package owl-base

  1. Overview
  2. Docs
Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module Owl_dataframeSource

Type definition
Sourcetype t

Abstract dataframe type.

Sourcetype series =
  1. | Bool_Series of bool array
  2. | Int_Series of int array
  3. | Float_Series of float array
  4. | String_Series of string array
  5. | Any_Series
    (*

    Abstract series type.

    *)
Sourcetype elt =
  1. | Bool of bool
  2. | Int of int
  3. | Float of float
  4. | String of string
  5. | Any
    (*

    Type of the elements in a series.

    *)
Pakcking & unpacking element
Sourceval pack_bool : bool -> elt

Pack the boolean value to ``elt`` type.

Sourceval pack_int : int -> elt

Pack the int value to ``elt`` type.

Sourceval pack_float : float -> elt

Pack the float value to ``elt`` type.

Sourceval pack_string : string -> elt

Pack the string value to ``elt`` type.

Sourceval unpack_bool : elt -> bool

Unpack ``elt`` type to boolean value.

Sourceval unpack_int : elt -> int

Unpack ``elt`` type to int value.

Sourceval unpack_float : elt -> float

Unpack ``elt`` type to float value.

Sourceval unpack_string : elt -> string

Unpack ``elt`` type to string value.

Pakcking & unpacking series
Sourceval pack_bool_series : bool array -> series

Pack boolean array to ``series`` type.

Sourceval pack_int_series : int array -> series

Pack int array to ``series`` type.

Sourceval pack_float_series : float array -> series

Pack float array to ``series`` type.

Sourceval pack_string_series : string array -> series

Pack string array to ``series`` type.

Sourceval unpack_bool_series : series -> bool array

Unpack ``series`` type to boolean array.

Sourceval unpack_int_series : series -> int array

Unpack ``series`` type to int array.

Sourceval unpack_float_series : series -> float array

Unpack ``series`` type to float array.

Sourceval unpack_string_series : series -> string array

Unpack ``series`` type to string array.

Obtain properties
Sourceval row_num : t -> int

``row_num x`` returns the number of rows in ``x``.

Sourceval col_num : t -> int

``col_num x`` returns the number of columns in ``x``.

Sourceval shape : t -> int * int

``shape x`` returns the shape of ``x``, i.e. ``(row numnber, column number)``.

Sourceval numel : t -> int

``numel x`` returns the number of elements in ``x``.

Sourceval types : t -> string array

``types x`` returns the string representation of column types.

Sourceval get_heads : t -> string array

``get_heads x`` returns the column names of ``x``.

Sourceval set_heads : t -> string array -> unit

``set_heads x head_names`` sets ``head_names`` as the column names of ``x``.

Sourceval id_to_head : t -> int -> string

``id_to_head head_name`` converts head name to its corresponding column index.

Sourceval head_to_id : t -> string -> int

``head_to_id i`` converts column index ``i`` to its corresponding head name.

Basic get and set functions
Sourceval get : t -> int -> int -> elt

``get x i j`` returns the element at ``(i,j)``.

Sourceval set : t -> int -> int -> elt -> unit

``set x i j v`` sets the value of element at ``(i,j)`` to ``v``.

Sourceval get_by_name : t -> int -> string -> elt

``get_by_name x i head_name`` is similar to ``get`` but uses column name.

Sourceval set_by_name : t -> int -> string -> elt -> unit

``set_by_name x i head_name`` is similar to ``set`` but uses column name.

Sourceval get_row : t -> int -> elt array

``get_row x i`` returns the ith row in ``x``.

Sourceval get_col : t -> int -> series

``get_col x i`` returns the ith column in ``x``.

Sourceval get_rows : t -> int array -> elt array array

``get_rows x a`` returns the rows of ``x`` specified in ``a``.

Sourceval get_cols : t -> int array -> series array

``get_cols x a`` returns the columns of ``x`` specified in ``a``.

Sourceval get_col_by_name : t -> string -> series

``get_col_by_name`` is similar to ``get_col`` but uses column name.

Sourceval get_cols_by_name : t -> string array -> series array

``get_cols_by_name`` is similar to ``get_cols`` but uses column names.

Sourceval get_slice : int list list -> t -> t

``get_slice s x`` returns a slice of ``x`` defined by ``s``. For more details, please refer to :doc:`owl_dense_ndarray_generic`.

Sourceval get_slice_by_name : (int list * string list) -> t -> t

``get_slice_by_name`` is similar to ``get_slice`` but uses column name.

Sourceval head : int -> t -> t

``head n x`` returns top ``n`` rows of ``x``.

Sourceval tail : int -> t -> t

``tail n x`` returns bottom ``n`` rows of ``x``.

Core operations
Sourceval make : ?data:series array -> string array -> t

``make ~data head_names`` creates a dataframe with an array of series data and corresponding column names. If data is not passed in, the function will return an empty dataframe.

Sourceval copy : t -> t

``copy x`` returns a copy of dataframe ``x``.

Sourceval copy_struct : t -> t

``copy_struct x`` only copies the structure of ``x`` with empty series.

Sourceval reset : t -> unit

``reset x`` resets the dataframe ``x`` by setting all the time series to empty.

Sourceval unique : t -> string -> series

``unique x`` removes the duplicates from the dataset and only returns the unique ones.

Sourceval sort : ?inc:bool -> t -> string -> t

``sort ~inc x head`` sorts the entries in the dataframe ``x`` according to the specified column by head name ``head``. By default, ``inc`` equals ``true``, indicating increasing order.

Sourceval min_i : t -> string -> int

``min_i x head`` returns the row index of the minimum value in the column specified by the ``head`` name.

Sourceval max_i : t -> string -> int

``max_i x head`` returns the row index of the maximum value in the column specified by the ``head`` name.

Sourceval append_row : t -> elt array -> unit

``append_row x row`` appends a row to the dataframe ``x``.

Sourceval append_col : t -> series -> string -> unit

``append_col x col`` appends a column to the dataframe ``x``.

Sourceval insert_row : t -> int -> elt array -> unit

``insert_row x i row`` inserts one ``row`` with at position ``i`` into dataframe ``x``.

Sourceval insert_col : t -> int -> string -> series -> unit

``insert_col x j col_head s`` inserts series ``s`` with column head ``col_head`` at position ``j`` into dataframe ``x``.

Sourceval remove_row : t -> int -> unit

``remove_row x i`` removes the ``ith`` row of ``x``. Negative index is accepted.

Sourceval remove_col : t -> int -> unit

``remove_col x i`` removes the ``ith`` column of ``x``. Negative index is accepted.

Sourceval concat_horizontal : t -> t -> t

``concat_horizontal x y`` merges two dataframes ``x`` and ``y``. Note that ``x`` and ``y`` must have the same number of rows, and each column name should be unique.

Sourceval concat_vertical : t -> t -> t

``concat_vertical x y`` concatenates two dataframes by appending ``y`` to ``x``. The two dataframes ``x`` and ``y`` must have the same number of columns and the same column names.

Iteration functions
Sourceval iteri_row : (int -> elt array -> unit) -> t -> unit

``iteri_row f x`` iterates the rows of ``x`` and applies ``f``.

Sourceval iter_row : (elt array -> unit) -> t -> unit

``iter_row`` is simiar to ``iteri_row`` without passing in row indices.

Sourceval mapi_row : (int -> elt array -> elt array) -> t -> t

``mapi_row f x`` transforms current dataframe ``x`` to a new dataframe by applying function ``f``. Note that the returned value of ``f`` must be consistent with ``x`` w.r.t to its length and type, otherwise runtime error will occur.

Sourceval map_row : (elt array -> elt array) -> t -> t

``map_row`` is simiar to ``mapi_row`` but without passing in row indices.

Sourceval filteri_row : (int -> elt array -> bool) -> t -> t

``filteri_row`` creates a new dataframe from ``x`` by filtering out those rows which satisfy the condition ``f``.

Sourceval filter_row : (elt array -> bool) -> t -> t

``filter_row`` is similar to ``filteri_row`` without passing in row indices.

Sourceval filter_mapi_row : (int -> elt array -> elt array option) -> t -> t

``filter_map_row f x`` creates a new dataframe from ``x`` by applying ``f`` to each row. If ``f`` returns ``None`` then the row is excluded in the returned dataframe; if ``f`` returns ``Some row`` then the row is included.

Sourceval filter_map_row : (elt array -> elt array option) -> t -> t

``filter_map_row`` is similar to ``filter_mapi_row`` without passing in row indices.

Extended indexing operators
Sourceval (.%()) : t -> (int * string) -> elt

Extended indexing operator associated with ``get_by_name`` function.

Sourceval (.%()<-) : t -> (int * string) -> elt -> unit

Extended indexing operator associated with ``set_by_name`` function.

Sourceval (.?()) : t -> (elt array -> bool) -> t

Extended indexing operator associated with ``filter_row`` function.

Sourceval (.?()<-) : t -> (elt array -> bool) -> (elt array -> elt array) -> t

Extended indexing operator associated with ``filter_map_row`` function. Given a dataframe ``x``, ``f`` is used for filtering and ``g`` is used for transforming. In other words, ``x.?(f) <- g`` means that if ``f row`` is ``true`` then ``g row`` is included in the returned dataframe.

Sourceval (.$()) : t -> (int list * string list) -> t

Extended indexing operator associated with ``get_slice_by_name`` function.

IO & helper functions
Sourceval of_csv : ?sep:char -> ?head:string array -> ?types:string array -> string -> t

``of_csv ~sep ~head ~types fname`` creates a dataframe by reading the data in a CSV file with the name ``fname``. Currently, the function supports four data types: ``b`` for boolean; ``i`` for int; ``f`` for float; ``s`` for string.

Note if ``types`` parameter is ignored, then all the elements will be parsed as string element by default.

Parameters: * ``sep``: delimiter, the default one is tab. * ``head``: column names, if not passed in, the first line of CSV file will be used. * ``types``: data type of each column, must be consistent with head.

Sourceval to_csv : ?sep:char -> t -> string -> unit

``to_csv ~sep x fname`` converts a dataframe to CSV file of name ``fname``. The delimiter is specified by ``sep``.

Sourceval to_rows : t -> elt array array

``to_rows x`` returns an array of rows in ``x``.

Sourceval to_cols : t -> series array

``to_cols x`` returns an arrays of columns in ``x``.

``print x`` pretty prints a dataframe on the terminal.

Sourceval elt_to_str : elt -> string

``elt_to_str x`` converts element ``x`` to its string representation.

OCaml

Innovation. Community. Security.