package caisar

  1. Overview
  2. Docs

Module OvoSource

Sourcetype t

t is an SVM. Each SVM is defined by a number (>1) of inputs or "features" and a number of outputs or "classes". Each class has a name and a number (>=1) of Support Vectors (SV).

Each support vector is defined as a vector of floats, one float for each input.

Additional information in the SVM record includes:

  • the dual coefficients (a vector of floats of cardinality nb_SVs * (nb_classes - 1);
  • the intercept (a vector of floats of cardinality nb_classes * (nb_classes-1));
  • the kernel type, which is either Linear, RBF (with a gamma parameter), or Polynomial (with degree coefficients).
Sourceval nb_ins : t -> int

nb_ins ovo is the number of inputs of ovo.

Sourceval nb_classes : t -> int

nb_classes ovo is the number of classes of ovo (i.e., the length of the output vector).

Sourceval class_name : t -> int -> string

class_name ovo cl if the name of class cl in ovo.

Sourceval nb_svs : t -> int

nb_svs ovo is the total number of support vectors in SVM ovo.

Sourceval parse : string -> (t, string) Stdlib.Result.t

Parses an OVO file according to the format implemented here: https://github.com/abstract-machine-learning/data-collection/blob/master/trainers/classifier_mapper.py#L21

The input format is:

ovo_file ::== header type_of_kernel classes_description dual_coef support_vector intercept

header ::== 'ovo' nb_ins=INT nb_classes=INT

type_of_kernel ::== 'linear' | 'rbf' gamma=FLOAT | ('polynomial'|'poly') ('gamma' gamma=FLOAT)? degree=FLOAT coef=FLOAT

classes_description ::== (classname1=STRING nb_sv_of_class1=INT classname0=STRING nb_sv_of_class0=INT) | (classname=STRING nb_sv_of_class=INT)^(nb_classes)

dual_coef ::== (FLOAT)^(total_nb_sv * (nb_classes - 1))

support_vector ::== (FLOAT)^(total_nb_sv * nb_ins)

intercept ::== (FLOAT)^(nb_classes * (nb_classes - 1) / 2)

Notes:

* The classes are described in increasing order (class 0, class 1, class 2, etc.) *except* if there are exactly two classes.

* When the polynomial kernel 'gamma' parameter is not specified (as in the implementation mentionned above), it is assumed to be 'auto' as specified in the scikit-learn implementation, i.e., it is computed as 1.0 /. nb_ins. The default (preferred) method seems to now be 'scale' instead (since version 0.22 of scikit-learn, cf. https://github.com/scikit-learn/scikit-learn/issues/12741); however, 'scale' requires access to the training data which is not available here; in general, it's just better to save the 'gamma' value in the ovo file (still, keeping the default option for compatibility reasons).

Sourceval to_nn : t -> Nir.Ngraph.t
OCaml

Innovation. Community. Security.