package earley

  1. Overview
  2. Docs

Module Earley_core.CharsetSource

A module providing efficient character sets.

Type

Sourcetype charset

The abstract type for a character set.

Sourcetype t = charset

Synonym of charset.

Charset construction

Sourceval empty : charset

The empty character set.

Sourceval full : charset

The full character set.

Sourceval singleton : char -> charset

singleton c returns a charset containing only c.

Sourceval range : char -> char -> charset

range cmin cmax returns the charset containing all the characters between cmin and cmax.

Sourceval from_string : string -> charset

from_string s returns the charset corresponding to the description string s, which may contain standalone characters (different from '-', which is only allowed as first character) or ranges. They are build of start and end characters, separated by '-'. An example of a valid description is "-_a-zA-Z0-9". Note that Invalid_argument is raised in case of ill-formed description.

Sourceval union : charset -> charset -> charset

union cs1 cs2 builds a new charset that contins the union of the characters of cs1 and cs2.

Sourceval complement : charset -> charset

complement cs returns a new charset containing exactly characters that are not in cs.

Sourceval add : charset -> char -> charset

add cs c returns a new charset containing the characters of cs and the character c.

Sourceval del : charset -> char -> charset

del cs c returns a new charset containing the characters of cs but not the character c.

Membership test

Sourceval mem : charset -> char -> bool

mem cs c tests whether the charset cs contains c.

Printing and string representation

Sourceval print : out_channel -> charset -> unit

print oc cs prints the charset cs to the output channel oc. A compact format is used for printing: common ranges are used and full and empty charsets are abreviated.

Sourceval print_full : out_channel -> charset -> unit

print_full oc cs is the same as print oc cs but it does not use abreviations (i.e. all characters are displayed).

Sourceval show : charset -> string

show oc cs builds a string representing the charset cs using the same compact format as print.

Sourceval show_full : charset -> string

show_full oc cs is the same as show oc cs but it does not use abreviations (i.e. all characters appear).

Manipulating charsets imperatively

Sourceval copy : charset -> charset

copy cs make a copy of the charset cs.

Sourceval addq : charset -> char -> unit

addq cs c adds the character c to the charset cs. Users must be particularly careful when using this function. In particular, it should not be used directly on empty, full or the result of the singleton function as it would change their value permanently. It is advisable to prefer the use of add or to work on a copy.

Sourceval delq : charset -> char -> unit

delq cs c deletes the character c from the charset cs. Similar recomendatiosn as for addq apply.

OCaml

Innovation. Community. Security.