package orsetto
Install
Dune Dependency
Authors
Maintainers
Sources
sha512=e260412b2dd0f98cfe3dc7ed5c31a694eb31c93cd207c51fa12675b790234ee0ad3bf07d9be17a4dc266fedfe55b14c967cad7bc0c9414063eef8afd59f3d0d1
doc/orsetto.ucs/Ucs_regx/index.html
Module Ucs_regx
Regular expression parsing, search and matching with Unicode text.
Overview
This module implements Unicode regular expression parsing, search and matching in pure Objective Caml. Implementation claims support for Requirements Level 1 (Basic Unicode Support) with the following exceptions:
- No support for line boundaries.
- No support for word boundaries.
- No support for case insensitive matching.
- Additional support for the Block enumerated property.
At present, there is no support for the Script_Extensions property.
Modules
module DFA : sig ... end
Deterministic finite automata for Unicode code points.
val of_text : Ucs_text.t -> t
Use of_text s
to make a regular expression denoted by s
. Raises Invalid_argment
if s
does not denote a valid regular expression.
Use of_uchars s
to make a regular expression denoted by the Unicode codepoints in s
. Raises Invalid_argment
if the characters do not denote a valid regular expression.
Use of_dfa_term s
to make a regular expression for recognizing the language term s
.
val test : t -> Ucs_text.t -> bool
Use test r t
to test whether the text t
matches the regular expression r
.
val contains : t -> Ucs_text.t -> bool
Use contains r t
to test whether r
recognizes any substring of t
.
Use search r s
to search with r
in a confluently persistent sequence s
for the first accepted subsequence. Returns None
if s
does not contain a matching subsequence. Otherwise, returns Some (start, limit)
where start
is the index of the first matching subsequence, and limit
is the index after the end of the longest matching subsequence.
val split : t -> Ucs_text.t -> Ucs_text.t Seq.t
Use split r s
to split s
into a sequence of slices comprising the substrings in s
that are separated by disjoint substrings matching r
, which are found by searching from left to right. If r
does not match any substring in s
, then a sequence containing just s
is returned, even if s
is an empty slice.
val quote : Ucs_text.t -> Ucs_text.t
Use quote s
to make a copy of s
by converting all the special characters into escape sequences.
val unquote : Ucs_text.t -> Ucs_text.t
Use unquote s
to make a copy of s
by converting all the escape sequences into ordinary characters.