package orsetto
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=151ca6df499bd3de7aa89a4e1627411fbee24c4dea6e0e71ce21f06f181ee654
md5=00393728b481c2bf15919a8202732335
doc/orsetto.ucs/Ucs_regx/index.html
Module Ucs_regx
Regular expression parsing, search and matching with Unicode text.
Overview
This module implements Unicode regular expression parsing, search and matching in pure Objective Caml. Implementation claims support for Requirements Level 1 (Basic Unicode Support) with the following exceptions:
- No support for line boundaries.
- No support for word boundaries.
- No support for case insensitive matching.
- Additional support for the Block enumerated property.
At present, there is no support for the Script_Extensions property.
Modules
module DFA : sig ... end
Deterministic finite automata for Unicode code points.
val of_text : Ucs_text.t -> t
Use of_text s
to make a regular expression denoted by s
. Raises Invalid_argment
if s
does not denote a valid regular expression.
Use of_uchars s
to make a regular expression denoted by the Unicode codepoints in s
. Raises Invalid_argment
if the characters do not denote a valid regular expression.
Use of_dfa_term s
to make a regular expression for recognizing the language term s
.
val test : t -> Ucs_text.t -> bool
Use test r t
to test whether the text t
matches the regular expression r
.
val contains : t -> Ucs_text.t -> bool
Use contains r t
to test whether r
recognizes any substring of t
.
Use search r s
to search with r
in a confluently persistent sequence s
for the first accepted subsequence. Returns None
if s
does not contain a matching subsequence. Otherwise, returns Some (start, limit)
where start
is the index of the first matching subsequence, and limit
is the index after the end of the longest matching subsequence.
val split : t -> Ucs_text.t -> Ucs_text.t Seq.t
Use split r s
to split s
into a sequence of slices comprising the substrings in s
that are separated by disjoint substrings matching r
, which are found by searching from left to right. If r
does not match any substring in s
, then a sequence containing just s
is returned, even if s
is an empty slice.
val quote : Ucs_text.t -> Ucs_text.t
Use quote s
to make a copy of s
by converting all the special characters into escape sequences.
val unquote : Ucs_text.t -> Ucs_text.t
Use unquote s
to make a copy of s
by converting all the escape sequences into ordinary characters.