package orsetto
Install
Dune Dependency
Authors
Maintainers
Sources
sha512=9b654edb663ae697563f150824047052f3b1bf760398f24bce6350553f031f73c46b6337239a1acd871e61238597ea92046809e3358290ff14d6ba671b449085
doc/orsetto.ucs/Ucs_text/index.html
Module Ucs_text
Unicode texts encoded in UTF-8 as string.
Types
Functions
val nil : t
A distinguished empty text.
Use of_seq s
to compose a text by consuming s
. Raises Failure
if more than Sys.max_string_length
octets are required.
val of_string : string -> t
Use of_string s
to compose a text from the octets in s
. Raises Invalid_argument
if the octets do not encode a valid Unicode text with UTF-8.
val of_slice : string Cf_slice.t -> t
Use of_slice sl
to compose a text from the octets in sl
. Raises Invalid_argument
if the octets do not encode a valid Unicode text with UTF-8.
val length : t -> int
Use length t
to count the number of code points in t
.
Use sub t pos len
returns a fresh text comprising the len
code points after the first pos
code points.
Use equal a b
to compare the octets in a
and b
for equivalence. Note: may return false
even when a
and b
are canonically equivalent, if the two texts are not first transformed to the same normalization form.
Use equal a b
to compare a
and b
for the total order defined by comparing the octet strings of their UTF-8 encoding. As with equal
, this may return non-zero even when a
and b
are canonically equivalent, when the two texts are not first transformed to the same normalization form.
val is_normalized : ?nf:(module Ucs_normal.Profile) -> t -> bool
Use is_normalized ?nf t
to test whether t
is normalized according to nf
. If ~nf
is not used, then NFC is assumed.
val normalize : ?nf:(module Ucs_normal.Profile) -> t -> t
Use normalize ?nf t
to produce the equivalent text normalized according to nf
. If already normalized, then returns the identical text t'
, i.e. t == t'
. If ~nf
is not used, then NFC is assumed.
val encode_scheme :
?utf:(module Ucs_transport.Profile) ->
unit ->
t Cf_encode.scheme
Use encode_scheme ~utf ()
to make an encoding scheme that emits a text encoded accordin to utf
. If ~utf
is not used, then UTF-8 is assumed.
val decode_scheme :
?utf:(module Ucs_transport.Profile) ->
int ->
t Cf_decode.scheme
Use decode_scheme ?utf n
to make a decoding scheme that scans n
octets encoded according to utf
to produce a text comprising those codepoints. If ~utf
is not used, then UTF-8 is assumed.
Raises Invalid_argument
if n < 0
. Raises Cf_decode.Invalid
if the octets are not a valid encoding according to the transport form.
module Unsafe : sig ... end