package containers

  1. Overview
  2. Docs
A modular, clean and powerful extension of the OCaml standard library

Install

Dune Dependency

Authors

Maintainers

Sources

v3.8.tar.gz
md5=f1c717c9a1015e81253f226ae594f547
sha512=7640b6af5a61e53e52eac51f237a06c5c21597374481af218cf0601c2b9059b96254058b92adb73ce20b1dece4ccaffb99d1b29b235c4dc954619738d8d0de40

doc/containers/CCUtf8_string/index.html

Module CCUtf8_stringSource

Unicode String, in UTF8

A unicode string represented by a utf8 bytestring. This representation is convenient for manipulating normal OCaml strings that are encoded in UTF8.

We perform only basic decoding and encoding between codepoints and bytestrings. For more elaborate operations, please use the excellent Uutf.

status: experimental

  • since 2.1
Sourcetype uchar = Uchar.t
Sourcetype 'a gen = unit -> 'a option
Sourcetype 'a iter = ('a -> unit) -> unit

Fast internal iterator.

  • since 2.8
Sourcetype t = private string

A UTF8 string

Sourceval equal : t -> t -> bool
Sourceval hash : t -> int
Sourceval compare : t -> t -> int
Sourceval pp : Format.formatter -> t -> unit
Sourceval to_string : t -> string

Identity.

Sourceexception Malformed of string * int

Malformed string at given offset

Sourceval to_gen : ?idx:int -> t -> uchar gen

Generator of unicode codepoints.

  • parameter idx

    offset where to start the decoding.

Sourceval to_iter : ?idx:int -> t -> uchar iter

Iterator of unicode codepoints.

  • parameter idx

    offset where to start the decoding.

  • since 2.8
Sourceval to_seq : ?idx:int -> t -> uchar Seq.t

Iter of unicode codepoints. Renamed from to_std_seq since 3.0.

  • parameter idx

    offset where to start the decoding.

  • since 3.0
Sourceval to_list : ?idx:int -> t -> uchar list

List of unicode codepoints.

  • parameter idx

    offset where to start the decoding.

Sourceval fold : ?idx:int -> ('a -> uchar -> 'a) -> 'a -> t -> 'a
Sourceval iter : ?idx:int -> (uchar -> unit) -> t -> unit
Sourceval n_chars : t -> int

Number of characters.

Sourceval n_bytes : t -> int

Number of bytes.

Sourceval map : (uchar -> uchar) -> t -> t
Sourceval filter_map : (uchar -> uchar option) -> t -> t
Sourceval flat_map : (uchar -> t) -> t -> t
Sourceval empty : t

Empty string.

  • since 3.5
Sourceval append : t -> t -> t

Append two string together.

Sourceval concat : t -> t list -> t

concat sep l concatenates each string in l, inserting sep in between each string. Similar to Concatenating.

Sourceval of_uchar : uchar -> t

of_char c is a string with only one unicode char in it.

  • since 3.5
Sourceval make : int -> uchar -> t

make n c makes a new string with n copies of c in it.

  • since 3.5
Sourceval of_seq : uchar Seq.t -> t

Build a string from unicode codepoints Renamed from of_std_seq since 3.0.

  • since 3.0
Sourceval of_iter : uchar iter -> t

Build a string from unicode codepoints

  • since 2.8
Sourceval uchar_to_bytes : uchar -> char iter

Translate the unicode codepoint to a list of utf-8 bytes. This can be used, for example, in combination with Buffer.add_char on a pre-allocated buffer to add the bytes one by one (despite its name, Buffer.add_char takes individual bytes, not unicode codepoints).

  • since 3.2
Sourceval of_gen : uchar gen -> t
Sourceval of_list : uchar list -> t
Sourceval of_string_exn : string -> t

Validate string by checking it is valid UTF8.

Sourceval of_string : string -> t option

Safe version of of_string_exn.

Sourceval is_valid : string -> bool

Valid UTF8?

Sourceval unsafe_of_string : string -> t

Conversion from a string without validating. CAUTION this is unsafe and can break all the other functions in this module. Use only if you're sure the string is valid UTF8. Upon iteration, if an invalid substring is met, Malformed will be raised.

OCaml

Innovation. Community. Security.