package textutils_kernel

  1. Overview
  2. Docs
Legend:
Page
Library
Module
Module type
Parameter
Class
Class type
Source

Module Utf8_textSource

Text is text encoded in UTF-8.

Under the hood, this is just a String.t, but the type is abstract so that the compiler will remind us not to use String.length when we mean Text.width.

Sourcetype t
include Ppx_compare_lib.Comparable.S with type t := t
Sourceval compare : t Base__Ppx_compare_lib.compare
include Ppx_quickcheck_runtime.Quickcheckable.S with type t := t
Sourceval quickcheck_generator : t Base_quickcheck.Generator.t
Sourceval quickcheck_observer : t Base_quickcheck.Observer.t
Sourceval quickcheck_shrinker : t Base_quickcheck.Shrinker.t
Sourceval sexp_of_t : t -> Sexplib0.Sexp.t

The invariant is that t is a sequence of well-formed UTF-8 code points.

include Core.Invariant.S with type t := t
Sourceval invariant : t Base__Invariant_intf.inv
include Core.Container.S0 with type t := t with type elt := Core.Uchar.t
val mem : t -> Core.Uchar.t -> bool
val length : t -> int
Sourceval is_empty : t -> bool
val iter : t -> f:(Core.Uchar.t -> unit) -> unit
val fold : t -> init:'accum -> f:('accum -> Core.Uchar.t -> 'accum) -> 'accum
val fold_result : t -> init:'accum -> f:('accum -> Core.Uchar.t -> ('accum, 'e) Base__.Result.t) -> ('accum, 'e) Base__.Result.t
val fold_until : t -> init:'accum -> f: ('accum -> Core.Uchar.t -> ('accum, 'final) Base__Container_intf.Continue_or_stop.t) -> finish:('accum -> 'final) -> 'final
val exists : t -> f:(Core.Uchar.t -> bool) -> bool
val for_all : t -> f:(Core.Uchar.t -> bool) -> bool
val count : t -> f:(Core.Uchar.t -> bool) -> int
val sum : (module Base__Container_intf.Summable with type t = 'sum) -> t -> f:(Core.Uchar.t -> 'sum) -> 'sum
val find : t -> f:(Core.Uchar.t -> bool) -> Core.Uchar.t option
val find_map : t -> f:(Core.Uchar.t -> 'a option) -> 'a option
val to_list : t -> Core.Uchar.t list
val to_array : t -> Core.Uchar.t array
val min_elt : t -> compare:(Core.Uchar.t -> Core.Uchar.t -> int) -> Core.Uchar.t option
val max_elt : t -> compare:(Core.Uchar.t -> Core.Uchar.t -> int) -> Core.Uchar.t option
include Core.Stringable.S with type t := t
Sourceval of_string : string -> t
Sourceval to_string : t -> string
Sourceval width : t -> int

width t approximates the displayed width of t.

We incorrectly assume that every code point has the same width. This is better than String.length for many code points, but doesn't work for double-width characters or combining diacritics.

Sourceval bytes : t -> int

bytes t is the number of bytes in the UTF-8 encoding of t.

Sourceval chunks_of : t -> width:int -> prefer_split_on_spaces:bool -> t list

chunks_of t ~width splits t into chunks no wider than width characters s.t.


t = t |> chunks_of ~width |> concat

. chunks_of always returns at least one chunk, which may be empty.

If prefer_split_on_spaces = true and such a space exists, t will be split on the last U+020 SPACE before the chunk becomes too wide. Otherwise, the split happens exactly at width characters.

Sourceval of_uchar_list : Core.Uchar.t list -> t
Sourceval concat : ?sep:t -> t list -> t
Sourceval iteri : t -> f:(int -> Core.Uchar.t -> unit) -> unit

iteri t ~f calls f index uchar for every uchar in t. index counts characters, not bytes.

Sourceval split : t -> on:char -> t list

split t ~on returns the substrings between and not including occurrences of on. on must be an ASCII char (in range '\000' to '\127').

OCaml

Innovation. Community. Security.