Module `Utf8_text`Source

Text is text encoded in UTF-8.

Under the hood, this is just a String.t, but the type is abstract so that the compiler will remind us not to use String.length when we mean Text.width.

Sourcetype t

include Ppx_compare_lib.Comparable.S with type t := t

Sourceval compare : t Base__Ppx_compare_lib.compare

include Ppx_quickcheck_runtime.Quickcheckable.S with type t := t

Sourceval quickcheck_generator : t Base_quickcheck.Generator.t

Sourceval quickcheck_observer : t Base_quickcheck.Observer.t

Sourceval quickcheck_shrinker : t Base_quickcheck.Shrinker.t

Sourceval sexp_of_t : t -> Sexplib0.Sexp.t

The invariant is that t is a sequence of well-formed UTF-8 code points.

include Core.Invariant.S with type t := t

Sourceval invariant : t Base__Invariant_intf.inv

include Core.Container.S0 with type t := t with type elt := Core.Uchar.t

val mem : t -> Core.Uchar.t -> bool

val length : t -> int

Sourceval is_empty : t -> bool

val iter : t -> f:(Core.Uchar.t -> unit) -> unit

val fold : t -> init:'accum -> f:('accum -> Core.Uchar.t -> 'accum) -> 'accum

val fold_result : 
  t ->
  init:'accum ->
  f:('accum -> Core.Uchar.t -> ('accum, 'e) Base__.Result.t) ->
  ('accum, 'e) Base__.Result.t

val fold_until : 
  t ->
  init:'accum ->
  f:
    ('accum ->
      Core.Uchar.t ->
      ('accum, 'final) Base__Container_intf.Continue_or_stop.t) ->
  finish:('accum -> 'final) ->
  'final

val exists : t -> f:(Core.Uchar.t -> bool) -> bool

val for_all : t -> f:(Core.Uchar.t -> bool) -> bool

val count : t -> f:(Core.Uchar.t -> bool) -> int

val sum : 
  (module Base__Container_intf.Summable with type t = 'sum) ->
  t ->
  f:(Core.Uchar.t -> 'sum) ->
  'sum

val find : t -> f:(Core.Uchar.t -> bool) -> Core.Uchar.t option

val find_map : t -> f:(Core.Uchar.t -> 'a option) -> 'a option

val to_list : t -> Core.Uchar.t list

val to_array : t -> Core.Uchar.t array

val min_elt : 
  t ->
  compare:(Core.Uchar.t -> Core.Uchar.t -> int) ->
  Core.Uchar.t option

val max_elt : 
  t ->
  compare:(Core.Uchar.t -> Core.Uchar.t -> int) ->
  Core.Uchar.t option

include Core.Stringable.S with type t := t

Sourceval of_string : string -> t

Sourceval to_string : t -> string

Sourceval width : t -> int

width t approximates the displayed width of t.

We incorrectly assume that every code point has the same width. This is better than String.length for many code points, but doesn't work for double-width characters or combining diacritics.

Sourceval bytes : t -> int

bytes t is the number of bytes in the UTF-8 encoding of t.

Sourceval chunks_of : t -> width:int -> prefer_split_on_spaces:bool -> t list

chunks_of t ~width splits t into chunks no wider than width characters s.t.


t = t |> chunks_of ~width |> concat

. chunks_of always returns at least one chunk, which may be empty.

If prefer_split_on_spaces = true and such a space exists, t will be split on the last U+020 SPACE before the chunk becomes too wide. Otherwise, the split happens exactly at width characters.

Sourceval of_uchar_list : Core.Uchar.t list -> t

Sourceval concat : ?sep:t -> t list -> t

Sourceval iteri : t -> f:(int -> Core.Uchar.t -> unit) -> unit

iteri t ~f calls f index uchar for every uchar in t. index counts characters, not bytes.

Sourceval split : t -> on:char -> t list

split t ~on returns the substrings between and not including occurrences of on. on must be an ASCII char (in range '\000' to '\127').

Install

Dune Dependency

Authors

Maintainers

Sources

doc/textutils_kernel.utf8_text/Utf8_text/index.html

Module `Utf8_text`Source

package textutils_kernel

Install

Dune Dependency

Authors

Maintainers

Sources

doc/textutils_kernel.utf8_text/Utf8_text/index.html

Module Utf8_textSource

Module `Utf8_text`Source