package orsetto
Install
Dune Dependency
Authors
Maintainers
Sources
sha512=9b654edb663ae697563f150824047052f3b1bf760398f24bce6350553f031f73c46b6337239a1acd871e61238597ea92046809e3358290ff14d6ba671b449085
doc/orsetto.ucs/Ucs_scan/Create/index.html
Module Ucs_scan.Create
Use Create(UTF)
to make scanner module for the encoding UTF
.
Parameters
module P : Ucs_transport.Profile
Signature
module Annot : Cf_annot.Textual.Unicode.Profile
The annotation system for Unicode encoded texts.
The basic scanner composed with the annotation system.
include Cf_scan.Profile
with type symbol := Uchar.t
and type position := Cf_annot.Textual.position
and type 'a form := 'a Annot.form
include Cf_monad.Unary.Profile with type +'r t := 'r t
Module inclusions from Cf_monad_core
and Cf_seqmonad
.
include Cf_monad.Core.Unary.Profile with type 'r t := 'r t
val return : 'r -> 'r t
Use return a
to apply the binding to a
.
Use map m ~f
to return the result of applying f
to the value returned by m
.
Use disregard m
to ignore the value returned by m
and apply the unit value to the bound function.
module Infix : sig ... end
Deprecated module alias.
include Cf_seqmonad.Functor.Unary with type 'r t := 'r t
Use collect s
to bind in sequence every monad value in the finite sequence s
and collect all the returned values. Returns (n, s)
where n
is the number of values collected and s
is the list of values in reverse order, i.e. from last collected to first collected. Never returns and exhausts all memory if s
never terminates.
val nil : 'r t
A scanner that never produces any value.
val fin : bool t
A scanner that returns true
at the end of the input sequence, otherwise returns false
.
Backtracking
val pos : mark -> unit Annot.form
Use pos mark
to make a unit value attributed with the position of captured mark
. If the mark was captured at the end of the stream, then the result is an implicit form.
Terminal Scanner
val any : Uchar.t Annot.form t
The universal symbol scanner. Recognizes any symbol in the input stream and produces its form. Does not produce anything at the end of input.
val one : Uchar.t -> Uchar.t Annot.form t
The literal symbol scanner. Use one symbol
to make a scanner that recognizes symbol
in the input stream and produces its form.
val sat : (Uchar.t -> bool) -> Uchar.t Annot.form t
The symbol satisfier scanner. Use sat f
to make a scanner that recognizes any symbol for which applying f
returns true
and produces its form.
val ign : (Uchar.t -> bool) -> unit Annot.form t
The ignore scanner. Use ign f
to make a scanner that scans the input while applying f
to each symbol returns true
, then produces a unit form that annotates the span of ignored symbols. Produces an implicit unit form if the end of input has already been reached.
val tok : (Uchar.t -> 'r option) -> 'r Annot.form t
The symbolic token scanner. Use tok f
to make a scanner that recognizes any symbol for which applying f
returns Some v
, then produces the form of v
.
Scanner Composers
val ntyp : 'r Cf_type.nym -> 'r Annot.form t -> Cf_type.opaque Annot.form t
The opaque value form scanner composer. Use ntyp n p
to make a scanner that encloses the value contained in the form produced by p
in an opaque value with the runtime type indicated by n
and returns its form in the same position.
val dflt : 'r -> 'r Annot.form t -> 'r Annot.form t
The default value scanner. Use dflt v p
to produce the output of p
or the default value v
with implicit annotation if p
does not produce output.
The optional scanner composer. Use opt p
to make a scanner that produces either Some v
if p
produces v
otherwise None
.
The visitor scanner composer. Use vis ?a ?b f v
to compose a scanner that recognizes a sequence of elements in the input stream by applying a visitor function f
at each element to obtain its scanner. The first element is visited with the initializer v
, and each following element is visited with the value returned by the preceding scanner.
If ~a
is used, then it specifies the minimum number of elements to visit. If ~b
is used then it specifies the maximum number of elements to visit. Composition raises Invalid_argument
if a < 0
or b < a
.
The homogenous list scanner composer. Use seq ?a ?b p
to create a new scanner that uses p
to recognize and produce, in order, each element in a sequence of elements in the input stream.
If ~a
is used, then it specifies the minimum number of elements that must be recognized and produced in the output. If ~b
is used then it specifies the maximum number of elements to recognize. Composition raises Invalid_argument
if a < 0
or b < a
.
The bounded multiple choice scanner. Use alt ps
to create a scanner that produces the output from the first scanner ps
that produces. If no scanner in ps
produces output, then the resulting scanner does not produce.
The unbounded multiple choice scanner. Use alt ps
to create a scanner that produces the output from the first scanner ps
that produces. If no scanner in ps
produces output, then the resulting scanner does not produce.
Error Parsers
A distinguished syntax failure exception.
val fail : string -> 'r t
Use fail msg
to raise Bad_syntax
with msg
optionally annotated with the current position.
Use or_fail msg p
to make a scanner that raises Bad_syntax
with msg
if p
does not recognize its input. It may be convenient to call this function with a pipeline operator, i.e. p |> or_fail "reasons"
.
val err : ?x:exn -> unit -> 'r t
Use err ~x ()
to make a scanner that raises x
. If ?x
is not provided, then it raises Not_found
.
Use errf ~xf ()
to make a scanner that captures a mark and applies it to xf
to raise an exception. If ?xf
is not provided, then raises Not_found
.
Use req ~x p
to make a scanner that either produces the output of p
or raises x
. If p
does not produce and ?x
is not provided, then it raises Not_found
.
Use reqf ~xf p
to make a scanner that either produces the output of p
or captures a mark at the current input and applies it to xf
to raise an exception. If ?xf
is not provided, then raises Not_found
.
val cast : ('a -> 'b) -> 'a Annot.form t -> 'b Annot.form t
The conversion scanner. Use cast f p
to create a scanner produces the result of applying f
to the value recognized by p
. If f
raises Not_found
then the resulting scanner does not produce. If f
raises Failure s
then it will be caught and Bad_syntax
will be raised with the corresponding message decorated with the position.
The error check scanner. Use ck p
to create a new scanner that either produces either Ok v
if p
produces v
or Error x
if scanning the input with p
raises the exception x
.
The error recovery scanner. Use sync p
to scan the input with p
until it produces or reaches the end of input. Produces Some v
if p
ever produces v
, otherwise produces None
if the end of input is reached without p
producing a value.
Elaboration
val lift :
?start:Cf_annot.Textual.position ->
'r t ->
Uchar.t Seq.t ->
'r Seq.t
Use lift p s
to map s
into a persistent sequence of the values produced by p
. If ~start
is provided, then it specifies the starting position of the first symbol in s
.
val of_seq : ?start:Cf_annot.Textual.position -> 'r t -> Uchar.t Seq.t -> 'r
Use of_seq p s
to parse s
with p
and return the result. Raises Not_found
if p
does not recognize the entire sequence of s
. If ~start
is provided, then it specifies the starting position of the first symbol in s
.
module Affix : sig ... end
Combinator operators
val ows : unit Annot.form t
The optional white space parser, which recognizes a sequence of zero or more code points with the White Space
property.
val rws : unit Annot.form t
The required white space parser, which recognizes a sequence of one or more code points with the White Space
property except. Returns the unit value annotated with its location.
val ids : Ucs_text.t Annot.form t
A parser that recognizes a programming language identifier, i.e. a sequence of code points beginning with one that has the Id Start
property, followed by zero or more with the Id Continue
property. Returns the text of the identifier, normalized to NFC, and annotated with its location in the input stream.
val of_text : 'a t -> Ucs_text.t -> 'a
Use of_text p s
to parse s
with p
and return the result. Raises Not_found
if p
does not recognize the entire text of s
.