package containers

  1. Overview
  2. Docs
A modular, clean and powerful extension of the OCaml standard library

Install

Dune Dependency

Authors

Maintainers

Sources

v3.5.tar.gz
md5=efc44e54af764ddb969ec823b7539a3e
sha512=df7c147233f13490710e81279a365290c85d4a00280d56a5bd2a74c579568abbe08c04a60c80f2936d7c15194b58b54b112b974eb8a0d28e131bae5ef38ac10d

doc/containers/CCParse/index.html

Module CCParseSource

Very Simple Parser Combinators

  open CCParse;;

  type tree = L of int | N of tree * tree;;

  let mk_leaf x = L x
  let mk_node x y = N(x,y)

  let ptree = fix @@ fun self ->
    skip_space *>
      ( (try_ (char '(') *> (pure mk_node <*> self <*> self) <* char ')')
        <|>
          (U.int >|= mk_leaf) )
  ;;

  parse_string_exn ptree "(1 (2 3))" ;;
  parse_string_exn ptree "((1 2) (3 (4 5)))" ;;
Parse a list of words
  open Containers.Parse;;
  let p = U.list ~sep:"," U.word;;
  parse_string_exn p "[abc , de, hello ,world  ]";;
Stress Test

This makes a list of 100_000 integers, prints it and parses it back.

  let p = CCParse.(U.list ~sep:"," U.int);;

  let l = CCList.(1 -- 100_000);;
  let l_printed =
    CCFormat.(to_string (within "[" "]" (list ~sep:(return ",@,") int))) l;;

  let l' = CCParse.parse_string_exn p l_printed;;

  assert (l=l');;
Sourcetype 'a or_error = ('a, string) result
Sourcetype line_num = int
Sourcetype col_num = int
Sourcetype parse_branch
Sourceval string_of_branch : parse_branch -> string
Sourceexception ParseError of parse_branch * unit -> string

parsing branch * message.

Input

Sourcetype position
Sourcetype state
Sourceval state_of_string : string -> state

Combinators

Sourcetype 'a t = state -> ok:('a -> unit) -> err:(exn -> unit) -> unit

Takes the input and two continuations:

  • ok to call with the result when it's done
  • err to call when the parser met an error
Sourceval return : 'a -> 'a t

Always succeeds, without consuming its input.

Sourceval pure : 'a -> 'a t

Synonym to return.

Sourceval (>|=) : 'a t -> ('a -> 'b) -> 'b t

Map.

Sourceval map : ('a -> 'b) -> 'a t -> 'b t
Sourceval map2 : ('a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t
Sourceval map3 : ('a -> 'b -> 'c -> 'd) -> 'a t -> 'b t -> 'c t -> 'd t
Sourceval (>>=) : 'a t -> ('a -> 'b t) -> 'b t

Monadic bind. p >>= f results in a new parser which behaves as p then, in case of success, applies f to the result.

Sourceval (<*>) : ('a -> 'b) t -> 'a t -> 'b t

Applicative.

Sourceval (<*) : 'a t -> _ t -> 'a t

a <* b parses a into x, parses b and ignores its result, and returns x.

Sourceval (*>) : _ t -> 'a t -> 'a t

a *> b parses a, then parses b into x, and returns x. The results of a is ignored.

Sourceval fail : string -> 'a t

fail msg fails with the given message. It can trigger a backtrack.

Sourceval failf : ('a, unit, string, 'b t) format4 -> 'a

Format.sprintf version of fail.

Sourceval parsing : string -> 'a t -> 'a t

parsing s p behaves the same as p, with the information that we are parsing s, if p fails.

Sourceval eoi : unit t

Expect the end of input, fails otherwise.

Sourceval nop : unit t

Succeed with ().

Sourceval char : char -> char t

char c parses the character c and nothing else.

Sourceval char_if : (char -> bool) -> char t

char_if f parses a character c if f c = true.

Sourceval chars_if : (char -> bool) -> string t

chars_if f parses a string of chars that satisfy f.

Sourceval chars1_if : (char -> bool) -> string t

Like chars_if, but only non-empty strings.

Sourceval endline : char t

Parse '\n'.

Sourceval space : char t

Tab or space.

Sourceval white : char t

Tab or space or newline.

Sourceval skip_chars : (char -> bool) -> unit t

Skip 0 or more chars satisfying the predicate.

Sourceval skip_space : unit t

Skip ' ' and '\t'.

Sourceval skip_white : unit t

Skip ' ' and '\t' and '\n'.

Sourceval is_alpha : char -> bool

Is the char a letter?

Sourceval is_num : char -> bool

Is the char a digit?

Sourceval is_alpha_num : char -> bool

Is the char a letter or a digit?

Sourceval is_space : char -> bool

True on ' ' and '\t'.

Sourceval is_white : char -> bool

True on ' ' and '\t' and '\n'.

Sourceval (<|>) : 'a t -> 'a t -> 'a t

a <|> b tries to parse a, and if a fails without consuming any input, backtracks and tries to parse b, otherwise it fails as a. See try_ to ensure a does not consume anything (but it is best to avoid wrapping large parsers with try_).

Sourceval (<?>) : 'a t -> string -> 'a t

a <?> msg behaves like a, but if a fails without consuming any input, it fails with msg instead. Useful as the last choice in a series of <|>: a <|> b <|> c <?> "expected a|b|c".

Sourceval try_ : 'a t -> 'a t

try_ p tries to parse like p, but backtracks if p fails. Useful in combination with <|>.

Sourceval suspend : (unit -> 'a t) -> 'a t

suspend f is the same as f (), but evaluates f () only when needed.

Sourceval string : string -> string t

string s parses exactly the string s, and nothing else.

Sourceval many : 'a t -> 'a list t

many p parses a list of p, eagerly (as long as possible).

Sourceval many1 : 'a t -> 'a list t

Parse a non-empty list.

Sourceval skip : _ t -> unit t

skip p parses zero or more times p and ignores its result.

Sourceval sep : by:_ t -> 'a t -> 'a list t

sep ~by p parses a list of p separated by by.

Sourceval sep1 : by:_ t -> 'a t -> 'a list t

sep1 ~by p parses a non empty list of p, separated by by.

Sourceval fix : ('a t -> 'a t) -> 'a t

Fixpoint combinator.

Sourceval memo : 'a t -> 'a t

Memoize the parser. memo p will behave like p, but when called in a state (read: position in input) it has already processed, memo p returns a result directly. The implementation uses an underlying hashtable. This can be costly in memory, but improve the run time a lot if there is a lot of backtracking involving p.

This function is not thread-safe.

Sourceval fix_memo : ('a t -> 'a t) -> 'a t

Like fix, but the fixpoint is memoized.

Sourceval get_lnum : int t

Reflect the current line number.

Sourceval get_cnum : int t

Reflect the current column number.

Sourceval get_pos : (int * int) t

Reflect the current (line, column) numbers.

Parse

Those functions have a label ~p on the parser, since 0.14.

Sourceval parse : 'a t -> state -> 'a or_error

parse p st applies p on the input, and returns Ok x if p succeeds with x, or Error s otherwise.

Sourceval parse_exn : 'a t -> state -> 'a

Unsafe version of parse.

Sourceval parse_string : 'a t -> string -> 'a or_error

Specialization of parse for string inputs.

Sourceval parse_string_exn : 'a t -> string -> 'a
Sourceval parse_file : 'a t -> string -> 'a or_error

parse_file p file parses file with p by opening the file and reading it whole.

Sourceval parse_file_exn : 'a t -> string -> 'a

Infix

Sourcemodule Infix : sig ... end

Utils

This is useful to parse OCaml-like values in a simple way.

Sourcemodule U : sig ... end

Let operators on OCaml >= 4.08.0, nothing otherwise

  • since 2.8
include CCShimsMkLet_.S with type 'a t_let := 'a t
Sourceval (let+) : 'a t -> ('a -> 'b) -> 'b t
Sourceval (and+) : 'a t -> 'b t -> ('a * 'b) t
Sourceval (let*) : 'a t -> ('a -> 'b t) -> 'b t
Sourceval (and*) : 'a t -> 'b t -> ('a * 'b) t
OCaml

Innovation. Community. Security.