package pcre
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=088a32dc2a38627559e409048e451aaea574bb4f1902a534210b1a2f54a7b820
sha512=f42ceb53956e522dc0f364a0d9fc4e0699f9d08534e562b4b2ae20c9bb6c6423e9cb1d0b65eaab4c82244b8623707e9b314ae8ff9718061d6dc629bf0a8e3f95
Description
pcre-ocaml offers library functions for string pattern matching and substitution, similar to the functionality offered by the Perl language.
Published: 23 Jun 2025
README
PCRE-OCaml - Perl Compatibility Regular Expressions for OCaml
This OCaml library interfaces with the C library PCRE, providing Perl-compatible regular expressions for string matching.
Features
PCRE-OCaml offers:
- Pattern searching
- Subpattern extraction
- String splitting by patterns
- Pattern substitution
Reasons to choose PCRE-OCaml:
- The PCRE library by Philip Hazel is mature and stable, implementing nearly all Perl regular expression features. High-level OCaml functions (split, replace, etc.) are compatible with Perl functions, as much as OCaml allows. Some developers find Perl-style regex syntax more intuitive and powerful than the Emacs-style regex used in OCaml's
Str
module. - PCRE-OCaml is reentrant and thread-safe, unlike the
Str
module. This reentrancy offers convenience, eliminating concerns about library state. - High-level replacement and substitution functions in OCaml are faster than those in the
Str
module. When compiled to native code, they can even outperform Perl's C-based functions. - Returned data is unique, allowing safe destructive updates without side effects.
- The library interface uses labels and default arguments for enhanced programming comfort.
Usage
Please run:
odig odoc pcre2
Or:
dune build @doc
Consult the API for details.
Functions support two flag types:
Convenience flags: Readable and concise, translated internally on each call. Example:
let rex = Pcre.regexp ~flags:[`ANCHORED; `CASELESS] "some pattern" in (* ... *)
These are easy to use but may incur overhead in loops. For performance optimization, consider the next approach.
Internal flags: Predefined and translated from convenience flags for optimal loop performance. Example:
let iflags = Pcre.cflags [`ANCHORED; `CASELESS] in for i = 1 to 1000 do let rex = Pcre.regexp ~iflags "some pattern constructed at runtime" in (* ... *) done
Translating flags outside loops saves cycles. Avoid creating regex in loops:
for i = 1 to 1000 do let chunks = Pcre.split ~pat:"[ \t]+" "foo bar" in (* ... *) done
Instead, predefine the regex:
let rex = Pcre.regexp "[ \t]+" in for i = 1 to 1000 do let chunks = Pcre.split ~rex "foo bar" in (* ... *) done
Functions use optional arguments with intuitive defaults. For instance, Pcre.split
defaults to whitespace as the pattern. The examples
directory contains applications demonstrating PCRE-OCaml's functionality.
Restartable (Partial) Pattern Matching
PCRE includes a DFA match function for restarting partial matches with new input, exposed via pcre_dfa_exec
. While not suitable for extracting submatches or splitting strings, it's useful for streaming and search tasks.
Example of a partial match restarted:
utop # open Pcre;;
utop # let rex = regexp "12+3";;
val rex : regexp = <abstr>
utop # let workspace = Array.make 40 0;;
val workspace : int array =
[| ... |]
utop # pcre_dfa_exec ~rex ~flags:[`PARTIAL] ~workspace "12222";;
Exception: Pcre.Error Partial.
utop # pcre_dfa_exec ~rex ~flags:[`PARTIAL; `DFA_RESTART] ~workspace "2222222";;
Exception: Pcre.Error Partial.
utop # pcre_dfa_exec ~rex ~flags:[`PARTIAL; `DFA_RESTART] ~workspace "2222222";;
Exception: Pcre.Error Partial.
utop # pcre_dfa_exec ~rex ~flags:[`PARTIAL; `DFA_RESTART] ~workspace "223xxxx";;
- : int array = [|0; 3; 0|]
Refer to the pcre_dfa_exec
documentation and the dfa_restart
example for more information.
Contact Information and Contributing
Submit bug reports, feature requests, and contributions via the GitHub issue tracker.
For the latest information, visit: https://mmottl.github.io/pcre-ocaml
Dependencies (4)
-
conf-libpcre
build
- dune-configurator
-
ocaml
>= "4.08"
-
dune
>= "2.7"
Used by (52)
- aifad
- anthill
- bap-veri
-
benchmark
= "1.5"
- camelsnakekebab
-
camlp5
>= "8.00~alpha06" & < "8.02.01"
- chamo
- coccinelle
- comby
- comby-kernel
-
commons
>= "1.8.0"
- cppffigen
- devkit
- dirsp-ps2ocaml
-
dotenv
>= "0.0.3"
-
duppy
< "0.9.4"
-
expect
< "0.1.0"
-
grib
< "0.11.0"
- janestreet_csv
-
jingoo
< "1.2.21"
-
lastfm
< "0.3.4"
-
ldap
< "2.5.1"
- line-up-words
-
liquidsoap
< "2.3.3"
-
liquidsoap-core
< "2.3.0"
- matita
- mparser-pcre
- oasis2debian
- ocaml-http
- ocaml-inifiles
- ocaml_db_model
- ocaml_pgsql_model
-
ocamldap
< "transition"
-
ocsigenserver
< "5.1.0"
-
pa_ppx
< "0.14"
- pa_ppx_ag
-
pa_ppx_hashcons
< "0.11"
-
pa_ppx_migrate
< "0.11"
-
pa_ppx_regexp
< "0.05"
-
pa_ppx_unique
< "0.11"
-
patdiff
< "v0.16.1"
-
pgocaml
< "2.3"
- rdf
- rdf_impls
- rotor
-
server-reason-react
< "0.2.0"
- shcaml
-
sihl
< "0.1.5"
-
spectrum
>= "0.4.0"
- squirrel
- stk
-
tyxml
< "4.2.0"
Conflicts
None