package term-tools
Install
Dune Dependency
Authors
Maintainers
Sources
md5=75acb747da03c0cd94d22c7f57cf0b1b
sha512=594187718db0e8b6decd40b353b1b3dad883deb59b7196378bb16783d22070d853f1ab5c5017d5eb826f8c435fde991769a0fe557e31abb2c50953f0e757525a
doc/index.html
Term tools
This library implements some tools to manipulate first-order terms. This page introduces the library with some basic examples. If you're ready to dive in the API, look at Term_tools
, especially the functor Term_tools.Make
.
Hash-consed first-order terms
A first-order term is either
- a variable (represented as an integer), or
- a primitive symbol applied to a list of sub-terms whose length correspond to the arity of the symbol.
Primitive symbols and their arities are specified by a signature
.
For instance, arithmetic expressions such as x_0 + x_1 * x_1
can be represented as terms constructed with primitives for addition, multiplication, etc. Let us define the corresponding signature.
open Term_tools
(* The type of primitive symbols *)
type prim = Add | Mul | Neg | Float of float
(* [Prim] implements [Intf.Signature].
We use the [Stdlib]'s polymorphic comparison and hash operators for simplicity. *)
module Prim : Intf.Signature with type t = prim = struct
type t = prim
let compare (x : t) (y : t) = Stdlib.compare x y
let equal (x : t) (y : t) = Stdlib.( = ) x y
let hash = Hashtbl.hash
let pp fmtr = function
| Add -> Fmt.pf fmtr "Add"
| Mul -> Fmt.pf fmtr "Mul"
| Neg -> Fmt.pf fmtr "Neg"
| Float f -> Fmt.pf fmtr "%.1f" f
(* Each primitive is associated to an arity, which maps each constructor to its number
of expected arguments;
e.g. addition is a binary operation, negation is a unary operation and
constants are 0-ary. *)
let arity = function Add | Mul -> 2 | Neg -> 1 | Float _ -> 0
end
The functor Term_tools.Make
packs all features of the library under a single functor taking an argument of type Intf.Signature
.
module Pack = Term_tools.Make (Prim)
open Pack
The module Term
contained in Pack
provides operations to create and manipulate hash-consed terms over the given signature (see Intf.Term
). Hash-consing is a technique that ensures that terms are allocated at most once: it is guaranteed that structurally equal terms are physically equal. Terms are constructed using the functions Term.prim
for primitive applications and Term.var
for variables.
Let us define some convenient wrappers to create terms. Note that the correctness of arities is dynamically checked by Term.prim
.
let add x y = Term.prim Add [| x; y |]
let mul x y = Term.prim Mul [| x; y |]
let neg x = Term.prim Neg [| x |]
let float f = Term.prim (Prim.Float f) [||]
let var s = Term.var s
The mathematical expression x_0 + x_1 * x_1
can be represented by the term
let t = add (var 0) (mul (var 1) (var 1))
Manipulating terms with zippers
Zippers allow to navigate and edit first-order terms in a purely applicative way. A zipper corresponds to a term in-context. One embeds a term inside an empty context using Zipper.of_term
. One can zoom in on a subterm (reps. zoom out) using Zipper.move_at
(resp. Zipper.move_up
).
let zipper = Zipper.of_term t
let left = Zipper.move_at zipper 0 |> Option.get
let () = Fmt.pr "left: %a@." Term.pp (Zipper.cursor left)
let right = Zipper.move_at zipper 1 |> Option.get
let () = Fmt.pr "right: %a@." Term.pp (Zipper.cursor right)
left: V(0) right: Mul(V(1), V(1))
Zipper.replace
allows to replace the term under focus by another one. Finally, one can retrieve a term using Zipper.to_term
.
let rewritten = Zipper.replace (float 42.0) right |> Zipper.to_term
let () = Fmt.pr "rewritten: %a@." Term.pp rewritten
Folding over terms
The function Zipper.fold
allows to fold with a zipper over all subterms of a given term. The term is traversed in a preorder, depth-first, left-to-right fashion. Let's try:
let all_subterms =
Zipper.fold (fun z acc -> Zipper.cursor z :: acc) [] (Zipper.of_term t)
let () = Fmt.pr "%a@." (Fmt.Dump.list Term.pp) all_subterms
[V(1); V(1); Mul(V(1), V(1)); V(0); Add(V(0), Mul(V(1), V(1)))]
Note that since the subterms are accumulated by pushing on a list, we get the results in reverse.
Folding over variables only
We can also fold over variables only. The term representation allows to skip entirely variable-free subtrees when doing so.
let all_variables =
Zipper.fold_variables (fun v _z acc -> v :: acc) [] (Zipper.of_term t)
let () = Fmt.pr "%a@." Fmt.Dump.(list Fmt.int) all_variables
[1; 1; 0]
Rewriting
We will illustrate rewriting by implementing some toy constant folding. The Pattern
module provides facilities to search for subterms having some particular shape.
We then define some patterns corresponding to terms that can be folded.
(* A pattern matching any float constant. [prim_pred] is a generic predicate on primitives. *)
let float_patt =
Pattern.(prim_pred (function Float _ -> true | _ -> false) list_empty)
(* A pattern matching an addition of float constants. *)
let add_patt = Pattern.(prim Prim.Add (float_patt @. float_patt @. list_empty))
(* A pattern matching a multiplication of float constants. *)
let mul_patt = Pattern.(prim Prim.Add (float_patt @. float_patt @. list_empty))
(* A pattern matching negation of a float constant. *)
let neg_patt = Pattern.(prim Prim.Neg (float_patt @. list_empty))
Upon detecting such subterms, we will need to reduce them. The following illustrates how to do so using Term.destruct
, which performs pattern matching on terms. It takes the following arguments:
- a function to be called if the term is a primitive, to which the primitive and subterms are passed
- a function to be called if the term is a variable, to which the variable is passed.
- the term to be analyzed
get_float
extracts the floating point value out of a Float
term, or returns None
if not possible.
let get_float (term : Term.t) : float option =
Term.destruct
(fun prim _ -> match prim with Prim.Float f -> Some f | _ -> None)
(fun _ -> None)
term
reduce
performs a step of constant folding if possible.
let reduce (term : Term.t) : Term.t option =
Term.destruct
(fun prim operands ->
match (prim, operands) with
| ((Add | Mul), [| l; r |]) ->
Option.bind (get_float l) @@ fun l ->
Option.bind (get_float r) @@ fun r ->
Option.some
(match prim with
| Add -> float (l +. r)
| Mul -> float (l *. r)
| _ -> assert false)
| (Neg, [| x |]) ->
Option.bind (get_float x) @@ fun x -> Option.some (float (-.x))
| _ -> Option.none)
(fun _ -> Option.none)
term
Constant folding iteratively looks for subterms to simplify until none is left. Pattern.first_matches
searches the term for an occurrence of a subterm matching any of the patterns in the provided list. If a pattern is found, we perform the rewrite, print the outcome and continue.
let rec rewrite_until_fixpoint term =
let matches = Pattern.first_match [add_patt; mul_patt; neg_patt] term in
match matches with
| [] -> term
| zipper :: _ ->
let rewritten =
match reduce (Zipper.cursor zipper) with
| Some reduced -> reduced
| None -> failwith "can't happen"
in
Fmt.pr "%a -> %a@." Term.pp term Term.pp rewritten ;
rewrite_until_fixpoint rewritten
Let's try this out on some dummy term.
let expression = add (float 1.0) (add (float 2.0) (mul (float 3.0) (float 4.0)))
let normalized = rewrite_until_fixpoint expression
The sequence of rewrites is:
Add(1.0, Add(2.0, Mul(3.0, 4.0))) -> Add(1.0, Add(2.0, 12.0)) Add(1.0, Add(2.0, 12.0)) -> Add(1.0, 14.0) Add(1.0, 14.0) -> 15.0
Substitutions
Variables denote placeholders for terms that may replace them. This mechanism is mediated through substitutions, which are finitely supported functions from variables to terms. The following is a substitution mapping
- the variable
0
to the termfloat 0.0
- the variable
1
to the termneg (float 42.0)
- the variable
2
to the termfloat 2.0
let subst =
[(0, float 0.0); (1, neg (float 42.0)); (2, float 2.0)]
|> List.to_seq |> Subst.of_seq
The terms associated to each variable in the domain of a substitution can be obtained through Subst.get
.
let () =
assert (Option.equal Term.equal (Subst.get 0 subst) (Some (float 0.0)))
let () = assert (Option.equal Term.equal (Subst.get 3 subst) None)
One can also apply a substitution to the variables contained in a term using Subst.lift
.
let term = add (var 1) (mul (var 2) (var 2))
let substituted = Subst.lift subst term
The value substituted
is equal to:
Add(Neg(42.0), Mul(2.0, 2.0))
Intermezzo: the refinement preorder on terms
Applying a substitution to a term intuitively "refines" it. More formally, one can define a preorder \le
on terms where a term t_1 \le t_2
if there exists a substitution \sigma
such that t_1 = \sigma(t_2)
. The maximal (equivalence class of) elements of this preorder are variables, and the minimal elements are ground terms (i.e. terms without variables). (This is a preorder and not a partial order because terms equal modulo variable renaming generalize each other).
Here is an increasing sequence of terms in the preorder \le
:
- the term
t1 = add (float 2.0) (float 2.0)
is ground - the term
t1
refinest2 = add (var 1) (var 1)
via the substitution1 \mapsto
float 2.0
- the term
t2
refinest3 = add (var 1) (var 2)
via the substitution2 \mapsto
var 1
- the term
t3
refinesvar 3
via the substitution3 \mapsto
t3
Unification
Some pairs of terms t_1, t_2
admit a common refinement. Formally, a unifier of t_1
and t_2
is a substitution which equates the two terms. A unification problem is a conjunction of equations between terms and a solution is a substitution which when applied to all terms satisfies all equations.
The library provides a module to compute such solutions. Unification proceeds on a state that allows to accumulate equations. Let us create an empty state.
let uf_state = Unification.empty ()
One can unify terms using Unification.unify
. This function returns None
when no unifier can be found, or an updated state in the other case. At any point, we can get a solution from the state using Unification.subst
which returns a substitution.
let t1 = add (mul (float 1.0) (float 2.0)) (var 1)
let t2 = add (var 2) (mul (float 3.0) (float 4.0))
let () =
match Unification.unify t1 t2 uf_state with
| None -> failwith "unification failed"
| Some uf_state' ->
let subst = Unification.subst uf_state' in
Fmt.pr "%a@." Subst.pp subst
V(1) -> Mul(3.0, 4.0); V(2) -> Mul(1.0, 2.0)