package wu-manber-fuzzy-search

  1. Overview
  2. Docs
Wu-Manber approximate string matching

Install

Dune Dependency

Authors

Maintainers

Sources

wu-manber-fuzzy-search-0.1.0.tbz
sha256=6383ed484151c85bd4de90592391d9117c9ea03eddc04bef0af5cfe57b25f56f
sha512=6d6aae059225726a24ad44eb1a47d90507d9c9a429be41a9c577baf01d29c63d7dd3bc5767602cc4dfb21d28f296ea4000f7a9b91a3e50740c85b02771f66a95

doc/wu-manber-fuzzy-search/Wu_Manber/StringSearch/index.html

Module Wu_Manber.StringSearchSource

An module for fuzzy searching in strings

This module is intended to be used for searching as well as an example of properly using the mid-level functors in the FirstMatch module.

The search function in this module stop after the first match, and returns the error count together with the number of characters of the text read by the algorithm.

The following is a description of how to use the functions in Wu_Manber.FirstMatch.

We first collect all the relevant operations of strings and characters in a module called Pattern.

Sourcemodule Pattern : sig ... end

Module collecting operations for strings.

Next we create a Matcher using one of the functors in the Matcher module.

module ArrayMatcher = Matcher.MakeArrayMatcher (Pattern)
Sourcemodule ArrayMatcher : sig ... end

Next, we use the FirstMatch.Make functor to create functions that can search through sequences of characters.

module FirstMatch = FirstMatch.Make (Pattern) (ArrayMatcher)
Sourcemodule FirstMatch : sig ... end

Lastly, to search through strings instead of sequences, we wrap the functions from the FirstMatch module.

let search ~k ~pattern ~text =
  let seq = String.to_seq text in
  FirstMatch.first_match ~pattern ~k seq

let search_right_leaning ~k ~pattern ~text =
  let seq = String.to_seq text in
  FirstMatch.first_right_leaning_match ~pattern ~k seq

let report = FirstMatch.report

search ~k ~pattern ~text searches for the first match in a string using the basic Wu and Manber algorithm, allowing for ~k errors for a match. ~pattern must have length less than or equal 63.

Sourceval search_right_leaning : k:int -> pattern:Pattern.t -> text:String.t -> (int * int * Optint.Int63.t array) option

search_right_leaning ~k ~pattern ~text searches for the first match in a string using the right leaning variant of the Wu and Manber algorithm, allowing for ~k errors for a match. ~pattern must have length less than or equal 63.

Sourceval report : (int * int * Optint.Int63.t array) option -> string

report produces a texual description of the results of the above functions.

OCaml

Innovation. Community. Security.