String Searching Algorithms
(require string-searchers) | package: string-searchers |
Provides a variety of string search algorithms written in Typed Racket. They look for sequences of exact code points or bytes, not equivalencies. When using with non-ASCII text, consider normalizing strings first.
1 Searching for single strings
Functions for searching for a single substring in strings.
1.1 Knuth-Morris-Pratt
Search for strings using the Knuth-Morris-Pratt algorithm. These functions are also available in the string-searchers/kmp module, without the kmp- prefix.
procedure
(kmp-string-contains haystack needle [ start end]) → (Option Index) haystack : String needle : String start : Index = 0 end : Index = (string-length haystack)
procedure
(kmp-string-contains-ci haystack needle [ start end]) → (Option Index) haystack : String needle : String start : Index = 0 end : Index = (string-length haystack)
procedure
(kmp-make-matcher pattern [ #:case-insensitive case-insensitive?]) → kmp-matcher pattern : String case-insensitive? : Boolean = #f
procedure
(kmp-matcher? obj) → Boolean
obj : Any
procedure
(kmp-matcher-ci? m) → Boolean
m : kmp-matcher
procedure
(kmp-matcher-pattern m) → String
m : kmp-matcher
procedure
(kmp-find-string m text [start end]) → (Option Index)
m : kmp-matcher text : String start : Index = 0 end : Index = (string-length text)
procedure
(kmp-find-all-strings m text [ start end #:overlap overlap?]) → (Listof Index) m : kmp-matcher text : String start : Index = 0 end : Index = (string-length text) overlap? : Boolean = #t
1.2 Boyer-Moore-Horspool
Search for strings using the Boyer-Moore-Horspool algorithm. These functions are also available in the string-searchers/bmh module, without the bmh- prefix.
1.2.1 Strings
procedure
(bmh-string-contains haystack needle [ start end]) → (Option Index) haystack : String needle : String start : Index = 0 end : Index = (string-length haystack)
procedure
(bmh-string-contains-ci haystack needle [ start end]) → (Option Index) haystack : String needle : String start : Index = 0 end : Index = (string-length haystack)
procedure
(bmh-make-matcher pattern [ #:case-insensitive case-insensitive?]) → bmh-matcher pattern : String case-insensitive? : Boolean = #f
procedure
(bmh-matcher? obj) → Boolean
obj : Any
procedure
(bmh-matcher-ci? m) → Boolean
m : bmh-matcher
procedure
(bmh-matcher-pattern m) → String
m : bmh-matcher
procedure
(bmh-find-string m text [start end]) → (Option Index)
m : bmh-matcher text : String start : Index = 0 end : Index = (string-length text)
procedure
(bmh-find-all-strings m text [ start end #:overlap overlap?]) → (Listof Index) m : bmh-matcher text : String start : Index = 0 end : Index = (string-length text) overlap? : Boolean = #t
1.2.2 Byte Strings
Note: There are no case-insensitive routines for byte strings.
procedure
(bmh-byte-string-contains haystack needle [ start end]) → (Option Index) haystack : Bytes needle : Bytes start : Index = 0 end : Index = (bytes-length haystack)
procedure
(bmh-make-byte-matcher pattern) → bmh-byte-matcher
pattern : Bytes
procedure
(bmh-byte-matcher? obj) → Boolean
obj : Any
procedure
m : bmh-byte-matcher
procedure
(bmh-find-byte-string m text [start end]) → (Option Index)
m : bmh-byte-matcher text : Bytes start : Index = 0 end : Index = (bytes-length text)
procedure
(bmh-find-all-byte-strings m text [ start end #:overlap overlap?]) → (Listof Index) m : bmh-byte-matcher text : Bytes start : Index = 0 end : Index = (bytes-length text) overlap? : Boolean = #t
2 Searching for multiple strings
Sections for searching for any of multiple different substrings in a string.
2.1 Aho-Corasick
Search for strings using the Aho-Corasick algorithm. These functions are also available in the string-searchers/ahoc module, without the ahoc- prefix.
procedure
(ahoc-make-matcher s ...) → ahoc-matcher
s : String
procedure
(list->ahoc-matcher strings) → ahoc-matcher
strings : (Listof String)
procedure
(ahoc-matcher? obj) → Boolean
obj : Any
procedure
(ahoc-matcher-patterns m) → (Listof String)
m : ahoc-matcher