blog-2024-02-01

Text normalization

For pepperino search. i'm also planning to add bigrams and trigrams, which should be simple.

https://nlp.stanford.edu/IR-book/html/htmledition/normalization-equivalence-classing-of-terms-1.html

at the core of pepperino search is the ReverseLookupMap<A, B>, which manages two mappings: forward: Map<A, List<B>> and backward: Map<B, List<A>>. adding(A, (1, 2, 3) to forward also adds 1 to A, 2 to A and 3 to A in the backwards map.

Random links of the day


nothing's linking here.

last edited by: stefs at Friday, February 2, 2024, 8:28:12 PM Central European Standard Time


edit history source