LemmaExpander

LemmaExpander#

class LemmaExpander(config)#

Bases: BatchedStep

Keep track of lemmatised version of relevant terms.

Note

Can leverage pos_mutations from POSBooster to tag lemmas only for important terms.
This metadata is then used to generate exact vs lemmatised ES query clauses (by ES query generator)

Input

Uses field nlp (spacy.tokens.doc.Doc) that contains the analysed SpaCy Doc (from SpacyNormalizer).

Output

Writes field lemma_map (dict) that contains the mapping of original term and its root form like {'original': 'lemmatized'}

Parameters:

Methods Summary

Process a document

Methods Documentation

process_doc(doc)#

Process a document