LemmaExpander#

class LemmaExpander(config)#

Bases: BatchedStep

Keep track of lemmatised version of relevant terms.

Note

  • Can leverage pos_mutations from POSBooster to tag lemmas only for important terms.

  • This metadata is then used to generate exact vs lemmatised ES query clauses (by ES query generator)

Input

  • Uses field nlp (spacy.tokens.doc.Doc) that contains the analysed SpaCy Doc (from SpacyNormalizer).

Output

  • Writes field lemma_map (dict) that contains the mapping of original term and its root form like {'original': 'lemmatized'}

Parameters:
  • step (str, "app") – app

  • type (str, "query_processing") – query_processing

  • name (str, "lemma_tagger") – lemma_tagger

  • analyzed_input_field (str,"nlp") – query

  • output_field (str,"lemma_map") – Tagged lemma terms

  • path (str,".") – .

Methods Summary

process_doc(doc)

Process a document

Methods Documentation

process_doc(doc)#

Process a document

Parameters:

doc (Document) – Document

Returns:

Processed document

Return type:

Document