CustomSpacyNormalizer#

class CustomSpacyNormalizer(config)#

Bases: SpacyNormalizer

Override Spacy Tokenizer to NOT split tokens by / or - characters.

Parameters
  • step (str, "app") – app

  • type (str, "query_processing") – query_processing

  • name (str, "custom_spacy_normalizer") – custom_spacy_normalizer

  • input_fields (list,["user_terms_str"]) – This step only takes one input field

  • (str (infix_split_chars) – <>=/”): Additional characters that are used to split a single token.

  • " – <>=/”): Additional characters that are used to split a single token.

  • merge_noun_chunks (bool, True) – merge noun chunks into a single token

  • product_recognizer__with_noun_num (bool, True) – Recognize consecutive NOUNs followed by a NUM as a PRODUCT entity

  • path (str, ".") – path

Methods Summary

customise_spacy(nlp)

Customise spacy model.

Methods Documentation

customise_spacy(nlp)#

Customise spacy model.

Enable custom steps to inherit from this step to customize specific spacy-components, for example the behaviour of the Tokenizer. :type nlp: Language :param nlp: :rtype: None :return: