CustomSpacyNormalizer

CustomSpacyNormalizer#

class CustomSpacyNormalizer(config)#

Bases: SpacyNormalizer

Override Spacy Tokenizer to NOT split tokens by / or - characters.

Parameters:

step (str, "app") – app
type (str, "query_processing") – query_processing
name (str, "custom_spacy_normalizer") – custom_spacy_normalizer
input_fields (list,["user_terms_str"]) – This step only takes one input field
(str (infix_split_chars) – <>=/”): Additional characters that are used to split a single token.
" – <>=/”): Additional characters that are used to split a single token.
merge_noun_chunks (bool, True) – merge noun chunks into a single token
product_recognizer__with_noun_num (bool, True) – Recognize consecutive NOUNs followed by a NUM as a PRODUCT entity
path (str, ".") – path

Methods Summary

customise_spacy(nlp)

Customise spacy model.

Methods Documentation

customise_spacy(nlp)#

Customise spacy model.

Enable custom steps to inherit from this step to customize specific spacy-components, for example the behaviour of the Tokenizer. :type nlp: Language :param nlp: :rtype: None :return:

CustomSpacyNormalizer

Contents

CustomSpacyNormalizer#