Normalizer

class squirro.lib.nlp.steps.normalizers.Normalizer(config)

Bases: squirro.lib.nlp.steps.batched_step.BatchedStep

The Normalizer step applies specific normalizations to fields for each Document.

Parameters
  • type (str) – Type of Normalizer (‘character’, ‘html’, ‘lowercase’, ‘punctuation’, or ‘stopwords’)

  • fields (list, []) – List of fields to normalize

  • input_fields (list, None) – List of fields to normalize from

  • output_fields (list, None) – List of fields to normalize to