Normalizer

Contents

Normalizer#

class Normalizer(config)#

Bases: BatchedStep

The Normalizer step applies specific normalizations to fields for each Document.

Parameters:
  • type (str) – Type of Normalizer (‘character’, ‘html’, ‘lowercase’, ‘punctuation’, or ‘stopwords’)

  • fields (list, []) – List of fields to normalize

  • input_fields (list, None) – List of fields to normalize from

  • output_fields (list, None) – List of fields to normalize to