Normalizers Package

Contents

Normalizers Package#

Functions#

make_normalizer(config)

Normalizer factory

Classes#

CharacterNormalizer(config)

The character Normalizer removes numeric digits

EmailParseNormalizer(config)

The email parse Normalizer parses an string which is based on a email to extract the email body.

HTMLNormalizer(config)

The HTML Normalizer removes HTML markup

LowercaseNormalizer(config)

The lowercase Normalizer lowercases everything

Normalizer(config)

The Normalizer step applies specific normalizations to fields for each Document.

PunctuationNormalizer(config)

The punctuation Normalizer strips punctuation from text

SentimentTermNormalizer(config)

Extracts positive and negative terms/phrases from given text.

SpacyNormalizer(config)

Multi-Lingual Spacy Text Analyzer.

StopwordsNormalizer(config)

The stopwords Normalizer strips stopwords from the text