Email parser normalizer parse an email string to extract email body. Given a non-html email string, parse to extract body, additional cleaning is applied for footer not extracted by python email parser.
Find a regex match “Body:” in the email string.
Extract body using python email parser.
- Given a list as discard_footers (eg. [“Best regards”, “Warm Regards”,]), discard the body after first
appearance of a footer string.
From: Squirro\nTo: Hi,\nI hope to find you well. In the emails before youve learned more about our Insights Engine.\nBest regards, Squirro
I hope to find you well. In the emails before youve learned more about our Insights Engine.
type (str) – email_parse
discard_footers (list) – Discard the text after occurance of these footer strings.
input_fields (list) – Fields to clean email
output_fields (list) – Fields to record cleaned email body
Process a document