ProximityFilter#
- class ProximityFilter(config)#
Bases:
RegexFilter
The proximity
RegexFilter
filters allDocument
by checking if terms exist within a specified proximity.- Note
The format for proximity rules is based on the Squirro Phrase Search syntax (e.g.
"issue shares~6"
). To make the search one directional append"|"
(e.g."issue shares~6|"
). It is possible to use more than 2 terms where the distance is used between consecutive terms, or use a single exact match term without proximity distance.Expressions are case-insensitive.
There is a max of 20 words per rule to limit the complexity of the regex
Input - all input fields needs to be of type
str
.Output - the output field is filled with data of type
str
.- Parameters:
Example
{ "step": "filter", "type": "proximity", "fields": ["body"], "matching_label": "m&a", "output_field": "prediction", "whitelist_terms": ["appoint CEO~3"] }
Attributes Summary
Attributes Documentation
- REG_FLAGS = 2#