RegexFilter¶
-
class
squirro.lib.nlp.steps.filters.
RegexFilter
(config)¶ Bases:
squirro.lib.nlp.steps.filters.Filter
Filter documents based on a supplied list of blacklist and whitelist regexes
- Parameters
type (str) – regex
blacklist_regexes (list, []) – List of blacklist regexes to apply
fields (list) – Fields to apply regexes
matching_label (str, 'match') – Label given if regex matches
non_matching_label (str, 'no_match') – Label given if regex does not match
whitelist_regexes (list, []) – List of whitelist regexes to apply
rule_field (str, None) – Field to record the rule which triggered the match (manly used in the context of proximity filters)
no_rule_matched_label (str, 'NO_RULE_MATCHED') – Rule given if regex does not match (manly used in the context of proximity filters)
default_language (str, 'en') – Default language if language_field is not present.
language_field (str, 'language') – Document field that gives the language.
Attributes Summary
Methods Summary
process_doc
(doc)Process a document
Attributes Documentation
-
REG_FLAGS
= 0¶
Methods Documentation