QueryModifier

class squirro.lib.nlp.apps.query_processing.QueryModifier(config)

Bases: squirro.lib.nlp.steps.batched_step.BatchedStep

Modify raw query using metadata collected from prior steps.

Parameters
  • step (str, "app") – app

  • type (str, "query_processing") – query_processing

  • name (str, "query_modifier") – query_modifier

  • raw_input_field (str, "query") – raw user query to modify

  • term_mutations_metadata (list, ["term_expansion_mutations","pos_mutations"]) – mutations applied, order matters

  • output_field (str, "enriched_query") – the modified query string

  • path (str, ".") – schema expects path where step file is located. not used actually

Attributes Summary

REMOVE_PUNCTUATION

Methods Summary

append_string(raw_query_string, mutations)

Concatenate mutations (act as filter) with raw-query-string.

process_doc(doc)

Process a document

replace_by_key_value(raw_query_string, mutations)

Apply mutations as defined from prior steps on raw_query_string.

Attributes Documentation

REMOVE_PUNCTUATION = '!"\\#\\$%\\&\'\\(\\)\\*\\+,\\-\\./:;<=>\\[email protected]\\[\\\\\\]\\^_`\\{\\|\\}\\~'

Methods Documentation

static append_string(raw_query_string, mutations)

Concatenate mutations (act as filter) with raw-query-string.

process_doc(doc)

Process a document

Parameters

doc (Document) – Document

Returns

Processed document

Return type

Document

replace_by_key_value(raw_query_string, mutations)

Apply mutations as defined from prior steps on raw_query_string.

Note: The whole approach of mutation-dicts & regex-matching on the raw-query-string is not ideal.

Reason for that approach is to consider/keep to user’s original query syntax in place, like logical grouping clauses. user query: (phone AND apple) possible rewritten: (phone^10 AND (apple^5 OR “Apple Inc.”^0.7)

Return type

str