FullTextMatch

FullTextMatch#

class FullTextMatch#

fulltext_match — Perform explicit lexical full-text matching on specified fields.

This profile can be applied during the retrieval stage (query) or rescore stage. It is usable at the project level (rescore + query) or inline within query syntax.

Programmatic Usage with Squirro Query Syntax#

Match terms across multiple fields with field-specific boosts and options.

Basic Example#

Match the term television across multiple fields with different boosts:

Search within key phrases (keywords.nlp_tag_phrases) with a boost factor of 10.
Search within title.*.unstemmed/stemmed with a boost factor of 3.
Use cross_fields scoring.
Require at least 75% of the terms to match across the specified fields.

profile:{fulltext_match fields:"nlp_tag__phrases^10,title^3"
        minimum_should_match:75%
        query_type:cross_fields
        text:television }

Complex `minimum_should_match`#

Specifies how many terms of the query have to match on the specified fields.

The minimum_should_match syntax allows a combination of multiple rules that are separated by whitespace

Use quotes for more advanced minimum-should-match rules:

profile:{fulltext_match fields:"nlp_tag__phrases^10,title^3"
        minimum_should_match:"2<75% 4<50%"
        text:"television program"
        query_type:cross_fields}

Phrase Matching#

Enforce phrase-level matching by setting query_type:phrase:

profile:{fulltext_match fields:"nlp_tag__phrases^10,title^3"
        text:"television program"
        query_type:phrase }

Usage in Project Configuration#

Learn how to configure scoring profiles at How to Configure Scoring Profiles (topic.search.document-scoring-profiles).

Rescoring Top-Ranked Documents#

Rescore documents based on query-term matches in a specific field (e.g., tag) with custom field boosts.

Final score formula:

doc_score = original_score * (1.2 * BM25_score(query_terms_match in `tag`))

"fulltext_inject_rescore": {
    "query": "profile:{fulltext_match text:{{query_terms}} fields:tag^10}",
    "stage": "rescore",
    "config": {
        "rescore_window_size": 100,
        "rescore_query_weight": 1.2,
        "rescore_score_mode": "multiply"
    }
}

Boosting Matching Documents#

Boost the original score of all documents where query terms match a specific field (e.g., tags).

Final score formula:

doc_score = original_score * 11 (if query terms match inside tags)

"fulltext_inject_scale_by": {
    "query": "scale_by:{ profile:{ fulltext_match text:{{query_terms}} fields:tags } }^11"
}

pydantic model PluginConfig#

Fields:

fields (list[str] | None)
minimum_should_match (str | int | None)
query_type (Literal['cross_fields', 'best_fields', 'bool_prefix', 'most_fields', 'phrase', 'phrase_prefix'])
text (str | None)

PluginConfig.plugin_name: ClassVar[str] = 'fulltext_match'#: Used to register and reference the plugin within a query.

field PluginConfig.text: Optional[str] = ''#: Text to be used for search

field PluginConfig.fields: Optional[list[str]] = ['title', 'body']#: Comma separated list of searchable fields. Supports field boosts using the ^ operator.

field PluginConfig.query_type: Literal['cross_fields', 'best_fields', 'bool_prefix', 'most_fields', 'phrase', 'phrase_prefix'] = 'best_fields'#: Define how the terms have to match on the provided field, for more information see query-type-parameters

field PluginConfig.minimum_should_match: Union[str, int, None] = None#

Specifies how many terms of the query have to match on the content.

The official syntax allows a combination of multiple rules that are separated by whitespace, for example the setting 3<75% 7<5 means:

0–3 tokens: all tokens have to match
3–7 tokens: 75% of tokens have to match
>7 tokens: at least 5 tokens have to match

If not defined, the project configuration default gets applied, see topic.search.query-strategy.