Typeahead Suggestions

Typeahead Suggestions#

Profiles: Project Creator, Search User

The search bar within Squirro Cognitive Search offers a rich set of suggestions out of the box. Upon activation, it immediately provides a rich set of suggestions, even before any inputs.

By default, the search offers the users suggestions as they type, based on configurable settings the project creator can adjust.

After performing a search, new suggestions appear based on the context of the initial query. This feature ensures that the recommendations are more relevant and dynamically updated. By continuously refining the suggestions based on user input, the user experience becomes more intuitive and efficient.

Based on Document Features#

Typeahead suggestions can come from three different sources:

Content (Key Phrases), as described on this page.
Document title.
Labels and their values.
Communities.

Content-based Typeahead#

Squirro provides search completion based on document features. This is in addition to the typeahead based on label values.

Features#

Matching on key-phrases
Fuzzy phrase-term matching
- Optional enforcing of in-order term matching
- Support for phrase slop: maximum number of intervening unmatched positions

Setup#

To enable the content-based typeahead, add the NLP Keyphrase Tagger (see Discover (NLP Tagger) Pipeline Step) step to the corresponding pipeline.

Add the NLP Keyphrase Tagger pipelet (found in the Discover section) to a pipeline workflow:

This enrichment creates a hidden label nlp_tag__phrases which contains the extracted key phrases:

Based on User Interaction#

User’s Saved Searches
User’s Search History
Popular queries (see Popular Query Suggestions) across the project (or for a single user).
User’s Collections

Using `SquirroClient`#

All suggestions can be accessed programmatically via the SquirroClient (Python SDK).

For this, see TopicApiBaseMixin, specifically get_typeahead_suggestions().

Configuration#

Typeahead Suggestions can be configured in Setup > Settings > Project Configuration.

You can configure what typeahead suggesters to use at different search states.

Project Configuration frontend.userapp.search.typeahead-suggesters

The full list of suggesters and their description can be found here.

Suggestions Based on Document Content#

Project Configuration topic.typeahead.content.configuration

Reference: Configuration Schema

pydantic model ContentTypeaheadConfig#

Content-based Typeahead configuration.

For more information about the Typeahead Suggestions feature in Squirro see: https://docs.squirro.com/en/latest/technical/search/features/typeahead.html

Fields:

fuzziness (str)
fuzzy_max_expansions (int)
fuzzy_prefix_length (int)
highlighter (squirro.common.config_schema.typeahead.TypeaheadHighlighter)
in_order (bool)
slop (int)
terminate_after (int)

field fuzziness: str = 'AUTO'#: Specifies the maximum edit distance allowed for matching. The fuzziness parameter can be specified as: 0, 1, 2: The maximum allowed Levenshtein Edit Distance (or number of edits), AUTO: Generates an edit distance based on the length of the term.

field fuzzy_max_expansions: int = 1#: Maximum number of variations created.

field fuzzy_prefix_length: int = 2#: Number of beginning characters left unchanged when creating expansions.

field highlighter: TypeaheadHighlighter = TypeaheadHighlighter.UNIFIED#: Highlighter type.

field in_order: bool = False#: Require matched spans to be in-order.

field slop: int = 6#: Maximum number of intervening unmatched positions.

field terminate_after: int = 100000#: Tradeoff between accurate matching and speed. The typeahead search request gets terminated after N matching documents (per shard) are found. Potentially higher scoring results may be missed out.

Suggestions Based on Document Title#

Project Configuration topic.typeahead.title.configuration

Reference: Configuration Schema

pydantic model TitleTypeaheadConfig#

Title-based Typeahead configuration.

For more information about the Typeahead Suggestions feature in Squirro see: https://docs.squirro.com/en/latest/technical/search/features/typeahead.html

Fields:

enabled (bool)
fuzziness (str)
fuzzy_max_expansions (int)
fuzzy_prefix_length (int)
highlighter (squirro.common.config_schema.typeahead.TypeaheadHighlighter)
in_order (bool)
query_builder (squirro.common.config_schema.typeahead.TitleTypeahedQueryBuilder)
slop (int)
terminate_after (int)

field enabled: bool = True#: Enable title based typeahead suggestions

field fuzziness: str = 'AUTO'#: Specifies the maximum edit distance allowed for matching. The fuzziness parameter can be specified as: 0, 1, 2: The maximum allowed Levenshtein Edit Distance (or number of edits), AUTO: Generates an edit distance based on the length of the term.

field fuzzy_max_expansions: int = 1#: Maximum number of variations created.

field fuzzy_prefix_length: int = 2#: Number of beginning characters left unchanged when creating expansions.

field highlighter: TypeaheadHighlighter = TypeaheadHighlighter.UNIFIED#: Highlighter type.

field in_order: bool = False#: Require matched spans to be in-order. This option is applicable only to the ‘span’ query builder.

field query_builder: TitleTypeahedQueryBuilder = TitleTypeahedQueryBuilder.SPAN#: The type of query builder to be used. If set to ‘span’ the span near query will be utilized. If set to ‘match’, the match query will be built. This option is intended for advanced users who understand its implications.

field slop: int = 6#: Maximum number of intervening unmatched positions. This option is applicable only to the ‘span’ query builder.

field terminate_after: int = 100000#: Tradeoff between accurate matching and speed. The typeahead search request gets terminated after N matching documents (per shard) are found. Potentially higher scoring results may be missed out.

Suggestions Based on Label Values#

Project Configuration topic.typeahead.facetvalue.configuration

{
    "terminate_after": string,
    "sample_shard_size": int,
    "aggregation_method": "significant_terms",
    "community_drill_down": {
        "enabled": true,
        "for_top_n": 1
    }
}

Reference: Configuration Schema

terminate_after: Type: int

Required: False

Default: 15000

Tradeoff between accurate matching and speed. The typeahead search request gets terminated after N matching documents (per shard) are found. Potentially higher scoring results may be missed out.

sample_shard_size: Type: int

Required: False

Default: 100

Facet Value suggestions is based on the Terms aggregation on the matching subset of Items. The sample_shard_size parameter limits how many top-scoring documents are collected in the sample processed on each shard. Lower values tend to return label-value suggestions faster (but with less accurate order) and tightens the focus to high-relevance matches rather than the potentially very long tail of low-quality matches.

aggregation_method: Type: string

Required: False

Default: significant_terms

Choose between significant_terms or terms aggregation to find best matching label values.

community_drill_down: Type: json

Required: False

Enable drilling down of top-ranked communities by setting community_drill_down.enabled:true

Typeahead Suggestions

Contents

Typeahead Suggestions#

Based on Document Features#

Content-based Typeahead#

Features#

Setup#

Based on User Interaction#

Using SquirroClient#

Configuration#

Suggestions Based on Document Content#

Suggestions Based on Document Title#

Suggestions Based on Label Values#

Using `SquirroClient`#