Query Syntax

Query Syntax#

Profiles: Project Creator, Search User

You can use query syntax within Squirro search bars to return better, more refined search results.

This page details the query syntax techniques available within Squirro.

Project creators can configure query syntax options. Search users can use the techniques outlined in this page to improve search accuracy.

Download the Advanced Query Syntax Cheat Sheet PDF for a handy reference.

Introduction#

Per default, the title and body fields are taken into account when searching.

Sequences of query terms are combined using the OR operator, although the project’s configured minimum-should-match strategy also applies.

Dynamically tagged text labels can also be configured to be searchable.

Reference: For details, see How To Use Best-Bets Labels to Map Query Terms.

Boolean Operators#

Use AND, OR, NOT, + (plus sign) or - (minus sign) to explicitly combine terms. Be aware that the operators need to be in all capital letters.

The following restrictions apply:

The + or required operator requires that the term after the + symbol exists somewhere.
The - or prohibit operator excludes documents that contain the term after the - symbol.

Example Boolean Queries#

Query	Description
`squirro AND memonic`	Search documents that contain squirro and memonic.
`squirro OR memonic`	Search documents that contain either squirro or memonic.
`+memonic -squirro`	Search documents that contain memonic but do not contain squirro.
`squirro NOT memonic`	Search documents that contain squirro but do not contain memonic.

Grouping#

Use round brackets / parentheses for grouping.

Example Grouping Queries#

Query	Description
`(java AND solr) OR (python AND elasticsearch)`	Search documents that contain both java and solr, or documents that contain both python and elasticsearch.
`nektoon AND (squirro OR memonic)`	Search documents that contain nektoon and either squirro or memonic.

Phrase Search#

Use double quotes at the beginning and ending of a phrase to perform a phrase search. Phrase search is useful to make the search results more precise by making sure that terms have to be found within close distance (per default the distance is set to 5 terms).

You can also add a slop to the phrase with a tilde ~ to manually specify the allowed distance between terms.

Note: The tilde operator only works with phrase searches, not other types of queries.

Example Phrase Search Queries#

Query	Description
`"oracle financial services"~1`	Find documents where oracle, financial and services match in exact this order and within three terms.
`"oracle financial leasing"~3`	Find documents where oracle, financial and leasing must match but allow for up to 3 additional terms between them. The order of the terms is no longer strict, but swapping two words is equivalent to adding two words in terms of edit distance.
`"oracle financial services"`	Find documents where oracle, financial and services are found within the configured default `phrase_slop` distance. See project’s query strategy configuration (`topic.search.query-strategy::phrase.phrase_slop`)

Wildcard Search#

Find documents that contain terms matching a wildcard pattern. Wildcard term matching is applied on title, body, and searchable Labels.

Two wildcard operators are supported:

*, which matches zero or more characters
?, which matches any single character

Avoid using wildcard queries with leading * or ? patterns. This can increase iterations needed to find matching terms and thus cause very slow search performance.

Example Wildcard Queries#

Query	Description
`squirr*`	Search documents that contain e.g. for squirro and squirrel.
`*emonic`	Search documents that contain e.g. for memonic and mnemonic.
`te?t`	Search documents that contain e.g. for test and text.
`name:*`	Search documents that have e.g. the field “name” ¹
`-name:*`	Search documents that do not have e.g. the field “name” ¹
`name:squir*`	Search documents that contain the “name” field started by “squir”, e.g. name:squirro and name:squirrel. ¹

¹ Note that label names containing spaces need to be put inside quotes in queries

Field Search#

Only search in specific fields, see examples below:

Query	Description
`$title:France`	Search documents that have the term France in the title
`$body:France`	Search documents that have the term France in the document body
`$item_id:PgnAQM1FTSCP1uNOesoE7Q`	Search for a specific document by id
`$item_created_at>="2023-02-15T00:00:00"`	Search documents created after Feb. 14, 2023
`$item_created_at<"2015-02-01T00:00:00"`	Search documents created before Feb. 1, 2015
`$item_created_at>="now-7d/d"`	Search documents created in the last 7 days (see Elasticsearch Documentation )
`$_size > 100000`	Search documents with size > 100’000 bytes
`$link:"<url here>"`	Search documents including a specific url. Url must be in quotes.

Label Search#

Use any document label to restrict the search, see examples below:

Query	Description
`Country:France`	Search documents that have a label named Country with a value France
`Country:"United Kingdom"`	Search documents that have a label named Country with a value United Kingdom
"Mixed Sentiment":Yes :sub:`1`	Search documents that have a label named Mixed Sentiment with a value Yes

¹ Note that label names containing spaces need to be put inside quotes in queries

Important: Label search is case sensitive.

Term-Level Boosting#

Note

This feature only boosts individual terms or facets, to boost the full query clause see Boosting Queries By Optional Ranking Signals.

Individual elements of a query can be prioritized by boosting them. Note that sorting needs to be by relevance to notice the changed relevance scores.

Example Boosted Queries#

Query	Description
`France^10 Europe`	Search for France and Europe, but boost matches of “France”.
`France OR Country:France^10`	Search for France in full text, as well as the “Country” label and boost items that have the value defined in the country label.
`France^0.1 Europe`	Search for France and Europe, but de-prioritize matches of “France” (the default boost is 1.0).

Sorting#

You can use the following query syntax to sort the result:

sort:<field_name>[:<order>]

Where <field_name> is either date (default) or relevance or any item field name you want to sort by and <order> is either asc for ascending or desc for descending. The order suffix is optional, the default order is descending.²

Additionally, you can add a second (or third etc) sorting criteria by adding

[;<2nd_sort_field>[:<2nd_order>]]

to the query syntax.

² Note: The square brackets above mean that those fields are optional. Those brackets are not part of the syntax.

Example Sorting Queries#

Query	Description
`sort:date`	Sort by date (descending order by default)
`sort:date:asc`	Sort by date in ascending order
`sort:relevance:desc`	Sort by relevance in descending order
`sort:my_sortable_facet:desc;date:desc`	Sort by “my_sortable_facet” in descending order; additionally add a second sorting by descending date
`(Moon landing) sort:date`	Sort query by date (default order is descending)

Multiple Sorting Criteria#

Squirro supports multiple sorting criteria, meaning that when multiple items match a criteria (e.g. the exact date), the next sorting criteria is then applied to those items.

In terms of syntax, the sorting criteria are applied in the order defined in the query. The first sorting criterion is the primary one; the second sorting criterion is the secondary one, and so on.

For example, if you use the syntax sort:date:desc sort:$title:asc, Squirro will first sort results by date in descending order, and then sort by title in ascending (alphabetical A-Z) order.

Time Increment#

It is possible to control the time increments shown in the main timeline and in the dashboard widgets. To do so, add time_increment:<value> to a query.

Here is the Bugzilla Project without a time_increment set:

The same query, with time_increment:year

Possible values are:

time_increment:minute
time_increment:hour
time_increment:day
time_increment:week
time_increment:month
time_increment:quarter
time_increment:year

This can also be combined with values for more flexibility. For example:

time_increment:12hours
time_increment:4days
time_increment:8weeks
time_increment:6months
time_increment:3year

There is a performance impact when using a time increment that results in many individual increments. This impact is both in the user interface, where each increment needs to be drawn, as well as on the Elasticsearch level, where they need to be calculated. So use the time_increment setting carefully.

Entity Search#

Note

Entity search should only be performed by Project Creators with access to the Entities page (under Setup→Explore). You must know the exact entity name to perform Entity search.

Query syntax to search for items having entities satisfied some criteria:

entity:{< any query to match a single entity document >}

Example:

Search for Items containing a specific Entity of type company:
```
entity:{type:company AND name:"Thomson Reuters"}
```
Search for Items containing at least one company-typed Entity “Thomson Reuters” and another one Entity “Squirro”:
```
entity:{type:company AND name:"Thomson Reuters"} AND entity:{type:company AND name:Squirro}
```
Search for Items containing a specific Entity of type company with a confidence higher than 80%:
```
entity:{type:company AND name:"Thomson Reuters" AND confidence > 0.8}
```
Search for Items containing any Entity of type company with confidence higher than 70%:
```
entity:{type:company AND NOT confidence < 0.7}
```
Search for Items containing no Entity of type company with confidence higher or equal than 20%:
```
entity:{type:company AND confidence < 0.2}
```
Search for Items containing any Entity of type deal with at least a 70% confidence:
```
entity:{type:deal AND confidence > 0.7}
```

Search for Items containing a specific Entity of type deal:

entity:{type:deal AND properties.size:100 AND properties.region:US AND properties.industry:Tech AND properties.target:Whatsapp AND properties.acquirer:Facebook}

Search for Items containing one Entity with target Squirro and another Entity with target Whatsapp:

entity:{type:deal AND properties.target:Squirro AND properties.industry:Tech} AND entity:{type:deal AND properties.target:Whatsapp AND properties.industry:Tech}

Search for Items containing an Entity of type deal with a property size bigger than 100:
```
entity:{type:deal AND properties.size > 100}
```

Starred and Read Items#

Starred items are items marked as favorite / bookmarked items.

Note: You need to enable flags for your project(s) before you can query for starred and read items. To do so, set the enable_flags_for_project_ids property in topic.ini. For more details, see topic.ini.

Query syntax for (un)starred items and (un)read items:

is:starred
is:unstarred
is:read
is:unread

Scoring Profiles and Queries#

Accessing Scoring Profiles#

Scoring profiles use document metadata as additional filtering criteria to return the most relevant documents (according to the scoring profile).

Reference: For an introduction to scoring profiles see How to Use Scoring Profiles to Customize Document Relevancy Scoring.

Those profiles can be directly used in the query syntax using the profile:{} literal.

Scoring profiles can either reference a configured profile from the project configuration by name or leverage a plugin without any project configuration required.

Out-of-the-Box Plugins#

Plugins shipped out of the box by Squirro include the following:

last_read: boosts a user’s recently read items. The more recent the item was read, the higher the score.
popular_item: boosts popular items. The more popular the item within a project, the higher the score.
concept: runs concept search from within the search bar.
recommend_on_searches: recommend items based on users’ search histories.
subscribed_communities: add community filter queries for a user’s subscribed communities.
recency_boost: make recent documents more relevant.

Example: For more information about these plugins, see relevancy-scoring-plugins.

Example Queries#

The examples below show how different scoring profiles can be referenced in the search bar:

Filter by Last Read Items#

# Filter on last read user items with default settings
profile:{ last_read }

# Filter on last 5 read user items
profile:{ last_read count:5 }

Date Time recency Boosting#

# Boost documents by their item_creation date (default date field ranking)
profile:{ recency_boost }

# Boost documents by a custom date_time label, for example `last_updated`
profile:{ recency_boost date_field:last_updated }

Combinations of profiles#

# Search for specific documents within users search history, and boost resulting items by their date recency
elasticsearch tutorial profile:{ last_read count:1000 } profile:{ recency_boost }

# Run concept search on a given text with additional boosting of recently trendy items
profile:{ concept text:"covid outbreak" } profile:{ popular_item last_months:1 }

# Filter documents that belong to user's subscribed communities, and boost the resulting items by their date recency
profile:{ subscribed_communities } profile:{ recency_boost }--------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Boosting Queries By Optional Ranking Signals#

Warning

This feature only works if sort by relevancy is applied (sort:relevance)

In addition to ranking by profile, it is also possible to augment scoring by incorporating extra query clauses using the rank_by:{} and scale_by:{} literals.

These query clauses do not have to match a document for the document ot be retrieved, but matching documents receive a boost to their relevance score.

rank_by:{} clauses can be used to incorporate queries which will contribute to the total query score an additional score equal to the score of the wrapped query for each document (additive contribution).

scale_by:{} clauses can be used to incorporate queries which, when matched, will cause the final score of a scored document to be the overall query score multiplied by a factor defined by the magnitude assigned to this specific clause.

Note that these score boosting clauses, even when nested inside grouped clauses under some boolean structure, will instead still apply on the whole query. Thus `` (wifi rank_by:{source:official}) OR (login rank_by:{source:unofficial}) `` will yield results that match the terms wifi or login and will augment the score of documents that are tagged as official or unofficial in the source field.

Example Queries#

Query	Description
`cheese rank_by:{ Country:France^10 }`	Search for the term `cheese` but boost items that additionally are tagged with `Country:France`. The final score per document is the score for the term `cheese` added to the score for the country tag (which is multiplied by 10 before being added). Items that don’t match `France` are still returned.
`wifi login rank_by:{ source:official^5 OR is_faq:True^10}`	Search for the term sequence `wifi login` with most important relevancy signal beeing `is_faq` and the second signal being items coming from the `official` named datasource.
`rank_by:{ entity:{"properties.Sentiment":"negative"} } rank_by:{ source:official^5 OR is_faq:True^10}`	Boost items that are tagged on sentence level with a classification output (`entities`), in this case `negative` sentiment.
`arbitrage scale_by:{ $item_created_at>="2023-01-01" }^2`	Search for the term `arbitrage` and boost the score of documents that are created after 2023 by a factor of 2

It’s also possible to use scoring profiles within the rank_by clause to enable use-case-specific ranking, for example on the dashboard or widget level.

Query	Description
`rank_by:{ profile:{plugin:last_read $last_days:7} }`	Match all available items in the project, but boost the user’s recently-read items according to read time. The more recent, the more relevant.

Cheat Sheet#

Download the Advanced Query Syntax Cheat Sheet PDF shown in the image below for a handy reference guide.

Query Syntax

Contents

Query Syntax#

Introduction#

Boolean Operators#

Example Boolean Queries#

Grouping#

Example Grouping Queries#

Phrase Search#

Example Phrase Search Queries#

Wildcard Search#

Example Wildcard Queries#

Field Search#

Label Search#

Term-Level Boosting#

Example Boosted Queries#

Sorting#

Example Sorting Queries#

Multiple Sorting Criteria#

Time Increment#

Entity Search#

Starred and Read Items#

Scoring Profiles and Queries#

Accessing Scoring Profiles#

Out-of-the-Box Plugins#

Example Queries#

Boosting Queries By Optional Ranking Signals#

Example Queries#

Cheat Sheet#