How To Use Best-Bets Labels to Map Query Terms

How To Use Best-Bets Labels to Map Query Terms#

Profiles: Project Creator, Search Engineer

This page provides an example of how to use best-bet labels to mapy query terms.

Reference: Learn more about Document Relevancy.

Relying on the default relevancy score might be good enough for homogeneous datasets, where all the documents share a similar structure (document size) and/or come from the same source repository / domain.

But what if the data shares similar content, but is very different in terms of domain & structure?

Search Use Case Walk Through#

In this example, a user wants to find a tutorial about a product, but doesn’t know how the product is exactly named nor what kind of information exists.

An example user queries for elastic search tutorial, and the initial result in that project is the following:

https://s3.amazonaws.com/download.squirro.net/docs/technical/search/relevancy/relevancy_best_bets-initial-result.png

Why are the wrong documents ranked highest?

  • The response contains many documents where search matches on the title (title matches have more weight per default), but none of the results are actually relevant.

    • Additionally those top ranked results have very short content, but contain matches on title and body (high score)

  • The expected document with the title Learning Elasticsearch is not found

    • The terms elastic and search match, but the ebook contains a lot of text and overall the relevancy score is not high enough. This is partly the case because BM25 similarity scoring considers the document length and promotes shorter documents per default.

How can this be improved?

Several techniques can be applied to bring the correct answer to the top.

Here we show how searchable labels can be used to use the search tuning technique called Best Bets: Tag any document with additional content that you think Users are searching for in order to find the expected document.

As a project-creator, you can analyze the query behaviour of users to get a better understanding what keywords are mostly searched for.

Create Searchable Label Best Bets#

Go to Data > Labels and create a new searchable Label. This label is used to store additional information used for document matching.

https://s3.amazonaws.com/download.squirro.net/docs/technical/search/relevancy/relevancy_best_bets-add-label.png

The boosting of the searchable Label is currently done via project-configuration:

https://s3.amazonaws.com/download.squirro.net/docs/technical/search/relevancy/relevancy_best_bets-boost.png

Whenever a term matches the configured searchable label, the original score of the label-match gets multiplied with the label-boost to promote the document overall.

Annotate Target Document#

Tag the target document with keywords, phrases or alternate descriptions that are expected to match user queries (map additional user query vocabulary to the document – content which is not available on the document itself)

Tag document with expected keywords: elastic search tutorial guide

https://s3.amazonaws.com/download.squirro.net/docs/technical/search/relevancy/relevancy_best_bets-tag-item.png

This is also beneficial to add synonyms scoped to one document only.

Result#

For the same user, the top ranked document is now the expected Ebook.

https://s3.amazonaws.com/download.squirro.net/docs/technical/search/relevancy/relevancy_best_bets-boosted-result.png