SemanticRescore#

class SemanticRescore#

semantic_rescore | Rescore top N ranked paragraphs based on the similarity between query embedding & pre-computed paragraph embeddings. This is performend within the rescore phase inside Elastic.

pydantic model PluginConfig#
Fields
  • apply_phrases_to_encoder (bool)

  • embedding_type (Optional[squirro.common.clients.transformers.EmbeddingDataType])

  • normalize_embeddings (bool)

  • query_weight (float)

  • rescore_query_weight (float)

  • score_mode (squirro.lib.search.relevancy.plugins.retrieve.embedding_retrievers.RescoreScoreMode)

  • text (Optional[str])

  • window_size (int)

  • worker (str)

PluginConfig.plugin_name: ClassVar[str] = 'semantic_rescore'#

Used to register and reference the plugin within a query.

field PluginConfig.text: Optional[str] = ''#

Explicitly set Text as input for embeddings, otherwise overall user-query terms are injected & embedded.

field PluginConfig.window_size: int = 1000#

Apply semantic rescoring on N top ranked paragraphs

field PluginConfig.query_weight: float = 1.0#

The relative importance of the original query and of the rescore query can be controlled with the query_weight and rescore_query_weight respectively

field PluginConfig.rescore_query_weight: float = 2.0#

The relative importance of the original query and of the rescore query can be controlled with the query_weight and rescore_query_weight respectively

field PluginConfig.score_mode: squirro.lib.search.relevancy.plugins.retrieve.embedding_retrievers.RescoreScoreMode = RescoreScoreMode.multiply#

The way the scores are combined can be controlled

field PluginConfig.apply_phrases_to_encoder: bool = True#

Query Processing might rewrite the user query to match detected entities exactly as a phrase. If this is enabled, then the additional phrase is appended to the user terms and used in the query embedding call.

field PluginConfig.worker: str = 'query-fast'#

What deployed sentence-embeddings worker (@transformer-service) should be used

field PluginConfig.normalize_embeddings: bool = False#

If set to true, embeddings will have length 1 (normalized). In that case, the faster dot-product instead of cosine similarity will be used. Note that normalization works well only with the float embedding type.

field PluginConfig.embedding_type: Optional[squirro.common.clients.transformers.EmbeddingDataType] = None#

The data type used to encode embeddings. Either float or byte. If set to byte, embeddings are quantized. If not set, the default type is read from the project configuration using the topic.search.default-embedding-settings config.