SemanticRescore#

class SemanticRescore#

semantic_rescore | Rescore top N ranked paragraphs based on the similarity between query embedding & pre-computed paragraph embeddings. This is performend within the rescore phase inside Elastic.

pydantic model PluginConfig#
Fields:
  • apply_phrases_to_encoder (bool)

  • embedding_type (squirro.common.clients.transformers.EmbeddingDataType | None)

  • text (str | None)

  • vector_field (str | None)

  • worker (str)

PluginConfig.plugin_name: ClassVar[str] = 'semantic_rescore'#

Used to register and reference the plugin within a query.

field PluginConfig.text: Optional[str] = ''#

Explicitly set Text as input for embeddings, otherwise overall user-query terms are injected & embedded.

field PluginConfig.apply_phrases_to_encoder: bool = True#

Query Processing might rewrite the user query to match detected entities exactly as a phrase. If this is enabled, then the additional phrase is appended to the user terms and used in the query embedding call.

field PluginConfig.worker: str = 'query-fast'#

What deployed sentence-embeddings worker (@transformer-service) should be used

field PluginConfig.embedding_type: Optional[EmbeddingDataType] = None#

The data type used to encode embeddings. Either float or byte. If set to byte, embeddings are quantized. If not set, the default type is read from the project configuration using the topic.search.default-embedding-settings config.

field PluginConfig.vector_field: Optional[str] = None#

The name of the vector field in Elasticsearch used for querying e.g. 384-byte-intfloat/multilingual-e5-small.