SemanticSearch#
- class SemanticSearch#
semantic | Perform knn search on paragraph embeddings
Leverages registered embeddings service to encode the query. Note: index embeddings have to be compatible with query embeddings
- pydantic model PluginConfig#
- Fields
embedding_type (Optional[squirro.common.clients.transformers.EmbeddingDataType])
k (int)
knn_boost (float)
normalize_embeddings (bool)
num_candidates (int)
similarity_threshold (Optional[float])
text (str)
worker (str)
- PluginConfig.plugin_name: ClassVar[str] = 'semantic'#
Used to register and reference the plugin within a query.
- field PluginConfig.worker: str = 'query-fast'#
What deployed sentence-embeddings worker (@transformer-service) should be used
- field PluginConfig.normalize_embeddings: bool = False#
If set to true, embeddings will have length 1 (normalized). In that case, the faster dot-product instead of cosine similarity will be used. Note that normalization works well only with the float embedding type.
- field PluginConfig.similarity_threshold: Optional[float] = None#
The required minimum similarity for a vector to be considered a match (optional). The scale of the threshold is dependent on the used similarity metric, and refers to the true similarity before it has been transformed into _score and boost applied - use the corresponding inverted score function.
- field PluginConfig.embedding_type: Optional[squirro.common.clients.transformers.EmbeddingDataType] = None#
The data type used to encode embeddings. Either float or byte. If set to byte, embeddings are quantized. If not set, the default type is read from the project configuration using the topic.search.default-embedding-settings config.