SklearnTFIDFEmbedder#
- class SklearnTFIDFEmbedder(config)#
Bases:
Embedder
The TFIDF
Embedder
encodes provided text based on the sklearn TFIDF-Vectorizer.Input - the input field needs to be of type
str
.Output - the output field is filled with data of type
numpy.ndarray
- Parameters:
Example
{ "step": "embedder", "type": "sklearn_tfidf", "name": "sklearn_tfidf", "input_field": "body", "model_kwargs": { "min_df": 5, "ngram_range": "1, 3" }, "output_field": "embedded_body" }
Methods Summary
load
()Load a step
process_batch
(batch)Process a batch of documents.
save
()Save a step
train
(docs)Train on a step of a set of documents
Methods Documentation
- load()#
Load a step
- process_batch(batch)#
Process a batch of documents. If not defined will default to using self.process_doc for each document in the batch.
- save()#
Save a step