SklearnClassifier#
- class SklearnClassifier(config)#
Bases:
Classifier
The sklearn
Classifier
is a generic scikit-learn classifier.- Note - currently we support following scikit-learn classifier:
Input - the input field need to be of type
list
[float
orint
] ornumpy.ndarray
. In addition also the typesstr
,float
,int
,list
[str
] andnumpy.sparse
matrices are supported.Output - the output field is filled with data of type
str
ordict
{str
:float
} if the model has the property “predict_proba”. The key of thedict
is the predicted class name and the value is the probability/confidence returned by the model.- Parameters:
Example
{ "step": "classifier", "type": "sklearn", "model_type": "GaussianNB", "model_kwargs": {}, "label_field": "label", "input_fields": ["embedded_extract"], "output_field": "prediction", }
Methods Summary
load
()Load a step
process_batch
(batch)Process a batch of documents.
save
()Save a step
train
(docs)Train on a step of a set of documents
Methods Documentation
- load()#
Load a step
- process_batch(batch)#
Process a batch of documents. If not defined will default to using self.process_doc for each document in the batch.
- save()#
Save a step