SklearnClassifier#
- class SklearnClassifier(config)#
Bases:
ClassifierThe sklearn
Classifieris a generic scikit-learn classifier.- Note - currently we support following scikit-learn classifier:
Input - the input field need to be of type
list[floatorint] ornumpy.ndarray. In addition also the typesstr,float,int,list[str] andnumpy.sparsematrices are supported.Output - the output field is filled with data of type
strordict{str:float} if the model has the property “predict_proba”. The key of thedictis the predicted class name and the value is the probability/confidence returned by the model.- Parameters:
Example
{ "step": "classifier", "type": "sklearn", "model_type": "GaussianNB", "model_kwargs": {}, "label_field": "label", "input_fields": ["embedded_extract"], "output_field": "prediction", }
Methods Summary
load()Load a step
process_batch(batch)Process a batch of documents.
save()Save a step
train(docs)Train on a step of a set of documents
Methods Documentation
- load()#
Load a step
- process_batch(batch)#
Process a batch of documents. If not defined will default to using self.process_doc for each document in the batch.
- save()#
Save a step