KFoldValidation

class squirro.lib.nlp.steps.classifiers.KFoldValidation(config)

Bases: squirro.lib.nlp.steps.classifiers.Classifier

The k-fold validation Classifier wrapper executes the specified Classifier into the k-fold validation principle. Based on the results of the k unseen test sets it creates the necessary files for the metrics and confusion matrix in the AI Studio Validation screens.

Note - The stored model is based on all input data, k-fold validation is only produced for the metrics

Input - Since it is a wrapper step check the used Classifier

Output - Since it is a wrapper step check the used Classifier

Parameters
  • type (str) – kfold

  • classifier (str) –

    model to be used

    Deprecated since version 3.2.0: Define classifier type using the classifier_params parameter.

  • classifier_params (dict, {}) – params

  • k (int, 5) – DS splitting parameter

  • output_path (str) – k-fold output path

Example

{
    "step": "classifier",
    "type": "kfold"
    "k": 5,
    "label_field": "label",
    "output_field": "prediction",
    "output_path": "./ml_results.json",
    "classifier_params": {
        "step": "classifier",
        "type": "cosine_similarity",
        "input_fields": ["embedded_extract"],
        "output_field": "prediction",
        "label_field": "label",
    }
}

Methods Summary

process(docs)

Process a set of documents

train(docs)

Train on a step of a set of documents

Methods Documentation

process(docs)

Process a set of documents

Parameters

docs (generator(Document)) – Generator of documents

Returns

Generator of processed documents

Return type

generator(Document)

train(docs)

Train on a step of a set of documents

Parameters

docs (generator(Document)) – Generator of documents

Returns

Generator of processed documents

Return type

generator(Document)