FastTextClassifier

class squirro.lib.nlp.steps.classifiers.FastTextClassifier(config)

Bases: squirro.lib.nlp.steps.classifiers.Classifier

FastText classifier (see: https://fasttext.cc/).

Parameters
  • type (str) – fasttext

  • cutoff (int, 100000) – Cutoff for quantization

  • learning_rate (float, 1.0) – Learning rate

  • min_count (int, 1) – Minimum number of words appearances to be included in dictionary

  • min_prob (float, 0.0) – Minimum prediction probability to return

  • n_epochs (int, 25) – Number of training epochs

  • n_grams (int, 2) – N of N-grams

  • n_predictions (int, None) – Number of label predictions to return. By default this will be the number of unique labels.

  • quantize (bool, False) – Whether or not to quantize the model

Methods Summary

clean()

Clean step

load()

Load a step

process_doc(doc)

Process a document

save()

Save a step

train(docs)

Train on a step of a set of documents

Methods Documentation

clean()

Clean step

load()

Load a step

process_doc(doc)

Process a document

Parameters

doc (Document) – Document

Returns

Processed document

Return type

Document

save()

Save a step

train(docs)

Train on a step of a set of documents

Parameters

docs (generator(Document)) – Generator of documents

Returns

Generator of processed documents

Return type

generator(Document)