SklearnProjector

class squirro.lib.nlp.steps.projectors.SklearnProjector(config)

Bases: squirro.lib.nlp.steps.projectors.Projector

Generic scikit-learn Projector. See http://scikit-learn.org/stable/index.html.

Parameters
  • type (str) – sklearn

  • model_type (str, 'svd') – Type of scikit-learn projection

  • model_kwargs (dict, {}) – Keyword arguments for the scikit-learn model

  • n_components (int) – Number of vector components after projection

  • normalize_output (bool, True) – Whether or not to normalize the output

Methods Summary

load()

Load a step

process_batch(batch)

Process a batch of documents.

save()

Save a step

train(docs)

Train on a step of a set of documents

Methods Documentation

load()

Load a step

process_batch(batch)

Process a batch of documents. If not defined will default to using self.process_doc for each document in the batch.

Parameters

batch (list(Document)) – List of documents

Returns

List of processed documents

Return type

list(Document)

save()

Save a step

train(docs)

Train on a step of a set of documents

Parameters

docs (generator(Document)) – Generator of documents

Returns

Generator of processed documents

Return type

generator(Document)