FileLoader

class squirro.lib.nlp.steps.loaders.FileLoader(config)

Bases: squirro.lib.nlp.steps.loaders.Loader

The file Loader looks in a specified directory_field directory for files and loads these files either line by line or as a whole and transforms them into Document items.

Parameters
  • type (str) – file

  • directory_field (str) – name of field to store directory path of a loaded file

  • encoding (str, 'utf-8') – given file’s encoding

  • output_field (str) – name of field to store line or whole file

  • per_line (bool, False) – indicates if each line represents a different item document

Example

{
    "step": "loader",
    "type": "file",
    "directory_field": "loaded_file_path",
    "output_field": "file_content"
}

Methods Summary

process(directory)

Process a directory and yield documents.

Methods Documentation

process(directory)

Process a directory and yield documents.

Parameters

directory (str) – path to directory with *.txt files

Returns

Generator of Documents read from files

Return type

generator(Document)