SquirroQueryLoader#

class SquirroQueryLoader(config)#

Bases: Loader

Loads items from a Squirro instance using the SquirroClient and transforms the items into Document items.

Note - The fields client_id, client_secret, cluster, token and project_id do not need to be set if it is used inside the Squirro machinelearning service

Parameters
  • type (str) – squirro_query

  • batch_size (int, 1000) – size of Squirro query batch

  • client_id (str, None) – Squirro client id

  • client_secret (str, None) – Squirro client secret

  • cluster (str) – Squirro cluster

  • token (str) – Squirro token

  • project_id (str) – id of Squirro project

  • merge_sub_items (bool, False) – indicates if loading content from sub_items and returning as merged field is applied

Example

{
    "step": "loader",
    "type": "squirro_query",
    "batch_size": 100,
    "cluster": "CLUSTER",
    "token": "TOKEN",
    "project_id": "PROJECT_ID"
    "fields": [ "body", "keywords.label" ]
}

Methods Summary

process(dataset)

Process a query and yield documents.

Methods Documentation

process(dataset)#

Process a query and yield documents.

Arguments

dataset (dict): Dictionary containing query strings

Yields

Document – A document from a query