Bulk Exporter

Bulk Exporter#

The Bulk Exporter is part of the Squirro Toolbox. It can extract the content of a Squirro project and export it to a CSV file.

Basic Usage#

Bulk export is called on the command line with an output file. The simplest invocation is as follows:

squirro_bulk_export ^
    --token ... ^
    --cluster https://next.squirro.net  ^
    --project-id ... ^
    --out-file data.csv

Note: In the example above, the lines are wrapped with the circumflex ^ at the end of each line.

Tip: On Mac and Linux you will need to use backslash \ instead.

Arguments#

The following table lists all of the arguments:

Argument

Mandatory

Description

General Options

-h

Show a help message and exit.

–version

Output the tool version and exit.

–verbose, -v

Increase log verbosity.

  • Not specified: the tool outputs all warnings and errors.

  • Specified once or more: informational messages are also output.

  • Specified twice or more: debugging messages are shown.

      • Specified three times or more: more information is included in all messages.

–log-file FILE

Path to a log file on disk, where the log output is to be stored. If this is not specified, the log messages are shown on the console.

Connection Options (see Connecting to Squirro for finding these values)

–token TOKEN -t TOKEN

Yes

The Authentication Token with which to authenticate. If the token value starts with a dash, you need to use an equal sign to specify the value like this: --token="-12345…"

–cluster URL
-c URL

The Squirro Cluster from which to export the data.

–project-id PROJECT_ID

Yes

The Project Identifier from which to export the data.

Export Options

–out-file FILE
-o FILE

Yes

Output file where the CSV data will be stored.

–query QUERY

The query for which to export the results. Defaults to an empty query - which returns all the items in the project.

–batch-size BATCH_SIZE

Number of items that are requested from the server in one request. Increasing this from the default of 100 can improve export performance at the cost of affecting the overall system performance.

–keyword-delimiter DELIMITER

The separator to use between multiple item keywords. Default is a comma.