When you set up a data source to ingest data into Squirro, you need to assign a pipeline workflow to that data source. The pipeline workflow defines a set of built-in pipeline steps that are applied to the data passing through the pipeline. In addition to the built-in steps, you can add your own custom enrichment or processing steps, called pipelets. In order to add a pipelet to the pipeline, you first need to upload the pipelet to the Squirro server. Then use the pipeline editor (Pipeline tab) to add the pipelet to the pipeline and, if required, change its configuration.

We use the Python programming language to develop pipelets. The Pipelets Tutorial covers the development of a pipelet and how to include it in the pipeline.

Where do Pipelets fit in the process of loading data?

Pipelets modify items. We refer to data as items once it has been transformed into the consistent and predictable Squirro item format. For this reason, you should always design your pipelets to work with data that is in the Squirro item format. You don’t have to worry about the transformation. The Squirro data loader tool automatically transforms all source data (data produced by data loader plugins) into the Squirro item format.

This documentation covers the basic workflow of working with pipelets and the interface that pipelets need to implement.