Load (Transform Input) Pipeline Step

Load (Transform Input) Pipeline Step#

The Transform Input step is used during the Load section to perform the transformations defined in the data source configuration process.

This page describes how the Transform Input step works and how to configure it.

Overview#

Originally, the logic contained within the Transform Input step was part of the Data Loader itself. At that time, the Squirro data processing pipeline began with the Enrich steps rather than the Load step.

In Squirro release 3.3.1, this logic was separated from the data loader itself and moved into a separate step, allowing for better troubleshooting.

The Transform Input step takes the extracted data in row format and transforms them to Squirro Item format by using the mapping configuration defined for the data source.

Transform Input step in Data Processing Pipeline

Note

If you wish to keep the functionality of the Transform step in the data loading process, you can edit the common.ini file and change item_transformation_in_pipeline = from true to false.

How It Works#

When you create a data source using a data connector from within the UI, you define the mapping configurations.

Here, you define how a source field maps to a Squirro item field.

Reference: Learn more about the Squirro Item Format.

If the source document has a field such as article, we may choose to map that source field to the Squirro item field title.

In the data loading process, this represents steps 2 and 3, Map to item fields and Map to labels, as shown in the example screenshot below:

Warning

Every document must include a mapped Title or Body field.

Configuring the Transform Input Step#

The Transform Input step includes a configuration option in your Squirro installation’s common.ini file named item_transformation_in_pipeline.

This option controls whether the Transform Input step is enabled (true) or not (false). When it is set to false, if the Transform Input step is in a pipeline, it is skipped.

There are no other step-specific configuration options within common.ini like you would find in some other steps (e.g. MIME detection).

Reference: To learn more about configuring the Transform Input step within common.ini, see common.ini.

Benefits of Transform Input Step#

The primary reason for moving the file mapping configurations from the Data Loader and creating a standalone step is to assist in troubleshooting.

As part of the Data Processing Pipeline, you can view data loading error logs within the Squirro Monitoring space.

Reference: To learn more, see Squirro Monitoring.

Additionally, as part of the Transform Input step, you can freely change the mappings of a data source without needing to extract the data again from the third-party system.

Note

If you run into any errors you are unable to troubleshoot yourself, contact Squirro Support for help.