PipelineWorkflowMixin#

class PipelineWorkflowMixin#

Bases: object

Methods Summary

add_model_to_pipeline_workflow(project_id, ...)

Updates a pipeline workflow with a published model from AI Studio.

delete_pipeline_workflow(project_id, workflow_id)

Deletes a pipeline workflow as long as it is no longer needed.

get_pipeline_workflow(project_id, workflow_id)

Return a specific pipeline workflow workflow_id in project with project_id.

get_pipeline_workflows(project_id[, ...])

Return all pipeline workflows for project with project_id.

get_pipeline_workflows_presets(project_id)

Return a list of pre-made pipeline workflows for covering various use cases.

modify_pipeline_workflow(project_id, workflow_id)

Updates a pipeline workflow

move_pipeline_workflow_step(project_id, ...)

Move a pipelet step within a workflow.

new_pipeline_workflow(project_id, name[, steps])

Creates a new pipeline workflow.

rerun_pipeline_workflow(project_id, workflow_id)

Rerun a pipeline workflow using the data of its configured data sources.

Methods Documentation

add_model_to_pipeline_workflow(project_id, workflow_id, model_id, document_label=None)#

Updates a pipeline workflow with a published model from AI Studio.

Parameters:
  • project_id – project id

  • workflow_id – pipeline workflow id

  • model_id – id of published model in AI Studio

  • document_label – (required only for document-level classifiers) name (not display name) of the label that the model should use to classify documents (the label must exist).

Example:

>>> client.add_model_to_pipeline_workflow(
>>>     project_id='project_id_1',
>>>     workflow_id='pipeline_workflow_id_1',
>>>     model_id="model_id_1",
>>>     document_label="ml_classification")
delete_pipeline_workflow(project_id, workflow_id)#

Deletes a pipeline workflow as long as it is no longer needed. Project default workflows cannot be deleted and neither can workflows that still have sources referring to them.

Parameters:
  • project_id – project id

  • workflow_id – pipeline workflow id

Returns:

204 if deletion has been successful

Example:

>>> client.delete_pipeline_workflow(
>>>     project_id='project_id_1',
>>>     workflow_id='pipeline_workflow_id_1',
get_pipeline_workflow(project_id, workflow_id, omit_steps=False, omit_sources=False)#

Return a specific pipeline workflow workflow_id in project with project_id.

Parameters:
  • project_id – project id

  • workflow_id – pipeline workflow id

  • omit_steps – whether to omit steps in the response for better performance.

  • omit_sources – whether to omit in the response the data sources which are configured to use this pipeline workflow.

Returns:

A dictionary of the pipeline workflow.

Example:

>>> client.get_pipeline_workflow('project_id_1', 'workflow_id_1')
{'id': 'pipeline_workflow_id_1',
 'project_id': 'project_id_1',
 'name': 'Pipeline Workflow 1',
 'steps': [
    {"name": "Pipelet",
     "type": "pipelet",
     "display_name": "PermID OpenCalais",
     "id": "XPOxEgNSR3W4TirOwOA-ng",
     "config": {"config": {"api_key": "AGa8A65", "confidence": 0.7},
                "pipelet": "searches/PermID Entities Enrichment"},
    },
    {"name": "Index",
     "type": "index",
      ...
    }
 ]
}
get_pipeline_workflows(project_id, omit_steps=False, omit_sources=False)#

Return all pipeline workflows for project with project_id.

Parameters:
  • project_id – id of the project within tenant

  • omit_steps – whether to omit steps in the response for better performance.

  • omit_sources – whether to omit in the response the data sources which are configured to use each pipeline workflow.

Returns:

A list of pipeline workflow dictionaries.

Example:

>>> client.get_pipeline_workflows('project_id_1')
[{'id': 'pipeline_workflow_id_1',
  'project_id': 'project_id_1',
  'name': 'Pipeline Workflow 1',
  'project_default': True,
  'steps': [
     {"name": "Pipelet",
      "type": "pipelet",
      "display_name": "PermID OpenCalais",
      "id": "XPOxEgNSR3W4TirOwOA-ng",
      "config": {"config": {"api_key": "AGa865", "confidence": 0.7},
                 "pipelet": "searches/PermID Entities Enrichment"},
     },
     {"name": "Index",
      "type": "index",
      ...
     }
  ]
 },
 {'id': 'pipeline_workflow_id_2',
  ...
 },
 ...
]
get_pipeline_workflows_presets(project_id)#

Return a list of pre-made pipeline workflows for covering various use cases.

Parameters:

project_id – the Id of the project

Returns:

a list of dictionaries where each dictionary represents a pipeline workflow preset.

modify_pipeline_workflow(project_id, workflow_id, name=None, steps=None, project_default=None)#

Updates a pipeline workflow

Parameters:
  • project_id – project id

  • workflow_id – pipeline workflow id

  • name – name of workflow or None if no change

  • steps – list of sets of properties that require at least the step type to be specified and be one of a list of known types. Steps need to be ordered in a specific way. Can be None if no change.

  • project_default – whether pipeline workflow should become the new project default workflow. Allowed values are True or None. It is not possible to clear the project_default because at any time exactly one project default pipeline workflow needs to exist. To change the project default workflow, instead set True on the new default workflow which will as a side-effect clear the previous default.

Example:

>>> client.modify_pipeline_workflow(
>>>     project_id='project_id_1',
>>>     workflow_id='pipeline_workflow_id_1',
>>>     name='Pipeline Workflow 1',
>>>     steps=[{"name": "Index",
>>>             "type": "index"}])
move_pipeline_workflow_step(project_id, workflow_id, step_id, after)#

Move a pipelet step within a workflow.

Parameters:
  • project_id – id of project that owns the workflow

  • workflow_id – pipeline workflow id

  • step_id – id of the pipelet step to move

  • after – id of the step after which the pipelet step should be moved or None if pipelet is supposed to be first

Returns:

updated workflow

Example:

>>> client.move_pipeline_workflow_step('2aEVClLRRA-vCCIvnuEAvQ',
...                                    'Ue1OceLkQlyz21wpPqml9Q',
...                                    'nJXpKUSERmSgQRjxX7LrZw',
...                                    'language-detection')
new_pipeline_workflow(project_id, name, steps=None)#

Creates a new pipeline workflow.

Parameters:
  • project_id – project id

  • name – name of workflow

  • steps – list of sets of properties that require at least the step type to be specified and be one of a list of known types. Steps need to be ordered in a specific way. If steps is None or the empty list, the default steps will be set.

Example:

>>> client.new_pipeline_workflow(
>>>     project_id='project_id_1',
>>>     name='Pipeline Workflow 1',
>>>     steps=[{"name": "Index",
>>>             "type": "index"}])
rerun_pipeline_workflow(project_id, workflow_id, from_index=False, step_ids=None, run_linked_steps=None, query=None, include_sub_items=None)#

Rerun a pipeline workflow using the data of its configured data sources.

Parameters:
  • project_id (str) – the Id of the project that this workflow belongs to.

  • workflow_id (str) – the Id of the pipeline workflow to rerun.

  • from_index (bool) – if True then this flag indicates that rerun will use the indexed Squirro items as input to the workflow for ingestion.

  • step_ids (Union[str, List[str], None]) – the IDs of one or more steps of the provided pipeline workflow that the rerun from index will be executed on. The rest of the steps in the workflow will be omitted.

  • run_linked_steps (Optional[bool]) – a flag which indicates whether to rerun from index the linked steps of the step provided by the step_ids. It has an effect only when the step_ids parameter is provided. If multiple step IDs are included in the provided step_ids, then only the first step in the list, along with its set of linked steps, will rerun from index.

  • query (Optional[str]) – a query expressed in Squirro’s Query Syntax [1]. It has an effect only with the rerun from index, in order to rerun only the items returned by the query. [1] https://go.squirro.com/query-syntax

  • include_sub_items (Optional[bool]) – if set to True, then the sub-items of the items will also be fetched and included in the rerun data. If set to False, then no sub-items will be fetched. If set to None (default), then the behaviour will be determined by the value of the project setting: datasource.rerun.index.include-sub-items. This option has an effect only with the rerun from index mode.

Example:

>>> client.rerun_pipeline_workflow(
...    "EcKKf_dxRe-xrCB8g1fGCg",
...    "Or0UiK-qROeE1x8kVlBZkQ",
...    from_index=True,
...    step_ids=["aiNJX35dRhCfqc3a3l84PA", "qsXDOkMvQ-62-iM7O1Fp5w"],
...    query="source:g2hqOvX8SZmR7R2RPmMlDw")

The above example will invoke rerun from index for the worklflow with id Or0UiK-qROeE1x8kVlBZkQ of the project with id EcKKf_dxRe-xrCB8g1fGCg, using only the 2 provided steps identified by their ids (steps are alredy part of the workflow), and only on the items of the source with id g2hqOvX8SZmR7R2RPmMlDw (the workflow can be configured to be used by many sources, but we want to rerun only on the items of a specific configured source).