MachineLearningMixin

class squirro_client.topic.MachineLearningMixin

Bases: object

Methods Summary

clone_machinelearning_workflow(project_id, …)

Clone the Machine Learning Workflow.

delete_machinelearning_job(project_id, …)

Delete a Machine Learning job.

delete_machinelearning_workflow(project_id, …)

Delete a Machine Learning workflow.

get_machinelearning_job(project_id, …[, …])

Return a particular Machine Learning job.

get_machinelearning_jobs(project_id, …[, …])

Return all the Machine Learning jobs for a particular Machine Learning workflow.

get_machinelearning_workflow(project_id, …)

Return a specific Machine Learning Workflow in a project.

get_machinelearning_workflow_assets(…[, …])

Return all the binary assets like trained models associated with a Machine Learning Workflow.

get_machinelearning_workflows(project_id)

Return all Machine Learning workflows for a project.

kill_machinelearning_job(project_id, …)

Kills a Machine Learning job if it is running.

modify_machinelearning_workflow(project_id, …)

Modify an existing Machine Learning workflow.

new_machinelearning_job(project_id, …[, …])

Create a new Machine Learning job.

new_machinelearning_workflow(project_id, …)

Create a new Machine Learning Workflow.

run_machinelearning_job(project_id, …)

Schedules a Machine Learning job to run now.

run_machinelearning_workflow(project_id, …)

Run a Machine Learning workflow directly on Squirro items.

wait_for_machinelearning_job(project_id, …)

Wait for the first run of the Machine Learning job to complete.

Methods Documentation

clone_machinelearning_workflow(project_id, ml_workflow_id, name=None, type=None)

Clone the Machine Learning Workflow.

Parameters
  • project_id (str) – Id of the Squirro project.

  • ml_workflow_id (str) – Id of the Machine Learning workflow.

  • name (Optional[str]) – Optional name of Machine learning workflow. If not specified, the name of the workflow will be the same as the original one.

  • type (Optional[str]) – Optional parameter to define type of the Machine learning workflow. Possible values are other, query, query_default, ais, published, document_embedder_queries, document_embedder_queries_default. If not specified, the type of the workflow will be the same as the original one.

delete_machinelearning_job(project_id, ml_workflow_id, ml_job_id)

Delete a Machine Learning job.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • ml_job_id – Id of the Machine Learning job.

delete_machinelearning_workflow(project_id, ml_workflow_id)

Delete a Machine Learning workflow.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

get_machinelearning_job(project_id, ml_workflow_id, ml_job_id, include_run_log=None, last_n_log_lines=None, include_results=None)

Return a particular Machine Learning job.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • ml_job_id – Id of the Machine Learning job.

  • include_run_log – Boolean flag to optionally fetch the last run log of the job.

  • last_n_log_lines – Integer to fetch only the last n lines of the last run log.

  • include_run_log – Boolean flag to optionally fetch the last run results.

get_machinelearning_jobs(project_id, ml_workflow_id, include_internal_jobs=None)

Return all the Machine Learning jobs for a particular Machine Learning workflow.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • include_internal_jobs – Bool, whether or not to include the internal jobs. Generally, you should not need to get these jobs as these jobs are used to optimize the inference runs.

get_machinelearning_workflow(project_id, ml_workflow_id)

Return a specific Machine Learning Workflow in a project.

Parameters
  • project_id – Id of the project.

  • ml_workflow_id – Id of the Machine Learning workflow.

get_machinelearning_workflow_assets(project_id, ml_workflow_id, write_to_disk=None)

Return all the binary assets like trained models associated with a Machine Learning Workflow.

Parameters
  • project_id – Id of the project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • write_to_disk – Boolean. Wheather or not to write the exported ML workflow assets to the disk.

get_machinelearning_workflows(project_id)

Return all Machine Learning workflows for a project.

Parameters

project_id – Id of the Squirro project.

kill_machinelearning_job(project_id, ml_workflow_id, ml_job_id)

Kills a Machine Learning job if it is running.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • ml_job_id – Id of the Machine Learning job.

modify_machinelearning_workflow(project_id, ml_workflow_id, name=None, config=None, ml_models=None, type=None)

Modify an existing Machine Learning workflow.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • name – Name of Machine learning workflow.

  • config – Dictionary of Machine Learning workflow config. Detailed documentation here: https://squirro.atlassian.net/wiki/spaces/DOC/pages/337215576/Squirro+Machine+Learning+Documentation # noqa

  • ml_models – Directory with ml_models to be uploaded into the workflow path.

  • type – Optional parameter to define type of the Machine learning workflow. Possible values are other, query, query_default, ais, published, document_embedder_queries, document_embedder_queries_default. If not specified, the default type is other.

new_machinelearning_job(project_id, ml_workflow_id, type, scheduling_options=None)

Create a new Machine Learning job.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine learning workflow.

  • type – Type of the Machine Learning job. Possible values are training and inference.

  • scheduling_options – Scheduling options for the job

Example:

>>> client.new_machinelearning_job(
        project_id='2aEVClLRRA-vCCIvnuEAvQ',
        ml_workflow_id='129aVASaFNPN3NG10-ASDF',
        type='training',
        scheduling_options={"time_based":{"repeat_every":"1d"}})
'13nv0va0svSDv3333v'
new_machinelearning_workflow(project_id, name, config, ml_models=None, type=None)

Create a new Machine Learning Workflow.

Parameters
  • project_id – Id of the Squirro project.

  • name – Name of Machine learning workflow.

  • config – Dictionary of Machine learning workflow config. Detailed documentation here: https://squirro.atlassian.net/wiki/spaces/DOC/pages/337215576/Squirro+Machine+Learning+Documentation # noqa

  • ml_models – Directory with ml_models to be uploaded into the workflow path

  • type – Optional parameter to define type of the Machine learning workflow. Possible values are other, query, query_default, ais, published, document_embedder_queries, document_embedder_queries_default. If not specified, the default type is other.

run_machinelearning_job(project_id, ml_workflow_id, ml_job_id)

Schedules a Machine Learning job to run now.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • ml_job_id – Id of the Machine Learning job.

run_machinelearning_workflow(project_id, ml_workflow_id, data, asynchronous=False)

Run a Machine Learning workflow directly on Squirro items.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • data – Data to run through Machine Learning workflow.

  • asynchronous – Whether or not to run Machine Learning workflow asynchronously (recommended for large data batches).

wait_for_machinelearning_job(project_id, ml_workflow_id, ml_job_id, max_wait_time=600)

Wait for the first run of the Machine Learning job to complete. Often useful in automated scripts where you would want to wait for a job to finish after setting it up to inspect logs or results.

Parameters
  • project_id – Id of the Squirro project.

  • ml_workflow_id – Id of the Machine Learning workflow.

  • ml_job_id – Id of the Machine Learning job.

  • max_wait_time – Maximum time to wait for Machine Learning job (default: 600s).