Model-as-a-Service#
This page provides an overview of creating, customizing, and prototyping machine learning (ML) models as part of Squirro’s Model-as-a-Service (MaaS).
Overview#
Model-as-a-Service (MaaS) allows Squirro customers to import custom ML models and speed up the prototyping phase for ML projects in Squirro.
Installation#
To use Maas, you must install the required packages from the Squirro mirror on your target Squirro server as follows:
yum install squirro-miniforge
yum install squirro-python38-mlflow
Prerequisites#
Before uploading a model, you must first create one.
For more information on MLFlow Models, see the official MLFlow Model site or an Example MLFlow Model.
Note: As an alternative to MaaS, you can create no-code models using the Squirro AI Studio. For a complete tutorial, see AI Studio.
Creating an MLFlow Model#
You can create an MLFlow Model in one of two ways:
train a MLFlow model on your local machine or on your exploration server
wrap an existing (pre-trained) model into the structure of a MLFlow Model and run it locally
Either way, MLFlow stores the (trained) model in the MLFlow base folder (mlruns/0/
) with a unique hash (<HASH>
) after executing the run
-command.
The simplest structure of the MLFlow Model is as follows:
├── artifacts
│ └── model
│ ├── conda.yaml
│ ├── MLmodel
│ ├── python_model.pkl
│ └── requirements.txt
└── meta.yaml
MLFlow documentation#
See the (external) links below for information on creating MLFlow models.
Built-in Model Flavors (to write your own model)
Data Structure#
To use the MLFlow Model later in the context of a Squirro ML Workflow you need to stick to a specific data structure:
the input must be a pandas dataframe with an
id
and named feature fields as columns.the output must be a pandas dataframe with an
id
and result fields as columns.
For more information about pandas, see the official pandas website.
Example:
input DataFrame
id text 0 id0 this is a example sentence. 1 id1 hello world. 2 id2 random sentence. 3 id3 test sentence.
output DataFrame
id class 0 id0 class1 1 id1 class0 2 id2 class0 3 id3 class1
Uploading a Model#
There are two ways to upload a MLFlow Model to Squirro:
Via squirro_asset CLI Reference (large models
>500MB
(exact number is under revision) can cause nginx issues → then usescp
):go into the MLFlow base folder
send the (trained) model via squirro_asset
squirro_asset -vvv mlflow_models upload -t $TOKEN -c $CLUSTER -f mlruns/0/<HASH>/
Via scp:
go into the MLFlow base folder
verify that the destination directory exists (on the Squirro server)
<BASE_DIR>=/var/lib/squirro/topic/assets/mlflow_models # default path mkdir -p <BASE_DIR>/mlruns/0
compress the directory with the (trained) model (wherever you have trained your model)
cd mlruns/0/ && tar -czvf trained_model.tar.gz <HASH>/
send it to the MLFlow base folder on the Squirro server
scp trained_model.zip <SQUIRRO_SERVER_URL>:/tmp/
ssh
into the Squirro server and unzip the sent filecd <BASE_DIR>/mlruns/0 && mv /tmp/trained_model.tar.gz <BASE_DIR>/mlruns/0/ #create the dirs if not existing tar -xzvf trained_model.tar.gz
adjust
artifact_uri
in themeta.yaml
with the new path of the MLFlow Model (file:///<BASE_DIR>/mlruns/0/<HASH>/artifacts
)sed -i '/artifact_uri/c\artifact_uri: file:///<BASE_DIR>/mlruns/0/<HASH>/artifacts' <HASH>/meta.yaml
Starting of Service#
To start a Model-as-a-Service follow the steps below:
make sure you are in the MLFlow base folder on the Squirro server
activate the Squirro environment
squirro_activate3
serve the model identified by the
<HASH>
as a service listening to the chosen port<PORT>
mlflow models serve -m runs:/<HASH>/model -p <PORT>
Note
there is no service orchestration provided at this stage
keep an eye on memory and storage consumption. Then among others:
a started model service loads the model in memory and keeps it there
there is a new conda environment created for every new model which has a different
conda.yaml
file
on-premise customers need to manually package their conda environment. This can be done as explained here.
Using MaaS#
To use your model, you must first create a ML Workflow.
Your ML Workflow can then be used as an inference ML Job scheduled in an interval or as a published model in the enrich pipeline.
For more information on publishing models, see How To Publish ML Models Using the Squirro Client.
Example Workflows#
Below are document-level and sentence-level examples of ML Workflows.
Document Level
{ "dataset": { "infer": { "count": 10, "query_string": "language:en" } }, "pipeline": [ { "fields": [ "body" ], "step": "loader", "type": "squirro_query" }, { "fields": [ "body" ], "step": "filter", "type": "empty" }, { "input_mapping": { "body":"text" }, "output_mapping": { "class":"keywords.prediction" }, "process_endpoint": "http://localhost:<PORT>/invocations", "name": "mlflow_maas", "step": "mlflow_maas", "type": "mlflow_maas" }, { "fields": [ "keywords.prediction" ], "step": "saver", "type": "squirro_item" } ] }
Sentence Level (With Entity Generation)
{ "dataset": { "infer": { "count": 10, "query_string": "language:en" } }, "pipeline": [ { "fields": [ "body" ], "step": "loader", "type": "squirro_query" }, { "fields": [ "body" ], "step": "filter", "type": "empty" }, { "input_fields": [ "body" ], "output_fields": [ "extract_sentences" ], "step": "tokenizer", "type": "sentences_nltk" }, { "fields": [ "extract_sentences" ], "step": "filter", "type": "doc_split" }, { "input_mapping": { "extract_sentences":"text" }, "output_mapping": { "class":"prediction" }, "process_endpoint": "http://localhost:<PORT>/invocations", "name": "mlflow_maas", "step": "mlflow_maas", "type": "mlflow_maas" }, { "fields": [ "extract_sentences", "prediction" ], "step": "filter", "type": "doc_join" }, { "entity_name_field": "Catalyst", "entity_type": "Catalyst", "excluded_values": [], "extract_field": "extract_sentences", "format_values": false, "global_property_field_map": {}, "modes": [ "process" ], "property_field_map": { "Catalyst": [ "prediction" ] }, "required_properties": [ "Catalyst" ], "source_field": "body", "step": "filter", "type": "squirro_entity" }, { "fields": [ "entities" ], "step": "saver", "type": "squirro_item" } ] }