How to Use Multi-Label Proximity Filters
Contents
How to Use Multi-Label Proximity Filters#
Through AI Studio, Squirro provides a process to build binary and multi-class machine learning classifiers.
With Proximity Filters (Squirro’s rule-based approach), you can generate binary classifiers and multi-label classifiers.
Reference: For background information on various types of machine learning classification, see the third-party guide Types of Classification or the following individual articles:
On this page, you will learn how to set up a multi-label proximity filter classifier using Squirro’s AI Studio.
Step 1: Candidate Set#
Create Candidate Sets to filter the data corpus down to a more relevant subset of documents, which will help you later in the process.
In this example, we focus on management change
, plan to invest
, and IPO
:
Step 2: Ground Truth Rule Generation#
First, the AI Studio Step 2: Ground Truth needs to be defined. It is important that all the labels below are added and the proximity search checkbox is activated:
Next, you can start to generate rules for the different labels in one of two ways:
By labeling in the List view
or the Focus view
:
or
In the Rule overview
tab:
Step 3: Model#
After defining the rules, you can move forward to build a proximity filter classifier model:
To generate a multi-label proximity filter you need to add all labels (coma-separated) to the Label Tags field on the Configure Template page:
The Proximity Filter is generated after clicking Save and Train and can be viewed in the model overview:
Step 4: Validation#
The validation of the proximity filter can be viewed by clicking Validate:
Note: This screen is empty if there are no sentences labeled in the ground truth.
Step 5: Publish#
The multi-label proximity filter can be published the same way as other AI Studio models in the data ingestion pipeline.
The only difference is sentences can now potentially be classified by more than one of the labels (which is a type of multi-label classification).