Recommendations

Recommendations#

Profiles: Search Engineer, Data Scientist

Squirro can make item, entity, and label recommendations based on various types of inputs.

The theory behind how recommendations work in Squirro and the three different methods provided are based on

non-correlated labels,
correlated labels, or
machine learning.

Tip

To enable entity feedback you must create a studio plugin to handle the data which comes from the UI in the frontend.

Methods of Recommendation#

The recommendation problem can be formulated in the following way:

Given input features f1, f2, .., fn (features could be labels of an item or properties of entity), recommend the top classes C (Class could be a Client or Investor in the Investment Banking App) based on a score(C) = score(f1, f2, …fn).

Squirro provides 3 methods: non-correlated labels, correlated labels and machine learning for computing the score(f1, f2, …fn).

With the non-correlated labels and correlated labels methods, the data could be recommended immediately after loading to the storage without any training process.

For the machine learning methods, you need to know the ml_workflow_id after a model is trained.

Non-correlated labels#

The score of a class C is computed based on the average score of each individual feature that belongs to it.

Scores of an individual feature is the probability that feature f co-occurs with C in a document or entity

score(C, f1, f2, …fn) = (score(C, f1) + score(C, f2) + … + score(C, fn)) / n

where score(C, fi) = power_norm(P(C|fi)) = power_norm(#E(C, fi)/#(E(fi)))

#E(C, fi): Number of entities contains both C and fi
#E(fi): Number of entities contains fi
power_norm: Power normalization function, see explain below

Correlated labels#

The score of each class C is computed based conditional probability of C given all features

By definition: score(C, f1, f2, …fn) = P(C | f1, f2, …,fn) = #E(C, f1, f2, …fn) / #E(f1, f2, …, fn)

where #E(fi): Number of documents or entities contains fi

However in case we have class which does not contain all the features f1, f2, …, fn then #E(f1, f2, …, fn) = 0, this makes score infinite. So given that we have n input feature, and class C contains only l features (l <=n), the final score is computed as:

score(C, f1, f2, …fn) = (l + P(C | f1, f2, …,fl) * (n-l))/n =(l + #E(C, f1, f2, …fl) / #E(f1, f2, …, fl) * (l - n)) / n

Machine learning (Classification)#

Generates scored classes of one label (entity property) by using other labels (entity properties) as input. This is a typical classification task, and can be done via AI Studio.

We currently support most classifiers available through SKLearn, though a custom classifier is also possible.

Example machine learning workflow:

workflow.json:

{
  "name": "test",
  "config": {
    "dataset": {
      "query_string": "*"
    },
    "path": ".",
    "pipeline": [{
      "step": "loader",
      "type": "squirro_query",
      "fields": ["keywords.Salary", "keywords.City", "keywords.Job"]
    },{
      "step": "checkpoint",
      "type": "disk",
      "batch_size": 64
    },{
      "step": "classifier",
      "type": "sklearn",
      "model_type": "SVC",
      "model_kwargs": {"probability": true},
      "input_fields": ["keywords.Salary", "keywords.Job"],
      "label_field": "keywords.City",
      "output_field": "keywords.City",
      "explanation_field": "keywords.City_explanation"
    }]
  }
}

Machine learning (Regression Aggregation)#

Generates scored classes of one label (entity property) by aggregating another label (entity property). The scoring of individual items (entitiies) is a regression task, and can be done via the machine learning service.

The aggregation is then performed by a filter step in the machine learning workflow.

Squirro supports most regressors available through SKLearn, though a custom regressor is also possible.

Example machine learning workflow:

workflow.json:

{
  "name": "test",
  "config": {
    "dataset": {
      "query_string": "*"
    },
    "path": ".",
    "pipeline": [{
      "step": "loader",
      "type": "squirro_query",
      "fields": ["keywords.Salary", "keywords.City", "keywords.Job"]
    },{
      "step": "checkpoint",
      "type": "disk",
      "batch_size": 64
    },{
      "step": "classifier",
      "type": "sklearn",
      "model_type": "SVR",
      "input_fields": ["keywords.City", "keywords.Job"],
      "label_field": "keywords.Salary",
      "output_field": "keywords.Salary",
      "explanation_field": "keywords.Salary_explanation"
    },{
      "step": "filter",
      "type": "aggregate",
      "aggregated_fields": ["keywords.Salary"],
      "aggregating_field": "keywords.City"
    }]
  }
}

Understanding Power Normalization Functions#

When using probability to compute the score, usually the value is quite small.

For example, if a highest probability has a value 1/10, just displaying the value 10% to the user may make the user feel less confident to take that recommendation.

Therefore we define a function called power normalization to transform the probability to a reasonable value to display to user:

power_norm(x) = norm_base + power(x, 1/accelerator) * (1 - norm_base )

This function guarantees following characteristics:

If 0 <= x <= 1 then 0<= F(x) <=1
If xi < xj then F(xi) < F(xj)

where

accelerator: higher this value, quicker score reach 1
norm_base: base value to separate with 0

The figure below shows different values of power normalization function with norm_base = 0.5 and accelerator = 1, 2, 3, 4

Visualising Recommendations#

This section discusses how recommendations can be visualized within your projects:

Items and Entities#

Given a project with entities, the appropriate call can be made to the Squirro API to fetch a list of recommendations and related items.

Entity Based Recommendations#

It is possible to retrieve recommended content based on entities and machine learning. For example, if you were interested in “job opportunities”, you might be able to build visualizations like the “Job Recommendation App” below.

Job Recommendation App - Example#

The small application below was built using the following widgets: Search Bar, Entities, Recommendations and the Entities List.

Job Recommendation App - How-to Video Tutorial#

The video below will show you how to quickly set up a “Job Recommendation App” using the Dashboard Editor.

Your browser does not support the HTML5 video element

Using pre-configured Input Features#

It is possible to create dashboards that will show the REC Results widget pre-selected and does not need the user to use the REC Input dropdown to select options first.

This will also pre-select items in the REC Input dropdown and will be used by the REC Explanations widget.

To do this you can pass the input_features JSON object to the Dashboard Store. Below is an example where “Catalyst” is an entity type and “Expansion” and “Earnings” are the entity values.

To understand how input_features affect results see SquirroClient Entities Recommend Endpoint.

Note: We pass the value \_inputFeatures on the Dashboard Store which is later converted to input_features when passed to the API.

Input Features Example

{
    "_inputFeatures": {
        "Catalyst": [ "Expansion", "Earnings" ]
    }
}

Adding Actions - Integrating with Third Parties#

Extend Recommendations to allow seamless integration into third parties like Salesforce and ServiceNow by adding Actions to the top right-hand corner of the Recommendations card.

In a Custom Widget you can do the following:

Adding Custom Actions

return Widgets.Recommendations.extend({
    customCardActions: [
        {
            iconName: 'business',
            action: function(ev, model, widgetModel) {
                // add your own logic for the action 'click' event.
                var $cardElem = $(ev.currentTarget).closest('.vRecommendationRowContent');
                var selectedRecId = $cardElem.data('rec-target-value');
                Materialize.toast('Opportunity for ' +  selectedRecId + ' has been created!', 4000);
                $(ev.currentTarget).closest('.card').remove();
            },
        },
        {
            iconName: 'close',
            action: function(ev, model, widgetModel) {
                // How to get additionalReturnedFeatures from the API
                // 'model' is the recommendation model
                // 'Sector' and 'Industry' are fetched below by passing them into the additionalReturnedFeatures property.
                // Now they exist on our model
                console.log(model.get('return_features').Sector);
                console.log(model.get('return_features').Industry);
            },
        },
        {
            iconName: 'search',
            action: (ev, model, widgetModel) => {
                // widgetModel is the widget model as a JSON object.
                // You can get your widget properties or custom widget properties
                // which you define in your config.json in a custom widget
                console.log(widgetModel.customName);
         },
     },
    ],

    additionalReturnedFeatures: ['Sector', 'Industry']
});

Feedback for Entity Recommendations#

To enable entity feedback you must create a studio plugin to handle the data which comes from the UI in the frontend. Outlined below are the various POST objects which can be consumed with the custom studio plugin endpoint.

To enable this feature you must fill the “REC Explanations” widget or the “Result List” widget config correctly with a specified “Feedback endpoint” pointing to your custom studio plugin.

Example Data Posts#

On Click thumbs-up / On Click thumbs-down:#

{
  action: "positive || negative"
  entityId: "apxq8rIOZA2JbdeKeR2JSw"
  itemId: "rMUhbBQu-x9NulyqaD0rhw"
  projectId: "o-WW8l2rTKefV1WhUovsVg"
  textExtract: "[nS8N2CV0BG]↵    ↵    * MICHELIN  a fait état lundi d'une chute de 78,4%↵de son résultat opérationnel au
  premier semestre, conséquence du↵coup d'arrêt du marché automobile provoqué par l'épidémie de↵coronavirus."
  userId: "gMuEUQBPT3iEKtUBssAegg"
  username: "[email protected]"
}

After Click thumbs-up, when adding an additional comment, After Click thumbs-down#

{
  action: "positive_feedback || not_interesting"
  entityId: "apxq8rIOZA2JbdeKeR2JSw"
  feedbackText: "Hi"
  itemId: "rMUhbBQu-x9NulyqaD0rhw"
  projectId: "o-WW8l2rTKefV1WhUovsVg"
  userId: "gMuEUQBPT3iEKtUBssAegg"
}

On Submission of an Incorrect Classification

{
  action: "incorrect_classification"
  catalyst_change_from: "Rumour"
  catalyst_change_to: "Layoff"
  entityId: "apxq8rIOZA2JbdeKeR2JSw"
  feedbackText: "This is a demo"
  itemId: "rMUhbBQu-x9NulyqaD0rhw"
  projectId: "o-WW8l2rTKefV1WhUovsVg"
  userId: "gMuEUQBPT3iEKtUBssAegg"
}