BERTSentiment

class squirro.lib.nlp.steps.classifiers.BERTSentiment(config)

Bases: squirro.lib.nlp.steps.classifiers.Classifier

BERT Sentiment Detect sentiments out of the text fragments using transformer bases pre-trained models, We provide the following pre-trained models:

DistilBERT: Pre-trained NLP model to analyze sentiments with smaller and faster than BERT model. For more info, please visit: https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english

The model predicts a sentiment of the input text [“positive”, “negative”] and a confidence score of that label. Currently we only use the label predicted.

FinBERT: FinBERT is a pre-trained NLP model to analyze sentiment of financial text. It is built by further training the BERT language model in the finance domain, using a large financial corpus and thereby fine-tuning it for financial sentiment classification. For more details please visit: https://huggingface.co/ProsusAI/finbert.

The model predicts a sentiment of the input text [“positive”, “neutral”, “negative”] and a confidence score of that label. Currently we only use the label predicted.

Example:

>>> classifier('stocks rallied and the British pound gained.')
>>> [{'label': 'positive', 'score': 0.8983614444732666}]

FinBERT is not installed by default, please installed it using ´sudo yum install squirro-finbert´

Note: Truncation is activated for the execution of the pre-trained BERT model, to shorten input sentences which exceed the maximum acceptable input length for the model.

Parameters
  • type (str) – bertsentiment

  • model_name (str) – finbert or distilbert

  • input_fields (list) – Fields to use to detect sentiment. The text fragments from all the element of the string is joined.

  • output_field (str) – Field to assign sentiment to

  • pretrained_models_dir (str, None) – Directory where the pre-trained models are stored (default: “/var/lib/squirro/machinelearning/pretrained_models”)

Methods Summary

process_doc(doc)

Process a document

Methods Documentation

process_doc(doc)

Process a document

Parameters

doc (Document) – Document

Returns

Processed document

Return type

Document