Question Answering#

Profiles: Project Creator, Search User

This page provides an overview of the Question Answering (QA) feature in Squirro.

Project creators can configure the feature and choose where it’s displayed. Search users can use the feature to discover answers to queries typed as questions.

Reference: For project creator configuration instructions, see How To Use The Question Answering Feature.

Introduction#

Question Answering is a feature that allows users to discover direct answers to questions by phrasing their search query in the form of a question.

In the Cognitive Search template, it can be accessed by clicking the dialogue icon (highlighted in orange) as shown in the example below:

Question Answering

In this example, asking the meaning of FDA and clicking the QA icon provides a result that shows the meaning of FDA is Food and Drug Administration.

Clicking the answer will open the document in the viewer with the passage used to provide the answer highlighted.

Tip: You only need to phrase your query in the form of a question, you do not need to add a question mark (?) at the end of your query.

Note

There will not always be an answer available for your query. The widget’s ability to provide an answer depends both on the question phrasing itself and available information in your project data sources.

Background#

Question Answering is a general-purpose task that consists of answering a question by extracting the answer as a span from a contextual text that is supplied together with the question, as in the following example:

Question: "What color is it?"
Context: "The tomato is red."

Answer: "red"

If the answer cannot be found as part of the context, the question is deemed unanswerable and the answer is an empty string, as in the example:

Question: "What is his nickname?"
Context: "The tomato is red."

Answer: "" (unanswerable)

However, QA is not a closed-book Question Answering model, i.e. it always requires a context that is supplied along with the question that possibly contains the answer for the question which is extractable from the text as a span of words.

Furthermore, QA is not a generative Question Answering model, i.e. it can only answer the question verbatim from the context.

Frequently Asked Questions#

Is it necessary to fine-tune a QA model for a specific use-case?#

In general, this is not required. If the questions and context texts are formulated in natural (English) language, the model should be able to predict the answer from the context or say “unanswerable” if the context does not contain an extractable answer.

Only if the target distribution is very different from the training distribution might fine-tuning of a new model be necessary, e.g. if the questions and context are formulated in a different language or if the English terms used are radically different from what might be expected from a natural English speaker.

How many samples are needed for fine-tuning a QA model?#

It is not possible to give a definitive answer to this question, as it depends on many nonlinearly related factors.

However, SQuAD2.0, for example, the de facto standard data set for the training and evaluation of QA models, can serve as a guide.

It contains a total of about 150,000 pairs of questions and corresponding context.

Of these 150,000 samples, about 50,000 questions are unanswerable by the supplied context and require the model to generate no answer predictions.

The dataset is split into a training set, a validation set, and a test set containing 129,941, 6,078, and 5,915 sample pairs, respectively.

Fine-tuning a model to achieve comparable performance to models trained on the SQuAD2.0 task will likely require constructing a dataset with similar properties.

Configuration#

The QA feature can be configured in Setup > Settings > Project Configuration.

Project Configuration topic.search.qa-configuration

Reference: Configuration Schema