Known Entity Extraction
Known Entity Extraction#
This page introduces Known Entity Extraction (KEE) and links to information about its set up, configuration, and testing using the command line interface (CLI tool) or the Studio plugin.
KEE is a proprietary Squirro technology that enriches data after it is loaded. KEE links your unstructured data to structured information.
Your unstructured data can take any form, including Word or PDF documents, emails, news articles, call notes, and social media content.
The following are examples of structured information that can be linked to your unstructured data using KEE:
Company lists from a CRM system such as Salesforce.
Portfolios of securities held by a specific investor in an asset management system.
Lists of people in a user authentication system.
Product lists from internal databases.
This documentation explains how to create links between structured information and unstructured data using the KEE functionality.
As KEE is a proprietary Squirro technology, it’s recommended that you familiarize yourself with the core Squirro concepts before working with KEE.
Most importantly, it’s important to understand the Squirro Architecture and Item Format.
KEE Set Up#
There are two ways to work with KEE: the Studio plugin (in your browser) or the command line interface (CLI).
The Studio plugin is accessed from your Squirro instance by navigating to Setup > AI Studio > Known Entity Extraction in your Squirro instance.
To learn more about using KEE in the Studio plugin (accessed through your browser), see the KEE Studio Plugin page.
For a step-by-step walkthrough of how to create a KEE using the Studio plugin, see the KEE Studio Plugin Tutorial.
Command Line Interface#
You can set up and deploy a KEE to your Squirro project using the CLI Tool, which is included in the Squirro Toolbox .
To learn more about using KEE from the CLI, see the KEE CLI Tool page.
For a step-by-step walkthrough of how to create a KEE from your CLI, see the KEE CLI Tool Tutorial.
Configuration options are stored in
config.json of the KEE root folder.
For information on format, structure, variables, custom pipelets, and other configuration references, see the KEE Configuration page.
Tokenizers and Filters#
Tokenizers and filters process the entity names and input text to create compatability.
For information on built-in and custom tokenizers and filters, see the Tokenizers and Filters page.
You can use the test subcommand of the KEE CLI tool to test items.
For more information, see Testing.