Retrieval Augmented Generation

Retrieval Augmented Generation#

Retrieval Augmented Generation (RAG) is a technology that combines the capabilities of retrieval systems with language models to produce more accurate, context-aware responses. RAG enables conversational AI applications and workflows to produce higher-quality outputs without requiring costly model retraining, making it a scalable and efficient solution for a wide range of use cases and industries.

Understanding RAG#

Retrieval Augmented Generation adds an additional processing layer to conversations between a user and a language model. Instead of having the prompts directly trigger a response from the LLM, the system first analyses the user inputs and augments them with additional knowledge.

Squirro Chat, one of the core components of the Squirro platform, natively supports RAG and drastically simplifies interactions with systems by using natural language processing (NLP), machine learning (ML), and large language models (LLMs). Its chat feature is fully integrated with the agent framework and can access various data types and formats, often dispersed across various systems and platforms. Agents combine the reasoning and language generation capabilities of LLMs with retrieval mechanisms for querying enterprise data, structured or unstructured, as well as external data sources.

Standard RAG#

With each user prompt, the system transforms the input into an optimized query, which it passes to the Squirro Retriever, the built-in tool that grants Squirro Chat access to the Squirro search engine. It quickly retrieves and accesses relevant information across vast amounts of indexed data. Organizations can further extend the data retrieval capabilities using data connectors. It then combines the user input and retrieved content to generate a coherent and contextually relevant response using the LLM. This mechanism offers a broad range of retrieval approaches, as the LLM can use the retriever to iteratively adjust the query if it deems the initial search results insufficient to generate a response.

By decoupling the retrieval layer from the LLM, organizations can easily benefit from the latest advancements in Squirro’s technology while maintaining flexibility in their language model deployments. This separation enables smooth transitions between different LLM versions or providers, allowing organizations to capitalize on the latest breakthroughs in model development.

Standard RAG approaches require ingesting and indexing data, a process that can be resource-intensive. When new data sources are added or existing content changes, the system must undergo a re-indexing phase to maintain efficiency. This can result in organizations needing more storage capacity and processing power, increasing costs. The Squirro technology layers can natively manage these re-processing stages with ease and at scale, supporting a broad range of use cases. However, specific business requirements, such as accessing structured and unstructured data in real-time or near real-time without indexing, may require enhancements to the standard RAG approach.

Enhanced RAG#

The agent framework enhances the standard RAG approach by providing access to an extended range of data sources, tapping into additional external or internal knowledge, bypassing the need for ingestion and indexing. The data from the sources is accessed and queried in real-time, either through a direct connection to the system or the Squirro Data Virtualization add-on. Organizations maintain oversight and control over the information accessed, in accordance with their security policies.

The system processes user prompts by combining each one of them with system instructions and passing the result directly to the LLM for content analysis. At this point of the workflow, the model is aware of the available tools it can call to assist with the response generation. Based on the analysis, the model transforms the user input into an optimized format and delegates specific tasks to one or multiple tools. These tools can interact with remote APIs and systems, or execute software programs and scripts to gather additional information or perform subtasks. The processed data returns to the LLM, which reanalyzes the full context and generates the response to the user prompt.

Key components#

The Enterprise GenAI Platform component is the central hub for the enhanced RAG mechanics, providing a robust foundation with its core features and capabilities supporting the other key components.

Agent Framework#

The Squirro agent framework is a structured and reusable software environment that simplifies the development, deployment, and management of AI agents. Learn more

AI Guardrails#

The AI Guardrails add-on provides a strategic framework that enforces strict guidelines for content generation. Learn more

Privacy Layer#

The Privacy Layer add-on is a set of technologies and protocols designed to protect personally identifiable information (PII) by removing it from the content transferred to the large language model with minimal impact. Learn more

Data Virtualization#

The Data Virtualization add-on extends the capabilities of AI workflows by providing frictionless access to a vast array of structured and unstructured data in real-time or near real-time. Learn more

Knowledge Graph#

The Knowledge Graph add-on combines RAG with semantic technology to deliver more accurate, comprehensive, and reliable results.