Agentic RAG¶
Agentic RAG is an advanced form of RAG that uses autonomous agents to improve query understanding and retrieval accuracy. By dynamically adapting the retrieval process based on the context and intent of user queries, Agentic RAG is particularly effective when working with structured data enterprise applications.
Why Agentic RAG?¶
The current search architecture, optimized for unstructured data, shows limitations when handling structured data from sources like JIRA. Specific queries yield suboptimal results due to the static retrieval configuration.
Agentic RAG leverages LLMs to add an intelligent layer that can dynamically adapt retrieval strategies based on query intent, moving away from the current one-size-fits-all approach. The agents in this RAG architecture analyze query intent to determine optimal retrieval parameters and filters, enabling dynamic adaptation of search strategies based on query context.
Agents in Agentic RAG¶
Currently, the following four agents are introduced in Search AI.
- Query Rephrase Agent
The Query Rephrasing Agent enhances and clarifies user queries by understanding the context and user intent. Using previous conversations, it provides improved versions of user input, resulting in more precise and actionable queries.
Example:
- User Query: "What about India?"
- Previous Conversation: "What is the leave policy for America?"
- Rephrased Query: "What is the leave policy for India?"
This agent is currently available only for the advanced Search API. Please refer to the API documentation to learn more about its usage.
- Result Type Classification
The Result Type Classification Agent interprets queries to determine whether the user seeks a specific answer or a list of search results. This ensures that the application responds with the most appropriate result type.
For instance,
- Query: *"Give me a list of tickets assigned to John." \
- Output: search
- Query: *"Give me details on the custom embeddings story." \
-
Output: answers
-
Query Transformation Agent
The Query Transformation Agent identifies key terms within a query, removing noise and prioritizing relevant documents. Extracting meaningful phrases and keywords ensures that results are aligned with the user's intent.
Query: "What is the work-from-home policy for Kore.ai?"
Processing:
- Extracted Key Terms: "Work from Home Policy", "Kore.ai"
- Refined Query: "Kore.ai work-from-home policy details."
-
Boosting Applied: Documents containing these terms in the title or content are ranked higher.
-
Metadata Extractor Agent
The Metadata Extractor Agent extracts relevant sources and fields from a query, maps them to structured data, and applies filters or boosts for accurate retrieval. This agent ensures the system applies appropriate filters to refine the results.
Query: "Find Jira tickets assigned to John with status ‘In Progress’ and priority ‘High’."
Processing:
- Source Identified: Jira
- Extracted Fields: Assignee, Status, Priority
- Action Taken:
- Filter Applied
- No Boost Needed
Limitations of Agentic RAG¶
While Agentic RAG significantly enhances query accuracy, certain limitations may arise:
- Increased Latency: Agentic RAG involves multiple LLM calls, which can contribute to the total response time for the user.
- False Positives: Extracted key terms might include common words, leading to irrelevant results.
- Large Prompts: The query enrichment node uses excessive metadata from multiple sources, resulting in large prompt sizes and reduced performance.
- Manual Configuration: By default, all metadata keys are boosted. Users need to manually configure filters as needed.
- Control Mechanisms: Dynamic decisions on boosting versus filtering are handled by the LLM, but additional mechanisms may be required for fine-tuning.
- Source Identification: Incorrect source identification may lead to retrieving irrelevant or incomplete results.
Enabling Agentic RAG¶
Prerequisites¶
- Configure LLM. Note that currently, only Open AI and Azure Open AI 4.0 models are supported for Agentic RAG.
- Under Gen AI features, enable required agents.
Steps to Enable¶
To enable Agentic RAG, go to Agentic RAG under responses and click Enable Now.
Select the model for Agentic RAG and click Confirm.
By default, this enables all the RAG agents. All the agents use the model configured while enabling the Agentic RAG.
To change the model and prompt settings or to enable or disable a particular agent, go to the Gen AI features under Generative AI Tools and make appropriate settings.
Testing the Agentic RAG feature¶
To test the performance of the agents, use the Test option on the Agentic RAG page. Enter your query and verify the response to the query. When Agentic RAG is enabled, an additional Retrieval tab is added to the debug logs. This page provides information on the sequence of agents invoked by the application, the input to the LLM by each agent, and the output from the agents. It also shows the time taken by the LLM to complete the request.