Agent Node¶
The Agent Node lets you leverage LLMs and generative AI with Tool calling to build AI-powered, sophisticated, and versatile virtual assistants capable of handling complex tasks and providing dynamic, data-driven interactions. With its streamlined entity collection, contextual intelligence, multilingual support, and integration with external systems, the node empowers platform users to deliver exceptional human-like conversational experiences to their employees and customers.
Benefits¶
- Entity Collection: The Agent Node simplifies the process of gathering entities within a conversation, reducing the need for multiple entity nodes. This streamlined approach enhances the user experience by making virtual assistance interactions more natural and user-friendly.
- System Context, Business Rules, and Exit Scenarios: The Agent Node incorporates system context, business rules, and predefined exit scenarios to ensure accurate and relevant responses. This contextual intelligence helps guide the conversation, handle various user inputs effectively, and maintain alignment with enterprise business rules.
- Multilingual Support: The Agent Node supports both English and non-English virtual assistance languages, enabling platform users to create virtual assistants that cater to a diverse user base and facilitate multilingual interactions.
- Configuration Flexibility: The Agent Node can be configured like any other node in the XO Platform, providing flexibility in its integration within dialog tasks. This allows platform users to seamlessly incorporate the Agent Node into their existing conversational flows.
- Tool Calling: Tool calling is the ability to identify when external functions are needed, select appropriate ones, invoke them with correct parameters, process their outputs, and incorporate the results into responses.
Quick Start Guide¶
Setup Agent Node¶
Model Configuration¶
The Agent Node supports variants of LLM, including OpenAI, Azure OpenAI, Amazon Bedrock, and Custom LLM. To learn more, see Model and Supported Features.
Prompt Setup¶
To learn more, see Agen Node Prompt Setup.
Configure Agent Node¶
By default, the Agent Node is disabled. To enable the node, see Enable GenAI Feature.
Add the node to a dialog task and configure the node's properties and tool calling capabilities.
Add Agent Node to a Dialog Task¶
Steps to add an Agent Node to a Dialog Task:
- Go to Automation > Dialogs and select the task that you are working with.
- You can add the Agent Node just like any other node. You can find it in the main list of nodes.
Component Properties¶
The component properties empower you to configure the following settings. The changes made within this section affect this node across all instances and dialog tasks.
It allows you to provide a Name and Display Name for the node. The node name cannot contain spaces.
Model Configuration¶
Adjusting the settings allows you to fine-tune the model’s behavior to meet your needs. The default settings work fine for most cases. You can tweak the settings and find the right balance for your use case. A few settings are common in the features, and a few are feature-specific:
- Model: The selected model for which the settings are displayed.
- Prompt/Instructions or Context: Add feature/use case-specific instructions or context to guide the model.
- Conversation History Length: This setting allows you to specify the number of recent messages sent to the LLM as context. These messages include both user messages and virtual assistant (VA) messages. The default value is 10. This conversation history can be seen from the debug logs. Note: Applicable only if you are using a custom prompt.
- Temperature: The setting controls the randomness of the model’s output. A higher temperature, like 0.8 or above, can result in unexpected, creative, and less relevant responses. On the other hand, a lower temperature, like 0.5 or below, makes the output more focused and relevant.
- Max Tokens: It indicates the total number of tokens used in the API call to the model. It affects the cost and the time taken to receive a response. A token can be as short as one character or as long as one word, depending on the text.
- Fallback Behavior: Fallback behavior lets the system determine the optimal course of action on LLM call failure or the Guardrails are violated. You can select fallback behavior as:
- Trigger the Task Execution Failure Event
- Skip the current node and jump to a particular node. The system skips the node and transitions to the node the user selects. By default, ‘End of Dialog’ is selected.
Pre-Processor Script¶
This property helps execute a script as the first step when the Agent Node is reached. Use the script to manipulate data and incorporate it into rules or exit scenarios as required. The Pre-processor Script has the same properties as the Script Node. Learn more.
To define a pre-processor script, click Define Script, add the script you want to execute, and click Save. Enable Auto Save to save your work automatically after one second of inactivity. It must be re-enabled each time you open the editor.
Entities¶
Note
Entity collection is applicable only for V1 Prompts.
Specify the entities to be collected by LLM during runtime. In the Entities section, click + Add, enter an Entity Name, and select the Entity Type from the drop-down list.
Most entity types are supported. Here are the exceptions: custom, composite, list of items (enumerated and lookup), and attachment. See Entity Types for more information.
System Context¶
Add a brief description of the use case context to guide the model.
Tools¶
Tools allow the Agent Node to interact with external services, fetching or posting data as needed. When called, they let language models perform tasks or obtain information by executing actions linked to Script, Service, or Search AI nodes. Users can add a maximum of 5 tools to each node.
Note
The Agent Node supports tool-calling with custom JavaScript prompts in non-streaming mode.
Click + Add to open the New Tool creation window.
Define the following details for tool configuration:
- Name: Add a meaningful name that helps the language model identify the tool to call during the conversation.
- Description: Provide a detailed explanation of what the tool does to help the language model understand when to call it.
- Parameters:Specify the inputs the tool needs to collect from the user. Define up to 10 parameters for each tool and mark them as mandatory or optional.
- Name: Enter the parameter name.
- Description: Enter an appropriate description of the parameter.
- Type: Select the parameter type (String, Boolean, or Integer).
- Actions: These are the nodes that the XO Platform executes when the language model requests a tool call with the required parameters. Users can add up to 5 actions for each tool. These actions are chained and executed sequentially, where the output of one action becomes the input for the next.
- Node Type: Select the node type (Service Node, Script Node, Search AI Node) from the dropdown.
- Node Name: Select a new or existing node from the dropdown.
- Response Path: The final output from the action nodes is required to be added as a Response Path for the Platform to understand where to look for the actual response in the payload. Choose the specific key or path that defines the output.
- Choose transition: Define the behavior after tool execution:
- Default: Send the response back to the LLM. It is mandatory to have a Response Path in this case.
- Exit Node: Follow the transitions defined for the Agent Node.
- Jump to a Node: You can jump to any node defined in the dialog.
Jump to a Node Transition
The Jump-to-Node transition option enables the creation of sophisticated dialog workflows. It allows for dynamic branching based on tool execution results, significantly streamlining the design of complex conversation flows.
Key Updates:
- Added "Jump-to-Node" transition option for tools within the Agent node.
- Enables seamless navigation to specified target nodes following tool execution.
- Maintains complete session-level conversation history across all transitions.
- Supports transitions to both orphan nodes and sub-dialogs.
- Ensures full backward compatibility with existing tool configurations.
Rules¶
Add the business rules that the collected entities should respect. In the rules section, click + Add, then enter a short and to-the-point sentence, such as:
- The airport name should include the IATA Airport Code;
- The passenger’s name should include the last name.
There is a 250-character limit to the Rules field, and you can add a maximum of 5 rules.
Example Business Rules:
- Policy_number must be exactly 10 digits.
- Incident_date must be within the last 30 days from current date.
- claim_type must be one of: auto, home, health, life.
- All required entities must be collected before submission.
Exit Scenarios¶
Specify the scenarios that should terminate entity collection and return to the dialog task. This means the node ends interaction with the generative AI model and returns to the dialog flow within the XO Platform. Well-defined exit scenarios create clear boundaries for conversations and improve the overall user experience.
Click Add Scenario, then enter short, clear, and to-the-point phrases that specifically tell the generative AI model when to exit and return to the dialog flow. For example, Exit when the user wants to book more than 5 tickets in a single booking and return "max limit reached"
.
There is a 250-character limit to the Scenarios field, and you can add a maximum of 5 scenarios.
Common Exit Scenarios:
- User explicitly asks to end the conversation
- User has provided invalid information after 3 attempts
- All required entities have been collected
- User requests to speak with a human agent
Post-Processor Script¶
This property initiates the post-processor script after processing every user input as part of the Agent Node. Use the script to manipulate the response captured in the context variables just before exiting the Agent Node for both the success and exit scenarios. The Post-processor Script has the same properties as the Script Node. Learn more.
Important Considerations
If the Agent Node requires multiple user inputs, the post-processor is executed for every user input received.
To define a post-processor script, click Define Script and add the script you want to execute.
Instance Properties¶
Configure the instance-specific fields for this node. These apply only for this instance and will not affect this adaptive dialog node when used in other tasks. You must configure Instance Properties for each task where this node is used.
User Input¶
Define how user input validation occurs for this node:
- Mandatory: This entity is required and must be provided before proceeding.
- Allowed Retries: Configure the maximum number of times a user is prompted for a valid input. You can choose between 5-25 retries in 5-retries increments. The default value is 10 retries.
- Behavior on Exceeding Retries: Define what happens when the user exceeds the allowed retries. You can choose to either End the Dialog or Transition to a Node – in which case you can select the node to transition to.
Auto Correct¶
Toggle enable/disable the Auto Correct for spelling and other common errors.
Advanced Controls¶
Configure advanced controls for this node instance as follows:
Intent Detection
This applies only to String and Description entities: Select one of these options to determine the course of action if the VA encounters an entity as a part of the user utterance:
- Accept input as entity value and discard the detected intent: The VA captures the user entry as a string or description and ignores the intent.
- Prefer user input as intent and proceed with Hold & Resume settings: The user input is considered for intent detection, and the VA proceeds according to the Hold & Resume settings.
- Ask the user how to proceed: Allow the user to specify if they meant intent or entity.
Interruptions Behavior
To define the interruption handling at this node. You can select from the below options:
- Use the task level ‘Interruptions Behavior’ setting: The VA refers to the Interruptions Behavior settings set at the dialog task level.
- Customize for this node: You can customize the Interruptions Behavior settings by selecting this option and configuring it. You can choose whether to allow interruptions or not, or to allow the end user to select the behavior. You can further customize Hold and Resume behavior. Read the Interruption Handling and Context Switching article for more information.
Custom Tags
Add Custom Meta Tags to the conversation flow to profile VA-user conversations and derive business-critical insights from usage and execution metrics. You can define tags to be attached to messages, users, and sessions. See Custom Meta Tags for details.
Connections Properties¶
Note
If the node is at the bottom of the sequence, then only the connection property is visible.
Define the transition conditions from this node. These conditions apply only to this instance and will not affect this node’s use in any other dialog. For a detailed setup guide, See Adding IF-Else Conditions to Node Connections for a detailed setup guide.
The Connection Path property offers three default variants:
- Not Connected - No specific next node is defined
- End of Dialog - Explicitly ends the current dialog
- Return to Flow - Terminates the Dialog Task and returns control to the Flow Builder. The Flow Builder resumes from the next node.
Note
Deflect to Chat works only with Kore Voice Gateway Channels (Phone number or SIP Transfer).
Tool Definition¶
Tool calling is the ability to identify when external functions are needed, select appropriate ones, invoke them with correct parameters, process their outputs, and incorporate the results into responses.
- Interaction with External Systems: The introduction of tool calling expands the Agent Node's capabilities beyond text generation. It enables interaction with external systems and databases, facilitating real-time data retrieval, calculations, and system-specific operations. This integration allows for more dynamic and data-driven conversational experiences.
- Dynamic Prompt Enhancement: The Agent Node's prompt is enhanced to include tool definitions and contextual information. Based on user input and ongoing conversation, the language model can dynamically decide whether to generate text or call a tool. The dynamic prompt adaptation ensures that the virtual assistant provides the most appropriate response or action at each step of the interaction.
Agent Node Execution¶
Execution Flow¶
During runtime, the Agent Node efficiently orchestrates interactions between the node, language model, and XO Platform to enable seamless user experiences and integration with external systems. You can work with this node like any other node within Dialog Tasks and invoke it within multiple tasks.
During runtime, the node behaves as follows:
- Input Processing: When the agent node receives user input, it processes it first through a Pre-Processor script. This script runs only once before the orchestration starts between the node and the platform. This script can perform tasks like formatting the input or extracting relevant information before sending the input to the language model.
- Entities Collection:
- The platform invokes the Generative AI model to understand the user input.
- The platform uses the entities and business rules defined in the node configurations to understand user input and identify the required entity values.
- The responses required to prompt/inform the user are automatically generated based on the conversation context.
- The platform drives the conversation until all the defined entities are captured.
- Contextual Intents
- Contextual intents (Dialog or FAQs) recognized from user input continue to be honored according to the Interruption Settings defined in the virtual assistance definition.
- Post completion of the contextual intents, the flows can return to the Agent Node.
- Language Model Decision: The language model analyzes the processed user input and decides whether to respond with generated text or call a tool:
- Text Response: If the language model determines that a text response is appropriate, it generates the response and sends it to the XO Platform. The platform then renders this response to the user.
- Tool Call Execution: When the language model decides to call a tool, it sends a tool request to the XO Platform. The platform identifies the action linked to the called tool, which could be a script, service, or Search AI node. The XO Platform executes this action and retrieves the output.
- Output Appending: Depending on the selected transition, the XO Platform may exit the node or append the output to the request prompt for enriched context and send the updated prompt back to the model for further processing.
- Post-Processing: Before presenting the final output, the XO Platform passes the response from the language model through a Post-Processor script. This script runs every time a response is received. It allows further manipulation of the response, such as formatting the output or integrating it with other elements of the conversation.
- Exit Conditions
- The platform exits from the Agent Node when any of the defined exit conditions are met.
- These conditions allow you to define scenarios that require a different path in the conversation, such as handing off to a human agent.
- The platform can also exit the Agent Node when the user exceeds the maximum number of volleys (retries to capture the required entities).
- Iterative Process: This process repeats for each conversation volley, ensuring the Agent Node dynamically adapts to user input and leverages the power of language models and external systems through tool calls.
Testing Agent Node/Debug Logs¶
The debug logs capture the entire execution flow, including the conversation history array and the tools being called. The conversation history array tracks the interaction between the user and the assistant, while the tool calls (FundsTransfer
, PayeesAvailableCheck
) represent the specific actions or functions invoked by the assistant to fulfill the user's request.
By examining the debug logs, users can trace the steps taken by the assistant, understand how it processes the user's input, and see how it interacts with different tools to complete the requested task. The logs provide crucial visibility into the underlying execution and are invaluable for debugging, monitoring, and gaining a deeper understanding of the assistant's behavior.
The debug logs on the left side of the screenshot below provide a comprehensive view of the execution flow and the interactions between the user, the assistant (Finance Buddy), and the underlying system. This detailed view ensures that you are fully informed about the process.
Here's a step-by-step explanation of the execution captured in the debug logs:
- The user initiates the conversation with Finance Buddy, requesting to transfer funds to the user.
- Finance Buddy responds, asking how it can help the user.
- The user expresses their intent to transfer funds to a person.
- The Agent Node is initiated (
Agent node initiated
). - The Agent Node Request Response Details are captured in JSON format and contain the conversation history up to this point.
- The tool execution (
FundsTransfer
) is initiated. This indicates that the assistant has determined that the tool needs to be called based on the user's request. - The assistant checks if the person (Raj Kumar) is already registered as a payee in the user's account. To verify this, it calls the
PayeesAvailableCheck
tool. - The
PayeesAvailableCheck
tool completes execution, and the result is captured in the debug logs. The assistant determines that the person is registered as a payee. - The assistant informs the user that the person is registered as a payee and requests additional details to proceed with the fund transfer. It asks the user to select the transfer type (NEFT/IMPS/RTGS), provide the transfer amount, and confirm their account ID.
- The user provides the requested information.
- The assistant calls the
FundsTransfer
tool with the provided details to initiate the fund transfer. - The
FundsTransfer
tool completes execution, and the Agent Node captures the updated conversation history array in the request-response details. - The XO Platform exits the Agent node.
Best Practices¶
To view recommended guidelines for using the agent node, see Best Practices.
FAQs¶
To view commonly asked questions about the agent node, see FAQs.