Add an External Model using API Integration¶
You can connect an external model to Agent Platform using API integration. This feature extends Agent Platform's functionality by allowing you to bring in models from external sources.
Add an External Model¶
The steps to add an external model using API integration are given below:
-
Click Models on the top navigation bar. The Models page is displayed.
-
Click Add a model. The Add an external model dialog is displayed.
-
Select the Custom integration option to connect models via API integration, and click Next. The Custom API integration dialog is displayed.
-
Enter a Model name and Model endpoint URL in the respective fields.
-
Select the Authorization profile you want to use with the request payload from the configured options on the Settings console. Learn more about Auth Profiles. To proceed without authentication, choose None which is the default selection.
-
In the Headers section, specify the headers such as Key and Value that need to be sent along with the request payload.
-
In the Model configurations section, select one of the following options to define the model’s API settings:
Option A: Default
By selecting this option, you can define the variables, request body code, and a test response from the model.
- Variables: Provide the variables that you want to use in the request payload, including:
- Prompt variables: The Prompt variable is set to mandatory by default. You can Turn ON the toggle for the System prompt and examples if required.
- Custom variables: (Optional) To add, follow the steps below:
- Body: The request body must include the model’s relevant parameters, which you must define manually. For dynamic variable mapping, use
{{variable}}
. Ensure the body is in the correct format, as shown in the screenshot below; otherwise, the API testing won't work.
These inputs are used to test the connection and receive a response from the model. Once the response is generated, you must configure the JSON path to capture the Output path, Input tokens, and Output tokens, as follows:
* Output Path: When you interact with the model, you send a request in a specific format and receive a response in a corresponding format, often as a large JSON object. As a user, you are mainly interested in extracting the model’s answer from this response. The output path refers to the location or key within the JSON where the model’s main output is stored. Knowing this path is essential, especially in the prompt playground, as it tells you exactly which key to map to populate the response in the playground box. For example, in the sample response below, the output path is choices[0].message.content.
-
Tokens: Tokens are the units of text data provided to or generated by an LLM. Depending on the tokenization method, a token may be as short as a single character or as long as a word. Across the product—in the playground, agents, and other areas—token usage is displayed for every model. For commercial models, this information is supplied by the provider. For open-source and fine-tuned models, ML calculates and displays them in the UI. For custom API integration models, where token calculation is not known, users can define which key in the response JSON corresponds to this information.
-
Input Tokens: Input tokens are fed into the LLM for processing. For example,
usage.prompt_tokens
indicates the input tokens in the sample response below. -
Output Tokens: Output tokens represent the text generated by the LLM after processing a prompt. Like input tokens, they can range from a single character to an entire word, depending on the tokenization method. For example,
usage.completion_tokens
indicates the output tokens in the sample response below.
-
Option B: Existing Model Provider Structures
Currently, external models added through custom API integration are limited to basic text generation because only minimal request/response structures are supported.
This prevents customers from utilizing these models for advanced scenarios, such as tool calling, structured outputs, and multimodal use cases. Without a comprehensive way to define request/response mappings, these models cannot be fully integrated across the platform.
This feature removes that limitation by:
-
Allowing users to provide complete request/response definitions through the UI or OpenAPI specifications.
-
Supporting commonly known request/response schemas (e.g., OpenAI Completions, Anthropic Messages).
With the Default option, you must manually define the request payload variables, including the model’s static and dynamic body parameters, and generate the response for the configured LLM. In this section, however, you can enable the required features and choose an LLM provider to automatically map the request and response schemas to their standard API format.
-
Model Features: Easily enable or disable one or more of the following features to make the relevant model available within the modules that use the feature(s).
Important
- This configuration requires at least one feature to be enabled.
- Users can enable any feature, but the model must support it. Otherwise, you may see unexpected behavior.
- Structured response: Specifies that the model supports the generation of a structured response. Enabling this flag allows the model to be used for generating a structured output within Prompts and Tools Flow.
- Data generation: Specifies that the model can be used for synthetic data generation for text-based tasks. Turning this flag on allows the model to be used for prompt generation in Prompts Studio.
- Streaming: Specifies that the model supports real-time, token-by-token generation for faster AI responses. Turning this flag on allows the model to be used for generating streaming responses within Agentic Apps. (coming soon)
- Tool calling: Specifies that the model supports tool calling. Enabling this flag allows the model to be used within Agentic Apps and for tool calls within the AI Text-to-text node in workflow tools.
- Support Tools: Specifies if the model supports simple tool calling. Dynamic function or API calling by the LLM to perform actions or retrieve real-time data during generation.
- Parallel Tool Calling: Specifies if the model handles parallel tool calls emanating from a single user request.
- Modalities Support: Specifies the modalities supported by the model. Enabling this flag allows the model to run Text-to-Text, Text-to-Image, Image-to-Text, and Audio-to-Text tasks for seamless downstream integration within the Tools Flow.
-
Body: In this section, choose a provider to set the API reference. The platform uses this mapping to resolve your model’s request-response structure. The available options include:
- Anthropic (Messages): Specifies that the selected model follows the request-response structure similar to Anthropic’s Messages API.
- OpenAI (Chat Completions): Specifies that the selected model follows the request-response structure similar to OpenAI’s Chat Completions API.
Note
Click Save as Draft to save the model. The status will be updated to Draft.
- Click Confirm to save the details and add the external model to the list.
Manage Custom API Integrations¶
Once the integration is successful and the inference toggle is ON, you can use the model across Agent Platform. You can also turn inferencing OFF if needed.
To manage an integration, click the three-dot icon corresponding to its name and choose from the following options:
- View: View the integration details.
- Copy: Make an editable copy of the integration details.
- Delete: Remove the integration.