Prompts & LLM Configuration¶
Learn how to configure prompts and LLM settings for your agents.
LLM Model Configuration¶
Basic Configuration¶
from agenticai_core.designtime.models.llm_model import LlmModel, LlmModelConfig
llm = LlmModel(
model="gpt-4o",
provider="Open AI",
connection_name="Default Connection",
max_timeout="60 Secs",
max_iterations="25",
modelConfig=LlmModelConfig(
temperature=0.7,
max_tokens=1600,
top_p=1.0,
frequency_penalty=0.0,
presence_penalty=0.0
)
)
Supported Providers¶
OpenAI¶
llm = LlmModel(
model="gpt-4o",
provider="Open AI",
connection_name="OpenAI Connection",
modelConfig=LlmModelConfig(
temperature=0.7,
max_tokens=1600,
frequency_penalty=0.0,
presence_penalty=0.0,
top_p=1.0
)
)
Anthropic (Claude)¶
llm = LlmModel(
model="claude-3-5-sonnet-20240620",
provider="Anthropic",
connection_name="Anthropic Connection",
modelConfig=LlmModelConfig(
temperature=1.0,
max_tokens=1024,
top_p=0.7,
top_k=5 # Anthropic-specific
)
)
Azure OpenAI¶
llm = LlmModel(
model="gpt-4",
provider="Azure OpenAI",
connection_name="Azure Connection",
modelConfig=LlmModelConfig(
temperature=0.8,
max_tokens=2048
)
)
Using Builder Pattern¶
from agenticai_core.designtime.models.llm_model import (
LlmModelBuilder, LlmModelConfigBuilder
)
# Build config
config_dict = LlmModelConfigBuilder() \
.set_temperature(0.7) \
.set_max_tokens(1600) \
.set_top_p(0.9) \
.build()
config = LlmModelConfig(**config_dict)
# Build model
llm_dict = LlmModelBuilder() \
.set_model("gpt-4o") \
.set_provider("Open AI") \
.set_connection_name("Default") \
.set_model_config(config) \
.build()
llm = LlmModel(**llm_dict)
LLM Parameters¶
Temperature (0.0 - 2.0)¶
Controls randomness in generation:
- 0.0 - 0.3: Deterministic, focused (good for factual tasks)
- 0.4 - 0.7: Balanced creativity and consistency
- 0.8 - 1.5: Creative, diverse responses
- 1.6 - 2.0: Highly random (experimental)
# Factual task
config = LlmModelConfig(temperature=0.1)
# Balanced
config = LlmModelConfig(temperature=0.7)
# Creative
config = LlmModelConfig(temperature=1.2)
Max Tokens¶
Maximum tokens to generate:
Guidelines: - Short answers: 500-1000 - Detailed responses: 1000-2000 - Long-form content: 2000-4000
Top P (0.0 - 1.0)¶
Nucleus sampling parameter:
- 0.1 - 0.5: Very focused sampling
- 0.6 - 0.9: Balanced diversity
- 0.95 - 1.0: Maximum diversity
Penalties (-2.0 to 2.0)¶
Reduce repetition:
config = LlmModelConfig(
frequency_penalty=0.5, # Penalize frequent tokens
presence_penalty=0.3 # Encourage topic diversity
)
Prompt Configuration¶
System Prompt¶
Base role definition:
from agenticai_core.designtime.models.prompt import Prompt
prompt = Prompt(
system="You are a helpful assistant."
)
Custom Prompt¶
Detailed instructions and context:
prompt = Prompt(
system="You are a helpful assistant.",
custom="""You are an intelligent banking assistant designed to help
customers manage their financial needs efficiently and securely.
## Your Capabilities
- Check account balances
- Process transactions
- Answer banking policy questions
- Provide loan information
## Customer Context
You have access to:
{{memory.accountInfo.accounts}}
Use this information for quick responses.
"""
)
With Instructions¶
Structured guidelines:
prompt = Prompt(
system="You are a banking assistant.",
custom="Help customers with account management.",
instructions=[
"""### Security Protocols
- Never ask for passwords, PINs, or CVV numbers
- If request seems suspicious, politely decline""",
"""### Speaking Style
- Use natural, conversational language
- Keep responses concise
- Provide key information first""",
"""### Handling Requests
1. Greet the customer warmly
2. Identify their need
3. Execute the request efficiently
4. Summarize and ask if anything else needed"""
]
)
Template Variables¶
Prompts support runtime variable substitution:
| Variable | Description |
|---|---|
{{app_name}} |
Application name |
{{app_description}} |
Application description |
{{agent_name}} |
Current agent name |
{{memory.store.field}} |
Access memory store data |
{{session_id}} |
Current session identifier |
Example with Templates¶
prompt = Prompt(
custom="""You are acting as {{agent_name}} for the application "{{app_name}}".
Application Description:
{{app_description}}
Customer Account Information:
{{memory.accountInfo.accounts}}
Use the above context to provide quick, accurate responses.
"""
)
Orchestrator Prompts¶
For supervisor/orchestrator agents:
supervisor_prompt = Prompt(
system="You are a helpful assistant.",
custom="""You are an AI Supervisor for "{{app_name}}".
### Your Team
You manage multiple workers:
- BillingAgent: Handles payments and billing
- SupportAgent: General customer support
- TechnicalAgent: Technical issues
### Routing Rules
1. **Small-talk**: Route to user with friendly response
2. **Direct Routing**: Match requests to worker expertise
3. **Follow-up**: Route responses to same worker
4. **Route to user**: When unrelated or complete
5. **Multi-Intent**: Break into sequential requests
"""
)
Best Practices¶
1. Clear Role Definition¶
Start with a clear role:
prompt = Prompt(
system="You are a helpful assistant.",
custom="You are a banking assistant specializing in account management..."
)
2. Structured Instructions¶
Use instructions for important rules:
prompt = Prompt(
instructions=[
"### Rule 1\nNever ask for sensitive information",
"### Rule 2\nAlways confirm before executing transactions"
]
)
3. Context Injection¶
Use template variables for dynamic context:
4. Security Guidelines¶
Always include for sensitive apps:
instructions=[
"""### Security
- Never ask for passwords, PINs, CVV, or OTPs
- Verify unusual requests
- Escalate suspicious activity"""
]
5. Voice Agent Considerations¶
For voice/audio agents:
instructions=[
"""### Speaking Style
- Use natural, conversational language
- Avoid markdown formatting
- Speak numbers clearly
- Use pauses with commas
- Keep responses concise"""
]
Configuration Tips¶
Task-Specific Settings¶
Factual Tasks:
Creative Tasks:
Balanced:
Cost Optimization¶
- Use smaller models for simple tasks
- Set appropriate max_tokens
- Configure max_iterations to limit tool calls
- Set reasonable timeouts
Quality Optimization¶
- Use latest model versions
- Increase max_tokens for detailed responses
- Lower temperature for consistency
- Increase max_iterations for complex workflows