Agents
Agents are the core building blocks of Grid applications. They combine language models, tools, and custom logic to create intelligent systems that can understand, reason, and act.
What is an Agent?
In Grid, an agent is an intelligent entity that:
- Processes natural language inputs from users
- Reasons about the best way to respond
- Uses tools to perform actions and gather information
- Maintains context throughout conversations
- Follows configured behaviors and system prompts
Creating Agents
Grid provides the createConfigurableAgent
factory function for creating agents:
import {
createConfigurableAgent,
baseLLMService,
createToolExecutor
} from "@mrck-labs/grid-core";
// Create services
const llmService = baseLLMService({
langfuse: { enabled: false }
});
const toolExecutor = createToolExecutor();
// Create a basic agent
const agent = createConfigurableAgent({
llmService,
toolExecutor,
config: {
id: "my-agent",
type: "general",
version: "1.0.0",
prompts: {
system: "You are a helpful assistant."
},
metadata: {
id: "my-agent",
type: "general",
name: "My Agent",
description: "A helpful assistant",
capabilities: ["general"],
version: "1.0.0"
},
tools: {
builtin: [],
custom: [],
mcp: []
},
behavior: {
maxRetries: 3,
responseFormat: "text"
}
}
});
Agent Configuration
Agents are highly configurable through the AgentConfig
interface:
LLM Service Configuration
Agents use the baseLLMService
to configure LLM interactions:
// Configure the LLM service
const llmService = baseLLMService({
// Model configuration is handled by environment variables
// or can be passed when calling agent.act()
langfuse: { enabled: true } // Enable observability
});
const toolExecutor = createToolExecutor();
const agent = createConfigurableAgent({
llmService,
toolExecutor,
config: { /* agent config */ },
});
System Prompts
System prompts define your agent's personality and behavior:
const agent = createConfigurableAgent({
llmService: baseLLMService({ /* ... */ }),
toolExecutor: createToolExecutor(),
config: {
id: "customer-service",
type: "general",
version: "1.0.0",
prompts: {
system: `You are a customer service agent for TechCorp.
- Be professional and courteous
- Help customers with product inquiries
- Escalate complex issues to human agents
- Never share internal company information`
},
metadata: {
id: "customer-service",
type: "general",
name: "Customer Service Agent",
description: "Handles customer inquiries",
capabilities: ["general"],
version: "1.0.0"
},
tools: {
builtin: [],
custom: [],
mcp: []
},
behavior: {
maxRetries: 3,
responseFormat: "text"
}
}
});
Tool Integration
Agents can use tools to extend their capabilities:
// Create tool executor and register tools
const toolExecutor = createToolExecutor();
toolExecutor.registerTool(searchTool);
toolExecutor.registerTool(calculatorTool);
toolExecutor.registerTool(emailTool);
const agent = createConfigurableAgent({
llmService: baseLLMService({ /* ... */ }),
toolExecutor,
config: {
id: "tool-agent",
type: "general",
version: "1.0.0",
prompts: {
system: "You are an AI assistant with access to various tools."
},
metadata: {
id: "tool-agent",
type: "general",
name: "Tool Agent",
description: "Agent with tool capabilities",
capabilities: ["general"],
version: "1.0.0"
},
tools: {
builtin: [],
custom: [searchTool, calculatorTool, emailTool],
mcp: []
},
behavior: {
maxRetries: 3,
responseFormat: "text"
}
}
});
Agent Lifecycle
Understanding the agent lifecycle helps you build robust applications:
1. Input Processing
When you call agent.act()
, the agent:
- Receives the user input
- Applies any input transformations
- Validates the input
2. LLM Interaction
The agent:
- Sends the processed input to the LLM
- Receives the response
- Parses any tool calls
3. Tool Execution
If tools are called:
- Validates tool parameters
- Executes tools in sequence or parallel
- Collects tool results
4. Response Generation
The agent:
- Processes tool results
- Generates final response
- Applies output transformations
5. Error Handling
Throughout the lifecycle:
- Errors are caught and handled
- Retry logic is applied if configured
- Custom error handlers are invoked
Advanced Features
Custom Handlers (Hooks)
Customize agent behavior at key points:
const agent = createConfigurableAgent({
customHandlers: {
// Transform input before processing
transformInput: async (input) => {
console.log("User input:", input.messages[0].content);
return input;
},
// Validate responses
validateResponse: async (response) => {
if (response.content.includes("ERROR")) {
return { isValid: false, reason: "Response contains error" };
}
return { isValid: true };
},
// Handle errors with retry logic
onError: async (error, attempt) => {
if (attempt < 3) {
return { shouldRetry: true, delayMs: 1000 * attempt };
}
},
},
});
Behavior Configuration
Fine-tune agent behavior:
const agent = createConfigurableAgent({
behaviorConfig: {
maxRetries: 3,
retryDelay: 1000,
continueOnError: false,
parallelToolExecution: true,
requireExplicitToolCalls: false,
},
});
Progress Tracking
Progress tracking is handled at the conversation level, not the agent level:
import { createConversationLoop } from "@mrck-labs/grid-core";
// Create conversation with progress tracking
const conversation = createConversationLoop({
agent,
onProgress: (update) => {
switch (update.type) {
case "thinking":
console.log("🤔 Agent is thinking...");
break;
case "tool_execution":
console.log(`🔧 Running ${update.toolName}...`);
break;
case "error":
console.log(`❌ Error: ${update.message}`);
break;
}
},
});
Agent Patterns
Specialized Agents
Create agents for specific domains:
import { researchAgent, mathDataAgent } from "@mrck-labs/grid-agents";
// Use pre-built agents
const researcher = researchAgent;
const calculator = mathDataAgent;
// Or create custom specialized agents
const toolExecutor = createToolExecutor();
toolExecutor.registerTool(lookupOrder);
toolExecutor.registerTool(checkInventory);
toolExecutor.registerTool(createTicket);
const supportAgent = createConfigurableAgent({
llmService: baseLLMService({ langfuse: { enabled: true } }),
toolExecutor,
config: {
id: "support-agent",
type: "general",
version: "1.0.0",
prompts: {
system: "You are a customer support specialist..."
},
metadata: {
id: "support-agent",
type: "general",
name: "Support Agent",
description: "Customer support specialist",
capabilities: ["general"],
version: "1.0.0"
},
tools: {
builtin: [],
custom: [lookupOrder, checkInventory, createTicket],
mcp: []
},
behavior: {
maxRetries: 3,
responseFormat: "text"
}
}
});
Multi-Model Agents
Use different models for different tasks:
// Fast agent for simple queries
const fastAgent = createConfigurableAgent({
llmService: baseLLMService({
model: "gpt-3.5-turbo",
apiKey: process.env.OPENAI_API_KEY,
}),
config: {
id: "fast-agent",
type: "general",
},
});
// Powerful agent for complex reasoning
const powerfulAgent = createConfigurableAgent({
llmService: baseLLMService({
model: "gpt-4",
apiKey: process.env.OPENAI_API_KEY,
}),
config: {
id: "powerful-agent",
type: "general",
},
});
// Router logic
async function handleQuery(query: string) {
const complexity = assessComplexity(query);
const agent = complexity > 0.7 ? powerfulAgent : fastAgent;
return agent.act(query);
}
Autonomous Agents
Create agents that can work independently:
const toolExecutor = createToolExecutor();
toolExecutor.registerTool(search);
toolExecutor.registerTool(analyze);
toolExecutor.registerTool(summarize);
toolExecutor.registerTool(save);
const autonomousAgent = createConfigurableAgent({
llmService: baseLLMService({ langfuse: { enabled: true } }),
toolExecutor,
config: {
id: "autonomous-researcher",
type: "general",
version: "1.0.0",
prompts: {
system: `You are an autonomous research agent.
Break down complex tasks into steps and work through them systematically.`
},
metadata: {
id: "autonomous-researcher",
type: "general",
name: "Autonomous Researcher",
description: "Autonomous research agent",
capabilities: ["general"],
version: "1.0.0"
},
tools: {
builtin: [],
custom: [search, analyze, summarize, save],
mcp: []
},
behavior: {
maxRetries: 3,
responseFormat: "text"
}
}
});
// Run with conversation loop for autonomous behavior
import { createConversationLoop } from "@mrck-labs/grid-core";
const loop = createConversationLoop({
agent: autonomousAgent,
});
const result = await loop.sendMessage(
"Research and summarize recent AI breakthroughs"
);
Voice Capabilities
Agents can be enhanced with voice capabilities by providing a voice service:
import { elevenlabsVoiceService } from "@mrck-labs/grid-core";
// Create voice service
const voiceService = elevenlabsVoiceService({
apiKey: process.env.ELEVENLABS_API_KEY,
defaultVoiceId: "21m00Tcm4TlvDq8ikWAM", // Rachel voice
});
// Create voice-enabled agent
const voiceAgent = createConfigurableAgent({
llmService,
toolExecutor,
voiceService, // This enables voice capabilities!
config: {
id: "voice-assistant",
prompts: {
system: "You are a helpful voice assistant. Keep responses concise for speech."
},
voice: {
enabled: true,
autoSpeak: true, // Automatically speak responses
interruptible: true, // Allow interruption mid-speech
}
}
});
Voice Methods
Voice-enabled agents gain additional methods:
// Check voice availability
if (voiceAgent.hasVoice()) {
// Speak text
await voiceAgent.speak("Hello! How can I help you?");
// Note: listen() requires application-level audio input
// The base agent's listen() method throws an error
// Use terminal voice service or web audio API for actual recording
// Example with terminal voice:
// const terminalVoice = new TerminalVoiceService();
// const recording = await terminalVoice.startRecording();
// const audioInput = await recording.stop();
// const transcript = await voiceService.transcribe(audioInput);
}
// Voice is integrated into normal act flow
const response = await voiceAgent.act({
messages: [{ role: "user", content: "What's the weather?" }]
});
// Response is automatically spoken if autoSpeak is true
Voice Configuration
Configure voice behavior at the agent level:
const config = {
voice: {
enabled: true, // Enable voice features
voiceId: "voice-id", // Default voice
autoSpeak: true, // Auto-speak responses
interruptible: true, // Allow interruption
autoListen: false, // Auto-listen after speaking
mixedModality: { // Mixed voice/text config
enabled: true,
mergeStrategy: 'temporal'
}
}
};
Best Practices
1. Clear System Prompts
- Be specific about the agent's role
- Define boundaries and limitations
- Include examples when helpful
- For voice agents, optimize for spoken language
2. Tool Selection
- Only include necessary tools
- Ensure tool descriptions are clear
- Test tool combinations thoroughly
- Consider voice-friendly tool responses
3. Error Handling
- Always implement error handlers
- Provide meaningful error messages
- Log errors for debugging
- Gracefully fall back from voice to text
4. Performance
- Use appropriate models for tasks
- Cache responses when possible
- Monitor token usage
- Use voice streaming for faster responses
5. Security
- Validate all inputs
- Limit tool permissions
- Never expose sensitive data
- Secure voice API keys
6. Voice Optimization
- Keep responses concise for speech
- Use natural, conversational language
- Test with different voices
- Handle audio failures gracefully
Next Steps
Now that you understand agents, explore:
- Tools - Extend agent capabilities
- Services Architecture - Understand the underlying systems
- Event Handlers - Implement persistence with events
- Pre-built Agents - Use ready-made agents