Agents

Agents are the core building blocks of Grid applications. They combine language models, tools, and custom logic to create intelligent systems that can understand, reason, and act.

What is an Agent?

In Grid, an agent is an intelligent entity that:

Processes natural language inputs from users
Reasons about the best way to respond
Uses tools to perform actions and gather information
Maintains context throughout conversations
Follows configured behaviors and system prompts

Creating Agents

Grid provides the createConfigurableAgent factory function for creating agents:

import { 
  createConfigurableAgent, 
  baseLLMService,
  createToolExecutor 
} from "@mrck-labs/grid-core";

// Create services
const llmService = baseLLMService({
  langfuse: { enabled: false }
});
const toolExecutor = createToolExecutor();

// Create a basic agent
const agent = createConfigurableAgent({
  llmService,
  toolExecutor,
  config: {
    id: "my-agent",
    type: "general",
    version: "1.0.0",
    prompts: {
      system: "You are a helpful assistant."
    },
    metadata: {
      id: "my-agent",
      type: "general",
      name: "My Agent",
      description: "A helpful assistant",
      capabilities: ["general"],
      version: "1.0.0"
    },
    tools: {
      builtin: [],
      custom: [],
      mcp: []
    },
    behavior: {
      maxRetries: 3,
      responseFormat: "text"
    }
  }
});

Agent Configuration

Agents are highly configurable through the AgentConfig interface:

LLM Service Configuration

Agents use the baseLLMService to configure LLM interactions:

// Configure the LLM service
const llmService = baseLLMService({
  // Model configuration is handled by environment variables
  // or can be passed when calling agent.act()
  langfuse: { enabled: true } // Enable observability
});

const toolExecutor = createToolExecutor();

const agent = createConfigurableAgent({
  llmService,
  toolExecutor,
  config: { /* agent config */ },
});

System Prompts

System prompts define your agent's personality and behavior:

const agent = createConfigurableAgent({
  llmService: baseLLMService({ /* ... */ }),
  toolExecutor: createToolExecutor(),
  config: {
    id: "customer-service",
    type: "general",
    version: "1.0.0",
    prompts: {
      system: `You are a customer service agent for TechCorp.
      - Be professional and courteous
      - Help customers with product inquiries
      - Escalate complex issues to human agents
      - Never share internal company information`
    },
    metadata: {
      id: "customer-service",
      type: "general",
      name: "Customer Service Agent",
      description: "Handles customer inquiries",
      capabilities: ["general"],
      version: "1.0.0"
    },
    tools: {
      builtin: [],
      custom: [],
      mcp: []
    },
    behavior: {
      maxRetries: 3,
      responseFormat: "text"
    }
  }
});

Tool Integration

Agents can use tools to extend their capabilities:

// Create tool executor and register tools
const toolExecutor = createToolExecutor();
toolExecutor.registerTool(searchTool);
toolExecutor.registerTool(calculatorTool);
toolExecutor.registerTool(emailTool);

const agent = createConfigurableAgent({
  llmService: baseLLMService({ /* ... */ }),
  toolExecutor,
  config: {
    id: "tool-agent",
    type: "general",
    version: "1.0.0",
    prompts: {
      system: "You are an AI assistant with access to various tools."
    },
    metadata: {
      id: "tool-agent",
      type: "general",
      name: "Tool Agent",
      description: "Agent with tool capabilities",
      capabilities: ["general"],
      version: "1.0.0"
    },
    tools: {
      builtin: [],
      custom: [searchTool, calculatorTool, emailTool],
      mcp: []
    },
    behavior: {
      maxRetries: 3,
      responseFormat: "text"
    }
  }
});

Agent Lifecycle

Understanding the agent lifecycle helps you build robust applications:

1. Input Processing

When you call agent.act(), the agent:

Receives the user input
Applies any input transformations
Validates the input

2. LLM Interaction

The agent:

Sends the processed input to the LLM
Receives the response
Parses any tool calls

3. Tool Execution

If tools are called:

Validates tool parameters
Executes tools in sequence or parallel
Collects tool results

4. Response Generation

The agent:

Processes tool results
Generates final response
Applies output transformations

5. Error Handling

Throughout the lifecycle:

Errors are caught and handled
Retry logic is applied if configured
Custom error handlers are invoked

Advanced Features

Custom Handlers (Hooks)

Customize agent behavior at key points:

const agent = createConfigurableAgent({
  customHandlers: {
    // Transform input before processing
    transformInput: async (input) => {
      console.log("User input:", input.messages[0].content);
      return input;
    },
    
    // Validate responses
    validateResponse: async (response) => {
      if (response.content.includes("ERROR")) {
        return { isValid: false, reason: "Response contains error" };
      }
      return { isValid: true };
    },
    
    // Handle errors with retry logic
    onError: async (error, attempt) => {
      if (attempt < 3) {
        return { shouldRetry: true, delayMs: 1000 * attempt };
      }
    },
  },
});

Behavior Configuration

Fine-tune agent behavior:

const agent = createConfigurableAgent({
  behaviorConfig: {
    maxRetries: 3,
    retryDelay: 1000,
    continueOnError: false,
    parallelToolExecution: true,
    requireExplicitToolCalls: false,
  },
});

Progress Tracking

Progress tracking is handled at the conversation level, not the agent level:

import { createConversationLoop } from "@mrck-labs/grid-core";

// Create conversation with progress tracking
const conversation = createConversationLoop({
  agent,
  onProgress: (update) => {
    switch (update.type) {
      case "thinking":
        console.log("🤔 Agent is thinking...");
        break;
      case "tool_execution":
        console.log(`🔧 Running ${update.toolName}...`);
        break;
      case "error":
        console.log(`❌ Error: ${update.message}`);
        break;
    }
  },
});

Agent Patterns

Specialized Agents

Create agents for specific domains:

import { researchAgent, mathDataAgent } from "@mrck-labs/grid-agents";

// Use pre-built agents
const researcher = researchAgent;
const calculator = mathDataAgent;

// Or create custom specialized agents
const toolExecutor = createToolExecutor();
toolExecutor.registerTool(lookupOrder);
toolExecutor.registerTool(checkInventory);
toolExecutor.registerTool(createTicket);

const supportAgent = createConfigurableAgent({
  llmService: baseLLMService({ langfuse: { enabled: true } }),
  toolExecutor,
  config: {
    id: "support-agent",
    type: "general",
    version: "1.0.0",
    prompts: {
      system: "You are a customer support specialist..."
    },
    metadata: {
      id: "support-agent",
      type: "general",
      name: "Support Agent",
      description: "Customer support specialist",
      capabilities: ["general"],
      version: "1.0.0"
    },
    tools: {
      builtin: [],
      custom: [lookupOrder, checkInventory, createTicket],
      mcp: []
    },
    behavior: {
      maxRetries: 3,
      responseFormat: "text"
    }
  }
});

Multi-Model Agents

Use different models for different tasks:

// Fast agent for simple queries
const fastAgent = createConfigurableAgent({
  llmService: baseLLMService({ 
    model: "gpt-3.5-turbo",
    apiKey: process.env.OPENAI_API_KEY,
  }),
  config: {
    id: "fast-agent",
    type: "general",
  },
});

// Powerful agent for complex reasoning
const powerfulAgent = createConfigurableAgent({
  llmService: baseLLMService({ 
    model: "gpt-4",
    apiKey: process.env.OPENAI_API_KEY,
  }),
  config: {
    id: "powerful-agent",
    type: "general",
  },
});

// Router logic
async function handleQuery(query: string) {
  const complexity = assessComplexity(query);
  const agent = complexity > 0.7 ? powerfulAgent : fastAgent;
  return agent.act(query);
}

Autonomous Agents

Create agents that can work independently:

const toolExecutor = createToolExecutor();
toolExecutor.registerTool(search);
toolExecutor.registerTool(analyze);
toolExecutor.registerTool(summarize);
toolExecutor.registerTool(save);

const autonomousAgent = createConfigurableAgent({
  llmService: baseLLMService({ langfuse: { enabled: true } }),
  toolExecutor,
  config: {
    id: "autonomous-researcher",
    type: "general",
    version: "1.0.0",
    prompts: {
      system: `You are an autonomous research agent.
      Break down complex tasks into steps and work through them systematically.`
    },
    metadata: {
      id: "autonomous-researcher",
      type: "general",
      name: "Autonomous Researcher",
      description: "Autonomous research agent",
      capabilities: ["general"],
      version: "1.0.0"
    },
    tools: {
      builtin: [],
      custom: [search, analyze, summarize, save],
      mcp: []
    },
    behavior: {
      maxRetries: 3,
      responseFormat: "text"
    }
  }
});

// Run with conversation loop for autonomous behavior
import { createConversationLoop } from "@mrck-labs/grid-core";

const loop = createConversationLoop({
  agent: autonomousAgent,
});

const result = await loop.sendMessage(
  "Research and summarize recent AI breakthroughs"
);

Voice Capabilities

Agents can be enhanced with voice capabilities by providing a voice service:

import { elevenlabsVoiceService } from "@mrck-labs/grid-core";

// Create voice service
const voiceService = elevenlabsVoiceService({
  apiKey: process.env.ELEVENLABS_API_KEY,
  defaultVoiceId: "21m00Tcm4TlvDq8ikWAM", // Rachel voice
});

// Create voice-enabled agent
const voiceAgent = createConfigurableAgent({
  llmService,
  toolExecutor,
  voiceService, // This enables voice capabilities!
  config: {
    id: "voice-assistant",
    prompts: {
      system: "You are a helpful voice assistant. Keep responses concise for speech."
    },
    voice: {
      enabled: true,
      autoSpeak: true, // Automatically speak responses
      interruptible: true, // Allow interruption mid-speech
    }
  }
});

Voice Methods

Voice-enabled agents gain additional methods:

// Check voice availability
if (voiceAgent.hasVoice()) {
  // Speak text
  await voiceAgent.speak("Hello! How can I help you?");
  
  // Note: listen() requires application-level audio input
  // The base agent's listen() method throws an error
  // Use terminal voice service or web audio API for actual recording
  // Example with terminal voice:
  // const terminalVoice = new TerminalVoiceService();
  // const recording = await terminalVoice.startRecording();
  // const audioInput = await recording.stop();
  // const transcript = await voiceService.transcribe(audioInput);
}

// Voice is integrated into normal act flow
const response = await voiceAgent.act({
  messages: [{ role: "user", content: "What's the weather?" }]
});
// Response is automatically spoken if autoSpeak is true

Voice Configuration

Configure voice behavior at the agent level:

const config = {
  voice: {
    enabled: true,              // Enable voice features
    voiceId: "voice-id",       // Default voice
    autoSpeak: true,           // Auto-speak responses
    interruptible: true,       // Allow interruption
    autoListen: false,         // Auto-listen after speaking
    mixedModality: {           // Mixed voice/text config
      enabled: true,
      mergeStrategy: 'temporal'
    }
  }
};

Best Practices

1. Clear System Prompts

Be specific about the agent's role
Define boundaries and limitations
Include examples when helpful
For voice agents, optimize for spoken language

2. Tool Selection

Only include necessary tools
Ensure tool descriptions are clear
Test tool combinations thoroughly
Consider voice-friendly tool responses

3. Error Handling

Always implement error handlers
Provide meaningful error messages
Log errors for debugging
Gracefully fall back from voice to text

4. Performance

Use appropriate models for tasks
Cache responses when possible
Monitor token usage
Use voice streaming for faster responses

5. Security

Validate all inputs
Limit tool permissions
Never expose sensitive data
Secure voice API keys

6. Voice Optimization

Keep responses concise for speech
Use natural, conversational language
Test with different voices
Handle audio failures gracefully

Next Steps

Now that you understand agents, explore:

Tools - Extend agent capabilities
Services Architecture - Understand the underlying systems
Event Handlers - Implement persistence with events
Pre-built Agents - Use ready-made agents

What is an Agent?​

Creating Agents​

Agent Configuration​

LLM Service Configuration​

System Prompts​

Tool Integration​

Agent Lifecycle​

1. Input Processing​

2. LLM Interaction​

3. Tool Execution​

4. Response Generation​

5. Error Handling​

Advanced Features​

Custom Handlers (Hooks)​

Behavior Configuration​

Progress Tracking​

Agent Patterns​

Specialized Agents​

Multi-Model Agents​

Autonomous Agents​

Voice Capabilities​

Voice Methods​

Voice Configuration​

Best Practices​

1. Clear System Prompts​

2. Tool Selection​

3. Error Handling​

4. Performance​

5. Security​

6. Voice Optimization​

Next Steps​