Grit Agent SDK¶

An integrated framework for building AI chat agents with support for multiple LLM providers, conversation memory, knowledge bases (RAG), and multi-agent handoffs.

Key Features¶

Dual-Provider Support — Build agents using OpenAI (GPT-5, o3) or Claude (Sonnet, Opus, Haiku)
Unified Interface — Both providers share identical APIs (process_chat, build_messages, create_new_thread)
Conversation Memory — Automatic persistence across sessions with thread-based isolation
Knowledge Base (RAG) — Semantic search over your documents using pgvector
Agent Handoffs — Delegate tasks to specialized sub-agents
MCP Database Access — Query models through Model Context Protocol
Streaming Responses — Real-time token streaming via async generators
Multi-modal Support — Process PDFs and images alongside text

Architecture¶

┌─────────────────────────────────────────────────────────────────┐
│                        Your Application                          │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                         Agent Model                              │
│  (Database configuration: system prompt, model, features)        │
└─────────────────────────────────────────────────────────────────┘
                                │
                    ┌───────────┴───────────┐
                    ▼                       ▼
        ┌───────────────────┐   ┌───────────────────┐
        │  BaseOpenAIAgent  │   │  BaseClaudeAgent  │
        │  (OpenAI SDK)     │   │  (Claude SDK)     │
        └───────────────────┘   └───────────────────┘
                    │                       │
                    └───────────┬───────────┘
                                ▼
        ┌───────────────────────────────────────────┐
        │              Core Services                 │
        │  • MemoryStoreService (conversation)       │
        │  • KnowledgeBaseVectorStoreService (RAG)   │
        └───────────────────────────────────────────┘

Quick Start¶

1. Create an Agent via Admin¶

Navigate to the Admin panel and create a new Agent with:

Field	Example Value
Name	`Customer Support Agent`
System Prompt	`You are a helpful customer support assistant...`
Metadata → model_name	`gpt-4o` or `claude-sonnet-4-5`

2. Use the Agent in Your View¶

from grit.agent.models import Agent

async def chat_view(request):
    # Get agent configuration from database
    agent_model = Agent.objects.get(name="Customer Support Agent")
    config = agent_model.get_config()

    # Get the appropriate agent class (auto-detected from model_name)
    agent_class = Agent.objects.get_agent_class(
        config.agent_class,
        model_name=config.model_name
    )

    # Create and initialize the agent
    agent = await agent_class.create(config=config)

    # Stream the response
    async for chunk in agent.process_chat(
        user=request.user,
        thread_id="unique-thread-id",
        new_message="Hello, I need help with my order"
    ):
        yield chunk

3. Factory Method Pattern¶

For cleaner initialization, use the create() factory method:

# This handles both __init__ and async initialize() in one call
agent = await BaseOpenAIAgent.create(config=config)

Agent Providers¶

The SDK supports two LLM providers with identical interfaces. The provider is auto-detected from the model_name in your configuration.

OpenAI Agents¶

from grit.agent.openai_agent import BaseOpenAIAgent

agent = await BaseOpenAIAgent.create(config=config)

Supported Models: - gpt-4o — Fast, cost-effective (default) - gpt-4.1 / gpt-4.1-mini — Latest GPT-4 variants - o1 / o3 — Reasoning models (supports reasoning_effort) - gpt-4.5 / gpt-5 — Advanced capabilities

Configuration:

# In Agent metadata
{
    "model_name": "gpt-4o",
    "enable_web_search": True,
    "reasoning_effort": "medium"  # For o1/o3 models: "low", "medium", "high"
}

Claude Agents¶

from grit.agent.claude_agent import BaseClaudeAgent

agent = await BaseClaudeAgent.create(config=config)

Supported Models: - claude-sonnet-4-5 — Balanced performance (default) - claude-haiku-4-5 — Fast responses - claude-opus-4-5 — Highest capability

Configuration:

# In Agent metadata
{
    "model_name": "claude-sonnet-4-5"
}

Core Features¶

Conversation Memory¶

Conversations are automatically persisted per user and thread. The MemoryStoreService handles storage using the ORM.

# Memory is managed automatically by process_chat()
# Each message is persisted immediately to prevent race conditions

# To access memory directly:
from grit.agent.store import MemoryStoreService

memory_service = MemoryStoreService()
namespace = ("memories", str(user.id))

# Get conversation history
memory = memory_service.get_memory(namespace, thread_id)
if memory:
    history = memory.value['conversation_history']

# List recent conversations (up to 20)
recent = memory_service.list_memories(namespace)

Thread Isolation: Each thread_id maintains its own conversation history. Create new threads with agent.create_new_thread(session_key).

Knowledge Base (RAG)¶

Enable semantic search over your documents using vector embeddings and pgvector.

1. Enable Knowledge Base:

# In Agent metadata
{
    "enable_knowledge_base": True
}

2. Link Knowledge Bases: Associate KnowledgeBase records with your Agent via the knowledge_bases many-to-many relationship.

3. Add Documents:

from grit.agent.store import KnowledgeBaseVectorStoreService

kb_service = KnowledgeBaseVectorStoreService()

# Add a document (automatically chunked and embedded)
kb_service.add_document(
    knowledge_base_id=str(knowledge_base.id),
    file_path="docs/faq.md",
    text="Your document content here...",
    chunk_size=300  # words per chunk
)

4. Automatic RAG Integration: When enable_knowledge_base is True, the agent automatically: - Searches linked knowledge bases for relevant content - Injects retrieved documents into the conversation context - Wraps results in <retrieved_knowledge> tags

Agent Handoffs¶

Delegate conversations to specialized sub-agents when specific expertise is needed.

1. Configure Sub-Agents: In the Admin, add agents to the sub_agents relationship on your main agent.

2. Handoff Behavior: The SDK automatically: - Detects when a handoff is appropriate - Transfers conversation history to the sub-agent - Updates the current_agent_id in memory

# Handoff instructions are auto-generated in the system prompt:
# "You can transfer conversations to specialized agents..."
# Available agents:
# - Billing Support: Handles payment and invoice questions
# - Technical Support: Handles product technical issues

3. Sub-Agent Configuration: Each sub-agent can have its own: - Model (different providers allowed) - System prompt - Knowledge bases - Tools

MCP Database Access¶

User-mode agents can query models through Model Context Protocol (MCP).

1. Use User-Mode Agent:

from grit.agent.openai_agent import BaseOpenAIUserModeAgent
from grit.agent.claude_agent import BaseClaudeUserModeAgent

# Pass request context for authentication
agent = BaseOpenAIUserModeAgent(config=config, request=request)
await agent.initialize()

2. Register Models for MCP: Models must have a scoped manager to be queryable:

from grit.agent.mcp_server import mcp_registry

# Register in your app's ready() method
mcp_registry.register(YourModel)

3. Available Operations: The agent gains an mcp_query tool with these operations: - list — Paginated listing with optional filters - retrieve — Get single record by primary key - search — Full-text search across specified fields

Configuration Reference¶

AgentConfig Fields¶

Field	Type	Description
`id`	`str`	Unique identifier (UUID)
`label`	`str`	Display name
`description`	`str`	Agent description
`model_name`	`str`	LLM model identifier
`enable_web_search`	`bool`	Enable web search tool (OpenAI only)
`enable_knowledge_base`	`bool`	Enable RAG from linked knowledge bases
`knowledge_bases`	`list`	Linked KnowledgeBase records
`reasoning_effort`	`str`	For o1/o3 models: `"low"`, `"medium"`, `"high"`

Agent Metadata Fields¶

Store these in the metadata JSONField on the Agent model:

{
    "model_name": "gpt-4o",
    "description": "Customer support assistant",
    "enable_web_search": True,
    "enable_knowledge_base": False,
    "agent_class": "grit.agent.openai_agent.BaseOpenAIAgent",
    "reasoning_effort": "medium",
    "suggested_messages": [
        "How can I track my order?",
        "I need to return a product"
    ],
    "overview_html": "<p>I help with customer inquiries.</p>",
    "tags": {"Type": ["support", "public"]}
}

Extending Agents¶

Create custom agent classes by inheriting from the base classes:

from grit.agent.openai_agent import BaseOpenAIAgent

class CustomAgent(BaseOpenAIAgent):
    def get_agent_instructions_context(self) -> dict:
        """Add custom context variables for prompt templates."""
        return {
            "custom_data": self.fetch_custom_data()
        }

    def _build_tools(self):
        """Add custom tools to the agent."""
        tools = super()._build_tools()
        tools.append(self.my_custom_tool)
        return tools

    def on_agent_end(self, user_id, thread_id, new_message, final_output):
        """Hook for post-processing after response completes."""
        super().on_agent_end(user_id, thread_id, new_message, final_output)
        self.log_interaction(user_id, final_output)

Streaming Interface¶

Both providers use the same async generator pattern:

async for chunk in agent.process_chat(user, thread_id, message):
    # chunk is a string fragment of the response
    await send_to_client(chunk)

The process_chat method:

Persists the user message immediately
Builds conversation context (history + RAG)
Streams tokens from the LLM
Persists the assistant response on completion
Handles handoffs if detected