Grit Agent SDK¶
An integrated framework for building AI chat agents with support for multiple LLM providers, conversation memory, knowledge bases (RAG), and multi-agent handoffs.
Key Features¶
- Dual-Provider Support — Build agents using OpenAI (GPT-5, o3) or Claude (Sonnet, Opus, Haiku)
- Unified Interface — Both providers share identical APIs (
process_chat,build_messages,create_new_thread) - Conversation Memory — Automatic persistence across sessions with thread-based isolation
- Knowledge Base (RAG) — Semantic search over your documents using pgvector
- Agent Handoffs — Delegate tasks to specialized sub-agents
- MCP Database Access — Query models through Model Context Protocol
- Streaming Responses — Real-time token streaming via async generators
- Multi-modal Support — Process PDFs and images alongside text
Architecture¶
┌─────────────────────────────────────────────────────────────────┐
│ Your Application │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Agent Model │
│ (Database configuration: system prompt, model, features) │
└─────────────────────────────────────────────────────────────────┘
│
┌───────────┴───────────┐
▼ ▼
┌───────────────────┐ ┌───────────────────┐
│ BaseOpenAIAgent │ │ BaseClaudeAgent │
│ (OpenAI SDK) │ │ (Claude SDK) │
└───────────────────┘ └───────────────────┘
│ │
└───────────┬───────────┘
▼
┌───────────────────────────────────────────┐
│ Core Services │
│ • MemoryStoreService (conversation) │
│ • KnowledgeBaseVectorStoreService (RAG) │
└───────────────────────────────────────────┘
Quick Start¶
1. Create an Agent via Admin¶
Navigate to the Admin panel and create a new Agent with:
| Field | Example Value |
|---|---|
| Name | Customer Support Agent |
| System Prompt | You are a helpful customer support assistant... |
| Metadata → model_name | gpt-4o or claude-sonnet-4-5 |
2. Use the Agent in Your View¶
from grit.agent.models import Agent
async def chat_view(request):
# Get agent configuration from database
agent_model = Agent.objects.get(name="Customer Support Agent")
config = agent_model.get_config()
# Get the appropriate agent class (auto-detected from model_name)
agent_class = Agent.objects.get_agent_class(
config.agent_class,
model_name=config.model_name
)
# Create and initialize the agent
agent = await agent_class.create(config=config)
# Stream the response
async for chunk in agent.process_chat(
user=request.user,
thread_id="unique-thread-id",
new_message="Hello, I need help with my order"
):
yield chunk
3. Factory Method Pattern¶
For cleaner initialization, use the create() factory method:
# This handles both __init__ and async initialize() in one call
agent = await BaseOpenAIAgent.create(config=config)
Agent Providers¶
The SDK supports two LLM providers with identical interfaces. The provider is auto-detected from the model_name in your configuration.
OpenAI Agents¶
from grit.agent.openai_agent import BaseOpenAIAgent
agent = await BaseOpenAIAgent.create(config=config)
Supported Models:
- gpt-4o — Fast, cost-effective (default)
- gpt-4.1 / gpt-4.1-mini — Latest GPT-4 variants
- o1 / o3 — Reasoning models (supports reasoning_effort)
- gpt-4.5 / gpt-5 — Advanced capabilities
Configuration:
# In Agent metadata
{
"model_name": "gpt-4o",
"enable_web_search": True,
"reasoning_effort": "medium" # For o1/o3 models: "low", "medium", "high"
}
Claude Agents¶
from grit.agent.claude_agent import BaseClaudeAgent
agent = await BaseClaudeAgent.create(config=config)
Supported Models:
- claude-sonnet-4-5 — Balanced performance (default)
- claude-haiku-4-5 — Fast responses
- claude-opus-4-5 — Highest capability
Configuration:
Core Features¶
Conversation Memory¶
Conversations are automatically persisted per user and thread. The MemoryStoreService handles storage using the ORM.
# Memory is managed automatically by process_chat()
# Each message is persisted immediately to prevent race conditions
# To access memory directly:
from grit.agent.store import MemoryStoreService
memory_service = MemoryStoreService()
namespace = ("memories", str(user.id))
# Get conversation history
memory = memory_service.get_memory(namespace, thread_id)
if memory:
history = memory.value['conversation_history']
# List recent conversations (up to 20)
recent = memory_service.list_memories(namespace)
Thread Isolation: Each thread_id maintains its own conversation history. Create new threads with agent.create_new_thread(session_key).
Knowledge Base (RAG)¶
Enable semantic search over your documents using vector embeddings and pgvector.
1. Enable Knowledge Base:
2. Link Knowledge Bases:
Associate KnowledgeBase records with your Agent via the knowledge_bases many-to-many relationship.
3. Add Documents:
from grit.agent.store import KnowledgeBaseVectorStoreService
kb_service = KnowledgeBaseVectorStoreService()
# Add a document (automatically chunked and embedded)
kb_service.add_document(
knowledge_base_id=str(knowledge_base.id),
file_path="docs/faq.md",
text="Your document content here...",
chunk_size=300 # words per chunk
)
4. Automatic RAG Integration:
When enable_knowledge_base is True, the agent automatically:
- Searches linked knowledge bases for relevant content
- Injects retrieved documents into the conversation context
- Wraps results in <retrieved_knowledge> tags
Agent Handoffs¶
Delegate conversations to specialized sub-agents when specific expertise is needed.
1. Configure Sub-Agents:
In the Admin, add agents to the sub_agents relationship on your main agent.
2. Handoff Behavior:
The SDK automatically:
- Detects when a handoff is appropriate
- Transfers conversation history to the sub-agent
- Updates the current_agent_id in memory
# Handoff instructions are auto-generated in the system prompt:
# "You can transfer conversations to specialized agents..."
# Available agents:
# - Billing Support: Handles payment and invoice questions
# - Technical Support: Handles product technical issues
3. Sub-Agent Configuration: Each sub-agent can have its own: - Model (different providers allowed) - System prompt - Knowledge bases - Tools
MCP Database Access¶
User-mode agents can query models through Model Context Protocol (MCP).
1. Use User-Mode Agent:
from grit.agent.openai_agent import BaseOpenAIUserModeAgent
from grit.agent.claude_agent import BaseClaudeUserModeAgent
# Pass request context for authentication
agent = BaseOpenAIUserModeAgent(config=config, request=request)
await agent.initialize()
2. Register Models for MCP:
Models must have a scoped manager to be queryable:
from grit.agent.mcp_server import mcp_registry
# Register in your app's ready() method
mcp_registry.register(YourModel)
3. Available Operations:
The agent gains an mcp_query tool with these operations:
- list — Paginated listing with optional filters
- retrieve — Get single record by primary key
- search — Full-text search across specified fields
Configuration Reference¶
AgentConfig Fields¶
| Field | Type | Description |
|---|---|---|
id |
str |
Unique identifier (UUID) |
label |
str |
Display name |
description |
str |
Agent description |
model_name |
str |
LLM model identifier |
enable_web_search |
bool |
Enable web search tool (OpenAI only) |
enable_knowledge_base |
bool |
Enable RAG from linked knowledge bases |
knowledge_bases |
list |
Linked KnowledgeBase records |
reasoning_effort |
str |
For o1/o3 models: "low", "medium", "high" |
Agent Metadata Fields¶
Store these in the metadata JSONField on the Agent model:
{
"model_name": "gpt-4o",
"description": "Customer support assistant",
"enable_web_search": True,
"enable_knowledge_base": False,
"agent_class": "grit.agent.openai_agent.BaseOpenAIAgent",
"reasoning_effort": "medium",
"suggested_messages": [
"How can I track my order?",
"I need to return a product"
],
"overview_html": "<p>I help with customer inquiries.</p>",
"tags": {"Type": ["support", "public"]}
}
Extending Agents¶
Create custom agent classes by inheriting from the base classes:
from grit.agent.openai_agent import BaseOpenAIAgent
class CustomAgent(BaseOpenAIAgent):
def get_agent_instructions_context(self) -> dict:
"""Add custom context variables for prompt templates."""
return {
"custom_data": self.fetch_custom_data()
}
def _build_tools(self):
"""Add custom tools to the agent."""
tools = super()._build_tools()
tools.append(self.my_custom_tool)
return tools
def on_agent_end(self, user_id, thread_id, new_message, final_output):
"""Hook for post-processing after response completes."""
super().on_agent_end(user_id, thread_id, new_message, final_output)
self.log_interaction(user_id, final_output)
Streaming Interface¶
Both providers use the same async generator pattern:
async for chunk in agent.process_chat(user, thread_id, message):
# chunk is a string fragment of the response
await send_to_client(chunk)
The process_chat method:
- Persists the user message immediately
- Builds conversation context (history + RAG)
- Streams tokens from the LLM
- Persists the assistant response on completion
- Handles handoffs if detected