From a4f684466c129efcb6d5809130d47645aa7650d7 Mon Sep 17 00:00:00 2001 From: qzl Date: Tue, 3 Mar 2026 17:29:01 +0800 Subject: [PATCH] =?UTF-8?q?chore:=20=E6=B8=85=E7=90=86opencode=E6=8A=80?= =?UTF-8?q?=E8=83=BD=E6=96=87=E4=BB=B6=E3=80=81=E6=97=A7=E8=AE=BE=E8=AE=A1?= =?UTF-8?q?=E6=96=87=E6=A1=A3=E5=B9=B6=E6=9B=B4=E6=96=B0=E9=85=8D=E7=BD=AE?= =?UTF-8?q?=E6=96=87=E6=A1=A3?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .env.example | 5 + .opencode/skills/ag-ui/SKILL.md | 111 - .opencode/skills/ag-ui/llms-full.txt | 10632 --- .opencode/skills/ag-ui/modules/agents.md | 16 - .../skills/ag-ui/modules/architecture.md | 17 - .../skills/ag-ui/modules/contributing.md | 15 - .opencode/skills/ag-ui/modules/drafts.md | 53 - .opencode/skills/ag-ui/modules/events.md | 20 - .../skills/ag-ui/modules/generative-ui.md | 11 - .opencode/skills/ag-ui/modules/messages.md | 21 - .opencode/skills/ag-ui/modules/middleware.md | 18 - .opencode/skills/ag-ui/modules/overview.md | 28 - .opencode/skills/ag-ui/modules/protocol.md | 11 - .opencode/skills/ag-ui/modules/reasoning.md | 26 - .../skills/ag-ui/modules/serialization.md | 21 - .opencode/skills/ag-ui/modules/state.md | 18 - .opencode/skills/ag-ui/modules/tools.md | 24 - .opencode/skills/ag-ui/scripts/README.md | 163 - .../skills/ag-ui/scripts/minimal_agent.ts | 99 - .../ag-ui/scripts/state_sync_example.ts | 255 - .../skills/ag-ui/scripts/tool_call_example.ts | 201 - .opencode/skills/crewai/SKILL.md | 93 - .opencode/skills/crewai/llms-full.md | 53221 ---------------- .opencode/skills/crewai/modules/agents.md | 18 - .../skills/crewai/modules/collaboration.md | 20 - .opencode/skills/crewai/modules/crews.md | 20 - .opencode/skills/crewai/modules/flows.md | 13 - .../skills/crewai/modules/installation.md | 13 - .opencode/skills/crewai/modules/knowledge.md | 17 - .opencode/skills/crewai/modules/llms.md | 11 - .opencode/skills/crewai/modules/memory.md | 11 - .opencode/skills/crewai/modules/planning.md | 10 - .../skills/crewai/modules/quickstart-tools.md | 52 - .opencode/skills/crewai/modules/reasoning.md | 10 - .opencode/skills/crewai/modules/tasks.md | 20 - .opencode/skills/crewai/modules/tools.md | 13 - .opencode/skills/crewai/modules/training.md | 12 - .../2026-02-26-social-data-model-redesign.md | 651 - docs/plans/2026-02-27-invite-code-design.md | 161 - ...6-02-27-invite-code-implementation-plan.md | 309 - .../2026-02-27-schedule-items-api-design.md | 191 - ...02-27-schedule-items-api-implementation.md | 1244 - docs/plans/2026-02-28-ag-ui-chat-design.md | 567 - ...26-02-28-ag-ui-chat-implementation-plan.md | 2463 - ...6-02-28-calendar-sharing-implementation.md | 714 - docs/plans/2026-02-28-friendship-design.md | 136 - ...26-02-28-friendship-implementation-plan.md | 870 - docs/runtime/runtime-database.md | 150 +- 48 files changed, 134 insertions(+), 72641 deletions(-) delete mode 100644 .opencode/skills/ag-ui/SKILL.md delete mode 100644 .opencode/skills/ag-ui/llms-full.txt delete mode 100644 .opencode/skills/ag-ui/modules/agents.md delete mode 100644 .opencode/skills/ag-ui/modules/architecture.md delete mode 100644 .opencode/skills/ag-ui/modules/contributing.md delete mode 100644 .opencode/skills/ag-ui/modules/drafts.md delete mode 100644 .opencode/skills/ag-ui/modules/events.md delete mode 100644 .opencode/skills/ag-ui/modules/generative-ui.md delete mode 100644 .opencode/skills/ag-ui/modules/messages.md delete mode 100644 .opencode/skills/ag-ui/modules/middleware.md delete mode 100644 .opencode/skills/ag-ui/modules/overview.md delete mode 100644 .opencode/skills/ag-ui/modules/protocol.md delete mode 100644 .opencode/skills/ag-ui/modules/reasoning.md delete mode 100644 .opencode/skills/ag-ui/modules/serialization.md delete mode 100644 .opencode/skills/ag-ui/modules/state.md delete mode 100644 .opencode/skills/ag-ui/modules/tools.md delete mode 100644 .opencode/skills/ag-ui/scripts/README.md delete mode 100644 .opencode/skills/ag-ui/scripts/minimal_agent.ts delete mode 100644 .opencode/skills/ag-ui/scripts/state_sync_example.ts delete mode 100644 .opencode/skills/ag-ui/scripts/tool_call_example.ts delete mode 100644 .opencode/skills/crewai/SKILL.md delete mode 100644 .opencode/skills/crewai/llms-full.md delete mode 100644 .opencode/skills/crewai/modules/agents.md delete mode 100644 .opencode/skills/crewai/modules/collaboration.md delete mode 100644 .opencode/skills/crewai/modules/crews.md delete mode 100644 .opencode/skills/crewai/modules/flows.md delete mode 100644 .opencode/skills/crewai/modules/installation.md delete mode 100644 .opencode/skills/crewai/modules/knowledge.md delete mode 100644 .opencode/skills/crewai/modules/llms.md delete mode 100644 .opencode/skills/crewai/modules/memory.md delete mode 100644 .opencode/skills/crewai/modules/planning.md delete mode 100644 .opencode/skills/crewai/modules/quickstart-tools.md delete mode 100644 .opencode/skills/crewai/modules/reasoning.md delete mode 100644 .opencode/skills/crewai/modules/tasks.md delete mode 100644 .opencode/skills/crewai/modules/tools.md delete mode 100644 .opencode/skills/crewai/modules/training.md delete mode 100644 docs/plans/2026-02-26-social-data-model-redesign.md delete mode 100644 docs/plans/2026-02-27-invite-code-design.md delete mode 100644 docs/plans/2026-02-27-invite-code-implementation-plan.md delete mode 100644 docs/plans/2026-02-27-schedule-items-api-design.md delete mode 100644 docs/plans/2026-02-27-schedule-items-api-implementation.md delete mode 100644 docs/plans/2026-02-28-ag-ui-chat-design.md delete mode 100644 docs/plans/2026-02-28-ag-ui-chat-implementation-plan.md delete mode 100644 docs/plans/2026-02-28-calendar-sharing-implementation.md delete mode 100644 docs/plans/2026-02-28-friendship-design.md delete mode 100644 docs/plans/2026-02-28-friendship-implementation-plan.md diff --git a/.env.example b/.env.example index 0a9375a..d7ae22b 100644 --- a/.env.example +++ b/.env.example @@ -143,3 +143,8 @@ SOCIAL_STORAGE__BUCKET=agent-chat-attachments SOCIAL_STORAGE__SIGNED_URL_TTL_SECONDS=600 SOCIAL_STORAGE__MAX_FILE_SIZE_MB=20 SOCIAL_STORAGE__RETENTION_DAYS=30 + +###### +# LLM API KEY +LLM_DEEPSEEK_API_KEY= +LLM_DASHSCOPE_API_KEY= diff --git a/.opencode/skills/ag-ui/SKILL.md b/.opencode/skills/ag-ui/SKILL.md deleted file mode 100644 index fe08786..0000000 --- a/.opencode/skills/ag-ui/SKILL.md +++ /dev/null @@ -1,111 +0,0 @@ ---- -name: ag-ui -description: AG-UI protocol for agent-user interaction. Use when implementing agent chat, streaming events, tool calls, state synchronization, SSE, multimodal messages, MCP/A2A integration, or any AG-UI protocol development. ---- - -# AG-UI Skills - -AG-UI 协议开发权威指南。**必须使用**场景:构建 agentic 应用、实现 agent 与用户交互、处理流式事件、工具调用生命周期、状态同步、多模态消息、MCP/A2A 集成。提供完整模块索引与源文件行号映射。 - -## 何时使用 - -**必须使用**的场景: -- 实现 Agent 与前端的流式交互 -- 处理 Agent 生命周期事件(RunStarted/Finished、StepStarted/Finished) -- 实现工具调用(ToolCall 事件流) -- Agent 状态管理与前端同步 -- 集成 MCP/A2A 协议的 agent 应用 -- 实现人机协作(Interrupts、Approval 流程) -- 处理多模态消息(文本、图片、音频、视频) - -**查询模式**: -- "如何实现 agent 流式响应" -- "tool call 事件流程" -- "agent state delta 同步" -- "human-in-the-loop interrupt" -- "AG-UI 与 MCP 集成" - -## 模块索引 - -按功能模块查看源文件对应章节: - -| 模块 | 作用 | 源文件行号 | -|------|------|------------| -| [protocol](modules/protocol.md) | 协议概述,与 MCP、A2A 关系 | 1-33 | -| [agents](modules/agents.md) | Agent 概念、架构、类型、实现 | 35-451 | -| [architecture](modules/architecture.md) | 核心架构、设计原则、运行机制 | 453-679 | -| [events](modules/events.md) | 所有事件类型详解 | 680-1475 | -| [generative-ui](modules/generative-ui.md) | 生成式 UI 规范(A2UI/MCP-UI) | 1476-1496 | -| [messages](modules/messages.md) | 消息结构、类型、同步机制 | 1498-1952 | -| [middleware](modules/middleware.md) | 中间件:转换、过滤、增强事件流 | 1954-2158 | -| [reasoning](modules/reasoning.md) | LLM 推理支持,加密推理内容 | 2160-2638 | -| [serialization](modules/serialization.md) | 事件流序列化、压缩、分支 | 2640-2827 | -| [state](modules/state.md) | Agent 与前端状态同步 | 2829-3080 | -| [tools](modules/tools.md) | 工具定义、调用生命周期 | 3082-3441 | -| [drafts](modules/drafts.md) | 提案功能:Generative UI, Interrupts, Meta Events, Multimodal | 3492-4846 | -| [contributing](modules/contributing.md) | 贡献指南、路线图、更新日志 | 3443-3485 | -| [overview](modules/overview.md) | **AG-UI 协议总体介绍** | 4894-5261 | - -## 源文件 - -- `llms-full.txt` - AG-UI 协议完整文档(唯一信源,10632 行) -- `scripts/` - 可执行示例代码(见下方"示例脚本") - -## 示例脚本 - -`scripts/` 目录包含可直接运行的 TypeScript 示例: - -| 示例 | 用途 | 参考文档 | -|------|------|---------| -| [minimal_agent.ts](scripts/minimal_agent.ts) | 最小 Agent 实现 | [agents](modules/agents.md) 行 132-197 | -| [tool_call_example.ts](scripts/tool_call_example.ts) | 工具调用流程 | [events](modules/events.md) 行 938-1066 | -| [state_sync_example.ts](scripts/state_sync_example.ts) | Snapshot-Delta 状态同步 | [events](modules/events.md) 行 1067-1155 | - -**运行示例**: -```bash -# 安装依赖 -npm install @ag-ui/client rxjs - -# 运行 -npx ts-node scripts/minimal_agent.ts -``` - -详见 [scripts/README.md](scripts/README.md) - -## 常见事件速查 - -| 场景 | 关键事件 | 详见 | -|------|---------|------| -| 流式响应 | TextMessageStart → Content → End | [events](modules/events.md) 行 835-937 | -| 工具调用 | ToolCallStart → Args → End → Result | [events](modules/events.md) 行 938-1066 | -| 状态同步 | StateSnapshot, StateDelta | [events](modules/events.md) 行 1067-1155 | -| 生命周期 | RunStarted/Finished, StepStarted/Finished | [events](modules/events.md) 行 715-754 | -| 人机中断 | RunFinished(interrupt) | [drafts](modules/drafts.md) 行 3897-3920 | - -## 快速路径 - -**新手入门**: -1. [overview](modules/overview.md) - **理解 AG-UI 协议全貌与定位** -2. [protocol](modules/protocol.md) - AG-UI 在 AI 协议栈的位置(与 MCP/A2A 关系) -3. [architecture](modules/architecture.md) - 核心概念与设计原则 -4. [agents](modules/agents.md) - Agent 基础实现 -5. 运行 [minimal_agent.ts](scripts/minimal_agent.ts) 体验基础事件流 - -**实现功能**: -- 流式响应 → [events](modules/events.md) (TextMessage 事件) + [minimal_agent.ts](scripts/minimal_agent.ts) -- 工具调用 → [tools](modules/tools.md) + [events](modules/events.md) (ToolCall 事件) + [tool_call_example.ts](scripts/tool_call_example.ts) -- 状态同步 → [state](modules/state.md) + [events](modules/events.md) (StateDelta) + [state_sync_example.ts](scripts/state_sync_example.ts) -- 中间件 → [middleware](modules/middleware.md) - -**高级特性**: -- 人机协作 → [drafts](modules/drafts.md) (Interrupts) -- 多模态 → [drafts](modules/drafts.md) (Multimodal Messages) -- 生成式 UI → [generative-ui](modules/generative-ui.md) + [drafts](modules/drafts.md) (Generative UI) -- 推理加密 → [reasoning](modules/reasoning.md) - -## 建议使用方式 - -1. 先阅读 [architecture](modules/architecture.md) 了解核心概念 -2. 根据需要查看具体模块 -3. 事件类型参考 [events](modules/events.md) -4. 实现细节参考对应功能模块 diff --git a/.opencode/skills/ag-ui/llms-full.txt b/.opencode/skills/ag-ui/llms-full.txt deleted file mode 100644 index b1ee653..0000000 --- a/.opencode/skills/ag-ui/llms-full.txt +++ /dev/null @@ -1,10632 +0,0 @@ -# MCP, A2A, and AG-UI -Source: https://docs.ag-ui.com/agentic-protocols - -Understanding how AG-UI complements and works with MCP and A2A - -## Agentic Protocols - -The agentic ecosystem is rapidly organizing around a family of open, complementary protocols — each addressing a distinct layer of interaction. AG-UI has emerged as the 3rd leg of the AI protocol landscape: - -
- AI Protocol Stack -
- -You can connect your application to agents directly via **AG-UI**, **MCP**, and **A2A**. - -* **MCP** (Model Context Protocol) Connects agents to tools and to context — but those tools are themselves becoming agentic. -* **A2A** (Agent to Agent) Connects agents to other agents. -* **AG-UI (Agent–User Interaction)** Connects agents to users (through user-facing applications). - - You can think of AG-UI as the **"kitchen sink" protocol** — informed by bottom-up, real-world needs for building best-in-class agentic applications. - -These three agentic protocols are complementary and have distinct technical goals; a single agent can and often does use all 3 simultaneously. - -## AG-UI Handshakes with MCP and A2A - -AG-UI contributors have recently added handshakes, allowing AG-UI to "front for" agents through MCP and A2A protocols, which allows AG-UI client apps and libraries to seamlessly use MCP and A2A supporting agents. - -AG-UI's mandate is to support the full set of building blocks required by modern agentic applications. - -## Generative UI Specs - -Recently several [generative ui specs](./concepts/generative-ui-specs) (including MCP-UI, Open JSON UI, and A2UI) have been released which allow agents to deliver UI widgets through the interaction protocols. AG-UI works with all of these. Visit our [generative ui specs page](./concepts/generative-ui-specs) to lern more. - - -# Agents -Source: https://docs.ag-ui.com/concepts/agents - -Learn about agents in the Agent User Interaction Protocol - -# Agents - -Agents are the core components in the AG-UI protocol that process requests and -generate responses. They establish a standardized way for front-end applications -to communicate with AI services through a consistent interface, regardless of -the underlying implementation. - -## What is an Agent? - -In AG-UI, an agent is a class that: - -1. Manages conversation state and message history -2. Processes incoming messages and context -3. Generates responses through an event-driven streaming interface -4. Follows a standardized protocol for communication - -Agents can be implemented to connect with any AI service, including: - -* Large language models (LLMs) like GPT-4 or Claude -* Custom AI systems -* Retrieval augmented generation (RAG) systems -* Multi-agent systems - -## Agent Architecture - -All agents in AG-UI extend the `AbstractAgent` class, which provides the -foundation for: - -* State management -* Message history tracking -* Event stream processing -* Tool usage - -```typescript theme={null} -import { AbstractAgent } from "@ag-ui/client" - -class MyAgent extends AbstractAgent { - run(input: RunAgentInput): RunAgent { - // Implementation details - } -} -``` - -### Core Components - -AG-UI agents have several key components: - -1. **Configuration**: Agent ID, thread ID, and initial state -2. **Messages**: Conversation history with user and assistant messages -3. **State**: Structured data that persists across interactions -4. **Events**: Standardized messages for communication with clients -5. **Tools**: Functions that agents can use to interact with external systems - -## Agent Types - -AG-UI provides different agent implementations to suit various needs: - -### AbstractAgent - -The base class that all agents extend. It handles core event processing, state -management, and message history. - -### HttpAgent - -A concrete implementation that connects to remote AI services via HTTP: - -```typescript theme={null} -import { HttpAgent } from "@ag-ui/client" - -const agent = new HttpAgent({ - url: "https://your-agent-endpoint.com/agent", - headers: { - Authorization: "Bearer your-api-key", - }, -}) -``` - -### Custom Agents - -You can create custom agents to integrate with any AI service by extending -`AbstractAgent`: - -```typescript theme={null} -class CustomAgent extends AbstractAgent { - // Custom properties and methods - - run(input: RunAgentInput): RunAgent { - // Implement the agent's logic - } -} -``` - -## Implementing Agents - -### Basic Implementation - -To create a custom agent, extend the `AbstractAgent` class and implement the -required `run` method: - -```typescript theme={null} -import { - AbstractAgent, - RunAgent, - RunAgentInput, - EventType, - BaseEvent, -} from "@ag-ui/client" -import { Observable } from "rxjs" - -class SimpleAgent extends AbstractAgent { - run(input: RunAgentInput): RunAgent { - const { threadId, runId } = input - - return () => - new Observable((observer) => { - // Emit RUN_STARTED event - observer.next({ - type: EventType.RUN_STARTED, - threadId, - runId, - }) - - // Send a message - const messageId = Date.now().toString() - - // Message start - observer.next({ - type: EventType.TEXT_MESSAGE_START, - messageId, - role: "assistant", - }) - - // Message content - observer.next({ - type: EventType.TEXT_MESSAGE_CONTENT, - messageId, - delta: "Hello, world!", - }) - - // Message end - observer.next({ - type: EventType.TEXT_MESSAGE_END, - messageId, - }) - - // Emit RUN_FINISHED event - observer.next({ - type: EventType.RUN_FINISHED, - threadId, - runId, - }) - - // Complete the observable - observer.complete() - }) - } -} -``` - -## Agent Capabilities - -Agents in the AG-UI protocol provide a rich set of capabilities that enable -sophisticated AI interactions: - -### Interactive Communication - -Agents establish bi-directional communication channels with front-end -applications through event streams. This enables: - -* Real-time streaming responses character-by-character -* Immediate feedback loops between user and AI -* Progress indicators for long-running operations -* Structured data exchange in both directions - -### Tool Usage - -Agents can use tools to perform actions and access external resources. -Importantly, tools are defined and passed in from the front-end application to -the agent, allowing for a flexible and extensible system: - -```typescript theme={null} -// Tool definition -const confirmAction = { - name: "confirmAction", - description: "Ask the user to confirm a specific action before proceeding", - parameters: { - type: "object", - properties: { - action: { - type: "string", - description: "The action that needs user confirmation", - }, - importance: { - type: "string", - enum: ["low", "medium", "high", "critical"], - description: "The importance level of the action", - }, - details: { - type: "string", - description: "Additional details about the action", - }, - }, - required: ["action"], - }, -} - -// Running an agent with tools from the frontend -agent.runAgent({ - tools: [confirmAction], // Frontend-defined tools passed to the agent - // other parameters -}) -``` - -Tools are invoked through a sequence of events: - -1. `TOOL_CALL_START`: Indicates the beginning of a tool call -2. `TOOL_CALL_ARGS`: Streams the arguments for the tool call -3. `TOOL_CALL_END`: Marks the completion of the tool call - -Front-end applications can then execute the tool and provide results back to the -agent. This bidirectional flow enables sophisticated human-in-the-loop workflows -where: - -* The agent can request specific actions be performed -* Humans can execute those actions with appropriate judgment -* Results are fed back to the agent for continued reasoning -* The agent maintains awareness of all decisions made in the process - -This mechanism is particularly powerful for implementing interfaces where AI and -humans collaborate. For example, [CopilotKit](https://docs.copilotkit.ai/) -leverages this exact pattern with their -[`useCopilotAction`](https://docs.copilotkit.ai/guides/frontend-actions) hook, -which provides a simplified way to define and handle tools in React -applications. - -By keeping the AI informed about human decisions through the tool mechanism, -applications can maintain context and create more natural collaborative -experiences between users and AI assistants. - -### State Management - -Agents maintain a structured state that persists across interactions. This state -can be: - -* Updated incrementally through `STATE_DELTA` events -* Completely refreshed with `STATE_SNAPSHOT` events -* Accessed by both the agent and front-end -* Used to store user preferences, conversation context, or application state - -```typescript theme={null} -// Accessing agent state -console.log(agent.state.preferences) - -// State is automatically updated during agent runs -agent.runAgent().subscribe((event) => { - if (event.type === EventType.STATE_DELTA) { - // State has been updated - console.log("New state:", agent.state) - } -}) -``` - -### Multi-Agent Collaboration - -AG-UI supports agent-to-agent handoff and collaboration: - -* Agents can delegate tasks to other specialized agents -* Multiple agents can work together in a coordinated workflow -* State and context can be transferred between agents -* The front-end maintains a consistent experience across agent transitions - -For example, a general assistant agent might hand off to a specialized coding -agent when programming help is needed, passing along the conversation context -and specific requirements. - -### Human-in-the-Loop Workflows - -Agents support human intervention and assistance: - -* Agents can request human input on specific decisions -* Front-ends can pause agent execution and resume it after human feedback -* Human experts can review and modify agent outputs before they're finalized -* Hybrid workflows combine AI efficiency with human judgment - -This enables applications where the agent acts as a collaborative partner rather -than an autonomous system. - -### Conversational Memory - -Agents maintain a complete history of conversation messages: - -* Past interactions inform future responses -* Message history is synchronized between client and server -* Messages can include rich content (text, structured data, references) -* The context window can be managed to focus on relevant information - -```typescript theme={null} -// Accessing message history -console.log(agent.messages) - -// Adding a new user message -agent.messages.push({ - id: "msg_123", - role: "user", - content: "Can you explain that in more detail?", -}) -``` - -### Metadata and Instrumentation - -Agents can emit metadata about their internal processes: - -* Reasoning steps through custom events -* Performance metrics and timing information -* Source citations and reference tracking -* Confidence scores for different response options - -This allows front-ends to provide transparency into the agent's decision-making -process and help users understand how conclusions were reached. - -## Using Agents - -Once you've implemented or instantiated an agent, you can use it like this: - -```typescript theme={null} -// Create an agent instance -const agent = new HttpAgent({ - url: "https://your-agent-endpoint.com/agent", -}) - -// Add initial messages if needed -agent.messages = [ - { - id: "1", - role: "user", - content: "Hello, how can you help me today?", - }, -] - -// Run the agent -agent - .runAgent({ - runId: "run_123", - tools: [], // Optional tools - context: [], // Optional context - }) - .subscribe({ - next: (event) => { - // Handle different event types - switch (event.type) { - case EventType.TEXT_MESSAGE_CONTENT: - console.log("Content:", event.delta) - break - // Handle other events - } - }, - error: (error) => console.error("Error:", error), - complete: () => console.log("Run complete"), - }) -``` - -## Agent Configuration - -Agents accept configuration through the constructor: - -```typescript theme={null} -interface AgentConfig { - agentId?: string // Unique identifier for the agent - description?: string // Human-readable description - threadId?: string // Conversation thread identifier - initialMessages?: Message[] // Initial messages - initialState?: State // Initial state object -} - -// Using the configuration -const agent = new HttpAgent({ - agentId: "my-agent-123", - description: "A helpful assistant", - threadId: "thread-456", - initialMessages: [ - { id: "1", role: "system", content: "You are a helpful assistant." }, - ], - initialState: { preferredLanguage: "English" }, -}) -``` - -## Agent State Management - -AG-UI agents maintain state across interactions: - -```typescript theme={null} -// Access current state -console.log(agent.state) - -// Access messages -console.log(agent.messages) - -// Clone an agent with its state -const clonedAgent = agent.clone() -``` - -## Conclusion - -Agents are the foundation of the AG-UI protocol, providing a standardized way to -connect front-end applications with AI services. By implementing the -`AbstractAgent` class, you can create custom integrations with any AI service -while maintaining a consistent interface for your applications. - -The event-driven architecture enables real-time, streaming interactions that are -essential for modern AI applications, and the standardized protocol ensures -compatibility across different implementations. - - -# Core architecture -Source: https://docs.ag-ui.com/concepts/architecture - -Understand how AG-UI connects front-end applications to AI agents - -Agent User Interaction Protocol (AG-UI) is built on a flexible, event-driven -architecture that enables seamless, efficient communication between front-end -applications and AI agents. This document covers the core architectural -components and concepts. - -## Design Principles - -AG-UI is designed to be lightweight and minimally opinionated, making it easy to -integrate with a wide range of agent implementations. The protocol's flexibility -comes from its simple requirements: - -1. **Event-Driven Communication**: Agents need to emit any of the 16 - standardized event types during execution, creating a stream of updates that - clients can process. - -2. **Bidirectional Interaction**: Agents accept input from users, enabling - collaborative workflows where humans and AI work together seamlessly. - -The protocol includes a built-in middleware layer that maximizes compatibility -in two key ways: - -* **Flexible Event Structure**: Events don't need to match AG-UI's format - exactly—they just need to be AG-UI-compatible. This allows existing agent - frameworks to adapt their native event formats with minimal effort. - -* **Transport Agnostic**: AG-UI doesn't mandate how events are delivered, - supporting various transport mechanisms including Server-Sent Events (SSE), - webhooks, WebSockets, and more. This flexibility lets developers choose the - transport that best fits their architecture. - -This pragmatic approach makes AG-UI easy to adopt without requiring major -changes to existing agent implementations or frontend applications. - -## Architectural Overview - -AG-UI follows a client-server architecture that standardizes communication -between agents and applications: - -```mermaid theme={null} -flowchart LR - subgraph "Frontend" - App["Application"] - Client["AG-UI Client"] - end - - subgraph "Backend" - A1["AI Agent A"] - P["Secure Proxy"] - A2["AI Agent B"] - A3["AI Agent C"] - end - - App <--> Client - Client <-->|"AG-UI Protocol"| A1 - Client <-->|"AG-UI Protocol"| P - P <-->|"AG-UI Protocol"| A2 - P <-->|"AG-UI Protocol"| A3 - - class P mintStyle; - classDef mintStyle fill:#E0F7E9,stroke:#66BB6A,stroke-width:2px,color:#000000; - - style App rx:5, ry:5; - style Client rx:5, ry:5; - style A1 rx:5, ry:5; - style P rx:5, ry:5; - style A2 rx:5, ry:5; - style A3 rx:5, ry:5; -``` - -* **Application**: User-facing apps (i.e. chat or any AI-enabled application). -* **AG-UI Client**: Generic communication clients like `HttpAgent` or - specialized clients for connecting to existing protocols. -* **Agents**: Backend AI agents that process requests and generate streaming - responses. -* **Secure Proxy**: Backend services that provide additional capabilities and - act as a secure proxy. - -## Core components - -### Protocol layer - -AG-UI's protocol layer provides a flexible foundation for agent communication. - -* **Universal compatibility**: Connect to any protocol by implementing - `run(input: RunAgentInput) -> Observable` - -The protocol's primary abstraction enables applications to run agents and -receive a stream of events: - -```typescript theme={null} -// Core agent execution interface -type RunAgent = () => Observable - -class MyAgent extends AbstractAgent { - run(input: RunAgentInput): RunAgent { - const { threadId, runId } = input - return () => - from([ - { type: EventType.RUN_STARTED, threadId, runId }, - { - type: EventType.MESSAGES_SNAPSHOT, - messages: [ - { id: "msg_1", role: "assistant", content: "Hello, world!" } - ], - }, - { type: EventType.RUN_FINISHED, threadId, runId }, - ]) - } -} -``` - -### Standard HTTP client - -AG-UI offers a standard HTTP client `HttpAgent` that can be used to connect to -any endpoint that accepts POST requests with a body of type `RunAgentInput` and -sends a stream of `BaseEvent` objects. - -`HttpAgent` supports the following transports: - -* **HTTP SSE (Server-Sent Events)** - - * Text-based streaming for wide compatibility - * Easy to read and debug - -* **HTTP binary protocol** - * Highly performant and space-efficient custom transport - * Robust binary serialization for production environments - -### Message types - -AG-UI defines several event categories for different aspects of agent -communication: - -* **Lifecycle events** - - * `RUN_STARTED`, `RUN_FINISHED`, `RUN_ERROR` - * `STEP_STARTED`, `STEP_FINISHED` - -* **Text message events** - - * `TEXT_MESSAGE_START`, `TEXT_MESSAGE_CONTENT`, `TEXT_MESSAGE_END` - -* **Tool call events** - - * `TOOL_CALL_START`, `TOOL_CALL_ARGS`, `TOOL_CALL_END` - -* **State management events** - - * `STATE_SNAPSHOT`, `STATE_DELTA`, `MESSAGES_SNAPSHOT` - -* **Special events** - * `RAW`, `CUSTOM` - -## Running Agents - -To run an agent, you create a client instance and execute it: - -```typescript theme={null} -// Create an HTTP agent client -const agent = new HttpAgent({ - url: "https://your-agent-endpoint.com/agent", - agentId: "unique-agent-id", - threadId: "conversation-thread" -}); - -// Start the agent and handle events -agent.runAgent({ - tools: [...], - context: [...] -}).subscribe({ - next: (event) => { - // Handle different event types - switch(event.type) { - case EventType.TEXT_MESSAGE_CONTENT: - // Update UI with new content - break; - // Handle other event types - } - }, - error: (error) => console.error("Agent error:", error), - complete: () => console.log("Agent run complete") -}); -``` - -## State Management - -AG-UI provides efficient state management through specialized events: - -* `STATE_SNAPSHOT`: Complete state representation at a point in time -* `STATE_DELTA`: Incremental state changes using JSON Patch format (RFC 6902) -* `MESSAGES_SNAPSHOT`: Complete conversation history - -These events enable efficient client-side state management with minimal data -transfer. - -## Tools and Handoff - -AG-UI supports agent-to-agent handoff and tool usage through standardized -events: - -* Tool definitions are passed in the `runAgent` parameters -* Tool calls are streamed as sequences of `TOOL_CALL_START` → `TOOL_CALL_ARGS` → - `TOOL_CALL_END` events -* Agents can hand off to other agents, maintaining context continuity - -## Events - -All communication in AG-UI is based on typed events. Every event inherits from -`BaseEvent`: - -```typescript theme={null} -interface BaseEvent { - type: EventType - timestamp?: number - rawEvent?: any -} -``` - -Events are strictly typed and validated, ensuring reliable communication between -components. - - -# Events -Source: https://docs.ag-ui.com/concepts/events - -Understanding events in the Agent User Interaction Protocol - -# Events - -The Agent User Interaction Protocol uses a streaming event-based architecture. -Events are the fundamental units of communication between agents and frontends, -enabling real-time, structured interaction. - -## Event Types Overview - -Events in the protocol are categorized by their purpose: - -| Category | Description | -| ----------------------- | --------------------------------------- | -| Lifecycle Events | Monitor the progression of agent runs | -| Text Message Events | Handle streaming textual content | -| Tool Call Events | Manage tool executions by agents | -| State Management Events | Synchronize state between agents and UI | -| Activity Events | Represent ongoing activity progress | -| Special Events | Support custom functionality | -| Draft Events | Proposed events under development | - -## Base Event Properties - -All events share a common set of base properties: - -| Property | Description | -| ----------- | ---------------------------------------------------------------- | -| `type` | The specific event type identifier | -| `timestamp` | Optional timestamp indicating when the event was created | -| `rawEvent` | Optional field containing the original event data if transformed | - -## Lifecycle Events - -These events represent the lifecycle of an agent run. A typical agent run -follows a predictable pattern: it begins with a `RunStarted` event, may contain -multiple optional `StepStarted`/`StepFinished` pairs, and concludes with either -a `RunFinished` event (success) or a `RunError` event (failure). - -Lifecycle events provide crucial structure to agent runs, enabling frontends to -track progress, manage UI states appropriately, and handle errors gracefully. -They create a consistent framework for understanding when operations begin and -end, making it possible to implement features like loading indicators, progress -tracking, and error recovery mechanisms. - -```mermaid theme={null} -sequenceDiagram - participant Agent - participant Client - - Note over Agent,Client: Run begins - Agent->>Client: RunStarted - - opt Sending steps is optional - Note over Agent,Client: Step execution - Agent->>Client: StepStarted - Agent->>Client: StepFinished - end - - Note over Agent,Client: Run completes - alt - Agent->>Client: RunFinished - else - Agent->>Client: RunError - end -``` - -The `RunStarted` and either `RunFinished` or `RunError` events are mandatory, -forming the boundaries of an agent run. Step events are optional and may occur -multiple times within a run, allowing for structured, observable progress -tracking. - -### RunStarted - -Signals the start of an agent run. - -The `RunStarted` event is the first event emitted when an agent begins -processing a request. It establishes a new execution context identified by a -unique `runId`. This event serves as a marker for frontends to initialize UI -elements such as progress indicators or loading states. It also provides crucial -identifiers that can be used to associate subsequent events with this specific -run. - -| Property | Description | -| ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `threadId` | ID of the conversation thread | -| `runId` | ID of the agent run | -| `parentRunId` | (Optional) Lineage pointer for branching/time travel. If present, refers to a prior run within the same thread, creating a git-like append-only log | -| `input` | (Optional) The exact agent input payload that was sent to the agent for this run. May omit messages already present in history; compactEvents() will normalize | - -### RunFinished - -Signals the successful completion of an agent run. - -The `RunFinished` event indicates that an agent has successfully completed all -its work for the current run. Upon receiving this event, frontends should -finalize any UI states that were waiting on the agent's completion. This event -marks a clean termination point and indicates that no further processing will -occur in this run unless explicitly requested. The optional `result` field can -contain any output data produced by the agent run. - -| Property | Description | -| ---------- | ----------------------------- | -| `threadId` | ID of the conversation thread | -| `runId` | ID of the agent run | -| `result` | Optional result data from run | - -### RunError - -Signals an error during an agent run. - -The `RunError` event indicates that the agent encountered an error it could not -recover from, causing the run to terminate prematurely. This event provides -information about what went wrong, allowing frontends to display appropriate -error messages and potentially offer recovery options. After a `RunError` event, -no further processing will occur in this run. - -| Property | Description | -| --------- | ------------------- | -| `message` | Error message | -| `code` | Optional error code | - -### StepStarted - -Signals the start of a step within an agent run. - -The `StepStarted` event indicates that the agent is beginning a specific subtask -or phase of its processing. Steps provide granular visibility into the agent's -progress, enabling more precise tracking and feedback in the UI. Steps are -optional but highly recommended for complex operations that benefit from being -broken down into observable stages. The `stepName` could be the name of a node -or function that is currently executing. - -| Property | Description | -| ---------- | ---------------- | -| `stepName` | Name of the step | - -### StepFinished - -Signals the completion of a step within an agent run. - -The `StepFinished` event indicates that the agent has completed a specific -subtask or phase. When paired with a corresponding `StepStarted` event, it -creates a bounded context for a discrete unit of work. Frontends can use these -events to update progress indicators, show completion animations, or reveal -results specific to that step. The `stepName` must match the corresponding -`StepStarted` event to properly pair the beginning and end of the step. - -| Property | Description | -| ---------- | ---------------- | -| `stepName` | Name of the step | - -## Text Message Events - -These events represent the lifecycle of text messages in a conversation. Text -message events follow a streaming pattern, where content is delivered -incrementally. A message begins with a `TextMessageStart` event, followed by one -or more `TextMessageContent` events that deliver chunks of text as they become -available, and concludes with a `TextMessageEnd` event. - -This streaming approach enables real-time display of message content as it's -generated, creating a more responsive user experience compared to waiting for -the entire message to be complete before showing anything. - -```mermaid theme={null} -sequenceDiagram - participant Agent - participant Client - - Note over Agent,Client: Message begins - Agent->>Client: TextMessageStart - - loop Content streaming - Agent->>Client: TextMessageContent - end - - Note over Agent,Client: Message completes - Agent->>Client: TextMessageEnd -``` - -The `TextMessageContent` events each contain a `delta` field with a chunk of -text. Frontends should concatenate these deltas in the order received to -construct the complete message. The `messageId` property links all related -events, allowing the frontend to associate content chunks with the correct -message. - -### TextMessageStart - -Signals the start of a text message. - -The `TextMessageStart` event initializes a new text message in the conversation. -It establishes a unique `messageId` that will be referenced by subsequent -content chunks and the end event. This event allows frontends to prepare the UI -for an incoming message, such as creating a new message bubble with a loading -indicator. The `role` property identifies whether the message is coming from the -assistant or potentially another participant in the conversation. - -| Property | Description | -| ----------- | ------------------------------------------------------------------------------- | -| `messageId` | Unique identifier for the message | -| `role` | Role of the message sender ("developer", "system", "assistant", "user", "tool") | - -### TextMessageContent - -Represents a chunk of content in a streaming text message. - -The `TextMessageContent` event delivers incremental parts of the message text as -they become available. Each event contains a small chunk of text in the `delta` -property that should be appended to previously received chunks. The streaming -nature of these events enables real-time display of content, creating a more -responsive and engaging user experience. Implementations should handle these -events efficiently to ensure smooth text rendering without visible delays or -flickering. - -| Property | Description | -| ----------- | -------------------------------------- | -| `messageId` | Matches the ID from `TextMessageStart` | -| `delta` | Text content chunk (non-empty) | - -### TextMessageEnd - -Signals the end of a text message. - -The `TextMessageEnd` event marks the completion of a streaming text message. -After receiving this event, the frontend knows that the message is complete and -no further content will be added. This allows the UI to finalize rendering, -remove any loading indicators, and potentially trigger actions that should occur -after message completion, such as enabling reply controls or performing -automatic scrolling to ensure the full message is visible. - -| Property | Description | -| ----------- | -------------------------------------- | -| `messageId` | Matches the ID from `TextMessageStart` | - -### TextMessageChunk - -Convenience event that expands to Start → Content → End automatically. - -The `TextMessageChunk` event lets you omit explicit `TextMessageStart` and -`TextMessageEnd` events. The client stream transformer expands chunks into the -standard triad: - -* First chunk for a message must include `messageId` and will emit - `TextMessageStart` (role defaults to `assistant` when not provided). -* Each chunk with a `delta` emits a `TextMessageContent` for the current - `messageId`. -* `TextMessageEnd` is emitted automatically when the stream switches to a new - message ID or when the stream completes. - -| Property | Description | -| ----------- | ------------------------------------------------------------------------------------ | -| `messageId` | Optional unique identifier for the message; required on the first chunk of a message | -| `role` | Optional role of the sender ("developer", "system", "assistant", "user") | -| `delta` | Optional text content of the message | - -## Tool Call Events - -These events represent the lifecycle of tool calls made by agents. Tool calls -follow a streaming pattern similar to text messages. When an agent needs to use -a tool, it emits a `ToolCallStart` event, followed by one or more `ToolCallArgs` -events that stream the arguments being passed to the tool, and concludes with a -`ToolCallEnd` event. - -This streaming approach allows frontends to show tool executions in real-time, -making the agent's actions transparent and providing immediate feedback about -what tools are being invoked and with what parameters. - -```mermaid theme={null} -sequenceDiagram - participant Agent - participant Client - - Note over Agent,Client: Tool call begins - Agent->>Client: ToolCallStart - - loop Arguments streaming - Agent->>Client: ToolCallArgs - end - - Note over Agent,Client: Tool call completes - Agent->>Client: ToolCallEnd - - Note over Agent,Client: Tool execution result - Agent->>Client: ToolCallResult -``` - -The `ToolCallArgs` events each contain a `delta` field with a chunk of the -arguments. Frontends should concatenate these deltas in the order received to -construct the complete arguments object. The `toolCallId` property links all -related events, allowing the frontend to associate argument chunks with the -correct tool call. - -### ToolCallStart - -Signals the start of a tool call. - -The `ToolCallStart` event indicates that the agent is invoking a tool to perform -a specific function. This event provides the name of the tool being called and -establishes a unique `toolCallId` that will be referenced by subsequent events -in this tool call. Frontends can use this event to display tool usage to users, -such as showing a notification that a specific operation is in progress. The -optional `parentMessageId` allows linking the tool call to a specific message in -the conversation, providing context for why the tool is being used. - -| Property | Description | -| ----------------- | ----------------------------------- | -| `toolCallId` | Unique identifier for the tool call | -| `toolCallName` | Name of the tool being called | -| `parentMessageId` | Optional ID of the parent message | - -### ToolCallArgs - -Represents a chunk of argument data for a tool call. - -The `ToolCallArgs` event delivers incremental parts of the tool's arguments as -they become available. Each event contains a segment of the argument data in the -`delta` property. These deltas are often JSON fragments that, when combined, -form the complete arguments object for the tool. Streaming the arguments is -particularly valuable for complex tool calls where constructing the full -arguments may take time. Frontends can progressively reveal these arguments to -users, providing insight into exactly what parameters are being passed to tools. - -| Property | Description | -| ------------ | ----------------------------------- | -| `toolCallId` | Matches the ID from `ToolCallStart` | -| `delta` | Argument data chunk | - -### ToolCallEnd - -Signals the end of a tool call. - -The `ToolCallEnd` event marks the completion of a tool call. After receiving -this event, the frontend knows that all arguments have been transmitted and the -tool execution is underway or completed. This allows the UI to finalize the tool -call display and prepare for potential results. In systems where tool execution -results are returned separately, this event indicates that the agent has -finished specifying the tool and its arguments, and is now waiting for or has -received the results. - -| Property | Description | -| ------------ | ----------------------------------- | -| `toolCallId` | Matches the ID from `ToolCallStart` | - -### ToolCallResult - -Provides the result of a tool call execution. - -The `ToolCallResult` event delivers the output or result from a tool that was -previously invoked by the agent. This event is sent after the tool has been -executed by the system and contains the actual output generated by the tool. -Unlike the streaming pattern of tool call specification (start, args, end), the -result is delivered as a complete unit since tool execution typically produces a -complete output. Frontends can use this event to display tool results to users, -append them to the conversation history, or trigger follow-up actions based on -the tool's output. - -| Property | Description | -| ------------ | ----------------------------------------------------------- | -| `messageId` | ID of the conversation message this result belongs to | -| `toolCallId` | Matches the ID from the corresponding `ToolCallStart` event | -| `content` | The actual result/output content from the tool execution | -| `role` | Optional role identifier, typically "tool" for tool results | - -### ToolCallChunk - -Convenience event that expands to Start → Args → End automatically. - -The `ToolCallChunk` event lets you omit explicit `ToolCallStart` and -`ToolCallEnd` events. The client stream transformer expands chunks into the -standard tool-call triad: - -* First chunk for a tool call must include `toolCallId` and `toolCallName` and - will emit `ToolCallStart` (propagating any `parentMessageId`). -* Each chunk with a `delta` emits a `ToolCallArgs` for the current `toolCallId`. -* `ToolCallEnd` is emitted automatically when the stream switches to a new - `toolCallId` or when the stream completes. - -| Property | Description | -| ----------------- | -------------------------------------------------------------------- | -| `toolCallId` | Optional on later chunks; required on the first chunk of a tool call | -| `toolCallName` | Optional on later chunks; required on the first chunk of a tool call | -| `parentMessageId` | Optional ID of the parent message | -| `delta` | Optional argument data chunk (often a JSON fragment) | - -## State Management Events - -These events are used to manage and synchronize the agent's state with the -frontend. State management in the protocol follows an efficient snapshot-delta -pattern where complete state snapshots are sent initially or infrequently, while -incremental updates (deltas) are used for ongoing changes. - -This approach optimizes for both completeness and efficiency: snapshots ensure -the frontend has the full state context, while deltas minimize data transfer for -frequent updates. Together, they enable frontends to maintain an accurate -representation of agent state without unnecessary data transmission. - -```mermaid theme={null} -sequenceDiagram - participant Agent - participant Client - - Note over Agent,Client: Initial state transfer - Agent->>Client: StateSnapshot - - Note over Agent,Client: Incremental updates - loop State changes over time - Agent->>Client: StateDelta - Agent->>Client: StateDelta - end - - Note over Agent,Client: Occasional full refresh - Agent->>Client: StateSnapshot - - loop More incremental updates - Agent->>Client: StateDelta - end - - Note over Agent,Client: Message history update - Agent->>Client: MessagesSnapshot -``` - -The combination of snapshots and deltas allows frontends to efficiently track -changes to agent state while ensuring consistency. Snapshots serve as -synchronization points that reset the state to a known baseline, while deltas -provide lightweight updates between snapshots. - -### StateSnapshot - -Provides a complete snapshot of an agent's state. - -The `StateSnapshot` event delivers a comprehensive representation of the agent's -current state. This event is typically sent at the beginning of an interaction -or when synchronization is needed. It contains all state variables relevant to -the frontend, allowing it to completely rebuild its internal representation. -Frontends should replace their existing state model with the contents of this -snapshot rather than trying to merge it with previous state. - -| Property | Description | -| ---------- | ----------------------- | -| `snapshot` | Complete state snapshot | - -### StateDelta - -Provides a partial update to an agent's state using JSON Patch. - -The `StateDelta` event contains incremental updates to the agent's state in the -form of JSON Patch operations (as defined in RFC 6902). Each delta represents -specific changes to apply to the current state model. This approach is -bandwidth-efficient, sending only what has changed rather than the entire state. -Frontends should apply these patches in sequence to maintain an accurate state -representation. If a frontend detects inconsistencies after applying patches, it -may request a fresh `StateSnapshot`. - -| Property | Description | -| -------- | ----------------------------------------- | -| `delta` | Array of JSON Patch operations (RFC 6902) | - -### MessagesSnapshot - -Provides a snapshot of all messages in a conversation. - -The `MessagesSnapshot` event delivers a complete history of messages in the -current conversation. Unlike the general state snapshot, this focuses -specifically on the conversation transcript. This event is useful for -initializing the chat history, synchronizing after connection interruptions, or -providing a comprehensive view when a user joins an ongoing conversation. -Frontends should use this to establish or refresh the conversational context -displayed to users. - -| Property | Description | -| ---------- | ------------------------ | -| `messages` | Array of message objects | - -## Activity Events - -Activity Events expose structured, in-progress activity updates that occur -between chat messages. They follow the same snapshot/delta pattern as the state -system so that UIs can render a complete activity view immediately and then -incrementally update it as new information arrives. - -### ActivitySnapshot - -Delivers a complete snapshot of an activity message. - -| Property | Description | -| -------------- | --------------------------------------------------------------------------------------------- | -| `messageId` | Identifier for the `ActivityMessage` this event updates | -| `activityType` | Activity discriminator (for example `"PLAN"`, `"SEARCH"`) | -| `content` | Structured JSON payload representing the full activity state | -| `replace` | Optional. Defaults to `true`. When `false`, ignore the snapshot if the message already exists | - -Frontends should either create a new `ActivityMessage` or replace the existing -one with the payload supplied by the snapshot. - -### ActivityDelta - -Applies incremental updates to an existing activity using JSON Patch operations. - -| Property | Description | -| -------------- | ------------------------------------------------------------------------ | -| `messageId` | Identifier for the target activity message | -| `activityType` | Activity discriminator (mirrors the value from the most recent snapshot) | -| `patch` | Array of RFC 6902 JSON Patch operations to apply to the activity data | - -Activity deltas should be applied in order to the previously synchronized -activity content. If an application detects divergence, it can request or emit a -fresh `ActivitySnapshot` to resynchronize. - -## Special Events - -Special events provide flexibility in the protocol by allowing for -system-specific functionality and integration with external systems. These -events don't follow the standard lifecycle or streaming patterns of other event -types but instead serve specialized purposes. - -### Raw - -Used to pass through events from external systems. - -The `Raw` event acts as a container for events originating from external systems -or sources that don't natively follow the Agent UI Protocol. This event type -enables interoperability with other event-based systems by wrapping their events -in a standardized format. The enclosed event data is preserved in its original -form inside the `event` property, while the optional `source` property -identifies the system it came from. Frontends can use this information to handle -external events appropriately, either by processing them directly or by -delegating them to system-specific handlers. - -| Property | Description | -| -------- | -------------------------- | -| `event` | Original event data | -| `source` | Optional source identifier | - -### Custom - -Used for application-specific custom events. - -The `Custom` event provides an extension mechanism for implementing features not -covered by the standard event types. Unlike `Raw` events which act as -passthrough containers, `Custom` events are explicitly part of the protocol but -with application-defined semantics. The `name` property identifies the specific -custom event type, while the `value` property contains the associated data. This -mechanism allows for protocol extensions without requiring formal specification -changes. Teams should document their custom events to ensure consistent -implementation across frontends and agents. - -| Property | Description | -| -------- | ------------------------------- | -| `name` | Name of the custom event | -| `value` | Value associated with the event | - -## Reasoning Events - -Reasoning events support LLM reasoning visibility and continuity, enabling -chain-of-thought reasoning while maintaining privacy. These events allow agents -to surface reasoning signals (e.g., summaries) and support encrypted reasoning -items for state carry-over across turns—especially under `store:false` or zero -data retention policies—without exposing raw chain-of-thought. - -See -[OpenAI ZTR documentation](https://developers.openai.com/cookbook/examples/responses_api/reasoning_items/#encrypted-reasoning-items), -[OpenAI store parameter documentation](https://platform.openai.com/docs/api-reference/responses/create#responses_create-store), -and -[Gemini Thought Signatures](https://ai.google.dev/gemini-api/docs/thought-signatures) -for the underlying concept of encrypted reasoning items, which inspired this -design. - -See [Reasoning](/concepts/reasoning) for comprehensive documentation including -privacy considerations, compliance guidance, and implementation examples. - -```mermaid theme={null} -sequenceDiagram - participant Agent - participant Client - - Note over Agent,Client: Reasoning begins - Agent->>Client: ReasoningStart - - Note over Agent,Client: Stream reasoning content - Agent->>Client: ReasoningMessageStart - Agent->>Client: ReasoningMessageContent - Agent->>Client: ReasoningMessageEnd - - Note over Agent,Client: Reasoning completes - Agent->>Client: ReasoningEnd -``` - -### ReasoningStart - -Marks the start of reasoning. - -The `ReasoningStart` event signals that the agent is beginning a reasoning -process. It establishes a reasoning context identified by a unique `messageId`. - -| Property | Description | -| ----------- | ----------------------------------- | -| `messageId` | Unique identifier of this reasoning | - -### ReasoningMessageStart - -Signals the start of a reasoning message. - -The `ReasoningMessageStart` event begins a streaming reasoning message. This -message will contain the visible portion of the agent's reasoning that should be -displayed to users (e.g., a summary or partial chain-of-thought). - -| Property | Description | -| ----------- | --------------------------------------------- | -| `messageId` | Unique identifier of the message | -| `role` | Role of the reasoning message (`"assistant"`) | - -### ReasoningMessageContent - -Represents a chunk of content in a streaming reasoning message. - -The `ReasoningMessageContent` event delivers incremental reasoning content to -the client. Multiple content events with the same `messageId` should be -concatenated to form the complete visible reasoning. - -| Property | Description | -| ----------- | ------------------------------------------ | -| `messageId` | Matches ID from ReasoningMessageStart | -| `delta` | Reasoning content chunk (non-empty string) | - -### ReasoningMessageEnd - -Signals the end of a reasoning message. - -The `ReasoningMessageEnd` event indicates that all content for the specified -reasoning message has been sent. Clients should finalize any UI representing -this reasoning message. - -| Property | Description | -| ----------- | ------------------------------------- | -| `messageId` | Matches ID from ReasoningMessageStart | - -### ReasoningMessageChunk - -A convenience event to auto start/close reasoning messages. - -The `ReasoningMessageChunk` event simplifies implementation by automatically -managing message lifecycle. The first chunk with a `messageId` implicitly starts -the message. An empty `delta` or the next non-reasoning event implicitly closes -the message. - -| Property | Description | -| ----------- | --------------------------------------------------------- | -| `messageId` | Message ID (first event must be non-empty) | -| `delta` | Reasoning content chunk (empty string closes the message) | - -### ReasoningEnd - -Marks the end of reasoning. - -The `ReasoningEnd` event signals that the agent has completed its reasoning -process for the given context. No further reasoning events with the same -`messageId` should be expected after this event. - -| Property | Description | -| ----------- | ----------------------------------- | -| `messageId` | Unique identifier of this reasoning | - -### ReasoningEncryptedValue - -Attaches encrypted chain-of-thought reasoning to a message or tool call. - -The `ReasoningEncryptedValue` event carries encrypted reasoning content that -represents the LLM's internal chain-of-thought related to a specific entity. -This allows the agent to preserve reasoning state across conversation turns -without exposing the raw content to the client. The client stores and forwards -these encrypted values opaquely—only the agent (or authorized backend) can -decrypt them. - -| Property | Description | -| ---------------- | -------------------------------------------------------- | -| `subtype` | Entity type: `"message"` or `"tool-call"` | -| `entityId` | ID of the message or tool call this reasoning belongs to | -| `encryptedValue` | Encrypted chain-of-thought content blob | - -Use cases: - -* **Message reasoning**: Attach encrypted reasoning to an `AssistantMessage` or - `ReasoningMessage` to preserve context for follow-up turns -* **Tool call reasoning**: Attach encrypted reasoning to a tool call to capture - why the agent chose specific arguments or how it interpreted results - -## Deprecated Events - - - The following events are deprecated and will be removed in version 1.0.0. Use - the corresponding Reasoning events instead. - - -### Thinking Events (Deprecated) - -The `THINKING_*` events have been replaced by `REASONING_*` events: - -| Deprecated Event | Replacement | -| ------------------------------- | --------------------------- | -| `THINKING_START` | `REASONING_START` | -| `THINKING_END` | `REASONING_END` | -| `THINKING_TEXT_MESSAGE_START` | `REASONING_MESSAGE_START` | -| `THINKING_TEXT_MESSAGE_CONTENT` | `REASONING_MESSAGE_CONTENT` | -| `THINKING_TEXT_MESSAGE_END` | `REASONING_MESSAGE_END` | - -See [Reasoning Migration](/concepts/reasoning#migration-from-thinking-events) -for detailed migration guidance. - -## Draft Events - -These events are currently in draft status and may change before finalization. -They represent proposed extensions to the protocol that are under active -development and discussion. - -### Meta Events - - - DRAFT - - -[View Proposal](/drafts/meta-events) - -Meta events provide annotations and signals independent of agent runs, such as -user feedback or external system events. - -#### MetaEvent - -A side-band annotation event that can occur anywhere in the stream. - -| Property | Description | -| ---------- | ---------------------------------------------------- | -| `metaType` | Application-defined type (e.g., "thumbs\_up", "tag") | -| `payload` | Application-defined payload | - -### Modified Lifecycle Events - - - DRAFT - - -[View Proposal](/drafts/interrupts) - -Extensions to existing lifecycle events to support interrupts and branching. - -#### RunFinished (Extended) - -The `RunFinished` event gains new fields to support interrupt-aware workflows. - -| Property | Description | -| ----------- | ------------------------------------------------ | -| `outcome` | Optional: "success" or "interrupt" | -| `interrupt` | Optional: Contains interrupt details when paused | - -See [Serialization](/concepts/serialization) for lineage and input capture. - -#### RunStarted (Extended) - -The `RunStarted` event gains new fields to support branching and input tracking. - -| Property | Description | -| ------------- | ------------------------------------------------- | -| `parentRunId` | Optional: Parent run ID for branching/time travel | -| `input` | Optional: The exact agent input for this run | - -## Event Flow Patterns - -Events in the protocol typically follow specific patterns: - -1. **Start-Content-End Pattern**: Used for streaming content (text messages, - tool calls) - * `Start` event initiates the stream - * `Content` events deliver data chunks - * `End` event signals completion - -2. **Snapshot-Delta Pattern**: Used for state synchronization - * `Snapshot` provides complete state - * `Delta` events provide incremental updates - -3. **Lifecycle Pattern**: Used for monitoring agent runs - * `Started` events signal beginnings - * `Finished`/`Error` events signal endings - -## Implementation Considerations - -When implementing event handlers: - -* Events should be processed in the order they are received -* Events with the same ID (e.g., `messageId`, `toolCallId`) belong to the same - logical stream -* Implementations should be resilient to out-of-order delivery -* Custom events should follow the established patterns for consistency - - -# Generative UI -Source: https://docs.ag-ui.com/concepts/generative-ui-specs - -Understanding AG-UI's relationship with generative UI specifications - -## **AG-UI and Generative UI Specs** - -Several recently released specs have enabled agents to return generative UI, increasing the power and flexibility of the Agent\<->User conversation. - -A2UI, MCP-UI, and Open-JSON-UI are all **generative UI specifications.** Generative UIs allow agents to respond to users not only with text but also with dynamic UI components. - -| **Specification** | **Origin / Maintainer** | **Purpose** | -| ----------------- | ----------------------- | -------------------------------------------------------------------------------------------------------------------- | -| **A2UI** | Google | A declarative, LLM-friendly Generative UI spec. JSONL-based and streaming, designed for platform-agnostic rendering. | -| **Open-JSON-UI** | OpenAI | An open standardization of OpenAI's internal declarative Generative UI schema. | -| **MCP-UI** | Microsoft + Shopify | A fully open, iframe-based Generative UI standard extending MCP for user-facing experiences. | - -Despite the naming similarities, **AG-UI is not a generative UI specification** — it's a **User Interaction protocol** that provides the **bi-directional runtime connection** between the agent and the application. - -AG-UI natively supports all of the above generative UI specs and allows developers to define **their own custom generative UI standards** as well. - - -# Messages -Source: https://docs.ag-ui.com/concepts/messages - -Understanding message structure and communication in AG-UI - -# Messages - -Messages form the backbone of communication in the AG-UI protocol. They -represent the conversation history between users and AI agents, and provide a -standardized way to exchange information regardless of the underlying AI service -being used. - -## Message Structure - -AG-UI messages follow a vendor-neutral format, ensuring compatibility across -different AI providers while maintaining a consistent structure. This allows -applications to switch between AI services (like OpenAI, Anthropic, or custom -models) without changing the client-side implementation. - -The basic message structure includes: - -```typescript theme={null} -interface BaseMessage { - id: string // Unique identifier for the message - role: string // The role of the sender (user, assistant, system, tool, reasoning) - content?: string // Optional text content of the message - name?: string // Optional name of the sender - encryptedContent?: string // Optional encrypted content for privacy-preserving state continuity -} -``` - -The `role` discriminator can be `"user"`, `"assistant"`, `"system"`, `"tool"`, -`"developer"`, `"activity"`, or `"reasoning"`. Concrete message types extend -this shape with the fields they need. - -> The `encryptedContent` field enables privacy-preserving workflows where -> sensitive content (such as reasoning chains) can be passed across turns -> without exposing the raw content. This is particularly useful for zero data -> retention (ZDR) compliance and `store:false` scenarios. - -## Message Types - -AG-UI supports several message types to accommodate different participants in a -conversation: - -### User Messages - -Messages from the end user to the agent: - -```typescript theme={null} -interface UserMessage { - id: string - role: "user" - content: string | InputContent[] // Text or multimodal input from the user - name?: string // Optional user identifier -} - -type InputContent = TextInputContent | BinaryInputContent - -interface TextInputContent { - type: "text" - text: string -} - -interface BinaryInputContent { - type: "binary" - mimeType: string - id?: string - url?: string - data?: string - filename?: string -} -``` - -> For `BinaryInputContent`, provide at least one of `id`, `url`, or `data` to -> reference the payload. - -This structure keeps traditional plain-text inputs working while enabling richer -payloads such as images, audio clips, or uploaded files in the same message. - -### Assistant Messages - -Messages from the AI assistant to the user: - -```typescript theme={null} -interface AssistantMessage { - id: string - role: "assistant" - content?: string // Text response from the assistant (optional if using tool calls) - name?: string // Optional assistant identifier - toolCalls?: ToolCall[] // Optional tool calls made by the assistant - encryptedContent?: string // Optional encrypted content for state continuity -} -``` - -### System Messages - -Instructions or context provided to the agent: - -```typescript theme={null} -interface SystemMessage { - id: string - role: "system" - content: string // Instructions or context for the agent - name?: string // Optional identifier -} -``` - -### Tool Messages - -Results from tool executions: - -```typescript theme={null} -interface ToolMessage { - id: string - role: "tool" - content: string // Result from the tool execution - toolCallId: string // ID of the tool call this message responds to - error?: string // Optional error message if the tool execution failed - encryptedValue?: string // Optional encrypted reasoning for state continuity -} -``` - -Key points: - -* The `toolCallId` links the result back to the original tool call -* Use `error` to indicate tool execution failures -* Use `encryptedValue` to attach encrypted chain-of-thought related to how the - agent interpreted or processed the tool result - -### Activity Messages - -Structured UI messages that exist only on the frontend. Used for progress, -status, or any custom visual element that shouldn’t be sent to the model: - -```typescript theme={null} -interface ActivityMessage { - id: string - role: "activity" - activityType: string // e.g. "PLAN", "SEARCH", "SCRAPE" - content: Record // Structured payload rendered by the frontend -} -``` - -Key points - -* Emitted via `ACTIVITY_SNAPSHOT` and `ACTIVITY_DELTA` to support live, - updateable UI (checklists, steps, search-in-progress, etc.). -* **Frontend-only:** never forwarded to the agent, so no filtering and no LLM - confusion. -* **Customizable:** define your own `activityType` and `content` and render a - matching UI component. -* **Streamable:** can be updated over time for long-running operations. -* Helps persist/restore custom events by turning them into durable message - objects. - -### Developer Messages - -Internal messages used for development or debugging: - -```typescript theme={null} -interface DeveloperMessage { - id: string - role: "developer" - content: string - name?: string -} -``` - -### Reasoning Messages - -Messages representing the agent's internal reasoning or chain-of-thought -process: - -```typescript theme={null} -interface ReasoningMessage { - id: string - role: "reasoning" - content: string // Reasoning content (visible to client) - encryptedValue?: string // Optional encrypted reasoning for state continuity -} -``` - - - Unlike Activity messages, Reasoning messages are intended to represent the - agent's internal thought process and may be encrypted for privacy and are - meant to be sent back to the agent for further processing on subsequent turns. - - -Key points: - -* Emitted via `REASONING_MESSAGE_START`, `REASONING_MESSAGE_CONTENT`, and - `REASONING_MESSAGE_END` events. -* **Visibility control:** Content may be visible to users (as a summary) or - fully encrypted. -* **Encrypted values:** Use `REASONING_ENCRYPTED_VALUE` events to attach - encrypted chain-of-thought to messages or tool calls without exposing content. -* **State continuity:** Encrypted reasoning items can be passed across - conversation turns without exposing raw chain-of-thought. -* **Privacy-first:** Supports `store:false` and zero data retention (ZDR) - policies while preserving reasoning capabilities. -* **Separate from assistant messages:** Reasoning is kept distinct from final - responses to avoid polluting the conversation history. - -See [Reasoning Events](/concepts/events#reasoning-events) for the streaming -event lifecycle. - -## Vendor Neutrality - -AG-UI messages are designed to be vendor-neutral, meaning they can be easily -mapped to and from proprietary formats used by various AI providers: - -```typescript theme={null} -// Example: Converting AG-UI messages to OpenAI format -const openaiMessages = agUiMessages - .filter((msg) => ["user", "system", "assistant"].includes(msg.role)) - .map((msg) => ({ - role: msg.role as "user" | "system" | "assistant", - content: msg.content || "", - // Map tool calls if present - ...(msg.role === "assistant" && msg.toolCalls - ? { - tool_calls: msg.toolCalls.map((tc) => ({ - id: tc.id, - type: tc.type, - function: { - name: tc.function.name, - arguments: tc.function.arguments, - }, - })), - } - : {}), - })) -``` - -This abstraction allows AG-UI to serve as a common interface regardless of the -underlying AI service. - -## Message Synchronization - -Messages can be synchronized between client and server through two primary -mechanisms: - -### Complete Snapshots - -The `MESSAGES_SNAPSHOT` event provides a complete view of all messages in a -conversation: - -```typescript theme={null} -interface MessagesSnapshotEvent { - type: EventType.MESSAGES_SNAPSHOT - messages: Message[] // Complete array of all messages -} -``` - -This is typically used: - -* When initializing a conversation -* After connection interruptions -* When major state changes occur -* To ensure client-server synchronization - -### Streaming Messages - -For real-time interactions, new messages can be streamed as they're generated: - -1. **Start a message**: Indicate a new message is being created - - ```typescript theme={null} - interface TextMessageStartEvent { - type: EventType.TEXT_MESSAGE_START - messageId: string - role: string - } - ``` - -2. **Stream content**: Send content chunks as they become available - - ```typescript theme={null} - interface TextMessageContentEvent { - type: EventType.TEXT_MESSAGE_CONTENT - messageId: string - delta: string // Text chunk to append - } - ``` - -3. **End a message**: Signal the message is complete - ```typescript theme={null} - interface TextMessageEndEvent { - type: EventType.TEXT_MESSAGE_END - messageId: string - } - ``` - -This streaming approach provides a responsive user experience with immediate -feedback. - -## Tool Integration in Messages - -AG-UI messages elegantly integrate tool usage, allowing agents to perform -actions and process their results: - -### Tool Calls - -Tool calls are embedded within assistant messages: - -```typescript theme={null} -interface ToolCall { - id: string // Unique ID for this tool call - type: "function" // Type of tool call - function: { - name: string // Name of the function to call - arguments: string // JSON-encoded string of arguments - } -} -``` - -Example assistant message with tool calls: - -```typescript theme={null} -{ - id: "msg_123", - role: "assistant", - content: "I'll help you with that calculation.", - toolCalls: [ - { - id: "call_456", - type: "function", - function: { - name: "calculate", - arguments: '{"expression": "24 * 7"}' - } - } - ] -} -``` - -### Tool Results - -Results from tool executions are represented as tool messages: - -```typescript theme={null} -{ - id: "result_789", - role: "tool", - content: "168", - toolCallId: "call_456" // References the original tool call -} -``` - -This creates a clear chain of tool usage: - -1. Assistant requests a tool call -2. Tool executes and returns a result -3. Assistant can reference and respond to the result - -## Streaming Tool Calls - -Similar to text messages, tool calls can be streamed to provide real-time -visibility into the agent's actions: - -1. **Start a tool call**: - - ```typescript theme={null} - interface ToolCallStartEvent { - type: EventType.TOOL_CALL_START - toolCallId: string - toolCallName: string - parentMessageId?: string // Optional link to parent message - } - ``` - -2. **Stream arguments**: - - ```typescript theme={null} - interface ToolCallArgsEvent { - type: EventType.TOOL_CALL_ARGS - toolCallId: string - delta: string // JSON fragment to append to arguments - } - ``` - -3. **End a tool call**: - ```typescript theme={null} - interface ToolCallEndEvent { - type: EventType.TOOL_CALL_END - toolCallId: string - } - ``` - -This allows frontends to show tools being invoked progressively as the agent -constructs its reasoning. - -## Practical Example - -Here's a complete example of a conversation with tool usage: - -```typescript theme={null} -// Conversation history -;[ - // User query - { - id: "msg_1", - role: "user", - content: "What's the weather in New York?", - }, - - // Assistant response with tool call - { - id: "msg_2", - role: "assistant", - content: "Let me check the weather for you.", - toolCalls: [ - { - id: "call_1", - type: "function", - function: { - name: "get_weather", - arguments: '{"location": "New York", "unit": "celsius"}', - }, - }, - ], - }, - - // Tool result - { - id: "result_1", - role: "tool", - content: - '{"temperature": 22, "condition": "Partly Cloudy", "humidity": 65}', - toolCallId: "call_1", - }, - - // Assistant's final response using tool results - { - id: "msg_3", - role: "assistant", - content: - "The weather in New York is partly cloudy with a temperature of 22°C and 65% humidity.", - }, -] -``` - -## Conclusion - -The message structure in AG-UI enables sophisticated conversational AI -experiences while maintaining vendor neutrality. By standardizing how messages -are represented, synchronized, and streamed, AG-UI provides a consistent way to -implement interactive human-agent communication regardless of the underlying AI -service. - -This system supports everything from simple text exchanges to complex tool-based -workflows, all while optimizing for both real-time responsiveness and efficient -data transfer. - - -# Middleware -Source: https://docs.ag-ui.com/concepts/middleware - -Transform and intercept events in AG-UI agents - -# Middleware - -Middleware in AG-UI provides a powerful way to transform, filter, and augment the event streams that flow through agents. It enables you to add cross-cutting concerns like logging, authentication, rate limiting, and event filtering without modifying the core agent logic. - -Examples below assume the relevant RxJS operators/utilities (`map`, `tap`, `catchError`, `switchMap`, `timer`, etc.) are imported. - -## What is Middleware? - -Middleware sits between the agent execution and the event consumer, allowing you to: - -1. **Transform events** – Modify or enhance events as they flow through the pipeline -2. **Filter events** – Selectively allow or block certain events -3. **Add metadata** – Inject additional context or tracking information -4. **Handle errors** – Implement custom error recovery strategies -5. **Monitor execution** – Add logging, metrics, or debugging capabilities - -## How Middleware Works - -Middleware forms a chain where each middleware wraps the next, creating layers of functionality. When an agent runs, the event stream flows through each middleware in sequence. - -```typescript theme={null} -import { AbstractAgent } from "@ag-ui/client" - -const agent = new MyAgent() - -// Middleware chain: logging -> auth -> filter -> agent -agent.use(loggingMiddleware, authMiddleware, filterMiddleware) - -// When agent runs, events flow through all middleware -await agent.runAgent() -``` - -Middleware added with `agent.use(...)` is applied in `runAgent()`. `connectAgent()` currently calls `connect()` directly and does not run middleware. - -## Function-Based Middleware - -For simple transformations, you can use function-based middleware. This is the most concise way to add middleware: - -```typescript theme={null} -import { MiddlewareFunction } from "@ag-ui/client" -import { EventType } from "@ag-ui/core" - -const prefixMiddleware: MiddlewareFunction = (input, next) => { - return next.run(input).pipe( - map(event => { - if ( - event.type === EventType.TEXT_MESSAGE_CHUNK || - event.type === EventType.TEXT_MESSAGE_CONTENT - ) { - return { - ...event, - delta: `[AI]: ${event.delta}` - } - } - return event - }) - ) -} - -agent.use(prefixMiddleware) -``` - -## Class-Based Middleware - -For more complex scenarios requiring state or configuration, use class-based middleware: - -```typescript theme={null} -import { Middleware } from "@ag-ui/client" -import { Observable } from "rxjs" -import { tap } from "rxjs/operators" - -class MetricsMiddleware extends Middleware { - private eventCount = 0 - - constructor(private metricsService: MetricsService) { - super() - } - - run(input: RunAgentInput, next: AbstractAgent): Observable { - const startTime = Date.now() - - return this.runNext(input, next).pipe( - tap(event => { - this.eventCount++ - this.metricsService.recordEvent(event.type) - }), - finalize(() => { - const duration = Date.now() - startTime - this.metricsService.recordDuration(duration) - this.metricsService.recordEventCount(this.eventCount) - }) - ) - } -} - -agent.use(new MetricsMiddleware(metricsService)) -``` - -If you are writing class middleware, prefer the helper methods: - -* `runNext(input, next)` normalizes chunk events into full `TEXT_MESSAGE_*`/`TOOL_CALL_*` sequences. -* `runNextWithState(input, next)` also provides accumulated `messages` and `state` after each event. - -## Built-in Middleware - -AG-UI provides several built-in middleware components for common use cases: - -### FilterToolCallsMiddleware - -Filter tool calls based on allowed or disallowed lists: - -```typescript theme={null} -import { FilterToolCallsMiddleware } from "@ag-ui/client" - -// Only allow specific tools -const allowedFilter = new FilterToolCallsMiddleware({ - allowedToolCalls: ["search", "calculate"] -}) - -// Or block specific tools -const blockedFilter = new FilterToolCallsMiddleware({ - disallowedToolCalls: ["delete", "modify"] -}) - -agent.use(allowedFilter) -``` - -`FilterToolCallsMiddleware` filters emitted `TOOL_CALL_*` events. It does not block tool execution in the upstream model/runtime. - -## Middleware Patterns - -Common patterns include logging, auth via `forwardedProps`, and rate limiting. See the [JS middleware reference](/sdk/js/client/middleware) for concrete implementations. - -## Combining Middleware - -You can combine multiple middleware to create sophisticated processing pipelines: - -```typescript theme={null} -const logMiddleware: MiddlewareFunction = (input, next) => next.run(input) -const metricsMiddleware = new MetricsMiddleware(metricsService) -const filterMiddleware = new FilterToolCallsMiddleware({ allowedToolCalls: ["search"] }) - -agent.use(logMiddleware, metricsMiddleware, filterMiddleware) -``` - -## Execution Order - -Middleware executes in the order it's added, with each middleware wrapping the next: - -1. First middleware receives the original input -2. It can modify the input before passing to the next middleware -3. Each middleware processes events from the next in the chain -4. The final middleware calls the actual agent - -```typescript theme={null} -agent.use(middleware1, middleware2, middleware3) - -// Execution flow: -// → middleware1 -// → middleware2 -// → middleware3 -// → agent.run() -// ← events flow back through middleware3 -// ← events flow back through middleware2 -// ← events flow back through middleware1 -``` - -## Best Practices - -1. **Keep middleware focused** – Each middleware should have a single responsibility -2. **Handle errors gracefully** – Use RxJS error handling operators -3. **Avoid blocking operations** – Use async patterns for I/O operations -4. **Document side effects** – Clearly indicate if middleware modifies state -5. **Test middleware independently** – Write unit tests for each middleware -6. **Consider performance** – Be mindful of processing overhead in the event stream - -## Advanced Use Cases - -### Conditional Middleware - -Apply middleware based on runtime conditions: - -```typescript theme={null} -const conditionalMiddleware: MiddlewareFunction = (input, next) => { - if (input.forwardedProps?.debug === true) { - // Apply debug logging - return next.run(input).pipe( - tap(event => console.debug(event)) - ) - } - return next.run(input) -} -``` - -For event transformation and stream-control variants, see the [JS middleware reference](/sdk/js/client/middleware). - -## Conclusion - -Middleware provides a flexible and powerful way to extend AG-UI agents without modifying their core logic. Whether you need simple event transformation or complex stateful processing, the middleware system offers the tools to build robust, maintainable agent applications. - - -# Reasoning -Source: https://docs.ag-ui.com/concepts/reasoning - -Support for LLM reasoning visibility and continuity in AG-UI - -# Reasoning - -AG-UI provides first-class support for LLM reasoning, enabling chain-of-thought -visibility while maintaining privacy and state continuity across conversation -turns. - -## Overview - -Modern LLMs increasingly use chain-of-thought reasoning to improve response -quality. AG-UI's reasoning support addresses three key challenges: - -* **Reasoning visibility**: Surface reasoning signals (e.g., summaries) to users - without exposing raw chain-of-thought -* **State continuity**: Maintain reasoning context across turns using encrypted - reasoning items, even under `store:false` or zero data retention (ZDR) - policies -* **Privacy compliance**: Support enterprise privacy requirements while - preserving reasoning capabilities - - - Unlike Activity messages, Reasoning messages are intended to represent the - agent's internal thought process and may be encrypted for privacy and are - meant to be sent back to the agent for further processing on subsequent turns. - - -## ReasoningMessage - -The `ReasoningMessage` type represents reasoning content in the message history: - -```typescript theme={null} -interface ReasoningMessage { - id: string - role: "reasoning" - content: string // Reasoning content (visible to client) - encryptedValue?: string // Optional encrypted reasoning for state continuity -} -``` - -| Property | Type | Description | -| ---------------- | ------------- | ---------------------------------------------------- | -| `id` | `string` | Unique identifier for the reasoning message | -| `role` | `"reasoning"` | Message role discriminator | -| `content` | `string` | Reasoning content visible to the client | -| `encryptedValue` | `string?` | Encrypted chain-of-thought blob for state continuity | - -Key characteristics: - -* **Separate from assistant messages**: Reasoning is kept distinct from final - responses to avoid polluting conversation history -* **Streamable**: Content arrives via streaming events -* **Optional encryption**: When `encryptedValue` is present, it represents - encrypted chain-of-thought that the client stores and forwards opaquely - -## Reasoning Events - -Reasoning events manage the lifecycle of reasoning messages. See -[Events](/concepts/events#reasoning-events) for the complete event reference. - -### Event Flow - -A typical reasoning flow follows this pattern: - -```mermaid theme={null} -sequenceDiagram - participant Agent - participant Client - - Note over Agent,Client: Reasoning begins - Agent->>Client: ReasoningStart - - Note over Agent,Client: Stream visible reasoning - Agent->>Client: ReasoningMessageStart - Agent->>Client: ReasoningMessageContent (delta) - Agent->>Client: ReasoningMessageContent (delta) - Agent->>Client: ReasoningMessageEnd - - Note over Agent,Client: Attach encrypted chain-of-thought - Agent->>Client: ReasoningEncryptedValue - - Note over Agent,Client: Reasoning completes - Agent->>Client: ReasoningEnd -``` - -### Event Types - -| Event | Purpose | -| ------------------------- | ------------------------------------------------------------- | -| `ReasoningStart` | Marks beginning of reasoning phase | -| `ReasoningMessageStart` | Begins a streaming reasoning message | -| `ReasoningMessageContent` | Delivers reasoning content chunks | -| `ReasoningMessageEnd` | Completes a reasoning message | -| `ReasoningMessageChunk` | Convenience event that auto-manages message lifecycle | -| `ReasoningEnd` | Marks completion of reasoning | -| `ReasoningEncryptedValue` | Attaches encrypted chain-of-thought to a message or tool call | - -## Privacy and Compliance - -AG-UI reasoning is designed with privacy-first principles: - -### Zero Data Retention (ZDR) - -For deployments requiring zero data retention: - -1. **Encrypted reasoning values** can carry state across turns without storing - decryptable content on the client -2. The client receives and forwards `encryptedValue` blobs opaquely via - `ReasoningEncryptedValue` events -3. Only the agent (or authorized backend) can decrypt the reasoning content - -### Visibility Control - -Agents control what reasoning is visible to users: - -* **Full visibility**: Stream the complete chain-of-thought via - `ReasoningMessageContent` events -* **Summary only**: Emit a condensed summary while attaching detailed reasoning - as encrypted values -* **Hidden**: Use only `ReasoningEncryptedValue` events with no visible - streaming - -### Compliance Considerations - -| Requirement | Solution | -| ----------------------- | ----------------------------------------------------------------------------------- | -| GDPR right to erasure | Encrypted content can be discarded without losing reasoning capability | -| SOC 2 data handling | Reasoning content never stored in plaintext on client | -| HIPAA minimum necessary | Only summaries exposed; detailed reasoning stays encrypted | -| Audit logging | `ReasoningStart`/`ReasoningEnd` events provide audit trail without content exposure | - -## Example Implementations - -### Basic Reasoning Flow - -A simple implementation showing visible reasoning: - -```typescript theme={null} -// Agent emits reasoning start -yield { - type: "REASONING_START", - messageId: "reasoning-001", -} - -// Stream visible reasoning content -yield { - type: "REASONING_MESSAGE_START", - messageId: "msg-123", - role: "assistant", -} - -yield { - type: "REASONING_MESSAGE_CONTENT", - messageId: "msg-123", - delta: "Let me ", -} - -yield { - type: "REASONING_MESSAGE_CONTENT", - messageId: "msg-123", - delta: "think through ", -} - -yield { - type: "REASONING_MESSAGE_CONTENT", - messageId: "msg-123", - delta: "this step ", -} - -yield { - type: "REASONING_MESSAGE_CONTENT", - messageId: "msg-123", - delta: "by step...", -} - -yield { - type: "REASONING_MESSAGE_END", - messageId: "msg-123", -} - -// End reasoning -yield { - type: "REASONING_END", - messageId: "reasoning-001", -} -``` - -### Encrypted Content for State Continuity - -When maintaining reasoning state across turns without exposing content, use the -`ReasoningEncryptedValue` event to attach encrypted chain-of-thought to messages -or tool calls: - -```typescript theme={null} -// Agent emits reasoning start -yield { - type: "REASONING_START", - messageId: "reasoning-002", -} - -// Stream a visible summary for the user -yield { - type: "REASONING_MESSAGE_START", - messageId: "msg-456", - role: "assistant", -} - -yield { - type: "REASONING_MESSAGE_CONTENT", - messageId: "msg-456", - delta: "Analyzing your request...", -} - -yield { - type: "REASONING_MESSAGE_END", - messageId: "msg-456", -} - -// Attach encrypted chain-of-thought to the reasoning message -yield { - type: "REASONING_ENCRYPTED_VALUE", - subtype: "message", - entityId: "msg-456", - encryptedValue: "eyJhbGciOiJBMjU2R0NNIiwiZW5jIjoiQTI1NkdDTSJ9...", -} - -yield { - type: "REASONING_END", - messageId: "reasoning-002", -} - -// On subsequent turns, client sends back the message with encryptedValue -// which the agent can decrypt to restore reasoning context -``` - -### Attaching Encrypted Reasoning to Tool Calls - -You can also attach encrypted reasoning to tool calls to capture why the agent -chose specific arguments or how it interpreted results: - -```typescript theme={null} -// Tool call with encrypted reasoning -yield { - type: "TOOL_CALL_START", - toolCallId: "tool-123", - toolCallName: "search_database", - parentMessageId: "msg-789", -} - -yield { - type: "TOOL_CALL_ARGS", - toolCallId: "tool-123", - delta: '{"query": "user preferences"}', -} - -yield { - type: "TOOL_CALL_END", - toolCallId: "tool-123", -} - -// Attach encrypted reasoning explaining why this tool was called -yield { - type: "REASONING_ENCRYPTED_VALUE", - subtype: "tool-call", - entityId: "tool-123", - encryptedValue: "encrypted-reasoning-about-tool-selection...", -} -``` - -### ZDR-Compliant Implementation - -For zero data retention scenarios: - -```typescript theme={null} -// Server-side: encrypt reasoning before sending -const encryptedReasoning = await encrypt(detailedChainOfThought, secretKey) - -yield { - type: "REASONING_START", - messageId: "reasoning-003", -} - -// Only emit a high-level summary to the client -yield { - type: "REASONING_MESSAGE_CHUNK", - messageId: "summary-001", - delta: "Processing your request securely...", -} - -yield { - type: "REASONING_MESSAGE_CHUNK", - messageId: "summary-001", - delta: "", // Empty delta closes the message -} - -// Attach the encrypted chain-of-thought -yield { - type: "REASONING_ENCRYPTED_VALUE", - subtype: "message", - entityId: "summary-001", - encryptedValue: encryptedReasoning, -} - -yield { - type: "REASONING_END", - messageId: "reasoning-003", -} - -// Client stores only: -// - The encrypted blob (cannot decrypt) -// - The summary text (no sensitive details) -// Full reasoning is never persisted in plaintext -``` - -### Using the Convenience Chunk Event - -The `ReasoningMessageChunk` event simplifies implementation by auto-managing -message lifecycle: - -```typescript theme={null} -// First chunk with messageId starts the message automatically -yield { - type: "REASONING_MESSAGE_CHUNK", - messageId: "msg-789", - delta: "Analyzing the problem space...", -} - -// Subsequent chunks continue the stream -yield { - type: "REASONING_MESSAGE_CHUNK", - messageId: "msg-789", - delta: " Considering multiple approaches...", -} - -// Empty delta (or next non-reasoning event) closes automatically -yield { - type: "REASONING_MESSAGE_CHUNK", - messageId: "msg-789", - delta: "", -} -``` - -## Client Integration - -### Handling Reasoning Events - -```typescript theme={null} -import { EventType, type BaseEvent } from "@ag-ui/core" - -function handleEvent(event: BaseEvent) { - switch (event.type) { - case EventType.REASONING_START: - // Initialize reasoning UI (e.g., "thinking" indicator) - console.log("Agent is reasoning...") - break - - case EventType.REASONING_MESSAGE_CONTENT: - // Append visible reasoning to UI - appendReasoningText(event.messageId, event.delta) - break - - case EventType.REASONING_ENCRYPTED_VALUE: - // Store encrypted value for the referenced entity - if (event.subtype === "message") { - storeMessageEncryptedValue(event.entityId, event.encryptedValue) - } else if (event.subtype === "tool-call") { - storeToolCallEncryptedValue(event.entityId, event.encryptedValue) - } - break - - case EventType.REASONING_END: - // Finalize reasoning UI - console.log("Reasoning complete") - break - } -} -``` - -### Passing Encrypted Reasoning Back - -When making subsequent requests, include stored encrypted values: - -```typescript theme={null} -const response = await agent.run({ - threadId: "thread-123", - messages: [ - ...previousMessages, - { - id: "reasoning-002", - role: "reasoning", - content: "Analyzing your request...", // Visible summary - encryptedValue: storedEncryptedBlob, // Opaque to client - }, - { - id: "user-msg-001", - role: "user", - content: "Follow up question...", - }, - ], -}) -``` - -## Migration from Thinking Events - - - The `THINKING_*` events are deprecated and will be removed in version 1.0.0. - New implementations should use `REASONING_*` events. - - -### Deprecated Events - -The following events are deprecated: - -| Deprecated Event | Replacement | -| ------------------------------- | --------------------------- | -| `THINKING_START` | `REASONING_START` | -| `THINKING_END` | `REASONING_END` | -| `THINKING_TEXT_MESSAGE_START` | `REASONING_MESSAGE_START` | -| `THINKING_TEXT_MESSAGE_CONTENT` | `REASONING_MESSAGE_CONTENT` | -| `THINKING_TEXT_MESSAGE_END` | `REASONING_MESSAGE_END` | - -### Migration Steps - -1. **Update event types**: Replace all `THINKING_*` event types with their - `REASONING_*` equivalents -2. **Update message types**: Use `ReasoningMessage` with `role: "reasoning"` - instead of any thinking-specific message types -3. **Add encrypted value support**: Consider using `ReasoningEncryptedValue` - events for improved privacy compliance -4. **Test thoroughly**: Ensure existing functionality works with the new event - types - -### Example Migration - -Before (deprecated): - -```typescript theme={null} -// ❌ Deprecated - do not use -yield { type: "THINKING_START", messageId: "think-001" } -yield { type: "THINKING_TEXT_MESSAGE_START", messageId: "msg-001" } -yield { type: "THINKING_TEXT_MESSAGE_CONTENT", messageId: "msg-001", delta: "..." } -yield { type: "THINKING_TEXT_MESSAGE_END", messageId: "msg-001" } -yield { type: "THINKING_END", messageId: "think-001" } -``` - -After (current): - -```typescript theme={null} -// ✅ Current implementation -yield { type: "REASONING_START", messageId: "reasoning-001" } -yield { type: "REASONING_MESSAGE_START", messageId: "msg-001", role: "assistant" } -yield { type: "REASONING_MESSAGE_CONTENT", messageId: "msg-001", delta: "..." } -yield { type: "REASONING_MESSAGE_END", messageId: "msg-001" } -yield { type: "REASONING_END", messageId: "reasoning-001" } -``` - -## Best Practices - -1. **Always pair start/end events**: Every `ReasoningStart` should have a - corresponding `ReasoningEnd` -2. **Use encrypted values for sensitive reasoning**: When chain-of-thought - contains sensitive information, use `ReasoningEncryptedValue` to attach - encrypted content to messages or tool calls -3. **Provide user feedback**: Even with encrypted reasoning, emit visible - summaries so users know the agent is working -4. **Handle missing events gracefully**: Clients should be resilient to - incomplete event streams -5. **Consider bandwidth**: For very long reasoning chains, consider emitting - only summaries to reduce data transfer - -## Related Documentation - -* [Events](/concepts/events#reasoning-events) - Complete event type reference -* [Messages](/concepts/messages#reasoning-messages) - Message type documentation -* [Serialization](/concepts/serialization) - State continuity and lineage - - -# Serialization -Source: https://docs.ag-ui.com/concepts/serialization - -Serialize event streams for history restore, branching, and compaction in AG-UI - -# Serialization - -Serialization in AG-UI provides a standard way to persist and restore the event -stream that drives an agent–UI session. With a serialized stream you can: - -* Restore chat history and UI state after reloads or reconnects -* Attach to running agents and continue receiving events -* Create branches (time travel) from any prior run -* Compact stored history to reduce size without losing meaning - -This page explains the model, the updated event fields, and practical usage -patterns with examples. - -## Core Concepts - -* Stream serialization – Convert the full event history to and from a portable - representation (e.g., JSON) for storage in databases, files, or logs. -* Event compaction – Reduce verbose streams to snapshots while preserving - semantics (e.g., merge content chunks, collapse deltas into snapshots). -* Run lineage – Track branches of conversation using a `parentRunId`, forming - a git‑like append‑only log that enables time travel and alternative paths. - -## Updated Event Fields - -The `RunStarted` event includes additional optional fields: - -```ts theme={null} -type RunStartedEvent = BaseEvent & { - type: EventType.RUN_STARTED - threadId: string - runId: string - /** Parent for branching/time travel within the same thread */ - parentRunId?: string - /** Exact agent input for this run (may omit messages already in history) */ - input?: AgentInput -} -``` - -These fields enable lineage tracking and let implementations record precisely -what was passed to the agent, independent of previously recorded messages. - -## Event Compaction - -Compaction reduces noise in an event stream while keeping the same observable -outcome. A typical implementation provides a utility: - -```ts theme={null} -declare function compactEvents(events: BaseEvent[]): BaseEvent[] -``` - -Common compaction rules include: - -* Message streams – Combine `TEXT_MESSAGE_*` sequences into a single message - snapshot; concatenate adjacent `TEXT_MESSAGE_CONTENT` for the same message. -* Tool calls – Collapse tool call start/content/end into a compact record. -* State – Merge consecutive `STATE_DELTA` events into a single final - `STATE_SNAPSHOT` and discard superseded updates. -* Run input normalization – Remove from `RunStarted.input.messages` any - messages already present earlier in the stream. - -## Branching and Time Travel - -Setting `parentRunId` on a `RunStarted` event creates a git‑like lineage. The -stream becomes an immutable append‑only log where each run can branch from any -previous run. - -```mermaid theme={null} -gitGraph - commit id: "run1" - commit id: "run2" - branch alternative - checkout alternative - commit id: "run3 (parent run2)" - commit id: "run4" - checkout main - commit id: "run5 (parent run2)" - commit id: "run6" -``` - -Benefits: - -* Multiple branches in the same serialized log -* Immutable history (append‑only) -* Deterministic time travel to any point - -## Examples - -### Basic Serialization - -```ts theme={null} -// Serialize event stream -const events: BaseEvent[] = [...]; -const serialized = JSON.stringify(events); - -await storage.save(threadId, serialized); - -// Restore and compact later -const restored = JSON.parse(await storage.load(threadId)); -const compacted = compactEvents(restored); -``` - -### Event Compaction - -Before: - -```ts theme={null} -[ - { type: "TEXT_MESSAGE_START", messageId: "msg1", role: "user" }, - { type: "TEXT_MESSAGE_CONTENT", messageId: "msg1", delta: "Hello " }, - { type: "TEXT_MESSAGE_CONTENT", messageId: "msg1", delta: "world" }, - { type: "TEXT_MESSAGE_END", messageId: "msg1" }, - { type: "STATE_DELTA", patch: { op: "add", path: "/foo", value: 1 } }, - { type: "STATE_DELTA", patch: { op: "replace", path: "/foo", value: 2 } }, -] -``` - -After: - -```ts theme={null} -[ - { - type: "MESSAGES_SNAPSHOT", - messages: [{ id: "msg1", role: "user", content: "Hello world" }], - }, - { - type: "STATE_SNAPSHOT", - state: { foo: 2 }, - }, -] -``` - -### Branching With `parentRunId` - -```ts theme={null} -// Original run -{ - type: "RUN_STARTED", - threadId: "thread1", - runId: "run1", - input: { messages: ["Tell me about Paris"] }, -} - -// Branch from run1 -{ - type: "RUN_STARTED", - threadId: "thread1", - runId: "run2", - parentRunId: "run1", - input: { messages: ["Actually, tell me about London instead"] }, -} -``` - -### Normalized Input - -```ts theme={null} -// First run includes full message -{ - type: "RUN_STARTED", - runId: "run1", - input: { messages: [{ id: "msg1", role: "user", content: "Hello" }] }, -} - -// Second run omits already‑present message -{ - type: "RUN_STARTED", - runId: "run2", - input: { messages: [{ id: "msg2", role: "user", content: "How are you?" }] }, - // msg1 omitted; it already exists in history -} -``` - -## Implementation Notes - -* Provide SDK helpers for compaction and (de)serialization. -* Store streams append‑only; prefer incremental writes when possible. -* Consider compression when persisting long histories. -* Add indexes by `threadId`, `runId`, and timestamps for fast retrieval. - -## See Also - -* Concepts: [Events](/concepts/events), [State Management](/concepts/state) -* SDKs: TypeScript encoder and core event types - - -# State Management -Source: https://docs.ag-ui.com/concepts/state - -Understanding state synchronization between agents and frontends in AG-UI - -# State Management - -State management is a core feature of the AG-UI protocol that enables real-time -synchronization between agents and frontend applications. By providing efficient -mechanisms for sharing and updating state, AG-UI creates a foundation for -collaborative experiences where both AI agents and human users can work together -seamlessly. - -## Shared State Architecture - -In AG-UI, state is a structured data object that: - -1. Persists across interactions with an agent -2. Can be accessed by both the agent and the frontend -3. Updates in real-time as the interaction progresses -4. Provides context for decision-making on both sides - -This shared state architecture creates a bidirectional communication channel -where: - -* Agents can access the application's current state to make informed decisions -* Frontends can observe and react to changes in the agent's internal state -* Both sides can modify the state, creating a collaborative workflow - -## State Synchronization Methods - -AG-UI provides two complementary methods for state synchronization: - -### State Snapshots - -The `STATE_SNAPSHOT` event delivers a complete representation of an agent's -current state: - -```typescript theme={null} -interface StateSnapshotEvent { - type: EventType.STATE_SNAPSHOT - snapshot: any // Complete state object -} -``` - -Snapshots are typically used: - -* At the beginning of an interaction to establish the initial state -* After connection interruptions to ensure synchronization -* When major state changes occur that require a complete refresh -* To establish a new baseline for future delta updates - -When a frontend receives a `STATE_SNAPSHOT` event, it should replace its -existing state model entirely with the contents of the snapshot. - -### State Deltas - -The `STATE_DELTA` event delivers incremental updates to the state using JSON -Patch format (RFC 6902): - -```typescript theme={null} -interface StateDeltaEvent { - type: EventType.STATE_DELTA - delta: JsonPatchOperation[] // Array of JSON Patch operations -} -``` - -Deltas are bandwidth-efficient, sending only what has changed rather than the -entire state. This approach is particularly valuable for: - -* Frequent small updates during streaming interactions -* Large state objects where most properties remain unchanged -* High-frequency updates that would be inefficient to send as full snapshots - -## JSON Patch Format - -AG-UI uses the JSON Patch format (RFC 6902) for state deltas, which defines a -standardized way to express changes to a JSON document: - -```typescript theme={null} -interface JsonPatchOperation { - op: "add" | "remove" | "replace" | "move" | "copy" | "test" - path: string // JSON Pointer (RFC 6901) to the target location - value?: any // The value to apply (for add, replace) - from?: string // Source path (for move, copy) -} -``` - -Common operations include: - -1. **add**: Adds a value to an object or array - - ```json theme={null} - { "op": "add", "path": "/user/preferences", "value": { "theme": "dark" } } - ``` - -2. **replace**: Replaces a value - - ```json theme={null} - { "op": "replace", "path": "/conversation_state", "value": "paused" } - ``` - -3. **remove**: Removes a value - - ```json theme={null} - { "op": "remove", "path": "/temporary_data" } - ``` - -4. **move**: Moves a value from one location to another - ```json theme={null} - { "op": "move", "path": "/completed_items", "from": "/pending_items/0" } - ``` - -Frontends should apply these patches in sequence to maintain an accurate state -representation. If inconsistencies are detected after applying patches, the -frontend can request a fresh `STATE_SNAPSHOT`. - -## State Processing in AG-UI - -In the AG-UI implementation, state deltas are applied using the -`fast-json-patch` library: - -```typescript theme={null} -case EventType.STATE_DELTA: { - const { delta } = event as StateDeltaEvent; - - try { - // Apply the JSON Patch operations to the current state without mutating the original - const result = applyPatch(state, delta, true, false); - state = result.newDocument; - return emitUpdate({ state }); - } catch (error: unknown) { - console.warn( - `Failed to apply state patch:\n` + - `Current state: ${JSON.stringify(state, null, 2)}\n` + - `Patch operations: ${JSON.stringify(delta, null, 2)}\n` + - `Error: ${errorMessage}` - ); - return emitNoUpdate(); - } -} -``` - -This implementation ensures that: - -* Patches are applied atomically (all or none) -* The original state is not mutated during the application process -* Errors are caught and handled gracefully - -## Human-in-the-Loop Collaboration - -The shared state system is fundamental to human-in-the-loop workflows in AG-UI. -It enables: - -1. **Real-time visibility**: Users can observe the agent's thought process and - current status -2. **Contextual awareness**: The agent can access user actions, preferences, and - application state -3. **Collaborative decision-making**: Both human and AI can contribute to the - evolving state -4. **Feedback loops**: Humans can correct or guide the agent by modifying state - properties - -For example, an agent might update its state with a proposed action: - -```json theme={null} -{ - "proposal": { - "action": "send_email", - "recipient": "client@example.com", - "content": "Draft email content..." - } -} -``` - -The frontend can display this proposal to the user, who can then approve, -reject, or modify it before execution. - -## CopilotKit Implementation - -[CopilotKit](https://docs.copilotkit.ai), a popular framework for building AI -assistants, leverages AG-UI's state management system through its "shared state" -feature. This implementation enables bidirectional state synchronization between -agents (particularly LangGraph agents) and frontend applications. - -CopilotKit's shared state system is implemented through: - -```jsx theme={null} -// In the frontend React application -const { state: agentState, setState: setAgentState } = useCoAgent({ - name: "agent", - initialState: { someProperty: "initialValue" }, -}) -``` - -This hook creates a real-time connection to the agent's state, allowing: - -1. Reading the agent's current state in the frontend -2. Updating the agent's state from the frontend -3. Rendering UI components based on the agent's state - -On the backend, LangGraph agents can emit state updates using: - -```python theme={null} -# In the LangGraph agent -async def tool_node(self, state: ResearchState, config: RunnableConfig): - # Update state with new information - tool_state = { - "title": new_state.get("title", ""), - "outline": new_state.get("outline", {}), - "sections": new_state.get("sections", []), - # Other state properties... - } - - # Emit updated state to frontend - await copilotkit_emit_state(config, tool_state) - - return tool_state -``` - -These state updates are transmitted using AG-UI's state snapshot and delta -mechanisms, creating a seamless shared context between agent and frontend. - -## Best Practices - -When implementing state management in AG-UI: - -1. **Use snapshots judiciously**: Full snapshots should be sent only when - necessary to establish a baseline. -2. **Prefer deltas for incremental changes**: Small state updates should use - deltas to minimize data transfer. -3. **Structure state thoughtfully**: Design state objects to support partial - updates and minimize patch complexity. -4. **Handle state conflicts**: Implement strategies for resolving conflicting - updates from agent and frontend. -5. **Include error recovery**: Provide mechanisms to resynchronize state if - inconsistencies are detected. -6. **Consider security implications**: Avoid storing sensitive information in - shared state. - -## Conclusion - -AG-UI's state management system provides a powerful foundation for building -collaborative applications where humans and AI agents work together. By -efficiently synchronizing state between frontend and backend through snapshots -and JSON Patch deltas, AG-UI enables sophisticated human-in-the-loop workflows -that combine the strengths of both human intuition and AI capabilities. - -The implementation in frameworks like CopilotKit demonstrates how this shared -state approach can create collaborative experiences that are more effective than -either fully autonomous systems or traditional user interfaces. - - -# Tools -Source: https://docs.ag-ui.com/concepts/tools - -Understanding tools and how they enable human-in-the-loop AI workflows - -# Tools - -Tools are a fundamental concept in the AG-UI protocol that enable AI agents to -interact with external systems and incorporate human judgment into their -workflows. By defining tools in the frontend and passing them to agents, -developers can create sophisticated human-in-the-loop experiences that combine -AI capabilities with human expertise. - -## What Are Tools? - -In AG-UI, tools are functions that agents can call to: - -1. Request specific information -2. Perform actions in external systems -3. Ask for human input or confirmation -4. Access specialized capabilities - -Tools bridge the gap between AI reasoning and real-world actions, allowing -agents to accomplish tasks that would be impossible through conversation alone. - -## Tool Structure - -Tools follow a consistent structure that defines their name, purpose, and -expected parameters: - -```typescript theme={null} -interface Tool { - name: string // Unique identifier for the tool - description: string // Human-readable explanation of what the tool does - parameters: { - // JSON Schema defining the tool's parameters - type: "object" - properties: { - // Tool-specific parameters - } - required: string[] // Array of required parameter names - } -} -``` - -The `parameters` field uses [JSON Schema](https://json-schema.org/) to define -the structure of arguments that the tool accepts. This schema is used by both -the agent (to generate valid tool calls) and the frontend (to validate and parse -tool arguments). - -## Frontend-Defined Tools - -A key aspect of AG-UI's tool system is that tools are defined in the frontend -and passed to the agent during execution: - -```typescript theme={null} -// Define tools in the frontend -const userConfirmationTool = { - name: "confirmAction", - description: "Ask the user to confirm a specific action before proceeding", - parameters: { - type: "object", - properties: { - action: { - type: "string", - description: "The action that needs user confirmation", - }, - importance: { - type: "string", - enum: ["low", "medium", "high", "critical"], - description: "The importance level of the action", - }, - }, - required: ["action"], - }, -} - -// Pass tools to the agent during execution -agent.runAgent({ - tools: [userConfirmationTool], - // Other parameters... -}) -``` - -This approach has several advantages: - -1. **Frontend control**: The frontend determines what capabilities are available - to the agent -2. **Dynamic capabilities**: Tools can be added or removed based on user - permissions, context, or application state -3. **Separation of concerns**: Agents focus on reasoning while frontends handle - tool implementation -4. **Security**: Sensitive operations are controlled by the application, not the - agent - -## Tool Call Lifecycle - -When an agent needs to use a tool, it follows a standardized sequence of events: - -1. **ToolCallStart**: Indicates the beginning of a tool call with a unique ID - and tool name - - ```typescript theme={null} - { - type: EventType.TOOL_CALL_START, - toolCallId: "tool-123", - toolCallName: "confirmAction", - parentMessageId: "msg-456" // Optional reference to a message - } - ``` - -2. **ToolCallArgs**: Streams the tool arguments as they're generated - - ```typescript theme={null} - { - type: EventType.TOOL_CALL_ARGS, - toolCallId: "tool-123", - delta: '{"act' // Partial JSON being streamed - } - ``` - - ```typescript theme={null} - { - type: EventType.TOOL_CALL_ARGS, - toolCallId: "tool-123", - delta: 'ion":"Depl' // More JSON being streamed - } - ``` - - ```typescript theme={null} - { - type: EventType.TOOL_CALL_ARGS, - toolCallId: "tool-123", - delta: 'oy the application to production"}' // Final JSON fragment - } - ``` - -3. **ToolCallEnd**: Marks the completion of the tool call - ```typescript theme={null} - { - type: EventType.TOOL_CALL_END, - toolCallId: "tool-123" - } - ``` - -The frontend accumulates these deltas to construct the complete tool call -arguments. Once the tool call is complete, the frontend can execute the tool and -provide results back to the agent. - -## Tool Results - -After a tool has been executed, the result is sent back to the agent as a "tool -message": - -```typescript theme={null} -{ - id: "result-789", - role: "tool", - content: "true", // Tool result as a string - toolCallId: "tool-123" // References the original tool call -} -``` - -This message becomes part of the conversation history, allowing the agent to -reference and incorporate the tool's result in subsequent responses. - -## Human-in-the-Loop Workflows - -The AG-UI tool system is especially powerful for implementing human-in-the-loop -workflows. By defining tools that request human input or confirmation, -developers can create AI experiences that seamlessly blend autonomous operation -with human judgment. - -For example: - -1. Agent needs to make an important decision -2. Agent calls the `confirmAction` tool with details about the decision -3. Frontend displays a confirmation dialog to the user -4. User provides their input -5. Frontend sends the user's decision back to the agent -6. Agent continues processing with awareness of the user's choice - -This pattern enables use cases like: - -* **Approval workflows**: AI suggests actions that require human approval -* **Data verification**: Humans verify or correct AI-generated data -* **Collaborative decision-making**: AI and humans jointly solve complex - problems -* **Supervised learning**: Human feedback improves future AI decisions - -## CopilotKit Integration - -[CopilotKit](https://docs.copilotkit.ai/) provides a simplified way to work with -AG-UI tools in React applications through its -[`useCopilotAction`](https://docs.copilotkit.ai/guides/frontend-actions) hook: - -```tsx theme={null} -import { useCopilotAction } from "@copilotkit/react-core" - -// Define a tool for user confirmation -useCopilotAction({ - name: "confirmAction", - description: "Ask the user to confirm an action", - parameters: { - type: "object", - properties: { - action: { - type: "string", - description: "The action to confirm", - }, - }, - required: ["action"], - }, - handler: async ({ action }) => { - // Show a confirmation dialog - const confirmed = await showConfirmDialog(action) - return confirmed ? "approved" : "rejected" - }, -}) -``` - -This approach makes it easy to define tools that integrate with your React -components and handle the tool execution logic in a clean, declarative way. - -## Tool Examples - -Here are some common types of tools used in AG-UI applications: - -### User Confirmation - -```typescript theme={null} -{ - name: "confirmAction", - description: "Ask the user to confirm an action", - parameters: { - type: "object", - properties: { - action: { - type: "string", - description: "The action to confirm" - }, - importance: { - type: "string", - enum: ["low", "medium", "high", "critical"], - description: "The importance level" - } - }, - required: ["action"] - } -} -``` - -### Data Retrieval - -```typescript theme={null} -{ - name: "fetchUserData", - description: "Retrieve data about a specific user", - parameters: { - type: "object", - properties: { - userId: { - type: "string", - description: "ID of the user" - }, - fields: { - type: "array", - items: { - type: "string" - }, - description: "Fields to retrieve" - } - }, - required: ["userId"] - } -} -``` - -### User Interface Control - -```typescript theme={null} -{ - name: "navigateTo", - description: "Navigate to a different page or view", - parameters: { - type: "object", - properties: { - destination: { - type: "string", - description: "Destination page or view" - }, - params: { - type: "object", - description: "Optional parameters for the navigation" - } - }, - required: ["destination"] - } -} -``` - -### Content Generation - -```typescript theme={null} -{ - name: "generateImage", - description: "Generate an image based on a description", - parameters: { - type: "object", - properties: { - prompt: { - type: "string", - description: "Description of the image to generate" - }, - style: { - type: "string", - description: "Visual style for the image" - }, - dimensions: { - type: "object", - properties: { - width: { type: "number" }, - height: { type: "number" } - }, - description: "Dimensions of the image" - } - }, - required: ["prompt"] - } -} -``` - -## Best Practices - -When designing tools for AG-UI: - -1. **Clear naming**: Use descriptive, action-oriented names -2. **Detailed descriptions**: Include thorough descriptions to help the agent - understand when and how to use the tool -3. **Structured parameters**: Define precise parameter schemas with descriptive - field names and constraints -4. **Required fields**: Only mark parameters as required if they're truly - necessary -5. **Error handling**: Implement robust error handling in tool execution code -6. **User experience**: Design tool UIs that provide appropriate context for - human decision-making - -## Conclusion - -Tools in AG-UI bridge the gap between AI reasoning and real-world actions, -enabling sophisticated workflows that combine the strengths of AI and human -intelligence. By defining tools in the frontend and passing them to agents, -developers can create interactive experiences where AI and humans collaborate -efficiently. - -The tool system is particularly powerful for implementing human-in-the-loop -workflows, where AI can suggest actions but defer critical decisions to humans. -This balances automation with human judgment, creating AI experiences that are -both powerful and trustworthy. - - -# Contributing -Source: https://docs.ag-ui.com/development/contributing - -How to participate in Agent User Interaction Protocol development - -# Naming conventions - -Add your package under `integrations/` with docs and tests. - -If your integration is work in progress, you can still add it to main branch. -You can prefix it with `wip-`, i.e. -(`integrations/wip-your-integration`) or if you're a third party -contributor use the `community` prefix, i.e. -(`integrations/community/your-integration`). - -For questions and discussions, please use -[GitHub Discussions](https://github.com/ag-ui-protocol/ag-ui/discussions). - - -# Roadmap -Source: https://docs.ag-ui.com/development/roadmap - -Our plans for evolving Agent User Interaction Protocol - -You can follow the progress of the AG-UI Protocol on our -[public roadmap](https://github.com/orgs/ag-ui-protocol/projects/1). - -## Get Involved - -If you’d like to contribute ideas, feature requests, or bug reports to -the roadmap, please see the [Contributing Guide](https://github.com/ag-ui-protocol/ag-ui/blob/main/CONTRIBUTING.md) -for details on how to get involved. - - -# What's New -Source: https://docs.ag-ui.com/development/updates - -The latest updates and improvements to AG-UI - - - * Initial release of the Agent User Interaction Protocol - - - -# Generative User Interfaces -Source: https://docs.ag-ui.com/drafts/generative-ui - -AI-generated interfaces without custom tool renderers - -# Generative User Interfaces - -## Summary - -### Problem Statement - -Currently, creating custom user interfaces for agent interactions requires -programmers to define specific tool renderers. This limits the flexibility and -adaptability of agent-driven applications. - -### Motivation - -This draft describes an AG-UI extension that addresses **generative user -interfaces**—interfaces produced directly by artificial intelligence without -requiring a programmer to define custom tool renderers. The key idea is to -leverage our ability to send client-side tools to the agent, thereby enabling -this capability across all agent frameworks supported by AG-UI. - -## Status - -* **Status**: Draft -* **Author(s)**: Markus Ecker ([mail@mme.xyz](mailto:mail@mme.xyz)) - -## Challenges and Limitations - -### Tool Description Length - -OpenAI enforces a limit of 1024 characters for tool descriptions. Gemini and -Anthropic impose no such limit. - -### Arguments JSON Schema Constraints - -Classes, nesting, `$ref`, and `oneOf` are not reliably supported across LLM -providers. - -### Context Window Considerations - -Injecting a large UI description language into an agent may reduce its -performance. Agents dedicated solely to UI generation perform better than agents -combining UI generation with other tasks. - -## Detailed Specification - -### Two-Step Generation Process - -```mermaid theme={null} -flowchart TD - A[Agent needs UI] --> B["Step 1: What?
Agent calls generateUserInterface
(description, data, output)"] - B --> C["Step 2: How?
Secondary generator builds actual UI
(JSON Schema, React, etc.)"] - C --> D[Rendered UI shown to user] - D --> E[Validated user input returned to Agent] -``` - -### Step 1: What to Generate? - -Inject a lightweight tool into the agent: - -**Tool Definition:** - -* **Name:** `generateUserInterface` -* **Arguments:** - * **description**: A high-level description of the UI (e.g., *"A form for - entering the user's address"*) - * **data**: Arbitrary pre-populated data for the generated UI - * **output**: A description or schema of the data the agent expects the user - to submit back (fields, required/optional, types, constraints) - -**Example Tool Call:** - -```json theme={null} -{ - "tool": "generateUserInterface", - "arguments": { - "description": "A form that collects a user's shipping address.", - "data": { - "firstName": "Ada", - "lastName": "Lovelace", - "city": "London" - }, - "output": { - "type": "object", - "required": [ - "firstName", - "lastName", - "street", - "city", - "postalCode", - "country" - ], - "properties": { - "firstName": { "type": "string", "title": "First Name" }, - "lastName": { "type": "string", "title": "Last Name" }, - "street": { "type": "string", "title": "Street Address" }, - "city": { "type": "string", "title": "City" }, - "postalCode": { "type": "string", "title": "Postal Code" }, - "country": { - "type": "string", - "title": "Country", - "enum": ["GB", "US", "DE", "AT"] - } - } - } - } -} -``` - -### Step 2: How to Generate? - -Delegate UI generation to a secondary LLM or agent: - -* The CopilotKit user stays in control: Can make their own generators, add - custom libraries, include additional prompts etc. -* On tool invocation, the secondary model consumes `description`, `data`, and - `output` to generate the user interface -* This model is focused solely on UI generation, ensuring maximum fidelity and - consistency -* The generation method can be swapped as needed (e.g., JSON, HTML, or other - renderable formats) -* The UI format description is not subject to structural or length constraints, - allowing arbitrarily complex specifications - -## Implementation Examples - -### Example Output: UISchemaGenerator - -```json theme={null} -{ - "jsonSchema": { - "title": "Shipping Address", - "type": "object", - "required": [ - "firstName", - "lastName", - "street", - "city", - "postalCode", - "country" - ], - "properties": { - "firstName": { "type": "string", "title": "First name" }, - "lastName": { "type": "string", "title": "Last name" }, - "street": { "type": "string", "title": "Street address" }, - "city": { "type": "string", "title": "City" }, - "postalCode": { "type": "string", "title": "Postal code" }, - "country": { - "type": "string", - "title": "Country", - "enum": ["GB", "US", "DE", "AT"] - } - } - }, - "uiSchema": { - "type": "VerticalLayout", - "elements": [ - { - "type": "Group", - "label": "Personal Information", - "elements": [ - { "type": "Control", "scope": "#/properties/firstName" }, - { "type": "Control", "scope": "#/properties/lastName" } - ] - }, - { - "type": "Group", - "label": "Address", - "elements": [ - { "type": "Control", "scope": "#/properties/street" }, - { "type": "Control", "scope": "#/properties/city" }, - { "type": "Control", "scope": "#/properties/postalCode" }, - { "type": "Control", "scope": "#/properties/country" } - ] - } - ] - }, - "initialData": { - "firstName": "Ada", - "lastName": "Lovelace", - "city": "London", - "country": "GB" - } -} -``` - -### Example Output: ReactFormHookGenerator - -```tsx theme={null} -import React from "react" -import { useForm } from "react-hook-form" -import { z } from "zod" -import { zodResolver } from "@hookform/resolvers/zod" - -// ----- Schema (contract) ----- -const AddressSchema = z.object({ - firstName: z.string().min(1, "Required"), - lastName: z.string().min(1, "Required"), - street: z.string().min(1, "Required"), - city: z.string().min(1, "Required"), - postalCode: z.string().regex(/^[A-Za-z0-9\\-\\s]{3,10}$/, "3–10 chars"), - country: z.enum(["GB", "US", "DE", "AT", "FR", "IT", "ES"]), -}) -export type Address = z.infer - -type Props = { - initialData?: Partial
- meta?: { title?: string; submitLabel?: string } - respond: (data: Address) => void // <-- called on successful submit -} - -const COUNTRIES: Address["country"][] = [ - "GB", - "US", - "DE", - "AT", - "FR", - "IT", - "ES", -] - -export default function AddressForm({ initialData, meta, respond }: Props) { - const { - register, - handleSubmit, - formState: { errors }, - } = useForm
({ - resolver: zodResolver(AddressSchema), - defaultValues: { - firstName: "", - lastName: "", - street: "", - city: "", - postalCode: "", - country: "GB", - ...initialData, - }, - }) - - const onSubmit = (data: Address) => { - // Guaranteed to match AddressSchema - respond(data) - } - - return ( -
- {meta?.title &&

{meta.title}

} - - {/* Section: Personal Information */} -
- Personal Information - -
- - - {errors.firstName && {errors.firstName.message}} -
- -
- - - {errors.lastName && {errors.lastName.message}} -
-
- - {/* Section: Address */} -
- Address - -
- - - {errors.street && {errors.street.message}} -
- -
- - - {errors.city && {errors.city.message}} -
- -
- - - {errors.postalCode && {errors.postalCode.message}} -
- -
- - - {errors.country && {errors.country.message}} -
-
- -
- -
-
- ) -} -``` - -## Implementation Considerations - -### Client SDK Changes - -TypeScript SDK additions: - -* New `generateUserInterface` tool type -* UI generator registry for pluggable generators -* Validation layer for generated UI schemas -* Response handler for user-submitted data - -Python SDK additions: - -* Support for UI generation tool invocation -* Schema validation utilities -* Serialization for UI definitions - -### Integration Impact - -* All AG-UI integrations can leverage this capability without modification -* Frameworks emit standard tool calls; client handles UI generation -* Backward compatible with existing tool-based UI approaches - -## Use Cases - -### Dynamic Forms - -Agents can generate forms on-the-fly based on conversation context without -pre-defined schemas. - -### Data Visualization - -Generate charts, graphs, or tables appropriate to the data being discussed. - -### Interactive Workflows - -Create multi-step wizards or guided processes tailored to user needs. - -### Adaptive Interfaces - -Generate different UI layouts based on user preferences or device capabilities. - -## Testing Strategy - -* Unit tests for tool injection and invocation -* Integration tests with multiple UI generators -* E2E tests demonstrating various UI types -* Performance benchmarks comparing single vs. two-step generation -* Cross-provider compatibility testing - -## References - -* [AG-UI Tools Documentation](/concepts/tools) -* [JSON Schema](https://json-schema.org/) -* [React Hook Form](https://react-hook-form.com/) -* [JSON Forms](https://jsonforms.io/) - - -# Interrupt-Aware Run Lifecycle -Source: https://docs.ag-ui.com/drafts/interrupts - -Native support for human-in-the-loop pauses and interrupts - -# Interrupt-Aware Run Lifecycle Proposal - -## Summary - -### Problem Statement - -Agents often need to pause execution to request human approval, gather -additional input, or confirm potentially risky actions. Currently, there's no -standardized way to handle these interruptions across different agent -frameworks. - -### Motivation - -Support **human-in-the-loop pauses** (and related mechanisms) natively in AG-UI -and CopilotKit. This enables compatibility with various framework interrupts, -workflow suspend/resume, and other framework-specific pause mechanisms. - -## Status - -* **Status**: Draft -* **Author(s)**: Markus Ecker ([mail@mme.xyz](mailto:mail@mme.xyz)) - -## Overview - -This proposal introduces a standardized interrupt/resume pattern: - -```mermaid theme={null} -sequenceDiagram - participant Agent - participant Client as Client App - - Agent-->>Client: RUN_FINISHED { outcome: "interrupt", interrupt:{ id, reason, payload }} - Client-->>Agent: RunAgentInput.resume { threadId, interruptId, payload } - Agent-->>Client: RUN_FINISHED { outcome: "success", result } -``` - -## Detailed Specification - -### Updates to RUN\_FINISHED Event - -```typescript theme={null} -type RunFinishedOutcome = "success" | "interrupt" - -type RunFinished = { - type: "RUN_FINISHED" - - // ... existing fields - - outcome?: RunFinishedOutcome // optional for back-compat (see rules below) - - // Present when outcome === "success" (or when outcome omitted and interrupt is absent) - result?: any - - // Present when outcome === "interrupt" (or when outcome omitted and interrupt is present) - interrupt?: { - id?: string // id can be set when needed - reason?: string // e.g. "human_approval" | "upload_required" | "policy_hold" - payload?: any // arbitrary JSON for UI (forms, proposals, diffs, etc.) - } -} -``` - -When a run finishes with `outcome == "interrupt"`, the agent indicates that on -the next run, a value needs to be provided to continue. - -### Updates to RunAgentInput - -```typescript theme={null} -type RunAgentInput = { - // ... existing fields - - // NEW: resume channel for continuing a suspension - resume?: { - interruptId?: string // echo back if one was provided - payload?: any // arbitrary JSON: approvals, edits, files-as-refs, etc. - } -} -``` - -### Contract Rules - -* Resume requests **must** use the same `threadId` -* When given in the `interrupt`, the `interruptId` must be provided via - `RunAgentInput` -* Agents should handle missing or invalid resume payloads gracefully - -## Implementation Examples - -### Minimal Interrupt/Resume - -**Agent sends interrupt:** - -```json theme={null} -{ - "type": "RUN_FINISHED", - "threadId": "t1", - "runId": "r1", - "outcome": "interrupt", - "interrupt": { - "id": "int-abc123", - "reason": "human_approval", - "payload": { - "proposal": { - "tool": "sendEmail", - "args": { "to": "a@b.com", "subject": "Hi", "body": "…" } - } - } - } -} -``` - -**User responds:** - -```json theme={null} -{ - "threadId": "t1", - "runId": "r2", - "resume": { - "interruptId": "int-abc123", - "payload": { "approved": true } - } -} -``` - -### Complex Approval Flow - -**Agent requests approval with context:** - -```json theme={null} -{ - "type": "RUN_FINISHED", - "threadId": "thread-456", - "runId": "run-789", - "outcome": "interrupt", - "interrupt": { - "id": "approval-001", - "reason": "database_modification", - "payload": { - "action": "DELETE", - "table": "users", - "affectedRows": 42, - "query": "DELETE FROM users WHERE last_login < '2023-01-01'", - "rollbackPlan": "Restore from backup snapshot-2025-01-23", - "riskLevel": "high" - } - } -} -``` - -**User approves with modifications:** - -```json theme={null} -{ - "threadId": "thread-456", - "runId": "run-790", - "resume": { - "interruptId": "approval-001", - "payload": { - "approved": true, - "modifications": { - "batchSize": 10, - "dryRun": true - } - } - } -} -``` - -## Use Cases - -### Human Approval - -Agents pause before executing sensitive operations (sending emails, making -purchases, deleting data). - -### Information Gathering - -Agent requests additional context or files from the user mid-execution. - -### Policy Enforcement - -Automatic pauses triggered by organizational policies or compliance -requirements. - -### Multi-Step Wizards - -Complex workflows where each step requires user confirmation or input. - -### Error Recovery - -Agent pauses when encountering an error, allowing user to provide guidance. - -## Implementation Considerations - -### Client SDK Changes - -TypeScript SDK: - -* Extended `RunFinishedEvent` type with outcome and interrupt fields -* Updated `RunAgentInput` with resume field -* Helper methods for interrupt handling - -Python SDK: - -* Extended `RunFinishedEvent` class -* Updated `RunAgentInput` with resume support -* Interrupt state management utilities - -### Framework Integration - -**Planning Frameworks:** - -* Map framework interrupts to AG-UI interrupt events -* Handle resume payloads in execution continuation - -**Workflow Systems:** - -* Convert workflow suspensions to AG-UI interrupts -* Resume workflow execution with provided payload - -**Custom Frameworks:** - -* Provide interrupt/resume adapter interface -* Documentation for integration patterns - -### UI Considerations - -* Standard components for common interrupt reasons -* Customizable interrupt UI based on payload -* Clear indication of pending interrupts -* History of interrupt/resume actions - -## Testing Strategy - -* Unit tests for interrupt/resume serialization -* Integration tests with multiple frameworks -* E2E tests demonstrating various interrupt scenarios -* State consistency tests across interrupt boundaries -* Performance tests for rapid interrupt/resume cycles - -## References - -* [AG-UI Events Documentation](/concepts/events) -* [AG-UI State Management](/concepts/state) - - -# Meta Events -Source: https://docs.ag-ui.com/drafts/meta-events - -Annotations and signals independent of agent runs - -# Meta Events Proposal - -## Summary - -### Problem Statement - -Currently, AG-UI events are tightly coupled to agent runs. There's no -standardized way to attach user feedback, annotations, or external signals to -the event stream that are independent of the agent's execution lifecycle. - -### Motivation - -AG-UI is extended with **MetaEvents**, a new class of events that can occur at -any point in the event stream, independent of agent runs. MetaEvents provide a -way to attach annotations, signals, or feedback to a serialized stream. They may -originate from users, clients, or external systems rather than from agents. -Examples include reactions such as thumbs up/down on a message. - -## Status - -* **Status**: Draft -* **Author(s)**: Markus Ecker ([mail@mme.xyz](mailto:mail@mme.xyz)) - -## Detailed Specification - -### Overview - -This proposal introduces: - -* A new **MetaEvent** type for side-band annotations -* Events that can appear anywhere in the stream -* Support for user feedback, tags, and external annotations -* Extensible payload structure for application-specific data - -## New Type: MetaEvent - -```typescript theme={null} -type MetaEvent = BaseEvent & { - type: EventType.META - /** - * Application-defined type of the meta event. - * Examples: "thumbs_up", "thumbs_down", "tag", "note" - */ - metaType: string - - /** - * Application-defined payload. - * May reference other entities (e.g., messageId) or contain freeform data. - */ - payload: Record -} -``` - -### Key Characteristics - -* **Run-independent**: MetaEvents are not tied to any specific run lifecycle -* **Position-flexible**: Can appear before, between, or after runs -* **Origin-diverse**: May come from users, clients, or external systems -* **Extensible**: Applications define their own metaType values and payload - schemas - -## Implementation Examples - -### User Feedback - -**Thumbs Up:** - -```json theme={null} -{ - "id": "evt_123", - "ts": 1714063982000, - "type": "META", - "metaType": "thumbs_up", - "payload": { - "messageId": "msg_456", - "userId": "user_789" - } -} -``` - -**Thumbs Down with Reason:** - -```json theme={null} -{ - "id": "evt_124", - "ts": 1714063985000, - "type": "META", - "metaType": "thumbs_down", - "payload": { - "messageId": "msg_456", - "userId": "user_789", - "reason": "inaccurate", - "comment": "The calculation seems incorrect" - } -} -``` - -### Annotations - -**User Note:** - -```json theme={null} -{ - "id": "evt_789", - "ts": 1714064001000, - "type": "META", - "metaType": "note", - "payload": { - "text": "Important question to revisit", - "relatedRunId": "run_001", - "author": "user_123" - } -} -``` - -**Tag Assignment:** - -```json theme={null} -{ - "id": "evt_890", - "ts": 1714064100000, - "type": "META", - "metaType": "tag", - "payload": { - "tags": ["important", "follow-up"], - "threadId": "thread_001" - } -} -``` - -### External System Events - -**Analytics Event:** - -```json theme={null} -{ - "id": "evt_901", - "ts": 1714064200000, - "type": "META", - "metaType": "analytics", - "payload": { - "event": "conversation_shared", - "properties": { - "shareMethod": "link", - "recipientCount": 3 - } - } -} -``` - -**Moderation Flag:** - -```json theme={null} -{ - "id": "evt_902", - "ts": 1714064300000, - "type": "META", - "metaType": "moderation", - "payload": { - "action": "flag", - "messageId": "msg_999", - "category": "inappropriate_content", - "confidence": 0.95 - } -} -``` - -## Common Meta Event Types - -While applications can define their own types, these are commonly used: - -| MetaType | Description | Typical Payload | -| ------------- | ----------------- | ---------------------------------- | -| `thumbs_up` | Positive feedback | `{ messageId, userId }` | -| `thumbs_down` | Negative feedback | `{ messageId, userId, reason? }` | -| `note` | User annotation | `{ text, relatedId?, author }` | -| `tag` | Categorization | `{ tags[], targetId }` | -| `bookmark` | Save for later | `{ messageId, userId }` | -| `copy` | Content copied | `{ messageId, content }` | -| `share` | Content shared | `{ messageId, method }` | -| `rating` | Numeric rating | `{ messageId, rating, maxRating }` | - -## Use Cases - -### User Feedback Collection - -Capture user reactions to agent responses for quality improvement. - -### Conversation Annotation - -Allow users to add notes, tags, or bookmarks to important parts of -conversations. - -### Analytics and Tracking - -Record user interactions and behaviors without affecting agent execution. - -### Content Moderation - -Flag or mark content for review by external moderation systems. - -### Collaborative Features - -Enable multiple users to annotate or comment on shared conversations. - -### Audit Trail - -Create a complete record of all interactions, not just agent responses. - -## Implementation Considerations - -### Client SDK Changes - -TypeScript SDK: - -* New `MetaEvent` type in `@ag-ui/core` -* Helper functions for common meta event types -* MetaEvent filtering and querying utilities - -Python SDK: - -* `MetaEvent` class implementation -* Meta event builders for common types -* Event stream filtering capabilities - -## Testing Strategy - -* Unit tests for MetaEvent creation and validation -* Integration tests with mixed event streams -* Performance tests with high-volume meta events -* Security tests for payload validation - -## References - -* [AG-UI Events Documentation](/concepts/events) -* [Event Sourcing](https://martinfowler.com/eaaDev/EventSourcing.html) -* [CQRS Pattern](https://martinfowler.com/bliki/CQRS.html) - - -# Multi-modal Messages -Source: https://docs.ag-ui.com/drafts/multimodal-messages - -Support for multimodal input messages including text, images, audio, video, and documents - -# Multi-modal Messages Proposal - -## Summary - -### Problem Statement - -Current AG-UI protocol only supports text-based user messages. As LLMs -increasingly support multimodal inputs (images, audio, files), the protocol -needs to evolve to handle these richer input types. - -### Motivation - -Evolve AG-UI to support **multimodal input messages** without breaking existing -apps. Inputs may include text, images, audio, video, and documents. Each -modality is represented as a distinct, typed content part with a clear source -discriminator (`data` for inline base64, `url` for references), making it -straightforward to map to any LLM provider's API. - -## Status - -* **Status**: Implemented — October 16, 2025 -* **Author(s)**: Markus Ecker ([mail@mme.xyz](mailto:mail@mme.xyz)), Alem Tuzlak ([t.zlak97@gmail.com](mailto:t.zlak97@gmail.com)) - -## Detailed Specification - -### Overview - -Extend the `UserMessage` `content` property to be either a string or an array of -`InputContentPart` objects. Each modality (image, audio, video, document) has -its own dedicated part type with a typed `source` that is either inline `data` -or a `url` reference. This makes it trivial to map content parts to any LLM -provider's API. - -```typescript theme={null} -/** - * Supported input modality types for multimodal content. - */ -type Modality = "text" | "image" | "audio" | "video" | "document" - -// ── Source types ────────────────────────────────────────────── - -interface InputContentDataSource { - /** Indicates this is inline data content. */ - type: "data" - /** The base64-encoded content value. */ - value: string - /** MIME type of the content (e.g., "image/png", "audio/wav"). Required. */ - mimeType: string -} - -interface InputContentUrlSource { - /** Indicates this is URL-referenced content. */ - type: "url" - /** HTTP(S) URL or data URI pointing to the content. */ - value: string - /** Optional MIME type hint for when it can't be inferred from the URL. */ - mimeType?: string -} - -type InputContentSource = InputContentDataSource | InputContentUrlSource - -// ── Content part types ──────────────────────────────────────── - -interface TextInputPart { - type: "text" - /** The text content. */ - text: string -} - -interface ImageInputPart { - type: "image" - /** Source of the image content. */ - source: InputContentSource - /** Provider-specific metadata (e.g., OpenAI detail: "auto" | "low" | "high"). */ - metadata?: TMetadata -} - -interface AudioInputPart { - type: "audio" - /** Source of the audio content. */ - source: InputContentSource - /** Provider-specific metadata (e.g., format, sample rate). */ - metadata?: TMetadata -} - -interface VideoInputPart { - type: "video" - /** Source of the video content. */ - source: InputContentSource - /** Provider-specific metadata (e.g., duration, resolution). */ - metadata?: TMetadata -} - -interface DocumentInputPart { - type: "document" - /** Source of the document content. */ - source: InputContentSource - /** Provider-specific metadata (e.g., Anthropic media_type for PDFs). */ - metadata?: TMetadata -} - -type InputContentPart = - | TextInputPart - | ImageInputPart - | AudioInputPart - | VideoInputPart - | DocumentInputPart - -// ── Updated UserMessage ─────────────────────────────────────── - -type UserMessage = { - id: string - role: "user" - content: string | InputContentPart[] - name?: string -} -``` - -### Modality Type - -The `Modality` type enumerates the supported content modalities: - -| Value | Description | -| ------------ | ------------------------------------------ | -| `"text"` | Plain text content | -| `"image"` | Image content (JPEG, PNG, GIF, WebP, etc.) | -| `"audio"` | Audio content (WAV, MP3, OGG, etc.) | -| `"video"` | Video content (MP4, WebM, etc.) | -| `"document"` | Document content (PDF, DOCX, XLSX, etc.) | - -### Source Types - -Every non-text content part carries a `source` property that describes how the -content is delivered. The source is a discriminated union with two variants: - -#### InputContentDataSource - -Inline base64-encoded content. - -| Property | Type | Required | Description | -| ---------- | -------- | -------- | ----------------------------------------------- | -| `type` | `"data"` | ✓ | Discriminator for inline data | -| `value` | `string` | ✓ | Base64-encoded content | -| `mimeType` | `string` | ✓ | MIME type (required to ensure correct handling) | - -#### InputContentUrlSource - -URL-referenced content. - -| Property | Type | Required | Description | -| ---------- | --------- | -------- | ------------------------------- | -| `type` | `"url"` | ✓ | Discriminator for URL reference | -| `value` | `string` | ✓ | HTTP(S) URL or data URI | -| `mimeType` | `string?` | | Optional MIME type hint | - -### Content Part Types - -#### TextInputPart - -Represents plain text content within a multimodal message. - -| Property | Type | Description | -| -------- | -------- | ------------------------------- | -| `type` | `"text"` | Identifies this as text content | -| `text` | `string` | The text content | - -#### ImageInputPart - -Represents image content. Maps directly to provider image inputs (e.g., OpenAI -vision, Anthropic image blocks). - -| Property | Type | Description | -| ---------- | -------------------- | -------------------------------------------------------- | -| `type` | `"image"` | Identifies this as image content | -| `source` | `InputContentSource` | Either inline data or URL reference | -| `metadata` | `TMetadata?` | Provider-specific metadata (e.g., OpenAI `detail` level) | - -#### AudioInputPart - -Represents audio content. - -| Property | Type | Description | -| ---------- | -------------------- | ------------------------------------------------------ | -| `type` | `"audio"` | Identifies this as audio content | -| `source` | `InputContentSource` | Either inline data or URL reference | -| `metadata` | `TMetadata?` | Provider-specific metadata (e.g., format, sample rate) | - -#### VideoInputPart - -Represents video content. - -| Property | Type | Description | -| ---------- | -------------------- | ------------------------------------------------------- | -| `type` | `"video"` | Identifies this as video content | -| `source` | `InputContentSource` | Either inline data or URL reference | -| `metadata` | `TMetadata?` | Provider-specific metadata (e.g., duration, resolution) | - -#### DocumentInputPart - -Represents document content such as PDFs, Word documents, or spreadsheets. - -| Property | Type | Description | -| ---------- | -------------------- | --------------------------------------------------------- | -| `type` | `"document"` | Identifies this as document content | -| `source` | `InputContentSource` | Either inline data or URL reference | -| `metadata` | `TMetadata?` | Provider-specific metadata (e.g., Anthropic `media_type`) | - -### Provider Metadata - -The generic `metadata` field on each content part allows provider-specific -information to flow through the protocol without polluting the core schema. -Examples: - -* **OpenAI**: `ImageInputPart<{ detail: 'auto' | 'low' | 'high' }>` -* **Anthropic**: `DocumentInputPart<{ media_type: 'application/pdf' }>` -* **Custom**: Any provider can define its own metadata shape - -## Implementation Examples - -### Simple Text Message (Backward Compatible) - -```json theme={null} -{ - "id": "msg-001", - "role": "user", - "content": "What's in this image?" -} -``` - -### Image with Inline Data - -```json theme={null} -{ - "id": "msg-002", - "role": "user", - "content": [ - { - "type": "text", - "text": "What's in this image?" - }, - { - "type": "image", - "source": { - "type": "data", - "value": "/9j/4AAQSkZJRg...", - "mimeType": "image/jpeg" - } - } - ] -} -``` - -### Image with URL Reference - -```json theme={null} -{ - "id": "msg-003", - "role": "user", - "content": [ - { - "type": "text", - "text": "What's in this image?" - }, - { - "type": "image", - "source": { - "type": "url", - "value": "https://example.com/photo.png" - }, - "metadata": { - "detail": "high" - } - } - ] -} -``` - -### Multiple Images with Question - -```json theme={null} -{ - "id": "msg-004", - "role": "user", - "content": [ - { - "type": "text", - "text": "What are the differences between these images?" - }, - { - "type": "image", - "source": { - "type": "url", - "value": "https://example.com/image1.png", - "mimeType": "image/png" - } - }, - { - "type": "image", - "source": { - "type": "url", - "value": "https://example.com/image2.png", - "mimeType": "image/png" - } - } - ] -} -``` - -### Audio Transcription Request - -```json theme={null} -{ - "id": "msg-005", - "role": "user", - "content": [ - { - "type": "text", - "text": "Please transcribe this audio recording" - }, - { - "type": "audio", - "source": { - "type": "url", - "value": "https://example.com/meeting-recording.wav", - "mimeType": "audio/wav" - } - } - ] -} -``` - -### Document Analysis - -```json theme={null} -{ - "id": "msg-006", - "role": "user", - "content": [ - { - "type": "text", - "text": "Summarize the key points from this PDF" - }, - { - "type": "document", - "source": { - "type": "url", - "value": "https://example.com/reports/q4-2024.pdf", - "mimeType": "application/pdf" - } - } - ] -} -``` - -### Video Analysis - -```json theme={null} -{ - "id": "msg-007", - "role": "user", - "content": [ - { - "type": "text", - "text": "Describe what happens in this video" - }, - { - "type": "video", - "source": { - "type": "url", - "value": "https://example.com/demo.mp4", - "mimeType": "video/mp4" - }, - "metadata": { - "duration": 120 - } - } - ] -} -``` - -### Mixed Modalities - -```json theme={null} -{ - "id": "msg-008", - "role": "user", - "content": [ - { - "type": "text", - "text": "Compare the screenshot with the design spec" - }, - { - "type": "image", - "source": { - "type": "data", - "value": "iVBORw0KGgo...", - "mimeType": "image/png" - } - }, - { - "type": "document", - "source": { - "type": "url", - "value": "https://example.com/design-spec.pdf", - "mimeType": "application/pdf" - } - } - ] -} -``` - -## Implementation Considerations - -### Client SDK Changes - -TypeScript SDK: - -* New `Modality` type and all `InputContentPart` types in `@ag-ui/core` -* `InputContentSource`, `InputContentDataSource`, `InputContentUrlSource` types -* Updated `UserMessage` with `content: string | InputContentPart[]` -* Helper methods for constructing typed content parts -* Provider-specific metadata generics on each content part type - -Python SDK: - -* Pydantic models for each content part type (`TextInputPart`, `ImageInputPart`, - etc.) -* `InputContentSource` discriminated union -* Updated `UserMessage` model -* Provider-specific metadata support via generics - -### Framework Integration - -Frameworks need to: - -* Parse typed `InputContentPart` parts and dispatch on `part.type` -* Map content parts to provider-specific formats (the typed structure makes this - straightforward) -* Use `source.type` to determine whether to send inline data or a URL to the - provider -* Forward `metadata` to providers that support it -* Handle fallbacks for models that don't support certain modalities -* Validate that `mimeType` is appropriate for the declared content part type - -## Use Cases - -### Visual Question Answering - -Users can upload images (`ImageInputPart`) and ask questions about them. - -### Document Processing - -Upload PDFs, Word documents, or spreadsheets (`DocumentInputPart`) for analysis. - -### Audio Transcription and Analysis - -Process voice recordings, podcasts, or meeting audio (`AudioInputPart`). - -### Video Understanding - -Analyze video content (`VideoInputPart`) for summaries, descriptions, or content -moderation. - -### Multi-modal Comparison - -Compare multiple images, documents, or mixed media using different content part -types in a single message. - -### Screenshot Analysis - -Share screenshots (`ImageInputPart`) for UI/UX feedback or debugging assistance. - -## Testing Strategy - -* Unit tests for each `InputContentPart` type and `InputContentSource` variant -* Validate `source.type` discriminator correctly narrows the union -* Integration tests with multimodal LLMs (OpenAI, Anthropic, Google) -* Backward compatibility tests with plain `string` content -* Verify `metadata` passthrough for provider-specific fields -* Performance tests for large base64 payloads in `InputContentDataSource` -* Security tests for URL validation and content sanitization -* Type-safety tests ensuring generic `TMetadata` works across SDKs - -## References - -* [OpenAI Vision API](https://platform.openai.com/docs/guides/vision) -* [Anthropic Vision](https://docs.anthropic.com/en/docs/vision) -* [Vercel AI SDK — Multi-modal Content](https://sdk.vercel.ai/docs/ai-sdk-core/generating-text#multi-modal-content) -* [MIME Types](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types) -* [Data URLs](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs) - - -# Overview -Source: https://docs.ag-ui.com/drafts/overview - -Draft changes being considered for the AG-UI protocol - -# Overview - -This section contains draft changes being considered for the AG-UI protocol. These proposals are under internal review and may be modified or withdrawn before implementation. - -## Current Drafts - - - - Support for LLM reasoning visibility and continuity with encrypted content - - - - Support for multimodal input messages including images, audio, and files - - - - Native support for agent pauses requiring human approval or input - - - - AI-generated interfaces without requiring custom tool renderers - - - - Annotations and signals independent of agent runs - - - -## Status Definitions - -* **Draft** - Initial proposal under consideration -* **Under Review** - Active development and testing -* **Accepted** - Approved for implementation -* **Implemented** - Merged into the main protocol specification -* **Withdrawn** - Proposal has been withdrawn or superseded - - -# AG-UI Overview -Source: https://docs.ag-ui.com/introduction - - - -# The Agent–User Interaction (AG-UI) Protocol - -AG-UI is an open, lightweight, event-based protocol that standardizes how AI agents connect to user-facing applications. - -AG-UI is designed to be the general-purpose, bi-directional connection between a user-facing application and any agentic backend. - -Built for simplicity and flexibility, it standardizes how agent state, UI intents, and user interactions flow between your model/agent runtime and user-facing frontend applications—to allow application developers to ship reliable, debuggable, user‑friendly agentic features fast while focusing on application needs and avoiding complex ad-hoc wiring. - -
- AG-UI Overview - - AG-UI Overview -
- -*** - -## Agentic Protocols - - - Confused about "A2UI" and "AG-UI"? That's understandable! Despite the naming similarities, they are quite different and work well together. A2UI is a [generative UI specification](./concepts/generative-ui-specs) - allowing agents to deliver UI widgets, where AG-UI is the Agent↔User Interaction protocol - which connects an agentic frontend to any agentic backend. [Learn more](https://copilotkit.ai/ag-ui-and-a2ui) - - -AG-UI is one of three prominent open [agentic protocols](./agentic-protocols). - -| **Layer** | **Protocol / Example** | **Purpose** | -| ---------------------------- | ------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ | -| **Agent ↔ User Interaction** | **AG-UI
(Agent–User Interaction Protocol)** | The open, event-based standard that connects agents to user-facing applications — enabling real-time, multimodal, interactive experiences. | -| **Agent ↔ Tools & Data** | **MCP
(Model Context Protocol)** | Open standard (originated by Anthropic) that lets agents securely connect to external systems — tools, workflows, and data sources. | -| **Agent ↔ Agent** | **A2A
(Agent to Agent)** | Open standard (originated by Google) which defines how agents coordinate and share work across distributed agentic systems. | - -*** - -## Building blocks (today & upcoming) - -
-
-
- Streaming chat -
- -
-
- Live token and event streaming for responsive multi turn sessions, with cancel and resume. -
-
-
- -
-
- Multimodality -
- -
-
- Typed attachments and real time media (files, images, audio, transcripts); supports voice, previews, annotations, provenance. -
-
-
- -
-
- Generative UI, static -
- -
-
- Render model output as stable, typed components under app control. -
-
-
- -
-
- Generative UI, declarative -
- -
-
- Small declarative language for constrained yet open-ended agent UIs; agents propose trees and constraints, the app validates and mounts. -
-
-
- -
-
- Shared state -
- -
-
- (Read-only & read-write). Typed store shared between agent and app, with streamed event-sourced diffs and conflict resolution for snappy collaboration. -
-
-
- -
-
- Thinking steps -
- -
-
- Visualize intermediate reasoning from traces and tool events; no raw chain of thought. -
-
-
- -
-
- Frontend tool calls -
- -
-
- Typed handoffs from agent to frontend-executed actions, and back. -
-
-
- -
-
- Backend tool rendering -
- -
-
- Visualize backend tool outputs in app and chat, emit side effects as first-class events. -
-
-
- -
-
- Interrupts (human in the loop) -
- -
-
- Pause, approve, edit, retry, or escalate mid flow without losing state. -
-
-
- -
-
- Sub-agents and composition -
- -
-
- Nested delegation with scoped state, tracing, and cancellation. -
-
-
- -
-
- Agent steering -
- -
-
- Dynamically redirect agent execution with real-time user input to guide behavior and outcomes. -
-
-
- -
-
- Tool output streaming -
- -
-
- Stream tool results and logs so UIs can render long-running effects in real time. -
-
-
- -
-
- Custom events -
- -
-
- Open-ended data exchange for needs not covered by the protocol. -
-
-
-
- -*** - -## Why Agentic Apps need AG-UI - -Agentic applications break the simple request/response model that dominated frontend-backend development in the pre-agentic era: a client makes a request, the server returns data, the client renders it, and the interaction ends. - -#### The requirements of user‑facing agents - -While agents are just software, they exhibit characteristics that make them challenging to serve behind traditional REST/GraphQL APIs: - -* Agents are **long‑running** and **stream** intermediate work—often across multi‑turn sessions. -* Agents are **nondeterministic** and can **control application UI nondeterministically**. -* Agents simultanously mix **structured + unstructured IO** (e.g. text & voice, alongside tool calls and state updates). -* Agents need user-interactive **composition**: e.g. they may call sub‑agents, often recursively. -* And more... - -AG-UI is an event-based protocol that enables dynamic communication between agentic frontends and backends. It builds on top of the foundational protocols of the web (HTTP, WebSockets) as an abstraction layer designed for the agentic age—bridging the gap between traditional client-server architectures and the dynamic, stateful nature of AI agents. - -*** - -## AG-UI in Action - -
- -
- - - You can see demo apps of the AG-UI features with the framework of your choice, with preview, code, and walkthrough docs in the [AG-UI Dojo](https://dojo.ag-ui.com/) - - -*** - -## Supported Integrations - -AG-UI was born from CopilotKit's initial **partnership** with LangGraph and CrewAI - and brings the incredibly popular agent-user-interactivity infrastructure to the wider agentic ecosystem. - -**1st party** = the platforms that have AG‑UI built in and provide documentation for guidance. - -### Direct to LLM - -| Framework | Status | AG-UI Resources | -| :------------ | --------- | ------------------------------------------------ | -| Direct to LLM | Supported | [Docs](https://docs.copilotkit.ai/direct-to-llm) | - -### Agent Framework - Partnerships - -| Framework | Status | AG-UI Resources | -| :----------------------------------------------- | --------- | --------------------------------------------------------------------------------------------------------------------- | -| [LangGraph](https://www.langchain.com/langgraph) | Supported | [Docs](https://docs.copilotkit.ai/langgraph/), [Demos](https://dojo.ag-ui.com/langgraph-fastapi/feature/shared_state) | -| [CrewAI](https://crewai.com/) | Supported | [Docs](https://docs.copilotkit.ai/crewai-flows), [Demos](https://dojo.ag-ui.com/crewai/feature/shared_state) | - -### Agent Framework - 1st Party - -| Framework | Status | AG-UI Resources | -| :--------------------------------------------------------------------------------------------------------- | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | -| [Microsoft Agent Framework](https://azure.microsoft.com/en-us/blog/introducing-microsoft-agent-framework/) | Supported | [Docs](https://docs.copilotkit.ai/microsoft-agent-framework), [Demos](https://dojo.ag-ui.com/microsoft-agent-framework-dotnet/feature/shared_state) | -| [Google ADK](https://google.github.io/adk-docs/get-started/) | Supported | [Docs](https://docs.copilotkit.ai/adk), [Demos](https://dojo.ag-ui.com/adk-middleware/feature/shared_state?openCopilot=true) | -| [AWS Strands Agents](https://github.com/strands-agents/sdk-python) | Supported | [Docs](https://docs.copilotkit.ai/aws-strands), [Demos](https://dojo.ag-ui.com/aws-strands/feature/shared_state) | -| [Mastra](https://mastra.ai/) | Supported | [Docs](https://docs.copilotkit.ai/mastra/), [Demos](https://dojo.ag-ui.com/mastra/feature/tool_based_generative_ui) | -| [Pydantic AI](https://github.com/pydantic/pydantic-ai) | Supported | [Docs](https://docs.copilotkit.ai/pydantic-ai/), [Demos](https://dojo.ag-ui.com/pydantic-ai/feature/shared_state) | -| [Agno](https://github.com/agno-agi/agno) | Supported | [Docs](https://docs.copilotkit.ai/agno/), [Demos](https://dojo.ag-ui.com/agno/feature/tool_based_generative_ui) | -| [LlamaIndex](https://github.com/run-llama/llama_index) | Supported | [Docs](https://docs.copilotkit.ai/llamaindex/), [Demos](https://dojo.ag-ui.com/llamaindex/feature/shared_state) | -| [AG2](https://ag2.ai/) | Supported | [Docs](https://docs.copilotkit.ai/ag2/) [Demos](https://dojo.ag-ui.com/ag2/feature/shared_state) | -| [AWS Bedrock Agents](https://aws.amazon.com/bedrock/agents/) | In Progress | – | - -### Agent Framework - Community - -| Framework | Status | AG-UI Resources | -| :----------------------------------------------------------------- | ----------- | --------------- | -| [OpenAI Agent SDK](https://openai.github.io/openai-agents-python/) | In Progress | – | -| [Cloudflare Agents](https://developers.cloudflare.com/agents/) | In Progress | – | - -### Agent Interaction Protocols - -| Protocol | Status | AG-UI Resources | Integrations | -| :------------------------------------------ | --------- | ----------------------------------------------- | ------------ | -| [A2A Middleware](https://a2a-protocol.org/) | Supported | [Docs](https://docs.copilotkit.ai/a2a-protocol) | Partnership | - -### Specification (standard) - -| Framework | Status | AG-UI Resources | -| :------------------------------------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | -| [Oracle Agent Spec](http://oracle.github.io/agent-spec/) | Supported | [Docs](https://go.copilotkit.ai/copilotkit-oracle-docs), [Demos](https://dojo.ag-ui.com/agent-spec-langgraph/feature/tool_based_generative_ui) | - -### SDKs - -| SDK | Status | AG-UI Resources | Integrations | -| :----------- | ----------- | ------------------------------------------------------------------------------------------------------------ | ------------ | -| [Kotlin]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/blob/main/docs/sdk/kotlin/overview.mdx) | Community | -| [Golang]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/blob/main/docs/sdk/go/overview.mdx) | Community | -| [Dart]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/tree/main/sdks/community/dart) | Community | -| [Java]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/blob/main/docs/sdk/java/overview.mdx) | Community | -| [Rust]() | Supported | [Getting Started](https://github.com/ag-ui-protocol/ag-ui/tree/main/sdks/community/rust/crates/ag-ui-client) | Community | -| [.NET]() | In Progress | [PR](https://github.com/ag-ui-protocol/ag-ui/pull/38) | Community | -| [Nim]() | In Progress | [PR](https://github.com/ag-ui-protocol/ag-ui/pull/29) | Community | -| [Flowise]() | In Progress | [GitHub Source](https://github.com/ag-ui-protocol/ag-ui/issues/367) | Community | -| [Langflow]() | In Progress | [GitHub Source](https://github.com/ag-ui-protocol/ag-ui/issues/366) | Community | - -### Clients - -| Client | Status | AG-UI Resources | Integrations | -| :----------------------------------------------------- | ----------- | ----------------------------------------------------------------------------- | ------------ | -| [CopilotKit](https://github.com/CopilotKit/CopilotKit) | Supported | [Getting Started](https://docs.copilotkit.ai/direct-to-llm/guides/quickstart) | 1st Party | -| [Terminal + Agent]() | Supported | [Getting Started](https://docs.ag-ui.com/quickstart/clients) | Community | -| [React Native](https://reactnative.dev/) | Help Wanted | [GitHub Source](https://github.com/ag-ui-protocol/ag-ui/issues/510) | Community | - -*** - -## Quick Start - -Choose the path that fits your needs: - - - - Build agentic applications powered by AG-UI compatible agents. - - - - Build integrations for new agent frameworks, custom in-house solutions, or use AG-UI without any agent framework. - - - - Build new clients for AG-UI-compatible agents (web, mobile, slack, messaging, etc.) - - - -## Explore AG-UI - -Dive deeper into AG-UI's core concepts and capabilities: - - - - Understand how AG-UI connects agents, protocols, and front-ends - - - - Learn about AG-UI's event-driven protocol - - - -## Resources - -Explore guides, tools, and integrations to help you build, optimize, and extend -your AG-UI implementation. These resources cover everything from practical -development workflows to debugging techniques. - - - - Use Cursor to build AG-UI implementations faster - - - - Fix common issues when working with AG-UI servers and clients - - - -## Contributing - -Want to contribute? Check out our -[Contributing Guide](/development/contributing) to learn how you can help -improve AG-UI. - -## Support and Feedback - -Here's how to get help or provide feedback: - -* For bug reports and feature requests related to the AG-UI specification, SDKs, - or documentation (open source), please - [create a GitHub issue](https://github.com/ag-ui-protocol/ag-ui/issues) -* For discussions or Q\&A about AG-UI, please join the [Discord community](https://discord.gg/Jd3FzfdJa8) - - -# Build applications -Source: https://docs.ag-ui.com/quickstart/applications - -Build agentic applications utilizing compatible event AG-UI event streams - -# Introduction - -AG-UI provides a concise, event-driven protocol that lets any agent stream rich, -structured output to any client. It can be used to connect any agentic system to -any client. - -A client is defined as any system that can receieve, display, and respond to -AG-UI events. For more information on existing clients and integrations, see -the [integrations](/integrations) page. - -# Automatic Setup - -AG-UI provides a CLI tool to automatically create or scaffold a new application with any client and server. - -```sh theme={null} -npx create-ag-ui-app@latest -``` - - - -Once the setup is done, start the server with - -```sh theme={null} -npm run dev -``` - -For the copilotkit example you can head to [http://localhost:3000/copilotkit](http://localhost:3000/copilotkit) to see the app in action. - - -# Build clients -Source: https://docs.ag-ui.com/quickstart/clients - -Showcase: build a conversational CLI agent from scratch using AG-UI and Mastra - -# Introduction - -A client implementation allows you to **build conversational applications that -leverage AG-UI's event-driven protocol**. This approach creates a direct -interface between your users and AI agents, demonstrating direct access to the -AG-UI protocol. - -## When to use a client implementation - -Building your own client is useful if you want to explore/hack on the AG-UI -protocol. For production use, use a full-featured client like -[CopilotKit](https://copilotkit.ai). - -## What you'll build - -In this guide, we'll create a CLI client that: - -1. Uses the `MastraAgent` from `@ag-ui/mastra` -2. Connects to OpenAI's GPT-4o model -3. Implements a weather tool for real-world functionality -4. Provides an interactive chat interface in the terminal - -Let's get started! - -## Prerequisites - -Before we begin, make sure you have: - -* [Node.js](https://nodejs.org/) **22.13.0 or later** -* An **OpenAI API key** -* [pnpm](https://pnpm.io/) package manager - -### 1. Provide your OpenAI API key - -First, let's set up your API key: - -```bash theme={null} -# Set your OpenAI API key -export OPENAI_API_KEY=your-api-key-here -``` - -### 2. Install pnpm - -If you don't have pnpm installed: - -```bash theme={null} -# Install pnpm -npm install -g pnpm -``` - -## Step 1 – Initialize your project - -Create a new directory for your AG-UI client: - -```bash theme={null} -mkdir my-ag-ui-client -cd my-ag-ui-client -``` - -Initialize a new Node.js project: - -```bash theme={null} -pnpm init -``` - -### Set up TypeScript and basic configuration - -Install TypeScript and essential development dependencies: - -```bash theme={null} -pnpm add -D typescript @types/node tsx -``` - -Create a `tsconfig.json` file: - -```json theme={null} -{ - "compilerOptions": { - "target": "ES2022", - "module": "commonjs", - "lib": ["ES2022"], - "outDir": "./dist", - "rootDir": "./src", - "strict": true, - "esModuleInterop": true, - "skipLibCheck": true, - "forceConsistentCasingInFileNames": true, - "resolveJsonModule": true - }, - "include": ["src/**/*"], - "exclude": ["node_modules", "dist"] -} -``` - -Update your `package.json` scripts: - -```json theme={null} -{ - "scripts": { - "start": "tsx src/index.ts", - "dev": "tsx --watch src/index.ts", - "build": "tsc", - "clean": "rm -rf dist" - } -} -``` - -## Step 2 – Install AG-UI and dependencies - -Install the core AG-UI packages and dependencies: - -```bash theme={null} -# Core AG-UI packages -pnpm add @ag-ui/client @ag-ui/core @ag-ui/mastra - -# Mastra ecosystem packages -pnpm add @mastra/core @mastra/client-js @mastra/memory @mastra/libsql - -# Mastra peer dependencies -pnpm add zod -``` - -## Step 3 – Create your agent - -Let's create a basic conversational agent. Create `src/agent.ts`: - -```typescript theme={null} -import { Agent } from "@mastra/core/agent" -import { MastraAgent } from "@ag-ui/mastra" -import { Memory } from "@mastra/memory" -import { LibSQLStore } from "@mastra/libsql" - -export const agent = new MastraAgent({ - resourceId: "cliExample", - agent: new Agent({ - id: "ag-ui-assistant", - name: "AG-UI Assistant", - instructions: ` - You are a helpful AI assistant. Be friendly, conversational, and helpful. - Answer questions to the best of your ability and engage in natural conversation. - `, - model: "openai/gpt-4o", - memory: new Memory({ - storage: new LibSQLStore({ - id: "storage-memory", - url: "file:./assistant.db", - }), - }), - }), - threadId: "main-conversation", -}) -``` - -### What's happening in the agent? - -1. **MastraAgent** – We wrap a Mastra Agent with the AG-UI protocol adapter -2. **Model Configuration** – We use OpenAI's GPT-4o for high-quality responses -3. **Memory Setup** – We configure persistent memory using LibSQL for - conversation context -4. **Instructions** – We give the agent basic guidelines for helpful - conversation - -## Step 4 – Create the CLI interface - -Now let's create the interactive chat interface. Create `src/index.ts`: - -```typescript theme={null} -import * as readline from "readline" -import { agent } from "./agent" -import { randomUUID } from "@ag-ui/client" - -const rl = readline.createInterface({ - input: process.stdin, - output: process.stdout, -}) - -async function chatLoop() { - console.log("🤖 AG-UI Assistant started!") - console.log("Type your messages and press Enter. Press Ctrl+D to quit.\n") - - return new Promise((resolve) => { - const promptUser = () => { - rl.question("> ", async (input) => { - if (input.trim() === "") { - promptUser() - return - } - console.log("") - - // Pause input while processing - rl.pause() - - // Add user message to conversation - agent.messages.push({ - id: randomUUID(), - role: "user", - content: input.trim(), - }) - - try { - // Run the agent with event handlers - await agent.runAgent( - {}, // No additional configuration needed - { - onTextMessageStartEvent() { - process.stdout.write("🤖 Assistant: ") - }, - onTextMessageContentEvent({ event }) { - process.stdout.write(event.delta) - }, - onTextMessageEndEvent() { - console.log("\n") - }, - } - ) - } catch (error) { - console.error("❌ Error:", error) - } - - // Resume input - rl.resume() - promptUser() - }) - } - - // Handle Ctrl+D to quit - rl.on("close", () => { - console.log("\n👋 Thanks for using AG-UI Assistant!") - resolve() - }) - - promptUser() - }) -} - -async function main() { - await chatLoop() -} - -main().catch(console.error) -``` - -### What's happening in the CLI interface? - -1. **Readline Interface** – We create an interactive prompt for user input -2. **Message Management** – We add each user input to the agent's conversation - history -3. **Event Handling** – We listen to AG-UI events to provide real-time feedback -4. **Streaming Display** – We show the agent's response as it's being generated - -## Step 5 – Test your assistant - -Let's run your new AG-UI client: - -```bash theme={null} -pnpm dev -``` - -You should see: - -``` -🤖 AG-UI Assistant started! -Type your messages and press Enter. Press Ctrl+D to quit. - -> -``` - -Try asking questions like: - -* "Hello! How are you?" -* "What can you help me with?" -* "Tell me a joke" -* "Explain quantum computing in simple terms" - -You'll see the agent respond with streaming text in real-time! - -## Step 6 – Understanding the AG-UI event flow - -Let's break down what happens when you send a message: - -1. **User Input** – You type a question and press Enter -2. **Message Added** – Your input is added to the conversation history -3. **Agent Processing** – The agent analyzes your request and formulates a - response -4. **Response Generation** – The agent streams its response back -5. **Streaming Output** – You see the response appear word by word - -### Event types you're handling: - -* `onTextMessageStartEvent` – Agent starts responding -* `onTextMessageContentEvent` – Each chunk of the response -* `onTextMessageEndEvent` – Response is complete - -## Step 7 – Add tool functionality - -Now that you have a working chat interface, let's add some real-world -capabilities by creating tools. We'll start with a weather tool. - -### Create your first tool - -Let's create a weather tool that your agent can use. Create the directory -structure: - -```bash theme={null} -mkdir -p src/tools -``` - -Create `src/tools/weather.tool.ts`: - -```typescript theme={null} -import { createTool } from "@mastra/core/tools" -import { z } from "zod" - -interface GeocodingResponse { - results: { - latitude: number - longitude: number - name: string - }[] -} - -interface WeatherResponse { - current: { - time: string - temperature_2m: number - apparent_temperature: number - relative_humidity_2m: number - wind_speed_10m: number - wind_gusts_10m: number - weather_code: number - } -} - -export const weatherTool = createTool({ - id: "get-weather", - description: "Get current weather for a location", - inputSchema: z.object({ - location: z.string().describe("City name"), - }), - outputSchema: z.object({ - temperature: z.number(), - feelsLike: z.number(), - humidity: z.number(), - windSpeed: z.number(), - windGust: z.number(), - conditions: z.string(), - location: z.string(), - }), - execute: async (inputData) => { - return await getWeather(inputData.location) - }, -}) - -const getWeather = async (location: string) => { - const geocodingUrl = `https://geocoding-api.open-meteo.com/v1/search?name=${encodeURIComponent( - location - )}&count=1` - const geocodingResponse = await fetch(geocodingUrl) - const geocodingData = (await geocodingResponse.json()) as GeocodingResponse - - if (!geocodingData.results?.[0]) { - throw new Error(`Location '${location}' not found`) - } - - const { latitude, longitude, name } = geocodingData.results[0] - - const weatherUrl = `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}¤t=temperature_2m,apparent_temperature,relative_humidity_2m,wind_speed_10m,wind_gusts_10m,weather_code` - - const response = await fetch(weatherUrl) - const data = (await response.json()) as WeatherResponse - - return { - temperature: data.current.temperature_2m, - feelsLike: data.current.apparent_temperature, - humidity: data.current.relative_humidity_2m, - windSpeed: data.current.wind_speed_10m, - windGust: data.current.wind_gusts_10m, - conditions: getWeatherCondition(data.current.weather_code), - location: name, - } -} - -function getWeatherCondition(code: number): string { - const conditions: Record = { - 0: "Clear sky", - 1: "Mainly clear", - 2: "Partly cloudy", - 3: "Overcast", - 45: "Foggy", - 48: "Depositing rime fog", - 51: "Light drizzle", - 53: "Moderate drizzle", - 55: "Dense drizzle", - 56: "Light freezing drizzle", - 57: "Dense freezing drizzle", - 61: "Slight rain", - 63: "Moderate rain", - 65: "Heavy rain", - 66: "Light freezing rain", - 67: "Heavy freezing rain", - 71: "Slight snow fall", - 73: "Moderate snow fall", - 75: "Heavy snow fall", - 77: "Snow grains", - 80: "Slight rain showers", - 81: "Moderate rain showers", - 82: "Violent rain showers", - 85: "Slight snow showers", - 86: "Heavy snow showers", - 95: "Thunderstorm", - 96: "Thunderstorm with slight hail", - 99: "Thunderstorm with heavy hail", - } - return conditions[code] || "Unknown" -} -``` - -### What's happening in the weather tool? - -1. **Tool Definition** – We use `createTool` from Mastra to define the tool's - interface -2. **Input Schema** – We specify that the tool accepts a location string -3. **Output Schema** – We define the structure of the weather data returned -4. **API Integration** – We fetch data from Open-Meteo's free weather API -5. **Data Processing** – We convert weather codes to human-readable conditions - -### Update your agent - -Now let's update our agent to use the weather tool. Update `src/agent.ts`: - -```typescript theme={null} -import { weatherTool } from "./tools/weather.tool" // <--- Import the tool - -export const agent = new MastraAgent({ - agent: new Agent({ - // ... - - tools: { weatherTool }, // <--- Add the tool to the agent - - // ... - }), - threadId: "main-conversation", -}) -``` - -### Update your CLI to handle tools - -Update your CLI interface in `src/index.ts` to handle tool events: - -```typescript theme={null} -// Add these new event handlers to your agent.runAgent call: -await agent.runAgent( - {}, // No additional configuration needed - { - // ... existing event handlers ... - - onToolCallStartEvent({ event }) { - console.log("🔧 Tool call:", event.toolCallName) - }, - onToolCallArgsEvent({ event }) { - process.stdout.write(event.delta) - }, - onToolCallEndEvent() { - console.log("") - }, - onToolCallResultEvent({ event }) { - if (event.content) { - console.log("🔍 Tool call result:", event.content) - } - }, - } -) -``` - -### Test your weather tool - -Now restart your application and try asking about weather: - -```bash theme={null} -pnpm dev -``` - -Try questions like: - -* "What's the weather like in London?" -* "How's the weather in Tokyo today?" -* "Is it raining in Seattle?" - -You'll see the agent use the weather tool to fetch real data and provide -detailed responses! - -## Step 8 – Add more functionality - -### Create a browser tool - -Let's add a web browsing capability. First install the `open` package: - -```bash theme={null} -pnpm add open -``` - -Create `src/tools/browser.tool.ts`: - -```typescript theme={null} -import { createTool } from "@mastra/core/tools" -import { z } from "zod" -import { open } from "open" - -export const browserTool = createTool({ - id: "open-browser", - description: "Open a URL in the default web browser", - inputSchema: z.object({ - url: z.url().describe("The URL to open"), - }), - outputSchema: z.object({ - success: z.boolean(), - message: z.string(), - }), - execute: async (inputData) => { - try { - await open(inputData.url) - return { - success: true, - message: `Opened ${inputData.url} in your default browser`, - } - } catch (error) { - return { - success: false, - message: `Failed to open browser: ${error}`, - } - } - }, -}) -``` - -### Update your agent with both tools - -Update `src/agent.ts` to include both tools: - -```typescript theme={null} -import { Agent } from "@mastra/core/agent" -import { MastraAgent } from "@ag-ui/mastra" -import { Memory } from "@mastra/memory" -import { LibSQLStore } from "@mastra/libsql" -import { weatherTool } from "./tools/weather.tool" -import { browserTool } from "./tools/browser.tool" - -export const agent = new MastraAgent({ - resourceId: "cliExample", - agent: new Agent({ - id: "ag-ui-assistant", - name: "AG-UI Assistant", - instructions: ` - You are a helpful assistant with weather and web browsing capabilities. - - For weather queries: - - Always ask for a location if none is provided - - Use the weatherTool to fetch current weather data - - For web browsing: - - Always use full URLs (e.g., "https://www.google.com") - - Use the browserTool to open web pages - - Be friendly and helpful in all interactions! - `, - model: "openai/gpt-4o", - tools: { weatherTool, browserTool }, // Add both tools - memory: new Memory({ - storage: new LibSQLStore({ - id: "storage-memory", - url: "file:./assistant.db", - }), - }), - }), - threadId: "main-conversation", -}) -``` - -Now you can ask your assistant to open websites: "Open Google for me" or "Show -me the weather website". - -## Step 9 – Deploy your client - -### Building your client - -Create a production build: - -```bash theme={null} -pnpm build -``` - -### Create a startup script - -Add to your `package.json`: - -```json theme={null} -{ - "bin": { - "weather-assistant": "./dist/index.js" - } -} -``` - -Add a shebang to your built `dist/index.js`: - -```javascript theme={null} -#!/usr/bin/env node -// ... rest of your compiled code -``` - -Make it executable: - -```bash theme={null} -chmod +x dist/index.js -``` - -### Link globally - -Install your CLI globally: - -```bash theme={null} -pnpm link --global -``` - -Now you can run `weather-assistant` from anywhere! - -## Extending your client - -Your AG-UI client is now a solid foundation. Here are some ideas for -enhancement: - -### Add more tools - -* **Calculator tool** – For mathematical operations -* **File system tool** – For reading/writing files -* **API tools** – For connecting to other services -* **Database tools** – For querying data - -### Improve the interface - -* **Rich formatting** – Use libraries like `chalk` for colored output -* **Progress indicators** – Show loading states for long operations -* **Configuration files** – Allow users to customize settings -* **Command-line arguments** – Support different modes and options - -### Add persistence - -* **Conversation history** – Save and restore chat sessions -* **User preferences** – Remember user settings -* **Tool results caching** – Cache expensive API calls - -## Share your client - -Built something useful? Consider sharing it with the community: - -1. **Open source it** – Publish your code on GitHub -2. **Publish to npm** – Make it installable via `npm install` -3. **Create documentation** – Help others understand and extend your work -4. **Join discussions** – Share your experience in the - [AG-UI GitHub Discussions](https://github.com/orgs/ag-ui-protocol/discussions) - -## Conclusion - -You've built a complete AG-UI client from scratch! Your weather assistant -demonstrates the core concepts: - -* **Event-driven architecture** with real-time streaming -* **Tool integration** for real-world functionality -* **Conversation memory** for context retention -* **Interactive CLI interface** for user engagement - -From here, you can extend your client to support any use case – from simple CLI -tools to complex conversational applications. The AG-UI protocol provides the -foundation, and your creativity provides the possibilities. - -Happy building! 🚀 - - -# Introduction -Source: https://docs.ag-ui.com/quickstart/introduction - -Learn how to get started building an AG-UI integration - -