Files
social-app/docs/plans/2026-03-07-agent-agui-full-alignment.md
T
zl-q 120df903d2 feat: AG-UI 协议对齐与路由导航功能
- 前端: 添加 SSE 流式支持、stateSnapshot 事件、路由导航工具
- 前端: 实现工具调用审批流程,支持 pending 状态展示
- 后端: Agent 状态管理与会话持久化相关重构
- 文档: 新增 agent-agui-full-alignance 设计文档
- 测试: 补充相关单元测试和集成测试
2026-03-07 17:30:20 +08:00

222 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AG-UI 全量对齐改造 Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** 前后端 Agent 全链路仅使用 AG-UI 单一协议格式,补齐 run/resume/SSE/history/工具审批闭环,并完成前端真 API 与 mock API 的统一接入与解析。
**Architecture:** 以后端 `RunAgentInput` + AG-UI 事件模型为唯一真源,前端统一通过 API 客户端调用同一组 `/agent/*` 接口并消费同一事件格式。工具链分为前端工具(需审批 + resume)和后端工具(服务端执行 + 入库 + 事件回传 + 成本入账),历史接口按“天”返回 `STATE_SNAPSHOT` 事件负载。
**Tech Stack:** FastAPI + Pydantic + SQLAlchemy + Redis Stream + Flutter + Dio + json_serializable
---
## Intake Contract
- Objective: 完整完成 AG-UI 对齐改造,移除双格式兼容逻辑,打通工具审批与历史加载。
- Deliverable: 后端接口/服务/工具实现、前端服务/模型/工具改造、文档更新、测试用例与验证输出。
- Constraints:
- run/resume/request/event/history 只允许一种 AG-UI 格式。
- 不保留 legacy 兼容输入与“双字段容错解析”。
- 前后端工具流必须可测试:前端路由工具 + 后端日历工具。
- Verification target:
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q`
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent`
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart`
## 审阅结论(作为改造依据)
- [ ] `RunService.run``ResumeService.resume` 仍保留 legacy 参数分支(`session_id/user_input/tool_call_id/tool_result`),违背“单协议输入”。
- [ ] 前端 `ToolCallResultEvent` 同时兼容 `result``content`,属于双格式解析。
- [ ] 前端 `AgUiService` 仍存在 mock/true 分叉实现,`loadHistory` 真 API 未接入。
- [ ] 后端缺少历史接口;当前历史仅前端本地 `MockHistoryService` 伪造。
- [ ] 当前 tool 流程以固定占位 `user_tool_result` 为主,缺少“前端工具审批 + resume 回传 + 后端工具执行入库”的完整验证链路。
## 执行任务(持续更新)
### Task 1: 严格单协议化(移除兼容分支)
**Files:**
- Modify: `backend/src/core/agent/application/run_service.py`
- Modify: `backend/src/core/agent/application/resume_service.py`
- Modify: `backend/src/v1/agent/service.py`
- Modify: `apps/lib/features/chat/data/models/ag_ui_event.dart`
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
- Test: `backend/tests/unit/v1/agent/test_service.py`
- Test: `apps/test/features/chat/ag_ui_event_test.dart`
**Checklist:**
- [x] 删除后端 legacy 入参路径,只接受 `RunAgentInput`
- [x] 删除前端 `ToolCallResult` 双格式容错,固定 AG-UI 单格式
- [x] 更新对应单元测试(先红后绿)
### Task 2: 历史接口(按天返回 `STATE_SNAPSHOT`
**Files:**
- Modify: `backend/src/v1/agent/router.py`
- Modify: `backend/src/v1/agent/service.py`
- Modify: `backend/src/v1/agent/repository.py`
- Add: `backend/src/v1/agent/history.py` (if needed)
- Test: `backend/tests/integration/v1/agent/test_routes.py`
- Test: `backend/tests/unit/v1/agent/test_service.py`
**Checklist:**
- [x] 新增 history endpoint(含 owner 校验 + 日期游标)
- [x] 查询会话消息并按天聚合
- [x]`STATE_SNAPSHOT` 事件格式返回单日历史与 `hasMore`
- [x] 补齐测试
### Task 3: 前端统一 mock/true API 接入与解析
**Files:**
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
- Modify: `apps/lib/core/api/mock_api_client.dart`
- Modify: `apps/lib/core/api/i_api_client.dart` (if needed)
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
- Remove/Modify: `apps/lib/features/chat/data/services/mock_history_service.dart`
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
- Test: `apps/test/features/chat/chat_bloc_test.dart`
**Checklist:**
- [x] `sendMessage/loadHistory/resume` 全部走统一 API 调用路径
- [x] mock 模式通过 `MockApiClient` 提供同接口响应,不再走本地分叉逻辑
- [x] 前端统一消费 AG-UI 事件流(SSE + history snapshot
- [x] 补齐测试
### Task 4: 工具链闭环(前端路由工具 + 后端日历工具)
**Files:**
- Add/Modify: `backend/src/core/agent/...` (tool orchestration modules)
- Modify: `backend/src/core/agent/application/run_service.py`
- Modify: `backend/src/core/agent/application/resume_service.py`
- Modify: `backend/src/core/agent/infrastructure/queue/tasks.py`
- Modify: `apps/lib/features/chat/data/tools/tool_registry.dart`
- Add: `apps/lib/features/chat/data/tools/navigation_tool.dart` (if needed)
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
- Modify: `apps/lib/features/home/ui/screens/home_screen.dart` (approval action if needed)
- Test: backend + apps agent related tests
**Checklist:**
- [x]`RunAgentInput.tools` 中组织前端工具与后端工具声明
- [x] 后端实现 `create_calendar_event` 工具执行(入库 `schedule_items`
- [x] 前端实现 `navigate_to_route` 工具执行能力(审批后执行)
- [x] 后端对前端工具发起调用时进入 pending,前端审批同意后调用 resume 回传 `tool` message
- [x] 后端处理 resume:落库、状态迁移、事件转发、成本核算保持正确
- [x] 补齐端到端测试场景
### Task 5: 协议与接口文档同步
**Files:**
- Modify: `docs/runtime/runtime-route.md`
- Modify: `docs/bugs/2026-03-07-agent-module-review.md` (if needed for结论回写)
**Checklist:**
- [x] 记录 run/resume/history/sse 的单协议格式
- [x] 记录工具审批与 resume 回传流程
- [x] 标注变更日期与示例
### Task 6: 审查高危问题收敛(并发/安全/前端健壮性)
**Files:**
- Modify: `backend/src/v1/agent/service.py`
- Modify: `backend/src/core/agent/application/run_service.py`
- Modify: `backend/src/core/agent/application/resume_service.py`
- Modify: `backend/src/core/agent/application/session_state_persistence.py`
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
- Modify: `apps/lib/features/chat/data/tools/route_navigation_tool.dart`
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
- Test: `backend/tests/unit/v1/agent/test_service.py`
- Test: `backend/tests/unit/core/agent/test_state_snapshot.py`
- Test: `backend/tests/integration/core/agent/test_queue_run_resume.py`
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
- Test: `apps/test/features/chat/chat_bloc_test.dart`
- Test: `apps/test/features/chat/tool_registry_test.dart`
**Checklist:**
- [x] 修复会话创建竞态:`enqueue_run` 捕获 `IntegrityError` 后回滚并回查 owner
- [x] 修复 resume 审批完整性:绑定 `toolName + toolArgsSha256 + nonce` 并强校验
- [x] 修复前端 SSE 容错:单条坏包不再中断整流
- [x] 修复前端 tool result 空卡片回归:`ui == null` 时不渲染占位卡片
- [x] 修复前端导航工具安全边界:增加路由白名单/前缀校验
### Task 7: L2 复核阻塞项收敛(二次审查后补修)
**Files:**
- Modify: `backend/src/core/agent/application/resume_service.py`
- Modify: `backend/src/core/agent/application/run_service.py`
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
**Checklist:**
- [x] 修复 SSE 重放:前端保存并续传 `Last-Event-ID`
- [x] 收紧后端写库触发:移除“关键词自动创建日程”路径,仅保留显式 `#tool:` 触发
- [x] 修复 resume 结果注入:后端仅使用 sanitize 后的受控 payload 落库/回放
- [x] 修复前端执行失败仍 resume:本地工具 `ok != true` 时中止 resume
- [x] 补充对应回归测试
### Task 8: 安全中风险补齐(HTTP 限额前置 + fail-closed 守卫)
**Files:**
- Modify: `backend/src/v1/agent/router.py`
- Add: `backend/tests/unit/v1/agent/test_router_guards.py`
- Modify: `backend/tests/integration/v1/agent/test_routes.py`
**Checklist:**
- [x] HTTP 层在 enqueue 前执行 `RunAgentInput` 限额校验(大小/消息数/文本长度)
- [x] Redis 异常时 run 限流与 SSE 配额改为 fail-closed
- [x] 补齐守卫单测与路由集成测试
## 执行日志(每完成一项即更新)
- 2026-03-07 16:35: 初始化计划文档,录入审阅结论与任务拆解。
- 2026-03-07 16:44: 完成 Task 1。后端 `RunService/ResumeService` 仅接受 `RunAgentInput`;前端 `ToolCallResultEvent` 仅使用 `content`
验证:
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/integration/core/agent/test_queue_run_resume.py backend/tests/unit/v1/agent/test_service.py -q` 通过(含部分 `skip`)。
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` 通过。
- 2026-03-07 16:50: 完成 Task 2。新增 `GET /api/v1/agent/runs/{thread_id}/history?before=YYYY-MM-DD`,按天聚合会话消息并返回 `STATE_SNAPSHOT`(含 `hasMore`)。
验证:
- `uv run pytest backend/tests/unit/v1/agent/test_service.py backend/tests/integration/v1/agent/test_routes.py -q` 通过。
- 2026-03-07 17:09: 完成 Task 3。前端 `AgUiService` 统一为 API 调用路径,mock/true 共用请求与事件解析;历史改走 `/api/v1/agent/history``STATE_SNAPSHOT`
验证:
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` 通过。
- 2026-03-07 17:09: 完成 Task 4。新增前端 `navigate_to_route` 工具(审批后执行并 resume),后端 `create_calendar_event` 工具(落库 `schedule_items`,回传 `TOOL_CALL_RESULT`),并将可用工具注入系统提示词供后端解析。
验证:
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
- 2026-03-07 17:10: 完成 Task 5。`docs/runtime/runtime-route.md` 已新增 history 接口与 `STATE_SNAPSHOT` 示例,更新 run/resume 协议描述为单格式。
- 2026-03-07 17:29: 完成 Task 6。收敛审查高危项:
- 后端 `enqueue_run` 增加并发建会话竞态处理(`IntegrityError -> rollback -> owner recheck`)。
- 后端 run/resume 增加 pending tool guard`pending_tool_name/pending_tool_args_sha256/pending_tool_nonce`)与 resume 强校验。
- 前端 SSE 解析增加坏包容错,tool result 无 ui 时不渲染空卡片,导航工具增加白名单。
验证:
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q` 通过(`25 passed, 3 skipped`)。
- `uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py` 通过。
- `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`33 passed`)。
- 2026-03-07 17:33: 执行全量目标验证命令:
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`69 passed`)。
- 2026-03-07 17:46: 完成 Task 7(针对 L2 门禁新增阻塞项的二次修复):
- 前端 `AgUiService` 增加 `Last-Event-ID` 续传,规避同线程重复回放。
- 后端 `RunService` 去除“日程关键词自动写库”,仅保留显式工具触发。
- 后端 `ResumeService` 新增 sanitize 流程,拒绝注入式 `ui/content` 污染。
- 前端审批后若本地工具执行失败,不再继续调用 resume。
验证:
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q` 通过(`26 passed, 3 skipped`)。
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
- `uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py` 通过。
- `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`35 passed`)。
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`71 passed`)。
- L2 复核结果:`code-reviewer``security-reviewer` 复核后确认此前 HIGH 已收敛,未发现新的 CRITICAL/HIGH。
- 2026-03-07 17:56: 完成 Task 8(安全中风险补齐):
- `router``/agent/runs``/agent/runs/{thread_id}/resume` 增加 `parse_run_input` 前置校验。
- `_allow_run_request``_acquire_sse_slot` 在 Redis 异常时改为 fail-closed。
- 新增 `test_router_guards.py`,并扩展 `test_routes.py` 覆盖超大 payload 422。
验证:
- `uv run pytest backend/tests/unit/v1/agent/test_router_guards.py backend/tests/integration/v1/agent/test_routes.py -q` 通过(`8 passed`)。
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`71 passed`)。
- L2 复核结果:增量 `code-reviewer``security-reviewer` 均确认当前无新的 `CRITICAL/HIGH`