120df903d2
- 前端: 添加 SSE 流式支持、stateSnapshot 事件、路由导航工具 - 前端: 实现工具调用审批流程,支持 pending 状态展示 - 后端: Agent 状态管理与会话持久化相关重构 - 文档: 新增 agent-agui-full-alignance 设计文档 - 测试: 补充相关单元测试和集成测试
16 KiB
16 KiB
AG-UI 全量对齐改造 Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: 前后端 Agent 全链路仅使用 AG-UI 单一协议格式,补齐 run/resume/SSE/history/工具审批闭环,并完成前端真 API 与 mock API 的统一接入与解析。
Architecture: 以后端 RunAgentInput + AG-UI 事件模型为唯一真源,前端统一通过 API 客户端调用同一组 /agent/* 接口并消费同一事件格式。工具链分为前端工具(需审批 + resume)和后端工具(服务端执行 + 入库 + 事件回传 + 成本入账),历史接口按“天”返回 STATE_SNAPSHOT 事件负载。
Tech Stack: FastAPI + Pydantic + SQLAlchemy + Redis Stream + Flutter + Dio + json_serializable
Intake Contract
- Objective: 完整完成 AG-UI 对齐改造,移除双格式兼容逻辑,打通工具审批与历史加载。
- Deliverable: 后端接口/服务/工具实现、前端服务/模型/工具改造、文档更新、测试用例与验证输出。
- Constraints:
- run/resume/request/event/history 只允许一种 AG-UI 格式。
- 不保留 legacy 兼容输入与“双字段容错解析”。
- 前后端工具流必须可测试:前端路由工具 + 后端日历工具。
- Verification target:
uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -quv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agentcd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart
审阅结论(作为改造依据)
RunService.run与ResumeService.resume仍保留 legacy 参数分支(session_id/user_input/tool_call_id/tool_result),违背“单协议输入”。- 前端
ToolCallResultEvent同时兼容result与content,属于双格式解析。 - 前端
AgUiService仍存在 mock/true 分叉实现,loadHistory真 API 未接入。 - 后端缺少历史接口;当前历史仅前端本地
MockHistoryService伪造。 - 当前 tool 流程以固定占位
user_tool_result为主,缺少“前端工具审批 + resume 回传 + 后端工具执行入库”的完整验证链路。
执行任务(持续更新)
Task 1: 严格单协议化(移除兼容分支)
Files:
- Modify:
backend/src/core/agent/application/run_service.py - Modify:
backend/src/core/agent/application/resume_service.py - Modify:
backend/src/v1/agent/service.py - Modify:
apps/lib/features/chat/data/models/ag_ui_event.dart - Test:
backend/tests/unit/core/agent/test_run_resume_service.py - Test:
backend/tests/unit/v1/agent/test_service.py - Test:
apps/test/features/chat/ag_ui_event_test.dart
Checklist:
- 删除后端 legacy 入参路径,只接受
RunAgentInput - 删除前端
ToolCallResult双格式容错,固定 AG-UI 单格式 - 更新对应单元测试(先红后绿)
Task 2: 历史接口(按天返回 STATE_SNAPSHOT)
Files:
- Modify:
backend/src/v1/agent/router.py - Modify:
backend/src/v1/agent/service.py - Modify:
backend/src/v1/agent/repository.py - Add:
backend/src/v1/agent/history.py(if needed) - Test:
backend/tests/integration/v1/agent/test_routes.py - Test:
backend/tests/unit/v1/agent/test_service.py
Checklist:
- 新增 history endpoint(含 owner 校验 + 日期游标)
- 查询会话消息并按天聚合
- 以
STATE_SNAPSHOT事件格式返回单日历史与hasMore - 补齐测试
Task 3: 前端统一 mock/true API 接入与解析
Files:
- Modify:
apps/lib/features/chat/data/services/ag_ui_service.dart - Modify:
apps/lib/core/api/mock_api_client.dart - Modify:
apps/lib/core/api/i_api_client.dart(if needed) - Modify:
apps/lib/features/chat/presentation/bloc/chat_bloc.dart - Remove/Modify:
apps/lib/features/chat/data/services/mock_history_service.dart - Test:
apps/test/features/chat/ag_ui_service_test.dart - Test:
apps/test/features/chat/chat_bloc_test.dart
Checklist:
sendMessage/loadHistory/resume全部走统一 API 调用路径- mock 模式通过
MockApiClient提供同接口响应,不再走本地分叉逻辑 - 前端统一消费 AG-UI 事件流(SSE + history snapshot)
- 补齐测试
Task 4: 工具链闭环(前端路由工具 + 后端日历工具)
Files:
- Add/Modify:
backend/src/core/agent/...(tool orchestration modules) - Modify:
backend/src/core/agent/application/run_service.py - Modify:
backend/src/core/agent/application/resume_service.py - Modify:
backend/src/core/agent/infrastructure/queue/tasks.py - Modify:
apps/lib/features/chat/data/tools/tool_registry.dart - Add:
apps/lib/features/chat/data/tools/navigation_tool.dart(if needed) - Modify:
apps/lib/features/chat/presentation/bloc/chat_bloc.dart - Modify:
apps/lib/features/home/ui/screens/home_screen.dart(approval action if needed) - Test: backend + apps agent related tests
Checklist:
- 在
RunAgentInput.tools中组织前端工具与后端工具声明 - 后端实现
create_calendar_event工具执行(入库schedule_items) - 前端实现
navigate_to_route工具执行能力(审批后执行) - 后端对前端工具发起调用时进入 pending,前端审批同意后调用 resume 回传
toolmessage - 后端处理 resume:落库、状态迁移、事件转发、成本核算保持正确
- 补齐端到端测试场景
Task 5: 协议与接口文档同步
Files:
- Modify:
docs/runtime/runtime-route.md - Modify:
docs/bugs/2026-03-07-agent-module-review.md(if needed for结论回写)
Checklist:
- 记录 run/resume/history/sse 的单协议格式
- 记录工具审批与 resume 回传流程
- 标注变更日期与示例
Task 6: 审查高危问题收敛(并发/安全/前端健壮性)
Files:
- Modify:
backend/src/v1/agent/service.py - Modify:
backend/src/core/agent/application/run_service.py - Modify:
backend/src/core/agent/application/resume_service.py - Modify:
backend/src/core/agent/application/session_state_persistence.py - Modify:
apps/lib/features/chat/data/services/ag_ui_service.dart - Modify:
apps/lib/features/chat/presentation/bloc/chat_bloc.dart - Modify:
apps/lib/features/chat/data/tools/route_navigation_tool.dart - Test:
backend/tests/unit/core/agent/test_run_resume_service.py - Test:
backend/tests/unit/v1/agent/test_service.py - Test:
backend/tests/unit/core/agent/test_state_snapshot.py - Test:
backend/tests/integration/core/agent/test_queue_run_resume.py - Test:
apps/test/features/chat/ag_ui_service_test.dart - Test:
apps/test/features/chat/chat_bloc_test.dart - Test:
apps/test/features/chat/tool_registry_test.dart
Checklist:
- 修复会话创建竞态:
enqueue_run捕获IntegrityError后回滚并回查 owner - 修复 resume 审批完整性:绑定
toolName + toolArgsSha256 + nonce并强校验 - 修复前端 SSE 容错:单条坏包不再中断整流
- 修复前端 tool result 空卡片回归:
ui == null时不渲染占位卡片 - 修复前端导航工具安全边界:增加路由白名单/前缀校验
Task 7: L2 复核阻塞项收敛(二次审查后补修)
Files:
- Modify:
backend/src/core/agent/application/resume_service.py - Modify:
backend/src/core/agent/application/run_service.py - Modify:
apps/lib/features/chat/data/services/ag_ui_service.dart - Test:
backend/tests/unit/core/agent/test_run_resume_service.py - Test:
apps/test/features/chat/ag_ui_service_test.dart
Checklist:
- 修复 SSE 重放:前端保存并续传
Last-Event-ID - 收紧后端写库触发:移除“关键词自动创建日程”路径,仅保留显式
#tool:触发 - 修复 resume 结果注入:后端仅使用 sanitize 后的受控 payload 落库/回放
- 修复前端执行失败仍 resume:本地工具
ok != true时中止 resume - 补充对应回归测试
Task 8: 安全中风险补齐(HTTP 限额前置 + fail-closed 守卫)
Files:
- Modify:
backend/src/v1/agent/router.py - Add:
backend/tests/unit/v1/agent/test_router_guards.py - Modify:
backend/tests/integration/v1/agent/test_routes.py
Checklist:
- HTTP 层在 enqueue 前执行
RunAgentInput限额校验(大小/消息数/文本长度) - Redis 异常时 run 限流与 SSE 配额改为 fail-closed
- 补齐守卫单测与路由集成测试
执行日志(每完成一项即更新)
- 2026-03-07 16:35: 初始化计划文档,录入审阅结论与任务拆解。
- 2026-03-07 16:44: 完成 Task 1。后端
RunService/ResumeService仅接受RunAgentInput;前端ToolCallResultEvent仅使用content。
验证:uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/integration/core/agent/test_queue_run_resume.py backend/tests/unit/v1/agent/test_service.py -q通过(含部分skip)。cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart通过。
- 2026-03-07 16:50: 完成 Task 2。新增
GET /api/v1/agent/runs/{thread_id}/history?before=YYYY-MM-DD,按天聚合会话消息并返回STATE_SNAPSHOT(含hasMore)。
验证:uv run pytest backend/tests/unit/v1/agent/test_service.py backend/tests/integration/v1/agent/test_routes.py -q通过。
- 2026-03-07 17:09: 完成 Task 3。前端
AgUiService统一为 API 调用路径,mock/true 共用请求与事件解析;历史改走/api/v1/agent/history的STATE_SNAPSHOT。 验证:cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart通过。
- 2026-03-07 17:09: 完成 Task 4。新增前端
navigate_to_route工具(审批后执行并 resume),后端create_calendar_event工具(落库schedule_items,回传TOOL_CALL_RESULT),并将可用工具注入系统提示词供后端解析。 验证:uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q通过(含skip)。uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent通过。
- 2026-03-07 17:10: 完成 Task 5。
docs/runtime/runtime-route.md已新增 history 接口与STATE_SNAPSHOT示例,更新 run/resume 协议描述为单格式。 - 2026-03-07 17:29: 完成 Task 6。收敛审查高危项:
- 后端
enqueue_run增加并发建会话竞态处理(IntegrityError -> rollback -> owner recheck)。 - 后端 run/resume 增加 pending tool guard(
pending_tool_name/pending_tool_args_sha256/pending_tool_nonce)与 resume 强校验。 - 前端 SSE 解析增加坏包容错,tool result 无 ui 时不渲染空卡片,导航工具增加白名单。 验证:
uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q通过(25 passed, 3 skipped)。uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py通过。cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart通过(33 passed)。
- 后端
- 2026-03-07 17:33: 执行全量目标验证命令:
uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q通过(含skip)。uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent通过。cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart通过(69 passed)。
- 2026-03-07 17:46: 完成 Task 7(针对 L2 门禁新增阻塞项的二次修复):
- 前端
AgUiService增加Last-Event-ID续传,规避同线程重复回放。 - 后端
RunService去除“日程关键词自动写库”,仅保留显式工具触发。 - 后端
ResumeService新增 sanitize 流程,拒绝注入式ui/content污染。 - 前端审批后若本地工具执行失败,不再继续调用 resume。 验证:
uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q通过(26 passed, 3 skipped)。uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q通过(含skip)。uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py通过。cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart通过(35 passed)。cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart通过(71 passed)。- L2 复核结果:
code-reviewer与security-reviewer复核后确认此前 HIGH 已收敛,未发现新的 CRITICAL/HIGH。
- 前端
- 2026-03-07 17:56: 完成 Task 8(安全中风险补齐):
router在/agent/runs与/agent/runs/{thread_id}/resume增加parse_run_input前置校验。_allow_run_request与_acquire_sse_slot在 Redis 异常时改为 fail-closed。- 新增
test_router_guards.py,并扩展test_routes.py覆盖超大 payload 422。 验证: uv run pytest backend/tests/unit/v1/agent/test_router_guards.py backend/tests/integration/v1/agent/test_routes.py -q通过(8 passed)。uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q通过(含skip)。uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent通过。cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart通过(71 passed)。- L2 复核结果:增量
code-reviewer与security-reviewer均确认当前无新的CRITICAL/HIGH。