222 lines
16 KiB
Markdown
222 lines
16 KiB
Markdown
|
|
# AG-UI 全量对齐改造 Implementation Plan
|
|||
|
|
|
|||
|
|
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
|||
|
|
|
|||
|
|
**Goal:** 前后端 Agent 全链路仅使用 AG-UI 单一协议格式,补齐 run/resume/SSE/history/工具审批闭环,并完成前端真 API 与 mock API 的统一接入与解析。
|
|||
|
|
|
|||
|
|
**Architecture:** 以后端 `RunAgentInput` + AG-UI 事件模型为唯一真源,前端统一通过 API 客户端调用同一组 `/agent/*` 接口并消费同一事件格式。工具链分为前端工具(需审批 + resume)和后端工具(服务端执行 + 入库 + 事件回传 + 成本入账),历史接口按“天”返回 `STATE_SNAPSHOT` 事件负载。
|
|||
|
|
|
|||
|
|
**Tech Stack:** FastAPI + Pydantic + SQLAlchemy + Redis Stream + Flutter + Dio + json_serializable
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Intake Contract
|
|||
|
|
|
|||
|
|
- Objective: 完整完成 AG-UI 对齐改造,移除双格式兼容逻辑,打通工具审批与历史加载。
|
|||
|
|
- Deliverable: 后端接口/服务/工具实现、前端服务/模型/工具改造、文档更新、测试用例与验证输出。
|
|||
|
|
- Constraints:
|
|||
|
|
- run/resume/request/event/history 只允许一种 AG-UI 格式。
|
|||
|
|
- 不保留 legacy 兼容输入与“双字段容错解析”。
|
|||
|
|
- 前后端工具流必须可测试:前端路由工具 + 后端日历工具。
|
|||
|
|
- Verification target:
|
|||
|
|
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q`
|
|||
|
|
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent`
|
|||
|
|
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart`
|
|||
|
|
|
|||
|
|
## 审阅结论(作为改造依据)
|
|||
|
|
|
|||
|
|
- [ ] `RunService.run` 与 `ResumeService.resume` 仍保留 legacy 参数分支(`session_id/user_input/tool_call_id/tool_result`),违背“单协议输入”。
|
|||
|
|
- [ ] 前端 `ToolCallResultEvent` 同时兼容 `result` 与 `content`,属于双格式解析。
|
|||
|
|
- [ ] 前端 `AgUiService` 仍存在 mock/true 分叉实现,`loadHistory` 真 API 未接入。
|
|||
|
|
- [ ] 后端缺少历史接口;当前历史仅前端本地 `MockHistoryService` 伪造。
|
|||
|
|
- [ ] 当前 tool 流程以固定占位 `user_tool_result` 为主,缺少“前端工具审批 + resume 回传 + 后端工具执行入库”的完整验证链路。
|
|||
|
|
|
|||
|
|
## 执行任务(持续更新)
|
|||
|
|
|
|||
|
|
### Task 1: 严格单协议化(移除兼容分支)
|
|||
|
|
|
|||
|
|
**Files:**
|
|||
|
|
- Modify: `backend/src/core/agent/application/run_service.py`
|
|||
|
|
- Modify: `backend/src/core/agent/application/resume_service.py`
|
|||
|
|
- Modify: `backend/src/v1/agent/service.py`
|
|||
|
|
- Modify: `apps/lib/features/chat/data/models/ag_ui_event.dart`
|
|||
|
|
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
|
|||
|
|
- Test: `backend/tests/unit/v1/agent/test_service.py`
|
|||
|
|
- Test: `apps/test/features/chat/ag_ui_event_test.dart`
|
|||
|
|
|
|||
|
|
**Checklist:**
|
|||
|
|
- [x] 删除后端 legacy 入参路径,只接受 `RunAgentInput`
|
|||
|
|
- [x] 删除前端 `ToolCallResult` 双格式容错,固定 AG-UI 单格式
|
|||
|
|
- [x] 更新对应单元测试(先红后绿)
|
|||
|
|
|
|||
|
|
### Task 2: 历史接口(按天返回 `STATE_SNAPSHOT`)
|
|||
|
|
|
|||
|
|
**Files:**
|
|||
|
|
- Modify: `backend/src/v1/agent/router.py`
|
|||
|
|
- Modify: `backend/src/v1/agent/service.py`
|
|||
|
|
- Modify: `backend/src/v1/agent/repository.py`
|
|||
|
|
- Add: `backend/src/v1/agent/history.py` (if needed)
|
|||
|
|
- Test: `backend/tests/integration/v1/agent/test_routes.py`
|
|||
|
|
- Test: `backend/tests/unit/v1/agent/test_service.py`
|
|||
|
|
|
|||
|
|
**Checklist:**
|
|||
|
|
- [x] 新增 history endpoint(含 owner 校验 + 日期游标)
|
|||
|
|
- [x] 查询会话消息并按天聚合
|
|||
|
|
- [x] 以 `STATE_SNAPSHOT` 事件格式返回单日历史与 `hasMore`
|
|||
|
|
- [x] 补齐测试
|
|||
|
|
|
|||
|
|
### Task 3: 前端统一 mock/true API 接入与解析
|
|||
|
|
|
|||
|
|
**Files:**
|
|||
|
|
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
|
|||
|
|
- Modify: `apps/lib/core/api/mock_api_client.dart`
|
|||
|
|
- Modify: `apps/lib/core/api/i_api_client.dart` (if needed)
|
|||
|
|
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
|
|||
|
|
- Remove/Modify: `apps/lib/features/chat/data/services/mock_history_service.dart`
|
|||
|
|
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
|
|||
|
|
- Test: `apps/test/features/chat/chat_bloc_test.dart`
|
|||
|
|
|
|||
|
|
**Checklist:**
|
|||
|
|
- [x] `sendMessage/loadHistory/resume` 全部走统一 API 调用路径
|
|||
|
|
- [x] mock 模式通过 `MockApiClient` 提供同接口响应,不再走本地分叉逻辑
|
|||
|
|
- [x] 前端统一消费 AG-UI 事件流(SSE + history snapshot)
|
|||
|
|
- [x] 补齐测试
|
|||
|
|
|
|||
|
|
### Task 4: 工具链闭环(前端路由工具 + 后端日历工具)
|
|||
|
|
|
|||
|
|
**Files:**
|
|||
|
|
- Add/Modify: `backend/src/core/agent/...` (tool orchestration modules)
|
|||
|
|
- Modify: `backend/src/core/agent/application/run_service.py`
|
|||
|
|
- Modify: `backend/src/core/agent/application/resume_service.py`
|
|||
|
|
- Modify: `backend/src/core/agent/infrastructure/queue/tasks.py`
|
|||
|
|
- Modify: `apps/lib/features/chat/data/tools/tool_registry.dart`
|
|||
|
|
- Add: `apps/lib/features/chat/data/tools/navigation_tool.dart` (if needed)
|
|||
|
|
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
|
|||
|
|
- Modify: `apps/lib/features/home/ui/screens/home_screen.dart` (approval action if needed)
|
|||
|
|
- Test: backend + apps agent related tests
|
|||
|
|
|
|||
|
|
**Checklist:**
|
|||
|
|
- [x] 在 `RunAgentInput.tools` 中组织前端工具与后端工具声明
|
|||
|
|
- [x] 后端实现 `create_calendar_event` 工具执行(入库 `schedule_items`)
|
|||
|
|
- [x] 前端实现 `navigate_to_route` 工具执行能力(审批后执行)
|
|||
|
|
- [x] 后端对前端工具发起调用时进入 pending,前端审批同意后调用 resume 回传 `tool` message
|
|||
|
|
- [x] 后端处理 resume:落库、状态迁移、事件转发、成本核算保持正确
|
|||
|
|
- [x] 补齐端到端测试场景
|
|||
|
|
|
|||
|
|
### Task 5: 协议与接口文档同步
|
|||
|
|
|
|||
|
|
**Files:**
|
|||
|
|
- Modify: `docs/runtime/runtime-route.md`
|
|||
|
|
- Modify: `docs/bugs/2026-03-07-agent-module-review.md` (if needed for结论回写)
|
|||
|
|
|
|||
|
|
**Checklist:**
|
|||
|
|
- [x] 记录 run/resume/history/sse 的单协议格式
|
|||
|
|
- [x] 记录工具审批与 resume 回传流程
|
|||
|
|
- [x] 标注变更日期与示例
|
|||
|
|
|
|||
|
|
### Task 6: 审查高危问题收敛(并发/安全/前端健壮性)
|
|||
|
|
|
|||
|
|
**Files:**
|
|||
|
|
- Modify: `backend/src/v1/agent/service.py`
|
|||
|
|
- Modify: `backend/src/core/agent/application/run_service.py`
|
|||
|
|
- Modify: `backend/src/core/agent/application/resume_service.py`
|
|||
|
|
- Modify: `backend/src/core/agent/application/session_state_persistence.py`
|
|||
|
|
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
|
|||
|
|
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
|
|||
|
|
- Modify: `apps/lib/features/chat/data/tools/route_navigation_tool.dart`
|
|||
|
|
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
|
|||
|
|
- Test: `backend/tests/unit/v1/agent/test_service.py`
|
|||
|
|
- Test: `backend/tests/unit/core/agent/test_state_snapshot.py`
|
|||
|
|
- Test: `backend/tests/integration/core/agent/test_queue_run_resume.py`
|
|||
|
|
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
|
|||
|
|
- Test: `apps/test/features/chat/chat_bloc_test.dart`
|
|||
|
|
- Test: `apps/test/features/chat/tool_registry_test.dart`
|
|||
|
|
|
|||
|
|
**Checklist:**
|
|||
|
|
- [x] 修复会话创建竞态:`enqueue_run` 捕获 `IntegrityError` 后回滚并回查 owner
|
|||
|
|
- [x] 修复 resume 审批完整性:绑定 `toolName + toolArgsSha256 + nonce` 并强校验
|
|||
|
|
- [x] 修复前端 SSE 容错:单条坏包不再中断整流
|
|||
|
|
- [x] 修复前端 tool result 空卡片回归:`ui == null` 时不渲染占位卡片
|
|||
|
|
- [x] 修复前端导航工具安全边界:增加路由白名单/前缀校验
|
|||
|
|
|
|||
|
|
### Task 7: L2 复核阻塞项收敛(二次审查后补修)
|
|||
|
|
|
|||
|
|
**Files:**
|
|||
|
|
- Modify: `backend/src/core/agent/application/resume_service.py`
|
|||
|
|
- Modify: `backend/src/core/agent/application/run_service.py`
|
|||
|
|
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
|
|||
|
|
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
|
|||
|
|
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
|
|||
|
|
|
|||
|
|
**Checklist:**
|
|||
|
|
- [x] 修复 SSE 重放:前端保存并续传 `Last-Event-ID`
|
|||
|
|
- [x] 收紧后端写库触发:移除“关键词自动创建日程”路径,仅保留显式 `#tool:` 触发
|
|||
|
|
- [x] 修复 resume 结果注入:后端仅使用 sanitize 后的受控 payload 落库/回放
|
|||
|
|
- [x] 修复前端执行失败仍 resume:本地工具 `ok != true` 时中止 resume
|
|||
|
|
- [x] 补充对应回归测试
|
|||
|
|
|
|||
|
|
### Task 8: 安全中风险补齐(HTTP 限额前置 + fail-closed 守卫)
|
|||
|
|
|
|||
|
|
**Files:**
|
|||
|
|
- Modify: `backend/src/v1/agent/router.py`
|
|||
|
|
- Add: `backend/tests/unit/v1/agent/test_router_guards.py`
|
|||
|
|
- Modify: `backend/tests/integration/v1/agent/test_routes.py`
|
|||
|
|
|
|||
|
|
**Checklist:**
|
|||
|
|
- [x] HTTP 层在 enqueue 前执行 `RunAgentInput` 限额校验(大小/消息数/文本长度)
|
|||
|
|
- [x] Redis 异常时 run 限流与 SSE 配额改为 fail-closed
|
|||
|
|
- [x] 补齐守卫单测与路由集成测试
|
|||
|
|
|
|||
|
|
## 执行日志(每完成一项即更新)
|
|||
|
|
|
|||
|
|
- 2026-03-07 16:35: 初始化计划文档,录入审阅结论与任务拆解。
|
|||
|
|
- 2026-03-07 16:44: 完成 Task 1。后端 `RunService/ResumeService` 仅接受 `RunAgentInput`;前端 `ToolCallResultEvent` 仅使用 `content`。
|
|||
|
|
验证:
|
|||
|
|
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/integration/core/agent/test_queue_run_resume.py backend/tests/unit/v1/agent/test_service.py -q` 通过(含部分 `skip`)。
|
|||
|
|
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` 通过。
|
|||
|
|
- 2026-03-07 16:50: 完成 Task 2。新增 `GET /api/v1/agent/runs/{thread_id}/history?before=YYYY-MM-DD`,按天聚合会话消息并返回 `STATE_SNAPSHOT`(含 `hasMore`)。
|
|||
|
|
验证:
|
|||
|
|
- `uv run pytest backend/tests/unit/v1/agent/test_service.py backend/tests/integration/v1/agent/test_routes.py -q` 通过。
|
|||
|
|
- 2026-03-07 17:09: 完成 Task 3。前端 `AgUiService` 统一为 API 调用路径,mock/true 共用请求与事件解析;历史改走 `/api/v1/agent/history` 的 `STATE_SNAPSHOT`。
|
|||
|
|
验证:
|
|||
|
|
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` 通过。
|
|||
|
|
- 2026-03-07 17:09: 完成 Task 4。新增前端 `navigate_to_route` 工具(审批后执行并 resume),后端 `create_calendar_event` 工具(落库 `schedule_items`,回传 `TOOL_CALL_RESULT`),并将可用工具注入系统提示词供后端解析。
|
|||
|
|
验证:
|
|||
|
|
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
|
|||
|
|
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
|
|||
|
|
- 2026-03-07 17:10: 完成 Task 5。`docs/runtime/runtime-route.md` 已新增 history 接口与 `STATE_SNAPSHOT` 示例,更新 run/resume 协议描述为单格式。
|
|||
|
|
- 2026-03-07 17:29: 完成 Task 6。收敛审查高危项:
|
|||
|
|
- 后端 `enqueue_run` 增加并发建会话竞态处理(`IntegrityError -> rollback -> owner recheck`)。
|
|||
|
|
- 后端 run/resume 增加 pending tool guard(`pending_tool_name/pending_tool_args_sha256/pending_tool_nonce`)与 resume 强校验。
|
|||
|
|
- 前端 SSE 解析增加坏包容错,tool result 无 ui 时不渲染空卡片,导航工具增加白名单。
|
|||
|
|
验证:
|
|||
|
|
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q` 通过(`25 passed, 3 skipped`)。
|
|||
|
|
- `uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py` 通过。
|
|||
|
|
- `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`33 passed`)。
|
|||
|
|
- 2026-03-07 17:33: 执行全量目标验证命令:
|
|||
|
|
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
|
|||
|
|
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
|
|||
|
|
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`69 passed`)。
|
|||
|
|
- 2026-03-07 17:46: 完成 Task 7(针对 L2 门禁新增阻塞项的二次修复):
|
|||
|
|
- 前端 `AgUiService` 增加 `Last-Event-ID` 续传,规避同线程重复回放。
|
|||
|
|
- 后端 `RunService` 去除“日程关键词自动写库”,仅保留显式工具触发。
|
|||
|
|
- 后端 `ResumeService` 新增 sanitize 流程,拒绝注入式 `ui/content` 污染。
|
|||
|
|
- 前端审批后若本地工具执行失败,不再继续调用 resume。
|
|||
|
|
验证:
|
|||
|
|
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q` 通过(`26 passed, 3 skipped`)。
|
|||
|
|
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
|
|||
|
|
- `uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py` 通过。
|
|||
|
|
- `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`35 passed`)。
|
|||
|
|
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`71 passed`)。
|
|||
|
|
- L2 复核结果:`code-reviewer` 与 `security-reviewer` 复核后确认此前 HIGH 已收敛,未发现新的 CRITICAL/HIGH。
|
|||
|
|
- 2026-03-07 17:56: 完成 Task 8(安全中风险补齐):
|
|||
|
|
- `router` 在 `/agent/runs` 与 `/agent/runs/{thread_id}/resume` 增加 `parse_run_input` 前置校验。
|
|||
|
|
- `_allow_run_request` 与 `_acquire_sse_slot` 在 Redis 异常时改为 fail-closed。
|
|||
|
|
- 新增 `test_router_guards.py`,并扩展 `test_routes.py` 覆盖超大 payload 422。
|
|||
|
|
验证:
|
|||
|
|
- `uv run pytest backend/tests/unit/v1/agent/test_router_guards.py backend/tests/integration/v1/agent/test_routes.py -q` 通过(`8 passed`)。
|
|||
|
|
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
|
|||
|
|
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
|
|||
|
|
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`71 passed`)。
|
|||
|
|
- L2 复核结果:增量 `code-reviewer` 与 `security-reviewer` 均确认当前无新的 `CRITICAL/HIGH`。
|