docs: 清理过期文档,新增认证 Token 刷新设计文档
This commit is contained in:
@@ -1,221 +0,0 @@
|
||||
# AG-UI 全量对齐改造 Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** 前后端 Agent 全链路仅使用 AG-UI 单一协议格式,补齐 run/resume/SSE/history/工具审批闭环,并完成前端真 API 与 mock API 的统一接入与解析。
|
||||
|
||||
**Architecture:** 以后端 `RunAgentInput` + AG-UI 事件模型为唯一真源,前端统一通过 API 客户端调用同一组 `/agent/*` 接口并消费同一事件格式。工具链分为前端工具(需审批 + resume)和后端工具(服务端执行 + 入库 + 事件回传 + 成本入账),历史接口按“天”返回 `STATE_SNAPSHOT` 事件负载。
|
||||
|
||||
**Tech Stack:** FastAPI + Pydantic + SQLAlchemy + Redis Stream + Flutter + Dio + json_serializable
|
||||
|
||||
---
|
||||
|
||||
## Intake Contract
|
||||
|
||||
- Objective: 完整完成 AG-UI 对齐改造,移除双格式兼容逻辑,打通工具审批与历史加载。
|
||||
- Deliverable: 后端接口/服务/工具实现、前端服务/模型/工具改造、文档更新、测试用例与验证输出。
|
||||
- Constraints:
|
||||
- run/resume/request/event/history 只允许一种 AG-UI 格式。
|
||||
- 不保留 legacy 兼容输入与“双字段容错解析”。
|
||||
- 前后端工具流必须可测试:前端路由工具 + 后端日历工具。
|
||||
- Verification target:
|
||||
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q`
|
||||
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent`
|
||||
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart`
|
||||
|
||||
## 审阅结论(作为改造依据)
|
||||
|
||||
- [ ] `RunService.run` 与 `ResumeService.resume` 仍保留 legacy 参数分支(`session_id/user_input/tool_call_id/tool_result`),违背“单协议输入”。
|
||||
- [ ] 前端 `ToolCallResultEvent` 同时兼容 `result` 与 `content`,属于双格式解析。
|
||||
- [ ] 前端 `AgUiService` 仍存在 mock/true 分叉实现,`loadHistory` 真 API 未接入。
|
||||
- [ ] 后端缺少历史接口;当前历史仅前端本地 `MockHistoryService` 伪造。
|
||||
- [ ] 当前 tool 流程以固定占位 `user_tool_result` 为主,缺少“前端工具审批 + resume 回传 + 后端工具执行入库”的完整验证链路。
|
||||
|
||||
## 执行任务(持续更新)
|
||||
|
||||
### Task 1: 严格单协议化(移除兼容分支)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/application/run_service.py`
|
||||
- Modify: `backend/src/core/agent/application/resume_service.py`
|
||||
- Modify: `backend/src/v1/agent/service.py`
|
||||
- Modify: `apps/lib/features/chat/data/models/ag_ui_event.dart`
|
||||
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
|
||||
- Test: `backend/tests/unit/v1/agent/test_service.py`
|
||||
- Test: `apps/test/features/chat/ag_ui_event_test.dart`
|
||||
|
||||
**Checklist:**
|
||||
- [x] 删除后端 legacy 入参路径,只接受 `RunAgentInput`
|
||||
- [x] 删除前端 `ToolCallResult` 双格式容错,固定 AG-UI 单格式
|
||||
- [x] 更新对应单元测试(先红后绿)
|
||||
|
||||
### Task 2: 历史接口(按天返回 `STATE_SNAPSHOT`)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/v1/agent/router.py`
|
||||
- Modify: `backend/src/v1/agent/service.py`
|
||||
- Modify: `backend/src/v1/agent/repository.py`
|
||||
- Add: `backend/src/v1/agent/history.py` (if needed)
|
||||
- Test: `backend/tests/integration/v1/agent/test_routes.py`
|
||||
- Test: `backend/tests/unit/v1/agent/test_service.py`
|
||||
|
||||
**Checklist:**
|
||||
- [x] 新增 history endpoint(含 owner 校验 + 日期游标)
|
||||
- [x] 查询会话消息并按天聚合
|
||||
- [x] 以 `STATE_SNAPSHOT` 事件格式返回单日历史与 `hasMore`
|
||||
- [x] 补齐测试
|
||||
|
||||
### Task 3: 前端统一 mock/true API 接入与解析
|
||||
|
||||
**Files:**
|
||||
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
|
||||
- Modify: `apps/lib/core/api/mock_api_client.dart`
|
||||
- Modify: `apps/lib/core/api/i_api_client.dart` (if needed)
|
||||
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
|
||||
- Remove/Modify: `apps/lib/features/chat/data/services/mock_history_service.dart`
|
||||
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
|
||||
- Test: `apps/test/features/chat/chat_bloc_test.dart`
|
||||
|
||||
**Checklist:**
|
||||
- [x] `sendMessage/loadHistory/resume` 全部走统一 API 调用路径
|
||||
- [x] mock 模式通过 `MockApiClient` 提供同接口响应,不再走本地分叉逻辑
|
||||
- [x] 前端统一消费 AG-UI 事件流(SSE + history snapshot)
|
||||
- [x] 补齐测试
|
||||
|
||||
### Task 4: 工具链闭环(前端路由工具 + 后端日历工具)
|
||||
|
||||
**Files:**
|
||||
- Add/Modify: `backend/src/core/agent/...` (tool orchestration modules)
|
||||
- Modify: `backend/src/core/agent/application/run_service.py`
|
||||
- Modify: `backend/src/core/agent/application/resume_service.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/queue/tasks.py`
|
||||
- Modify: `apps/lib/features/chat/data/tools/tool_registry.dart`
|
||||
- Add: `apps/lib/features/chat/data/tools/navigation_tool.dart` (if needed)
|
||||
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
|
||||
- Modify: `apps/lib/features/home/ui/screens/home_screen.dart` (approval action if needed)
|
||||
- Test: backend + apps agent related tests
|
||||
|
||||
**Checklist:**
|
||||
- [x] 在 `RunAgentInput.tools` 中组织前端工具与后端工具声明
|
||||
- [x] 后端实现 `create_calendar_event` 工具执行(入库 `schedule_items`)
|
||||
- [x] 前端实现 `navigate_to_route` 工具执行能力(审批后执行)
|
||||
- [x] 后端对前端工具发起调用时进入 pending,前端审批同意后调用 resume 回传 `tool` message
|
||||
- [x] 后端处理 resume:落库、状态迁移、事件转发、成本核算保持正确
|
||||
- [x] 补齐端到端测试场景
|
||||
|
||||
### Task 5: 协议与接口文档同步
|
||||
|
||||
**Files:**
|
||||
- Modify: `docs/runtime/runtime-route.md`
|
||||
- Modify: `docs/bugs/2026-03-07-agent-module-review.md` (if needed for结论回写)
|
||||
|
||||
**Checklist:**
|
||||
- [x] 记录 run/resume/history/sse 的单协议格式
|
||||
- [x] 记录工具审批与 resume 回传流程
|
||||
- [x] 标注变更日期与示例
|
||||
|
||||
### Task 6: 审查高危问题收敛(并发/安全/前端健壮性)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/v1/agent/service.py`
|
||||
- Modify: `backend/src/core/agent/application/run_service.py`
|
||||
- Modify: `backend/src/core/agent/application/resume_service.py`
|
||||
- Modify: `backend/src/core/agent/application/session_state_persistence.py`
|
||||
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
|
||||
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
|
||||
- Modify: `apps/lib/features/chat/data/tools/route_navigation_tool.dart`
|
||||
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
|
||||
- Test: `backend/tests/unit/v1/agent/test_service.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_state_snapshot.py`
|
||||
- Test: `backend/tests/integration/core/agent/test_queue_run_resume.py`
|
||||
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
|
||||
- Test: `apps/test/features/chat/chat_bloc_test.dart`
|
||||
- Test: `apps/test/features/chat/tool_registry_test.dart`
|
||||
|
||||
**Checklist:**
|
||||
- [x] 修复会话创建竞态:`enqueue_run` 捕获 `IntegrityError` 后回滚并回查 owner
|
||||
- [x] 修复 resume 审批完整性:绑定 `toolName + toolArgsSha256 + nonce` 并强校验
|
||||
- [x] 修复前端 SSE 容错:单条坏包不再中断整流
|
||||
- [x] 修复前端 tool result 空卡片回归:`ui == null` 时不渲染占位卡片
|
||||
- [x] 修复前端导航工具安全边界:增加路由白名单/前缀校验
|
||||
|
||||
### Task 7: L2 复核阻塞项收敛(二次审查后补修)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/application/resume_service.py`
|
||||
- Modify: `backend/src/core/agent/application/run_service.py`
|
||||
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
|
||||
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
|
||||
- Test: `apps/test/features/chat/ag_ui_service_test.dart`
|
||||
|
||||
**Checklist:**
|
||||
- [x] 修复 SSE 重放:前端保存并续传 `Last-Event-ID`
|
||||
- [x] 收紧后端写库触发:移除“关键词自动创建日程”路径,仅保留显式 `#tool:` 触发
|
||||
- [x] 修复 resume 结果注入:后端仅使用 sanitize 后的受控 payload 落库/回放
|
||||
- [x] 修复前端执行失败仍 resume:本地工具 `ok != true` 时中止 resume
|
||||
- [x] 补充对应回归测试
|
||||
|
||||
### Task 8: 安全中风险补齐(HTTP 限额前置 + fail-closed 守卫)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/v1/agent/router.py`
|
||||
- Add: `backend/tests/unit/v1/agent/test_router_guards.py`
|
||||
- Modify: `backend/tests/integration/v1/agent/test_routes.py`
|
||||
|
||||
**Checklist:**
|
||||
- [x] HTTP 层在 enqueue 前执行 `RunAgentInput` 限额校验(大小/消息数/文本长度)
|
||||
- [x] Redis 异常时 run 限流与 SSE 配额改为 fail-closed
|
||||
- [x] 补齐守卫单测与路由集成测试
|
||||
|
||||
## 执行日志(每完成一项即更新)
|
||||
|
||||
- 2026-03-07 16:35: 初始化计划文档,录入审阅结论与任务拆解。
|
||||
- 2026-03-07 16:44: 完成 Task 1。后端 `RunService/ResumeService` 仅接受 `RunAgentInput`;前端 `ToolCallResultEvent` 仅使用 `content`。
|
||||
验证:
|
||||
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/integration/core/agent/test_queue_run_resume.py backend/tests/unit/v1/agent/test_service.py -q` 通过(含部分 `skip`)。
|
||||
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` 通过。
|
||||
- 2026-03-07 16:50: 完成 Task 2。新增 `GET /api/v1/agent/runs/{thread_id}/history?before=YYYY-MM-DD`,按天聚合会话消息并返回 `STATE_SNAPSHOT`(含 `hasMore`)。
|
||||
验证:
|
||||
- `uv run pytest backend/tests/unit/v1/agent/test_service.py backend/tests/integration/v1/agent/test_routes.py -q` 通过。
|
||||
- 2026-03-07 17:09: 完成 Task 3。前端 `AgUiService` 统一为 API 调用路径,mock/true 共用请求与事件解析;历史改走 `/api/v1/agent/history` 的 `STATE_SNAPSHOT`。
|
||||
验证:
|
||||
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` 通过。
|
||||
- 2026-03-07 17:09: 完成 Task 4。新增前端 `navigate_to_route` 工具(审批后执行并 resume),后端 `create_calendar_event` 工具(落库 `schedule_items`,回传 `TOOL_CALL_RESULT`),并将可用工具注入系统提示词供后端解析。
|
||||
验证:
|
||||
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
|
||||
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
|
||||
- 2026-03-07 17:10: 完成 Task 5。`docs/runtime/runtime-route.md` 已新增 history 接口与 `STATE_SNAPSHOT` 示例,更新 run/resume 协议描述为单格式。
|
||||
- 2026-03-07 17:29: 完成 Task 6。收敛审查高危项:
|
||||
- 后端 `enqueue_run` 增加并发建会话竞态处理(`IntegrityError -> rollback -> owner recheck`)。
|
||||
- 后端 run/resume 增加 pending tool guard(`pending_tool_name/pending_tool_args_sha256/pending_tool_nonce`)与 resume 强校验。
|
||||
- 前端 SSE 解析增加坏包容错,tool result 无 ui 时不渲染空卡片,导航工具增加白名单。
|
||||
验证:
|
||||
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q` 通过(`25 passed, 3 skipped`)。
|
||||
- `uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py` 通过。
|
||||
- `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`33 passed`)。
|
||||
- 2026-03-07 17:33: 执行全量目标验证命令:
|
||||
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
|
||||
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
|
||||
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`69 passed`)。
|
||||
- 2026-03-07 17:46: 完成 Task 7(针对 L2 门禁新增阻塞项的二次修复):
|
||||
- 前端 `AgUiService` 增加 `Last-Event-ID` 续传,规避同线程重复回放。
|
||||
- 后端 `RunService` 去除“日程关键词自动写库”,仅保留显式工具触发。
|
||||
- 后端 `ResumeService` 新增 sanitize 流程,拒绝注入式 `ui/content` 污染。
|
||||
- 前端审批后若本地工具执行失败,不再继续调用 resume。
|
||||
验证:
|
||||
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q` 通过(`26 passed, 3 skipped`)。
|
||||
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
|
||||
- `uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py` 通过。
|
||||
- `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`35 passed`)。
|
||||
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`71 passed`)。
|
||||
- L2 复核结果:`code-reviewer` 与 `security-reviewer` 复核后确认此前 HIGH 已收敛,未发现新的 CRITICAL/HIGH。
|
||||
- 2026-03-07 17:56: 完成 Task 8(安全中风险补齐):
|
||||
- `router` 在 `/agent/runs` 与 `/agent/runs/{thread_id}/resume` 增加 `parse_run_input` 前置校验。
|
||||
- `_allow_run_request` 与 `_acquire_sse_slot` 在 Redis 异常时改为 fail-closed。
|
||||
- 新增 `test_router_guards.py`,并扩展 `test_routes.py` 覆盖超大 payload 422。
|
||||
验证:
|
||||
- `uv run pytest backend/tests/unit/v1/agent/test_router_guards.py backend/tests/integration/v1/agent/test_routes.py -q` 通过(`8 passed`)。
|
||||
- `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。
|
||||
- `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。
|
||||
- `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`71 passed`)。
|
||||
- L2 复核结果:增量 `code-reviewer` 与 `security-reviewer` 均确认当前无新的 `CRITICAL/HIGH`。
|
||||
@@ -1,207 +0,0 @@
|
||||
# Agent Tool Architecture Design
|
||||
|
||||
**Date:** 2026-03-08
|
||||
**Source:** `docs/bugs/2026-03-08-agent-tool-architecture.md`
|
||||
**Scope:** `backend/src/core/agent`
|
||||
**Status:** Approved for planning
|
||||
|
||||
---
|
||||
|
||||
## 1. Objective
|
||||
|
||||
修复 Agent 工具架构相关 8 个问题,优先恢复端到端闭环能力(工具审批后继续推理并产出最终回复),并在同版本内补齐工具输出结构化、存储分层、阶段策略解耦、多模态与语音输入能力。
|
||||
|
||||
---
|
||||
|
||||
## 2. Deliverables
|
||||
|
||||
1. 两阶段修复蓝图(Phase 1 + Phase 2)
|
||||
2. 统一事件与状态机设计(AG-UI Step 事件 + 审批恢复)
|
||||
3. 接口边界与职责重划分(run/resume/runtime/persistence)
|
||||
4. 风险与回滚策略
|
||||
5. 验收标准(双金路径)
|
||||
|
||||
---
|
||||
|
||||
## 3. Constraints And Decisions
|
||||
|
||||
### 3.1 Release Strategy
|
||||
|
||||
- 一次性切换
|
||||
- 不做灰度
|
||||
- 不做双轨
|
||||
- 不留兼容代码
|
||||
|
||||
### 3.2 Contract Decisions
|
||||
|
||||
- `run` 接口允许破坏性变更:移除前端传完整历史 `messages` 的语义
|
||||
- 前端只传本次输入,历史以后端为准
|
||||
- Phase 1 不引入 client hint
|
||||
- 工具架构在 Phase 1 完整迁移至 CrewAI Tools(非桥接)
|
||||
|
||||
### 3.3 AG-UI Event Decisions
|
||||
|
||||
- 三阶段固定发 `StepStarted/StepFinished`:`intent`, `execution`, `organization`
|
||||
- 等待工具审批不单独新增 step,归属 execution 内部状态
|
||||
- 后端只发英文机器名,前端自行文案化
|
||||
|
||||
### 3.4 ASR / Multimodal Decisions
|
||||
|
||||
- 多模态首版只支持文件上传(不支持 URL)
|
||||
- ASR 首版为“录音结束后上传音频 -> 后端同步返回 transcript”
|
||||
- 前端将 transcript 回填输入框,再调用 run
|
||||
|
||||
---
|
||||
|
||||
## 4. Complexity And Risk
|
||||
|
||||
- **Complexity:** S2(跨多个核心模块的架构调整)
|
||||
- **Risk Tier:** L2(包含高危安全项:前端可篡改历史)
|
||||
|
||||
风险驱动原则:先修复闭环与安全问题,再扩展能力面。
|
||||
|
||||
---
|
||||
|
||||
## 5. Phased Plan
|
||||
|
||||
## Phase 1 - Close Loop And Stop Security Bleeding
|
||||
|
||||
**Bugs:** #1, #5, #6
|
||||
|
||||
### Goals
|
||||
|
||||
1. 后端成为历史与上下文唯一事实源
|
||||
2. 工具审批后恢复并继续 Agent Loop
|
||||
3. 工具执行完整迁移到 CrewAI Tools 注册体系
|
||||
|
||||
### Module Boundaries
|
||||
|
||||
- `backend/src/core/agent/application/run_service.py`
|
||||
- 仅负责本次输入解析、后端上下文组装、触发 runtime
|
||||
- 移除前端历史信任路径
|
||||
- 移除硬编码工具分发
|
||||
|
||||
- `backend/src/core/agent/application/resume_service.py`
|
||||
- 审批确认后触发异步续跑,立即返回 `accepted`
|
||||
- 不可在工具执行后直接置 `COMPLETED`
|
||||
- 增加 `approval_request_id` 幂等保护
|
||||
|
||||
- `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
||||
- 引入 CrewAI Tools 注册与注入
|
||||
- 按 agent/stage 装配工具集
|
||||
- 三阶段统一发 Step start/end 事件
|
||||
|
||||
- `backend/src/core/agent/application/session_state_persistence.py`
|
||||
- 保障审批状态、工具结果、续跑状态一致性落库
|
||||
- 为 Phase 2 元数据扩展保留一致接口
|
||||
|
||||
### Runtime Flow (Phase 1)
|
||||
|
||||
1. `run` 接收本次输入
|
||||
2. 后端读取 Redis/DB 重建历史
|
||||
3. 进入 intent/execution/organization 三阶段
|
||||
4. execution 中若触发工具审批:进入 `WAITING_APPROVAL`
|
||||
5. 前端审批后调用 `resume`
|
||||
6. `resume` 异步触发续跑:执行工具 -> 写 tool result -> 继续 loop
|
||||
7. 生成最终 assistant 回复并 `RunFinished`
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 - Capability Completion In Same Version
|
||||
|
||||
**Order:** #3 -> #2 -> #4 -> #7 -> #8
|
||||
|
||||
### #3 Tool Output As UI Schema v1
|
||||
|
||||
- 统一工具输出结构:`type/version/data/actions`
|
||||
- 单一版本 `v1`,短期不做多版本并行
|
||||
|
||||
### #2 Tool Result Object Storage
|
||||
|
||||
- 大 payload 存对象存储
|
||||
- DB 仅存摘要、索引、校验信息
|
||||
- 启用 `storage_bucket/storage_path/payload_sha256`
|
||||
|
||||
### #4 Stage-Level Strategy Decoupling
|
||||
|
||||
- intent/execution/organization 支持独立参数与工具策略
|
||||
- intent 阶段可配置为只读(禁工具)
|
||||
|
||||
### #7 Multimodal Input
|
||||
|
||||
- 首版支持图片文件上传输入
|
||||
- 不再丢弃非 text 内容
|
||||
|
||||
### #8 ASR API
|
||||
|
||||
- 新增语音转写 API(同步返回 transcript)
|
||||
- 语音转写与 agent run 解耦
|
||||
|
||||
---
|
||||
|
||||
## 6. Session State And Events
|
||||
|
||||
推荐状态机:
|
||||
|
||||
`RUNNING -> WAITING_APPROVAL -> RESUMING -> RUNNING -> COMPLETED/FAILED`
|
||||
|
||||
关键约束:
|
||||
|
||||
- 重复审批请求不得重复执行工具(幂等)
|
||||
- `COMPLETED` 仅在 loop 自然结束时设置
|
||||
- Step 事件覆盖三阶段完整生命周期
|
||||
|
||||
---
|
||||
|
||||
## 7. Acceptance Criteria
|
||||
|
||||
## 7.1 Golden Path A (No Tool)
|
||||
|
||||
用户输入后,完整经历三阶段并产出最终回复;前端收到完整 step 事件与 `RunFinished`。
|
||||
|
||||
## 7.2 Golden Path B (Tool + Approval + Resume)
|
||||
|
||||
用户触发工具调用,审批后系统异步续跑并最终产出 assistant 回复;会话不在审批后直接结束。
|
||||
|
||||
## 7.3 Security Validation
|
||||
|
||||
前端即使提交伪造历史字段,也不会影响后端实际上下文。
|
||||
|
||||
## 7.4 Event Validation
|
||||
|
||||
每轮 run 必须包含 `intent/execution/organization` 的 `StepStarted/StepFinished`。
|
||||
|
||||
---
|
||||
|
||||
## 8. Risk And Rollback
|
||||
|
||||
### High Risk: #6 Context Ownership Migration
|
||||
|
||||
- 风险:上下文错绑、历史缺失
|
||||
- 控制:会话归属校验 + Redis/DB 一致性读取
|
||||
- 回滚:可退到“后端 DB-only 历史重建”
|
||||
|
||||
### High Risk: #5 Async Resume Consistency
|
||||
|
||||
- 风险:重复审批、状态卡死
|
||||
- 控制:审批幂等键 + 状态跃迁约束 + 超时终态
|
||||
- 回滚:降级为“仅返回工具结果,不自动续跑”
|
||||
|
||||
### Medium Risk: #2 Storage Split Consistency
|
||||
|
||||
- 风险:对象存储与 DB 元数据不一致
|
||||
- 控制:先对象后元数据 + 失败补偿清理
|
||||
- 回滚:临时退回 DB 内联存储
|
||||
|
||||
---
|
||||
|
||||
## 9. Bug-To-Phase Mapping
|
||||
|
||||
- **Phase 1:** #1, #5, #6
|
||||
- **Phase 2:** #2, #3, #4, #7, #8
|
||||
|
||||
---
|
||||
|
||||
## 10. Next Step
|
||||
|
||||
进入 implementation planning:将本设计拆解为任务级可执行计划(文件、测试、命令、验收证据)。
|
||||
@@ -1,449 +0,0 @@
|
||||
# Agent Tool Architecture Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** 修复 agent 工具架构 8 个问题,先恢复端到端闭环与安全正确性,再补齐 UI Schema、对象存储、阶段解耦、多模态与 ASR。
|
||||
|
||||
**Architecture:** 采用两阶段落地。Phase 1 先完成后端上下文主控、CrewAI Tools 完整迁移、审批后异步续跑闭环;Phase 2 按 `#3 -> #2 -> #4 -> #7 -> #8` 逐项扩展能力。所有变更遵循 AG-UI 事件流语义,三阶段固定发送 StepStarted/StepFinished。
|
||||
|
||||
**Tech Stack:** FastAPI, Pydantic, CrewAI, LiteLLM, Redis, Postgres, MinIO/Supabase Storage, pytest
|
||||
|
||||
**Status:** Completed on 2026-03-08 (Task 1-8 delivered; Task 4/5/6 finalized with E2E object-storage verification)
|
||||
|
||||
---
|
||||
|
||||
### Task 1: 锁定 Phase 1 契约(移除前端历史语义)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/domain/agui_input.py`
|
||||
- Modify: `backend/src/core/agent/application/run_service.py`
|
||||
- Modify: `backend/src/v1/agent/schemas.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_run_ignores_client_history_messages(fake_run_input_with_messages):
|
||||
result = service.run(run_input=fake_run_input_with_messages)
|
||||
assert result.used_context_source == "backend"
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k ignores_client_history -v`
|
||||
Expected: FAIL,当前实现仍读取/依赖前端 history。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
# run_service.py
|
||||
user_input = extract_latest_user_text(run_input)
|
||||
history = await load_context_from_backend_sources(session_id)
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k ignores_client_history -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/agent/domain/agui_input.py backend/src/core/agent/application/run_service.py backend/src/v1/agent/schemas.py backend/tests/unit/core/agent/test_run_resume_service.py
|
||||
git commit -m "refactor(agent): make backend own conversation context"
|
||||
```
|
||||
|
||||
### Task 2: CrewAI Tools 完整迁移(替换硬编码分发)
|
||||
|
||||
**Files:**
|
||||
- Create: `backend/src/core/agent/infrastructure/crewai/tools_registry.py`
|
||||
- Create: `backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
||||
- Modify: `backend/src/core/agent/application/run_service.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_crewai_runtime.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_runtime_uses_registered_crewai_tools():
|
||||
runtime = build_runtime_with_registry(["create_calendar_event"])
|
||||
result = runtime.execute(user_input="帮我创建日历事件", system_prompt="x")
|
||||
assert result.tool_calls[0].tool_name == "create_calendar_event"
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k registered_crewai_tools -v`
|
||||
Expected: FAIL,当前路径仍是 run_service 硬编码。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
# tools_registry.py
|
||||
TOOLS = {"create_calendar_event": CreateCalendarEventTool()}
|
||||
|
||||
def tools_for_stage(stage: str) -> list[BaseTool]:
|
||||
return STAGE_TOOL_MAP.get(stage, [])
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k registered_crewai_tools -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/agent/infrastructure/crewai/tools_registry.py backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/application/run_service.py backend/tests/unit/core/agent/test_crewai_runtime.py
|
||||
git commit -m "feat(agent): migrate backend tools to crewai tool registry"
|
||||
```
|
||||
|
||||
### Task 3: 修复审批后异步续跑闭环(#5)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/application/resume_service.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/queue/tasks.py`
|
||||
- Modify: `backend/src/core/agent/application/session_state_persistence.py`
|
||||
- Test: `backend/tests/integration/core/agent/test_queue_run_resume.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_resume_triggers_async_loop_until_final_assistant_message(client):
|
||||
response = client.post("/v1/agent/runs/{id}/resume", json={"approve": True})
|
||||
assert response.status_code == 202
|
||||
assert eventually_has_final_assistant_message(id)
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/core/agent/test_queue_run_resume.py -k triggers_async_loop -v`
|
||||
Expected: FAIL,当前审批后直接完成。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
# resume_service.py
|
||||
await mark_session_resuming(...)
|
||||
await enqueue_resume_task(...)
|
||||
return ResumeAccepted(...)
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/core/agent/test_queue_run_resume.py -k triggers_async_loop -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/agent/application/resume_service.py backend/src/core/agent/infrastructure/queue/tasks.py backend/src/core/agent/application/session_state_persistence.py backend/tests/integration/core/agent/test_queue_run_resume.py
|
||||
git commit -m "fix(agent): continue agent loop asynchronously after tool approval"
|
||||
```
|
||||
|
||||
### Task 4: 三阶段 Step 事件完整化(intent/execution/organization)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/agui/bridge.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_agui_bridge.py`
|
||||
- Test: `backend/tests/integration/v1/agent/test_sse_flow_live.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_each_stage_emits_step_started_and_finished():
|
||||
events = collect_events_from_run(...)
|
||||
assert has_step_pair(events, "intent")
|
||||
assert has_step_pair(events, "execution")
|
||||
assert has_step_pair(events, "organization")
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/v1/agent/test_sse_flow_live.py -k emits_step_started_and_finished -v`
|
||||
Expected: FAIL,至少一个阶段事件缺失。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
emit_step_started(stage)
|
||||
stage_output = run_stage(stage)
|
||||
emit_step_finished(stage)
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/v1/agent/test_sse_flow_live.py -k emits_step_started_and_finished -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/agui/bridge.py backend/tests/unit/core/agent/test_agui_bridge.py backend/tests/integration/v1/agent/test_sse_flow_live.py
|
||||
git commit -m "feat(agent): emit ag-ui step events for three-stage flow"
|
||||
```
|
||||
|
||||
### Task 5: 工具输出统一为 UI Schema v1(#3)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py`
|
||||
- Modify: `backend/src/core/agent/domain/message_metadata.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_run_resume_service.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_calendar_tool_returns_ui_schema_v1():
|
||||
result = run_calendar_tool(...)
|
||||
assert result["type"] == "calendar_card.v1"
|
||||
assert result["version"] == "v1"
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k returns_ui_schema_v1 -v`
|
||||
Expected: FAIL,当前返回简单 status/event_id。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
return {
|
||||
"type": "calendar_card.v1",
|
||||
"version": "v1",
|
||||
"data": {...},
|
||||
"actions": [...],
|
||||
}
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k returns_ui_schema_v1 -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py backend/src/core/agent/domain/message_metadata.py backend/tests/unit/core/agent/test_run_resume_service.py
|
||||
git commit -m "feat(agent): return tool results as ui schema v1"
|
||||
```
|
||||
|
||||
### Task 6: 工具结果对象存储(#2)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/application/session_state_persistence.py`
|
||||
- Modify: `backend/src/core/agent/domain/message_metadata.py`
|
||||
- Test: `backend/tests/integration/core/agent/test_session_message_persistence.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_large_tool_payload_persisted_to_object_storage():
|
||||
meta = persist_large_tool_result(...)
|
||||
assert meta.storage_bucket is not None
|
||||
assert meta.storage_path is not None
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/core/agent/test_session_message_persistence.py -k object_storage -v`
|
||||
Expected: FAIL,当前 metadata 为空。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
payload_ref = await persist_tool_result_payload(...)
|
||||
metadata.storage_bucket = payload_ref.bucket
|
||||
metadata.storage_path = payload_ref.path
|
||||
metadata.payload_sha256 = payload_ref.sha256
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/core/agent/test_session_message_persistence.py -k object_storage -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/agent/application/session_state_persistence.py backend/src/core/agent/domain/message_metadata.py backend/tests/integration/core/agent/test_session_message_persistence.py
|
||||
git commit -m "feat(agent): persist large tool results to object storage"
|
||||
```
|
||||
|
||||
### Task 7: 三阶段参数解耦(#4)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/config/resolver.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_config_resolver.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_crewai_runtime.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_intent_stage_can_disable_tools():
|
||||
cfg = load_stage_config(intent_tools=[])
|
||||
result = run_intent_stage(cfg)
|
||||
assert result.tool_calls == []
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k intent_stage_can_disable_tools -v`
|
||||
Expected: FAIL,当前三阶段共享同一 llm/tools 配置。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
stage_cfg = config.for_stage(stage)
|
||||
run_stage(..., llm_config=stage_cfg.llm, tools=stage_cfg.tools)
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k intent_stage_can_disable_tools -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/config/resolver.py backend/tests/unit/core/agent/test_config_resolver.py backend/tests/unit/core/agent/test_crewai_runtime.py
|
||||
git commit -m "refactor(agent): decouple llm and tool strategy by stage"
|
||||
```
|
||||
|
||||
### Task 8: 多模态图片输入(文件上传)支持(#7)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/domain/agui_input.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/litellm/client.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_litellm_client.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_image_content_block_is_preserved_for_llm():
|
||||
payload = build_multimodal_payload(text="分析图片", image_file="a.png")
|
||||
assert payload_contains_image_block(payload)
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_litellm_client.py -k image_content_block_is_preserved -v`
|
||||
Expected: FAIL,当前非 text 被丢弃。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
if item.type == "image":
|
||||
blocks.append({"type": "image_url", "image_url": {"url": signed_file_url}})
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent/test_litellm_client.py -k image_content_block_is_preserved -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/agent/domain/agui_input.py backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/litellm/client.py backend/tests/unit/core/agent/test_litellm_client.py
|
||||
git commit -m "feat(agent): support multimodal image input blocks"
|
||||
```
|
||||
|
||||
### Task 9: 新增 ASR 同步转写 API(#8)
|
||||
|
||||
**Files:**
|
||||
- Create: `backend/src/v1/agent/asr_router.py`
|
||||
- Modify: `backend/src/v1/agent/router.py`
|
||||
- Create: `backend/src/v1/agent/asr_service.py`
|
||||
- Create: `backend/src/v1/agent/asr_schemas.py`
|
||||
- Test: `backend/tests/integration/v1/agent/test_routes.py`
|
||||
|
||||
**Step 1: Write the failing test**
|
||||
|
||||
```python
|
||||
def test_asr_transcribe_returns_sync_transcript(client, wav_file):
|
||||
resp = client.post("/v1/agent/asr/transcribe", files={"audio": wav_file})
|
||||
assert resp.status_code == 200
|
||||
assert resp.json()["transcript"]
|
||||
```
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/v1/agent/test_routes.py -k asr_transcribe_returns_sync_transcript -v`
|
||||
Expected: FAIL,当前无路由。
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
|
||||
```python
|
||||
@router.post("/asr/transcribe")
|
||||
async def transcribe(audio: UploadFile) -> AsrTranscribeResponse:
|
||||
text = await asr_service.transcribe(audio)
|
||||
return AsrTranscribeResponse(transcript=text)
|
||||
```
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/v1/agent/test_routes.py -k asr_transcribe_returns_sync_transcript -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/v1/agent/asr_router.py backend/src/v1/agent/router.py backend/src/v1/agent/asr_service.py backend/src/v1/agent/asr_schemas.py backend/tests/integration/v1/agent/test_routes.py
|
||||
git commit -m "feat(agent): add synchronous asr transcription endpoint"
|
||||
```
|
||||
|
||||
### Task 10: 全量验证与文档对齐
|
||||
|
||||
**Files:**
|
||||
- Modify: `docs/runtime/runtime-route.md`
|
||||
- Modify: `docs/bugs/2026-03-08-agent-tool-architecture.md` (状态回填)
|
||||
|
||||
**Step 1: Run targeted unit suite**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/unit/core/agent -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 2: Run targeted integration suite**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/integration/core/agent tests/integration/v1/agent -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 3: Run e2e smoke for agent flow**
|
||||
|
||||
Run: `cd backend && uv run pytest tests/e2e -k "agent or mobile_health" -v`
|
||||
Expected: PASS 或明确记录跳过原因
|
||||
|
||||
**Step 4: Run quality gates**
|
||||
|
||||
Run: `cd backend && uv run ruff check src tests && uv run basedpyright`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Final commit**
|
||||
|
||||
```bash
|
||||
git add docs/runtime/runtime-route.md docs/bugs/2026-03-08-agent-tool-architecture.md
|
||||
git commit -m "docs(agent): align runtime docs with new tool architecture"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Evidence Requirements
|
||||
|
||||
实施完成时必须输出:
|
||||
|
||||
1. 双金路径验证结果(无工具 + 工具审批后续跑)
|
||||
2. 三阶段 StepStarted/StepFinished 事件日志片段
|
||||
3. 安全验证结果(前端 history 篡改无效)
|
||||
4. ASR 同步转写接口请求/响应样例
|
||||
5. 关键命令输出摘要(pytest/ruff/basedpyright)
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- 本计划不包含兼容逻辑保留。
|
||||
- 本计划采用一次性切换。
|
||||
- 若实施中出现 S2 -> S3 范围升级,先暂停并更新计划,再继续执行。
|
||||
@@ -1,129 +0,0 @@
|
||||
# Runtime Refactor and Prompt Centralization Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Refactor CrewAI runtime into reusable modules, centralize all prompt text under `core/agent/prompt`, and diagnose flaky front-tool interrupt behavior without adding hardcoded runtime heuristics.
|
||||
|
||||
**Architecture:** Keep `runtime.py` as a thin facade and move parsing/tool/prompt composition/stage execution into cohesive modules. Prompt strings (including stage contracts and injected tool-context instructions) are generated exclusively by prompt-module functions. Keep behavior equivalent by default; only add diagnostic observability for flaky live scenario analysis.
|
||||
|
||||
**Tech Stack:** Python 3.12, FastAPI backend, CrewAI, Pydantic v2, pytest, ruff, basedpyright.
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Add prompt module and centralize all runtime prompt text
|
||||
|
||||
**Files:**
|
||||
- Create: `backend/src/core/agent/prompt/__init__.py`
|
||||
- Create: `backend/src/core/agent/prompt/runtime_stage_prompts.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
||||
- Test: `backend/tests/unit/core/agent/test_crewai_runtime.py`
|
||||
|
||||
**Step 1: Write failing test**
|
||||
- Add unit test asserting runtime uses prompt builder output (not inline literals) for stage description/contract/tool context.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
- Run: `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py::test_runtime_uses_prompt_module_for_stage_descriptions -q`
|
||||
- Expected: FAIL because runtime still composes inline strings.
|
||||
|
||||
**Step 3: Implement prompt module**
|
||||
- Add prompt functions:
|
||||
- `build_stage_output_contract(stage: str) -> str`
|
||||
- `build_stage_task_description(...) -> str`
|
||||
- `build_intent_multimodal_prompt(...) -> str`
|
||||
- Use mainstream prompt structure: role/objective/context/constraints/output-format.
|
||||
- Keep rules non-hardcoded and behavior-oriented, avoid keyword-triggered branching rules.
|
||||
|
||||
**Step 4: Wire runtime to prompt functions**
|
||||
- Replace inline prompt strings in runtime with prompt-module function calls.
|
||||
- Ensure no prompt literals remain in runtime except minimal wiring labels.
|
||||
|
||||
**Step 5: Run tests**
|
||||
- Run: `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py -q`
|
||||
- Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Split runtime into reusable modules and keep facade stable
|
||||
|
||||
**Files:**
|
||||
- Create: `backend/src/core/agent/infrastructure/crewai/runtime_models.py`
|
||||
- Create: `backend/src/core/agent/infrastructure/crewai/runtime_parsers.py`
|
||||
- Create: `backend/src/core/agent/infrastructure/crewai/runtime_tools.py`
|
||||
- Create: `backend/src/core/agent/infrastructure/crewai/runtime_stage_runner.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/__init__.py` (if needed)
|
||||
- Test: `backend/tests/unit/core/agent/test_crewai_runtime.py`
|
||||
|
||||
**Step 1: Write failing test**
|
||||
- Add/adjust unit test that imports `CrewAIRuntime` facade and verifies existing contract (`execute`, `map_events`, `is_registered_backend_tool`) still works after split.
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
- Run: `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py::test_runtime_facade_contract_stable_after_refactor -q`
|
||||
- Expected: FAIL before module split wiring.
|
||||
|
||||
**Step 3: Extract models/parsers/tools/stage-runner**
|
||||
- Move Pydantic result models to `runtime_models.py`.
|
||||
- Move parse/normalize helpers to `runtime_parsers.py`.
|
||||
- Move tool normalization, routing tool class, pending-front-tool extraction to `runtime_tools.py`.
|
||||
- Move `_run_stage_with_crewai` + usage extraction to `runtime_stage_runner.py`.
|
||||
|
||||
**Step 4: Keep runtime facade thin**
|
||||
- `runtime.py` retains orchestration flow and public API only.
|
||||
- Import and compose extracted modules; no behavior change intended.
|
||||
|
||||
**Step 5: Run tests**
|
||||
- Run: `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py -q`
|
||||
- Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Diagnose front-tool interrupt instability with explicit observability
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
||||
- Modify: `backend/src/core/agent/infrastructure/crewai/runtime_stage_runner.py`
|
||||
- Modify: `backend/tests/e2e/test_agent_live_flow.py`
|
||||
- Modify: `docs/bugs/2026-03-08-backend-tool-no-events.md`
|
||||
|
||||
**Step 1: Add failing/diagnostic assertion in live test path**
|
||||
- Extend test to capture and print structured diagnostics when `pending_tool_call_id` is `None`:
|
||||
- intent/execution raw+structured output
|
||||
- tool payload injected into prompts
|
||||
- captured tool calls list
|
||||
|
||||
**Step 2: Run targeted live test for evidence**
|
||||
- Run: `AGENT_LIVE_E2E=1 uv run pytest backend/tests/e2e/test_agent_live_flow.py::test_agent_live_front_tool_interrupt_resume_continue -v -rs`
|
||||
- Expected: still flaky/fail, but with actionable diagnostics.
|
||||
|
||||
**Step 3: Analyze evidence and apply non-hardcoded fix**
|
||||
- If input ambiguity: refine test input prompt text under test fixture.
|
||||
- If tool-description injection issue: fix prompt-builder injection logic.
|
||||
- Do not add keyword heuristics in runtime branching.
|
||||
|
||||
**Step 4: Re-run live targeted test**
|
||||
- Same command as Step 2.
|
||||
- Expected: improved stability or clearly documented unresolved root cause.
|
||||
|
||||
**Step 5: Update bug doc**
|
||||
- Add root-cause findings and next actions under Bug 3 section.
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Full verification and hygiene
|
||||
|
||||
**Files:**
|
||||
- Modify (if needed): `backend/tests/unit/core/agent/test_run_resume_service.py`
|
||||
|
||||
**Step 1: Run impacted unit suites**
|
||||
- `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py -q`
|
||||
- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py -q`
|
||||
|
||||
**Step 2: Run lint/type checks**
|
||||
- `uv run ruff check backend/src/core/agent/prompt backend/src/core/agent/infrastructure/crewai backend/tests/unit/core/agent/test_crewai_runtime.py backend/tests/e2e/test_agent_live_flow.py`
|
||||
- `uv run basedpyright backend/src/core/agent/prompt backend/src/core/agent/infrastructure/crewai backend/tests/unit/core/agent/test_crewai_runtime.py`
|
||||
|
||||
**Step 3: Optional live regression pack (if env ready)**
|
||||
- `AGENT_LIVE_E2E=1 uv run pytest backend/tests/e2e/test_agent_live_flow.py -m live -v -rs`
|
||||
|
||||
**Step 4: Report residual risk**
|
||||
- If live still flaky, report exact failure mode and captured diagnostics (no workaround heuristics).
|
||||
@@ -1,303 +0,0 @@
|
||||
# Cloud Supabase Env Cleanup & JWKS Migration Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** 切换到云 Supabase 后,移除本地自托管 Supabase 基础设施变量与编排,保留 Redis + DB + init-job,并将后端 JWT 验签从 `JWT_SECRET` 改为 JWKS 公钥验签。
|
||||
|
||||
**Architecture:** 后端配置收敛到“业务运行所需最小集合”(Supabase URL/anon/service role + DB + Redis)。认证链路采用 JWKS 拉取公钥并按 `kid` 验签,替代共享密钥 HS256。Docker 编排只保留业务依赖(redis、db、init-job),不再编排本地 Supabase 全家桶。
|
||||
|
||||
**Tech Stack:** FastAPI, Pydantic Settings, PyJWT (PyJWKClient), Docker Compose, pytest
|
||||
|
||||
---
|
||||
|
||||
### Task 1: 固化云模式配置契约(先测后改)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/tests/unit/test_settings_supabase_env.py`
|
||||
- Modify: `.env.example`
|
||||
|
||||
**Step 1: 写失败测试,定义新 Supabase 配置契约**
|
||||
|
||||
```python
|
||||
def test_social_prefixed_supabase_env_populates_settings(monkeypatch: MonkeyPatch) -> None:
|
||||
monkeypatch.setenv("SOCIAL_SUPABASE__PUBLIC_URL", "https://project.example.supabase.co")
|
||||
monkeypatch.setenv("SOCIAL_SUPABASE__ANON_KEY", "anon-key")
|
||||
monkeypatch.setenv("SOCIAL_SUPABASE__SERVICE_ROLE_KEY", "service-key")
|
||||
monkeypatch.setenv("SOCIAL_SUPABASE__JWT_AUDIENCE", "authenticated")
|
||||
|
||||
settings = Settings()
|
||||
|
||||
assert settings.supabase.public_url == "https://project.example.supabase.co"
|
||||
assert settings.supabase.jwt_issuer == "https://project.example.supabase.co/auth/v1"
|
||||
assert settings.supabase.jwks_url.endswith("/auth/v1/.well-known/jwks.json")
|
||||
```
|
||||
|
||||
**Step 2: 运行测试确认失败**
|
||||
|
||||
Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v`
|
||||
Expected: FAIL(`public_url/jwks_url` 字段不存在或断言失败)
|
||||
|
||||
**Step 3: 最小改动让测试通过(仅 settings 相关,逻辑改动在后续任务)**
|
||||
|
||||
更新 `.env.example` 为云模式最小变量草案(先占位,后续任务会补最终文案):
|
||||
- `SOCIAL_SUPABASE__PUBLIC_URL=`
|
||||
- `SOCIAL_SUPABASE__ANON_KEY=`
|
||||
- `SOCIAL_SUPABASE__SERVICE_ROLE_KEY=`
|
||||
- `SOCIAL_SUPABASE__JWT_AUDIENCE=authenticated`
|
||||
- `SOCIAL_SUPABASE__JWT_ISSUER=`(可选,默认由 PUBLIC_URL 推导)
|
||||
- `SOCIAL_SUPABASE__JWKS_URL=`(可选,默认由 PUBLIC_URL 推导)
|
||||
|
||||
**Step 4: 运行测试确认通过**
|
||||
|
||||
Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/tests/unit/test_settings_supabase_env.py .env.example
|
||||
git commit -m "test: define cloud supabase settings contract"
|
||||
```
|
||||
|
||||
### Task 2: 重构 SupabaseSettings(移除 JWT_SECRET 依赖)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/config/settings.py`
|
||||
- Modify: `backend/tests/unit/test_settings_supabase_env.py`
|
||||
|
||||
**Step 1: 写失败测试,约束默认推导行为**
|
||||
|
||||
```python
|
||||
assert settings.supabase.jwt_issuer == "https://project.example.supabase.co/auth/v1"
|
||||
assert settings.supabase.jwks_url == "https://project.example.supabase.co/auth/v1/.well-known/jwks.json"
|
||||
assert "jwt_secret" not in settings.model_dump()["supabase"]
|
||||
```
|
||||
|
||||
**Step 2: 运行测试确认失败**
|
||||
|
||||
Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v`
|
||||
Expected: FAIL
|
||||
|
||||
**Step 3: 实现最小配置重构**
|
||||
|
||||
在 `SupabaseSettings` 中改为:
|
||||
- 必填:`public_url`, `anon_key`, `service_role_key`
|
||||
- 可选:`site_url`, `additional_redirect_urls`
|
||||
- 新增:`jwt_audience`(默认 `authenticated`)、`jwt_issuer`(默认 `${public_url}/auth/v1`)、`jwks_url`(默认 `${jwt_issuer}/.well-known/jwks.json`)
|
||||
- 删除:`jwt_secret`, `public_scheme`, `public_host`, `kong_http_port`, `kong_https_port`
|
||||
|
||||
**Step 4: 运行测试确认通过**
|
||||
|
||||
Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/config/settings.py backend/tests/unit/test_settings_supabase_env.py
|
||||
git commit -m "refactor: migrate supabase config to cloud jwks fields"
|
||||
```
|
||||
|
||||
### Task 3: 引入 JWKS 验签组件并接入认证依赖
|
||||
|
||||
**Files:**
|
||||
- Create: `backend/src/core/auth/jwt_verifier.py`
|
||||
- Modify: `backend/src/v1/users/dependencies.py`
|
||||
- Create: `backend/tests/unit/core/auth/test_jwt_verifier.py`
|
||||
|
||||
**Step 1: 先写失败测试(JWT 验签核心行为)**
|
||||
|
||||
```python
|
||||
def test_verify_token_with_jwks_success(...):
|
||||
claims = verifier.verify(token)
|
||||
assert claims["sub"] == str(user_id)
|
||||
|
||||
def test_verify_token_rejects_invalid_issuer(...):
|
||||
with pytest.raises(TokenValidationError):
|
||||
verifier.verify(token_with_wrong_iss)
|
||||
```
|
||||
|
||||
**Step 2: 运行测试确认失败**
|
||||
|
||||
Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py -v`
|
||||
Expected: FAIL(模块/类不存在)
|
||||
|
||||
**Step 3: 实现最小 JWKS 验签逻辑**
|
||||
|
||||
```python
|
||||
class JwtVerifier:
|
||||
def __init__(self, jwks_url: str, issuer: str, audience: str) -> None: ...
|
||||
|
||||
def verify(self, token: str) -> dict[str, Any]:
|
||||
key = self._jwks_client.get_signing_key_from_jwt(token)
|
||||
return jwt.decode(
|
||||
token,
|
||||
key.key,
|
||||
algorithms=["RS256", "ES256"],
|
||||
audience=self._audience,
|
||||
issuer=self._issuer,
|
||||
options={"require": ["sub", "aud", "iss", "exp"]},
|
||||
)
|
||||
```
|
||||
|
||||
在 `get_current_user` 中替换原 `jwt_secret + HS256` 验签,统一映射为现有 401/503 语义。
|
||||
|
||||
**Step 4: 运行测试确认通过**
|
||||
|
||||
Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/src/core/auth/jwt_verifier.py backend/src/v1/users/dependencies.py backend/tests/unit/core/auth/test_jwt_verifier.py
|
||||
git commit -m "feat: validate access tokens via supabase jwks"
|
||||
```
|
||||
|
||||
### Task 4: 回归认证路径与 live 测试兼容
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/tests/integration/v1/agent/test_sse_flow_live.py`
|
||||
- Modify: `backend/tests/integration/test_auth_routes.py`(如需)
|
||||
|
||||
**Step 1: 写失败测试/调整 live 测试生成 token 方式**
|
||||
|
||||
将 live 测试从“本地签发 HS256 token”改为“通过真实登录拿 access token”或“无测试账号时 skip”。
|
||||
|
||||
```python
|
||||
if not os.getenv("AGENT_LIVE_EMAIL") or not os.getenv("AGENT_LIVE_PASSWORD"):
|
||||
pytest.skip("missing live supabase credentials")
|
||||
```
|
||||
|
||||
**Step 2: 运行相关测试确认失败(或旧逻辑不适配)**
|
||||
|
||||
Run: `uv run pytest backend/tests/integration/v1/agent/test_sse_flow_live.py -m live -v`
|
||||
Expected: 在旧代码下不可用/依赖 jwt_secret
|
||||
|
||||
**Step 3: 完成最小实现改造**
|
||||
|
||||
- 移除 `config.supabase.jwt_secret` 的测试依赖。
|
||||
- 保持 `@pytest.mark.live` 行为不变,避免影响常规 CI。
|
||||
|
||||
**Step 4: 运行测试确认通过(或受控 skip)**
|
||||
|
||||
Run: `uv run pytest backend/tests/integration/v1/agent/test_sse_flow_live.py -m live -v`
|
||||
Expected: PASS 或可解释的 SKIP(凭证缺失)
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add backend/tests/integration/v1/agent/test_sse_flow_live.py backend/tests/integration/test_auth_routes.py
|
||||
git commit -m "test: align live auth flow with cloud supabase tokens"
|
||||
```
|
||||
|
||||
### Task 5: 裁剪 Docker Compose(移除本地 Supabase,保留 Redis/DB/init-job)
|
||||
|
||||
**Files:**
|
||||
- Modify: `infra/docker/docker-compose.yml`
|
||||
|
||||
**Step 1: 写失败验证(compose 结构断言)**
|
||||
|
||||
添加一个轻量脚本化检查(可在本任务临时执行,不必入库):
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env -f infra/docker/docker-compose.yml config
|
||||
```
|
||||
|
||||
在改造前记录当前包含的 Supabase 服务(`studio/kong/auth/rest/...`)作为对照。
|
||||
|
||||
**Step 2: 执行检查确认当前状态(基线)**
|
||||
|
||||
Run: `docker compose --env-file .env -f infra/docker/docker-compose.yml config`
|
||||
Expected: 输出包含 Supabase 全家桶服务
|
||||
|
||||
**Step 3: 最小实现裁剪**
|
||||
|
||||
- 删除服务:`studio/kong/mail-templates/auth/rest/realtime/storage/imgproxy/meta/functions/analytics/vector/supavisor`
|
||||
- 保留服务:`redis`, `db`, `init-job`
|
||||
- `init-job` 环境变量移除:`SOCIAL_SUPABASE__ANON_KEY`, `SOCIAL_SUPABASE__SERVICE_ROLE_KEY`, `SOCIAL_SUPABASE__JWT_SECRET`
|
||||
- `db` 服务切换为业务最小化所需配置(仅数据库启动与健康检查必需)
|
||||
|
||||
**Step 4: 运行 compose 校验**
|
||||
|
||||
Run: `docker compose --env-file .env -f infra/docker/docker-compose.yml config`
|
||||
Expected: PASS,且仅保留 redis/db/init-job
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add infra/docker/docker-compose.yml
|
||||
git commit -m "refactor: remove local supabase stack from compose"
|
||||
```
|
||||
|
||||
### Task 6: 清理环境模板与运行文档
|
||||
|
||||
**Files:**
|
||||
- Modify: `.env.example`
|
||||
- Modify: `docs/runtime/runtime-runbook.md`
|
||||
- Modify: `infra/scripts/dev-migrate.sh`
|
||||
|
||||
**Step 1: 先写文档/模板检查点(人工可核验)**
|
||||
|
||||
定义必须满足:
|
||||
- `.env.example` 不再包含本地 Supabase 基础设施变量(logflare/pooler/studio/kong/jwt_secret 等)
|
||||
- 保留并标注后端必需项:`PUBLIC_URL`, `ANON_KEY`, `SERVICE_ROLE_KEY`
|
||||
- runbook 的健康检查改为 Redis/DB/Web,而非 Kong
|
||||
|
||||
**Step 2: 运行基线检查(改造前)**
|
||||
|
||||
Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v`
|
||||
Expected: 作为环境模板改造后的回归基线
|
||||
|
||||
**Step 3: 最小实现文档更新**
|
||||
|
||||
- `docs/runtime/runtime-runbook.md`:把“启动基础设施”描述改为 `redis + db`。
|
||||
- `infra/scripts/dev-migrate.sh`:将提示从“Requires Supabase services”改为“Requires db/redis services”。
|
||||
- `.env.example`:按云模式分组,明确前端/后端变量边界。
|
||||
|
||||
**Step 4: 运行检查确认通过**
|
||||
|
||||
Run: `docker compose --env-file .env -f infra/docker/docker-compose.yml config`
|
||||
Expected: PASS
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add .env.example docs/runtime/runtime-runbook.md infra/scripts/dev-migrate.sh
|
||||
git commit -m "docs: update runtime guide for cloud supabase mode"
|
||||
```
|
||||
|
||||
### Task 7: 全量验证与发布前检查
|
||||
|
||||
**Files:**
|
||||
- Modify: `docs/runtime/runtime-runbook.md`(记录验证命令与结果)
|
||||
|
||||
**Step 1: 运行静态检查**
|
||||
|
||||
Run: `uv run ruff check backend/src backend/tests`
|
||||
Expected: PASS
|
||||
|
||||
**Step 2: 运行类型检查**
|
||||
|
||||
Run: `uv run basedpyright`
|
||||
Expected: PASS
|
||||
|
||||
**Step 3: 运行测试(按影响面)**
|
||||
|
||||
Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py backend/tests/unit/core/auth/test_jwt_verifier.py -v`
|
||||
Expected: PASS
|
||||
|
||||
Run: `uv run pytest backend/tests/integration/test_users_routes.py backend/tests/integration/test_auth_routes.py -v`
|
||||
Expected: PASS
|
||||
|
||||
**Step 4: 运行运行时门禁验证**
|
||||
|
||||
Run: `docker compose --env-file .env -f infra/docker/docker-compose.yml up -d redis db && docker compose --env-file .env -f infra/docker/docker-compose.yml run --rm --build init-job uv run python -m core.runtime.cli bootstrap`
|
||||
Expected: PASS(迁移 + init-data 成功)
|
||||
|
||||
**Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add docs/runtime/runtime-runbook.md
|
||||
git commit -m "chore: record cloud supabase migration verification"
|
||||
```
|
||||
@@ -0,0 +1,69 @@
|
||||
# Auth Token Compatibility + Refresh Singleflight Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** 兼容云 Supabase 实际 access token claims(缺失 `iss` 仍可通过),并修复前端 401 导致 refresh 风暴问题,消除日志中的批量 401/429 警告。
|
||||
|
||||
**Architecture:** 后端保持 HS256 签名校验、`exp/sub` 必检,将 `iss` 从“强制存在”改为“存在时校验”;前端在拦截器中加入 refresh 单飞与防重入,避免并发 401 触发多次 refresh 或 refresh 自递归。同步清理无效分支与冗余状态。
|
||||
|
||||
**Tech Stack:** FastAPI, PyJWT, Flutter, Dio, flutter_test
|
||||
|
||||
---
|
||||
|
||||
### Task 1: 后端 JWT claim 兼容化(无 `iss` 可通过)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/src/core/auth/jwt_verifier.py`
|
||||
- Test: `backend/tests/unit/core/auth/test_jwt_verifier.py`
|
||||
|
||||
**Step 1: Write failing test**
|
||||
- 新增用例:token 不含 `iss`、但 `sub/exp` 与 HS256 签名合法时应验证成功。
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
- Run: `cd backend && uv run pytest tests/unit/core/auth/test_jwt_verifier.py -q`
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
- `jwt.decode` 的 `require` 去掉 `iss`,仅保留 `sub/exp`。
|
||||
- 若 payload 中存在 `iss` 且配置了 issuer,则手动比对 issuer;不一致时报错。
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
- Run: `cd backend && uv run pytest tests/unit/core/auth/test_jwt_verifier.py -q`
|
||||
|
||||
### Task 2: 前端 refresh 单飞 + 防递归
|
||||
|
||||
**Files:**
|
||||
- Modify: `apps/lib/core/api/api_interceptor.dart`
|
||||
- Test: `apps/test/core/api/api_interceptor_test.dart`
|
||||
|
||||
**Step 1: Write failing tests**
|
||||
- 并发 401 时只调用一次 `onTokenRefresh`。
|
||||
- `/api/v1/auth/sessions/refresh` 自身 401 不触发 refresh 重试。
|
||||
|
||||
**Step 2: Run tests to verify failures**
|
||||
- Run: `cd apps && flutter test test/core/api/api_interceptor_test.dart`
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
- 增加 `_refreshFuture` 单飞字段。
|
||||
- 非 refresh 请求命中 401 时 await 同一个 refresh future。
|
||||
- 对 refresh/logout 认证端点和已重试请求加短路,避免无限重入。
|
||||
|
||||
**Step 4: Run tests to verify pass**
|
||||
- Run: `cd apps && flutter test test/core/api/api_interceptor_test.dart`
|
||||
|
||||
### Task 3: 清理无效/旧分支并做回归验证
|
||||
|
||||
**Files:**
|
||||
- Modify: `apps/lib/core/api/api_interceptor.dart`(移除无效重试分支)
|
||||
- Modify: `backend/src/core/auth/jwt_verifier.py`(删除不再使用的路径)
|
||||
|
||||
**Step 1: Refactor cleanup**
|
||||
- 删除不再可达的分支与重复逻辑,保持行为不变。
|
||||
|
||||
**Step 2: Full targeted verification**
|
||||
- Run: `cd backend && uv run ruff check src tests`
|
||||
- Run: `cd backend && uv run basedpyright`
|
||||
- Run: `cd backend && uv run pytest tests/unit/core/auth/test_jwt_verifier.py tests/unit/v1/users -q`
|
||||
- Run: `cd apps && flutter test test/core/api/api_interceptor_test.dart test/features/auth`
|
||||
|
||||
**Step 3: Runtime spot-check**
|
||||
- Run: 登录拿 token 后请求 `/api/v1/agent/history`,确认不再因缺失 `iss` 返回 401。
|
||||
@@ -1,65 +0,0 @@
|
||||
# Supabase JWKS Auth Reliability Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** 让后端在云 Supabase 场景下稳定使用 JWKS/RS256 验签,并把 Auth 上游超时错误正确暴露为 503,保障注册/登录/重置密码链路可观测。
|
||||
|
||||
**Architecture:** 保留 `PUBLIC_URL -> issuer/jwks` 自动推导,JWT 验签继续强制 RS256,但给 JWKS 拉取添加 `apikey` 与 `Authorization` 头。Auth Gateway 新增统一错误映射,将上游 timeout/网关错误归类为服务不可用(503),其余保持既有 401/422 语义。
|
||||
|
||||
**Tech Stack:** FastAPI, Pydantic, PyJWT (`PyJWKClient`), Supabase Python SDK, pytest。
|
||||
|
||||
---
|
||||
|
||||
### Task 1: JWKS Header 支持(测试先行)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/tests/unit/core/auth/test_jwt_verifier.py`
|
||||
- Modify: `backend/src/core/auth/jwt_verifier.py`
|
||||
- Modify: `backend/src/v1/users/dependencies.py`
|
||||
|
||||
**Step 1: Write failing test**
|
||||
- 为 `JwtVerifier` 新增用例,断言初始化 `PyJWKClient` 时会传入 `apikey` 与 `Authorization: Bearer <anon_key>`。
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
- Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py -v`
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
- `JwtVerifier.__init__` 新增 `apikey` 参数并注入 JWKS 请求头。
|
||||
- `get_jwt_verifier()` 传入 `config.supabase.anon_key`。
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
- Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py -v`
|
||||
|
||||
### Task 2: Auth 上游超时错误映射为 503(测试先行)
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/tests/unit/v1/auth/test_auth_gateway.py`
|
||||
- Modify: `backend/src/v1/auth/gateway.py`
|
||||
|
||||
**Step 1: Write failing test**
|
||||
- 新增 `create_verification` 的超时错误测试,期望返回 `HTTPException(status_code=503)`。
|
||||
|
||||
**Step 2: Run test to verify it fails**
|
||||
- Run: `uv run pytest backend/tests/unit/v1/auth/test_auth_gateway.py -v`
|
||||
|
||||
**Step 3: Write minimal implementation**
|
||||
- 增加 AuthError 分类函数,识别 timeout/request_timeout/upstream timeout。
|
||||
- 在注册、登录、刷新、重置相关分支中映射为 503。
|
||||
|
||||
**Step 4: Run test to verify it passes**
|
||||
- Run: `uv run pytest backend/tests/unit/v1/auth/test_auth_gateway.py -v`
|
||||
|
||||
### Task 3: 回归验证
|
||||
|
||||
**Files:**
|
||||
- Modify: `backend/tests/unit/test_settings_supabase_env.py` (if needed)
|
||||
|
||||
**Step 1: Run targeted suites**
|
||||
- Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py backend/tests/unit/v1/auth/test_auth_gateway.py backend/tests/unit/test_settings_supabase_env.py -v`
|
||||
|
||||
**Step 2: Run quality gates**
|
||||
- Run: `uv run ruff check backend/src backend/tests`
|
||||
- Run: `uv run basedpyright backend/src`
|
||||
|
||||
**Step 3: Document runtime checks**
|
||||
- 记录 JWT/JWKS 必备环境变量和手工联调命令。
|
||||
Reference in New Issue
Block a user