diff --git a/docs/bugs/2026-03-07-agent-module-review.md b/docs/bugs/2026-03-07-agent-module-review.md deleted file mode 100644 index 30d5c9c..0000000 --- a/docs/bugs/2026-03-07-agent-module-review.md +++ /dev/null @@ -1,188 +0,0 @@ -# Agent 模块审查报告 - -**日期**: 2026-03-07 -**范围**: `backend/src/core/agent` -**状态**: 待修复 - ---- - -## 🔴 HIGH - 阻塞性问题 - -### 1. 同步 LLM 调用阻塞异步事件循环 - -**文件**: `infrastructure/crewai/runtime.py:126` - -**问题**: -```python -response = run_completion(...) # 同步调用 -``` - -`run_completion` 使用 `litellm.completion()` 是同步的,但 `RunService.run()` 是异步方法。这会阻塞整个事件循环,在高并发下严重影响性能。 - -**建议**: 使用 `litellm.acompletion()` 或 `asyncio.to_thread()`。 - -**影响范围**: -- `infrastructure/litellm/client.py` - 需要添加异步版本 -- `infrastructure/crewai/runtime.py` - `_run_stage()` 需要改为异步 - ---- - -## 🟡 MEDIUM - 需要修复 - -### 2. 缺少输入长度验证 - -**文件**: `application/run_service.py:63` - -**问题**: -```python -async def run(self, *, session_id: str, user_input: str) -> dict[str, object]: -``` - -`user_input` 没有长度限制,恶意用户可发送超大输入消耗 tokens 和资源。 - -**建议**: 添加最大长度验证(如 10000 字符)。 - -```python -MAX_USER_INPUT_LENGTH = 10000 - -if len(user_input) > MAX_USER_INPUT_LENGTH: - raise ValueError(f"user_input exceeds maximum length of {MAX_USER_INPUT_LENGTH}") -``` - ---- - -### 3. LLM 调用无超时控制 - -**文件**: `infrastructure/crewai/runtime.py:126` - -**问题**: `run_completion` 没有设置超时,如果 LLM API 挂起,请求会无限期阻塞。 - -**建议**: 添加 `timeout` 参数。 - -```python -def run_completion( - *, - model: str, - api_key: str, - messages: list[dict[str, Any]], - temperature: float | None = None, - max_tokens: int | None = None, - timeout: float | None = None, # 新增 -) -> Any: - kwargs["timeout"] = timeout - ... -``` - ---- - -### 4. 硬编码工具结果 - -**文件**: `application/resume_service.py:52` - -**问题**: -```python -content='{"status":"ok"}', -``` - -工具执行结果被硬编码为 `{"status":"ok"}`,看起来是占位符代码,实际工具执行结果未被使用。 - -**建议**: 实现真正的工具执行逻辑,或明确标注为待实现。 - ---- - -### 5. 缓存写入异常静默失败 - -**文件**: `infrastructure/persistence/user_context_cache.py:95-96` - -**问题**: -```python -async def set(self, *, session_id: UUID, context: UserAgentContext) -> None: - ... - except Exception: - return None -``` - -`set()` 方法失败时静默返回 `None`,调用方无法知道缓存是否成功,可能导致缓存失效问题难以排查。 - -**建议**: 记录日志或抛出异常。 - -```python -except Exception as exc: - logger.warning("Failed to cache user context", session_id=str(session_id), error=str(exc)) - return None -``` - ---- - -## 🟢 LOW - 建议改进 - -### 6. Redis Stream 响应格式校验缺失 - -**文件**: `infrastructure/events/redis_stream.py:62` - -**问题**: -```python -_, entries = response[0] -``` - -假设 response 格式正确,异常格式会导致 `IndexError`。 - -**建议**: 添加防御性检查。 - ---- - -### 7. 路径限制不支持子目录 - -**文件**: `infrastructure/crewai/loader.py:47` - -**问题**: -```python -if resolved.parent != base_dir: -``` - -只允许文件直接在 `base_dir` 下,未来扩展子目录模板可能受限。 - -**建议**: 改为检查路径是否在 `base_dir` 下(允许子目录)。 - ---- - -### 8. 异常信息丢失 - -**文件**: `infrastructure/queue/tasks.py:112` - -**问题**: -```python -except Exception: # noqa: BLE001 - error_id = "agent_runtime_failed" - logger.exception(...) -``` - -捕获所有异常但只用 `error_id` 标识,丢失了具体异常类型,排查困难。 - -**建议**: 在日志中记录异常类型。 - ---- - -## ✅ 良好实践 - -以下设计值得肯定: - -- **DDD 分层清晰**: domain / application / infrastructure 职责分明 -- **Repository 不做 commit**: 由 Service 控制事务边界 -- **并发控制**: 使用 `FOR UPDATE` 锁防止并发问题 -- **敏感字段脱敏**: `agui/bridge.py` 实现了 `_redact_sensitive()` -- **路径穿越防护**: `loader.py` 使用 `_resolve_allowed_path()` -- **协议抽象**: 使用 Protocol 进行依赖解耦 - ---- - -## 修复优先级建议 - -| 优先级 | 问题 | 预计工时 | -|--------|------|----------| -| P0 | 同步 LLM 调用阻塞 | 2h | -| P1 | 输入长度验证 | 0.5h | -| P1 | LLM 超时控制 | 1h | -| P2 | 硬编码工具结果 | 待定 | -| P2 | 缓存异常处理 | 0.5h | -| P3 | 其他 LOW 问题 | 1h | diff --git a/docs/bugs/2026-03-08-agent-tool-architecture.md b/docs/bugs/2026-03-08-agent-tool-architecture.md deleted file mode 100644 index ace5431..0000000 --- a/docs/bugs/2026-03-08-agent-tool-architecture.md +++ /dev/null @@ -1,323 +0,0 @@ -# Agent 模块审查报告 - 工具架构 - -**日期**: 2026-03-08 -**范围**: `backend/src/core/agent` -**状态**: 待评估 - ---- - -## 🟡 MEDIUM - 工具架构问题 - -### 1. 未使用 CrewAI 工具模块,工具硬编码 - -**文件**: -- `application/run_service.py:406` - `_execute_backend_tool()` -- `infrastructure/crewai/runtime.py` - 三阶段流程 - -**问题**: - -当前 agent 只使用了 CrewAI 的 **agent/task 配置模板**(YAML),但**没有使用 CrewAI 的工具系统**: - -``` -已用到: -├── agents.yaml (agent 角色定义) -└── tasks.yaml (task 定义) - -未用到: -├── @tool 装饰器 -├── BaseTool 类 -└── Tools 工具注册表 -``` - -**当前实现**: -```python -# run_service.py:406 -async def _execute_backend_tool(self, *, tool_name, tool_args, ...): - if tool_name != "create_calendar_event": # 硬编码判断 - raise ValueError(f"unsupported backend tool: {tool_name}") - # 手动执行工具... -``` - -**影响**: -1. 每新增一个工具需要修改 `_execute_backend_tool()` 代码 -2. 无法利用 CrewAI 的工具选择、执行结果处理等能力 -3. 与 CrewAI 集成度低,无法发挥框架优势 -4. 无法将工具描述等prompt信息自动注入agent中 - ---- - -## 🟡 MEDIUM - 工具结果存储问题 - -### 2. 工具结果存储到对象存储的功能未启用 - -**文件**: -- `application/session_state_persistence.py:52` - `persist_tool_result_payload()` -- `models/agent_chat_message.py` - messages 表 - -**问题**: - -已定义 `persist_tool_result_payload()` 函数,可将工具结果上传到对象存储(MinIO/Supabase Storage),但**该函数未被调用**。 - -当前实现: -- 工具结果直接存在数据库 `messages.content` 字段 -- `metadata_json` 中定义了 `storage_bucket`, `storage_path` 等字段,但都是 `None` - -```python -# message_metadata.py:17-27 -class MessageMetadataToolResult(BaseModel): - storage_bucket: str | None = None # 当前未使用 - storage_path: str | None = None # 当前未使用 - payload_sha256: str | None = None # 当前未使用 -``` - -**影响**: -1. 工具结果(尤其是 UI 组件等大数据)存在数据库,增加 DB 负担 -2. 已定义的存储接口未被使用,代码冗余 -3. 无法利用对象存储的 CDN 加速和带宽优势 - ---- - -## 🟡 MEDIUM - 工具输出格式问题 - -### 3. 工具输出不是 UI Schema,前端无法直接渲染 - -**文件**: -- `application/run_service.py:456-479` - `_execute_backend_tool()` - -**问题**: - -当前 `create_calendar_event` 工具返回的是**非结构化文本**,不是前端可渲染的 UI Schema: - -```python -# run_service.py:456-479 -event_id = str(schedule_item.id) -ui_card = { - "type": "calendar_card.v1", - "version": "v1", - "data": {...} - "actions": [...] -} -# ui_card 构建了但没有作为 tool result 返回 -return {"status": "ok", "event_id": event_id} # 只返回了简单结构 -``` - -**当前输出**: -```json -{ - "status": "ok", - "event_id": "xxx" -} -``` - -**期望输出**(UI Schema): -```json -{ - "type": "calendar_card.v1", - "version": "v1", - "data": { - "id": "xxx", - "title": "会议", - "startAt": "2026-03-08T15:00:00Z", - ... - }, - "actions": [ - {"type": "link", "label": "查看详情", "target": "/calendar/events/xxx"} - ] -} -``` - -**影响**: -1. 前端无法直接渲染丰富的 UI 组件 -2. 需要前端手动解析文本再渲染,增加前端工作量 -3. 无法利用 AG-UI 协议的 `ui_schema` 能力 - ---- - -## 🟡 MEDIUM - 阶段配置问题 - -### 4. 三阶段流程参数硬编码,无法为每个阶段配置不同策略 - -**文件**: -- `infrastructure/crewai/runtime.py:190-277` - `CrewAIRuntime.execute()` - -**问题**: - -当前三阶段流程(intent → execution → organization)是硬编码在 `run_agent_task()` 中的,无法为每个阶段配置不同的参数,如每个阶段可以使用的工具: - -```python -# runtime.py:203-277 -# intent 阶段 -intent_text, intent_usage = _run_stage( - litellm_model=litellm_model, - api_key=..., - llm_config=self._llm_config, # 同一套配置 - stage="intent", - ... -) - -# execution 阶段(如果有) -execution_text, execution_usage = _run_stage( - litellm_model=litellm_model, - api_key=..., - llm_config=self._llm_config, # 同一套配置 - stage="execution", - ... -) - -# organization 阶段 -organization_text, organization_usage = _run_stage( - litellm_model=litellm_model, - api_key=..., - llm_config=self._llm_config, # 同一套配置 - stage="organization", - ... -) -``` - -**当前限制**: -1. 无法为 intent 阶段设置只读 LLM(不允许工具调用) - - -**影响**: -1. 无法精细控制每个阶段的 LLM 行为 -2. 意图识别阶段可能误触发工具调用 -3. 增加不必要的 LLM 调用成本 -4. 降低了架构的灵活性 - ---- - -## 🔴 HIGH - Agent Loop 断裂问题 - -### 5. 工具审批后未继续 Agent Loop - -**文件**: -- `application/resume_service.py:121-158` - -**问题**: - -前端审批工具调用后,后端返回 tool result,但**没有继续执行 agent loop**,直接标记 session 为 COMPLETED 结束。 - -当前流程: -```python -# resume_service.py:121-127 -snapshot = self._state_persistence.build_completed_snapshot() -await session_repository.update_runtime_state( - chat_session=chat_session, - status=AgentChatSessionStatus.COMPLETED, # 直接完成 - state_snapshot=snapshot, - ... -) -``` - -缺失的流程: -``` -1. 接收 tool result -2. 将 tool result 作为 message 存入上下文 -3. 再次调用 LLM(带 tool result) -4. 生成最终回复 -5. 标记为 COMPLETED -``` - -**影响**: -1. 用户审批工具后,agent 不会继续生成回复 -2. 整个 agent loop 在工具审批后断裂 -3. 用户体验不完整 - ---- - -## 🔴 HIGH - 对话历史和用户上下文架构错误 - -### 6. 对话历史由前端维护,违反后端架构设计 - -**文件**: -- `application/run_service.py:89-124` -- `domain/agui_input.py` - -**问题**: - -当前架构中,**对话历史完全由前端维护并传递**: - -``` -前端 → GET /runs/{thread_id}/history → 后端返回历史 messages -前端 → POST /runs/{thread_id}/run → 前端把 history 放入 run_input.messages 传给后端 -后端 → 只读取 run_input 中的最新 user_input,不读取数据库历史 -``` - -代码证据 (`run_service.py:89-124`): -```python -async def run(self, *, run_input: RunAgentInput): - user_input = extract_latest_user_text(run_input) # 只取最新用户消息 - - runtime_result = await asyncio.to_thread( - runtime.execute, - user_input=user_input, # 只传最新输入 - system_prompt=system_prompt, - ) -``` - -**影响**: -1. **高危安全风险**:前端可以篡改对话历史,伪造上下文 -2. **架构违反**:用户上下文和对话历史都应该由后端维护 -3. **数据不一致**:前端可能遗漏或错误处理历史消息 -4. **无法支持多端同步**:不同前端设备看到的历史可能不同 -5. **Token 浪费**:每次请求都要传递完整历史,增加请求体积 -6. 原来的计划文档写清楚了,后端通过redis来缓存对话历史,并结合数据库读取的回退策略 - ---- - -## 🟡 MEDIUM - 多模态输入支持问题 - -### 7. 不支持图片等多模态输入 - -**文件**: -- `domain/agui_input.py:64-86` - `extract_latest_user_text()` -- `infrastructure/crewai/runtime.py:121-136` - `_run_stage()` -- `infrastructure/litellm/client.py` - -**问题**: - -当前架构**只支持纯文本输入**,图片等多模态内容被丢弃: - -代码证据 (`agui_input.py:64-86`): -```python -def extract_latest_user_text(run_input: RunAgentInput) -> str: - if isinstance(content, list): - for item in content: - if getattr(item, "type", None) != "text": - continue # ❌ 跳过非 text 类型(图片被丢弃) -``` - -代码证据 (`runtime.py:125`): -```python -messages.append({"role": "user", "content": user_content}) # 只传 str -``` - -**影响**: -1. 用户无法发送图片进行多模态交互 -2. 浪费多模态 LLM 能力 -3. 无法实现"上传图片让 AI 分析"等场景 - ---- - -## 🟡 MEDIUM - 缺失语音识别 (ASR) 功能 - -### 8. 未实现 fun-asr-realtime 语音识别 API 相关路由 - -**文件**: -- 无(功能缺失) - -**问题**: - -后端**未实现语音识别功能**,无法处理前端传入的音频数据: - -当前状态: -- `dashscope` 只用于 LLM(qwen3.5-flash 等) -- 没有任何 fun-asr、ASR、audio、transcribe 相关代码 -- v1 路由中无语音/音频相关 API - -**影响**: -1. 用户无法发送语音消息 -2. 无法实现实时语音对话场景 -3. 需要前端自行完成 ASR,增大前端负担 - ---- diff --git a/docs/bugs/2026-03-08-backend-tool-no-events.md b/docs/bugs/2026-03-08-backend-tool-no-events.md deleted file mode 100644 index 9684383..0000000 --- a/docs/bugs/2026-03-08-backend-tool-no-events.md +++ /dev/null @@ -1,118 +0,0 @@ -# Bug - 后端工具事件与前端中断稳定性 - -**日期**: 2026-03-08 -**范围**: `backend/src/core/agent` - -## 状态 - -- [x] Bug 1 已修复: 后端工具调用事件未转发 -- [x] Bug 2 已修复: history 未过滤负 seq 内部消息 -- [ ] Bug 3 调查中: live 前端工具中断不稳定 - ---- - -## Bug 1 - 后端工具调用不转发事件给前端(已修复) - -### 修复 - -- `run_service.py` 现在会消费 runtime 的 `tool_calls`(`target=backend`)并发出: - - `TOOL_CALL_START` - - `TOOL_CALL_ARGS` - - `TOOL_CALL_END` - - `TOOL_CALL_RESULT` -- 同时落库 `role=TOOL` 消息,metadata 使用 `tool_result`。 - -### 验证 - -- `backend/tests/unit/core/agent/test_run_resume_service.py::test_run_service_executes_backend_calendar_tool_and_emits_result` - ---- - -## Bug 2 - seq 设计缺陷与 history 暴露内部消息(已修复) - -### 修复 - -- `SessionRepository.next_message_seq()` 支持 `mode`: - - `public`: 仅基于正序号递增 - - `internal`: 基于负序号递减 -- `v1/agent/repository.py` history 查询增加 `seq > 0` 过滤。 - -### 验证 - -- `backend/tests/unit/v1/agent/test_repository.py::test_get_history_day_filters_out_negative_seq_messages` - ---- - -## Bug 3 - live 前端工具中断不稳定(调查中) - -### 现象 - -- `test_agent_live_front_tool_interrupt_resume_continue` 偶发或持续失败。 -- 失败点: `pending_tool_call_id` 为 `None`。 - -### 已采集证据 - -- 输入文本已明确要求调用工具。 -- 前端工具描述已注入到 prompt,且 execution 阶段可见工具列表。 -- 部分失败样本中,模型在 execution 输出里给出“需要审批”的文字/结构化说明,但没有真正触发工具调用事件。 -- 常见 execution_data 形态: - - `tool_used/tool_name` - - `approval_status/approval_required` - - `target_route/target` - - 但无真实 tool call 事件。 - -### 当前判断 - -- 问题不在“工具未注入”。 -- 主要是模型在 execution 阶段把“应调用工具”退化为“文本说明审批状态”,导致 runtime 无法拿到 pending call。 - -### 已做改进(非硬编码兜底) - -- 提示词集中化到 `core/agent/prompt/runtime_stage_prompts.py`。 -- execution prompt 增加规则: 工具可满足请求时必须通过 runtime 工具接口调用,不可伪造工具结果文本。 -- pending 提取逻辑增强以兼容 `approval_required/target` 变体结构。 -- `DynamicRoutingTool._run` 改为接受 `**kwargs`,兼容 CrewAI 直接参数调用(之前仅收 `payload`,会导致 `unexpected keyword argument`)。 -- execution 阶段关闭 `output_pydantic` 强约束,避免 structured output 过早收敛影响 ReAct 工具动作循环。 - -### 最新验证(2026-03-08 晚) - -- 前端中断 live 用例仍失败: - - `AGENT_LIVE_E2E=1 uv run pytest backend/tests/e2e/test_agent_live_flow.py::test_agent_live_front_tool_interrupt_resume_continue -v -rs` - - 结果:`pending_tool_call_id = null` - - assistant 文本会声称“已触发审批/待确认”,但 runtime 仍未捕获真实 tool call。 -- 后端工具 live 用例本次环境未能执行到断言: - - `AGENT_LIVE_E2E=1 uv run pytest backend/tests/e2e/test_agent_live_flow.py::test_agent_live_image_calendar_tool_persistence -v -rs` - - `Tool result storage unavailable` 已定位并修复(测试初始化顺序问题,不是 Docker Storage 服务故障) - - 当前新失败为业务断言:未创建 `schedule_items` -- 非 live 证据: - - `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime_tools.py -q` PASS(验证 front tool kwargs 可进入 runtime) - - `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py -q` PASS(后端工具链路单测通过) - -### 后续建议 - -1. 为 live 失败样本继续沉淀 execution 原始输出分型统计。 -2. 评估在 execution stage 增加 CrewAI guardrail: 若 NEEDS_EXECUTION 且零 tool call,则判为无效输出并重试。 -3. 若仍不稳定,考虑升级模型或为关键路径启用更强结构化调用策略。 -4. 补充可观测性:在 execution 阶段记录“注入工具名列表 + Crew 原始 action 文本片段(脱敏)”,用于区分“未注入”与“注入后未 act”。 - ---- - -## 额外排查结论(CrewAI tools 与 Storage) - -### A) CrewAI tools 机制对齐结论 - -- 官方 tools 文档要求 `BaseTool` 的 `args_schema` 与 `_run` 参数语义一致,示例为 `_run(self, argument: str)`。 -- CrewAI 执行器在 ReAct 模式下依赖 `Action / Action Input` 文本被 parser 解析后才会真正执行工具。 -- 我们此前 `_run(self, payload: dict)` 与实际运行时 kwargs 形态存在不匹配风险,已改为 `_run(self, **kwargs)` 兼容调用。 -- execution 阶段若过度强调“直接输出严格 JSON”,会与 ReAct 工具动作循环冲突,已在 prompt 中补充明确的 `Action` / `Action Input` 约束。 - -### B) Tool result storage unavailable 根因 - -- 根因不是 Supabase Docker Storage 宕机;`docker compose ps` 显示 `supabase-storage` healthy。 -- 真实原因是 live 测试在 `supabase_service.initialize()` 之前调用 `create_tool_result_storage()`,导致 admin client 尚未初始化而返回 `None`。 -- 已修复测试顺序:先初始化 Supabase,再创建 storage。 - -### C) 现阶段阻塞 - -- 后端图片场景还暴露出 AG-UI multimodal 输入兼容问题:`type=image` 不符合当前 `RunAgentInput`(期望 `binary`)。 -- 已修复为 `binary` 输入并在 `agui_input` 增加 `binary` 解析兼容;用例不再因 payload 校验失败而提前终止。 diff --git a/docs/plans/2026-03-07-agent-agui-full-alignment.md b/docs/plans/2026-03-07-agent-agui-full-alignment.md deleted file mode 100644 index d709830..0000000 --- a/docs/plans/2026-03-07-agent-agui-full-alignment.md +++ /dev/null @@ -1,221 +0,0 @@ -# AG-UI 全量对齐改造 Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** 前后端 Agent 全链路仅使用 AG-UI 单一协议格式,补齐 run/resume/SSE/history/工具审批闭环,并完成前端真 API 与 mock API 的统一接入与解析。 - -**Architecture:** 以后端 `RunAgentInput` + AG-UI 事件模型为唯一真源,前端统一通过 API 客户端调用同一组 `/agent/*` 接口并消费同一事件格式。工具链分为前端工具(需审批 + resume)和后端工具(服务端执行 + 入库 + 事件回传 + 成本入账),历史接口按“天”返回 `STATE_SNAPSHOT` 事件负载。 - -**Tech Stack:** FastAPI + Pydantic + SQLAlchemy + Redis Stream + Flutter + Dio + json_serializable - ---- - -## Intake Contract - -- Objective: 完整完成 AG-UI 对齐改造,移除双格式兼容逻辑,打通工具审批与历史加载。 -- Deliverable: 后端接口/服务/工具实现、前端服务/模型/工具改造、文档更新、测试用例与验证输出。 -- Constraints: - - run/resume/request/event/history 只允许一种 AG-UI 格式。 - - 不保留 legacy 兼容输入与“双字段容错解析”。 - - 前后端工具流必须可测试:前端路由工具 + 后端日历工具。 -- Verification target: - - `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` - - `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` - - `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` - -## 审阅结论(作为改造依据) - -- [ ] `RunService.run` 与 `ResumeService.resume` 仍保留 legacy 参数分支(`session_id/user_input/tool_call_id/tool_result`),违背“单协议输入”。 -- [ ] 前端 `ToolCallResultEvent` 同时兼容 `result` 与 `content`,属于双格式解析。 -- [ ] 前端 `AgUiService` 仍存在 mock/true 分叉实现,`loadHistory` 真 API 未接入。 -- [ ] 后端缺少历史接口;当前历史仅前端本地 `MockHistoryService` 伪造。 -- [ ] 当前 tool 流程以固定占位 `user_tool_result` 为主,缺少“前端工具审批 + resume 回传 + 后端工具执行入库”的完整验证链路。 - -## 执行任务(持续更新) - -### Task 1: 严格单协议化(移除兼容分支) - -**Files:** -- Modify: `backend/src/core/agent/application/run_service.py` -- Modify: `backend/src/core/agent/application/resume_service.py` -- Modify: `backend/src/v1/agent/service.py` -- Modify: `apps/lib/features/chat/data/models/ag_ui_event.dart` -- Test: `backend/tests/unit/core/agent/test_run_resume_service.py` -- Test: `backend/tests/unit/v1/agent/test_service.py` -- Test: `apps/test/features/chat/ag_ui_event_test.dart` - -**Checklist:** -- [x] 删除后端 legacy 入参路径,只接受 `RunAgentInput` -- [x] 删除前端 `ToolCallResult` 双格式容错,固定 AG-UI 单格式 -- [x] 更新对应单元测试(先红后绿) - -### Task 2: 历史接口(按天返回 `STATE_SNAPSHOT`) - -**Files:** -- Modify: `backend/src/v1/agent/router.py` -- Modify: `backend/src/v1/agent/service.py` -- Modify: `backend/src/v1/agent/repository.py` -- Add: `backend/src/v1/agent/history.py` (if needed) -- Test: `backend/tests/integration/v1/agent/test_routes.py` -- Test: `backend/tests/unit/v1/agent/test_service.py` - -**Checklist:** -- [x] 新增 history endpoint(含 owner 校验 + 日期游标) -- [x] 查询会话消息并按天聚合 -- [x] 以 `STATE_SNAPSHOT` 事件格式返回单日历史与 `hasMore` -- [x] 补齐测试 - -### Task 3: 前端统一 mock/true API 接入与解析 - -**Files:** -- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart` -- Modify: `apps/lib/core/api/mock_api_client.dart` -- Modify: `apps/lib/core/api/i_api_client.dart` (if needed) -- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart` -- Remove/Modify: `apps/lib/features/chat/data/services/mock_history_service.dart` -- Test: `apps/test/features/chat/ag_ui_service_test.dart` -- Test: `apps/test/features/chat/chat_bloc_test.dart` - -**Checklist:** -- [x] `sendMessage/loadHistory/resume` 全部走统一 API 调用路径 -- [x] mock 模式通过 `MockApiClient` 提供同接口响应,不再走本地分叉逻辑 -- [x] 前端统一消费 AG-UI 事件流(SSE + history snapshot) -- [x] 补齐测试 - -### Task 4: 工具链闭环(前端路由工具 + 后端日历工具) - -**Files:** -- Add/Modify: `backend/src/core/agent/...` (tool orchestration modules) -- Modify: `backend/src/core/agent/application/run_service.py` -- Modify: `backend/src/core/agent/application/resume_service.py` -- Modify: `backend/src/core/agent/infrastructure/queue/tasks.py` -- Modify: `apps/lib/features/chat/data/tools/tool_registry.dart` -- Add: `apps/lib/features/chat/data/tools/navigation_tool.dart` (if needed) -- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart` -- Modify: `apps/lib/features/home/ui/screens/home_screen.dart` (approval action if needed) -- Test: backend + apps agent related tests - -**Checklist:** -- [x] 在 `RunAgentInput.tools` 中组织前端工具与后端工具声明 -- [x] 后端实现 `create_calendar_event` 工具执行(入库 `schedule_items`) -- [x] 前端实现 `navigate_to_route` 工具执行能力(审批后执行) -- [x] 后端对前端工具发起调用时进入 pending,前端审批同意后调用 resume 回传 `tool` message -- [x] 后端处理 resume:落库、状态迁移、事件转发、成本核算保持正确 -- [x] 补齐端到端测试场景 - -### Task 5: 协议与接口文档同步 - -**Files:** -- Modify: `docs/runtime/runtime-route.md` -- Modify: `docs/bugs/2026-03-07-agent-module-review.md` (if needed for结论回写) - -**Checklist:** -- [x] 记录 run/resume/history/sse 的单协议格式 -- [x] 记录工具审批与 resume 回传流程 -- [x] 标注变更日期与示例 - -### Task 6: 审查高危问题收敛(并发/安全/前端健壮性) - -**Files:** -- Modify: `backend/src/v1/agent/service.py` -- Modify: `backend/src/core/agent/application/run_service.py` -- Modify: `backend/src/core/agent/application/resume_service.py` -- Modify: `backend/src/core/agent/application/session_state_persistence.py` -- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart` -- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart` -- Modify: `apps/lib/features/chat/data/tools/route_navigation_tool.dart` -- Test: `backend/tests/unit/core/agent/test_run_resume_service.py` -- Test: `backend/tests/unit/v1/agent/test_service.py` -- Test: `backend/tests/unit/core/agent/test_state_snapshot.py` -- Test: `backend/tests/integration/core/agent/test_queue_run_resume.py` -- Test: `apps/test/features/chat/ag_ui_service_test.dart` -- Test: `apps/test/features/chat/chat_bloc_test.dart` -- Test: `apps/test/features/chat/tool_registry_test.dart` - -**Checklist:** -- [x] 修复会话创建竞态:`enqueue_run` 捕获 `IntegrityError` 后回滚并回查 owner -- [x] 修复 resume 审批完整性:绑定 `toolName + toolArgsSha256 + nonce` 并强校验 -- [x] 修复前端 SSE 容错:单条坏包不再中断整流 -- [x] 修复前端 tool result 空卡片回归:`ui == null` 时不渲染占位卡片 -- [x] 修复前端导航工具安全边界:增加路由白名单/前缀校验 - -### Task 7: L2 复核阻塞项收敛(二次审查后补修) - -**Files:** -- Modify: `backend/src/core/agent/application/resume_service.py` -- Modify: `backend/src/core/agent/application/run_service.py` -- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart` -- Test: `backend/tests/unit/core/agent/test_run_resume_service.py` -- Test: `apps/test/features/chat/ag_ui_service_test.dart` - -**Checklist:** -- [x] 修复 SSE 重放:前端保存并续传 `Last-Event-ID` -- [x] 收紧后端写库触发:移除“关键词自动创建日程”路径,仅保留显式 `#tool:` 触发 -- [x] 修复 resume 结果注入:后端仅使用 sanitize 后的受控 payload 落库/回放 -- [x] 修复前端执行失败仍 resume:本地工具 `ok != true` 时中止 resume -- [x] 补充对应回归测试 - -### Task 8: 安全中风险补齐(HTTP 限额前置 + fail-closed 守卫) - -**Files:** -- Modify: `backend/src/v1/agent/router.py` -- Add: `backend/tests/unit/v1/agent/test_router_guards.py` -- Modify: `backend/tests/integration/v1/agent/test_routes.py` - -**Checklist:** -- [x] HTTP 层在 enqueue 前执行 `RunAgentInput` 限额校验(大小/消息数/文本长度) -- [x] Redis 异常时 run 限流与 SSE 配额改为 fail-closed -- [x] 补齐守卫单测与路由集成测试 - -## 执行日志(每完成一项即更新) - -- 2026-03-07 16:35: 初始化计划文档,录入审阅结论与任务拆解。 -- 2026-03-07 16:44: 完成 Task 1。后端 `RunService/ResumeService` 仅接受 `RunAgentInput`;前端 `ToolCallResultEvent` 仅使用 `content`。 - 验证: - - `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/integration/core/agent/test_queue_run_resume.py backend/tests/unit/v1/agent/test_service.py -q` 通过(含部分 `skip`)。 - - `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` 通过。 -- 2026-03-07 16:50: 完成 Task 2。新增 `GET /api/v1/agent/runs/{thread_id}/history?before=YYYY-MM-DD`,按天聚合会话消息并返回 `STATE_SNAPSHOT`(含 `hasMore`)。 - 验证: - - `uv run pytest backend/tests/unit/v1/agent/test_service.py backend/tests/integration/v1/agent/test_routes.py -q` 通过。 -- 2026-03-07 17:09: 完成 Task 3。前端 `AgUiService` 统一为 API 调用路径,mock/true 共用请求与事件解析;历史改走 `/api/v1/agent/history` 的 `STATE_SNAPSHOT`。 - 验证: - - `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart` 通过。 -- 2026-03-07 17:09: 完成 Task 4。新增前端 `navigate_to_route` 工具(审批后执行并 resume),后端 `create_calendar_event` 工具(落库 `schedule_items`,回传 `TOOL_CALL_RESULT`),并将可用工具注入系统提示词供后端解析。 - 验证: - - `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。 - - `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。 -- 2026-03-07 17:10: 完成 Task 5。`docs/runtime/runtime-route.md` 已新增 history 接口与 `STATE_SNAPSHOT` 示例,更新 run/resume 协议描述为单格式。 -- 2026-03-07 17:29: 完成 Task 6。收敛审查高危项: - - 后端 `enqueue_run` 增加并发建会话竞态处理(`IntegrityError -> rollback -> owner recheck`)。 - - 后端 run/resume 增加 pending tool guard(`pending_tool_name/pending_tool_args_sha256/pending_tool_nonce`)与 resume 强校验。 - - 前端 SSE 解析增加坏包容错,tool result 无 ui 时不渲染空卡片,导航工具增加白名单。 - 验证: - - `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q` 通过(`25 passed, 3 skipped`)。 - - `uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py` 通过。 - - `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`33 passed`)。 -- 2026-03-07 17:33: 执行全量目标验证命令: - - `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。 - - `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。 - - `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`69 passed`)。 -- 2026-03-07 17:46: 完成 Task 7(针对 L2 门禁新增阻塞项的二次修复): - - 前端 `AgUiService` 增加 `Last-Event-ID` 续传,规避同线程重复回放。 - - 后端 `RunService` 去除“日程关键词自动写库”,仅保留显式工具触发。 - - 后端 `ResumeService` 新增 sanitize 流程,拒绝注入式 `ui/content` 污染。 - - 前端审批后若本地工具执行失败,不再继续调用 resume。 - 验证: - - `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py -q` 通过(`26 passed, 3 skipped`)。 - - `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。 - - `uv run ruff check backend/src/core/agent/application/run_service.py backend/src/core/agent/application/resume_service.py backend/src/core/agent/application/session_state_persistence.py backend/src/v1/agent/service.py backend/tests/unit/core/agent/test_run_resume_service.py backend/tests/unit/core/agent/test_state_snapshot.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/core/agent/test_queue_run_resume.py` 通过。 - - `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`35 passed`)。 - - `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`71 passed`)。 - - L2 复核结果:`code-reviewer` 与 `security-reviewer` 复核后确认此前 HIGH 已收敛,未发现新的 CRITICAL/HIGH。 -- 2026-03-07 17:56: 完成 Task 8(安全中风险补齐): - - `router` 在 `/agent/runs` 与 `/agent/runs/{thread_id}/resume` 增加 `parse_run_input` 前置校验。 - - `_allow_run_request` 与 `_acquire_sse_slot` 在 Redis 异常时改为 fail-closed。 - - 新增 `test_router_guards.py`,并扩展 `test_routes.py` 覆盖超大 payload 422。 - 验证: - - `uv run pytest backend/tests/unit/v1/agent/test_router_guards.py backend/tests/integration/v1/agent/test_routes.py -q` 通过(`8 passed`)。 - - `uv run pytest backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent -q` 通过(含 `skip`)。 - - `uv run ruff check backend/src/core/agent backend/src/v1/agent backend/tests/unit/core/agent backend/tests/unit/v1/agent backend/tests/integration/core/agent backend/tests/integration/v1/agent` 通过。 - - `cd apps && flutter test test/features/chat/ag_ui_event_test.dart test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart test/features/chat/tool_registry_test.dart` 通过(`71 passed`)。 - - L2 复核结果:增量 `code-reviewer` 与 `security-reviewer` 均确认当前无新的 `CRITICAL/HIGH`。 diff --git a/docs/plans/2026-03-08-agent-tool-architecture-design.md b/docs/plans/2026-03-08-agent-tool-architecture-design.md deleted file mode 100644 index 4f5c55a..0000000 --- a/docs/plans/2026-03-08-agent-tool-architecture-design.md +++ /dev/null @@ -1,207 +0,0 @@ -# Agent Tool Architecture Design - -**Date:** 2026-03-08 -**Source:** `docs/bugs/2026-03-08-agent-tool-architecture.md` -**Scope:** `backend/src/core/agent` -**Status:** Approved for planning - ---- - -## 1. Objective - -修复 Agent 工具架构相关 8 个问题,优先恢复端到端闭环能力(工具审批后继续推理并产出最终回复),并在同版本内补齐工具输出结构化、存储分层、阶段策略解耦、多模态与语音输入能力。 - ---- - -## 2. Deliverables - -1. 两阶段修复蓝图(Phase 1 + Phase 2) -2. 统一事件与状态机设计(AG-UI Step 事件 + 审批恢复) -3. 接口边界与职责重划分(run/resume/runtime/persistence) -4. 风险与回滚策略 -5. 验收标准(双金路径) - ---- - -## 3. Constraints And Decisions - -### 3.1 Release Strategy - -- 一次性切换 -- 不做灰度 -- 不做双轨 -- 不留兼容代码 - -### 3.2 Contract Decisions - -- `run` 接口允许破坏性变更:移除前端传完整历史 `messages` 的语义 -- 前端只传本次输入,历史以后端为准 -- Phase 1 不引入 client hint -- 工具架构在 Phase 1 完整迁移至 CrewAI Tools(非桥接) - -### 3.3 AG-UI Event Decisions - -- 三阶段固定发 `StepStarted/StepFinished`:`intent`, `execution`, `organization` -- 等待工具审批不单独新增 step,归属 execution 内部状态 -- 后端只发英文机器名,前端自行文案化 - -### 3.4 ASR / Multimodal Decisions - -- 多模态首版只支持文件上传(不支持 URL) -- ASR 首版为“录音结束后上传音频 -> 后端同步返回 transcript” -- 前端将 transcript 回填输入框,再调用 run - ---- - -## 4. Complexity And Risk - -- **Complexity:** S2(跨多个核心模块的架构调整) -- **Risk Tier:** L2(包含高危安全项:前端可篡改历史) - -风险驱动原则:先修复闭环与安全问题,再扩展能力面。 - ---- - -## 5. Phased Plan - -## Phase 1 - Close Loop And Stop Security Bleeding - -**Bugs:** #1, #5, #6 - -### Goals - -1. 后端成为历史与上下文唯一事实源 -2. 工具审批后恢复并继续 Agent Loop -3. 工具执行完整迁移到 CrewAI Tools 注册体系 - -### Module Boundaries - -- `backend/src/core/agent/application/run_service.py` - - 仅负责本次输入解析、后端上下文组装、触发 runtime - - 移除前端历史信任路径 - - 移除硬编码工具分发 - -- `backend/src/core/agent/application/resume_service.py` - - 审批确认后触发异步续跑,立即返回 `accepted` - - 不可在工具执行后直接置 `COMPLETED` - - 增加 `approval_request_id` 幂等保护 - -- `backend/src/core/agent/infrastructure/crewai/runtime.py` - - 引入 CrewAI Tools 注册与注入 - - 按 agent/stage 装配工具集 - - 三阶段统一发 Step start/end 事件 - -- `backend/src/core/agent/application/session_state_persistence.py` - - 保障审批状态、工具结果、续跑状态一致性落库 - - 为 Phase 2 元数据扩展保留一致接口 - -### Runtime Flow (Phase 1) - -1. `run` 接收本次输入 -2. 后端读取 Redis/DB 重建历史 -3. 进入 intent/execution/organization 三阶段 -4. execution 中若触发工具审批:进入 `WAITING_APPROVAL` -5. 前端审批后调用 `resume` -6. `resume` 异步触发续跑:执行工具 -> 写 tool result -> 继续 loop -7. 生成最终 assistant 回复并 `RunFinished` - ---- - -## Phase 2 - Capability Completion In Same Version - -**Order:** #3 -> #2 -> #4 -> #7 -> #8 - -### #3 Tool Output As UI Schema v1 - -- 统一工具输出结构:`type/version/data/actions` -- 单一版本 `v1`,短期不做多版本并行 - -### #2 Tool Result Object Storage - -- 大 payload 存对象存储 -- DB 仅存摘要、索引、校验信息 -- 启用 `storage_bucket/storage_path/payload_sha256` - -### #4 Stage-Level Strategy Decoupling - -- intent/execution/organization 支持独立参数与工具策略 -- intent 阶段可配置为只读(禁工具) - -### #7 Multimodal Input - -- 首版支持图片文件上传输入 -- 不再丢弃非 text 内容 - -### #8 ASR API - -- 新增语音转写 API(同步返回 transcript) -- 语音转写与 agent run 解耦 - ---- - -## 6. Session State And Events - -推荐状态机: - -`RUNNING -> WAITING_APPROVAL -> RESUMING -> RUNNING -> COMPLETED/FAILED` - -关键约束: - -- 重复审批请求不得重复执行工具(幂等) -- `COMPLETED` 仅在 loop 自然结束时设置 -- Step 事件覆盖三阶段完整生命周期 - ---- - -## 7. Acceptance Criteria - -## 7.1 Golden Path A (No Tool) - -用户输入后,完整经历三阶段并产出最终回复;前端收到完整 step 事件与 `RunFinished`。 - -## 7.2 Golden Path B (Tool + Approval + Resume) - -用户触发工具调用,审批后系统异步续跑并最终产出 assistant 回复;会话不在审批后直接结束。 - -## 7.3 Security Validation - -前端即使提交伪造历史字段,也不会影响后端实际上下文。 - -## 7.4 Event Validation - -每轮 run 必须包含 `intent/execution/organization` 的 `StepStarted/StepFinished`。 - ---- - -## 8. Risk And Rollback - -### High Risk: #6 Context Ownership Migration - -- 风险:上下文错绑、历史缺失 -- 控制:会话归属校验 + Redis/DB 一致性读取 -- 回滚:可退到“后端 DB-only 历史重建” - -### High Risk: #5 Async Resume Consistency - -- 风险:重复审批、状态卡死 -- 控制:审批幂等键 + 状态跃迁约束 + 超时终态 -- 回滚:降级为“仅返回工具结果,不自动续跑” - -### Medium Risk: #2 Storage Split Consistency - -- 风险:对象存储与 DB 元数据不一致 -- 控制:先对象后元数据 + 失败补偿清理 -- 回滚:临时退回 DB 内联存储 - ---- - -## 9. Bug-To-Phase Mapping - -- **Phase 1:** #1, #5, #6 -- **Phase 2:** #2, #3, #4, #7, #8 - ---- - -## 10. Next Step - -进入 implementation planning:将本设计拆解为任务级可执行计划(文件、测试、命令、验收证据)。 diff --git a/docs/plans/2026-03-08-agent-tool-architecture-implementation-plan.md b/docs/plans/2026-03-08-agent-tool-architecture-implementation-plan.md deleted file mode 100644 index 60acc1d..0000000 --- a/docs/plans/2026-03-08-agent-tool-architecture-implementation-plan.md +++ /dev/null @@ -1,449 +0,0 @@ -# Agent Tool Architecture Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** 修复 agent 工具架构 8 个问题,先恢复端到端闭环与安全正确性,再补齐 UI Schema、对象存储、阶段解耦、多模态与 ASR。 - -**Architecture:** 采用两阶段落地。Phase 1 先完成后端上下文主控、CrewAI Tools 完整迁移、审批后异步续跑闭环;Phase 2 按 `#3 -> #2 -> #4 -> #7 -> #8` 逐项扩展能力。所有变更遵循 AG-UI 事件流语义,三阶段固定发送 StepStarted/StepFinished。 - -**Tech Stack:** FastAPI, Pydantic, CrewAI, LiteLLM, Redis, Postgres, MinIO/Supabase Storage, pytest - -**Status:** Completed on 2026-03-08 (Task 1-8 delivered; Task 4/5/6 finalized with E2E object-storage verification) - ---- - -### Task 1: 锁定 Phase 1 契约(移除前端历史语义) - -**Files:** -- Modify: `backend/src/core/agent/domain/agui_input.py` -- Modify: `backend/src/core/agent/application/run_service.py` -- Modify: `backend/src/v1/agent/schemas.py` -- Test: `backend/tests/unit/core/agent/test_run_resume_service.py` - -**Step 1: Write the failing test** - -```python -def test_run_ignores_client_history_messages(fake_run_input_with_messages): - result = service.run(run_input=fake_run_input_with_messages) - assert result.used_context_source == "backend" -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k ignores_client_history -v` -Expected: FAIL,当前实现仍读取/依赖前端 history。 - -**Step 3: Write minimal implementation** - -```python -# run_service.py -user_input = extract_latest_user_text(run_input) -history = await load_context_from_backend_sources(session_id) -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k ignores_client_history -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/agent/domain/agui_input.py backend/src/core/agent/application/run_service.py backend/src/v1/agent/schemas.py backend/tests/unit/core/agent/test_run_resume_service.py -git commit -m "refactor(agent): make backend own conversation context" -``` - -### Task 2: CrewAI Tools 完整迁移(替换硬编码分发) - -**Files:** -- Create: `backend/src/core/agent/infrastructure/crewai/tools_registry.py` -- Create: `backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py` -- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py` -- Modify: `backend/src/core/agent/application/run_service.py` -- Test: `backend/tests/unit/core/agent/test_crewai_runtime.py` - -**Step 1: Write the failing test** - -```python -def test_runtime_uses_registered_crewai_tools(): - runtime = build_runtime_with_registry(["create_calendar_event"]) - result = runtime.execute(user_input="帮我创建日历事件", system_prompt="x") - assert result.tool_calls[0].tool_name == "create_calendar_event" -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k registered_crewai_tools -v` -Expected: FAIL,当前路径仍是 run_service 硬编码。 - -**Step 3: Write minimal implementation** - -```python -# tools_registry.py -TOOLS = {"create_calendar_event": CreateCalendarEventTool()} - -def tools_for_stage(stage: str) -> list[BaseTool]: - return STAGE_TOOL_MAP.get(stage, []) -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k registered_crewai_tools -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/agent/infrastructure/crewai/tools_registry.py backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/application/run_service.py backend/tests/unit/core/agent/test_crewai_runtime.py -git commit -m "feat(agent): migrate backend tools to crewai tool registry" -``` - -### Task 3: 修复审批后异步续跑闭环(#5) - -**Files:** -- Modify: `backend/src/core/agent/application/resume_service.py` -- Modify: `backend/src/core/agent/infrastructure/queue/tasks.py` -- Modify: `backend/src/core/agent/application/session_state_persistence.py` -- Test: `backend/tests/integration/core/agent/test_queue_run_resume.py` - -**Step 1: Write the failing test** - -```python -def test_resume_triggers_async_loop_until_final_assistant_message(client): - response = client.post("/v1/agent/runs/{id}/resume", json={"approve": True}) - assert response.status_code == 202 - assert eventually_has_final_assistant_message(id) -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/integration/core/agent/test_queue_run_resume.py -k triggers_async_loop -v` -Expected: FAIL,当前审批后直接完成。 - -**Step 3: Write minimal implementation** - -```python -# resume_service.py -await mark_session_resuming(...) -await enqueue_resume_task(...) -return ResumeAccepted(...) -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/integration/core/agent/test_queue_run_resume.py -k triggers_async_loop -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/agent/application/resume_service.py backend/src/core/agent/infrastructure/queue/tasks.py backend/src/core/agent/application/session_state_persistence.py backend/tests/integration/core/agent/test_queue_run_resume.py -git commit -m "fix(agent): continue agent loop asynchronously after tool approval" -``` - -### Task 4: 三阶段 Step 事件完整化(intent/execution/organization) - -**Files:** -- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py` -- Modify: `backend/src/core/agent/infrastructure/agui/bridge.py` -- Test: `backend/tests/unit/core/agent/test_agui_bridge.py` -- Test: `backend/tests/integration/v1/agent/test_sse_flow_live.py` - -**Step 1: Write the failing test** - -```python -def test_each_stage_emits_step_started_and_finished(): - events = collect_events_from_run(...) - assert has_step_pair(events, "intent") - assert has_step_pair(events, "execution") - assert has_step_pair(events, "organization") -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/integration/v1/agent/test_sse_flow_live.py -k emits_step_started_and_finished -v` -Expected: FAIL,至少一个阶段事件缺失。 - -**Step 3: Write minimal implementation** - -```python -emit_step_started(stage) -stage_output = run_stage(stage) -emit_step_finished(stage) -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/integration/v1/agent/test_sse_flow_live.py -k emits_step_started_and_finished -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/agui/bridge.py backend/tests/unit/core/agent/test_agui_bridge.py backend/tests/integration/v1/agent/test_sse_flow_live.py -git commit -m "feat(agent): emit ag-ui step events for three-stage flow" -``` - -### Task 5: 工具输出统一为 UI Schema v1(#3) - -**Files:** -- Modify: `backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py` -- Modify: `backend/src/core/agent/domain/message_metadata.py` -- Test: `backend/tests/unit/core/agent/test_run_resume_service.py` - -**Step 1: Write the failing test** - -```python -def test_calendar_tool_returns_ui_schema_v1(): - result = run_calendar_tool(...) - assert result["type"] == "calendar_card.v1" - assert result["version"] == "v1" -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k returns_ui_schema_v1 -v` -Expected: FAIL,当前返回简单 status/event_id。 - -**Step 3: Write minimal implementation** - -```python -return { - "type": "calendar_card.v1", - "version": "v1", - "data": {...}, - "actions": [...], -} -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k returns_ui_schema_v1 -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py backend/src/core/agent/domain/message_metadata.py backend/tests/unit/core/agent/test_run_resume_service.py -git commit -m "feat(agent): return tool results as ui schema v1" -``` - -### Task 6: 工具结果对象存储(#2) - -**Files:** -- Modify: `backend/src/core/agent/application/session_state_persistence.py` -- Modify: `backend/src/core/agent/domain/message_metadata.py` -- Test: `backend/tests/integration/core/agent/test_session_message_persistence.py` - -**Step 1: Write the failing test** - -```python -def test_large_tool_payload_persisted_to_object_storage(): - meta = persist_large_tool_result(...) - assert meta.storage_bucket is not None - assert meta.storage_path is not None -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/integration/core/agent/test_session_message_persistence.py -k object_storage -v` -Expected: FAIL,当前 metadata 为空。 - -**Step 3: Write minimal implementation** - -```python -payload_ref = await persist_tool_result_payload(...) -metadata.storage_bucket = payload_ref.bucket -metadata.storage_path = payload_ref.path -metadata.payload_sha256 = payload_ref.sha256 -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/integration/core/agent/test_session_message_persistence.py -k object_storage -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/agent/application/session_state_persistence.py backend/src/core/agent/domain/message_metadata.py backend/tests/integration/core/agent/test_session_message_persistence.py -git commit -m "feat(agent): persist large tool results to object storage" -``` - -### Task 7: 三阶段参数解耦(#4) - -**Files:** -- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py` -- Modify: `backend/src/core/agent/infrastructure/config/resolver.py` -- Test: `backend/tests/unit/core/agent/test_config_resolver.py` -- Test: `backend/tests/unit/core/agent/test_crewai_runtime.py` - -**Step 1: Write the failing test** - -```python -def test_intent_stage_can_disable_tools(): - cfg = load_stage_config(intent_tools=[]) - result = run_intent_stage(cfg) - assert result.tool_calls == [] -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k intent_stage_can_disable_tools -v` -Expected: FAIL,当前三阶段共享同一 llm/tools 配置。 - -**Step 3: Write minimal implementation** - -```python -stage_cfg = config.for_stage(stage) -run_stage(..., llm_config=stage_cfg.llm, tools=stage_cfg.tools) -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k intent_stage_can_disable_tools -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/config/resolver.py backend/tests/unit/core/agent/test_config_resolver.py backend/tests/unit/core/agent/test_crewai_runtime.py -git commit -m "refactor(agent): decouple llm and tool strategy by stage" -``` - -### Task 8: 多模态图片输入(文件上传)支持(#7) - -**Files:** -- Modify: `backend/src/core/agent/domain/agui_input.py` -- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py` -- Modify: `backend/src/core/agent/infrastructure/litellm/client.py` -- Test: `backend/tests/unit/core/agent/test_litellm_client.py` - -**Step 1: Write the failing test** - -```python -def test_image_content_block_is_preserved_for_llm(): - payload = build_multimodal_payload(text="分析图片", image_file="a.png") - assert payload_contains_image_block(payload) -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_litellm_client.py -k image_content_block_is_preserved -v` -Expected: FAIL,当前非 text 被丢弃。 - -**Step 3: Write minimal implementation** - -```python -if item.type == "image": - blocks.append({"type": "image_url", "image_url": {"url": signed_file_url}}) -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/unit/core/agent/test_litellm_client.py -k image_content_block_is_preserved -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/agent/domain/agui_input.py backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/litellm/client.py backend/tests/unit/core/agent/test_litellm_client.py -git commit -m "feat(agent): support multimodal image input blocks" -``` - -### Task 9: 新增 ASR 同步转写 API(#8) - -**Files:** -- Create: `backend/src/v1/agent/asr_router.py` -- Modify: `backend/src/v1/agent/router.py` -- Create: `backend/src/v1/agent/asr_service.py` -- Create: `backend/src/v1/agent/asr_schemas.py` -- Test: `backend/tests/integration/v1/agent/test_routes.py` - -**Step 1: Write the failing test** - -```python -def test_asr_transcribe_returns_sync_transcript(client, wav_file): - resp = client.post("/v1/agent/asr/transcribe", files={"audio": wav_file}) - assert resp.status_code == 200 - assert resp.json()["transcript"] -``` - -**Step 2: Run test to verify it fails** - -Run: `cd backend && uv run pytest tests/integration/v1/agent/test_routes.py -k asr_transcribe_returns_sync_transcript -v` -Expected: FAIL,当前无路由。 - -**Step 3: Write minimal implementation** - -```python -@router.post("/asr/transcribe") -async def transcribe(audio: UploadFile) -> AsrTranscribeResponse: - text = await asr_service.transcribe(audio) - return AsrTranscribeResponse(transcript=text) -``` - -**Step 4: Run test to verify it passes** - -Run: `cd backend && uv run pytest tests/integration/v1/agent/test_routes.py -k asr_transcribe_returns_sync_transcript -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/v1/agent/asr_router.py backend/src/v1/agent/router.py backend/src/v1/agent/asr_service.py backend/src/v1/agent/asr_schemas.py backend/tests/integration/v1/agent/test_routes.py -git commit -m "feat(agent): add synchronous asr transcription endpoint" -``` - -### Task 10: 全量验证与文档对齐 - -**Files:** -- Modify: `docs/runtime/runtime-route.md` -- Modify: `docs/bugs/2026-03-08-agent-tool-architecture.md` (状态回填) - -**Step 1: Run targeted unit suite** - -Run: `cd backend && uv run pytest tests/unit/core/agent -v` -Expected: PASS - -**Step 2: Run targeted integration suite** - -Run: `cd backend && uv run pytest tests/integration/core/agent tests/integration/v1/agent -v` -Expected: PASS - -**Step 3: Run e2e smoke for agent flow** - -Run: `cd backend && uv run pytest tests/e2e -k "agent or mobile_health" -v` -Expected: PASS 或明确记录跳过原因 - -**Step 4: Run quality gates** - -Run: `cd backend && uv run ruff check src tests && uv run basedpyright` -Expected: PASS - -**Step 5: Final commit** - -```bash -git add docs/runtime/runtime-route.md docs/bugs/2026-03-08-agent-tool-architecture.md -git commit -m "docs(agent): align runtime docs with new tool architecture" -``` - ---- - -## Verification Evidence Requirements - -实施完成时必须输出: - -1. 双金路径验证结果(无工具 + 工具审批后续跑) -2. 三阶段 StepStarted/StepFinished 事件日志片段 -3. 安全验证结果(前端 history 篡改无效) -4. ASR 同步转写接口请求/响应样例 -5. 关键命令输出摘要(pytest/ruff/basedpyright) - ---- - -## Notes - -- 本计划不包含兼容逻辑保留。 -- 本计划采用一次性切换。 -- 若实施中出现 S2 -> S3 范围升级,先暂停并更新计划,再继续执行。 diff --git a/docs/plans/2026-03-08-runtime-refactor-prompt-centralization.md b/docs/plans/2026-03-08-runtime-refactor-prompt-centralization.md deleted file mode 100644 index 2054c52..0000000 --- a/docs/plans/2026-03-08-runtime-refactor-prompt-centralization.md +++ /dev/null @@ -1,129 +0,0 @@ -# Runtime Refactor and Prompt Centralization Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** Refactor CrewAI runtime into reusable modules, centralize all prompt text under `core/agent/prompt`, and diagnose flaky front-tool interrupt behavior without adding hardcoded runtime heuristics. - -**Architecture:** Keep `runtime.py` as a thin facade and move parsing/tool/prompt composition/stage execution into cohesive modules. Prompt strings (including stage contracts and injected tool-context instructions) are generated exclusively by prompt-module functions. Keep behavior equivalent by default; only add diagnostic observability for flaky live scenario analysis. - -**Tech Stack:** Python 3.12, FastAPI backend, CrewAI, Pydantic v2, pytest, ruff, basedpyright. - ---- - -### Task 1: Add prompt module and centralize all runtime prompt text - -**Files:** -- Create: `backend/src/core/agent/prompt/__init__.py` -- Create: `backend/src/core/agent/prompt/runtime_stage_prompts.py` -- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py` -- Test: `backend/tests/unit/core/agent/test_crewai_runtime.py` - -**Step 1: Write failing test** -- Add unit test asserting runtime uses prompt builder output (not inline literals) for stage description/contract/tool context. - -**Step 2: Run test to verify it fails** -- Run: `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py::test_runtime_uses_prompt_module_for_stage_descriptions -q` -- Expected: FAIL because runtime still composes inline strings. - -**Step 3: Implement prompt module** -- Add prompt functions: - - `build_stage_output_contract(stage: str) -> str` - - `build_stage_task_description(...) -> str` - - `build_intent_multimodal_prompt(...) -> str` -- Use mainstream prompt structure: role/objective/context/constraints/output-format. -- Keep rules non-hardcoded and behavior-oriented, avoid keyword-triggered branching rules. - -**Step 4: Wire runtime to prompt functions** -- Replace inline prompt strings in runtime with prompt-module function calls. -- Ensure no prompt literals remain in runtime except minimal wiring labels. - -**Step 5: Run tests** -- Run: `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py -q` -- Expected: PASS. - ---- - -### Task 2: Split runtime into reusable modules and keep facade stable - -**Files:** -- Create: `backend/src/core/agent/infrastructure/crewai/runtime_models.py` -- Create: `backend/src/core/agent/infrastructure/crewai/runtime_parsers.py` -- Create: `backend/src/core/agent/infrastructure/crewai/runtime_tools.py` -- Create: `backend/src/core/agent/infrastructure/crewai/runtime_stage_runner.py` -- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py` -- Modify: `backend/src/core/agent/infrastructure/crewai/__init__.py` (if needed) -- Test: `backend/tests/unit/core/agent/test_crewai_runtime.py` - -**Step 1: Write failing test** -- Add/adjust unit test that imports `CrewAIRuntime` facade and verifies existing contract (`execute`, `map_events`, `is_registered_backend_tool`) still works after split. - -**Step 2: Run test to verify it fails** -- Run: `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py::test_runtime_facade_contract_stable_after_refactor -q` -- Expected: FAIL before module split wiring. - -**Step 3: Extract models/parsers/tools/stage-runner** -- Move Pydantic result models to `runtime_models.py`. -- Move parse/normalize helpers to `runtime_parsers.py`. -- Move tool normalization, routing tool class, pending-front-tool extraction to `runtime_tools.py`. -- Move `_run_stage_with_crewai` + usage extraction to `runtime_stage_runner.py`. - -**Step 4: Keep runtime facade thin** -- `runtime.py` retains orchestration flow and public API only. -- Import and compose extracted modules; no behavior change intended. - -**Step 5: Run tests** -- Run: `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py -q` -- Expected: PASS. - ---- - -### Task 3: Diagnose front-tool interrupt instability with explicit observability - -**Files:** -- Modify: `backend/src/core/agent/infrastructure/crewai/runtime.py` -- Modify: `backend/src/core/agent/infrastructure/crewai/runtime_stage_runner.py` -- Modify: `backend/tests/e2e/test_agent_live_flow.py` -- Modify: `docs/bugs/2026-03-08-backend-tool-no-events.md` - -**Step 1: Add failing/diagnostic assertion in live test path** -- Extend test to capture and print structured diagnostics when `pending_tool_call_id` is `None`: - - intent/execution raw+structured output - - tool payload injected into prompts - - captured tool calls list - -**Step 2: Run targeted live test for evidence** -- Run: `AGENT_LIVE_E2E=1 uv run pytest backend/tests/e2e/test_agent_live_flow.py::test_agent_live_front_tool_interrupt_resume_continue -v -rs` -- Expected: still flaky/fail, but with actionable diagnostics. - -**Step 3: Analyze evidence and apply non-hardcoded fix** -- If input ambiguity: refine test input prompt text under test fixture. -- If tool-description injection issue: fix prompt-builder injection logic. -- Do not add keyword heuristics in runtime branching. - -**Step 4: Re-run live targeted test** -- Same command as Step 2. -- Expected: improved stability or clearly documented unresolved root cause. - -**Step 5: Update bug doc** -- Add root-cause findings and next actions under Bug 3 section. - ---- - -### Task 4: Full verification and hygiene - -**Files:** -- Modify (if needed): `backend/tests/unit/core/agent/test_run_resume_service.py` - -**Step 1: Run impacted unit suites** -- `uv run pytest backend/tests/unit/core/agent/test_crewai_runtime.py -q` -- `uv run pytest backend/tests/unit/core/agent/test_run_resume_service.py -q` - -**Step 2: Run lint/type checks** -- `uv run ruff check backend/src/core/agent/prompt backend/src/core/agent/infrastructure/crewai backend/tests/unit/core/agent/test_crewai_runtime.py backend/tests/e2e/test_agent_live_flow.py` -- `uv run basedpyright backend/src/core/agent/prompt backend/src/core/agent/infrastructure/crewai backend/tests/unit/core/agent/test_crewai_runtime.py` - -**Step 3: Optional live regression pack (if env ready)** -- `AGENT_LIVE_E2E=1 uv run pytest backend/tests/e2e/test_agent_live_flow.py -m live -v -rs` - -**Step 4: Report residual risk** -- If live still flaky, report exact failure mode and captured diagnostics (no workaround heuristics). diff --git a/docs/plans/2026-03-09-cloud-supabase-jwks-migration-plan.md b/docs/plans/2026-03-09-cloud-supabase-jwks-migration-plan.md deleted file mode 100644 index 61797dc..0000000 --- a/docs/plans/2026-03-09-cloud-supabase-jwks-migration-plan.md +++ /dev/null @@ -1,303 +0,0 @@ -# Cloud Supabase Env Cleanup & JWKS Migration Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** 切换到云 Supabase 后,移除本地自托管 Supabase 基础设施变量与编排,保留 Redis + DB + init-job,并将后端 JWT 验签从 `JWT_SECRET` 改为 JWKS 公钥验签。 - -**Architecture:** 后端配置收敛到“业务运行所需最小集合”(Supabase URL/anon/service role + DB + Redis)。认证链路采用 JWKS 拉取公钥并按 `kid` 验签,替代共享密钥 HS256。Docker 编排只保留业务依赖(redis、db、init-job),不再编排本地 Supabase 全家桶。 - -**Tech Stack:** FastAPI, Pydantic Settings, PyJWT (PyJWKClient), Docker Compose, pytest - ---- - -### Task 1: 固化云模式配置契约(先测后改) - -**Files:** -- Modify: `backend/tests/unit/test_settings_supabase_env.py` -- Modify: `.env.example` - -**Step 1: 写失败测试,定义新 Supabase 配置契约** - -```python -def test_social_prefixed_supabase_env_populates_settings(monkeypatch: MonkeyPatch) -> None: - monkeypatch.setenv("SOCIAL_SUPABASE__PUBLIC_URL", "https://project.example.supabase.co") - monkeypatch.setenv("SOCIAL_SUPABASE__ANON_KEY", "anon-key") - monkeypatch.setenv("SOCIAL_SUPABASE__SERVICE_ROLE_KEY", "service-key") - monkeypatch.setenv("SOCIAL_SUPABASE__JWT_AUDIENCE", "authenticated") - - settings = Settings() - - assert settings.supabase.public_url == "https://project.example.supabase.co" - assert settings.supabase.jwt_issuer == "https://project.example.supabase.co/auth/v1" - assert settings.supabase.jwks_url.endswith("/auth/v1/.well-known/jwks.json") -``` - -**Step 2: 运行测试确认失败** - -Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v` -Expected: FAIL(`public_url/jwks_url` 字段不存在或断言失败) - -**Step 3: 最小改动让测试通过(仅 settings 相关,逻辑改动在后续任务)** - -更新 `.env.example` 为云模式最小变量草案(先占位,后续任务会补最终文案): -- `SOCIAL_SUPABASE__PUBLIC_URL=` -- `SOCIAL_SUPABASE__ANON_KEY=` -- `SOCIAL_SUPABASE__SERVICE_ROLE_KEY=` -- `SOCIAL_SUPABASE__JWT_AUDIENCE=authenticated` -- `SOCIAL_SUPABASE__JWT_ISSUER=`(可选,默认由 PUBLIC_URL 推导) -- `SOCIAL_SUPABASE__JWKS_URL=`(可选,默认由 PUBLIC_URL 推导) - -**Step 4: 运行测试确认通过** - -Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/tests/unit/test_settings_supabase_env.py .env.example -git commit -m "test: define cloud supabase settings contract" -``` - -### Task 2: 重构 SupabaseSettings(移除 JWT_SECRET 依赖) - -**Files:** -- Modify: `backend/src/core/config/settings.py` -- Modify: `backend/tests/unit/test_settings_supabase_env.py` - -**Step 1: 写失败测试,约束默认推导行为** - -```python -assert settings.supabase.jwt_issuer == "https://project.example.supabase.co/auth/v1" -assert settings.supabase.jwks_url == "https://project.example.supabase.co/auth/v1/.well-known/jwks.json" -assert "jwt_secret" not in settings.model_dump()["supabase"] -``` - -**Step 2: 运行测试确认失败** - -Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v` -Expected: FAIL - -**Step 3: 实现最小配置重构** - -在 `SupabaseSettings` 中改为: -- 必填:`public_url`, `anon_key`, `service_role_key` -- 可选:`site_url`, `additional_redirect_urls` -- 新增:`jwt_audience`(默认 `authenticated`)、`jwt_issuer`(默认 `${public_url}/auth/v1`)、`jwks_url`(默认 `${jwt_issuer}/.well-known/jwks.json`) -- 删除:`jwt_secret`, `public_scheme`, `public_host`, `kong_http_port`, `kong_https_port` - -**Step 4: 运行测试确认通过** - -Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/config/settings.py backend/tests/unit/test_settings_supabase_env.py -git commit -m "refactor: migrate supabase config to cloud jwks fields" -``` - -### Task 3: 引入 JWKS 验签组件并接入认证依赖 - -**Files:** -- Create: `backend/src/core/auth/jwt_verifier.py` -- Modify: `backend/src/v1/users/dependencies.py` -- Create: `backend/tests/unit/core/auth/test_jwt_verifier.py` - -**Step 1: 先写失败测试(JWT 验签核心行为)** - -```python -def test_verify_token_with_jwks_success(...): - claims = verifier.verify(token) - assert claims["sub"] == str(user_id) - -def test_verify_token_rejects_invalid_issuer(...): - with pytest.raises(TokenValidationError): - verifier.verify(token_with_wrong_iss) -``` - -**Step 2: 运行测试确认失败** - -Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py -v` -Expected: FAIL(模块/类不存在) - -**Step 3: 实现最小 JWKS 验签逻辑** - -```python -class JwtVerifier: - def __init__(self, jwks_url: str, issuer: str, audience: str) -> None: ... - - def verify(self, token: str) -> dict[str, Any]: - key = self._jwks_client.get_signing_key_from_jwt(token) - return jwt.decode( - token, - key.key, - algorithms=["RS256", "ES256"], - audience=self._audience, - issuer=self._issuer, - options={"require": ["sub", "aud", "iss", "exp"]}, - ) -``` - -在 `get_current_user` 中替换原 `jwt_secret + HS256` 验签,统一映射为现有 401/503 语义。 - -**Step 4: 运行测试确认通过** - -Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py -v` -Expected: PASS - -**Step 5: Commit** - -```bash -git add backend/src/core/auth/jwt_verifier.py backend/src/v1/users/dependencies.py backend/tests/unit/core/auth/test_jwt_verifier.py -git commit -m "feat: validate access tokens via supabase jwks" -``` - -### Task 4: 回归认证路径与 live 测试兼容 - -**Files:** -- Modify: `backend/tests/integration/v1/agent/test_sse_flow_live.py` -- Modify: `backend/tests/integration/test_auth_routes.py`(如需) - -**Step 1: 写失败测试/调整 live 测试生成 token 方式** - -将 live 测试从“本地签发 HS256 token”改为“通过真实登录拿 access token”或“无测试账号时 skip”。 - -```python -if not os.getenv("AGENT_LIVE_EMAIL") or not os.getenv("AGENT_LIVE_PASSWORD"): - pytest.skip("missing live supabase credentials") -``` - -**Step 2: 运行相关测试确认失败(或旧逻辑不适配)** - -Run: `uv run pytest backend/tests/integration/v1/agent/test_sse_flow_live.py -m live -v` -Expected: 在旧代码下不可用/依赖 jwt_secret - -**Step 3: 完成最小实现改造** - -- 移除 `config.supabase.jwt_secret` 的测试依赖。 -- 保持 `@pytest.mark.live` 行为不变,避免影响常规 CI。 - -**Step 4: 运行测试确认通过(或受控 skip)** - -Run: `uv run pytest backend/tests/integration/v1/agent/test_sse_flow_live.py -m live -v` -Expected: PASS 或可解释的 SKIP(凭证缺失) - -**Step 5: Commit** - -```bash -git add backend/tests/integration/v1/agent/test_sse_flow_live.py backend/tests/integration/test_auth_routes.py -git commit -m "test: align live auth flow with cloud supabase tokens" -``` - -### Task 5: 裁剪 Docker Compose(移除本地 Supabase,保留 Redis/DB/init-job) - -**Files:** -- Modify: `infra/docker/docker-compose.yml` - -**Step 1: 写失败验证(compose 结构断言)** - -添加一个轻量脚本化检查(可在本任务临时执行,不必入库): - -```bash -docker compose --env-file .env -f infra/docker/docker-compose.yml config -``` - -在改造前记录当前包含的 Supabase 服务(`studio/kong/auth/rest/...`)作为对照。 - -**Step 2: 执行检查确认当前状态(基线)** - -Run: `docker compose --env-file .env -f infra/docker/docker-compose.yml config` -Expected: 输出包含 Supabase 全家桶服务 - -**Step 3: 最小实现裁剪** - -- 删除服务:`studio/kong/mail-templates/auth/rest/realtime/storage/imgproxy/meta/functions/analytics/vector/supavisor` -- 保留服务:`redis`, `db`, `init-job` -- `init-job` 环境变量移除:`SOCIAL_SUPABASE__ANON_KEY`, `SOCIAL_SUPABASE__SERVICE_ROLE_KEY`, `SOCIAL_SUPABASE__JWT_SECRET` -- `db` 服务切换为业务最小化所需配置(仅数据库启动与健康检查必需) - -**Step 4: 运行 compose 校验** - -Run: `docker compose --env-file .env -f infra/docker/docker-compose.yml config` -Expected: PASS,且仅保留 redis/db/init-job - -**Step 5: Commit** - -```bash -git add infra/docker/docker-compose.yml -git commit -m "refactor: remove local supabase stack from compose" -``` - -### Task 6: 清理环境模板与运行文档 - -**Files:** -- Modify: `.env.example` -- Modify: `docs/runtime/runtime-runbook.md` -- Modify: `infra/scripts/dev-migrate.sh` - -**Step 1: 先写文档/模板检查点(人工可核验)** - -定义必须满足: -- `.env.example` 不再包含本地 Supabase 基础设施变量(logflare/pooler/studio/kong/jwt_secret 等) -- 保留并标注后端必需项:`PUBLIC_URL`, `ANON_KEY`, `SERVICE_ROLE_KEY` -- runbook 的健康检查改为 Redis/DB/Web,而非 Kong - -**Step 2: 运行基线检查(改造前)** - -Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py -v` -Expected: 作为环境模板改造后的回归基线 - -**Step 3: 最小实现文档更新** - -- `docs/runtime/runtime-runbook.md`:把“启动基础设施”描述改为 `redis + db`。 -- `infra/scripts/dev-migrate.sh`:将提示从“Requires Supabase services”改为“Requires db/redis services”。 -- `.env.example`:按云模式分组,明确前端/后端变量边界。 - -**Step 4: 运行检查确认通过** - -Run: `docker compose --env-file .env -f infra/docker/docker-compose.yml config` -Expected: PASS - -**Step 5: Commit** - -```bash -git add .env.example docs/runtime/runtime-runbook.md infra/scripts/dev-migrate.sh -git commit -m "docs: update runtime guide for cloud supabase mode" -``` - -### Task 7: 全量验证与发布前检查 - -**Files:** -- Modify: `docs/runtime/runtime-runbook.md`(记录验证命令与结果) - -**Step 1: 运行静态检查** - -Run: `uv run ruff check backend/src backend/tests` -Expected: PASS - -**Step 2: 运行类型检查** - -Run: `uv run basedpyright` -Expected: PASS - -**Step 3: 运行测试(按影响面)** - -Run: `uv run pytest backend/tests/unit/test_settings_supabase_env.py backend/tests/unit/core/auth/test_jwt_verifier.py -v` -Expected: PASS - -Run: `uv run pytest backend/tests/integration/test_users_routes.py backend/tests/integration/test_auth_routes.py -v` -Expected: PASS - -**Step 4: 运行运行时门禁验证** - -Run: `docker compose --env-file .env -f infra/docker/docker-compose.yml up -d redis db && docker compose --env-file .env -f infra/docker/docker-compose.yml run --rm --build init-job uv run python -m core.runtime.cli bootstrap` -Expected: PASS(迁移 + init-data 成功) - -**Step 5: Commit** - -```bash -git add docs/runtime/runtime-runbook.md -git commit -m "chore: record cloud supabase migration verification" -``` diff --git a/docs/plans/2026-03-10-auth-token-compat-refresh-singleflight.md b/docs/plans/2026-03-10-auth-token-compat-refresh-singleflight.md new file mode 100644 index 0000000..b0b47c0 --- /dev/null +++ b/docs/plans/2026-03-10-auth-token-compat-refresh-singleflight.md @@ -0,0 +1,69 @@ +# Auth Token Compatibility + Refresh Singleflight Implementation Plan + +> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. + +**Goal:** 兼容云 Supabase 实际 access token claims(缺失 `iss` 仍可通过),并修复前端 401 导致 refresh 风暴问题,消除日志中的批量 401/429 警告。 + +**Architecture:** 后端保持 HS256 签名校验、`exp/sub` 必检,将 `iss` 从“强制存在”改为“存在时校验”;前端在拦截器中加入 refresh 单飞与防重入,避免并发 401 触发多次 refresh 或 refresh 自递归。同步清理无效分支与冗余状态。 + +**Tech Stack:** FastAPI, PyJWT, Flutter, Dio, flutter_test + +--- + +### Task 1: 后端 JWT claim 兼容化(无 `iss` 可通过) + +**Files:** +- Modify: `backend/src/core/auth/jwt_verifier.py` +- Test: `backend/tests/unit/core/auth/test_jwt_verifier.py` + +**Step 1: Write failing test** +- 新增用例:token 不含 `iss`、但 `sub/exp` 与 HS256 签名合法时应验证成功。 + +**Step 2: Run test to verify it fails** +- Run: `cd backend && uv run pytest tests/unit/core/auth/test_jwt_verifier.py -q` + +**Step 3: Write minimal implementation** +- `jwt.decode` 的 `require` 去掉 `iss`,仅保留 `sub/exp`。 +- 若 payload 中存在 `iss` 且配置了 issuer,则手动比对 issuer;不一致时报错。 + +**Step 4: Run test to verify it passes** +- Run: `cd backend && uv run pytest tests/unit/core/auth/test_jwt_verifier.py -q` + +### Task 2: 前端 refresh 单飞 + 防递归 + +**Files:** +- Modify: `apps/lib/core/api/api_interceptor.dart` +- Test: `apps/test/core/api/api_interceptor_test.dart` + +**Step 1: Write failing tests** +- 并发 401 时只调用一次 `onTokenRefresh`。 +- `/api/v1/auth/sessions/refresh` 自身 401 不触发 refresh 重试。 + +**Step 2: Run tests to verify failures** +- Run: `cd apps && flutter test test/core/api/api_interceptor_test.dart` + +**Step 3: Write minimal implementation** +- 增加 `_refreshFuture` 单飞字段。 +- 非 refresh 请求命中 401 时 await 同一个 refresh future。 +- 对 refresh/logout 认证端点和已重试请求加短路,避免无限重入。 + +**Step 4: Run tests to verify pass** +- Run: `cd apps && flutter test test/core/api/api_interceptor_test.dart` + +### Task 3: 清理无效/旧分支并做回归验证 + +**Files:** +- Modify: `apps/lib/core/api/api_interceptor.dart`(移除无效重试分支) +- Modify: `backend/src/core/auth/jwt_verifier.py`(删除不再使用的路径) + +**Step 1: Refactor cleanup** +- 删除不再可达的分支与重复逻辑,保持行为不变。 + +**Step 2: Full targeted verification** +- Run: `cd backend && uv run ruff check src tests` +- Run: `cd backend && uv run basedpyright` +- Run: `cd backend && uv run pytest tests/unit/core/auth/test_jwt_verifier.py tests/unit/v1/users -q` +- Run: `cd apps && flutter test test/core/api/api_interceptor_test.dart test/features/auth` + +**Step 3: Runtime spot-check** +- Run: 登录拿 token 后请求 `/api/v1/agent/history`,确认不再因缺失 `iss` 返回 401。 diff --git a/docs/plans/2026-03-10-supabase-jwks-auth-reliability-plan.md b/docs/plans/2026-03-10-supabase-jwks-auth-reliability-plan.md deleted file mode 100644 index dd963ff..0000000 --- a/docs/plans/2026-03-10-supabase-jwks-auth-reliability-plan.md +++ /dev/null @@ -1,65 +0,0 @@ -# Supabase JWKS Auth Reliability Implementation Plan - -> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. - -**Goal:** 让后端在云 Supabase 场景下稳定使用 JWKS/RS256 验签,并把 Auth 上游超时错误正确暴露为 503,保障注册/登录/重置密码链路可观测。 - -**Architecture:** 保留 `PUBLIC_URL -> issuer/jwks` 自动推导,JWT 验签继续强制 RS256,但给 JWKS 拉取添加 `apikey` 与 `Authorization` 头。Auth Gateway 新增统一错误映射,将上游 timeout/网关错误归类为服务不可用(503),其余保持既有 401/422 语义。 - -**Tech Stack:** FastAPI, Pydantic, PyJWT (`PyJWKClient`), Supabase Python SDK, pytest。 - ---- - -### Task 1: JWKS Header 支持(测试先行) - -**Files:** -- Modify: `backend/tests/unit/core/auth/test_jwt_verifier.py` -- Modify: `backend/src/core/auth/jwt_verifier.py` -- Modify: `backend/src/v1/users/dependencies.py` - -**Step 1: Write failing test** -- 为 `JwtVerifier` 新增用例,断言初始化 `PyJWKClient` 时会传入 `apikey` 与 `Authorization: Bearer `。 - -**Step 2: Run test to verify it fails** -- Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py -v` - -**Step 3: Write minimal implementation** -- `JwtVerifier.__init__` 新增 `apikey` 参数并注入 JWKS 请求头。 -- `get_jwt_verifier()` 传入 `config.supabase.anon_key`。 - -**Step 4: Run test to verify it passes** -- Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py -v` - -### Task 2: Auth 上游超时错误映射为 503(测试先行) - -**Files:** -- Modify: `backend/tests/unit/v1/auth/test_auth_gateway.py` -- Modify: `backend/src/v1/auth/gateway.py` - -**Step 1: Write failing test** -- 新增 `create_verification` 的超时错误测试,期望返回 `HTTPException(status_code=503)`。 - -**Step 2: Run test to verify it fails** -- Run: `uv run pytest backend/tests/unit/v1/auth/test_auth_gateway.py -v` - -**Step 3: Write minimal implementation** -- 增加 AuthError 分类函数,识别 timeout/request_timeout/upstream timeout。 -- 在注册、登录、刷新、重置相关分支中映射为 503。 - -**Step 4: Run test to verify it passes** -- Run: `uv run pytest backend/tests/unit/v1/auth/test_auth_gateway.py -v` - -### Task 3: 回归验证 - -**Files:** -- Modify: `backend/tests/unit/test_settings_supabase_env.py` (if needed) - -**Step 1: Run targeted suites** -- Run: `uv run pytest backend/tests/unit/core/auth/test_jwt_verifier.py backend/tests/unit/v1/auth/test_auth_gateway.py backend/tests/unit/test_settings_supabase_env.py -v` - -**Step 2: Run quality gates** -- Run: `uv run ruff check backend/src backend/tests` -- Run: `uv run basedpyright backend/src` - -**Step 3: Document runtime checks** -- 记录 JWT/JWKS 必备环境变量和手工联调命令。