feat: 添加 Agent 步骤事件与图片附件功能

- 新增 stepStarted/stepFinished 事件类型支持 - 前端实现图片附件上传和预览功能 - 后端增强工具结果存储和事件处理 - 完善相关单元测试和集成测试
2026-03-12 09:29:57 +08:00
parent 87215f9d41
commit 7b8865e256
45 changed files with 3869 additions and 308 deletions
@@ -67,3 +67,60 @@ uv run pytest tests/integration/v1/agent/test_sse_flow_live.py::test_agent_runs_
 2. **连续会话记忆测试** - 验证 session 是否从数据库读取历史上下文
 3. **工具调用测试** - calendar 读/写/删/分享 + 用户查找 + 时间感知
 4. **session 失败排查** - 找出最新失败原因并修复
+
+## 9. 本轮进展与结论（2026-03-12）
+
+### 9.1 反馈闭环状态
+
+1. **intent/execution 阶段 tokens/cost 入库**：已解决。
+2. **连续会话记忆（今天+昨天上下文）**：已解决。
+3. **工具调用冒烟（读/写/删/分享 + user 查询 + 时间感知）**：部分解决。
+4. **最新失败 session 根因定位与修复**：已解决。
+5. **反馈同步到文档**：已完成（本节）。
+
+### 9.2 关键修复
+
+1. **stage telemetry 补齐**（intent/execution）：
+   - usage 缺失时补 token 估算；
+   - 通过 `LiteLLMService.calculate_cost` 按项目定价估算 cost；
+   - 回填 `response_metadata.inputTokens/outputTokens/cost` 并落库。
+
+2. **会话记忆上下文注入**：
+   - runtime 在执行前读取同一 session 最近两天（今天+昨天）的 user/assistant 消息；
+   - intent prompt 增加 `[Conversation Context]`，避免只看最新用户输入。
+
+3. **工具调用稳定性修复**：
+   - tool 名统一为下划线（`calendar_read`/`calendar_write`/`user_resolve`），修复 OpenAI/LiteLLM tool name 正则错误；
+   - intent prompt 注入 intent+execution 合并工具 schema，避免误判“无可用写入工具”。
+
+### 9.3 Live 证据
+
+#### A) tokens/cost 入库（thread=`cb1681c2-c223-4ced-bcfd-76f7252ba2d8`）
+
+- intent: `input_tokens=1541`，`output_tokens=37`，`cost=0.000382`
+- execution: `input_tokens=2161`，`output_tokens=376`，`cost=0.005450`
+- report: `input_tokens=3266`，`output_tokens=318`，`cost=0.007256`
+- session 聚合：`total_tokens=13518`，`total_cost=0.019473`
+
+#### B) 连续会话记忆（thread=`9c456736-d5e5-48a4-b9db-55f507baf573`）
+
+- run `mem-1`：`请记住口令是蓝鲸42，只回复已记住。`
+- run `mem-2`：`只回复我刚才让你记住的口令，不要解释。`
+- assistant 回复：`蓝鲸42`（记忆命中）。
+
+#### C) 工具调用 + 时间感知（thread=`cb1681c2-c223-4ced-bcfd-76f7252ba2d8`，run=`run-tool-1`）
+
+- 事件序列含 execution 阶段与多次 `TOOL_CALL_RESULT`
+- 工具调用结果：`calendar_write`、`calendar_read`（多次）
+- assistant 回复包含时间感知信息（北京时间日期/星期/时刻）
+
+### 9.4 最新失败 session 根因
+
+- 失败样本：`d6bc4dbd-8361-4a39-bf09-12b3392e0e70`
+- 根因：tool 名含点号（如 `calendar.write`）触发校验失败：
+  - `Invalid 'tools[0].function.name' ... expected pattern ^[a-zA-Z0-9_-]+$`
+- 修复后：同类执行链路已可稳定进入 execution 并产出 `TOOL_CALL_RESULT`。
+
+### 9.5 当前未闭环项
+
+- `user_resolve` + calendar **分享 + 删除** 组合链路的完整 live 证据还未补齐（本轮执行中断：`Tool execution aborted`）。
@@ -0,0 +1,173 @@
+# Agent Tool UI Schema and Frontend Event Wiring Design
+
+## Goal
+
+修正 agent 工具结果的数据契约与前后端对接：
+
+1. SSE `TOOL_CALL_RESULT` 继续携带可实时渲染的 `ui`。
+2. 落库时 `messages.content` 仅存关键摘要，完整工具结果（含 `ui schema`）存对象存储。
+3. `messages.metadata` 仅存访问路径和索引字段，history 通过 metadata 回填完整工具卡片数据。
+4. 前端正式接通 runs/events/history 三路，并统一实时与历史渲染行为。
+
+## Constraints
+
+- 暂缓冒烟测试，先完成工具数据修正与前后端接口对接。
+- 保持现有前端 `UiSchemaRenderer` 可解析格式，不做破坏性协议改动。
+- `resume` 新需求暂不扩展。
+- 遵循 AG-UI 事件语义和现有 FastAPI 路由约定。
+
+## Selected Approach
+
+采用兼容增强方案：
+
+- 事件流对前端保持兼容（`TOOL_CALL_RESULT` 带 `ui` + `content`）。
+- 持久化与回放做结构化增强（storage + metadata 索引 + 摘要 content）。
+- 前端实时与历史统一映射层，保证同类消息一致渲染。
+
+## Design A: Unified Data Contract
+
+### SSE Event Contract (Realtime)
+
+`TOOL_CALL_RESULT` 事件继续包含前端当前可解析字段：
+
+- `callId`
+- `toolName`
+- `args`
+- `result`
+- `error`
+- `content` (关键结果摘要)
+- `ui` (工具卡片 schema)
+
+这保证前端实时流不需要等待 history 即可显示工具卡片。
+
+### Persistence Contract (Database + Storage)
+
+对 tool message 持久化采用双层：
+
+- `messages.content`: 仅保存 `content_summary`（短文本，供低成本上下文和兜底展示）。
+- 对象存储: 保存完整 payload（`ui`、`args`、`result`、`error`、时间戳、工具标识等）。
+- `messages.metadata`: 只保存索引和访问路径：
+  - `tool_call_id`
+  - `tool_name`
+  - `run_id`
+  - `stage`
+  - `task_id`
+  - `storage_bucket`
+  - `storage_path`
+  - `summary_version`
+
+### History Contract
+
+history 序列化时：
+
+1. 先通过 `metadata.storage_bucket/storage_path` 读取完整 payload。
+2. 从 payload 回填 `ui`，并保留摘要 `content`。
+3. storage 读取失败时，回退 `messages.content`，确保历史可读。
+
+## Design B: Frontend Wiring (runs/events/history)
+
+### runs
+
+- `POST /api/v1/agent/runs` 仅负责创建 run 与启动执行。
+- 前端保留 `threadId/runId` 和本地流状态，不承载渲染业务。
+
+### events
+
+- SSE 作为唯一实时渲染来源。
+- `TOOL_CALL_RESULT` 直接读取事件内 `ui` 渲染 `ToolResultItem`。
+- `STEP_STARTED/STEP_FINISHED` 显示三阶段状态（intent/execution/report）。
+
+### history
+
+- 通过 `/api/v1/agent/history` 或 `/api/v1/agent/runs/{threadId}/history` 回放。
+- tool message 优先读 `ui`（由后端从 metadata+storage 回填）。
+- user message 读取 `attachments` 渲染多模态内容。
+
+### Consistency Rule
+
+- 实时事件与历史快照统一进入同一 `ChatListItem` 映射层。
+- `content` 只做兜底文本，不作为工具卡片主数据。
+
+## Design C: Backend Implementation Details
+
+### Modules to Change
+
+- `backend/src/core/agentscope/events/store.py`
+  - 增加 tool result 的摘要生成与 storage 上传。
+  - `append_message` 时写入摘要 content + metadata 索引。
+- `backend/src/core/agentscope/tools/tool_result_storage.py`
+  - 复用现有 `upload_json/read_json`，作为完整 payload 存取层。
+- `backend/src/v1/agent/repository.py`
+  - `_to_snapshot_message` 对 tool message 优先按 metadata 读取 storage 并回填 `ui`。
+- `backend/src/core/agentscope/runtime/agent_route_runtime.py`
+  - 确保 `tool.result` 事件继续带 `ui` 和摘要 `content`。
+
+### Failure Fallback
+
+- storage 写失败：不阻断主流程，至少保证 `messages.content` 可读，metadata 标记缺失。
+- storage 读失败：history 返回摘要 `content`，`ui` 为空。
+
+## Design D: content_summary Rule Engine
+
+### Function
+
+新增纯函数：
+
+`build_tool_content_summary(tool_name, args, result, error) -> str`
+
+### Rules (Priority)
+
+1. 错误优先：有 `error` 直接输出失败摘要。
+2. 工具专用模板：
+   - `calendar_write`: `已创建日程：{title}（{start_time}）`
+   - `calendar_read`: `查询到 {count} 条日程（{date_range}）`
+   - `calendar_delete`: `已删除日程：{title_or_id}`
+   - `calendar_share`: `已分享日程给 {target}`
+   - `user_resolve`: `已匹配用户：{name_or_id}`
+3. 通用回退：优先 `result.content`，否则抽取常见键拼句。
+4. 最终兜底：`{tool_name} 执行完成/执行失败`。
+5. 清洗：去换行与多空格，限制长度，避免大段 JSON。
+
+### Summary Storage Policy
+
+- `messages.content` 存摘要。
+- `summary_version` 存入 metadata，支持未来摘要算法演进。
+
+## Testing and Acceptance
+
+### Backend
+
+- 单元测试：
+  - `events/store`: tool result 摘要写入、metadata 路径写入、storage 异常回退。
+  - `v1/agent/repository`: history 按 metadata 回填 `ui`；storage 缺失回退 content。
+  - 摘要函数：覆盖成功/失败/缺字段/超长文本场景。
+- 集成测试：
+  - `/runs` + `/events`：实时 `TOOL_CALL_RESULT` 带 `ui`。
+  - `/history`：返回 tool message 的 `ui` 来自 metadata+storage。
+
+### Frontend
+
+- 单元/组件测试：
+  - `AgUiService` 解析 `TOOL_CALL_RESULT` 的 `ui`。
+  - `ChatBloc`：实时事件与 history 快照都能产出 `ToolResultItem`。
+  - `UiSchemaRenderer`：history 回放卡片渲染一致。
+  - user message 附件渲染（history）。
+- 页面行为验证：
+  - events 到达即实时更新消息列表。
+  - step 三阶段状态正确切换。
+  - 上拉历史后工具卡片可正常显示。
+
+## Risks and Mitigations
+
+- 风险：storage 不可用导致 history 卡片缺失。
+  - 缓解：保底展示摘要 content，不阻断对话。
+- 风险：事件格式变更导致前端实时解析失败。
+  - 缓解：维持现有 `ToolCallResultEvent` 字段，不做破坏性改名。
+- 风险：摘要规则覆盖不足。
+  - 缓解：规则版本化 + 测试样例扩展。
+
+## Out of Scope
+
+- resume 扩展协议与交互策略。
+- 新一轮 live 冒烟验收。
+- 新 UI 风格重构，仅实现链路打通与数据契约修正。
@@ -0,0 +1,283 @@
+# Agent UI Schema and Event Wiring Implementation Plan
+
+> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
+
+**Goal:** 打通 agent 工具结果在实时事件与历史回放的一致渲染链路：SSE 实时带 UI，落库 content 存摘要，完整 UI schema 存 storage 并通过 metadata 回填。
+
+**Architecture:** 后端在 `TOOL_CALL_RESULT` 持久化链路中引入“摘要 + 全量分离”策略：摘要写 `messages.content`，全量 payload 写对象存储，metadata 仅存索引路径；history 读取时按 metadata 反查 storage 回填 `ui`。前端复用现有 AG-UI 事件模型，实现 runs/events/history 三路统一映射到 `ChatListItem`，并补齐 step 事件渲染与 history 多模态渲染。
+
+**Tech Stack:** FastAPI, SQLAlchemy, AgentScope runtime/events, Supabase Storage, Flutter (Bloc/Cubit), Dart models/tests, AG-UI events
+
+---
+
+### Task 1: Add Tool Summary Rule Engine (Backend)
+
+**Files:**
+- Create: `backend/src/core/agentscope/events/tool_result_summary.py`
+- Test: `backend/tests/unit/core/agentscope/events/test_tool_result_summary.py`
+
+**Step 1: Write the failing test**
+
+```python
+from core.agentscope.events.tool_result_summary import build_tool_content_summary
+
+
+def test_calendar_write_summary() -> None:
+    text = build_tool_content_summary(
+        tool_name="calendar_write",
+        args={"title": "项目评审"},
+        result={"start_time": "明天 10:00"},
+        error=None,
+    )
+    assert text.startswith("已创建日程")
+```
+
+**Step 2: Run test to verify it fails**
+
+Run: `uv run pytest backend/tests/unit/core/agentscope/events/test_tool_result_summary.py -q`
+Expected: FAIL with import/module/function missing.
+
+**Step 3: Write minimal implementation**
+
+```python
+def build_tool_content_summary(*, tool_name: str, args, result, error) -> str:
+    if error:
+        return f"{tool_name} 执行失败"
+    if tool_name == "calendar_write":
+        return "已创建日程"
+    return f"{tool_name} 执行完成"
+```
+
+**Step 4: Run test to verify it passes**
+
+Run: `uv run pytest backend/tests/unit/core/agentscope/events/test_tool_result_summary.py -q`
+Expected: PASS.
+
+**Step 5: Extend tests for all rules and refactor**
+
+Add cases for `calendar_read/calendar_delete/calendar_share/user_resolve/error/fallback/truncation` and implement full rule table.
+
+**Step 6: Commit**
+
+```bash
+git add backend/src/core/agentscope/events/tool_result_summary.py backend/tests/unit/core/agentscope/events/test_tool_result_summary.py
+git commit -m "feat: add deterministic tool result summary engine"
+```
+
+### Task 2: Persist Full Tool Payload to Storage and Keep Content Lightweight
+
+**Files:**
+- Modify: `backend/src/core/agentscope/events/store.py`
+- Test: `backend/tests/unit/core/agentscope/events/test_store.py`
+
+**Step 1: Write the failing tests**
+
+Add tests asserting:
+- `TOOL_CALL_RESULT` persists summary to `content`.
+- metadata includes `storage_bucket/storage_path/tool_call_id`.
+- uploaded payload includes full `ui/args/result/error`.
+
+**Step 2: Run targeted tests (RED)**
+
+Run: `uv run pytest backend/tests/unit/core/agentscope/events/test_store.py -q`
+Expected: FAIL on new assertions.
+
+**Step 3: Implement minimal storage write path**
+
+In `_persist_tool_call_result`:
+- build `full_payload` from event fields.
+- call summary engine for `content`.
+- upload payload via tool result storage (inject dependency if needed).
+- store only path/index in metadata.
+
+**Step 4: Run tests (GREEN)**
+
+Run: `uv run pytest backend/tests/unit/core/agentscope/events/test_store.py -q`
+Expected: PASS.
+
+**Step 5: Add fallback test and implementation**
+
+Add case where storage upload fails but tool message still persists with summary and no crash.
+
+**Step 6: Commit**
+
+```bash
+git add backend/src/core/agentscope/events/store.py backend/tests/unit/core/agentscope/events/test_store.py
+git commit -m "feat: store tool payload in object storage with metadata index"
+```
+
+### Task 3: Hydrate History Tool UI from Metadata Storage Path
+
+**Files:**
+- Modify: `backend/src/v1/agent/repository.py`
+- Test: `backend/tests/unit/v1/agent/test_repository.py`
+
+**Step 1: Write failing tests**
+
+Add/adjust assertions:
+- history tool payload resolves `ui` from storage payload.
+- when storage missing, fallback to `messages.content` summary.
+
+**Step 2: Run tests (RED)**
+
+Run: `uv run pytest backend/tests/unit/v1/agent/test_repository.py -q`
+Expected: FAIL on `ui` hydration and fallback assertions.
+
+**Step 3: Implement minimal hydration logic**
+
+In `_to_snapshot_message` for tool role:
+- read storage via `metadata.storage_bucket/storage_path`.
+- map hydrated payload fields to snapshot (`ui`, `content`, `toolCallId`).
+- keep safe fallback when storage read fails.
+
+**Step 4: Run tests (GREEN)**
+
+Run: `uv run pytest backend/tests/unit/v1/agent/test_repository.py -q`
+Expected: PASS.
+
+**Step 5: Commit**
+
+```bash
+git add backend/src/v1/agent/repository.py backend/tests/unit/v1/agent/test_repository.py
+git commit -m "fix: hydrate tool ui from metadata storage in history snapshots"
+```
+
+### Task 4: Keep SSE TOOL_CALL_RESULT Compatible with Existing Frontend Parsing
+
+**Files:**
+- Modify: `backend/src/core/agentscope/runtime/agent_route_runtime.py`
+- Test: `backend/tests/unit/core/agentscope/runtime/test_agent_route_runtime.py`
+
+**Step 1: Write failing test**
+
+Add assertion that emitted `TOOL_CALL_RESULT` data contains expected renderable fields (`callId/toolName/result/error` and `ui` path from result payload).
+
+**Step 2: Run tests (RED)**
+
+Run: `uv run pytest backend/tests/unit/core/agentscope/runtime/test_agent_route_runtime.py -q`
+Expected: FAIL on missing/incorrect payload fields.
+
+**Step 3: Implement minimal payload normalization**
+
+Normalize tool result event payload so frontend can keep current parsing without contract breaks.
+
+**Step 4: Run tests (GREEN)**
+
+Run: `uv run pytest backend/tests/unit/core/agentscope/runtime/test_agent_route_runtime.py -q`
+Expected: PASS.
+
+**Step 5: Commit**
+
+```bash
+git add backend/src/core/agentscope/runtime/agent_route_runtime.py backend/tests/unit/core/agentscope/runtime/test_agent_route_runtime.py
+git commit -m "fix: preserve frontend-compatible tool result event payload"
+```
+
+### Task 5: Wire Frontend History + Events to Unified Rendering Path
+
+**Files:**
+- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
+- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
+- Modify: `apps/lib/features/chat/data/models/tool_result.dart`
+- Modify: `apps/lib/features/home/ui/screens/home_screen.dart`
+- Test: `apps/test/features/chat/ag_ui_service_test.dart`
+- Create/Modify: `apps/test/features/chat/chat_bloc_test.dart`
+
+**Step 1: Write failing tests**
+
+Add tests asserting:
+- history tool message with `ui` becomes `ToolResultItem`.
+- SSE `TOOL_CALL_RESULT` with `ui` renders same item shape.
+- attachments in history user message are mapped for multimodal rendering.
+
+**Step 2: Run tests (RED)**
+
+Run: `cd apps && flutter test test/features/chat/ag_ui_service_test.dart`
+Expected: FAIL on new mapping assertions.
+
+**Step 3: Implement minimal mapping changes**
+
+- In service/bloc, unify history and event mapping into same conversion path.
+- Keep existing `UiSchemaRenderer` input format untouched.
+- Ensure fallback to content text when `ui` missing.
+
+**Step 4: Run tests (GREEN)**
+
+Run: `cd apps && flutter test test/features/chat/ag_ui_service_test.dart`
+Expected: PASS.
+
+**Step 5: Commit**
+
+```bash
+git add apps/lib/features/chat/data/services/ag_ui_service.dart apps/lib/features/chat/presentation/bloc/chat_bloc.dart apps/lib/features/chat/data/models/tool_result.dart apps/lib/features/home/ui/screens/home_screen.dart apps/test/features/chat/ag_ui_service_test.dart apps/test/features/chat/chat_bloc_test.dart
+git commit -m "feat: unify realtime and history tool card rendering"
+```
+
+### Task 6: Add Step Event Rendering for Intent/Execution/Report
+
+**Files:**
+- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
+- Modify: `apps/lib/features/home/ui/screens/home_screen.dart`
+- Test: `apps/test/features/chat/chat_bloc_test.dart`
+
+**Step 1: Write failing test**
+
+Add test verifying `STEP_STARTED/STEP_FINISHED` transitions produce visible stage state.
+
+**Step 2: Run tests (RED)**
+
+Run: `cd apps && flutter test test/features/chat/chat_bloc_test.dart`
+Expected: FAIL on missing stage state.
+
+**Step 3: Implement minimal state and UI**
+
+- Track current stage enum in `ChatState`.
+- Render compact stage progress row in chat screen.
+
+**Step 4: Run tests (GREEN)**
+
+Run: `cd apps && flutter test test/features/chat/chat_bloc_test.dart`
+Expected: PASS.
+
+**Step 5: Commit**
+
+```bash
+git add apps/lib/features/chat/presentation/bloc/chat_bloc.dart apps/lib/features/home/ui/screens/home_screen.dart apps/test/features/chat/chat_bloc_test.dart
+git commit -m "feat: render agent step progress from AG-UI events"
+```
+
+### Task 7: Verification Gate (Backend + Frontend)
+
+**Files:**
+- Modify (if needed): `docs/plans/2026-03-11-agent-multimodal-smoke-runbook.md`
+
+**Step 1: Run backend targeted tests**
+
+Run: `uv run pytest backend/tests/unit/core/agentscope/events/test_tool_result_summary.py backend/tests/unit/core/agentscope/events/test_store.py backend/tests/unit/v1/agent/test_repository.py backend/tests/unit/core/agentscope/runtime/test_agent_route_runtime.py -q`
+Expected: PASS.
+
+**Step 2: Run frontend targeted tests**
+
+Run: `cd apps && flutter test test/features/chat/ag_ui_service_test.dart test/features/chat/chat_bloc_test.dart`
+Expected: PASS.
+
+**Step 3: Run backend quality checks**
+
+Run: `uv run ruff check backend/src backend/tests`
+Expected: PASS.
+
+**Step 4: Run backend type checks**
+
+Run: `uv run basedpyright`
+Expected: 0 errors.
+
+**Step 5: Update runbook evidence**
+
+Record changed contract, test evidence, and known follow-ups.
+
+**Step 6: Commit**
+
+```bash
+git add docs/plans/2026-03-11-agent-multimodal-smoke-runbook.md
+git commit -m "docs: record tool ui schema storage and rendering verification"
+```