Merge branch 'worktree/refactor-tool-cli-skill-ui-schema' into dev

2026-04-23 16:04:13 +08:00
parent dcb0eb4c65 f708bce585
commit bb36bdbb9b
151 changed files with 6725 additions and 5216 deletions
@@ -161,19 +161,28 @@ run 过滤语义：
  messages: Array<{
    id: string;
    seq: number;
-    role: "user" | "assistant";
+    role: "user" | "assistant" | "tool";
    content: string;
+    suggestedActions?: string[];
    attachments?: Array<{       // user 附件签名链接列表
      mimeType: string;
      url: string;
    }>;
-    ui_schema?: object | null; // assistant 的编译后 UI
+    ui_schema?: object | null; // tool 的编译后 UI
    timestamp: string;         // ISO-8601
  }>;
 }
 ```

-tool 消息在存储层用于运行时上下文续接，不在 `/history` 对外返回。续接时以 `metadata.tool_agent_output` 作为主信源（`content` 为轻量摘要）。
+`/history` 会返回 tool 消息用于 UI 重建。tool 消息的 `ui_schema` 来自 `metadata.tool_agent_output.ui_hints` 的编译结果。
+
+`messages[].content` 在当前协议中始终是字符串：
+
+- assistant: answer 文本
+- tool: tool result 的 JSON 文本投影
+- user: 用户输入文本
+
+结构化字段（如 tool result/ui hints、suggested actions）通过 metadata 派生，不要求把 `content` 升级为 JSON 对象。

 可见性说明：

@@ -204,8 +204,8 @@ interface ForwardedProps {

 | runtime_mode | 说明 | Pipeline | 差异 |
 |--------------|------|----------|------|
-| `chat` | 标准对话模式 | `router` -> `worker` | `enabled_tools` 和 `context` 来自 `system_agents.yaml` |
-| `automation` | 自动化任务模式 | `router` -> `worker` | `enabled_tools` 和 `context` 来自 `AutomationJob.config`（通过 `runtime_config` 注入）|
+| `chat` | 标准对话模式 | `router` -> `worker` | `enabled_skills` 和 `context` 来自 `system_agents.yaml` |
+| `automation` | 自动化任务模式 | `router` -> `worker` | `enabled_skills` 和 `context` 来自 `AutomationJob.config`（通过 `runtime_config` 注入）|

 > `runtime_mode` 仅影响 `RuntimeConfig`（工具列表与上下文配置），不改变执行阶段。两模式均使用固定两阶段 pipeline。

@@ -454,7 +454,7 @@ interface HistoryMessageAssistant {
  seq: number;
  role: "assistant";
  content: string;
-  ui_schema: UiSchemaRenderer | null; // 由 agent_output.ui_hints 编译
+  ui_schema: UiSchemaRenderer | null; // 当前 assistant 文本消息默认不携带 UI，通常为 null
  timestamp: string;
 }

@@ -127,6 +127,11 @@ data: <json>

 ### 3.3 Tool 事件

+前端渲染约束（当前实现）：
+
+- tool UI 渲染仅消费 `TOOL_CALL_RESULT.ui_schema`。
+- `TOOL_CALL_START` / `TOOL_CALL_ARGS` / `TOOL_CALL_END` 仅作为执行观测事件保留，前端主聊天流不渲染中间态卡片。
+
 #### `TOOL_CALL_START`

 ```json
@@ -184,18 +189,22 @@ data: <json>
  "tool_call_id": "...",
  "tool_call_args": {},
  "status": "success" | "failure" | "partial",
-  "result": "...",
-  "error": null
+  "result": {},
+  "error": null,
+  "ui_schema": {}
 }
 ```

-说明：`TOOL_CALL_RESULT` 不再携带 `ui_schema`。tool 结果通过 `result` 字段提供紧凑、结构化、可执行的信息（优先包含 id/status/count 等关键事实），用于 agent 后续推理与工具编排。
+说明：`TOOL_CALL_RESULT` 中 `result` 字段提供紧凑、结构化、可执行的信息（优先包含 id/status/count 等关键事实），用于 agent 后续推理与工具编排。若对应工具输出存在 `ui_hints`，后端会在 codec 层编译得到 `ui_schema` 并随事件下发。
+
+当前 `ui_hints` 策略：仅对当前 canonical CLI 的 CRUD 子命令生成（`calendar.create/read/update/delete`、`contacts.read`、`memory.update`）；`calendar.share` 不生成 `ui_hints`。

 补充约束：

 - `tool_call_id` 必须与同次调用的 `TOOL_CALL_START/ARGS/END.toolCallId` 一致，并在每次工具调用中保持唯一。
 - `tool_call_args` 仅表示输入参数快照。
 - `result` 仅表示执行输出事实，不重复 `tool_call_args` 已包含的输入参数。
+- `ui_schema` 为可渲染 UI 线缆格式；其源数据来自 `metadata.tool_agent_output.ui_hints`。

 #### 3.3.1 tool 名称展示规范（前端本地化）

@@ -206,21 +215,22 @@ SSE 协议中的工具名字段保持后端原样，不做服务端翻译：

 前端展示层统一通过工具名本地化映射进行中文渲染，要求兼容两类命名风格：

- dot 风格：`memory.write`、`calendar.read`
- snake 风格：`memory_write`、`calendar_read`
+- dot 风格：`memory.update`、`calendar.read`
+- snake 风格：`memory_update`、`calendar_read`

 当前规范映射（canonical -> 中文）如下：

 - `calendar.read` -> `读取日程`
- `calendar.write` -> `写入日程`
+- `calendar.create` -> `创建日程`
+- `calendar.update` -> `更新日程`
+- `calendar.delete` -> `删除日程`
 - `calendar.share` -> `共享日程`
- `user.lookup` -> `查找联系人`
- `memory.write` -> `写入记忆`
- `memory.forget` -> `清理记忆`
+- `contacts.read` -> `读取联系人`
+- `memory.update` -> `更新记忆`

 兼容策略：

-1. 优先按 alias 归一化（例如 `memory_write` -> `memory.write`）
+1. 优先按 alias 归一化（例如 `memory_update` -> `memory.update`）
 2. 命中 canonical 映射后展示中文
 3. 未命中时回退显示原始工具名（保证向后兼容）

@@ -242,11 +252,8 @@ SSE 协议中的工具名字段保持后端原样，不做服务端翻译：
  "stage": "worker",
  "status": "success" | "partial_success" | "failed",
  "answer": "...",
-  "key_points": [],
-  "result_type": "execution_report" | "clarification" | "error_report" | "unknown",
  "suggested_actions": [],
-  "error": null,
-  "ui_schema": {}
+  "error": null
 }
 ```

@@ -427,5 +434,5 @@ WHERE (visibility_mask & 2) != 0
 | agent 输出 visibility_mask | `UI_HISTORY \| CONTEXT_ASSEMBLY`（memory stage 仅 `UI_HISTORY`） | `UI_HISTORY` |
 | 进入 /history | ✅ | ✅（仅 agent 输出） |
 | 进入 context assembly | ✅（自动） | ❌（通过 run_input 注入） |
-| enabled_tools 来源 | `system_agents.yaml` worker 配置 | `AutomationJob.config.enabled_tools` |
+| enabled_skills 来源 | `system_agents.yaml` worker 配置 | `AutomationJob.config.enabled_skills` |
 | context 配置来源 | `system_agents.yaml` router context_messages | `AutomationJob.config.context` |
@@ -0,0 +1,240 @@
+# Agent Tool Protocol
+
+本文件定义当前项目中 AgentScope tool、项目 CLI、tool post-processor、SSE/history/persistence 之间的协议边界。
+
+## 1. Scope
+
+本协议覆盖：
+
+- AgentScope tool wrapper
+- 项目内受限 CLI tool
+- runtime tool post-processor
+- `ToolResponse`
+- `ToolAgentOutput`
+- tool message `content`
+- tool 结果的 SSE/history/persistence 表达
+
+本协议不覆盖：
+
+- 前端视觉实现细节
+- worker answer 文案格式
+- 非 tool 的普通 assistant 文本输出
+
+## 2. Core Principles
+
+1. 项目 CLI 是受限工具执行边界，不是通用 shell。
+2. agent 只暴露一个 AgentScope tool：`project_cli`。
+3. skills 只负责向 agent 披露如何使用 `project_cli`，不承担执行 transport 或权限决策。
+4. Router 是 CLI 的唯一命令分发核心，只允许白名单 `command + subcommand`。
+5. 每个 CLI 子命令绑定 Python handler。
+6. handler 只能调用允许的内部能力，不开放任意系统命令执行。
+6.1 `project_cli` 命令权限由 runtime `allowed_commands` 与 CLI router 白名单共同约束，不能由 skills 启用状态隐式放开。
+7. `ToolAgentOutput.result` 是 canonical machine-oriented tool result。
+8. `ToolResponse` 不承载完整 `ToolAgentOutput`，只承载给 agent 使用的文本投影。
+9. tool UI 只来自 `ToolAgentOutput.ui_hints`，不再经过 worker `ui_hints -> ui_schema` 链路。
+
+## 3. Execution Flow
+
+一次 tool 调用按如下顺序执行：
+
+1. AgentScope tool `project_cli` 接收到模型生成的 tool call。
+2. wrapper 将 `command + subcommand + args` 映射为项目 CLI 输入。
+3. runtime 将受控认证凭证通过环境变量注入 CLI 子进程。
+4. CLI router 将 `(command, subcommand)` 分发给对应 Python handler。
+5. handler 执行业务逻辑并返回结构化 `result`。
+6. wrapper 将 `result` 投影为文本，写入 `ToolResponse.content`。
+7. runtime tool post-processor 基于 `result` 和 runtime 上下文生成完整 `ToolAgentOutput`。
+8. `ToolAgentOutput` 进入：
+   - `TOOL_CALL_RESULT`
+   - `metadata.tool_agent_output`
+   - history replay
+   - context rebuild
+
+## 4. Input Channels
+
+Agent -> `project_cli` 的结构化入参：
+
+```json
+{
+  "command": "calendar",
+  "subcommand": "read",
+  "args": {
+    "start_at": "2026-04-21T00:00:00+08:00",
+    "end_at": "2026-04-22T00:00:00+08:00"
+  }
+}
+```
+
+CLI 运行时输入通道采用“两者结合”：
+
+- `argv` 为主：
+  - command
+  - subcommand
+  - mode / formatting flags
+- `stdin` 为辅：
+  - 较大的 JSON payload
+  - 复杂对象参数
+  - 多步批量操作负载
+- environment variables：
+  - controlled credential
+  - runtime-only internal context
+
+约束：
+
+- 模型不可见的认证信息不得进入 tool args。
+- CLI 不接受来自自然语言/模型参数的任意 token 字符串。
+- backend runtime 只能通过受控环境变量注入认证凭证。
+
+权限边界：
+
+- `enabled_skills` 仅控制 skill 文档可见性与注册。
+- `allowed_commands` 控制 `project_cli` 可执行命令集合。
+- 两者职责解耦，避免“技能可见即命令授权”的隐式耦合。
+
+## 5. CLI Output Contract
+
+CLI handler 的原始成功输出必须是统一结构化结果。
+
+示例：
+
+```json
+{
+  "ok": true,
+  "command": "calendar",
+  "subcommand": "read",
+  "data": {
+    "items": [
+      {
+        "id": "evt_123",
+        "title": "Project sync",
+        "startAt": "2026-04-21T10:00:00+08:00"
+      }
+    ],
+    "count": 1
+  }
+}
+```
+
+失败时，CLI handler 必须返回结构化错误结果。
+
+## 6. ToolResponse Contract
+
+`ToolResponse` 只用于给 agent 提供 tool 结果文本投影。
+
+规则：
+
+- `ToolResponse.content` 只包含 `result` 的完整 JSON 文本投影。
+- 不再把完整 `ToolAgentOutput` 序列化后塞进 `ToolResponse.content`。
+- 文本投影必须与 `result` 保持等价信息量，不做摘要裁剪。
+
+示例：
+
+```json
+{"status":"success","items":[{"id":"evt_123","title":"Project sync","startAt":"2026-04-21T10:00:00+08:00"}],"count":1}
+```
+
+## 7. Tool Post-Processor Contract
+
+runtime 必须在 tool 调用完成后运行 tool post-processor。
+
+post-processor 负责生成完整 `ToolAgentOutput`，至少包括：
+
+- `tool_name`（固定为 `project_cli`）
+- `tool_call_id`
+- `tool_call_args`
+- `status`
+- `result`
+- `error`
+- `ui_hints`
+
+规则：
+
+- `result` 是真源。
+- `result` 应保留 `command`、`subcommand` 和 `data`。
+- `ui_hints` 由 post-processor 生成，不由 worker 生成。
+- tool 失败时 `error` 必须为结构化对象。
+- `status` 必须是 `success | failure | partial`。
+
+`ui_hints` 输出范围（当前协议）:
+
+- 输出：当前 CLI canonical 子命令中的 CRUD 调用
+  - `calendar.create`
+  - `calendar.read`
+  - `calendar.update`
+  - `calendar.delete`
+  - `contacts.read`
+  - `memory.update`
+- 不输出：非 CRUD 子命令（例如 `calendar.share`）
+
+## 8. ToolAgentOutput Contract
+
+`ToolAgentOutput` 用于系统内部和前端消费，不直接作为模型上下文主输入。
+
+消费位置：
+
+- `TOOL_CALL_RESULT`
+- 数据库存储 `metadata.tool_agent_output`
+- `/history` tool UI replay
+- cold-path context rebuild
+
+规则：
+
+- `tool_name` 固定为 `project_cli`。
+- `result` 必须是 JSON-native、machine-oriented 结构。
+- 必须包含后续链式调用所需的 ID/outcome/status/count 等事实。
+- `ui_hints` 是 tool UI 的唯一真源。
+
+## 9. History Replay Contract
+
+`/history` 必须支持 tool UI 回放。
+
+规则：
+
+- history 对外返回 tool message。
+- tool message 的 UI 恢复从 `metadata.tool_agent_output.ui_hints` 读取，编译为 `ui_schema` 后返回。
+- tool message `content` 仍是 `result` 的 JSON 文本投影。
+
+### 9.1 `messages.content` 存储类型决策
+
+- 当前决策：`messages.content` 继续保持 `text`，不迁移到 `jsonb`。
+- 原因：
+  - `messages` 表承担多角色消息（user/assistant/tool），`content` 统一作为文本载荷更稳定；
+  - tool 的结构化数据已经由 `metadata.tool_agent_output.result` 与 `metadata.tool_agent_output.ui_hints` 承载；
+  - `/history`、SSE、context rebuild 当前都按“`content` 文本 + metadata 结构化字段”工作，避免双轨 schema 演进；
+  - 实测出现过 dict 直接写入 `messages.content` 导致驱动类型错误（`expected str, got dict`），保持 `text` 可减少写入歧义。
+- 约束：凡写入 `messages.content` 的数据必须是字符串；结构化对象必须进入 `metadata`。
+
+## 10. SSE Contract
+
+规则：
+
+- `TEXT_MESSAGE_END` 不再包含 worker `ui_hints` 或 `ui_schema`。
+- `TOOL_CALL_RESULT` 携带 `ui_schema`（由后端 codec 从 `ui_hints` 编译而来）。
+- tool UI 前端消费基于 `ui_schema`（由 `ui_hints` 编译）。
+
+## 11. Controlled Credential Contract
+
+tool runtime 的认证边界使用 controlled credential。
+
+规则：
+
+- chat 与 automation 都不得把 `owner_id` 当作凭证。
+- controlled credential 由当前 bearer token 签发方在同一信任边界内签发。
+- TTL 目标为 5-10 分钟。
+- 该凭证只覆盖 tool runtime 所需权限窗口。
+- 凭证仅通过 backend-controlled env 注入 CLI。
+- 日志、错误响应、history、SSE 中不得暴露原始凭证。
+
+## 12. Security Constraints
+
+- 不开放 shell 执行能力。
+- 不允许通过 tool args 传任意 token。
+- 不允许通过 `owner_id` 伪造用户 Bearer token。
+- 不允许把 DB session 直接注入 CLI 边界。
+
+## 13. Compatibility Strategy
+
+- 策略：`backward-compatible`。
+- `ui_schema` 作为 wire format 保留，由后端 codec 从 `ui_hints` 编译而来。
+- 前端 renderer 继续消费 `ui_schema`。
+- `ui_hints` 作为内部字段，不直接传输给前端。
@@ -14,7 +14,7 @@ scheduler computation, and Flutter settings pages.
 - `is_system`: boolean (`bootstrap_key != null` 时为 `true`，只读派生字段)
 - `config`: object
  - `input_template`: string
-  - `enabled_tools`: string[]
+  - `enabled_skills`: string[] (`calendar | contacts | memory`)
  - `context`: object
    - `source`: `latest_chat`
    - `window_mode`: `day | number`
@@ -93,12 +93,12 @@ data: <json>

 ### 5.2 UI 编译器一致

-两条链路都使用后端 `ui_compiler.compile(...)` 将 **worker** 的 `ui_hints` 编译为可渲染结构：
+两条链路都使用后端 `ui_compiler.compile(...)` 将 **tool** 的 `ui_hints` 编译为可渲染结构：

 - events：在 runtime 发送事件前编译，字段名为 `ui_schema`
 - history：在历史转换时编译，字段名为 `ui_schema`

-tool 结果不再走 UI 编译链路：`TOOL_CALL_RESULT` 提供 `tool_call_args` + `result` 组合。
+tool 结果走 UI 编译链路：`TOOL_CALL_RESULT` 在保留 `tool_call_args` + `result` 的同时可携带 `ui_schema`。

 - `metadata.tool_agent_output` 是 tool 消息的完整信源（用于 runtime observation 与 history replay）。
 - `message.content` 保持轻量摘要（当前等于 `result`）。
@@ -126,10 +126,9 @@ tool 结果不再走 UI 编译链路：`TOOL_CALL_RESULT` 提供 `tool_call_args

 ### 7.1 后端生成

- runtime 使用 `ui_hints.action.type = navigation` 产出导航动作。
+- runtime 基于 tool 输出中的 `ui_hints.action.type = navigation` 产出导航动作。
 - 编译后在 `ui_schema` 中保持 `action.type = navigation`、`action.path`、`action.params`。
- 路由来源应受后端静态路由目录约束：
-  - `backend/src/core/config/static/route/frontend_routes.yaml`
+- 路由由工具能力直接给出 concrete path，agent 本身不需要维护 route_id 概念。

 ### 7.2 前端消费（统一解析规则）

@@ -144,12 +143,12 @@ tool 结果不再走 UI 编译链路：`TOOL_CALL_RESULT` 提供 `tool_call_args
 - 关键业务动作（创建、编辑、分享、处理邀请等）应优先设计为可深链页面路由，而不是仅存在于临时弹层。
 - 若 UI 采用 sheet 风格展示，也应由页面路由承载状态，再以页面内 surface 呈现 sheet 视觉。
 - `todo.edit` 必须落地为独立子页面（`/todo/{id}/edit`），不应通过详情页内弹窗承载编辑主流程。
- 推荐后端优先使用以下 route_id 生成导航（示例）：
-  - `calendar.event_create` -> `/calendar/events/new`
-  - `calendar.event_edit` -> `/calendar/events/{id}/edit`
-  - `calendar.event_share` -> `/calendar/events/{id}/share`
-  - `todo.create` -> `/todo/new`
-  - `todo.edit` -> `/todo/{id}/edit`
+- 推荐工具能力优先输出以下 concrete path（示例）：
+  - `/calendar/events/new`
+  - `/calendar/events/{id}/edit`
+  - `/calendar/events/{id}/share`
+  - `/todo/new`
+  - `/todo/{id}/edit`

 ### 7.4 约束建议

@@ -294,8 +294,8 @@ interface NavigateAction {
 // 2) path MUST NOT include query string or fragment.
 // 3) params, when provided, is treated as query params only.
 // 4) params values MUST be scalar (string | number | boolean).
-// 5) Backend MUST generate path from route catalog
-//    `backend/src/core/config/static/route/frontend_routes.yaml`.
+// 5) Backend/tool layer MUST generate concrete internal path directly.
+//    Agent prompt does not carry route catalog contract.

 // URL action
 interface UrlAction {