Files

T

qzl d2d292a99e fix(agent): 修复 skill action 卡片调用约定、memory 强类型校验和死代码清理

- 所有 calendar action .md: skill/action 替换为 module/method + mode 字段
- handler_memory: 新增 Pydantic extra=forbid 模型替代手工 dict 校验
- memory/SKILL.md: 补充 UserMemoryContent/WorkProfileContent 全字段文档
- 移除 handler_calendar 死代码 _batch_status 和 runner 旧别名 AgentScopeReActRunner
- PRD §5.2-5.6 和 sse-events 协议对齐实际 module/method 实现

2026-04-24 14:10:57 +08:00

13 KiB

Raw Blame History

Agent SSE Events

本文档描述 GET /api/v1/agent/runs/{thread_id}/events 的事件协议。

当前协议要求 SSE 订阅显式携带 runId，事件输出按 run 隔离。

1) 事件管道

后端事件流转如下：

Runtime 直接产出 AG-UI 事件（如 RUN_STARTED、TOOL_CALL_RESULT）
agui_codec 仅做协议对齐与字段净化（例如移除仅后端内部统计字段）
事件同时：
- 持久化到数据库（用于 history）
- 发布到 Redis Stream（用于 SSE）
/runs/{thread_id}/events 从 Redis Stream 读取并输出 SSE（仅输出目标 runId 事件）

2) SSE 帧格式

每条事件遵循标准 SSE：

id: <redis-stream-id>
event: <AG-UI-EVENT-TYPE>
data: <json>

id 可用于断点续流（Last-Event-ID）
event 与 JSON 内 type 一致（例如 RUN_STARTED）
仅当 payload 中 runId 与 query runId 一致时才会对外发送
空闲时可能出现 keep-alive 注释帧：

: keep-alive

3) 事件类型（当前实现）

3.0 过滤与终止规则

请求参数 runId 为本次订阅目标 run。
服务端只转发 event.runId == runId 的事件。
SSE 连接仅在目标 run 收到 terminal 事件后结束：
- RUN_FINISHED
- RUN_ERROR
其他 run 的事件（包括 terminal）不会结束当前连接。

3.1 Run 生命周期

`RUN_STARTED`

{
  "type": "RUN_STARTED",
  "threadId": "...",
  "runId": "..."
}

`RUN_FINISHED`

{
  "type": "RUN_FINISHED",
  "threadId": "...",
  "runId": "..."
}

`RUN_ERROR`

{
  "type": "RUN_ERROR",
  "threadId": "...",
  "runId": "...",
  "message": "runtime execution failed",
  "code": null
}

取消语义（当前实现）：

{
  "type": "RUN_ERROR",
  "threadId": "...",
  "runId": "...",
  "message": "run canceled by user",
  "code": "RUN_CANCELED"
}

说明：RUN_CANCELED 表示用户主动中断，本阶段后端仍复用会话 failed 状态以保持兼容。

3.2 阶段事件

`STEP_STARTED`

{
  "type": "STEP_STARTED",
  "threadId": "...",
  "runId": "...",
  "stepName": "router" | "worker"
}

`STEP_FINISHED`

{
  "type": "STEP_FINISHED",
  "threadId": "...",
  "runId": "...",
  "stepName": "router" | "worker"
}

3.3 Tool 事件

前端渲染约束（当前实现）：

tool UI 渲染仅消费 TOOL_CALL_RESULT.ui_schema。
TOOL_CALL_START / TOOL_CALL_ARGS / TOOL_CALL_END 仅作为执行观测事件保留，前端主聊天流不渲染中间态卡片。

`TOOL_CALL_START`

{
  "type": "TOOL_CALL_START",
  "threadId": "...",
  "runId": "...",
  "messageId": "...",
  "toolCallId": "...",
  "toolCallName": "...",
  "stage": "worker"
}

`TOOL_CALL_ARGS`

{
  "type": "TOOL_CALL_ARGS",
  "threadId": "...",
  "runId": "...",
  "messageId": "...",
  "toolCallId": "...",
  "toolCallName": "...",
  "args": {},
  "stage": "worker"
}

`TOOL_CALL_END`

{
  "type": "TOOL_CALL_END",
  "threadId": "...",
  "runId": "...",
  "messageId": "...",
  "toolCallId": "...",
  "toolCallName": "...",
  "stage": "worker"
}

`TOOL_CALL_RESULT`

{
  "type": "TOOL_CALL_RESULT",
  "threadId": "...",
  "runId": "...",
  "messageId": "...",
  "role": "tool",
  "stage": "worker",
  "tool_name": "...",
  "tool_call_id": "...",
  "tool_call_args": {},
  "status": "success" | "failure" | "partial",
  "result": {},
  "error": null,
  "ui_schema": {}
}

说明：TOOL_CALL_RESULT 中 result 字段提供紧凑、结构化、可执行的信息（优先包含 id/status/count 等关键事实），用于 agent 后续推理与工具编排。若对应工具输出存在 ui_hints，后端会在 codec 层编译得到 ui_schema 并随事件下发。

当前 ui_hints 策略：仅对当前有稳定展示语义的 canonical method 生成，例如 calendar.read、calendar.create、calendar.update、calendar.delete、calendar.share、calendar.accept_invite、calendar.reject_invite、contacts.read、memory.update。

协议迁移说明：

tool_call_args 的模型侧 canonical 结构已统一为 module/method/input。
SSE 事件字段名 tool_call_args 保持不变，但其内部对象形状以当期 project_cli 协议为准。
前端和调试工具不得再假设 tool_call_args.command / tool_call_args.subcommand 一定存在。

补充约束：

tool_call_id 必须与同次调用的 TOOL_CALL_START/ARGS/END.toolCallId 一致，并在每次工具调用中保持唯一。
tool_call_args 仅表示输入参数快照。
result 仅表示执行输出事实，不重复 tool_call_args 已包含的输入参数。
ui_schema 为可渲染 UI 线缆格式；其源数据来自 metadata.tool_agent_output.ui_hints。

推荐的 tool_call_args 形状：

{
  "module": "calendar",
  "method": "read",
  "input": {
    "mode": "event",
    "event_id": "evt_123"
  }
}

3.3.1 tool 名称展示规范（前端本地化）

SSE 协议中的工具名字段保持后端原样，不做服务端翻译：

TOOL_CALL_START/ARGS/END.toolCallName
TOOL_CALL_RESULT.tool_name

前端展示层统一通过工具名本地化映射进行中文渲染，要求兼容两类命名风格：

dot 风格：memory.update、calendar.read
snake 风格：memory_update、calendar_read

当前规范映射（canonical -> 中文）如下：

calendar.read -> 读取日程
calendar.create -> 创建日程
calendar.update -> 更新日程
calendar.delete -> 删除日程
calendar.share -> 邀请参与者
calendar.accept_invite -> 接受邀请
calendar.reject_invite -> 拒绝邀请
contacts.read -> 读取联系人
memory.update -> 更新记忆

兼容策略：

优先按 alias 归一化（例如 memory_update -> memory.update）
命中 canonical 映射后展示中文
未命中时回退显示原始工具名（保证向后兼容）

该规范只约束展示，不改变 wire event 字段定义与取值。

3.4 文本完成事件

`TEXT_MESSAGE_END`

当前实现仅在 worker 输出完成后发送完整结果，不发送 token delta 事件。

{
  "type": "TEXT_MESSAGE_END",
  "threadId": "...",
  "runId": "...",
  "messageId": "...",
  "role": "assistant",
  "stage": "worker",
  "status": "success" | "partial_success" | "failed",
  "answer": "...",
  "suggested_actions": [],
  "error": null
}

inputTokens、outputTokens、cost、latencyMs、model 属于后端内部统计字段，不在 SSE 对外协议中暴露。

当前实现补充说明：

TEXT_MESSAGE_END 在 wire payload 中会包含 totalTokens、cachedPromptTokens、promptCacheHitTokens、promptCacheMissTokens、reasoningTokens、costSource、usageComplete 等 usage 摘要字段，供前端观测面板使用。
这些字段来自后端 usage 归一化层，属于 AG-UI 事件数据的一部分，不改变 TEXT_MESSAGE_END 主结构。

5) Usage 审计协议（后端内部）

本节描述后端对 LLM usage 的内部审计与计费策略。该协议用于数据库持久化、成本统计与运行观测，不对 SSE 外部协议直接暴露。

5.1 当前厂商范围

DashScope（Qwen）
DeepSeek

当前实现仅针对上述两家做深度适配。

5.2 原始字段采集（Provider -> Runtime）

TrackingChatModel 会优先读取 provider 直接字段，读取不到时再从 metadata 补齐。

优先级如下：

直接字段（优先）
- usage.input_tokens
- usage.output_tokens
- usage.total_tokens
- usage.time（秒）
- usage.cost（若存在）
metadata 字段（补齐）
- metadata.prompt_tokens
- metadata.completion_tokens
- metadata.total_tokens
- metadata.prompt_tokens_details.cached_tokens
- metadata.prompt_cache_hit_tokens
- metadata.prompt_cache_miss_tokens
- metadata.completion_tokens_details.reasoning_tokens
- metadata.cost / metadata.total_cost（若存在）

5.3 归一化后的内部 usage_summary 字段

TrackingChatModel.usage_summary() 当前输出：

input_tokens
output_tokens
total_tokens
latency_ms（由 usage.time * 1000 转换）
cached_prompt_tokens
prompt_cache_hit_tokens
prompt_cache_miss_tokens
reasoning_tokens
direct_cost
direct_cost_observed（0/1）
direct_cost_complete（0/1）
model_call_records
usage_records
direct_cost_records
cost_source（provider | catalog_fallback）

5.4 成本计算策略（严谨优先）

核心原则：能直接用 provider 返回就直接用；缺失才 fallback。

LiteLLMService.build_usage_metadata() 执行规则：

仅当以下条件同时满足时使用 provider 直出成本：
- usageComplete == true（model_call_records == usage_records）
- direct_cost_observed == 1
- direct_cost_complete == 1
- direct_cost 为有效非负数
否则使用 catalog 价格回退计算（calculate_cost）

5.5 Fallback 计费细节

档位选择：按 prompt_tokens 命中 pricing_tiers.max_prompt_tokens
公式：

cost = uncached_prompt_tokens * input_cost_per_token
     + cached_prompt_tokens   * cached_token_rate
     + completion_tokens      * output_cost_per_token

cached_token_rate 规则：
- 若 tier 配置了 cache_hit_cost_per_token 且 > 0，使用该值
- 否则回退为 input_cost_per_token

5.6 内部 costSource 语义

provider: 使用 provider 直接成本
catalog_fallback: 正常使用价格表回退
catalog_fallback_incomplete_provider_cost: provider 返回了部分 direct cost，但不完整，回退价格表
incomplete_usage_fallback: usage 本身不完整，回退价格表

5.7 DeepSeek / DashScope 当前观测到的返回特征

根据当前线上探针与运行结果：

两家都稳定返回：input_tokens、output_tokens、time
usage.total_tokens 顶层可能为空，但 metadata.total_tokens 可用
DeepSeek 常见 prompt_tokens_details.cached_tokens、prompt_cache_hit_tokens、prompt_cache_miss_tokens
DashScope 常见 completion_tokens_details.reasoning_tokens（可能为 null）
两家当前都未稳定提供直接 cost 字段，因此多数场景为 catalog fallback

6) 快照事件

编码器支持以下 AG-UI 类型映射：

STATE_SNAPSHOT
MESSAGES_SNAPSHOT

当前 /runs/{thread_id}/events 主流程通常不主动产出这两类事件；历史查询请使用 /history。

7) 字段命名约定

事件顶层通用字段使用 AG-UI 风格：type、threadId、runId
部分业务字段沿运行时模型历史命名保留下划线：
- tool_name
- tool_call_id
- tool_call_args
- ui_schema

这部分命名属于当前后端实现约束，文档与实现保持一致。

8) 可见性与上下文装载说明

visibility_mask 位掩码系统

持久化消息使用单字段 visibility_mask（位掩码）控制不同 consumer 的可见性：

Bit	常量名	说明
0	`UI_HISTORY`	`/history` API 投影可见的消息
1	`CONTEXT_ASSEMBLY`	运行时上下文装配（context assembly）可见

用户输入入库时，chat 模式设置 mask = UI_HISTORY | CONTEXT_ASSEMBLY（值为 3），automation 模式设置 mask = 0。 agent 运行产物入库时，automation 模式设置 mask = UI_HISTORY（值为 1），用于展示历史但不参与 context assembly。

/history API

GET /api/v1/agent/history 仅投影包含 UI_HISTORY 位的消息：

WHERE (visibility_mask & 1) != 0

运行时上下文装配

load_context_messages 查询上下文时使用 CONTEXT_ASSEMBLY 位过滤：

WHERE (visibility_mask & 2) != 0

影响：

chat 模式用户输入：mask=3 → 进入 /history ✅，进入 context assembly ✅
automation 模式用户输入：mask=0 → 进入 /history ❌，进入 context assembly ❌
automation 模式 agent 输出：mask=1 → 进入 /history ✅，进入 context assembly ❌

Automation 模式上下文注入

由于 automation 用户输入 mask=0 不进入 context assembly，router 调用前会从 RunAgentInput.messages 注入最新用户消息到 context 头部（条件：context 为空或最后一条非 user）。

runtime_mode 差异总结

维度	`chat`	`automation`
Pipeline	`router` -> `worker`	`router` -> `worker`
用户输入 visibility_mask	`UI_HISTORY \| CONTEXT_ASSEMBLY`	`0`
agent 输出 visibility_mask	`UI_HISTORY \| CONTEXT_ASSEMBLY`（memory stage 仅 `UI_HISTORY`）	`UI_HISTORY`
进入 /history	✅	✅（仅 agent 输出）
进入 context assembly	✅（自动）	❌（通过 run_input 注入）
enabled_skills 来源	`system_agents.yaml` worker 配置	`AutomationJob.config.enabled_skills`
context 配置来源	`system_agents.yaml` router context_messages	`AutomationJob.config.context`

13 KiB Raw Blame History Unescape Escape