2026-03-16 09:01:01 +08:00
|
|
|
|
# Agent SSE Events
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
本文档描述 `GET /api/v1/agent/runs/{thread_id}/events` 的事件协议。
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-30 09:07:07 +08:00
|
|
|
|
> 当前协议要求 SSE 订阅显式携带 `runId`,事件输出按 run 隔离。
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
---
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
## 1) 事件管道
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
后端事件流转如下:
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
1. Runtime 直接产出 AG-UI 事件(如 `RUN_STARTED`、`TOOL_CALL_RESULT`)
|
|
|
|
|
|
2. `agui_codec` 仅做协议对齐与字段净化(例如移除仅后端内部统计字段)
|
|
|
|
|
|
3. 事件同时:
|
|
|
|
|
|
- 持久化到数据库(用于 history)
|
|
|
|
|
|
- 发布到 Redis Stream(用于 SSE)
|
2026-03-30 09:07:07 +08:00
|
|
|
|
4. `/runs/{thread_id}/events` 从 Redis Stream 读取并输出 SSE(仅输出目标 `runId` 事件)
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
## 2) SSE 帧格式
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
每条事件遵循标准 SSE:
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```text
|
|
|
|
|
|
id: <redis-stream-id>
|
|
|
|
|
|
event: <AG-UI-EVENT-TYPE>
|
|
|
|
|
|
data: <json>
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
- `id` 可用于断点续流(`Last-Event-ID`)
|
|
|
|
|
|
- `event` 与 JSON 内 `type` 一致(例如 `RUN_STARTED`)
|
2026-03-30 09:07:07 +08:00
|
|
|
|
- 仅当 payload 中 `runId` 与 query `runId` 一致时才会对外发送
|
2026-03-16 16:11:40 +08:00
|
|
|
|
- 空闲时可能出现 keep-alive 注释帧:
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```text
|
|
|
|
|
|
: keep-alive
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
## 3) 事件类型(当前实现)
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-30 09:07:07 +08:00
|
|
|
|
### 3.0 过滤与终止规则
|
|
|
|
|
|
|
|
|
|
|
|
- 请求参数 `runId` 为本次订阅目标 run。
|
|
|
|
|
|
- 服务端只转发 `event.runId == runId` 的事件。
|
|
|
|
|
|
- SSE 连接仅在目标 run 收到 terminal 事件后结束:
|
|
|
|
|
|
- `RUN_FINISHED`
|
|
|
|
|
|
- `RUN_ERROR`
|
|
|
|
|
|
- 其他 run 的事件(包括 terminal)不会结束当前连接。
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
### 3.1 Run 生命周期
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `RUN_STARTED`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "RUN_STARTED",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "..."
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `RUN_FINISHED`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "RUN_FINISHED",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "..."
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `RUN_ERROR`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "RUN_ERROR",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
|
|
|
|
|
"message": "runtime execution failed",
|
|
|
|
|
|
"code": null
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-25 18:33:25 +08:00
|
|
|
|
取消语义(当前实现):
|
|
|
|
|
|
|
|
|
|
|
|
```json
|
|
|
|
|
|
{
|
|
|
|
|
|
"type": "RUN_ERROR",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
|
|
|
|
|
"message": "run canceled by user",
|
|
|
|
|
|
"code": "RUN_CANCELED"
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
说明:`RUN_CANCELED` 表示用户主动中断,本阶段后端仍复用会话 `failed` 状态以保持兼容。
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
### 3.2 阶段事件
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `STEP_STARTED`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "STEP_STARTED",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
2026-04-02 11:52:23 +08:00
|
|
|
|
"stepName": "router" | "worker"
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `STEP_FINISHED`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "STEP_FINISHED",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
2026-04-02 11:52:23 +08:00
|
|
|
|
"stepName": "router" | "worker"
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
### 3.3 Tool 事件
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-04-23 12:12:41 +08:00
|
|
|
|
前端渲染约束(当前实现):
|
|
|
|
|
|
|
|
|
|
|
|
- tool UI 渲染仅消费 `TOOL_CALL_RESULT.ui_schema`。
|
|
|
|
|
|
- `TOOL_CALL_START` / `TOOL_CALL_ARGS` / `TOOL_CALL_END` 仅作为执行观测事件保留,前端主聊天流不渲染中间态卡片。
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `TOOL_CALL_START`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "TOOL_CALL_START",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
|
|
|
|
|
"messageId": "...",
|
|
|
|
|
|
"toolCallId": "...",
|
|
|
|
|
|
"toolCallName": "...",
|
2026-04-02 11:52:23 +08:00
|
|
|
|
"stage": "worker"
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `TOOL_CALL_ARGS`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "TOOL_CALL_ARGS",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
|
|
|
|
|
"messageId": "...",
|
|
|
|
|
|
"toolCallId": "...",
|
|
|
|
|
|
"toolCallName": "...",
|
|
|
|
|
|
"args": {},
|
2026-04-02 11:52:23 +08:00
|
|
|
|
"stage": "worker"
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `TOOL_CALL_END`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "TOOL_CALL_END",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
|
|
|
|
|
"messageId": "...",
|
|
|
|
|
|
"toolCallId": "...",
|
|
|
|
|
|
"toolCallName": "...",
|
2026-04-02 11:52:23 +08:00
|
|
|
|
"stage": "worker"
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `TOOL_CALL_RESULT`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "TOOL_CALL_RESULT",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
|
|
|
|
|
"messageId": "...",
|
|
|
|
|
|
"role": "tool",
|
2026-04-02 11:52:23 +08:00
|
|
|
|
"stage": "worker",
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"tool_name": "...",
|
|
|
|
|
|
"tool_call_id": "...",
|
|
|
|
|
|
"tool_call_args": {},
|
2026-03-17 12:18:09 +08:00
|
|
|
|
"status": "success" | "failure" | "partial",
|
2026-04-23 12:12:41 +08:00
|
|
|
|
"result": {},
|
|
|
|
|
|
"error": null,
|
|
|
|
|
|
"ui_schema": {}
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-04-23 12:12:41 +08:00
|
|
|
|
说明:`TOOL_CALL_RESULT` 中 `result` 字段提供紧凑、结构化、可执行的信息(优先包含 id/status/count 等关键事实),用于 agent 后续推理与工具编排。若对应工具输出存在 `ui_hints`,后端会在 codec 层编译得到 `ui_schema` 并随事件下发。
|
|
|
|
|
|
|
2026-04-24 13:24:13 +08:00
|
|
|
|
当前 `ui_hints` 策略:仅对当前有稳定展示语义的 canonical method 生成,例如 `calendar.read`、`calendar.create`、`calendar.update`、`calendar.delete`、`calendar.share`、`calendar.accept_invite`、`calendar.reject_invite`、`contacts.read`、`memory.update`。
|
|
|
|
|
|
|
|
|
|
|
|
协议迁移说明:
|
|
|
|
|
|
|
|
|
|
|
|
- `tool_call_args` 的模型侧 canonical 结构已统一为 `module/method/input`。
|
|
|
|
|
|
- SSE 事件字段名 `tool_call_args` 保持不变,但其内部对象形状以当期 `project_cli` 协议为准。
|
|
|
|
|
|
- 前端和调试工具不得再假设 `tool_call_args.command` / `tool_call_args.subcommand` 一定存在。
|
2026-03-17 12:18:09 +08:00
|
|
|
|
|
2026-03-17 14:12:44 +08:00
|
|
|
|
补充约束:
|
|
|
|
|
|
|
|
|
|
|
|
- `tool_call_id` 必须与同次调用的 `TOOL_CALL_START/ARGS/END.toolCallId` 一致,并在每次工具调用中保持唯一。
|
|
|
|
|
|
- `tool_call_args` 仅表示输入参数快照。
|
|
|
|
|
|
- `result` 仅表示执行输出事实,不重复 `tool_call_args` 已包含的输入参数。
|
2026-04-23 12:12:41 +08:00
|
|
|
|
- `ui_schema` 为可渲染 UI 线缆格式;其源数据来自 `metadata.tool_agent_output.ui_hints`。
|
2026-03-17 14:12:44 +08:00
|
|
|
|
|
2026-04-24 13:24:13 +08:00
|
|
|
|
推荐的 `tool_call_args` 形状:
|
|
|
|
|
|
|
|
|
|
|
|
```json
|
|
|
|
|
|
{
|
2026-04-24 14:10:57 +08:00
|
|
|
|
"module": "calendar",
|
|
|
|
|
|
"method": "read",
|
2026-04-24 13:24:13 +08:00
|
|
|
|
"input": {
|
2026-04-24 14:10:57 +08:00
|
|
|
|
"mode": "event",
|
2026-04-24 13:24:13 +08:00
|
|
|
|
"event_id": "evt_123"
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-24 12:38:11 +08:00
|
|
|
|
#### 3.3.1 tool 名称展示规范(前端本地化)
|
|
|
|
|
|
|
|
|
|
|
|
SSE 协议中的工具名字段保持后端原样,不做服务端翻译:
|
|
|
|
|
|
|
|
|
|
|
|
- `TOOL_CALL_START/ARGS/END.toolCallName`
|
|
|
|
|
|
- `TOOL_CALL_RESULT.tool_name`
|
|
|
|
|
|
|
|
|
|
|
|
前端展示层统一通过工具名本地化映射进行中文渲染,要求兼容两类命名风格:
|
|
|
|
|
|
|
2026-04-24 14:10:57 +08:00
|
|
|
|
- dot 风格:`memory.update`、`calendar.read`
|
|
|
|
|
|
- snake 风格:`memory_update`、`calendar_read`
|
2026-03-24 12:38:11 +08:00
|
|
|
|
|
|
|
|
|
|
当前规范映射(canonical -> 中文)如下:
|
|
|
|
|
|
|
2026-04-24 14:10:57 +08:00
|
|
|
|
- `calendar.read` -> `读取日程`
|
|
|
|
|
|
- `calendar.create` -> `创建日程`
|
|
|
|
|
|
- `calendar.update` -> `更新日程`
|
|
|
|
|
|
- `calendar.delete` -> `删除日程`
|
|
|
|
|
|
- `calendar.share` -> `邀请参与者`
|
2026-04-24 13:24:13 +08:00
|
|
|
|
- `calendar.accept_invite` -> `接受邀请`
|
|
|
|
|
|
- `calendar.reject_invite` -> `拒绝邀请`
|
2026-04-23 12:12:41 +08:00
|
|
|
|
- `contacts.read` -> `读取联系人`
|
|
|
|
|
|
- `memory.update` -> `更新记忆`
|
2026-03-24 12:38:11 +08:00
|
|
|
|
|
|
|
|
|
|
兼容策略:
|
|
|
|
|
|
|
2026-04-23 12:12:41 +08:00
|
|
|
|
1. 优先按 alias 归一化(例如 `memory_update` -> `memory.update`)
|
2026-03-24 12:38:11 +08:00
|
|
|
|
2. 命中 canonical 映射后展示中文
|
|
|
|
|
|
3. 未命中时回退显示原始工具名(保证向后兼容)
|
|
|
|
|
|
|
|
|
|
|
|
该规范只约束展示,不改变 wire event 字段定义与取值。
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
### 3.4 文本完成事件
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
#### `TEXT_MESSAGE_END`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
当前实现仅在 worker 输出完成后发送完整结果,不发送 token delta 事件。
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
```json
|
2026-03-16 09:01:01 +08:00
|
|
|
|
{
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"type": "TEXT_MESSAGE_END",
|
|
|
|
|
|
"threadId": "...",
|
|
|
|
|
|
"runId": "...",
|
|
|
|
|
|
"messageId": "...",
|
|
|
|
|
|
"role": "assistant",
|
2026-04-02 11:52:23 +08:00
|
|
|
|
"stage": "worker",
|
2026-03-16 16:11:40 +08:00
|
|
|
|
"status": "success" | "partial_success" | "failed",
|
|
|
|
|
|
"answer": "...",
|
|
|
|
|
|
"suggested_actions": [],
|
2026-04-22 17:09:37 +08:00
|
|
|
|
"error": null
|
2026-03-16 09:01:01 +08:00
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
`inputTokens`、`outputTokens`、`cost`、`latencyMs`、`model` 属于后端内部统计字段,不在 SSE 对外协议中暴露。
|
|
|
|
|
|
|
2026-03-19 00:52:16 +08:00
|
|
|
|
当前实现补充说明:
|
|
|
|
|
|
|
|
|
|
|
|
- `TEXT_MESSAGE_END` 在 wire payload 中会包含 `totalTokens`、`cachedPromptTokens`、`promptCacheHitTokens`、`promptCacheMissTokens`、`reasoningTokens`、`costSource`、`usageComplete` 等 usage 摘要字段,供前端观测面板使用。
|
|
|
|
|
|
- 这些字段来自后端 usage 归一化层,属于 AG-UI 事件数据的一部分,不改变 `TEXT_MESSAGE_END` 主结构。
|
|
|
|
|
|
|
2026-03-18 19:12:47 +08:00
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 5) Usage 审计协议(后端内部)
|
|
|
|
|
|
|
|
|
|
|
|
本节描述后端对 LLM usage 的内部审计与计费策略。该协议用于数据库持久化、成本统计与运行观测,不对 SSE 外部协议直接暴露。
|
|
|
|
|
|
|
|
|
|
|
|
### 5.1 当前厂商范围
|
|
|
|
|
|
|
|
|
|
|
|
- DashScope(Qwen)
|
|
|
|
|
|
- DeepSeek
|
|
|
|
|
|
|
|
|
|
|
|
当前实现仅针对上述两家做深度适配。
|
|
|
|
|
|
|
|
|
|
|
|
### 5.2 原始字段采集(Provider -> Runtime)
|
|
|
|
|
|
|
|
|
|
|
|
`TrackingChatModel` 会优先读取 provider 直接字段,读取不到时再从 metadata 补齐。
|
|
|
|
|
|
|
|
|
|
|
|
优先级如下:
|
|
|
|
|
|
|
|
|
|
|
|
1. 直接字段(优先)
|
|
|
|
|
|
- `usage.input_tokens`
|
|
|
|
|
|
- `usage.output_tokens`
|
|
|
|
|
|
- `usage.total_tokens`
|
|
|
|
|
|
- `usage.time`(秒)
|
|
|
|
|
|
- `usage.cost`(若存在)
|
|
|
|
|
|
2. metadata 字段(补齐)
|
|
|
|
|
|
- `metadata.prompt_tokens`
|
|
|
|
|
|
- `metadata.completion_tokens`
|
|
|
|
|
|
- `metadata.total_tokens`
|
|
|
|
|
|
- `metadata.prompt_tokens_details.cached_tokens`
|
|
|
|
|
|
- `metadata.prompt_cache_hit_tokens`
|
|
|
|
|
|
- `metadata.prompt_cache_miss_tokens`
|
|
|
|
|
|
- `metadata.completion_tokens_details.reasoning_tokens`
|
|
|
|
|
|
- `metadata.cost` / `metadata.total_cost`(若存在)
|
|
|
|
|
|
|
|
|
|
|
|
### 5.3 归一化后的内部 usage_summary 字段
|
|
|
|
|
|
|
|
|
|
|
|
`TrackingChatModel.usage_summary()` 当前输出:
|
|
|
|
|
|
|
|
|
|
|
|
- `input_tokens`
|
|
|
|
|
|
- `output_tokens`
|
|
|
|
|
|
- `total_tokens`
|
|
|
|
|
|
- `latency_ms`(由 `usage.time * 1000` 转换)
|
|
|
|
|
|
- `cached_prompt_tokens`
|
|
|
|
|
|
- `prompt_cache_hit_tokens`
|
|
|
|
|
|
- `prompt_cache_miss_tokens`
|
|
|
|
|
|
- `reasoning_tokens`
|
|
|
|
|
|
- `direct_cost`
|
|
|
|
|
|
- `direct_cost_observed`(0/1)
|
|
|
|
|
|
- `direct_cost_complete`(0/1)
|
|
|
|
|
|
- `model_call_records`
|
|
|
|
|
|
- `usage_records`
|
|
|
|
|
|
- `direct_cost_records`
|
|
|
|
|
|
- `cost_source`(`provider` | `catalog_fallback`)
|
|
|
|
|
|
|
|
|
|
|
|
### 5.4 成本计算策略(严谨优先)
|
|
|
|
|
|
|
|
|
|
|
|
核心原则:**能直接用 provider 返回就直接用;缺失才 fallback。**
|
|
|
|
|
|
|
|
|
|
|
|
`LiteLLMService.build_usage_metadata()` 执行规则:
|
|
|
|
|
|
|
|
|
|
|
|
1. 仅当以下条件同时满足时使用 provider 直出成本:
|
|
|
|
|
|
- `usageComplete == true`(`model_call_records == usage_records`)
|
|
|
|
|
|
- `direct_cost_observed == 1`
|
|
|
|
|
|
- `direct_cost_complete == 1`
|
|
|
|
|
|
- `direct_cost` 为有效非负数
|
|
|
|
|
|
2. 否则使用 catalog 价格回退计算(`calculate_cost`)
|
|
|
|
|
|
|
|
|
|
|
|
### 5.5 Fallback 计费细节
|
|
|
|
|
|
|
|
|
|
|
|
- 档位选择:按 `prompt_tokens` 命中 `pricing_tiers.max_prompt_tokens`
|
|
|
|
|
|
- 公式:
|
|
|
|
|
|
|
|
|
|
|
|
```text
|
|
|
|
|
|
cost = uncached_prompt_tokens * input_cost_per_token
|
|
|
|
|
|
+ cached_prompt_tokens * cached_token_rate
|
|
|
|
|
|
+ completion_tokens * output_cost_per_token
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
- `cached_token_rate` 规则:
|
|
|
|
|
|
- 若 tier 配置了 `cache_hit_cost_per_token` 且 > 0,使用该值
|
|
|
|
|
|
- 否则回退为 `input_cost_per_token`
|
|
|
|
|
|
|
|
|
|
|
|
### 5.6 内部 costSource 语义
|
|
|
|
|
|
|
|
|
|
|
|
- `provider`: 使用 provider 直接成本
|
|
|
|
|
|
- `catalog_fallback`: 正常使用价格表回退
|
|
|
|
|
|
- `catalog_fallback_incomplete_provider_cost`: provider 返回了部分 direct cost,但不完整,回退价格表
|
|
|
|
|
|
- `incomplete_usage_fallback`: usage 本身不完整,回退价格表
|
|
|
|
|
|
|
|
|
|
|
|
### 5.7 DeepSeek / DashScope 当前观测到的返回特征
|
|
|
|
|
|
|
|
|
|
|
|
根据当前线上探针与运行结果:
|
|
|
|
|
|
|
|
|
|
|
|
- 两家都稳定返回:`input_tokens`、`output_tokens`、`time`
|
|
|
|
|
|
- `usage.total_tokens` 顶层可能为空,但 `metadata.total_tokens` 可用
|
|
|
|
|
|
- DeepSeek 常见 `prompt_tokens_details.cached_tokens`、`prompt_cache_hit_tokens`、`prompt_cache_miss_tokens`
|
|
|
|
|
|
- DashScope 常见 `completion_tokens_details.reasoning_tokens`(可能为 `null`)
|
|
|
|
|
|
- 两家当前都未稳定提供直接 `cost` 字段,因此多数场景为 catalog fallback
|
|
|
|
|
|
|
|
|
|
|
|
## 6) 快照事件
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
编码器支持以下 AG-UI 类型映射:
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
- `STATE_SNAPSHOT`
|
|
|
|
|
|
- `MESSAGES_SNAPSHOT`
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
2026-03-16 16:11:40 +08:00
|
|
|
|
当前 `/runs/{thread_id}/events` 主流程通常不主动产出这两类事件;历史查询请使用 `/history`。
|
2026-03-16 09:01:01 +08:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-03-18 19:12:47 +08:00
|
|
|
|
## 7) 字段命名约定
|
2026-03-16 16:11:40 +08:00
|
|
|
|
|
|
|
|
|
|
- 事件顶层通用字段使用 AG-UI 风格:`type`、`threadId`、`runId`
|
|
|
|
|
|
- 部分业务字段沿运行时模型历史命名保留下划线:
|
|
|
|
|
|
- `tool_name`
|
|
|
|
|
|
- `tool_call_id`
|
|
|
|
|
|
- `tool_call_args`
|
|
|
|
|
|
- `ui_schema`
|
|
|
|
|
|
|
|
|
|
|
|
这部分命名属于当前后端实现约束,文档与实现保持一致。
|
2026-03-19 18:42:45 +08:00
|
|
|
|
|
|
|
|
|
|
## 8) 可见性与上下文装载说明
|
|
|
|
|
|
|
2026-03-23 01:20:27 +08:00
|
|
|
|
### visibility_mask 位掩码系统
|
|
|
|
|
|
|
|
|
|
|
|
持久化消息使用单字段 `visibility_mask`(位掩码)控制不同 consumer 的可见性:
|
|
|
|
|
|
|
|
|
|
|
|
| Bit | 常量名 | 说明 |
|
|
|
|
|
|
|-----|--------|------|
|
|
|
|
|
|
| 0 | `UI_HISTORY` | `/history` API 投影可见的消息 |
|
|
|
|
|
|
| 1 | `CONTEXT_ASSEMBLY` | 运行时上下文装配(context assembly)可见 |
|
|
|
|
|
|
|
2026-03-25 17:41:55 +08:00
|
|
|
|
> 用户输入入库时,`chat` 模式设置 `mask = UI_HISTORY | CONTEXT_ASSEMBLY`(值为 3),`automation` 模式设置 `mask = 0`。
|
|
|
|
|
|
> agent 运行产物入库时,`automation` 模式设置 `mask = UI_HISTORY`(值为 1),用于展示历史但不参与 context assembly。
|
2026-03-23 01:20:27 +08:00
|
|
|
|
|
|
|
|
|
|
### /history API
|
|
|
|
|
|
|
|
|
|
|
|
`GET /api/v1/agent/history` 仅投影包含 `UI_HISTORY` 位的消息:
|
|
|
|
|
|
|
|
|
|
|
|
```sql
|
|
|
|
|
|
WHERE (visibility_mask & 1) != 0
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
### 运行时上下文装配
|
|
|
|
|
|
|
|
|
|
|
|
`load_context_messages` 查询上下文时使用 `CONTEXT_ASSEMBLY` 位过滤:
|
|
|
|
|
|
|
|
|
|
|
|
```sql
|
|
|
|
|
|
WHERE (visibility_mask & 2) != 0
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**影响**:
|
|
|
|
|
|
- `chat` 模式用户输入:mask=3 → 进入 `/history` ✅,进入 context assembly ✅
|
|
|
|
|
|
- `automation` 模式用户输入:mask=0 → 进入 `/history` ❌,进入 context assembly ❌
|
2026-03-25 17:41:55 +08:00
|
|
|
|
- `automation` 模式 agent 输出:mask=1 → 进入 `/history` ✅,进入 context assembly ❌
|
2026-03-23 01:20:27 +08:00
|
|
|
|
|
|
|
|
|
|
### Automation 模式上下文注入
|
|
|
|
|
|
|
|
|
|
|
|
由于 automation 用户输入 `mask=0` 不进入 context assembly,router 调用前会从 `RunAgentInput.messages` 注入最新用户消息到 context 头部(条件:context 为空 或 最后一条非 user)。
|
|
|
|
|
|
|
|
|
|
|
|
### runtime_mode 差异总结
|
|
|
|
|
|
|
|
|
|
|
|
| 维度 | `chat` | `automation` |
|
|
|
|
|
|
|------|--------|--------------|
|
|
|
|
|
|
| Pipeline | `router` -> `worker` | `router` -> `worker` |
|
|
|
|
|
|
| 用户输入 visibility_mask | `UI_HISTORY \| CONTEXT_ASSEMBLY` | `0` |
|
2026-03-25 17:41:55 +08:00
|
|
|
|
| agent 输出 visibility_mask | `UI_HISTORY \| CONTEXT_ASSEMBLY`(memory stage 仅 `UI_HISTORY`) | `UI_HISTORY` |
|
|
|
|
|
|
| 进入 /history | ✅ | ✅(仅 agent 输出) |
|
2026-03-23 01:20:27 +08:00
|
|
|
|
| 进入 context assembly | ✅(自动) | ❌(通过 run_input 注入) |
|
2026-04-22 17:09:37 +08:00
|
|
|
|
| enabled_skills 来源 | `system_agents.yaml` worker 配置 | `AutomationJob.config.enabled_skills` |
|
2026-03-23 01:20:27 +08:00
|
|
|
|
| context 配置来源 | `system_agents.yaml` router context_messages | `AutomationJob.config.context` |
|