208 lines
5.7 KiB
Markdown
208 lines
5.7 KiB
Markdown
|
|
# Agent Tool Architecture Design
|
|||
|
|
|
|||
|
|
**Date:** 2026-03-08
|
|||
|
|
**Source:** `docs/bugs/2026-03-08-agent-tool-architecture.md`
|
|||
|
|
**Scope:** `backend/src/core/agent`
|
|||
|
|
**Status:** Approved for planning
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. Objective
|
|||
|
|
|
|||
|
|
修复 Agent 工具架构相关 8 个问题,优先恢复端到端闭环能力(工具审批后继续推理并产出最终回复),并在同版本内补齐工具输出结构化、存储分层、阶段策略解耦、多模态与语音输入能力。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Deliverables
|
|||
|
|
|
|||
|
|
1. 两阶段修复蓝图(Phase 1 + Phase 2)
|
|||
|
|
2. 统一事件与状态机设计(AG-UI Step 事件 + 审批恢复)
|
|||
|
|
3. 接口边界与职责重划分(run/resume/runtime/persistence)
|
|||
|
|
4. 风险与回滚策略
|
|||
|
|
5. 验收标准(双金路径)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. Constraints And Decisions
|
|||
|
|
|
|||
|
|
### 3.1 Release Strategy
|
|||
|
|
|
|||
|
|
- 一次性切换
|
|||
|
|
- 不做灰度
|
|||
|
|
- 不做双轨
|
|||
|
|
- 不留兼容代码
|
|||
|
|
|
|||
|
|
### 3.2 Contract Decisions
|
|||
|
|
|
|||
|
|
- `run` 接口允许破坏性变更:移除前端传完整历史 `messages` 的语义
|
|||
|
|
- 前端只传本次输入,历史以后端为准
|
|||
|
|
- Phase 1 不引入 client hint
|
|||
|
|
- 工具架构在 Phase 1 完整迁移至 CrewAI Tools(非桥接)
|
|||
|
|
|
|||
|
|
### 3.3 AG-UI Event Decisions
|
|||
|
|
|
|||
|
|
- 三阶段固定发 `StepStarted/StepFinished`:`intent`, `execution`, `organization`
|
|||
|
|
- 等待工具审批不单独新增 step,归属 execution 内部状态
|
|||
|
|
- 后端只发英文机器名,前端自行文案化
|
|||
|
|
|
|||
|
|
### 3.4 ASR / Multimodal Decisions
|
|||
|
|
|
|||
|
|
- 多模态首版只支持文件上传(不支持 URL)
|
|||
|
|
- ASR 首版为“录音结束后上传音频 -> 后端同步返回 transcript”
|
|||
|
|
- 前端将 transcript 回填输入框,再调用 run
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. Complexity And Risk
|
|||
|
|
|
|||
|
|
- **Complexity:** S2(跨多个核心模块的架构调整)
|
|||
|
|
- **Risk Tier:** L2(包含高危安全项:前端可篡改历史)
|
|||
|
|
|
|||
|
|
风险驱动原则:先修复闭环与安全问题,再扩展能力面。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Phased Plan
|
|||
|
|
|
|||
|
|
## Phase 1 - Close Loop And Stop Security Bleeding
|
|||
|
|
|
|||
|
|
**Bugs:** #1, #5, #6
|
|||
|
|
|
|||
|
|
### Goals
|
|||
|
|
|
|||
|
|
1. 后端成为历史与上下文唯一事实源
|
|||
|
|
2. 工具审批后恢复并继续 Agent Loop
|
|||
|
|
3. 工具执行完整迁移到 CrewAI Tools 注册体系
|
|||
|
|
|
|||
|
|
### Module Boundaries
|
|||
|
|
|
|||
|
|
- `backend/src/core/agent/application/run_service.py`
|
|||
|
|
- 仅负责本次输入解析、后端上下文组装、触发 runtime
|
|||
|
|
- 移除前端历史信任路径
|
|||
|
|
- 移除硬编码工具分发
|
|||
|
|
|
|||
|
|
- `backend/src/core/agent/application/resume_service.py`
|
|||
|
|
- 审批确认后触发异步续跑,立即返回 `accepted`
|
|||
|
|
- 不可在工具执行后直接置 `COMPLETED`
|
|||
|
|
- 增加 `approval_request_id` 幂等保护
|
|||
|
|
|
|||
|
|
- `backend/src/core/agent/infrastructure/crewai/runtime.py`
|
|||
|
|
- 引入 CrewAI Tools 注册与注入
|
|||
|
|
- 按 agent/stage 装配工具集
|
|||
|
|
- 三阶段统一发 Step start/end 事件
|
|||
|
|
|
|||
|
|
- `backend/src/core/agent/application/session_state_persistence.py`
|
|||
|
|
- 保障审批状态、工具结果、续跑状态一致性落库
|
|||
|
|
- 为 Phase 2 元数据扩展保留一致接口
|
|||
|
|
|
|||
|
|
### Runtime Flow (Phase 1)
|
|||
|
|
|
|||
|
|
1. `run` 接收本次输入
|
|||
|
|
2. 后端读取 Redis/DB 重建历史
|
|||
|
|
3. 进入 intent/execution/organization 三阶段
|
|||
|
|
4. execution 中若触发工具审批:进入 `WAITING_APPROVAL`
|
|||
|
|
5. 前端审批后调用 `resume`
|
|||
|
|
6. `resume` 异步触发续跑:执行工具 -> 写 tool result -> 继续 loop
|
|||
|
|
7. 生成最终 assistant 回复并 `RunFinished`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Phase 2 - Capability Completion In Same Version
|
|||
|
|
|
|||
|
|
**Order:** #3 -> #2 -> #4 -> #7 -> #8
|
|||
|
|
|
|||
|
|
### #3 Tool Output As UI Schema v1
|
|||
|
|
|
|||
|
|
- 统一工具输出结构:`type/version/data/actions`
|
|||
|
|
- 单一版本 `v1`,短期不做多版本并行
|
|||
|
|
|
|||
|
|
### #2 Tool Result Object Storage
|
|||
|
|
|
|||
|
|
- 大 payload 存对象存储
|
|||
|
|
- DB 仅存摘要、索引、校验信息
|
|||
|
|
- 启用 `storage_bucket/storage_path/payload_sha256`
|
|||
|
|
|
|||
|
|
### #4 Stage-Level Strategy Decoupling
|
|||
|
|
|
|||
|
|
- intent/execution/organization 支持独立参数与工具策略
|
|||
|
|
- intent 阶段可配置为只读(禁工具)
|
|||
|
|
|
|||
|
|
### #7 Multimodal Input
|
|||
|
|
|
|||
|
|
- 首版支持图片文件上传输入
|
|||
|
|
- 不再丢弃非 text 内容
|
|||
|
|
|
|||
|
|
### #8 ASR API
|
|||
|
|
|
|||
|
|
- 新增语音转写 API(同步返回 transcript)
|
|||
|
|
- 语音转写与 agent run 解耦
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. Session State And Events
|
|||
|
|
|
|||
|
|
推荐状态机:
|
|||
|
|
|
|||
|
|
`RUNNING -> WAITING_APPROVAL -> RESUMING -> RUNNING -> COMPLETED/FAILED`
|
|||
|
|
|
|||
|
|
关键约束:
|
|||
|
|
|
|||
|
|
- 重复审批请求不得重复执行工具(幂等)
|
|||
|
|
- `COMPLETED` 仅在 loop 自然结束时设置
|
|||
|
|
- Step 事件覆盖三阶段完整生命周期
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 7. Acceptance Criteria
|
|||
|
|
|
|||
|
|
## 7.1 Golden Path A (No Tool)
|
|||
|
|
|
|||
|
|
用户输入后,完整经历三阶段并产出最终回复;前端收到完整 step 事件与 `RunFinished`。
|
|||
|
|
|
|||
|
|
## 7.2 Golden Path B (Tool + Approval + Resume)
|
|||
|
|
|
|||
|
|
用户触发工具调用,审批后系统异步续跑并最终产出 assistant 回复;会话不在审批后直接结束。
|
|||
|
|
|
|||
|
|
## 7.3 Security Validation
|
|||
|
|
|
|||
|
|
前端即使提交伪造历史字段,也不会影响后端实际上下文。
|
|||
|
|
|
|||
|
|
## 7.4 Event Validation
|
|||
|
|
|
|||
|
|
每轮 run 必须包含 `intent/execution/organization` 的 `StepStarted/StepFinished`。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 8. Risk And Rollback
|
|||
|
|
|
|||
|
|
### High Risk: #6 Context Ownership Migration
|
|||
|
|
|
|||
|
|
- 风险:上下文错绑、历史缺失
|
|||
|
|
- 控制:会话归属校验 + Redis/DB 一致性读取
|
|||
|
|
- 回滚:可退到“后端 DB-only 历史重建”
|
|||
|
|
|
|||
|
|
### High Risk: #5 Async Resume Consistency
|
|||
|
|
|
|||
|
|
- 风险:重复审批、状态卡死
|
|||
|
|
- 控制:审批幂等键 + 状态跃迁约束 + 超时终态
|
|||
|
|
- 回滚:降级为“仅返回工具结果,不自动续跑”
|
|||
|
|
|
|||
|
|
### Medium Risk: #2 Storage Split Consistency
|
|||
|
|
|
|||
|
|
- 风险:对象存储与 DB 元数据不一致
|
|||
|
|
- 控制:先对象后元数据 + 失败补偿清理
|
|||
|
|
- 回滚:临时退回 DB 内联存储
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 9. Bug-To-Phase Mapping
|
|||
|
|
|
|||
|
|
- **Phase 1:** #1, #5, #6
|
|||
|
|
- **Phase 2:** #2, #3, #4, #7, #8
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 10. Next Step
|
|||
|
|
|
|||
|
|
进入 implementation planning:将本设计拆解为任务级可执行计划(文件、测试、命令、验收证据)。
|