15 KiB
Agent Tool Architecture Implementation Plan
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: 修复 agent 工具架构 8 个问题,先恢复端到端闭环与安全正确性,再补齐 UI Schema、对象存储、阶段解耦、多模态与 ASR。
Architecture: 采用两阶段落地。Phase 1 先完成后端上下文主控、CrewAI Tools 完整迁移、审批后异步续跑闭环;Phase 2 按 #3 -> #2 -> #4 -> #7 -> #8 逐项扩展能力。所有变更遵循 AG-UI 事件流语义,三阶段固定发送 StepStarted/StepFinished。
Tech Stack: FastAPI, Pydantic, CrewAI, LiteLLM, Redis, Postgres, MinIO/Supabase Storage, pytest
Task 1: 锁定 Phase 1 契约(移除前端历史语义)
Files:
- Modify:
backend/src/core/agent/domain/agui_input.py - Modify:
backend/src/core/agent/application/run_service.py - Modify:
backend/src/v1/agent/schemas.py - Test:
backend/tests/unit/core/agent/test_run_resume_service.py
Step 1: Write the failing test
def test_run_ignores_client_history_messages(fake_run_input_with_messages):
result = service.run(run_input=fake_run_input_with_messages)
assert result.used_context_source == "backend"
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k ignores_client_history -v
Expected: FAIL,当前实现仍读取/依赖前端 history。
Step 3: Write minimal implementation
# run_service.py
user_input = extract_latest_user_text(run_input)
history = await load_context_from_backend_sources(session_id)
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k ignores_client_history -v
Expected: PASS
Step 5: Commit
git add backend/src/core/agent/domain/agui_input.py backend/src/core/agent/application/run_service.py backend/src/v1/agent/schemas.py backend/tests/unit/core/agent/test_run_resume_service.py
git commit -m "refactor(agent): make backend own conversation context"
Task 2: CrewAI Tools 完整迁移(替换硬编码分发)
Files:
- Create:
backend/src/core/agent/infrastructure/crewai/tools_registry.py - Create:
backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py - Modify:
backend/src/core/agent/infrastructure/crewai/runtime.py - Modify:
backend/src/core/agent/application/run_service.py - Test:
backend/tests/unit/core/agent/test_crewai_runtime.py
Step 1: Write the failing test
def test_runtime_uses_registered_crewai_tools():
runtime = build_runtime_with_registry(["create_calendar_event"])
result = runtime.execute(user_input="帮我创建日历事件", system_prompt="x")
assert result.tool_calls[0].tool_name == "create_calendar_event"
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k registered_crewai_tools -v
Expected: FAIL,当前路径仍是 run_service 硬编码。
Step 3: Write minimal implementation
# tools_registry.py
TOOLS = {"create_calendar_event": CreateCalendarEventTool()}
def tools_for_stage(stage: str) -> list[BaseTool]:
return STAGE_TOOL_MAP.get(stage, [])
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k registered_crewai_tools -v
Expected: PASS
Step 5: Commit
git add backend/src/core/agent/infrastructure/crewai/tools_registry.py backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/application/run_service.py backend/tests/unit/core/agent/test_crewai_runtime.py
git commit -m "feat(agent): migrate backend tools to crewai tool registry"
Task 3: 修复审批后异步续跑闭环(#5)
Files:
- Modify:
backend/src/core/agent/application/resume_service.py - Modify:
backend/src/core/agent/infrastructure/queue/tasks.py - Modify:
backend/src/core/agent/application/session_state_persistence.py - Test:
backend/tests/integration/core/agent/test_queue_run_resume.py
Step 1: Write the failing test
def test_resume_triggers_async_loop_until_final_assistant_message(client):
response = client.post("/v1/agent/runs/{id}/resume", json={"approve": True})
assert response.status_code == 202
assert eventually_has_final_assistant_message(id)
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/integration/core/agent/test_queue_run_resume.py -k triggers_async_loop -v
Expected: FAIL,当前审批后直接完成。
Step 3: Write minimal implementation
# resume_service.py
await mark_session_resuming(...)
await enqueue_resume_task(...)
return ResumeAccepted(...)
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/integration/core/agent/test_queue_run_resume.py -k triggers_async_loop -v
Expected: PASS
Step 5: Commit
git add backend/src/core/agent/application/resume_service.py backend/src/core/agent/infrastructure/queue/tasks.py backend/src/core/agent/application/session_state_persistence.py backend/tests/integration/core/agent/test_queue_run_resume.py
git commit -m "fix(agent): continue agent loop asynchronously after tool approval"
Task 4: 三阶段 Step 事件完整化(intent/execution/organization)
Files:
- Modify:
backend/src/core/agent/infrastructure/crewai/runtime.py - Modify:
backend/src/core/agent/infrastructure/agui/bridge.py - Test:
backend/tests/unit/core/agent/test_agui_bridge.py - Test:
backend/tests/integration/v1/agent/test_sse_flow_live.py
Step 1: Write the failing test
def test_each_stage_emits_step_started_and_finished():
events = collect_events_from_run(...)
assert has_step_pair(events, "intent")
assert has_step_pair(events, "execution")
assert has_step_pair(events, "organization")
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/integration/v1/agent/test_sse_flow_live.py -k emits_step_started_and_finished -v
Expected: FAIL,至少一个阶段事件缺失。
Step 3: Write minimal implementation
emit_step_started(stage)
stage_output = run_stage(stage)
emit_step_finished(stage)
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/integration/v1/agent/test_sse_flow_live.py -k emits_step_started_and_finished -v
Expected: PASS
Step 5: Commit
git add backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/agui/bridge.py backend/tests/unit/core/agent/test_agui_bridge.py backend/tests/integration/v1/agent/test_sse_flow_live.py
git commit -m "feat(agent): emit ag-ui step events for three-stage flow"
Task 5: 工具输出统一为 UI Schema v1(#3)
Files:
- Modify:
backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py - Modify:
backend/src/core/agent/domain/message_metadata.py - Test:
backend/tests/unit/core/agent/test_run_resume_service.py
Step 1: Write the failing test
def test_calendar_tool_returns_ui_schema_v1():
result = run_calendar_tool(...)
assert result["type"] == "calendar_card.v1"
assert result["version"] == "v1"
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k returns_ui_schema_v1 -v
Expected: FAIL,当前返回简单 status/event_id。
Step 3: Write minimal implementation
return {
"type": "calendar_card.v1",
"version": "v1",
"data": {...},
"actions": [...],
}
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/unit/core/agent/test_run_resume_service.py -k returns_ui_schema_v1 -v
Expected: PASS
Step 5: Commit
git add backend/src/core/agent/infrastructure/crewai/tools/create_calendar_event_tool.py backend/src/core/agent/domain/message_metadata.py backend/tests/unit/core/agent/test_run_resume_service.py
git commit -m "feat(agent): return tool results as ui schema v1"
Task 6: 工具结果对象存储(#2)
Files:
- Modify:
backend/src/core/agent/application/session_state_persistence.py - Modify:
backend/src/core/agent/domain/message_metadata.py - Test:
backend/tests/integration/core/agent/test_session_message_persistence.py
Step 1: Write the failing test
def test_large_tool_payload_persisted_to_object_storage():
meta = persist_large_tool_result(...)
assert meta.storage_bucket is not None
assert meta.storage_path is not None
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/integration/core/agent/test_session_message_persistence.py -k object_storage -v
Expected: FAIL,当前 metadata 为空。
Step 3: Write minimal implementation
payload_ref = await persist_tool_result_payload(...)
metadata.storage_bucket = payload_ref.bucket
metadata.storage_path = payload_ref.path
metadata.payload_sha256 = payload_ref.sha256
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/integration/core/agent/test_session_message_persistence.py -k object_storage -v
Expected: PASS
Step 5: Commit
git add backend/src/core/agent/application/session_state_persistence.py backend/src/core/agent/domain/message_metadata.py backend/tests/integration/core/agent/test_session_message_persistence.py
git commit -m "feat(agent): persist large tool results to object storage"
Task 7: 三阶段参数解耦(#4)
Files:
- Modify:
backend/src/core/agent/infrastructure/crewai/runtime.py - Modify:
backend/src/core/agent/infrastructure/config/resolver.py - Test:
backend/tests/unit/core/agent/test_config_resolver.py - Test:
backend/tests/unit/core/agent/test_crewai_runtime.py
Step 1: Write the failing test
def test_intent_stage_can_disable_tools():
cfg = load_stage_config(intent_tools=[])
result = run_intent_stage(cfg)
assert result.tool_calls == []
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k intent_stage_can_disable_tools -v
Expected: FAIL,当前三阶段共享同一 llm/tools 配置。
Step 3: Write minimal implementation
stage_cfg = config.for_stage(stage)
run_stage(..., llm_config=stage_cfg.llm, tools=stage_cfg.tools)
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/unit/core/agent/test_crewai_runtime.py -k intent_stage_can_disable_tools -v
Expected: PASS
Step 5: Commit
git add backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/config/resolver.py backend/tests/unit/core/agent/test_config_resolver.py backend/tests/unit/core/agent/test_crewai_runtime.py
git commit -m "refactor(agent): decouple llm and tool strategy by stage"
Task 8: 多模态图片输入(文件上传)支持(#7)
Files:
- Modify:
backend/src/core/agent/domain/agui_input.py - Modify:
backend/src/core/agent/infrastructure/crewai/runtime.py - Modify:
backend/src/core/agent/infrastructure/litellm/client.py - Test:
backend/tests/unit/core/agent/test_litellm_client.py
Step 1: Write the failing test
def test_image_content_block_is_preserved_for_llm():
payload = build_multimodal_payload(text="分析图片", image_file="a.png")
assert payload_contains_image_block(payload)
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/unit/core/agent/test_litellm_client.py -k image_content_block_is_preserved -v
Expected: FAIL,当前非 text 被丢弃。
Step 3: Write minimal implementation
if item.type == "image":
blocks.append({"type": "image_url", "image_url": {"url": signed_file_url}})
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/unit/core/agent/test_litellm_client.py -k image_content_block_is_preserved -v
Expected: PASS
Step 5: Commit
git add backend/src/core/agent/domain/agui_input.py backend/src/core/agent/infrastructure/crewai/runtime.py backend/src/core/agent/infrastructure/litellm/client.py backend/tests/unit/core/agent/test_litellm_client.py
git commit -m "feat(agent): support multimodal image input blocks"
Task 9: 新增 ASR 同步转写 API(#8)
Files:
- Create:
backend/src/v1/agent/asr_router.py - Modify:
backend/src/v1/agent/router.py - Create:
backend/src/v1/agent/asr_service.py - Create:
backend/src/v1/agent/asr_schemas.py - Test:
backend/tests/integration/v1/agent/test_routes.py
Step 1: Write the failing test
def test_asr_transcribe_returns_sync_transcript(client, wav_file):
resp = client.post("/v1/agent/asr/transcribe", files={"audio": wav_file})
assert resp.status_code == 200
assert resp.json()["transcript"]
Step 2: Run test to verify it fails
Run: cd backend && uv run pytest tests/integration/v1/agent/test_routes.py -k asr_transcribe_returns_sync_transcript -v
Expected: FAIL,当前无路由。
Step 3: Write minimal implementation
@router.post("/asr/transcribe")
async def transcribe(audio: UploadFile) -> AsrTranscribeResponse:
text = await asr_service.transcribe(audio)
return AsrTranscribeResponse(transcript=text)
Step 4: Run test to verify it passes
Run: cd backend && uv run pytest tests/integration/v1/agent/test_routes.py -k asr_transcribe_returns_sync_transcript -v
Expected: PASS
Step 5: Commit
git add backend/src/v1/agent/asr_router.py backend/src/v1/agent/router.py backend/src/v1/agent/asr_service.py backend/src/v1/agent/asr_schemas.py backend/tests/integration/v1/agent/test_routes.py
git commit -m "feat(agent): add synchronous asr transcription endpoint"
Task 10: 全量验证与文档对齐
Files:
- Modify:
docs/runtime/runtime-route.md - Modify:
docs/bugs/2026-03-08-agent-tool-architecture.md(状态回填)
Step 1: Run targeted unit suite
Run: cd backend && uv run pytest tests/unit/core/agent -v
Expected: PASS
Step 2: Run targeted integration suite
Run: cd backend && uv run pytest tests/integration/core/agent tests/integration/v1/agent -v
Expected: PASS
Step 3: Run e2e smoke for agent flow
Run: cd backend && uv run pytest tests/e2e -k "agent or mobile_health" -v
Expected: PASS 或明确记录跳过原因
Step 4: Run quality gates
Run: cd backend && uv run ruff check src tests && uv run basedpyright
Expected: PASS
Step 5: Final commit
git add docs/runtime/runtime-route.md docs/bugs/2026-03-08-agent-tool-architecture.md
git commit -m "docs(agent): align runtime docs with new tool architecture"
Verification Evidence Requirements
实施完成时必须输出:
- 双金路径验证结果(无工具 + 工具审批后续跑)
- 三阶段 StepStarted/StepFinished 事件日志片段
- 安全验证结果(前端 history 篡改无效)
- ASR 同步转写接口请求/响应样例
- 关键命令输出摘要(pytest/ruff/basedpyright)
Notes
- 本计划不包含兼容逻辑保留。
- 本计划采用一次性切换。
- 若实施中出现 S2 -> S3 范围升级,先暂停并更新计划,再继续执行。