Files

T

qzl 1c02503d1d refactor: 简化 AgentScope 运行时模块与事件处理

- 移除冗余的 user_token 参数传递
- 重构 tool.result 事件使用 ToolAgentOutput 模型
- 重构 text.end 事件使用 WorkerAgentOutput 模型
- 简化 store 模块的 tool result 处理逻辑
- 更新 router/service 适配新事件结构
- 清理废弃的测试文件与设计文档
- 新增 AgentRuns 多模态存储设计文档

2026-03-13 17:27:18 +08:00

9.4 KiB

Raw Blame History

Agent Runs Multimodal Refactor Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: 让 runs/resume 使用真实多模态图片输入，并将 worker/tool 按新结构化 metadata 规范落库。

Architecture: 保持现有 event pipeline，不引入旁路写库。请求入口完成 URL 安全边界校验；runtime 将 binary 转模型可识别 image_url block；event store 统一校验 WorkerAgentOutput / ToolAgentOutput 并完成 content 映射。

Tech Stack: FastAPI, Pydantic v2, SQLAlchemy AsyncSession, AgentScope, LiteLLM, Redis Stream

Task 1: Runs 输入安全边界

Files:

Modify: backend/src/core/agentscope/schemas/agui_input.py
Modify: backend/src/v1/agent/router.py
Modify: backend/src/v1/agent/service.py
Test: backend/tests/unit/v1/agent/test_agent_router.py

Step 1: Write the failing test

def test_runs_rejects_non_project_signed_url(...) -> None:
    payload = build_run_payload_with_binary_url("https://evil.example.com/storage/v1/object/sign/..." )
    resp = client.post("/api/v1/agent/runs", json=payload, headers=auth_headers)
    assert resp.status_code == 422

Step 2: Run test to verify it fails

Run: pytest backend/tests/unit/v1/agent/test_agent_router.py::test_runs_rejects_non_project_signed_url -v Expected: FAIL（当前不会拦截该 URL）

Step 3: Write minimal implementation

def validate_binary_signed_url_scope(*, url: str, user_id: UUID, thread_id: UUID) -> tuple[str, str]:
    bucket, path = supabase_service.parse_signed_url(url)
    # check host, bucket, path prefix agent-inputs/{user_id}/{thread_id}/uploads/
    return bucket, path

在 runs/resume 请求入口调用校验；若请求含 binary 且当前模型不支持视觉，抛 HTTPException(status_code=422, ...)。

Step 4: Run test to verify it passes

Run: pytest backend/tests/unit/v1/agent/test_agent_router.py::test_runs_rejects_non_project_signed_url -v Expected: PASS

Step 5: Commit

git add backend/src/core/agentscope/schemas/agui_input.py backend/src/v1/agent/router.py backend/src/v1/agent/service.py backend/tests/unit/v1/agent/test_agent_router.py
git commit -m "fix: enforce signed image url scope on runs"

Task 2: Runtime 多模态直传（移除文本化图片）

Files:

Modify: backend/src/core/agentscope/runtime/orchestrator.py
Modify: backend/src/core/agentscope/prompts/agent_prompt.py
Test: backend/tests/unit/core/agentscope/runtime/test_orchestrator.py

Step 1: Write the failing test

async def test_orchestrator_passes_image_url_block_to_runner() -> None:
    command = build_run_input_with_binary("https://project.supabase.co/storage/v1/object/sign/...")
    await orchestrator.run(..., command=command, ...)
    assert fake_runner.user_input[1]["type"] == "image_url"

Step 2: Run test to verify it fails

Run: pytest backend/tests/unit/core/agentscope/runtime/test_orchestrator.py::test_orchestrator_passes_image_url_block_to_runner -v Expected: FAIL（当前路径仍可能文本化）

Step 3: Write minimal implementation

def _to_model_multimodal_blocks(content_blocks: list[dict[str, Any]]) -> list[dict[str, Any]]:
    # text -> {type:"text", text:...}
    # binary -> {type:"image_url", image_url:{url:...}}

将 runner 输入改为上述多模态块；禁止把图片块拼进普通字符串。

Step 4: Run test to verify it passes

Run: pytest backend/tests/unit/core/agentscope/runtime/test_orchestrator.py::test_orchestrator_passes_image_url_block_to_runner -v Expected: PASS

Step 5: Commit

git add backend/src/core/agentscope/runtime/orchestrator.py backend/src/core/agentscope/prompts/agent_prompt.py backend/tests/unit/core/agentscope/runtime/test_orchestrator.py
git commit -m "feat: pass image blocks as multimodal payload to model"

Task 3: Worker 结构化落库（content=answer）

Files:

Modify: backend/src/core/agentscope/events/store.py
Modify: backend/src/core/agentscope/runtime/orchestrator.py
Test: backend/tests/unit/core/agentscope/events/test_store.py

Step 1: Write the failing test

async def test_text_message_end_persists_worker_output_and_answer_content() -> None:
    event = build_text_end_event(worker_agent_output={"answer": "ok", ...})
    await store.persist(event)
    assert saved.content == "ok"
    assert saved.metadata_json["worker_agent_output"]["answer"] == "ok"

Step 2: Run test to verify it fails

Run: pytest backend/tests/unit/core/agentscope/events/test_store.py::test_text_message_end_persists_worker_output_and_answer_content -v Expected: FAIL

Step 3: Write minimal implementation

worker = WorkerAgentOutput.model_validate(event.get("workerAgentOutput") or {})
content = worker.answer
metadata["worker_agent_output"] = worker.model_dump(mode="json")

orchestrator 在 text.end 事件 data 写入 workerAgentOutput。

Step 4: Run test to verify it passes

Run: pytest backend/tests/unit/core/agentscope/events/test_store.py::test_text_message_end_persists_worker_output_and_answer_content -v Expected: PASS

Step 5: Commit

git add backend/src/core/agentscope/events/store.py backend/src/core/agentscope/runtime/orchestrator.py backend/tests/unit/core/agentscope/events/test_store.py
git commit -m "refactor: persist worker output schema with answer as message content"

Task 4: Tool 结构化落库（content=result_summary）并删除旧摘要逻辑

Files:

Modify: backend/src/core/agentscope/events/store.py
Modify: backend/src/core/agentscope/runtime/orchestrator.py
Delete: backend/src/core/agentscope/events/tool_result_summary.py
Test: backend/tests/unit/core/agentscope/events/test_store.py

Step 1: Write the failing test

async def test_tool_result_persists_tool_output_and_summary_content() -> None:
    event = build_tool_result_event(tool_agent_output={"result_summary": "done", ...})
    await store.persist(event)
    assert saved.content == "done"
    assert saved.metadata_json["tool_agent_output"]["result_summary"] == "done"

Step 2: Run test to verify it fails

Run: pytest backend/tests/unit/core/agentscope/events/test_store.py::test_tool_result_persists_tool_output_and_summary_content -v Expected: FAIL

Step 3: Write minimal implementation

tool = ToolAgentOutput.model_validate(event.get("toolAgentOutput") or {})
content = tool.result_summary
metadata["tool_agent_output"] = tool.model_dump(mode="json")

移除 build_tool_content_summary 相关 import/调用。

Step 4: Run test to verify it passes

Run: pytest backend/tests/unit/core/agentscope/events/test_store.py::test_tool_result_persists_tool_output_and_summary_content -v Expected: PASS

Step 5: Commit

git add backend/src/core/agentscope/events/store.py backend/src/core/agentscope/runtime/orchestrator.py backend/tests/unit/core/agentscope/events/test_store.py backend/src/core/agentscope/events/tool_result_summary.py
git commit -m "refactor: persist tool output schema and remove legacy summary builder"

Task 5: Worker output 模型别名收敛（可选第二阶段）

Files:

Modify: backend/src/schemas/agent/runtime_models.py
Modify: backend/src/schemas/messages/chat_message.py
Test: backend/tests/unit/schemas/agent/test_runtime_models.py

Step 1: Write the failing test

def test_worker_output_lite_disallows_ui_hints() -> None:
    with pytest.raises(ValidationError):
        WorkerAgentOutputLite.model_validate({... , "ui_hints": {...}})

Step 2: Run test to verify it fails

Run: pytest backend/tests/unit/schemas/agent/test_runtime_models.py::test_worker_output_lite_disallows_ui_hints -v Expected: 根据现状决定（若已 fail 则作为守护测试）

Step 3: Write minimal implementation

WorkerAgentOutput = WorkerAgentOutputLite | WorkerAgentOutputRich

如不想扩大变更，可保留现状并仅补充注释说明由 resolve_worker_output_model 决定运行时约束。

Step 4: Run test to verify it passes

Run: pytest backend/tests/unit/schemas/agent/test_runtime_models.py -v Expected: PASS

Step 5: Commit

git add backend/src/schemas/agent/runtime_models.py backend/src/schemas/messages/chat_message.py backend/tests/unit/schemas/agent/test_runtime_models.py
git commit -m "refactor: clarify worker output model contract for lite and rich modes"

Task 6: 端到端回归与文档同步

Files:

Modify: docs/protocols/agent-chat-messages.md
Modify: docs/runtime/runtime-route.md

Step 1: Run targeted backend tests

Run: pytest backend/tests/unit/v1/agent/test_agent_router.py backend/tests/unit/core/agentscope/runtime/test_orchestrator.py backend/tests/unit/core/agentscope/events/test_store.py -v Expected: PASS

Step 2: Run lint/type checks

Run: cd backend && ruff check src tests && mypy src Expected: PASS

Step 3: Update docs for new contracts

明确 runs 的 URL 安全边界与 422 错误码。
明确 worker_agent_output/tool_agent_output 的落库契约及 content 映射规则。

Step 4: Final verification

Run: pytest backend/tests -q Expected: PASS

Step 5: Commit

git add docs/protocols/agent-chat-messages.md docs/runtime/runtime-route.md
git commit -m "docs: align runs multimodal and structured persistence contracts"

9.4 KiB Raw Blame History Unescape Escape

Agent Runs Multimodal Refactor Implementation Plan

Task 1: Runs 输入安全边界

Task 2: Runtime 多模态直传（移除文本化图片）

Task 3: Worker 结构化落库（content=answer）

Task 4: Tool 结构化落库（content=result_summary）并删除旧摘要逻辑

Task 5: Worker output 模型别名收敛（可选第二阶段）

Task 6: 端到端回归与文档同步

9.4 KiB

Raw Blame History