407 lines
21 KiB
Markdown
407 lines
21 KiB
Markdown
# AgentScope Skill + CLI Tool Refactor Implementation Checklist
|
|
|
|
## Purpose
|
|
|
|
This file is the execution checklist for implementing the PRD in:
|
|
|
|
- `.trellis/tasks/04-20-refactor-tool-cli-skill-ui-schema/prd.md`
|
|
|
|
Use this document as the working guide during implementation.
|
|
Do not mark an item complete until the code, docs, and verification for that item are actually done.
|
|
|
|
## Required Standards Read Before Backend Changes
|
|
|
|
- [x] Read `backend/AGENTS.md`
|
|
- [x] Read `.trellis/spec/backend/index.md`
|
|
- [x] Read `.trellis/spec/backend/database-guidelines.md`
|
|
- [x] Read `.trellis/spec/backend/error-handling.md`
|
|
- [x] Read `.trellis/spec/backend/logging-guidelines.md`
|
|
- [x] Read `.trellis/spec/backend/quality-guidelines.md`
|
|
- [x] Confirm `.trellis/spec/backend/type-safety.md` does not exist; use current backend schema/type rules from `backend/AGENTS.md` and repository code as the effective type-safety baseline
|
|
|
|
## Non-Negotiable Constraints
|
|
|
|
- [x] Protocol docs are updated before implementation changes that alter contracts
|
|
- [ ] New backend runtime code reads configuration only through `core.config.settings`
|
|
- [ ] New backend runtime code uses project logging, never `print()`
|
|
- [ ] New backend errors follow RFC 7807 with stable `code`
|
|
- [ ] Any new or changed error codes are updated in `docs/protocols/common/http-error-codes.md`
|
|
- [ ] Repository/service layering remains intact
|
|
- [ ] `owner_id` is never treated as a credential
|
|
- [ ] No new error swallowing is introduced
|
|
- [x] `ToolAgentOutput.result` remains the canonical machine-oriented tool result field
|
|
|
|
## Execution Order
|
|
|
|
- [ ] Phase 0 completed before any runtime contract change is implemented
|
|
- [x] Phase 1 completed before replacing tool execution with CLI-backed wrappers
|
|
- [ ] Phase 2 completed before auth credential transport is wired into queue/runtime
|
|
- [ ] Phase 3 completed before frontend contract alignment begins
|
|
- [ ] Phase 4 completed before cleanup is considered done
|
|
- [ ] Phase 5 verification completed before task is marked finished
|
|
|
|
## Phase 0: Protocol Docs First
|
|
|
|
### 0.1 Define the tool protocol source of truth
|
|
|
|
- [x] Add `docs/protocols/agent/tool-protocol.md`
|
|
- [x] Document that CLI execution produces structured `result` as the source payload
|
|
- [x] Document that `ToolResponse` only carries the text projection of `result`
|
|
- [x] Document that runtime tool post-processing reconstructs full `ToolAgentOutput`
|
|
- [x] Document that tool post-processing is responsible for `status`, `error`, and `ui_hints`
|
|
- [x] Document that `message.content` is the full JSON text projection of `ToolAgentOutput.result`
|
|
- [x] Document that `ToolAgentOutput` is used for SSE, persistence, history recovery, and context rebuild
|
|
- [x] Document CLI input channel split: `argv` primary, `stdin` secondary, environment variables for controlled auth injection
|
|
- [x] Document stdout JSON shape and non-zero exit semantics
|
|
- [x] Document that shell execution is not exposed; router is whitelist-only
|
|
|
|
### 0.2 Remove `ui_schema` from active protocol
|
|
|
|
- [x] Update `docs/protocols/ui/data-flow.md`
|
|
- [x] Replace worker-driven UI source descriptions with tool-driven `ui_hints`
|
|
- [x] Explicitly document that worker output no longer includes `ui_hints`
|
|
- [x] Explicitly document that history tool UI recovery reads `metadata.tool_agent_output.ui_hints` and compiles to `ui_schema`
|
|
- [x] Update `docs/protocols/ui/ui-schema.md`
|
|
- [x] Clarify that `ui_hints` is the descriptive UI representation (source), `ui_schema` is the rendered format (wire format)
|
|
- [x] Clarify that frontend renderer continues to consume `ui_schema`
|
|
- [x] Document that `ui_hints → ui_schema` compilation path remains unchanged, only `ui_hints` source changes
|
|
|
|
### 0.3 Update SSE and HTTP contracts
|
|
|
|
- [x] Update `docs/protocols/agent/sse-events.md`
|
|
- [x] Remove worker `ui_hints` from `TEXT_MESSAGE_END`
|
|
- [x] Define `TOOL_CALL_RESULT` payload with `ui_schema` (compiled from `ui_hints`)
|
|
- [x] Document that `ui_hints → ui_schema` compilation happens in backend codec
|
|
- [x] Provide examples where `result` is object-shaped instead of string-shaped
|
|
- [x] Update `docs/protocols/agent/api-endpoints.md`
|
|
- [x] Define `/history` response contract for tool UI replay from `metadata.tool_agent_output.ui_hints` compiled to `ui_schema`
|
|
- [x] Remove any statement that `/history` is assistant-only UI-wise if tool UI replay is now supported
|
|
- [x] Update `docs/protocols/agent/run-agent-input.md`
|
|
- [x] Clarify that frontend does not submit auth token as tool arg
|
|
- [x] Clarify that backend-controlled tool registration remains backend-owned
|
|
|
|
### 0.4 Update auth and automation protocol docs
|
|
|
|
- [x] Update `docs/protocols/models/auth.md`
|
|
- [x] Define controlled credential purpose, TTL, scope, and audit expectations
|
|
- [x] Define relationship between normal bearer issuer and automation credential issuer
|
|
- [x] Update `docs/protocols/models/automation-jobs.md`
|
|
- [x] State that `owner_id` is only an identity reference, not a credential
|
|
- [x] Document automation credential issuance path before queue/runtime execution
|
|
- [x] Update `docs/protocols/common/http-error-codes.md` if new codes are introduced for CLI/runtime/credential failures
|
|
|
|
### 0.5 Phase 0 verification
|
|
|
|
- [x] Confirm protocol docs no longer describe worker `ui_hints` as UI source
|
|
- [x] Confirm protocol docs explicitly document `ui_hints → ui_schema` compilation path
|
|
- [x] Confirm docs explicitly define ToolResponse vs ToolAgentOutput responsibility split
|
|
- [x] Confirm docs explicitly define `/history` tool UI replay path (from `ui_hints` compiled to `ui_schema`)
|
|
- [x] Confirm docs explicitly define controlled credential transport and TTL
|
|
|
|
## Phase 1: Backend Contract Models And Persistence Path
|
|
|
|
### 1.1 Refactor runtime schemas
|
|
|
|
- [x] Update `backend/src/schemas/agent/runtime_models.py`
|
|
- [x] Remove `WorkerAgentOutputRich.ui_hints`
|
|
- [x] Remove `AgentOutput` inheritance that depends on worker UI payload
|
|
- [x] Make `resolve_worker_output_model()` return the non-UI worker output model path
|
|
- [x] Change `ToolAgentOutput.result` from `str` to JSON-native structured payload type
|
|
- [x] Add `ui_hints` to `ToolAgentOutput`
|
|
- [x] Keep `ToolAgentOutput` strict with `extra="forbid"`
|
|
- [x] Review any validator changes required to keep result deterministic and JSON-native
|
|
|
|
### 1.2 Update chat message metadata schema consumers
|
|
|
|
- [x] Review `backend/src/schemas/domain/chat_message.py`
|
|
- [x] Ensure `tool_agent_output` accepts the updated structured `ToolAgentOutput`
|
|
- [x] Confirm metadata serialization remains compatible with persistence and context cache usage
|
|
|
|
### 1.3 Separate ToolResponse from ToolAgentOutput
|
|
|
|
- [x] Update `backend/src/core/agentscope/tools/utils/tool_response_builder.py`
|
|
- [x] Stop serializing full `ToolAgentOutput` directly into `ToolResponse.content`
|
|
- [x] Make `build_tool_response()` emit only the text projection of `result`
|
|
- [x] Decide and implement the helper that projects structured `result` to stable JSON text
|
|
- [x] Update error response builder to follow the same split cleanly
|
|
|
|
### 1.4 Add tool post-processing path
|
|
|
|
- [x] Introduce a runtime tool post-processing module in backend tool/runtime layer
|
|
- [x] Define the post-processor input contract from raw tool execution result
|
|
- [x] Define the post-processor output as full `ToolAgentOutput`
|
|
- [x] Ensure post-processor is the only place generating `ui_hints` for tools
|
|
- [x] Ensure worker code does not generate tool UI fields anymore
|
|
|
|
### 1.5 Update parsing and stage emission
|
|
|
|
- [x] Update `backend/src/core/agentscope/utils/parsing.py`
|
|
- [x] Stop assuming text blocks contain full serialized `ToolAgentOutput`
|
|
- [x] Add helpers to parse the text projection back into structured result where required
|
|
- [x] Update `backend/src/core/agentscope/runtime/stage_emitter.py`
|
|
- [x] Remove worker `ui_hints` emission from final text events
|
|
- [x] Emit `TOOL_CALL_RESULT` based on full post-processed `ToolAgentOutput`
|
|
- [x] Ensure emitted tool payload carries structured `result` and `ui_hints`
|
|
|
|
### 1.6 Update AG-UI codec and event storage
|
|
|
|
- [x] Update `backend/src/core/agentscope/events/agui_codec.py`
|
|
- [x] Remove worker `ui_hints -> ui_schema` compilation path
|
|
- [x] Remove `ui_schema`-specific output shaping
|
|
- [x] Ensure tool events pass through tool-derived `ui_hints`
|
|
- [x] Update `backend/src/core/agentscope/events/store.py`
|
|
- [x] Persist tool message `content` as the JSON text projection of `result`
|
|
- [x] Persist full post-processed `ToolAgentOutput` in metadata
|
|
- [x] Ensure worker metadata no longer expects `ui_hints`
|
|
|
|
### 1.7 Unify cold/hot runtime paths
|
|
|
|
- [x] Update `backend/src/core/agentscope/runtime/tasks.py`
|
|
- [x] Replace `_serialize_tool_agent_output()` assumptions that rely on old `ToolAgentOutput` shape
|
|
- [x] Ensure context rebuild uses the same content projection rule as hot-path execution
|
|
- [x] Stop rebuilding tool context from legacy string-only result assumptions
|
|
- [x] Review `backend/src/core/agentscope/caches/context_messages_cache.py`
|
|
- [x] Define whether old cache payloads are backward-read compatible or intentionally invalidated
|
|
- [x] Ensure runtime cold path and hot path see the same tool message shape
|
|
|
|
### 1.8 Update `/history` backend shaping
|
|
|
|
- [x] Update `backend/src/v1/agent/utils.py`
|
|
- [x] Remove worker `ui_hints` compilation logic
|
|
- [x] Stop returning `ui_schema`
|
|
- [x] Add tool UI replay logic from `metadata.tool_agent_output.ui_hints`
|
|
- [x] Keep user attachment handling intact
|
|
- [x] Update `backend/src/v1/agent/schemas.py`
|
|
- [x] Remove `UiSchemaRenderer` dependency from `HistoryMessage`
|
|
- [x] Redefine history response shape to carry tool UI replay payload
|
|
- [x] Update role constraints if tool-derived history items need explicit representation
|
|
- [x] Review `backend/src/v1/agent/repository.py` for any history query assumptions that prevent tool UI replay
|
|
|
|
### 1.9 Phase 1 verification
|
|
|
|
- [x] Unit tests cover `ToolAgentOutput.result` as structured payload
|
|
- [x] Unit tests confirm worker output schema no longer includes `ui_hints`
|
|
- [x] Unit tests confirm ToolResponse no longer embeds full ToolAgentOutput
|
|
- [x] Unit tests confirm event store persists full ToolAgentOutput metadata and projected content separately
|
|
- [x] Unit tests confirm `/history` shaping no longer emits `ui_schema`
|
|
- [x] Unit tests confirm tool UI replay uses `metadata.tool_agent_output.ui_hints`
|
|
|
|
## Phase 2: CLI-Backed Tools And Skill Registration
|
|
|
|
### 2.1 Replace direct Python tool registration
|
|
|
|
- [x] Update `backend/src/core/agentscope/tools/tool_config.py`
|
|
- [x] Replace function-name-centric mapping with CLI capability/wrapper-centric mapping
|
|
- [x] Unify config and runtime skill selection on `enabled_skills`
|
|
- [x] Keep approval config support aligned with the new tool names
|
|
- [x] Update `backend/src/core/agentscope/tools/toolkit.py`
|
|
- [x] Remove direct imports of `custom/calendar.py`, `custom/memory.py`, `custom/user_lookup.py`
|
|
- [x] Register CLI-backed wrappers instead of Python business functions
|
|
- [x] Preserve `enabled_skills` filtering behavior
|
|
|
|
### 2.2 Add CLI adapter, router, and entrypoint
|
|
|
|
- [x] Add a CLI adapter module in `backend/src/core/agentscope/tools/`
|
|
- [x] Adapter must invoke only the project CLI entrypoint
|
|
- [x] Adapter must pass args via `argv` primarily and `stdin` secondarily where required
|
|
- [x] Adapter must inject auth credential only via controlled environment variables
|
|
- [x] Adapter must parse stdout JSON and map failures to structured errors
|
|
- [x] Add a command router module in `backend/src/core/agentscope/tools/`
|
|
- [x] Router must be whitelist-only
|
|
- [x] Router must map commands to Python handlers
|
|
- [x] Router must not expose generic shell execution
|
|
- [x] Add a Python console entrypoint module in `backend/src/core/agentscope/tools/`
|
|
- [x] Update `pyproject.toml` with the console script entry
|
|
|
|
### 2.3 Migrate tool implementations to CLI handlers
|
|
|
|
- [x] Replace old `backend/src/core/agentscope/tools/custom/*.py` direct runtime tools with CLI handler implementations
|
|
- [x] Remove old direct AgentScope tool-function implementations from final runtime wiring
|
|
- [x] Ensure new handlers only call allowed internal services/repositories
|
|
- [x] Ensure handler boundaries follow schema -> repository -> service layering
|
|
- [x] Ensure handlers raise typed errors instead of transport exceptions where applicable
|
|
|
|
### 2.4 Register AgentScope skills
|
|
|
|
- [x] Populate `backend/src/core/agentscope/tools/custom` with skill assets using AgentScope-native layout
|
|
- [x] Add required `SKILL.md` files
|
|
- [x] Ensure skill content explains when to use each tool and how to compose them
|
|
- [x] Register skills through AgentScope-native registration path in toolkit/runtime setup
|
|
- [x] Ensure skill assets are included in runtime/deployment packaging
|
|
|
|
### 2.5 Update runner and middleware linkages
|
|
|
|
- [x] Update `backend/src/core/agentscope/runtime/runner.py`
|
|
- [x] Build toolkit from CLI-backed wrappers instead of Python functions
|
|
- [x] Keep `enabled_skills` and stage-based selection behavior intact
|
|
- [x] Update `backend/src/core/agentscope/tools/tool_middleware.py`
|
|
- [x] Ensure middleware name resolution still works with the new tool registration path
|
|
- [x] Update `backend/src/core/agentscope/prompts/agent_prompt.py`
|
|
- [x] Remove any prompt assumptions that still act as pseudo-skill behavior
|
|
- [x] Keep prompt aligned with skill-driven disclosure instead of duplicating the full tool contract
|
|
|
|
### 2.6 Phase 2 verification
|
|
|
|
- [x] Unit tests cover CLI adapter success path
|
|
- [x] Unit tests cover CLI adapter malformed stdout path
|
|
- [x] Unit tests cover CLI adapter non-zero exit path
|
|
- [x] Unit tests confirm toolkit only registers enabled CLI-backed tools
|
|
- [x] Unit tests confirm middleware still recognizes the active tool names
|
|
- [x] Smoke test confirms AgentScope skill registration succeeds from project skill assets
|
|
|
|
## Phase 3: Controlled Credential And Queue Transport
|
|
|
|
### 3.1 Define backend auth runtime objects
|
|
|
|
- [x] Review `backend/src/core/auth/models.py`
|
|
- [x] Add any missing auth runtime model needed for controlled credential transport
|
|
- [x] Keep `CurrentUser` as identity model if still appropriate, but do not overload it as credential carrier without an explicit design
|
|
|
|
### 3.2 Add controlled credential issuance path
|
|
|
|
- [x] Add a credential issuer service under `backend/src/core/auth/` or another appropriate auth module
|
|
- [x] Keep issuer in the same trust boundary as current bearer token issuing system
|
|
- [x] Ensure issued credential is short-lived according to PRD target
|
|
- [x] Ensure issuer encodes only the minimal scope required for tool execution
|
|
- [x] Ensure logs do not expose raw credentials
|
|
|
|
### 3.3 Wire chat enqueue path
|
|
|
|
- [x] Update `backend/src/v1/agent/service.py`
|
|
- [x] Stop enqueueing only `owner_id` for runtime auth purposes
|
|
- [x] Enqueue the controlled credential or resolvable credential handle required by worker runtime
|
|
- [x] Ensure queue payload does not expose raw token in model-visible fields
|
|
- [x] Keep session ownership checks intact
|
|
|
|
### 3.4 Wire automation dispatch path
|
|
|
|
- [x] Update `backend/src/core/automation/scheduler.py`
|
|
- [x] Stop creating runtime auth solely as `CurrentUser(id=owner_id)`
|
|
- [x] Issue or obtain automation controlled credential before enqueueing run
|
|
- [x] Ensure `owner_id` remains only a lookup/reference input
|
|
- [x] Ensure automation runtime uses the same CLI auth injection mechanism as chat runtime
|
|
|
|
### 3.5 Update task runtime injection
|
|
|
|
- [x] Update `backend/src/core/agentscope/runtime/tasks.py`
|
|
- [x] Read controlled credential from queued command payload
|
|
- [x] Inject controlled credential into CLI runtime environment variables
|
|
- [x] Remove any path that implicitly depends on `owner_id` as execution credential
|
|
- [x] Keep user-context loading behavior explicit and separate from auth credential handling
|
|
|
|
### 3.6 Add settings and error mapping
|
|
|
|
- [x] Update `backend/src/core/config/settings.py` for any new CLI/credential configuration
|
|
- [x] Keep new config values typed and centralized
|
|
- [x] Update error handling paths to use stable problem codes for credential/CLI failures
|
|
- [x] Update docs/protocols/common/http-error-codes.md if these codes are new
|
|
|
|
### 3.7 Phase 3 verification
|
|
|
|
- [x] Unit tests confirm chat enqueue includes required controlled credential transport data
|
|
- [x] Unit tests confirm automation dispatch no longer relies on `owner_id` as credential
|
|
- [x] Unit tests confirm task runtime injects controlled credential only via env vars
|
|
- [x] Unit tests confirm credential issuance TTL and scope constraints
|
|
- [x] Logs and error payloads do not expose raw credentials
|
|
|
|
## Phase 4: Frontend Contract Alignment
|
|
|
|
### 4.1 Update event parsing
|
|
|
|
- [x] Update `apps/lib/core/chat/ag_ui_event.dart`
|
|
- [x] Remove active wire parsing paths that depend on `ui_schema`
|
|
- [x] Parse tool event `ui_hints` directly from updated payload contract
|
|
- [x] Parse structured `result` instead of string-only assumptions
|
|
|
|
### 4.2 Update history parsing and cache
|
|
|
|
- [x] Update `apps/lib/core/chat/chat_history_repository.dart`
|
|
- [x] Align cached history format with the new backend history response shape
|
|
- [x] Ensure history replay can rebuild tool UI items from backend-provided tool metadata/UI payload
|
|
|
|
### 4.3 Update chat service and item models
|
|
|
|
- [x] Update `apps/lib/core/chat/ag_ui_service.dart`
|
|
- [x] Ensure SSE handling matches the new tool event contract
|
|
- [x] Update `apps/lib/core/chat/chat_list_item.dart`
|
|
- [x] Remove item model assumptions that a rendered UI payload must be named `uiSchema`
|
|
|
|
### 4.4 Update rendering path
|
|
|
|
- [x] Update `apps/lib/features/chat/presentation/bloc/chat_bloc_events.dart`
|
|
- [x] Ensure tool results become visible UI items through direct tool payloads
|
|
- [x] Update `apps/lib/features/home/presentation/widgets/home_chat_item_renderer.dart`
|
|
- [x] Continue reusing the existing renderer component if it still fits the new input shape
|
|
- [x] Update `apps/lib/shared/widgets/ui_schema/ui_schema_renderer.dart` only as needed to accept the new direct tool UI input contract
|
|
|
|
### 4.5 Phase 4 verification
|
|
|
|
- [x] Frontend tests confirm SSE tool event parsing without `ui_schema`
|
|
- [x] Frontend tests confirm history replay rebuilds tool UI correctly
|
|
- [x] Frontend tests confirm refresh/reload still shows prior tool UI consistently
|
|
|
|
## Phase 5: Cleanup, Regression Tests, And Final Validation
|
|
|
|
### 5.1 Backend test updates
|
|
|
|
- [x] Update `backend/tests/unit/core/agentscope/events/test_store.py`
|
|
- [x] Update `backend/tests/unit/core/agentscope/events/test_agui_codec.py`
|
|
- [x] Update `backend/tests/unit/core/agentscope/runtime/test_stage_emitter.py`
|
|
- [x] Update `backend/tests/unit/core/agentscope/runtime/test_tasks.py`
|
|
- [x] Update `backend/tests/unit/v1/agent/test_utils.py`
|
|
- [x] Update `backend/tests/unit/schemas/agent/test_runtime_models.py`
|
|
- [x] Add tests for CLI adapter, command router, and tool post-processing
|
|
- [x] Add tests for controlled credential issuance and queue transport
|
|
|
|
### 5.2 Frontend test updates
|
|
|
|
- [x] Update `apps/test/core/chat/ag_ui_event_test.dart`
|
|
- [x] Update `apps/test/features/chat/presentation/bloc/chat_bloc_test.dart`
|
|
- [x] Add tests for history repository if needed by the new replay contract
|
|
|
|
### 5.3 Remove obsolete code paths
|
|
|
|
- [x] Remove worker `ui_hints` usage from runtime/event/history code paths
|
|
- [x] Remove active `ui_schema` contract usage from backend response shaping (N/A - ui_schema is still used as wire format)
|
|
- [x] Remove old direct `custom/*.py` tool runtime wiring
|
|
- [x] Remove any parsing logic that assumes `ToolResponse` carries full ToolAgentOutput JSON
|
|
- [x] Remove dead compatibility helpers only after replacement path is verified
|
|
|
|
### 5.4 Run verification commands
|
|
|
|
- [x] Run relevant backend unit tests with `uv run pytest ...`
|
|
- [x] Run relevant frontend tests
|
|
- [x] Run backend lint checks required for touched files
|
|
- [x] Run backend type checks required for touched files
|
|
- [x] If skill registration/package wiring changed, run a focused smoke check of the CLI-backed tool path
|
|
|
|
### 5.5 Final acceptance audit against PRD
|
|
|
|
- [x] `ui_hints → ui_schema` compilation path is preserved (only `ui_hints` source changes from worker to tool)
|
|
- [x] `WorkerAgentOutput` no longer has `ui_hints`
|
|
- [x] `/history` tool UI replay compiles `metadata.tool_agent_output.ui_hints` to `ui_schema`
|
|
- [x] `ToolResponse` carries only projected result text
|
|
- [x] Tool post-processor generates full `ToolAgentOutput`
|
|
- [x] `ToolAgentOutput.result` is structured and machine-oriented
|
|
- [x] `message.content` is the full JSON text projection of `result`
|
|
- [x] CLI uses whitelist router and no shell execution path
|
|
- [x] Chat and automation both use controlled credential injection, not `owner_id` as credential
|
|
- [x] AgentScope skills are registered from project skill assets
|
|
- [x] Hot path and cold path tool context are unified
|
|
- [x] Frontend receives `ui_schema` from `TOOL_CALL_RESULT` and history
|
|
- [x] Relevant docs, tests, lint, and type checks are updated
|
|
|
|
## Suggested First Implementation Slice
|
|
|
|
- [ ] Complete Phase 0 only
|
|
- [ ] Do not start backend runtime refactor until Phase 0 contract text is committed and reviewed
|
|
|
|
## Progress Log
|
|
|
|
- [x] Phase 0 complete
|
|
- [x] Phase 1 complete
|
|
- [x] Phase 2 complete
|
|
- [x] Phase 3 complete
|
|
- [x] Phase 4 complete
|
|
- [x] Phase 5 complete
|