.trellis/tasks/04-23-redesign-single-cli-skill-disclosure/decision-log.md

# Decision Log: Single CLI + Progressive Skill Disclosure Redesign

## Accepted Decisions

### D1. Keep one executable tool

Accepted.

Reason:

- lower tool-selection complexity
- lower repeated schema exposure cost
- consistent with the original valid direction of the CLI refactor

### D2. Replace `command/subcommand/args` with `module/method/input`

Accepted.

Reason:

- keeps one tool while removing ambiguous CLI-history semantics
- aligns the worker-facing protocol with business intent
- reduces repeated failure from action guessing
- keeps skill files decoupled from the executable tool contract

### D2.1 Remove `skill` from `project_cli`

Accepted.

Reason:

- skill files are guidance, not transport
- project_cli should execute by business module and method only
- error messages and validation should remain tool-native and not point back into skill docs

### D2.2 Use strong typed calendar read inputs

Accepted.

Reason:

- `date: str` plus manual parsing is glue code and too easy to misuse
- Pydantic `date`, timezone-aware `datetime`, and `UUID` give stricter and clearer validation
- `calendar.read` can cover day/range/event reads with one module-scoped method while still keeping input modes explicit

### D3. Keep progressive disclosure through skill files

Accepted.

Reason:

- allows the model to load only current-scenario knowledge
- avoids injecting every action definition into every model call
- fits AgentScope skill usage better than giant tool schemas

### D4. Split skill knowledge into index + action cards

Accepted.

Reason:

- real progressive disclosure needs smaller files, not only a long `SKILL.md`
- action-scoped files are easier for the worker to read and apply correctly

### D5. Set worker `max_iters=7`

Accepted.

Reason:

- current default 10 is too high for repeated invalid action discovery
- 7 preserves room for complex tasks without keeping the current waste level

### D6. Keep worker temperature unchanged

Accepted.

Reason:

- explicit user requirement
- this task focuses on protocol clarity and token efficiency, not generation-style tuning

### D7. Remove semantic reliance on worker `context_messages`

Accepted.

Reason:

- current runtime does not feed those messages into worker execution
- keeping the config active on worker is misleading and complicates reasoning about cost

## Rejected Decisions

### R1. Re-split into many domain tools

Rejected.

Reason:

- increases tool schema size
- increases selection ambiguity
- pushes the design back toward the old high-token path

### R2. Keep old CLI shape and only improve skill writing

Rejected.

Reason:

- failure came from structural action ambiguity, not only poor wording
- `read` remains overloaded even with better prose

### R3. Keep broad legacy input compatibility

Rejected.

Reason:

- old aliases teach the model that guessing is acceptable
- compatibility paths increase parser complexity and maintenance burden
- the repository is still early enough to tighten the protocol cleanly

### R4. Add duplicate-failure circuit breaker now

Rejected for this task.

Reason:

- user explicitly wants to keep some exploration room while refining skills
- this redesign should first fix the protocol itself

## Open Questions To Resolve During Implementation

1. Should router output add all optional execution hint fields in one step or phase them in gradually?
2. Should `worker.config.context_messages` be removed from schema entirely or retained as ignored/deprecated for one migration cycle?
3. Should `calendar` action files be separate files under `actions/` or a single file with stable sections and `ranges` reads?
4. Should action validation errors include `suggested_alternative_actions` for every validation failure or only for selected known-confusion cases?
5. Should `archive` become an explicit calendar action now, or remain represented via `update_event.patch.status = archived` until there is a dedicated route and UI contract?
6. ~~Frontend/live integration assertions still need migration from `skill/action` to `module/method`~~ Resolved 2026-04-24: assertions migrated and integration tests passing.

## Session 2026-04-24: Integration Test Debugging

### D8. Tool schema `input` must be required, not optional

Accepted.

Root cause of `project_cli` repeatedly receiving `input: {}`:
- `input: dict[str, Any] | None = None` generated a tool schema with `input` as optional, nullable, and `additionalProperties: true`
- Small models (qwen3.5-flash) interpret this as "input can be anything, including empty object"
- The tool schema has higher priority than skill file content in the model's attention
- Fix: changed to `input: dict[str, Any]` (required, no default, no nullable)

### D9. Router must not resolve time or suggest tool args

Accepted.

Previous router prompt included instructions for:
- Including `project_cli_defaults` in `context_summary`
- Standardizing time values to project_cli input format
- Resolving relative dates via `system_time_local`

This violated the router/worker responsibility split:
- Router: intent extraction + context summary + requires_tool evidence
- Worker: tool selection + time resolution via skill + ENV section + tool execution

Fix: removed all `project_cli_defaults` and time-resolution instructions from router prompt.
Time resolution is now the sole responsibility of worker + skill file, using `system_time_local` from ENV section as the single time source.

### D10. Skill files should reference ENV section variable names explicitly

Accepted.

Instead of abstract instructions like "resolve dates using system_time_local", skill files should directly reference `system_time_local` and `timezone_effective` from `USER_CONTEXT_JSON` in the ENV section, with concrete examples showing how to extract values.
feat(agent): redesign project_cli with module/method/input protocol 2026-04-24 13:24:13 +08:00			`# Decision Log: Single CLI + Progressive Skill Disclosure Redesign`

			`## Accepted Decisions`

			`### D1. Keep one executable tool`

			`Accepted.`

			`Reason:`

			`- lower tool-selection complexity`
			`- lower repeated schema exposure cost`
			`- consistent with the original valid direction of the CLI refactor`

			### D2. Replace `command/subcommand/args` with `module/method/input`

			`Accepted.`

			`Reason:`

			`- keeps one tool while removing ambiguous CLI-history semantics`
			`- aligns the worker-facing protocol with business intent`
			`- reduces repeated failure from action guessing`
			`- keeps skill files decoupled from the executable tool contract`

			### D2.1 Remove `skill` from `project_cli`

			`Accepted.`

			`Reason:`

			`- skill files are guidance, not transport`
			`- project_cli should execute by business module and method only`
			`- error messages and validation should remain tool-native and not point back into skill docs`

			`### D2.2 Use strong typed calendar read inputs`

			`Accepted.`

			`Reason:`

			- `date: str` plus manual parsing is glue code and too easy to misuse
			- Pydantic `date`, timezone-aware `datetime`, and `UUID` give stricter and clearer validation
			- `calendar.read` can cover day/range/event reads with one module-scoped method while still keeping input modes explicit

			`### D3. Keep progressive disclosure through skill files`

			`Accepted.`

			`Reason:`

			`- allows the model to load only current-scenario knowledge`
			`- avoids injecting every action definition into every model call`
			`- fits AgentScope skill usage better than giant tool schemas`

			`### D4. Split skill knowledge into index + action cards`

			`Accepted.`

			`Reason:`

			- real progressive disclosure needs smaller files, not only a long `SKILL.md`
			`- action-scoped files are easier for the worker to read and apply correctly`

			### D5. Set worker `max_iters=7`

			`Accepted.`

			`Reason:`

			`- current default 10 is too high for repeated invalid action discovery`
			`- 7 preserves room for complex tasks without keeping the current waste level`

			`### D6. Keep worker temperature unchanged`

			`Accepted.`

			`Reason:`

			`- explicit user requirement`
			`- this task focuses on protocol clarity and token efficiency, not generation-style tuning`

			### D7. Remove semantic reliance on worker `context_messages`

			`Accepted.`

			`Reason:`

			`- current runtime does not feed those messages into worker execution`
			`- keeping the config active on worker is misleading and complicates reasoning about cost`

			`## Rejected Decisions`

			`### R1. Re-split into many domain tools`

			`Rejected.`

			`Reason:`

			`- increases tool schema size`
			`- increases selection ambiguity`
			`- pushes the design back toward the old high-token path`

			`### R2. Keep old CLI shape and only improve skill writing`

			`Rejected.`

			`Reason:`

			`- failure came from structural action ambiguity, not only poor wording`
			- `read` remains overloaded even with better prose

			`### R3. Keep broad legacy input compatibility`

			`Rejected.`

			`Reason:`

			`- old aliases teach the model that guessing is acceptable`
			`- compatibility paths increase parser complexity and maintenance burden`
			`- the repository is still early enough to tighten the protocol cleanly`

			`### R4. Add duplicate-failure circuit breaker now`

			`Rejected for this task.`

			`Reason:`

			`- user explicitly wants to keep some exploration room while refining skills`
			`- this redesign should first fix the protocol itself`

			`## Open Questions To Resolve During Implementation`

			`1. Should router output add all optional execution hint fields in one step or phase them in gradually?`
			2. Should `worker.config.context_messages` be removed from schema entirely or retained as ignored/deprecated for one migration cycle?
			3. Should `calendar` action files be separate files under `actions/` or a single file with stable sections and `ranges` reads?
			4. Should action validation errors include `suggested_alternative_actions` for every validation failure or only for selected known-confusion cases?
			5. Should `archive` become an explicit calendar action now, or remain represented via `update_event.patch.status = archived` until there is a dedicated route and UI contract?
			6. ~~Frontend/live integration assertions still need migration from `skill/action` to `module/method`~~ Resolved 2026-04-24: assertions migrated and integration tests passing.

			`## Session 2026-04-24: Integration Test Debugging`

			### D8. Tool schema `input` must be required, not optional

			`Accepted.`

			Root cause of `project_cli` repeatedly receiving `input: {}`:
			- `input: dict[str, Any] \| None = None` generated a tool schema with `input` as optional, nullable, and `additionalProperties: true`
			`- Small models (qwen3.5-flash) interpret this as "input can be anything, including empty object"`
			`- The tool schema has higher priority than skill file content in the model's attention`
			- Fix: changed to `input: dict[str, Any]` (required, no default, no nullable)

			`### D9. Router must not resolve time or suggest tool args`

			`Accepted.`

			`Previous router prompt included instructions for:`
			- Including `project_cli_defaults` in `context_summary`
			`- Standardizing time values to project_cli input format`
			- Resolving relative dates via `system_time_local`

			`This violated the router/worker responsibility split:`
			`- Router: intent extraction + context summary + requires_tool evidence`
			`- Worker: tool selection + time resolution via skill + ENV section + tool execution`

			Fix: removed all `project_cli_defaults` and time-resolution instructions from router prompt.
			Time resolution is now the sole responsibility of worker + skill file, using `system_time_local` from ENV section as the single time source.

			`### D10. Skill files should reference ENV section variable names explicitly`

			`Accepted.`

			Instead of abstract instructions like "resolve dates using system_time_local", skill files should directly reference `system_time_local` and `timezone_effective` from `USER_CONTEXT_JSON` in the ENV section, with concrete examples showing how to extract values.