social-app/.trellis/tasks/04-23-redesign-single-cli-skill-disclosure/decision-log.md

# Decision Log: Single CLI + Progressive Skill Disclosure Redesign

## Accepted Decisions

### D1. Keep one executable tool

Accepted.

Reason:

- lower tool-selection complexity
- lower repeated schema exposure cost
- consistent with the original valid direction of the CLI refactor

### D2. Replace `command/subcommand/args` with `module/method/input`

Accepted.

Reason:

- keeps one tool while removing ambiguous CLI-history semantics
- aligns the worker-facing protocol with business intent
- reduces repeated failure from action guessing
- keeps skill files decoupled from the executable tool contract

### D2.1 Remove `skill` from `project_cli`

Accepted.

Reason:

- skill files are guidance, not transport
- project_cli should execute by business module and method only
- error messages and validation should remain tool-native and not point back into skill docs

### D2.2 Use strong typed calendar read inputs

Accepted.

Reason:

- `date: str` plus manual parsing is glue code and too easy to misuse
- Pydantic `date`, timezone-aware `datetime`, and `UUID` give stricter and clearer validation
- `calendar.read` can cover day/range/event reads with one module-scoped method while still keeping input modes explicit

### D3. Keep progressive disclosure through skill files

Accepted.

Reason:

- allows the model to load only current-scenario knowledge
- avoids injecting every action definition into every model call
- fits AgentScope skill usage better than giant tool schemas

### D4. Split skill knowledge into index + action cards

Accepted.

Reason:

- real progressive disclosure needs smaller files, not only a long `SKILL.md`
- action-scoped files are easier for the worker to read and apply correctly

### D5. Set worker `max_iters=7`

Accepted.

Reason:

- current default 10 is too high for repeated invalid action discovery
- 7 preserves room for complex tasks without keeping the current waste level

### D6. Keep worker temperature unchanged

Accepted.

Reason:

- explicit user requirement
- this task focuses on protocol clarity and token efficiency, not generation-style tuning

### D7. Remove semantic reliance on worker `context_messages`

Accepted.

Reason:

- current runtime does not feed those messages into worker execution
- keeping the config active on worker is misleading and complicates reasoning about cost

## Rejected Decisions

### R1. Re-split into many domain tools

Rejected.

Reason:

- increases tool schema size
- increases selection ambiguity
- pushes the design back toward the old high-token path

### R2. Keep old CLI shape and only improve skill writing

Rejected.

Reason:

- failure came from structural action ambiguity, not only poor wording
- `read` remains overloaded even with better prose

### R3. Keep broad legacy input compatibility

Rejected.

Reason:

- old aliases teach the model that guessing is acceptable
- compatibility paths increase parser complexity and maintenance burden
- the repository is still early enough to tighten the protocol cleanly

### R4. Add duplicate-failure circuit breaker now

Rejected for this task.

Reason:

- user explicitly wants to keep some exploration room while refining skills
- this redesign should first fix the protocol itself

## Open Questions To Resolve During Implementation

1. Should router output add all optional execution hint fields in one step or phase them in gradually?
2. Should `worker.config.context_messages` be removed from schema entirely or retained as ignored/deprecated for one migration cycle?
3. Should `calendar` action files be separate files under `actions/` or a single file with stable sections and `ranges` reads?
4. Should action validation errors include `suggested_alternative_actions` for every validation failure or only for selected known-confusion cases?
5. Should `archive` become an explicit calendar action now, or remain represented via `update_event.patch.status = archived` until there is a dedicated route and UI contract?
6. ~~Frontend/live integration assertions still need migration from `skill/action` to `module/method`~~ Resolved 2026-04-24: assertions migrated and integration tests passing.

## Session 2026-04-24: Integration Test Debugging

### D8. Tool schema `input` must be required, not optional

Accepted.

Root cause of `project_cli` repeatedly receiving `input: {}`:
- `input: dict[str, Any] | None = None` generated a tool schema with `input` as optional, nullable, and `additionalProperties: true`
- Small models (qwen3.5-flash) interpret this as "input can be anything, including empty object"
- The tool schema has higher priority than skill file content in the model's attention
- Fix: changed to `input: dict[str, Any]` (required, no default, no nullable)

### D9. Router must not resolve time or suggest tool args

Accepted.

Previous router prompt included instructions for:
- Including `project_cli_defaults` in `context_summary`
- Standardizing time values to project_cli input format
- Resolving relative dates via `system_time_local`

This violated the router/worker responsibility split:
- Router: intent extraction + context summary + requires_tool evidence
- Worker: tool selection + time resolution via skill + ENV section + tool execution

Fix: removed all `project_cli_defaults` and time-resolution instructions from router prompt.
Time resolution is now the sole responsibility of worker + skill file, using `system_time_local` from ENV section as the single time source.

### D10. Skill files should reference ENV section variable names explicitly

Accepted.

Instead of abstract instructions like "resolve dates using system_time_local", skill files should directly reference `system_time_local` and `timezone_effective` from `USER_CONTEXT_JSON` in the ENV section, with concrete examples showing how to extract values.