fix(ce-sessions): unblock session-history on Claude Code#800
Conversation
|
@digitalcostas wanna test this? |
…ynthesis-only The ce-session-historian agent deadlocked when dispatched as a subagent in Claude Code because its first action was Skill(ce-session-inventory), and subagents cannot invoke the Skill tool (anthropics/claude-code#38719). The spinner hung at "Initializing…" indefinitely; after timeout the orchestrator received a spurious "user doesn't want to proceed" rejection. The fix removes every code path that has a subagent calling Skill: - Move 4 extraction scripts into plugins/compound-engineering/skills/ce-sessions/scripts/ (single home; ce-session-inventory and ce-session-extract skills deleted) - Rewrite ce-sessions/SKILL.md as the full orchestrator: discovery, branch + keyword filtering, scan-window selection, top-5 deep-dive cap, mktemp scratch dir, per-session extraction with new --output PATH flag (extraction bytes write directly to scratch, never round-trip through main-context tool results), dispatch of synthesis-only historian - Reshape ce-session-historian.agent.md to synthesis-only: receives file paths in dispatch prompt, reads via native file-read tool, returns prose findings. No Skill calls, no Bash discovery, no orchestration logic - Update ce-compound Phase 1 to delegate to ce-sessions via the platform's skill-invocation primitive (semantic-prose form per plan-handoff.md line 57 convention, not literal Skill(...) syntax). Specifies dispatch ordering so the parallel research subagents and ce-sessions still run concurrently — wall-clock parallelism preserved - Add --output PATH to extract-skeleton.py and extract-errors.py: when set, scripts write to file and emit only a one-line JSON status to stdout. Stdout-mode behavior preserved when omitted (additive API change) - Add regression test (tests/skills/ce-session-historian-no-skill-tool.test.ts) asserting the agent body never instructs Skill(ce-session-inventory), Skill(ce-session-extract), or any literal Skill(...) tool-call expression - Register ce-session-inventory and ce-session-extract in legacy-cleanup lookups (STALE_SKILL_DIRS, LEGACY_ONLY_SKILL_DESCRIPTIONS, and EXTRA_LEGACY_ARTIFACTS_BY_PLUGIN) so existing flat-installs sweep on upgrade - Fix broken See Also links in docs/skills/ce-sessions.md The bug is structurally gone: no subagent in the post-refactor flow ever invokes the Skill tool. Plan with full design rationale, alternatives considered (including issue #794's Options 1 and 2), and implementation units lives at docs/plans/2026-05-08-001-fix-ce-sessions-orchestration-refactor-plan.md. All 1337 bun tests pass; bun run release:validate passes (37 skills, 49 agents). Closes #794.
605e311 to
dd25db0
Compare
|
Tested via marketplace-cached install of Install pathThis routes through Test 1:
|
summarize_claude_tool sliced inp.get("query", "") and inp.get("prompt", "")
unconditionally. When MCP or specialized tools put a dict in those fields,
dict[:80] raises TypeError: unhashable type: 'slice' and the per-session
extraction silently fails. Same exposure existed in handle_cursor's
tool_use path.
Add a _safe_slice helper and reroute every potentially-non-string field
through it, then add regression tests for dict-shaped query, command,
prompt, pattern, fall-through to a later string field, and the cursor path.
Fixes #805
Summary
Session-history features (
/ce-sessionsand/ce-compoundPhase 1) work on Claude Code again. Previously, the historian agent's first action was a Skill tool call to fetch session inventory, but Claude Code does not permit subagents to invoke the Skill tool (anthropics/claude-code#38719), so the spinner hung atInitializing…indefinitely and orchestrators received a spurious "user doesn't want to proceed" rejection.The fix is structural: orchestration moves to the
ce-sessionsskill (main context, where the Skill tool works), and the historian becomes synthesis-only. It receives pre-extracted file paths in its dispatch prompt and reads them via the native file-read tool. No subagent ever invokes Skill again.What changed
ce-sessionsis the canonical entry point. It owns the full pipeline: discovery, branch + keyword filtering, scan-window selection, top-5 deep-dive cap, scratch directory, per-session extraction, and synthesis dispatch. Scripts that were spread acrossce-session-inventory/andce-session-extract/now live in one home atplugins/compound-engineering/skills/ce-sessions/scripts/.ce-session-historianis synthesis-only. Receives{problem_topic, scratch_dir, sessions, output_schema}, reads the path files via the native file-read tool, returns prose findings. No Skill calls, no Bash discovery, no orchestration logic.ce-compoundPhase 1 delegates via the platform's skill-invocation primitive in semantic-prose form (perce-plan/references/plan-handoff.mdline 57), not a literalSkill(...)tool-call expression. The literal form would propagate Claude-Code-specific syntax to Codex, Cursor, Gemini, OpenCode, Pi, and Kiro when the skill ships verbatim through the converters. Dispatch ordering is pinned: launch the three background research subagents first, then invoke ce-sessions, so wall-clock parallelism is preserved.--output PATH. When set, scripts write to file and emit only a one-line JSON status to stdout. Extraction content (~50 KB+ per session × 5 sessions) never round-trips through orchestrator tool results. Stdout-mode behavior preserved when the flag is omitted.ce-session-inventoryandce-session-extractwereuser-invocable: falsescript holders. With their callers gone, they're deleted and registered inSTALE_SKILL_DIRS,LEGACY_ONLY_SKILL_DESCRIPTIONS, andEXTRA_LEGACY_ARTIFACTS_BY_PLUGINso existing flat-installs sweep on upgrade.Skill(ce-session-inventory),Skill(ce-session-extract), or any literalSkill(...)tool-call expression.Why this shape
Issue #794 proposed two narrower fixes: refactor the agent to invoke scripts directly via Bash from subagent context, or have the orchestrator pre-fetch inventory and pass it into the subagent's dispatch prompt. Both leave Skill calls in subagent context (per-session extraction in Option B remains a Skill round-trip; Option A trips on the same script-path-resolution problem ce-sessions navigates, but agents have no sibling-
scripts/convention to lean on). This refactor moves every Skill call to main context, eliminating the deadlock structurally rather than working around it on a per-call basis. Full design rationale, alternatives considered, and implementation units are indocs/plans/2026-05-08-001-fix-ce-sessions-orchestration-refactor-plan.md.Test plan
--output PATHtests on the extract scripts; 4 new regression tests for the no-Skill-from-subagent invariant).bun run release:validatepasses (49 agents, 37 skills after the two deletions).--plugin-dir):/ce-sessions "what did we work on this week"and/ce-compoundFull mode with session history opted in both complete withoutInitializing…hangs. The plan's Risks table flags an empirical verification step in case the barebash scripts/...runtime invocation doesn't resolve from the skill directory; established slash-command precedents (ce-clean-gone-branches,ce-resolve-pr-feedback,ce-optimize) argue it does, but verifying on a real install before merge is cheap.Post-Deploy Monitoring & Validation
Initializing…hangs on/ce-sessionsor/ce-compound. The fix should eliminate them. If they recur, the architecture's central assumption (skill-invocation primitive works from inside an executing skill body in main context) is wrong and needs revisiting.Closes #794.