fix(ce-sessions): unblock session-history on Claude Code by tmchow · Pull Request #800 · EveryInc/compound-engineering-plugin

tmchow · 2026-05-08T15:40:31Z

Summary

Session-history features (/ce-sessions and /ce-compound Phase 1) work on Claude Code again. Previously, the historian agent's first action was a Skill tool call to fetch session inventory, but Claude Code does not permit subagents to invoke the Skill tool (anthropics/claude-code#38719), so the spinner hung at Initializing… indefinitely and orchestrators received a spurious "user doesn't want to proceed" rejection.

The fix is structural: orchestration moves to the ce-sessions skill (main context, where the Skill tool works), and the historian becomes synthesis-only. It receives pre-extracted file paths in its dispatch prompt and reads them via the native file-read tool. No subagent ever invokes Skill again.

What changed

ce-sessions is the canonical entry point. It owns the full pipeline: discovery, branch + keyword filtering, scan-window selection, top-5 deep-dive cap, scratch directory, per-session extraction, and synthesis dispatch. Scripts that were spread across ce-session-inventory/ and ce-session-extract/ now live in one home at plugins/compound-engineering/skills/ce-sessions/scripts/.
ce-session-historian is synthesis-only. Receives {problem_topic, scratch_dir, sessions, output_schema}, reads the path files via the native file-read tool, returns prose findings. No Skill calls, no Bash discovery, no orchestration logic.
ce-compound Phase 1 delegates via the platform's skill-invocation primitive in semantic-prose form (per ce-plan/references/plan-handoff.md line 57), not a literal Skill(...) tool-call expression. The literal form would propagate Claude-Code-specific syntax to Codex, Cursor, Gemini, OpenCode, Pi, and Kiro when the skill ships verbatim through the converters. Dispatch ordering is pinned: launch the three background research subagents first, then invoke ce-sessions, so wall-clock parallelism is preserved.
Extract scripts gain --output PATH. When set, scripts write to file and emit only a one-line JSON status to stdout. Extraction content (~50 KB+ per session × 5 sessions) never round-trips through orchestrator tool results. Stdout-mode behavior preserved when the flag is omitted.
Two callerless skills removed. ce-session-inventory and ce-session-extract were user-invocable: false script holders. With their callers gone, they're deleted and registered in STALE_SKILL_DIRS, LEGACY_ONLY_SKILL_DESCRIPTIONS, and EXTRA_LEGACY_ARTIFACTS_BY_PLUGIN so existing flat-installs sweep on upgrade.
Regression test asserts the agent body never instructs Skill(ce-session-inventory), Skill(ce-session-extract), or any literal Skill(...) tool-call expression.

Why this shape

Issue #794 proposed two narrower fixes: refactor the agent to invoke scripts directly via Bash from subagent context, or have the orchestrator pre-fetch inventory and pass it into the subagent's dispatch prompt. Both leave Skill calls in subagent context (per-session extraction in Option B remains a Skill round-trip; Option A trips on the same script-path-resolution problem ce-sessions navigates, but agents have no sibling-scripts/ convention to lean on). This refactor moves every Skill call to main context, eliminating the deadlock structurally rather than working around it on a per-call basis. Full design rationale, alternatives considered, and implementation units are in docs/plans/2026-05-08-001-fix-ce-sessions-orchestration-refactor-plan.md.

Test plan

1337 bun tests pass (3 new --output PATH tests on the extract scripts; 4 new regression tests for the no-Skill-from-subagent invariant).
bun run release:validate passes (49 agents, 37 skills after the two deletions).
Manual smoke test on a marketplace-cached install (not --plugin-dir): /ce-sessions "what did we work on this week" and /ce-compound Full mode with session history opted in both complete without Initializing… hangs. The plan's Risks table flags an empirical verification step in case the bare bash scripts/... runtime invocation doesn't resolve from the skill directory; established slash-command precedents (ce-clean-gone-branches, ce-resolve-pr-feedback, ce-optimize) argue it does, but verifying on a real install before merge is cheap.

Post-Deploy Monitoring & Validation

Watch for any new bug reports referencing Initializing… hangs on /ce-sessions or /ce-compound. The fix should eliminate them. If they recur, the architecture's central assumption (skill-invocation primitive works from inside an executing skill body in main context) is wrong and needs revisiting.
Monitor for cross-platform regressions when the plugin converts and ships to Codex, Cursor, Gemini, OpenCode, Pi, and Kiro. The semantic-prose invocation form should round-trip cleanly through every converter; if a target's converter doesn't recognize the prose pattern, ce-compound's Phase 1 would silently skip session-history enrichment.
Validation window: through next plugin release. Owner: plugin maintainers.

Closes #794.

tmchow · 2026-05-08T16:07:52Z

@digitalcostas wanna test this?

…ynthesis-only The ce-session-historian agent deadlocked when dispatched as a subagent in Claude Code because its first action was Skill(ce-session-inventory), and subagents cannot invoke the Skill tool (anthropics/claude-code#38719). The spinner hung at "Initializing…" indefinitely; after timeout the orchestrator received a spurious "user doesn't want to proceed" rejection. The fix removes every code path that has a subagent calling Skill: - Move 4 extraction scripts into plugins/compound-engineering/skills/ce-sessions/scripts/ (single home; ce-session-inventory and ce-session-extract skills deleted) - Rewrite ce-sessions/SKILL.md as the full orchestrator: discovery, branch + keyword filtering, scan-window selection, top-5 deep-dive cap, mktemp scratch dir, per-session extraction with new --output PATH flag (extraction bytes write directly to scratch, never round-trip through main-context tool results), dispatch of synthesis-only historian - Reshape ce-session-historian.agent.md to synthesis-only: receives file paths in dispatch prompt, reads via native file-read tool, returns prose findings. No Skill calls, no Bash discovery, no orchestration logic - Update ce-compound Phase 1 to delegate to ce-sessions via the platform's skill-invocation primitive (semantic-prose form per plan-handoff.md line 57 convention, not literal Skill(...) syntax). Specifies dispatch ordering so the parallel research subagents and ce-sessions still run concurrently — wall-clock parallelism preserved - Add --output PATH to extract-skeleton.py and extract-errors.py: when set, scripts write to file and emit only a one-line JSON status to stdout. Stdout-mode behavior preserved when omitted (additive API change) - Add regression test (tests/skills/ce-session-historian-no-skill-tool.test.ts) asserting the agent body never instructs Skill(ce-session-inventory), Skill(ce-session-extract), or any literal Skill(...) tool-call expression - Register ce-session-inventory and ce-session-extract in legacy-cleanup lookups (STALE_SKILL_DIRS, LEGACY_ONLY_SKILL_DESCRIPTIONS, and EXTRA_LEGACY_ARTIFACTS_BY_PLUGIN) so existing flat-installs sweep on upgrade - Fix broken See Also links in docs/skills/ce-sessions.md The bug is structurally gone: no subagent in the post-refactor flow ever invokes the Skill tool. Plan with full design rationale, alternatives considered (including issue #794's Options 1 and 2), and implementation units lives at docs/plans/2026-05-08-001-fix-ce-sessions-orchestration-refactor-plan.md. All 1337 bun tests pass; bun run release:validate passes (37 skills, 49 agents). Closes #794.

digitalcostas · 2026-05-08T19:11:11Z

Tested via marketplace-cached install of tmchow/debug-issue-794 on Claude Code, macOS Darwin 25.4.0. Plugin v3.7.0 / gitCommitSha: dd25db036c908b597b60e3cb242bec20abb9048a confirmed in ~/.claude/plugins/installed_plugins.json.

Install path

/plugin marketplace add https://github.com/EveryInc/compound-engineering-plugin.git#tmchow/debug-issue-794
/plugin install compound-engineering@compound-engineering-plugin

This routes through ~/.claude/plugins/cache/ like a published release — exercising the residual risk you flagged in the test plan, not just the diff. Filesystem verified: ce-session-inventory/ and ce-session-extract/ are absent from the skills cache; ce-sessions/scripts/ contains all 4 scripts.

Test 1: `/ce-sessions "what did we work on this week"`

✅ End-to-end success. 100 sessions inventoried (0 parse errors). 5 selected by size+recency+branch diversity, 3 returned substantive skeletons via the new --output PATH flag (status JSON to stdout, content to $SCRATCH/<id>.skeleton.txt). ce-session-historian subagent dispatched on sonnet, completed in 134s with 10 tool_uses (all file-reads, zero Skill calls in subagent context). Returned proper 4-section synthesis. No Initializing… hang.

The bare bash scripts/discover-sessions.sh runtime invocation resolved correctly from the marketplace-cached skill base dir — the residual risk you flagged in the test plan is empirically clear.

Test 2: `/ce-compound` Full mode + session-history opt-in

✅ End-to-end success on the path that previously deadlocked. The pinned dispatch ordering works as intended — background research subagents launched first (Context Analyzer 20s, Solution Extractor 51s, Related Docs Finder 91s), then Skill(ce-sessions) invoked from main context (~75s including historian dispatch on 4 sessions). Wall-clock for Phase 1 = max(~91s, ~75s) ≈ 91s, not 237s sequential — the parallelism preservation works.

End-to-end the skill produced a clean overlap-detected update to an existing learning doc, ran validate-frontmatter.py (exit=0), and completed Phase 2.5 + Discoverability without issue.

Separate pre-existing bug found during testing

Filed as #805 — extract-skeleton.py crashes with TypeError: unhashable type: 'slice' on one of five keyword-matched sessions due to a dict[:80] slice. Pre-existing (the consolidation moved the file, didn't re-author the code path) and doesn't block this merge.

Summary

LGTM from the original reporter. PR #800 fixes the deadlock structurally on Claude Code via marketplace-cached install. Will retire CLAUDE.md Rule 6 (the "answer no" workaround in our project rules) once plugin v3.7.0+ ships from a stable release.

Closes #794 confirmed working on this side.

summarize_claude_tool sliced inp.get("query", "") and inp.get("prompt", "") unconditionally. When MCP or specialized tools put a dict in those fields, dict[:80] raises TypeError: unhashable type: 'slice' and the per-session extraction silently fails. Same exposure existed in handle_cursor's tool_use path. Add a _safe_slice helper and reroute every potentially-non-string field through it, then add regression tests for dict-shaped query, command, prompt, pattern, fall-through to a later string field, and the cursor path. Fixes #805

tmchow force-pushed the tmchow/debug-issue-794 branch from 605e311 to dd25db0 Compare May 8, 2026 16:34

digitalcostas mentioned this pull request May 8, 2026

extract-skeleton.py crashes on dict-shaped tool input — TypeError: unhashable type: 'slice' #805

Closed

tmchow merged commit 81710ef into main May 8, 2026
2 checks passed

github-actions Bot mentioned this pull request May 8, 2026

chore: release main #806

Merged

LLMpsycho pushed a commit to LLMpsycho/compound-engineering-plugin that referenced this pull request May 8, 2026

fix(ce-sessions): unblock session-history on Claude Code (EveryInc#800)

872551b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ce-sessions): unblock session-history on Claude Code#800

fix(ce-sessions): unblock session-history on Claude Code#800
tmchow merged 2 commits into
mainfrom
tmchow/debug-issue-794

tmchow commented May 8, 2026

Uh oh!

tmchow commented May 8, 2026

Uh oh!

digitalcostas commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tmchow commented May 8, 2026

Summary

What changed

Why this shape

Test plan

Post-Deploy Monitoring & Validation

Uh oh!

tmchow commented May 8, 2026

Uh oh!

digitalcostas commented May 8, 2026

Install path

Test 1: /ce-sessions "what did we work on this week"

Test 2: /ce-compound Full mode + session-history opt-in

Separate pre-existing bug found during testing

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Test 1: `/ce-sessions "what did we work on this week"`

Test 2: `/ce-compound` Full mode + session-history opt-in