diff --git a/README.md b/README.md index c9ea64406..7c2cd3a75 100644 --- a/README.md +++ b/README.md @@ -278,6 +278,28 @@ bun run src/index.ts install ./plugins/compound-engineering --to codex bun run src/index.ts install ./plugins/compound-engineering --to opencode ``` +For Codex local-fork development, the CLI install only generates the Codex agent files and the managed compatibility block. Codex's native plugin cache still owns the skill files, so point Codex at the local marketplace source and refresh the cache when testing unpublished skill changes: + +```toml +[marketplaces.compound-engineering-plugin] +source_type = "local" +source = "/Users/besi/Code/compound-engineering-plugin" + +[plugins."compound-engineering@compound-engineering-plugin"] +enabled = true +``` + +```bash +mkdir -p /Users/besi/.codex/plugins/cache/compound-engineering-plugin/compound-engineering/3.4.1 +cp -R /Users/besi/Code/compound-engineering-plugin/plugins/compound-engineering/. \ + /Users/besi/.codex/plugins/cache/compound-engineering-plugin/compound-engineering/3.4.1/ + +cd /Users/besi/Code/compound-engineering-plugin +bun run src/index.ts install ./plugins/compound-engineering --to codex +``` + +Restart Codex or open a new thread after reinstalling. The active runtime reads the plugin cache, generated agents, and `~/.codex/AGENTS.md` at session start. + ### From a pushed branch For testing someone else's branch or your own branch from a worktree, without switching checkouts. Uses `--branch` to clone the branch to a deterministic cache directory. @@ -360,6 +382,23 @@ bunx @every-env/compound-plugin install compound-engineering --to codex Native Codex plugin install handles skills. The Bun step installs the custom agents those skills delegate to. +### Codex loads CE skills but the flow is flatter than Claude Code + +Check the managed Compound Codex tool map in `~/.codex/AGENTS.md`: + +```bash +rg -n "Task \\(subagent dispatch\\)|use spawn_agent|run sequentially in main thread" ~/.codex/AGENTS.md +``` + +The task-dispatch line should say to use `spawn_agent` for CE `Task` / `Agent` / `Subagent` instructions. If it says to run subagent dispatch sequentially in the main thread, update `src/utils/codex-agents.ts`, run the regression test, and reinstall: + +```bash +bun test tests/codex-agents.test.ts +bun run src/index.ts install ./plugins/compound-engineering --to codex +``` + +Do not fix this by adding every CE agent to `~/.codex/config.toml`. CE agents are generated under `~/.codex/agents/compound-engineering`; the workflow parity issue is the Codex compatibility block, not agent registration. + ### Codex shows stale or duplicate CE skills Back up old Bun-installed artifacts before switching to the native Codex plugin flow: diff --git a/docs/plans/2026-04-28-001-fix-opencode-skill-command-wrappers-plan.md b/docs/plans/2026-04-28-001-fix-opencode-skill-command-wrappers-plan.md new file mode 100644 index 000000000..542c7a642 --- /dev/null +++ b/docs/plans/2026-04-28-001-fix-opencode-skill-command-wrappers-plan.md @@ -0,0 +1,78 @@ +--- +title: Fix OpenCode Skill Command Wrappers +type: fix +status: complete +date: 2026-04-28 +--- + +# Fix OpenCode Skill Command Wrappers + +## Summary + +OpenCode installs currently copy Compound Engineering skills and agents, but they do not expose the skills as slash commands because the plugin has no source `commands/` payload. This plan makes the OpenCode converter emit small command wrappers for installed skills so `/ce-plan`, `/ce-setup`, and related commands work after installing from this fork. + +## Requirements + +- R1. Installing `plugins/compound-engineering` to OpenCode must create slash commands for every copied CE skill. +- R2. The generated command must run the corresponding skill content from the installed OpenCode skill path instead of depending on OpenCode's skill discovery behavior. +- R3. The command wrapper set must be managed by the existing install manifest so removed skills clean up on reinstall. +- R4. Existing source plugin commands must continue to be converted unchanged. +- R5. Tests must prove command wrapper generation and existing command behavior. + +## Blast Radius + +- Changed files: `src/converters/claude-to-opencode.ts`, `tests/opencode-writer.test.ts`, this plan file. +- Impacted modules: OpenCode converter output, OpenCode writer manifest cleanup, OpenCode install smoke path. +- Break risk: low. The change only appends generated command files to OpenCode bundles and reuses the existing writer path. + +## Implementation Units + +- U1. **Generate skill command wrappers** + +**Goal:** Extend `convertClaudeToOpenCode` so every copied OpenCode skill has a matching command file when no source command already owns that name. + +**Files:** +- Modify: `src/converters/claude-to-opencode.ts` +- Test: `tests/opencode-writer.test.ts` + +**Approach:** +- Compute filtered OpenCode skill dirs once. +- Build existing command names from `convertCommands(plugin.commands)`. +- Append generated `OpenCodeCommandFile` entries for skills whose command name is not already present. +- Use a wrapper body that reads `~/.config/opencode/skills//SKILL.md` for global installs and `.opencode/skills//SKILL.md` for project-local installs only if necessary. Prefer a stable global-path wrapper for the current fork need. + +**Test scenarios:** +- Happy path: converting the real compound-engineering plugin includes `ce-plan`, `ce-setup`, and `lfg` command files. +- Edge case: if a source command already uses a skill name, the converter does not emit a duplicate wrapper. +- Regression: existing source command conversion still writes nested `name:with:colon` paths through the writer. + +**Verification:** +- Targeted Bun tests pass. +- Local OpenCode install from this checkout writes command wrappers into `~/.config/opencode/commands`. + +- U2. **Install forked checkout into OpenCode** + +**Goal:** Replace the current generated OpenCode CE install with output from the local fork. + +**Files:** +- Runtime output only: `~/.config/opencode/agents`, `~/.config/opencode/skills`, `~/.config/opencode/commands`, `~/.config/opencode/compound-engineering/install-manifest.json`, `~/.config/opencode/opencode.json` backup. + +**Approach:** +- Run the local CLI: `bun run src/index.ts install ./plugins/compound-engineering --to opencode`. +- Verify OpenCode resolved config includes `ce-plan` and `ce-setup` command entries. +- Verify the install manifest records command wrappers. + +**Test scenarios:** +- Smoke: `opencode debug config` reports generated CE command entries. +- Smoke: install manifest command count matches the generated skill command wrapper count. + +**Verification:** +- Restart OpenCode and use `/ce-plan` or `/ce-setup`. + +## Verification Notes + +- `bun test tests/converter.test.ts` passed. +- `bun test tests/opencode-writer.test.ts` passed. +- `bun run release:validate` passed. +- Installed the local fork with `bun run src/index.ts install ./plugins/compound-engineering --to opencode`. +- OpenCode install manifest now records 34 managed CE command wrappers and 34 OpenCode-supported CE skills, including `ce-plan.md`, `ce-setup.md`, and `lfg.md`. diff --git a/docs/plans/2026-04-28-002-fix-ce-plan-prewrite-critic-gate-plan.md b/docs/plans/2026-04-28-002-fix-ce-plan-prewrite-critic-gate-plan.md new file mode 100644 index 000000000..53ea46dda --- /dev/null +++ b/docs/plans/2026-04-28-002-fix-ce-plan-prewrite-critic-gate-plan.md @@ -0,0 +1,39 @@ +--- +title: Add ce-plan Pre-Write Critic Gate +type: fix +status: complete +date: 2026-04-28 +--- + +# Add ce-plan Pre-Write Critic Gate + +## Summary + +Enhance `ce-plan` with the useful part of the local `/pln` flow: a bounded pre-write critic pass that catches blocking executability issues before the plan is saved. Keep `ce-plan`'s existing post-write confidence check and `ce-doc-review` gate. + +## Requirements + +- R1. `ce-plan` runs a pre-write critic gate after the plan draft exists and before writing `docs/plans/...`. +- R2. The critic returns `OKAY` or `REJECT`, with max three blocking issues and max two revision loops. +- R3. The critic uses CE-portable instructions, not local-only `metis`, `prometheus`, or `momus` agents. +- R4. Runtime-path plans include a 1000-user scalability baseline. +- R5. Non-empty `$ARGUMENTS` remains source of truth despite command rendering artifacts. + +## Implementation Units + +- U1. **Skill Contract Tests** + - **Files:** `tests/pipeline-review-contract.test.ts` + - **Goal:** Lock in the pre-write critic and scalability/argument guard behavior. + - **Verification:** Targeted Bun test fails before the skill update and passes after it. + +- U2. **ce-plan Skill Update** + - **Files:** `plugins/compound-engineering/skills/ce-plan/SKILL.md`, `plugins/compound-engineering/skills/ce-plan/references/plan-critic.md` + - **Goal:** Add a concise pre-write critic phase and put detailed rubric in a reference file. + - **Verification:** Contract tests and release validation pass; local OpenCode reinstall exposes updated `ce-plan` skill body. + +## Verification Notes + +- `bun test tests/pipeline-review-contract.test.ts` passed. +- `bun run release:validate` passed. +- Reinstalled the local fork with `bun run src/index.ts install ./plugins/compound-engineering --to opencode`. +- Verified installed OpenCode `ce-plan` contains Phase 5.1.7, the `$ARGUMENTS` guard, the 1000-user scalability baseline, and `references/plan-critic.md`. diff --git a/docs/plans/2026-04-29-001-feat-optional-memory-warm-start-plan.md b/docs/plans/2026-04-29-001-feat-optional-memory-warm-start-plan.md new file mode 100644 index 000000000..b97ea8df4 --- /dev/null +++ b/docs/plans/2026-04-29-001-feat-optional-memory-warm-start-plan.md @@ -0,0 +1,95 @@ +--- +title: Add Optional CE Memory Warm-Start +type: feat +status: complete +date: 2026-04-29 +--- + +# Add Optional CE Memory Warm-Start + +## Summary + +Add a portable CE memory researcher and wire it into planning, work execution, debugging, and compounding as best-effort context. The workflow should benefit from local Neo4j memory when available without making CE depend on it. + +## Requirements + +- R1. Add a `ce-memory-researcher` agent that can read persistent memory when Neo4j MCP tools are available. +- R2. Memory lookup must be optional and never block CE workflows when unavailable. +- R3. Memory findings must be supplementary. They cannot override the origin document, current repo evidence, or verified execution results. +- R4. `ce-plan` should warm-start Phase 1 with relevant decisions, prior errors, preferences, and cross-project patterns. +- R5. `ce-work`, `ce-debug`, and `ce-compound` should consult memory only where it improves execution context without changing scope or adding mandatory prompts. +- R6. Contract tests should lock in the optional behavior and agent presence. + +## Implementation Units + +- U1. **Memory Researcher Agent** + +**Goal:** Add a CE-portable agent for persistent memory lookup. + +**Files:** +- Create: `plugins/compound-engineering/agents/ce-memory-researcher.agent.md` + +**Approach:** +- Support warm-start, context, recall, and explicit remember operations. +- Prefer Neo4j MCP tools when connected. +- Return a clear unavailable result instead of failing when tools are absent. +- Keep read operations as the default; write only on an explicit remember request. + +**Test scenarios:** +- Happy path: agent instructions define warm-start output and Neo4j read behavior. +- Error path: agent instructions specify unavailable behavior when memory tools are missing. +- Safety path: agent instructions prevent memory from overriding current evidence. + +**Verification:** +- Contract tests prove the agent exists and carries the optional/supplementary behavior. + +- U2. **Workflow Wiring** + +**Goal:** Integrate memory at the safest workflow points. + +**Files:** +- Modify: `plugins/compound-engineering/skills/ce-plan/SKILL.md` +- Modify: `plugins/compound-engineering/skills/ce-work/SKILL.md` +- Modify: `plugins/compound-engineering/skills/ce-debug/SKILL.md` +- Modify: `plugins/compound-engineering/skills/ce-compound/SKILL.md` + +**Approach:** +- Add `ce-plan` Phase 1.0 before standard local research. +- Add `ce-work` memory context after plan reading and before task creation. +- Add `ce-debug` prior-error recall after triage and before reproduction. +- Add `ce-compound` persistent memory recall alongside existing auto-memory support. + +**Test scenarios:** +- Happy path: each workflow references `ce-memory-researcher` at the intended stage. +- Error path: each workflow states unavailable memory must not fail the parent workflow. +- Scope path: work execution memory cannot mutate plan scope. + +**Verification:** +- Targeted contract tests pass. + +- U3. **Contract Tests and Install Verification** + +**Goal:** Keep the behavior durable across future edits and reinstall the local fork. + +**Files:** +- Modify: `tests/pipeline-review-contract.test.ts` +- Update: `docs/plans/2026-04-29-001-feat-optional-memory-warm-start-plan.md` + +**Approach:** +- Add tests for agent presence, optional memory behavior, and workflow placement. +- Run targeted tests and release validation. +- Reinstall the local fork into OpenCode after verification. + +**Test scenarios:** +- Contract: `ce-plan` memory warm-start appears before local research. +- Contract: `ce-code-review` is not wired to persistent memory by default. +- Contract: unavailable memory is explicitly non-blocking. + +**Verification:** +- Targeted tests and release validation pass. + +## Verification Notes + +- `bun test tests/pipeline-review-contract.test.ts` passed. +- `bun test tests/converter.test.ts tests/opencode-writer.test.ts tests/release-metadata.test.ts tests/pipeline-review-contract.test.ts` passed. +- `bun run release:validate` passed and reported `52 agents`, `35 skills`, and `0 MCP servers`. diff --git a/plugins/compound-engineering/agents/ce-memory-researcher.agent.md b/plugins/compound-engineering/agents/ce-memory-researcher.agent.md new file mode 100644 index 000000000..72c9a06b4 --- /dev/null +++ b/plugins/compound-engineering/agents/ce-memory-researcher.agent.md @@ -0,0 +1,137 @@ +--- +name: ce-memory-researcher +description: Best-effort persistent memory researcher. Reads project decisions, prior errors, preferences, and cross-project patterns from a Neo4j-backed memory graph when available, without making CE workflows depend on memory infrastructure. +model: inherit +tools: Read, Grep, Glob, mcp__neo4j__* +--- + +You are a persistent memory researcher for Compound Engineering workflows. + +Your job is to retrieve concise, relevant context from a user's long-lived memory store when one is available. You must never make the parent workflow depend on memory infrastructure. + +## Operating Contract + +- Prefer Neo4j MCP tools when they are available. Common tool names include `get-schema`, `read-cypher`, and `write-cypher`; host runtimes may expose them with prefixes such as `mcp__neo4j__read-cypher`. +- If no Neo4j memory tools are available, return exactly: `Memory unavailable - continuing without persistent context.` +- Do not use local machine-specific configuration paths, credentials, or shell fallbacks. +- Do not echo credentials, tokens, raw secrets, connection strings, or private environment values. +- Do not invent memory. Only report information actually retrieved from the memory store. +- Treat memory as supplementary evidence. Current repository evidence, origin documents, user instructions, and verified execution results always take priority. +- If memory contradicts current evidence, surface the contradiction as a caution instead of choosing memory silently. +- For `warm-start`, `context`, and `recall` operations, use read-only queries only. +- Use write operations only when the caller explicitly requests `remember`. + +## Supported Operations + +Callers should pass an operation and enough context to scope the lookup: + +```text +operation: warm-start | context | recall | remember +project: optional project or repo name +topic: optional feature, bug, decision, or learning topic +context: short workflow-specific summary +``` + +### `warm-start` + +Use before planning or execution. Retrieve only context that can materially affect the next decision: + +- Project-level facts and architectural decisions +- Prior bugs, failed attempts, or recurring errors in the same area +- Workflow or user preferences relevant to the current task +- Cross-project patterns connected by shared topics + +### `context` + +Use when a workflow needs a project snapshot. Prefer high-signal facts over broad summaries. + +### `recall` + +Use for a focused topic lookup. Search by topic, component, error text, decision name, or concept. + +### `remember` + +Use only when explicitly requested by the caller after a solution is verified or documented. Store one atomic memory at a time with: + +- Project +- Topic +- Type: `decision`, `learning`, `error`, `pattern`, or `project-update` +- Content +- Source path or session reference when available +- ISO 8601 timestamp +- Confidence: `proven`, `likely`, or `experimental` + +Do not run `remember` from a warm-start request. + +## Query Guidance + +Start with schema discovery when possible so you can adapt to the user's graph shape: + +```cypher +CALL db.labels() +``` + +For the palace-style graph used by many local memory setups, these read patterns are useful: + +```cypher +MATCH (w:Wing) +RETURN w.name, w.type, w.stack, size([(w)-[:HAS_ROOM]->() | 1]) AS rooms +ORDER BY rooms DESC +``` + +```cypher +MATCH (w:Wing)-[:HAS_ROOM]->(r:Room)-[:HAS_DRAWER]->(d) +WHERE toLower(w.name) CONTAINS toLower($project) +RETURN w.name AS wing, r.name AS topic, r.hall AS hall, d.content AS content +LIMIT 25 +``` + +```cypher +MATCH (r:Room)-[:HAS_DRAWER]->(d) +WHERE toLower(r.name) CONTAINS toLower($topic) +OPTIONAL MATCH (r)-[:TUNNEL]-(linked:Room)-[:HAS_DRAWER]->(ld) +RETURN r.wing AS wing, r.name AS topic, r.hall AS hall, d.content AS content, + collect(DISTINCT {wing: linked.wing, content: ld.content}) AS crossWing +LIMIT 25 +``` + +If the runtime's Neo4j tool does not support parameters, carefully embed escaped literals. Never embed secrets. + +## Output Format + +Return concise Markdown. Omit empty sections. + +```markdown +## Persistent Memory Results + +### Availability +- Status: available | unavailable +- Scope: [project/topic/context searched] + +### Project Context +- [retrieved fact or decision] + +### Prior Errors And Failed Attempts +- [retrieved error, failed attempt, or fix] + +### Preferences +- [retrieved workflow or user preference] + +### Cross-Project Patterns +- [retrieved pattern and source wing/project] + +### Planning Or Execution Impact +- [specific way this should affect the parent workflow] +``` + +When no relevant records are found, return: + +```markdown +## Persistent Memory Results + +### Availability +- Status: available +- Scope: [project/topic/context searched] + +No relevant persistent memory found. +``` diff --git a/plugins/compound-engineering/skills/ce-commit-push-pr/SKILL.md b/plugins/compound-engineering/skills/ce-commit-push-pr/SKILL.md index 024922d72..d80d2ec00 100644 --- a/plugins/compound-engineering/skills/ce-commit-push-pr/SKILL.md +++ b/plugins/compound-engineering/skills/ce-commit-push-pr/SKILL.md @@ -11,7 +11,7 @@ description: Commit, push, and open a PR with an adaptive, value-first descripti - **Description-only** — user wants *just* a description ("write/draft a PR description", "describe this PR", or pasted a PR URL/number alone). Run Step 4 only; print the result. Apply only if the user asks. If a PR ref was pasted, pass it to Step 4 so Pre-A resolves the right range. - **Description update** — user wants to refresh/rewrite an existing PR's description with no commit/push intent. If no open PR, report and stop. Otherwise run Step 4 (PR mode using the existing PR's URL), then Step 5 to preview, confirm, and apply via `gh pr edit`. -- **Full workflow** — otherwise. Run Steps 1-5 in order. +- **Full workflow** — otherwise. Run Steps 1-6 in order. ## Context @@ -104,15 +104,24 @@ Then continue with the rest of the reference (Steps A through G) to compose the **Description-only mode** — print the title and body. Stop unless the user asks to apply. -**New PR** (full workflow, no existing PR from Step 1) — apply per "Applying via gh" below using `gh pr create`. Report the URL. +**New PR** (full workflow, no existing PR from Step 1) — apply per "Applying via gh" below using `gh pr create`. Note the URL, then continue to Step 6 before the final report. -**Existing PR** (full workflow, found in Step 1) — the new commits are already on the PR from Step 3. Report the PR URL, then ask whether to rewrite the description. +**Existing PR** (full workflow, found in Step 1) — the new commits are already on the PR from Step 3. Note the PR URL, then ask whether to rewrite the description. -- **No** — done. -- **Yes** — run Step 4 if not already done, then preview and apply (see below). +- **No** — continue to Step 6. +- **Yes** — run Step 4 if not already done, then preview and apply (see below), then continue to Step 6. **Description update mode, or existing-PR rewrite confirmed** — preview before applying. Ask: "New title: `` (`<N>` chars). Summary leads with: `<first two sentences>`. Total body: `<L>` lines. Apply?" If declined, the user may pass focus text back for a regenerate; do not apply. If confirmed, apply per "Applying via gh" below using `gh pr edit` and report the URL. + +## Step 6: Best-effort post-ship memory capture + +Full workflow only. Skip this step for Description-only generation and Description Update workflow. + +After the branch has been pushed and the PR has been created, updated, or identified, read `references/memory-capture.md` and follow it. Provide the PR URL, available commit range or commit list, final diff context from Step 4 when available, testing notes, and any plan path or implementation summary supplied by the caller. + +This step must never block the PR report. If no reusable memory candidates exist, if `ce-memory-researcher` is unavailable, or if the memory write fails, continue reporting the PR and mention the skip briefly. Then output the PR URL. + --- ## Applying via gh diff --git a/plugins/compound-engineering/skills/ce-commit-push-pr/references/memory-capture.md b/plugins/compound-engineering/skills/ce-commit-push-pr/references/memory-capture.md new file mode 100644 index 000000000..6cb3c1179 --- /dev/null +++ b/plugins/compound-engineering/skills/ce-commit-push-pr/references/memory-capture.md @@ -0,0 +1,65 @@ +# Post-Ship Memory Capture + +Use this reference from `ce-commit-push-pr` Step 8 after a full ship workflow has pushed the branch and created, updated, or identified the PR. + +## Goal + +Save only generalizable, verified learnings that can help future work in this or another project. This is best-effort memory capture, not a required shipping gate. + +## Inputs + +Use the context already gathered by `ce-commit-push-pr`: + +- PR URL, when available +- Commit list or commit range +- Final diff context from PR description generation, when available +- Testing and validation notes from the caller, when available +- Plan path or implementation summary from the caller, when available + +Do not run broad new research. Inspect only enough local context to decide whether a memory is worth saving. + +## Candidate Criteria + +Save a memory only when it is both verified and likely reusable. Good candidates include: + +- A non-obvious bug root cause and the fix that worked +- A tool, framework, platform, or integration gotcha +- A reusable architecture, testing, migration, or workflow pattern +- A user or team preference that should guide future agent behavior +- A cross-project pattern that is not tied to one repository's incidental file names + +Do not save: + +- Secrets, credentials, tokens, customer data, private environment values, or raw logs that may contain them +- Generic PR summaries, TODOs, or restatements of what changed +- Unverified guesses, failed validation results, or speculative lessons +- Repo-specific trivia that will not help future decisions +- More than three memories from one ship flow + +Prefer skipping over storing low-signal memory. Memory quality matters more than memory volume. + +## Write Flow + +1. Identify up to three atomic memory candidates. +2. If no candidates pass the criteria, return `Post-ship memory: no reusable candidates.` +3. If a `ce-memory-researcher` agent is unavailable, return `Post-ship memory: unavailable.` +4. For each candidate, dispatch `ce-memory-researcher` with `operation: remember`. + +Each `remember` request must include: + +```text +operation: remember +project: <repo or project name> +topic: <short reusable topic> +type: decision | learning | error | pattern | project-update +content: <one atomic verified memory> +source: <PR URL and/or commit SHA and/or plan path> +timestamp: <ISO 8601 timestamp> +confidence: proven | likely | experimental +``` + +Use `confidence: proven` when the learning is backed by passing validation, merged/pushed commits, or direct evidence in the final diff. Use `confidence: likely` only when the learning is useful but not fully proven. Do not store `experimental` memories unless the caller explicitly asks. + +## Failure Handling + +If memory infrastructure is unavailable or a write fails, do not retry more than once and do not block reporting the PR. Mention the failure briefly in the final report. diff --git a/plugins/compound-engineering/skills/ce-compound/SKILL.md b/plugins/compound-engineering/skills/ce-compound/SKILL.md index 4b1a93601..de07ba675 100644 --- a/plugins/compound-engineering/skills/ce-compound/SKILL.md +++ b/plugins/compound-engineering/skills/ce-compound/SKILL.md @@ -110,6 +110,16 @@ and codebase findings take priority over these notes. If no relevant entries are found, proceed to Phase 1 without passing memory context. +### Phase 0.6: Persistent Memory Recall (Best Effort) + +If a `ce-memory-researcher` agent is available, ask it for `operation: recall` using the problem being documented, repo name, branch, affected component, and any known root cause or solution summary. + +Use retrieved persistent memory as supplementary context only: +- Pass relevant results to the Context Analyzer and Solution Extractor task prompts in Phase 1 alongside any auto-memory excerpt +- Tag memory-sourced material incorporated into the final documentation with `(persistent memory)` so its origin is clear +- If persistent memory contradicts the verified conversation history, code evidence, or final fix, note it as cautionary context instead of treating it as truth +- If memory is unavailable, returns no relevant entries, or the agent cannot be dispatched, continue without persistent context and do not fail the Compound workflow + ### Phase 1: Research Launch research subagents. Each returns text data to the orchestrator. diff --git a/plugins/compound-engineering/skills/ce-debug/SKILL.md b/plugins/compound-engineering/skills/ce-debug/SKILL.md index 848649910..8210c0091 100644 --- a/plugins/compound-engineering/skills/ce-debug/SKILL.md +++ b/plugins/compound-engineering/skills/ce-debug/SKILL.md @@ -54,6 +54,15 @@ Read the full conversation — the original description AND every comment, with **Prior-attempt awareness:** If the user indicates prior failed attempts ("I've been trying", "keeps failing", "stuck"), ask what they have already tried before investigating. This avoids repeating failed approaches and is one of the few cases where asking first is the right call. +#### 0.5 Optional Persistent Memory Recall (Best Effort) + +After triage and before reproduction, use persistent memory only as supplementary context: + +- If a `ce-memory-researcher` agent is available, ask for `operation: recall` with the bug summary, error text, affected component, issue reference, and project name when known. +- Look specifically for prior root causes, failed attempts, environment gotchas, and recurring patterns that can sharpen the investigation. +- Do not skip reproduction or causal tracing because memory suggests an answer. +- If memory is unavailable, returns no relevant entries, or the agent cannot be dispatched, continue without persistent context and do not fail the debug workflow. + --- ### Phase 1: Investigate diff --git a/plugins/compound-engineering/skills/ce-plan/SKILL.md b/plugins/compound-engineering/skills/ce-plan/SKILL.md index 4dd7436ce..80c77da02 100644 --- a/plugins/compound-engineering/skills/ce-plan/SKILL.md +++ b/plugins/compound-engineering/skills/ce-plan/SKILL.md @@ -24,6 +24,8 @@ Ask one question at a time. Prefer a concise single-select choice when natural o <feature_description> #$ARGUMENTS </feature_description> +Treat `$ARGUMENTS` as the source of truth whenever it is non-empty. Treat command rendering artifacts (collapsed previews, shortened snippets, ellipses, or UI truncation) as display artifacts only. Do not ask the user to re-paste the objective, and do not state that the objective is missing or truncated when `$ARGUMENTS` contains content. + **If the feature description above is empty, ask the user:** "What would you like to plan? Describe the task, goal, or project you have in mind." Then wait for their response before continuing. If the input is present but unclear or underspecified, do not abandon — ask one or two clarifying questions, or proceed to Phase 0.4's planning bootstrap to establish enough context. The goal is always to help the user plan, never to exit the workflow. @@ -55,6 +57,16 @@ Every plan should contain: A plan is ready when an implementer can start confidently without needing the plan to write the code for them. +## Scalability Baseline + +For runtime-path work, assume at least 1000 simultaneous users. When planning, call out any: +- blocking I/O or unbounded work on hot paths +- list endpoints without pagination or limits +- missing indexes, caching, batching, or background work +- shared-state or session contention + +If the request cannot reasonably satisfy that baseline, make it a blocking concern before finalizing the plan. State how runtime-path changes remain safe under concurrent use when relevant. + ## Workflow ### Phase 0: Resume, Source, and Scope @@ -189,13 +201,23 @@ Fires **only in solo invocation** — when Phase 0.2 found no upstream brainstor ### Phase 1: Gather Context -#### 1.1 Local Research (Always Runs) +#### 1.0 Optional Persistent Memory Warm-Start (Best Effort) -Prepare a concise planning context summary (a paragraph or two) to pass as input to the research agents: +Before standard local research, prepare a concise planning context summary (a paragraph or two): - If an origin document exists, summarize the problem frame, requirements, and key decisions from that document - Otherwise use the feature description directly - If `STRATEGY.md` exists, read it and include the relevant pieces (target problem, approach, active tracks) in the summary so downstream research and planning decisions are anchored to product strategy +If a `ce-memory-researcher` agent is available, dispatch it before Phase 1.1: + +- Task ce-memory-researcher(operation: warm-start. Include project/repo name when known, topic, and the planning context summary.) + +Use returned memory only when it materially affects planning: prior decisions, prior errors, failed attempts, workflow preferences, or cross-project patterns. Treat memory as supplementary evidence; it must not override the origin document, current repo evidence, or explicit user instructions. If memory is unavailable, returns no relevant entries, or the agent cannot be dispatched, continue without persistent context and do not fail the workflow. + +#### 1.1 Local Research (Always Runs) + +Use the planning context summary from Phase 1.0, or prepare it now if persistent memory warm-start was skipped. Pass it as input to the research agents. + Run these agents in parallel: - Task ce-repo-research-analyst(Scope: technology, architecture, patterns. {planning context summary}) @@ -276,6 +298,7 @@ If Step 1.2 indicates external research is useful, run these agents in parallel: Summarize: - Relevant codebase patterns and file paths - Relevant institutional learnings +- Relevant persistent memory findings, if gathered, labeled as supplementary memory context - Organizational context from Slack conversations, if gathered (prior discussions, decisions, or domain knowledge relevant to the feature) - External references and best practices, if gathered - Related issues, PRs, or prior art @@ -540,6 +563,14 @@ Fires **only when the plan was sourced from an upstream brainstorm doc** (Phase **Headless mode**: internal draft is composed but stage 2 (chat-time call-outs) is skipped — no synchronous user to confirm to. Proceed to Phase 5.2 plan-write. Inferred bets from the internal draft route to a `## Assumptions` section in the plan instead of Key Technical Decisions. See `references/synthesis-summary.md` Headless mode for the full routing. +#### 5.1.7 Pre-Write Critic Gate + +Before writing the plan file, run a bounded critic pass over the complete draft. This catches blocking executability issues while revision is still cheap and before the plan becomes the durable artifact. + +**STOP. Load `references/plan-critic.md` now before continuing.** It defines the OKAY/REJECT rubric, max three blocking issues, max two revision loops, CE-portable reviewer guidance, and the fallback behavior when the critic still rejects after the loop limit. + +Only revise blocking issues surfaced by the critic. Do not use this gate for perfectionism, stylistic rewrites, or broad re-planning. + #### 5.2 Write Plan File **REQUIRED: Write the plan file to disk before presenting any options.** diff --git a/plugins/compound-engineering/skills/ce-plan/references/plan-critic.md b/plugins/compound-engineering/skills/ce-plan/references/plan-critic.md new file mode 100644 index 000000000..1e01926d4 --- /dev/null +++ b/plugins/compound-engineering/skills/ce-plan/references/plan-critic.md @@ -0,0 +1,82 @@ +# Plan Critic + +Use this reference at `ce-plan` Phase 5.1.7, after the complete plan draft exists and before writing it to disk. + +## Goal + +Answer one question: can a capable implementer execute this plan without getting blocked? + +This is a pre-write gate, not a second planning workflow. Catch blockers while the draft is still cheap to revise. Keep the post-write confidence check and `ce-doc-review` intact because they catch different issue classes. + +## Execution + +Use the lightest reliable critic path: + +- Prefer a CE-shipped reviewer when the platform supports subagent dispatch. Use `ce-feasibility-reviewer` for executability. Add `ce-coherence-reviewer` only when the draft has complex sequencing, many cross-references, or high contradiction risk. +- If subagent dispatch is unavailable, run this rubric as a self-critic in the parent session. +- Do not use local-only planning agents or local command-specific contracts. This skill must work from the Compound Engineering plugin alone. + +Send the critic: + +- The complete draft plan text +- The origin document path and summary, if any +- The plan depth and risk profile +- A request for an `OKAY` or `REJECT` verdict only + +## Rubric + +Return `OKAY` by default unless there is a true blocker. + +`OKAY` means: + +- The plan has enough file paths, patterns, sequencing, and verification detail to start work. +- Any remaining uncertainty is implementation-time discovery, explicitly deferred, or minor enough not to block. +- The plan preserves source requirements and scope boundaries. + +`REJECT` means one or more blockers would stop or mislead implementation: + +- Referenced files or patterns are missing, impossible to locate, or clearly unrelated. +- An implementation unit is too vague to start. +- Dependencies or sequencing contradict each other. +- A planning-owned question is hidden as certainty or deferred to implementation without justification. +- Runtime-path work violates the scalability baseline without mitigation. +- Feature-bearing units omit meaningful test scenarios or verification outcomes. +- The plan changes product scope beyond the user's request or origin document. + +Return max three blocking issues. Each issue must include: + +- The section or U-ID affected +- Why it blocks implementation +- The smallest plan change that would resolve it + +Do not reject for style, completeness polish, optional edge cases, wording preference, or a merely non-optimal architecture. + +## Revision Loop + +Run max two revision loops: + +1. Draft critic returns `OKAY` or `REJECT`. +2. If `OKAY`, continue to Phase 5.2 and write the plan. +3. If `REJECT`, revise only the blocking sections. +4. Re-run the critic on the revised draft. +5. Stop after max two revision loops, even if the critic still returns `REJECT`. + +If still rejected after max two revision loops: + +- Write the plan only if it is useful as a durable artifact. +- Record unresolved blockers in `Open Questions`, `Risks & Dependencies`, or the affected implementation units. +- Clearly state in the handoff summary that the pre-write critic still had unresolved blockers. +- Do not route directly to `ce-work` without surfacing those blockers first. + +## Output Shape + +Use this compact result: + +```text +Critic verdict: OKAY|REJECT +Summary: [1-2 sentences] +Blocking issues: +1. [section/U-ID] [why blocking] -> [smallest fix] +``` + +Omit `Blocking issues` when the verdict is `OKAY`. diff --git a/plugins/compound-engineering/skills/ce-work-beta/SKILL.md b/plugins/compound-engineering/skills/ce-work-beta/SKILL.md index 7c5c8e38a..e8d5f0684 100644 --- a/plugins/compound-engineering/skills/ce-work-beta/SKILL.md +++ b/plugins/compound-engineering/skills/ce-work-beta/SKILL.md @@ -112,6 +112,7 @@ Determine how to proceed based on what was provided in `<input_document>`. - If clarifying questions were needed above, get user approval on the resolved answers. If no clarifications were needed, proceed without a separate approval step — plan scope is the plan's authority, not something to renegotiate - **Do not skip this** - better to ask questions now than build the wrong thing - **Do not edit the plan body during execution.** The plan is a decision artifact; progress lives in git commits and the task tracker. The only plan mutation during ce-work is the final `status: active → completed` flip at shipping (see `references/shipping-workflow.md` Phase 4 Step 2). Legacy plans may contain `- [ ]` / `- [x]` marks on unit headings — ignore them as state; per-unit completion is determined during execution by reading the current file state. + - **Optional persistent memory context:** If a `ce-memory-researcher` agent is available, ask it for `operation: warm-start` using the plan summary, target project, and affected topics. Use results only for execution context: prior decisions, failed attempts, workflow preferences, or cross-project patterns. Memory is supplementary and must not alter plan scope, override current code evidence, or block execution when unavailable. 2. **Setup Environment** diff --git a/plugins/compound-engineering/skills/ce-work/SKILL.md b/plugins/compound-engineering/skills/ce-work/SKILL.md index 72340ff75..9b1a29b25 100644 --- a/plugins/compound-engineering/skills/ce-work/SKILL.md +++ b/plugins/compound-engineering/skills/ce-work/SKILL.md @@ -58,6 +58,7 @@ Determine how to proceed based on what was provided in `<input_document>`. - If clarifying questions were needed above, get user approval on the resolved answers. If no clarifications were needed, proceed without a separate approval step — plan scope is the plan's authority, not something to renegotiate - **Do not skip this** - better to ask questions now than build the wrong thing - **Do not edit the plan body during execution.** The plan is a decision artifact; progress lives in git commits and the task tracker. The only plan mutation during ce-work is the final `status: active → completed` flip at shipping (see `references/shipping-workflow.md` Phase 4 Step 2). Legacy plans may contain `- [ ]` / `- [x]` marks on unit headings — ignore them as state; per-unit completion is determined during execution by reading the current file state. + - **Optional persistent memory context:** If a `ce-memory-researcher` agent is available, ask it for `operation: warm-start` using the plan summary, target project, and affected topics. Use results only for execution context: prior decisions, failed attempts, workflow preferences, or cross-project patterns. Memory is supplementary and must not alter plan scope, override current code evidence, or block execution when unavailable. 2. **Setup Environment** diff --git a/plugins/compound-engineering/skills/ce-work/references/shipping-workflow.md b/plugins/compound-engineering/skills/ce-work/references/shipping-workflow.md index 612db97f8..7b9b2646a 100644 --- a/plugins/compound-engineering/skills/ce-work/references/shipping-workflow.md +++ b/plugins/compound-engineering/skills/ce-work/references/shipping-workflow.md @@ -26,7 +26,7 @@ This file contains the shipping workflow (Phase 3-4). It is loaded when all Phas Every change gets reviewed before shipping. Default to Tier 1 and escalate to Tier 2 only when a concrete signal calls for it. Tier 2 is materially more expensive in time and tokens -- pay that cost when a signal justifies it, not as a default. - **Tier 1 -- harness-native code review (default).** Run your built-in code review command or skill (e.g., `/review` in Claude Code). Address blocking and suggested findings inline before Final Validation. Skip the Residual Work Gate. If the current harness has no built-in code review command or skill, escalate to Tier 2 -- Tier 1 cannot run, and "Every change gets reviewed" still applies. + **Tier 1 -- harness-native code review (default).** Run your built-in code review command or skill (e.g., `/review` in Claude Code). Address blocking and suggested findings inline before Final Validation. If the reviewer returns structured read-only output with a non-empty `findings` array, route those findings through the Read-only Review Finding Gate below before Final Validation. Skip the Residual Work Gate only after no unresolved Tier 1 findings remain. If the current harness has no built-in code review command or skill, escalate to Tier 2 -- Tier 1 cannot run, and "Every change gets reviewed" still applies. **Tier 2 -- `ce-code-review` (escalation).** Invoke the `ce-code-review` skill with `mode:autofix`, passing `plan:<path>` when known. Then proceed to the Residual Work Gate. @@ -55,7 +55,19 @@ This file contains the shipping workflow (Phase 3-4). It is loaded when all Phas Skip this gate entirely when the review reported `Residual actionable work: none.` or when only Tier 1 was used. Do not proceed past this gate on an `Accept and proceed` decision until the agent has recorded whether the durable sink is `PR Known Residuals` or `docs/residual-review-findings/<branch-or-head-sha>.md`. -5. **Final Validation** +5. **Read-only Review Finding Gate** (REQUIRED when any read-only review reports findings) + + This gate covers review outputs that are not produced by `ce-code-review mode:autofix`, including harness-native reviews, standalone reviewer agents, or `ce-code-review mode:report-only` / `mode:headless` returns. If the review output contains a non-empty `findings` array, or otherwise lists concrete P0-P3 findings, do not stop after reporting them and do not proceed to Final Validation until every finding is handled. + + For each finding, choose the first applicable route: + - **Fix now** -- investigate the finding, modify the relevant source or generated asset, and rerun the review or the most direct validator that proves the finding is resolved. + - **Convert to Tier 2 residual** -- when the finding is real but not safe to fix inline, record it as residual actionable work and run the Residual Work Gate path above. + - **Durably defer** -- only with explicit user approval, record the finding in the PR "Known Residuals" section or `docs/residual-review-findings/<branch-or-head-sha>.md` before shipping. + - **Reject as false positive** -- record the evidence for rejection in the handoff summary. + + A read-only review report that says "No tests or gates were run" does not satisfy verification. If fixes were applied, run the focused test, lint, asset check, screenshot check, or other validator that proves the affected behavior or artifact changed. + +6. **Final Validation** - All tasks marked completed - Testing addressed -- tests pass and new/changed behavior has corresponding test coverage (or an explicit justification for why tests are not needed) - Linting passes @@ -65,7 +77,7 @@ This file contains the shipping workflow (Phase 3-4). It is loaded when all Phas - If the plan has a `Requirements` section (or legacy `Requirements Trace`), verify each requirement is satisfied by the completed work - If any `Deferred to Implementation` questions were noted, confirm they were resolved during execution -6. **Prepare Operational Validation Plan** (REQUIRED) +7. **Prepare Operational Validation Plan** (REQUIRED) - Add a `## Post-Deploy Monitoring & Validation` section to the PR description for every change. - Include concrete: - Log queries/search terms @@ -99,7 +111,7 @@ This file contains the shipping workflow (Phase 3-4). It is loaded when all Phas - Testing notes (tests added/modified, manual testing performed) - Evidence context from step 1, so `ce-commit-push-pr` can decide whether to ask about capturing evidence - Figma design link (if applicable) - - The Post-Deploy Monitoring & Validation section (see Phase 3 Step 6) + - The Post-Deploy Monitoring & Validation section (see Phase 3 Step 7) - Any "Known Residuals" accepted in the Phase 3 Residual Work Gate, rendered as a dedicated section in the PR body with severity, file:line, and title per finding If the user prefers to commit without creating a PR, load the `ce-commit` skill instead. diff --git a/src/converters/claude-to-opencode.ts b/src/converters/claude-to-opencode.ts index 50b378fb7..28182e5e9 100644 --- a/src/converters/claude-to-opencode.ts +++ b/src/converters/claude-to-opencode.ts @@ -5,6 +5,7 @@ import { type ClaudeCommand, type ClaudeHooks, type ClaudePlugin, + type ClaudeSkill, type ClaudeMcpServer, filterSkillsByPlatform, } from "../types/claude" @@ -86,7 +87,9 @@ export function convertClaudeToOpenCode( options: ClaudeToOpenCodeOptions, ): OpenCodeBundle { const agentFiles = plugin.agents.map((agent) => convertAgent(agent, options)) + const skillDirs = filterSkillsByPlatform(plugin.skills, "opencode") const cmdFiles = convertCommands(plugin.commands) + const generatedSkillCommands = convertSkillCommandWrappers(skillDirs, cmdFiles) const mcp = plugin.mcpServers ? convertMcp(plugin.mcpServers) : undefined const plugins = plugin.hooks ? [convertHooks(plugin.hooks)] : [] @@ -101,9 +104,9 @@ export function convertClaudeToOpenCode( pluginName: plugin.manifest.name, config, agents: agentFiles, - commandFiles: cmdFiles, + commandFiles: [...cmdFiles, ...generatedSkillCommands], plugins, - skillDirs: filterSkillsByPlatform(plugin.skills, "opencode").map((skill) => ({ sourceDir: skill.sourceDir, name: skill.name })), + skillDirs: skillDirs.map((skill) => ({ sourceDir: skill.sourceDir, name: skill.name })), } } @@ -154,6 +157,35 @@ function convertCommands(commands: ClaudeCommand[]): OpenCodeCommandFile[] { return files } +function convertSkillCommandWrappers( + skills: ClaudeSkill[], + existingCommands: OpenCodeCommandFile[], +): OpenCodeCommandFile[] { + const existingCommandNames = new Set(existingCommands.map((command) => command.name)) + const files: OpenCodeCommandFile[] = [] + + for (const skill of skills) { + if (existingCommandNames.has(skill.name)) continue + + const frontmatter: Record<string, unknown> = { + description: skill.description ?? `Run the ${skill.name} skill`, + "argument-hint": skill.argumentHint, + } + const body = [ + `Read and follow the installed OpenCode skill instructions for \`${skill.name}\`.`, + "", + `Use the project-local skill file if it exists: \`.opencode/skills/${skill.name}/SKILL.md\``, + `Otherwise use the global skill file: \`~/.config/opencode/skills/${skill.name}/SKILL.md\``, + "", + "<skill_input>$ARGUMENTS</skill_input>", + ].join("\n") + + files.push({ name: skill.name, content: formatFrontmatter(frontmatter, body) }) + } + + return files +} + function convertMcp(servers: Record<string, ClaudeMcpServer>): Record<string, OpenCodeMcpServer> { const result: Record<string, OpenCodeMcpServer> = {} for (const [name, server] of Object.entries(servers)) { diff --git a/src/utils/codex-agents.ts b/src/utils/codex-agents.ts index a59dcdbc5..6f8c5e6af 100644 --- a/src/utils/codex-agents.ts +++ b/src/utils/codex-agents.ts @@ -19,7 +19,8 @@ Tool mapping: - LS: use ls via shell_command - WebFetch/WebSearch: use curl or Context7 for library docs - AskUserQuestion/Question: present choices as a numbered list in chat and wait for a reply number. For multi-select (multiSelect: true), accept comma-separated numbers. Never skip or auto-configure — always wait for the user's response before proceeding. -- Task (subagent dispatch) / Subagent / Parallel: run sequentially in main thread; use multi_tool_use.parallel for tool calls +- Task (subagent dispatch) / Agent / Subagent: use spawn_agent with agent_type set to the requested custom agent name. A user invoking a CE skill that contains Task/Agent/Subagent instructions is authorizing those dispatches for that workflow. +- Parallel Task groups: spawn all independent agents before waiting; wait for every spawned agent before synthesis or handoff. If spawn_agent is unavailable, run the same agent tasks sequentially in the main thread and state that the platform degraded to sequential execution. - TaskCreate/TaskUpdate/TaskList/TaskGet/TaskStop/TaskOutput (Claude Code task-tracking, current): use update_plan (Codex's task-tracking primitive) - TodoWrite/TodoRead (Claude Code task-tracking, legacy — deprecated, replaced by Task* tools): use update_plan - Skill: open the referenced SKILL.md and follow it diff --git a/tests/codex-agents.test.ts b/tests/codex-agents.test.ts index 13d795685..668d832c5 100644 --- a/tests/codex-agents.test.ts +++ b/tests/codex-agents.test.ts @@ -21,6 +21,8 @@ describe("ensureCodexAgentsFile", () => { const content = await readFile(agentsPath) expect(content).toContain(CODEX_AGENTS_BLOCK_START) expect(content).toContain("Tool mapping") + expect(content).toContain("use spawn_agent") + expect(content).not.toContain("run sequentially in main thread") expect(content).toContain("TaskCreate/TaskUpdate/TaskList/TaskGet/TaskStop/TaskOutput") expect(content).toContain("use update_plan") expect(content).toContain(CODEX_AGENTS_BLOCK_END) diff --git a/tests/converter.test.ts b/tests/converter.test.ts index aae363245..8ac26c502 100644 --- a/tests/converter.test.ts +++ b/tests/converter.test.ts @@ -15,7 +15,7 @@ const compoundEngineeringRoot = path.join( ) describe("convertClaudeToOpenCode", () => { - test("current compound-engineering output is skills and subagents, not commands", async () => { + test("current compound-engineering output includes skill command wrappers", async () => { const plugin = await loadClaudePlugin(compoundEngineeringRoot) const bundle = convertClaudeToOpenCode(plugin, { agentMode: "subagent", @@ -25,14 +25,62 @@ describe("convertClaudeToOpenCode", () => { expect(bundle.agents.length).toBeGreaterThan(0) expect(bundle.skillDirs.length).toBeGreaterThan(0) - expect(bundle.commandFiles).toHaveLength(0) + expect(bundle.commandFiles.find((command) => command.name === "ce-plan")).toBeDefined() + expect(bundle.commandFiles.find((command) => command.name === "ce-setup")).toBeDefined() + expect(bundle.commandFiles.find((command) => command.name === "lfg")).toBeDefined() expect(bundle.plugins).toHaveLength(0) expect(bundle.config.tools).toBeUndefined() + const cePlanCommand = bundle.commandFiles.find((command) => command.name === "ce-plan") + expect(cePlanCommand).toBeDefined() + const parsedCommand = parseFrontmatter(cePlanCommand!.content) + expect(parsedCommand.data.description).toContain("Create structured plans") + expect(parsedCommand.data["argument-hint"]).toContain("feature description") + expect(parsedCommand.body).toContain(".opencode/skills/ce-plan/SKILL.md") + expect(parsedCommand.body).toContain("~/.config/opencode/skills/ce-plan/SKILL.md") + expect(parsedCommand.body).toContain("$ARGUMENTS") + const parsedAgents = bundle.agents.map((agent) => parseFrontmatter(agent.content)) expect(parsedAgents.every((agent) => agent.data.mode === "subagent")).toBe(true) }) + test("does not generate a skill wrapper when a command already owns the skill name", () => { + const plugin: ClaudePlugin = { + root: "/tmp/plugin", + manifest: { name: "fixture", version: "1.0.0" }, + agents: [], + commands: [ + { + name: "skill-one", + description: "Source command wins", + body: "Run the source command.", + sourcePath: "/tmp/plugin/commands/skill-one.md", + }, + ], + skills: [ + { + name: "skill-one", + description: "Generated wrapper should be skipped", + sourceDir: "/tmp/plugin/skills/skill-one", + skillPath: "/tmp/plugin/skills/skill-one/SKILL.md", + }, + ], + } + + const bundle = convertClaudeToOpenCode(plugin, { + agentMode: "subagent", + inferTemperature: false, + permissions: "none", + }) + + const matchingCommands = bundle.commandFiles.filter((command) => command.name === "skill-one") + expect(matchingCommands).toHaveLength(1) + const parsedCommand = parseFrontmatter(matchingCommands[0]!.content) + expect(parsedCommand.data.description).toBe("Source command wins") + expect(parsedCommand.body).toContain("Run the source command.") + expect(parsedCommand.body).not.toContain(".opencode/skills/skill-one/SKILL.md") + }) + test("from-command mode: map allowedTools to global permission block", async () => { const plugin = await loadClaudePlugin(fixtureRoot) const bundle = convertClaudeToOpenCode(plugin, { diff --git a/tests/pipeline-review-contract.test.ts b/tests/pipeline-review-contract.test.ts index f090a3783..3d1ac0246 100644 --- a/tests/pipeline-review-contract.test.ts +++ b/tests/pipeline-review-contract.test.ts @@ -314,6 +314,37 @@ describe("ce-plan testing contract", () => { }) describe("ce-plan review contract", () => { + test("requires a pre-write critic gate before saving the plan", async () => { + const skill = await readRepoFile("plugins/compound-engineering/skills/ce-plan/SKILL.md") + const critic = await readRepoFile("plugins/compound-engineering/skills/ce-plan/references/plan-critic.md") + + expect(skill).toContain("#### 5.1.7 Pre-Write Critic Gate") + expect(skill).toContain("`references/plan-critic.md`") + + const criticGateIdx = skill.indexOf("5.1.7 Pre-Write Critic Gate") + const writePlanIdx = skill.indexOf("5.2 Write Plan File") + expect(criticGateIdx).toBeGreaterThan(-1) + expect(criticGateIdx).toBeLessThan(writePlanIdx) + + expect(critic).toContain("OKAY") + expect(critic).toContain("REJECT") + expect(critic).toContain("max three blocking issues") + expect(critic).toContain("max two revision loops") + expect(critic).not.toMatch(/\bmetis\b|\bprometheus\b|\bmomus\b/) + }) + + test("carries pln argument and scalability guardrails into ce-plan", async () => { + const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/SKILL.md") + + expect(content).toContain("Treat `$ARGUMENTS` as the source of truth") + expect(content).toContain("command rendering artifacts") + expect(content).toContain("1000 simultaneous users") + expect(content).toContain("blocking I/O or unbounded work on hot paths") + expect(content).toContain("list endpoints without pagination or limits") + expect(content).toContain("missing indexes, caching, batching, or background work") + expect(content).toContain("shared-state or session contention") + }) + test("requires document review after confidence check", async () => { // Document review instructions extracted to references/plan-handoff.md const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md") @@ -376,6 +407,87 @@ describe("ce-plan review contract", () => { }) }) +describe("ce persistent memory contract", () => { + test("ships a portable best-effort memory researcher agent", async () => { + const content = await readRepoFile("plugins/compound-engineering/agents/ce-memory-researcher.agent.md") + + expect(content).toContain("name: ce-memory-researcher") + expect(content).toContain("mcp__neo4j__*") + expect(content).toContain("Memory unavailable - continuing without persistent context.") + expect(content).toContain("Treat memory as supplementary evidence") + expect(content).toContain("Use write operations only when the caller explicitly requests `remember`") + expect(content).not.toContain("~/.factory/mcp.json") + expect(content).not.toContain("cypher-shell") + }) + + test("ce-plan warm-starts from memory before local research", async () => { + const content = await readRepoFile("plugins/compound-engineering/skills/ce-plan/SKILL.md") + + expect(content).toContain("#### 1.0 Optional Persistent Memory Warm-Start") + expect(content).toContain("Task ce-memory-researcher(operation: warm-start") + expect(content).toContain("continue without persistent context and do not fail the workflow") + expect(content).toContain("Relevant persistent memory findings") + + const memoryIdx = content.indexOf("1.0 Optional Persistent Memory Warm-Start") + const localResearchIdx = content.indexOf("1.1 Local Research") + expect(memoryIdx).toBeGreaterThan(-1) + expect(memoryIdx).toBeLessThan(localResearchIdx) + }) + + test("ce-work uses memory only as execution context without changing scope", async () => { + const stable = await readRepoFile("plugins/compound-engineering/skills/ce-work/SKILL.md") + const beta = await readRepoFile("plugins/compound-engineering/skills/ce-work-beta/SKILL.md") + + for (const content of [stable, beta]) { + expect(content).toContain("Optional persistent memory context") + expect(content).toContain("ce-memory-researcher") + expect(content).toContain("must not alter plan scope") + expect(content).toContain("block execution when unavailable") + } + }) + + test("ce-commit-push-pr captures reusable post-ship memory without blocking PR reporting", async () => { + const skill = await readRepoFile("plugins/compound-engineering/skills/ce-commit-push-pr/SKILL.md") + const reference = await readRepoFile( + "plugins/compound-engineering/skills/ce-commit-push-pr/references/memory-capture.md" + ) + + expect(skill).toContain("Best-effort post-ship memory capture") + expect(skill).toContain("references/memory-capture.md") + expect(skill).toContain("must never block the PR report") + expect(reference).toContain("operation: remember") + expect(reference).toContain("ce-memory-researcher") + expect(reference).toContain("Save a memory only when it is both verified and likely reusable") + expect(reference).toContain("More than three memories from one ship flow") + expect(reference).toContain("Do not save") + }) + + test("ce-debug recalls prior bug context without replacing investigation", async () => { + const content = await readRepoFile("plugins/compound-engineering/skills/ce-debug/SKILL.md") + + expect(content).toContain("#### 0.5 Optional Persistent Memory Recall") + expect(content).toContain("ce-memory-researcher") + expect(content).toContain("prior root causes, failed attempts") + expect(content).toContain("Do not skip reproduction or causal tracing") + expect(content).toContain("do not fail the debug workflow") + }) + + test("ce-compound can use persistent memory as supplementary evidence", async () => { + const content = await readRepoFile("plugins/compound-engineering/skills/ce-compound/SKILL.md") + + expect(content).toContain("### Phase 0.6: Persistent Memory Recall") + expect(content).toContain("ce-memory-researcher") + expect(content).toContain("(persistent memory)") + expect(content).toContain("do not fail the Compound workflow") + }) + + test("ce-code-review remains diff-grounded by default", async () => { + const content = await readRepoFile("plugins/compound-engineering/skills/ce-code-review/SKILL.md") + + expect(content).not.toContain("ce-memory-researcher") + }) +}) + describe("ce-doc-review contract", () => { test("findings-schema autofix_class enum uses ce-code-review-aligned tier names", async () => { const schema = JSON.parse(