diff --git a/.compound-engineering/config.local.example.yaml b/.compound-engineering/config.local.example.yaml index dd3d9b040..236b41a9d 100644 --- a/.compound-engineering/config.local.example.yaml +++ b/.compound-engineering/config.local.example.yaml @@ -12,14 +12,14 @@ # work_delegate_effort: high # minimal | low | medium | high | xhigh (omit to use ~/.codex/config.toml default) # --- Dispatch (external workspace delegation) --- -# Settings for /ce-dispatch, which fans out plan implementation units to -# external agent workspaces (e.g., Conductor) via GitHub issues. +# Settings for /ce-dispatch, which hands one plan implementation unit at a time +# off to an external agent workspace (e.g., Conductor) via a GitHub issue. +# Single-unit sync MVP: orchestrator and agent coordinate via issue comments +# and the resulting PR; the user pings each side manually. -# dispatch_mode: conductor # conductor | (default: conductor) # dispatch_branch_prefix: dispatch/ # branch prefix suggested in dispatch prompts (default: dispatch/) # dispatch_base_branch: main # PR base branch (default: repo default branch) # dispatch_labels: ce-dispatch # comma-separated labels applied to created issues (default: ce-dispatch) -# dispatch_auto_review: true # true | false (default: true) -- auto-run ce-code-review on each new PR # --- Product pulse --- # Settings written by /ce-product-pulse first-run interview. Re-run the skill with diff --git a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md b/plugins/compound-engineering/skills/ce-dispatch/SKILL.md index 17aecb642..037f31f57 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/SKILL.md +++ b/plugins/compound-engineering/skills/ce-dispatch/SKILL.md @@ -1,23 +1,27 @@ --- name: ce-dispatch -description: "[BETA] Dispatch plan implementation units to external agent workspaces via GitHub issues. Use after ce-plan to fan out execution to Conductor workspaces or any issue-driven agent workflow. One issue per implementation unit, dispatched in dependency order; the orchestrator monitors PRs, gates merges on dependencies, and re-dispatches newly unblocked units." +description: "[BETA] Dispatch a single plan implementation unit to an external agent workspace via a GitHub issue. Use after ce-plan when you already have a worktree open in Conductor (or any issue-driven workflow) and want the agent to run the compound-engineering loop end-to-end (work -> code review -> compound -> PR). Orchestrator and agent coordinate sync via issue comments and the PR; the user pings each side manually." disable-model-invocation: true argument-hint: "[Plan doc path. Blank to auto-detect latest plan]" --- -# Dispatch Implementation Units to External Agent Workspaces +# Dispatch a Single Implementation Unit -Fan out a structured plan's implementation units to external agent workspaces (Conductor or any issue-driven agent platform) by creating one GitHub issue per dispatchable unit. The orchestrator monitors the resulting pull requests, enforces dependency-ordered merges, and re-dispatches units whose dependencies have just merged. +Hand off **one** implementation unit from a structured plan to an **external agent workspace** (Conductor or any issue-driven workflow) by creating a single GitHub issue. The orchestrator and the agent coordinate **synchronously** via issue comments and the eventual pull request -- no polling, no automated webhooks. The user pings each side manually. -This skill is a sibling to `ce-work` and `ce-work-beta`. Where `ce-work` executes a plan in the **current** session and `ce-work-beta` can delegate to `codex exec`, `ce-dispatch` hands units off to **separate workspaces** that the dispatching session does not control directly. Use it when units are independent enough to parallelize across workspaces, when you want human-in-the-loop review at the PR layer, or when integrating with a workspace platform (e.g., Conductor) that picks up GitHub issues. +This skill is the dispatch sibling to `ce-work` and `ce-work-beta`. Where `ce-work` executes a plan in the **current** session and `ce-work-beta` can delegate to `codex exec`, `ce-dispatch` hands one unit off to a **separate workspace** and lets that workspace's agent run the standard compound-engineering loop end-to-end (work -> code review -> compound -> PR). -For background on Conductor's specific behavior (issue-to-workspace lifecycle, startup scripts, PR creation flow), see `references/conductor-notes.md`. For the structure of the prompt embedded in each issue, see `references/dispatch-prompt-template.md`. +For background on Conductor's specific behavior (issue-to-workspace lifecycle, startup scripts, PR creation flow), see `references/conductor-notes.md`. For the structure of the prompt embedded in the issue body, see `references/dispatch-prompt-template.md`. + +## Why one unit at a time? + +This is the MVP shape: simple, sync, in-the-loop. Multi-unit fan-out, dependency graphs, parallel orchestration, and merge-gate enforcement belong in a future iteration. For now, every dispatch is a single GitHub issue and the user opens (or has already opened) one Conductor workspace per dispatch. The chicken-and-egg of "the worktree exists before the issue exists" is solved by **user-first ordering**: the user creates the workspace in Conductor, then invokes `ce-dispatch` from the orchestrating session and supplies the worktree path. ## Interaction Method -When asking the user a question, use the platform's blocking question tool: `AskUserQuestion` in Claude Code (call `ToolSearch` with `select:AskUserQuestion` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) — not because a schema load is required. Never silently skip the question. +When asking the user a question, use the platform's blocking question tool: `AskUserQuestion` in Claude Code (call `ToolSearch` with `select:AskUserQuestion` first if its schema isn't loaded), `request_user_input` in Codex, `ask_user` in Gemini, `ask_user` in Pi (requires the `pi-ask-user` extension). Fall back to numbered options in chat only when no blocking tool exists in the harness or the call errors (e.g., Codex edit modes) -- not because a schema load is required. Never silently skip the question. -The Phase 4 monitor loop renders **6 menu options**, which exceeds the 4-option cap most blocking tools enforce. For that menu — and only that menu — render a numbered list directly in chat per the option-overflow exception in `plugins/compound-engineering/AGENTS.md`. Tell the user "Pick a number or describe what you want." so the list retains the open-endedness of the blocking tool. Earlier phases (Phase 0 plan-path confirmation, Phase 3 confirm-before-creating-issues) stay within the 4-option cap and use the blocking tool. +The Phase 4 respond menu has **4 options**, which fits the 4-option cap most blocking tools enforce -- always use the blocking tool for it. Earlier phases (Phase 0 plan-path confirmation, Phase 1 unit selection, Phase 3 confirm-before-creating-issue) likewise use the blocking tool. ## Input @@ -34,100 +38,104 @@ If `` is non-empty: If `` is empty: - Auto-detect the latest plan in `docs/plans/`. Sort by file mtime descending; pick the most recently modified `*.md` whose frontmatter has `status: active`. If multiple plans tie, prefer the one whose filename matches today's or yesterday's date prefix. -- Confirm the auto-detected plan with the user via the blocking question tool before proceeding ("Dispatch plan ``? Yes / Pick another / Cancel"). Never silently dispatch the wrong plan. +- Confirm the auto-detected plan with the user via the blocking question tool before proceeding ("Dispatch from plan ``? Yes / Pick another / Cancel"). Never silently dispatch the wrong plan. - If no candidate plan exists, stop and tell the user to pass a plan path explicitly. -Resolve the plan path to a repo-relative form (relative to `git rev-parse --show-toplevel`) for use in issue bodies. Repo-relative paths only — absolute paths break across machines. +Resolve the plan path to a repo-relative form (relative to `git rev-parse --show-toplevel`) for use in the issue body. Repo-relative paths only -- absolute paths break across machines. #### 0.2 Read dispatch config -Read `dispatch_*` keys from `.compound-engineering/config.local.yaml` at the repo root (use the native file-read tool — `Read` in Claude Code, `read_file` in Codex). All keys are optional; missing values fall through to the documented defaults below. +Read `dispatch_*` keys from `.compound-engineering/config.local.yaml` at the repo root (use the native file-read tool -- `Read` in Claude Code, `read_file` in Codex). All keys are optional; missing values fall through to the documented defaults below. Config keys and resolution: | Key | Values | Default | |---|---|---| -| `dispatch_mode` | `conductor`, or another short identifier | `conductor` | | `dispatch_branch_prefix` | any string (no leading/trailing slashes) | `dispatch/` | | `dispatch_base_branch` | any branch name | repo's default branch (`git symbolic-ref --short refs/remotes/origin/HEAD`) | | `dispatch_labels` | comma-separated label list | `ce-dispatch` | -| `dispatch_auto_review` | `true` or `false` | `true` | If a key has an unrecognized value, fall through to the default for that key. Do not error. Store the resolved values for the rest of the workflow: -- `mode` — string identifier; affects the wording of in-prompt hints (e.g., "Conductor's `Create PR` action") but never gates behavior. Unknown modes still work — they just get generic phrasing. -- `branch_prefix` — used to suggest branch names in the dispatch prompt -- `base_branch` — recorded in metadata; the in-workspace agent targets this branch with the PR -- `labels` — list of labels applied to each created issue -- `auto_review` — when true, Phase 4's review action invokes `ce-code-review` automatically; when false, the user must opt in per PR +- `branch_prefix` -- used to suggest a branch name in the dispatch prompt +- `base_branch` -- recorded in the issue metadata; the in-workspace agent targets this branch with the PR +- `labels` -- list of labels applied to the created issue + +Removed in this MVP: `dispatch_mode`, `dispatch_auto_review`. Mode is no longer multiplexed (one shape only); auto-review is no longer wired (the user opts in to review per PR via the Phase 4 menu). + +#### 0.3 Confirm worktree path and agent name + +The user must have already created a Conductor workspace (or another worktree-based workspace) for this dispatch. Without that, there is no place for the eventual agent to run. -### Phase 1: Parse Plan and Build Dependency Graph +Ask via the blocking question tool: "Paste the absolute path of the Conductor worktree you opened for this dispatch. (e.g., `/Users/you/conductor/workspaces//`)" -Read the plan file and extract the structured fields needed for dispatch. +- The dirname of that path (the last path segment) becomes the **agent name** used in the dispatch issue body. The orchestrator and agent address each other in comments using this name (e.g., `[orchestrator -> jackson]`, `[jackson -> orchestrator]`). It is purely a label -- no infrastructure depends on it. +- If the user can't provide a worktree path, stop and tell them: "Create a Conductor workspace for this dispatch first (Cmd+Shift+N), then re-invoke `/ce-dispatch `." Do not invent a path. +- Do not validate the path against the orchestrator's filesystem -- the worktree typically lives outside the orchestrator's checkout and validation would always fail. -#### 1.1 Identify implementation units +Record `worktree_path` and `agent_name` for use in Phase 2. -Locate the `Implementation Units` section. Each unit is a top-level bullet whose heading is `- U. ****` (e.g., `- U1. **Add rate limiter**`). For each unit, capture: +### Phase 1: Pick One Implementation Unit + +Read the plan file. Locate the `Implementation Units` section. Each unit is a top-level bullet whose heading is `- U. ****` (e.g., `- U1. **Add rate limiter**`). Capture each unit's: - **U-ID** (e.g., `U1`, `U3`) - **Name** (the bolded heading text) - **Goal** (the unit's "Goal" or "Why" field) -- **Files** (the unit's `Files:` section — Create, Modify, Read paths) +- **Files** (the unit's `Files:` section -- Create, Modify, Read paths) - **Patterns** (the unit's `Patterns to follow` field, if present) - **Approach** (the unit's `Approach` field, if present) - **Verification** (the unit's `Verification` or `Test scenarios` field) -- **Dependencies** (any `Depends on:` field listing other U-IDs, or inferred from the plan's sequencing prose; default to `none`) If the plan has no recognizable Implementation Units section, stop and tell the user the plan must contain implementation units before dispatch. Do not invent units. -#### 1.2 Build the dependency graph - -Construct a directed graph from the captured `Dependencies` lists. Nodes are U-IDs, edges point from a dependency to its dependent (so `U2 depends on U1` means `U1 → U2`). - -- **Cycle check**: detect cycles via topological sort. If any cycle exists, stop and tell the user which U-IDs form the cycle — dispatch cannot proceed until the plan is corrected. -- **Roots** (units with `Dependencies: none`) are the initial dispatch candidates. - -#### 1.3 Parallel Safety Check - -Mirror the parallel-safety analysis from `ce-work` (the canonical version lives in `plugins/compound-engineering/skills/ce-work/SKILL.md`'s "Parallel Safety Check" section). Build a file-to-unit mapping from every unit's `Files:` section (Create, Modify, and Test paths). Detect intersections. +Present the captured units to the user via the blocking question tool (single-select). Each option is `: `. The user picks **one**. Multi-unit fan-out is intentionally out of scope for this MVP. -Each external workspace runs in its own working tree (Conductor: one workspace = one branch = one isolated working tree), so file overlap between units in different workspaces does **not** corrupt git state — but it predicts merge conflicts when those PRs land. +This MVP does not build a dependency graph and does not run a parallel-safety check -- only one unit is in flight per `ce-dispatch` invocation. If the user wants to dispatch a second unit while the first is still open, they invoke `ce-dispatch` again separately. -For each pair of units that share files, log the predicted overlap (e.g., "U2 and U4 both modify `config/routes.rb` — expect a merge conflict on the second PR; the agent in the second workspace should rebase before opening the PR"). Carry this forecast into the dispatch prompts (the `` block already tells agents to scope tightly; predicted-overlap pairs additionally get a one-line hint at the bottom of `` naming the other U-ID). +### Phase 2: Generate the Dispatch Prompt -### Phase 2: Generate Dispatch Prompts - -For each dispatchable unit (initially the roots; later, units whose dependencies have all merged), render a self-contained prompt using the template in `references/dispatch-prompt-template.md`. Load that file now and follow its required structure. +For the selected unit, render a self-contained prompt using the template in `references/dispatch-prompt-template.md`. Load that file now and follow its required structure. Substitute concrete values for every section: -- `` — plan file repo-relative path; one-sentence project context -- `` — Goal from the unit (single-unit case) -- `` — the unit's combined Create/Modify/Read file list -- `` — the unit's `Patterns to follow` content (or the fallback line) -- `` — the unit's Approach field -- `` — the template's constraints, plus any predicted-overlap hint -- `` — the template's testing guidance, anchored to this unit's test scenarios -- `` — the project's combined test/lint commands (read from the plan or from the repo's package manifest) -- `` — the template's ce-plugin block, unchanged -- `` — the template's PR-description schema, unchanged + +- `` -- a checklist of repo-relative paths the in-workspace agent should `Read` first, before doing any work. Include (only those that exist in the target repo): + - the plan file (the unit was extracted from it) + - `README.md` + - `AGENTS.md` and/or `CLAUDE.md` (root and any plugin-scoped equivalents) + - any `docs/architecture.md`, `docs/architecture/`, or top-level architecture document + - the unit's `Patterns to follow` files (verbatim from the unit) + - the unit's `Files:` paths (so the agent reads existing code before editing) + This is **progressive context exposure**: list paths, do not inline content. The agent reads what it needs. +- `` -- the `agent_name` (worktree dirname from Phase 0.3) and the `worktree_path` (absolute path the user supplied). The agent uses `agent_name` to sign comments; the orchestrator uses it to address the agent. +- `` -- one paragraph orienting the agent: plan path (repo-relative), one-sentence project context (read from plan frontmatter or repo README), note that this issue was created by `ce-dispatch` and corresponds to a single unit. +- `` -- the unit's Goal, verbatim. +- `` -- the unit's combined Create/Modify/Read file list, repo-relative. +- `` -- the unit's `Patterns to follow` content, or the fallback line `"No explicit patterns referenced -- follow existing conventions in the modified files."` +- `` -- the unit's Approach, verbatim. +- `` -- the template's constraints block, unchanged. +- `` -- the template's testing guidance, anchored to this unit's test scenarios. +- `` -- the project's combined test/lint commands (read from the plan or from the repo's package manifest). +- `` -- the template's explicit nine-step compound-engineering loop, unchanged. +- `` -- the template's comment-protocol block, unchanged. +- `` -- the template's PR-description schema, unchanged. After the rendered XML body, append the metadata HTML comment from the template, populated with: - `plan: ` -- `unit_ids: ` -- `dependencies: ` +- `unit_id: ` +- `agent_name: ` +- `worktree_path: ` - `expected_branch: -` (e.g., `dispatch/U3-add-rate-limiter`) - `base_branch: ` - `labels: ` - `dispatched_at: ` -**Coalescing units into one issue:** by default, dispatch one unit per issue. Coalesce two or more units into one issue **only** when (a) they share no dependency edges with each other, (b) they share substantial context (same files or same patterns), and (c) coalescing actually reduces work for the in-workspace agent. Default to one-per-issue when in doubt — splitting later costs less than re-merging conflicting PRs. - -### Phase 3: Create Issues +Single-unit dispatch only -- there is no multi-unit coalescing in this MVP. `dependencies:` is intentionally not part of the metadata in the single-unit shape; the orchestrator does not gate merges on dependencies in this iteration. -Before creating any issue, present the dispatch plan to the user via the blocking question tool: list each unit being dispatched in this round (U-ID, name, expected branch), the labels that will be applied, and the base branch. Options: `Create all`, `Create one at a time`, `Cancel`. Default to `Create all` when the user picks it explicitly. +### Phase 3: Create the Issue -For each unit being dispatched in this round (only units whose dependencies are already merged or have none): +Before creating the issue, present the dispatch summary to the user via the blocking question tool: U-ID and goal, agent name and worktree path, the labels that will be applied, and the base branch. Options: `Create the issue`, `Edit metadata`, `Cancel`. Default to `Create the issue` when the user picks it explicitly. ```bash gh issue create \ @@ -137,67 +145,57 @@ gh issue create \ ``` Notes: -- Write the rendered prompt to a per-run scratch file under `mktemp -d -t ce-dispatch-XXXXXX` (per the repo's "Scratch Space" guidance in `AGENTS.md`). The scratch directory holds one file per dispatched unit so retries can re-use them. -- The label list comes from `dispatch_labels` (default `ce-dispatch`). If a label does not yet exist in the repo, `gh` prints a warning — surface it to the user once and offer to create the label via `gh label create` (single confirmation, not per-issue). -- After each successful issue creation, capture the issue URL and number and append them to an in-memory `dispatched_units` map keyed by U-ID: `{ U3: { issue_number: 142, issue_url: "...", expected_branch: "dispatch/U3-...", status: "issue_created", pr: null } }`. -- If `gh issue create` fails (auth error, rate limit, etc.), stop the round and surface the error. Do not try to "recover" by retrying with different flags — the user needs to fix the underlying problem. +- Write the rendered prompt to a per-run scratch file under `mktemp -d -t ce-dispatch-XXXXXX` (per the repo's "Scratch Space" guidance in `AGENTS.md`). +- The label list comes from `dispatch_labels` (default `ce-dispatch`). If a label does not yet exist in the repo, `gh` prints a warning -- surface it to the user once and offer to create the label via `gh label create` (single confirmation). +- After successful issue creation, capture and store the issue URL and number for the Phase 4 loop. +- If `gh issue create` fails (auth error, rate limit, etc.), stop and surface the error. Do not retry blindly -- the user needs to fix the underlying problem. -After all issues in the round are created, summarize to the user: count, U-IDs dispatched, base branch, and the expectation that workspaces will pick them up. +After the issue is created, tell the user: "Issue `#` created at ``. Open the Conductor workspace at `` and tell the agent: `Read issue # in this repo, then begin.` When the agent posts a comment back here or opens a PR, ping me in this orchestrator session and I'll respond via the Phase 4 menu." -### Phase 4: Monitor and Review +### Phase 4: Respond Loop -This phase is an **interactive loop**. Each iteration the orchestrator presents the user with a numbered menu (rendered in chat — six options exceeds the blocking tool's 4-option cap; see "Interaction Method" above). The user picks an option (or describes what they want in free text); the orchestrator acts; the loop repeats until the user picks `Done for now` or all units are merged. +This phase is an **interactive loop**. Each iteration the orchestrator presents the user with a four-option menu via the blocking question tool. The user picks an option (or describes what they want in free text); the orchestrator acts; the loop repeats until the user picks `Done for now` or the unit is marked complete. -Render the menu as a numbered list and tell the user "Pick a number or describe what you want." +The four options: -``` -Dispatch status: / merged. open PRs. waiting on dependencies. -1. Check PR status — pull latest gh pr view / gh pr checks for every dispatched unit -2. Review a PR — run ce-code-review on a specific PR -3. Merge a PR — squash-merge a PR whose dependencies are all merged and CI is green -4. Dispatch newly unblocked units — re-run Phases 2-3 for units whose dependencies just merged -5. Show dependency graph — render the current state of the dispatch graph (merged / open / blocked) -6. Done for now — exit the loop; the dispatched issues and PRs persist -``` +1. **Reply to agent comment** -- read the issue thread, surface the latest agent-to-orchestrator comment, capture the user's reply in the orchestrator session (full context loaded), and post the reply back via `gh issue comment`. +2. **Review the PR** -- the agent has opened a PR; pull it and either invoke `ce-code-review` against it, or capture user-typed feedback and post it as a PR review comment via `gh pr review`. +3. **Mark unit complete** -- the PR has been merged (manually, by the user, in Conductor or the GitHub UI); close the issue and tell the user to archive the Conductor workspace. +4. **Done for now** -- exit the loop; the issue and PR persist. The user can re-invoke `/ce-dispatch ` later to resume the loop. #### 4.1 Routing -Act on the user's selection — do not just announce it. The bare per-option action lives inline below. Elaborate sub-flows (review tool selection, conflict resolution prose) live further down. - -- **Check PR status (1)** — for each dispatched unit, run `gh pr list --state all --search "head:"` (or query by linked issue if the workspace renamed the branch); `--state all` is required because `gh pr list` defaults to open PRs only and would otherwise miss PRs merged outside this orchestrator (GitHub UI, Conductor, another shell). For each match, run `gh pr view --json state,mergeable,statusCheckRollup,headRefName`. Update `dispatched_units` with the latest PR number, state (`OPEN`, `MERGED`, `CLOSED`), CI rollup, and mergeable flag. Re-render the loop status line and re-render the menu. - -- **Review a PR (2)** — ask the user which U-ID's PR to review (blocking tool single-select from open PRs in `dispatched_units`). Then invoke the `ce-code-review` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the PR URL as the argument. When `dispatch_auto_review: true`, also auto-trigger this for every newly opened PR before the user is asked to merge it (record per-PR `reviewed: true` so it isn't re-run). - -- **Merge a PR (3)** — ask which U-ID's PR to merge (blocking tool single-select from PRs that pass the merge gate below). Apply this gate before merging: - - All of the unit's dependencies (per the dependency graph) are already in state `MERGED` in `dispatched_units`. If any dependency is not yet merged, refuse with the message "Cannot merge `` — dependency `` is still . Merge it first." and re-render the menu. - - CI rollup on the PR is green (no `FAILURE` or `ERROR` checks). If checks are pending, ask the user whether to wait or skip. - - The PR has a `## Dispatch Result` section in its body with `Status: completed`. If the section is missing or `Status` is `partial` / `failed`, refuse and surface the issue back to the user. - - When all gates pass, run `gh pr merge --squash --delete-branch`. After the merge succeeds, run the project's test suite (`bun test`, `pytest`, etc., as inferred from the plan or repo manifest); if it fails, surface the failure prominently and ask the user whether to revert. Update `dispatched_units[].status` to `merged`. +Act on the user's selection -- do not just announce it. The bare per-option action lives inline below. - On merge conflict (`gh pr merge` reports the PR is not mergeable due to conflicts), do **not** attempt to resolve the conflict in the dispatching session — the conflict belongs to the workspace that produced the PR. Surface the conflict and advise the user: "Open the workspace, run `git fetch origin && git rebase origin/`, resolve conflicts, push, and re-run option 1 to refresh status." Re-render the menu without merging. +- **Reply to agent comment (1)** -- run `gh issue view --json comments,body --jq '.comments[-3:]'` to fetch the latest comments (or pull more if context is missing). Identify the latest comment whose body matches the agent-to-orchestrator comment-protocol shape (`[ -> orchestrator]`). Show the question to the user. Capture the user's reply via the blocking question tool ("free text" / "ask a follow-up first" / "skip"). If `free text`, format the reply as `**[orchestrator -> ] **\n\n` and post via `gh issue comment --body-file `. After posting, tell the user: "Reply posted. Ping the agent in Conductor: `Read the new comment on issue # and continue.`" -- **Dispatch newly unblocked units (4)** — recompute the dispatchable set: U-IDs whose dependencies are all `merged` and that have not yet been dispatched. Re-enter Phases 2-3 for that set. If the set is empty, say so and re-render the menu. +- **Review the PR (2)** -- run `gh pr list --state all --search " in:body"` (or `gh pr view ` if the user supplied it). `--state all` is required because `gh pr list` defaults to open PRs only and would otherwise miss a PR merged elsewhere. If a PR is found and is `OPEN`, ask the user (blocking tool): `Run ce-code-review now / Type feedback to post / Approve and tell user to merge / Skip`. + - `Run ce-code-review` -- invoke the `ce-code-review` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the PR URL. + - `Type feedback to post` -- capture user-typed feedback, post via `gh pr review --comment --body-file `. Tell the user: "Review posted. Ping the agent in Conductor: `Run /ce-resolve-pr-feedback on PR #`." + - `Approve and tell user to merge` -- post `gh pr review --approve --body "Approved via ce-dispatch."` Tell the user: "Approved. Merge in Conductor / GitHub UI when ready, then re-enter the loop and pick `Mark unit complete`." + - `Skip` -- re-render the menu. -- **Show dependency graph (5)** — render an ASCII graph (or a Mermaid diagram if the harness renders one) of all U-IDs, with each node labeled by U-ID and current state (`merged` / `open #PR` / `blocked` / `pending`). Re-render the menu. +- **Mark unit complete (3)** -- run `gh pr view --json state,merged,mergedAt`. If the PR state is `MERGED`, run `gh issue close --comment "Unit complete. PR merged: ."` Tell the user: "Issue `#` closed. You can archive the Conductor workspace at `` now (Conductor's Archive action)." Exit the loop. If the PR is not merged yet, tell the user to merge first and re-enter the loop. -- **Done for now (6)** — print a summary (units merged, units still open, units blocked) and exit the loop. The dispatched issues and PRs persist in GitHub; the user can re-invoke `ce-dispatch` later to resume monitoring. +- **Done for now (4)** -- print a summary (issue URL, PR URL if open, current state) and exit the loop. The dispatched issue and PR persist on GitHub; the user can re-invoke `ce-dispatch` later to resume. If the user enters free text instead of a number, interpret intent and route to the closest option, or ask one clarifying question and resume the loop. #### 4.2 Completion -The skill is **not** complete until the user picks `Done for now` or every unit in the plan is in state `merged`. Re-rendering the menu and stopping at the user's selection without acting on it is not completion — fire the routed action. +The skill is **not** complete until the user picks `Done for now` or `Mark unit complete`. Re-rendering the menu and stopping at the user's selection without acting on it is not completion -- fire the routed action. -When every unit is merged, congratulate the user, optionally run the plan's final verification command (e.g., the full test suite from ``), and exit the loop. Do not auto-close the dispatched issues — `gh pr merge` typically closes them via the linked-issue mechanism, but verify and report. +When the unit is marked complete, congratulate the user briefly and exit. Do not auto-rescan the plan for follow-up units -- multi-unit dispatch is out of scope; the user invokes `ce-dispatch` again for the next unit. ## Pipeline Mode -If `ce-dispatch` is invoked from an automated workflow (e.g., LFG, or any `disable-model-invocation` upstream), skip the Phase 4 interactive loop and return immediately after Phase 3 with a structured summary of dispatched units. The caller decides what to do with the open PRs. +If `ce-dispatch` is invoked from an automated workflow (e.g., LFG, or any `disable-model-invocation` upstream), skip the Phase 4 interactive loop and return immediately after Phase 3 with a structured summary of the dispatched unit (issue URL, agent name, worktree path, expected branch). The caller decides what to do next. ## What ce-dispatch does NOT do -- It does not programmatically create Conductor workspaces. Conductor opens workspaces from issues at the user's discretion (per `references/conductor-notes.md`, section 1). +- It does not programmatically create Conductor workspaces. The user creates the workspace before invoking the skill (see Phase 0.3). - It does not write to or modify the dispatched workspace's filesystem. The orchestrating session only touches GitHub via `gh` and the local plan file. -- It does not edit the plan file. Plan mutations are `ce-plan`'s job; execution progress lives in git and the dispatched-units map, never in the plan body. -- It does not run a long-running background poller. The Phase 4 menu refreshes on user request — there is no implicit "watch" loop between menu interactions. +- It does not edit the plan file. Plan mutations are `ce-plan`'s job; execution progress lives in git, the GitHub issue thread, and the resulting PR. +- It does not run a long-running background poller or webhook. The Phase 4 menu refreshes only when the user re-enters it. +- It does not fan out multiple units, build a dependency graph, or gate merges on dependency order. One unit per dispatch; the user invokes `ce-dispatch` again for the next unit. +- It does not auto-merge PRs or run the project's test suite after merge. The user merges manually (in Conductor or the GitHub UI) and re-enters the loop with `Mark unit complete`. diff --git a/plugins/compound-engineering/skills/ce-dispatch/references/conductor-notes.md b/plugins/compound-engineering/skills/ce-dispatch/references/conductor-notes.md index 2764cdc79..e7664e063 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/references/conductor-notes.md +++ b/plugins/compound-engineering/skills/ce-dispatch/references/conductor-notes.md @@ -61,5 +61,5 @@ Source: [Workflow](https://www.conductor.build/docs/concepts/workflow), [From is - **Label** (default `ce-dispatch`): so humans can filter their issue list; not a Conductor requirement. - **Branch name suggestion** (`dispatch_branch_prefix` + U-ID + slug): so the orchestrator can correlate PRs back to U-IDs in Phase 4; the in-workspace agent is encouraged but not forced to honor it. -- **HTML metadata comment in the issue body** (plan path, U-ID, dependencies, expected branch, base branch): structured data the orchestrator parses on subsequent runs to detect dependency state without rebuilding the graph from scratch. The HTML comment renders invisibly to humans on GitHub but stays parseable. +- **HTML metadata comment in the issue body** (plan path, U-ID, agent name, worktree path, expected branch, base branch, labels, dispatched-at timestamp): structured data the orchestrator parses on subsequent runs so the Phase 4 respond loop can pick up where the user left off without re-asking for the worktree path or branch metadata. The HTML comment renders invisibly to humans on GitHub but stays parseable. Single-unit MVP: dependency-graph state is intentionally not part of the metadata; multi-unit dispatch (when it returns) can re-introduce that field. - **PR-based output contract** (a `## Dispatch Result` section in the PR description): replaces ce-work-beta's `--output-schema` JSON, since dispatched agents don't have a shared scratch directory with the orchestrator. The PR description is the durable handoff surface. diff --git a/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md b/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md index b57fae0ac..0f53d281f 100644 --- a/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md +++ b/plugins/compound-engineering/skills/ce-dispatch/references/dispatch-prompt-template.md @@ -1,47 +1,74 @@ # Dispatch Prompt Template -Build the dispatch prompt for each implementation unit (or coalesced batch of units with no inter-batch dependencies) using the XML-tagged sections below. The full rendered prompt becomes the **GitHub issue body** so the in-workspace agent (e.g., a Conductor workspace opened from the issue) sees the entire instruction set as its starting context. +Build the dispatch prompt for a single implementation unit using the XML-tagged sections below. The full rendered prompt becomes the **GitHub issue body** so the in-workspace agent (e.g., a Conductor workspace opened in the user-supplied worktree) sees the entire instruction set as its starting context. -The prompt is intentionally self-contained: do not assume the in-workspace agent has access to scratch directories, side-channel files, or shared state with the dispatching orchestrator. The plan file is referenced by repo-relative path so the agent can `Read` it for additional context. +The prompt is intentionally self-contained: do not assume the in-workspace agent has access to scratch directories, side-channel files, or shared state with the dispatching orchestrator. The plan file and orientation files are referenced by repo-relative path so the agent can `Read` them for additional context (progressive context exposure -- list paths, do not inline content). + +This template is for **single-unit sync MVP dispatch**. Multi-unit fan-out, dependency-graph metadata, and parallel coordination hints are intentionally not part of this template. ## Required structure Render exactly these sections, in this order. Keep the XML tags so downstream tooling (and the contract test) can validate structure. ```xml + +[A checklist of repo-relative paths the agent should `Read` first, before doing +any work. Include only paths that exist in the target repo. Recommended set: +- The plan file (this unit was extracted from it) +- README.md +- AGENTS.md (and CLAUDE.md if it diverges from AGENTS.md; root-level and any + plugin-scoped equivalents the unit touches) +- docs/architecture.md, docs/architecture/, or any top-level architecture doc +- The unit's `Patterns to follow` files (verbatim from the unit) +- The unit's `Files:` paths so the agent reads existing code before editing + +Render as a Markdown bullet list with each path inline-quoted in backticks. +The agent reads these to come in green and build context, rather than the +orchestrator wasting prompt tokens inlining the content here.] + + + +[The agent's name and worktree absolute path, supplied by the user when +ce-dispatch was invoked. + +Render as: +- agent-name: `` +- worktree-path: `` + +The agent uses `agent-name` to sign comments on this issue +(`[ -> orchestrator]`). The orchestrator addresses the agent the +same way (`[orchestrator -> ]`). It is purely a label -- no +infrastructure depends on it.] + + [One paragraph orienting the in-workspace agent: - Plan file path (repo-relative) the unit was extracted from - One-sentence project context (read from plan frontmatter / repo README if available) - Note that this issue was created by ce-dispatch and corresponds to a single - implementation unit (or a small batch of independent units) from the plan. -The agent should `Read` the plan file for the full picture before starting.] + implementation unit from the plan. +The agent should `Read` the plan file (and the orientation files above) for +the full picture before starting.] -[For a single-unit dispatch: Goal from the implementation unit, verbatim. -For a coalesced multi-unit dispatch: list each unit with its U-ID and Goal, -stating the concrete job, repository context, and expected end state for each. -Multi-unit dispatch is only valid when the units have no dependencies on each -other and share enough context that batching is more efficient than separate -issues -- otherwise prefer one issue per unit.] +[The Goal from the implementation unit, verbatim. Single-unit dispatch only -- +no multi-unit coalescing in this template.] -[Combined file list from the unit(s) -- files to create, modify, or read. +[The unit's combined file list -- files to create, modify, or read. Use the plan's `Files:` section as the source of truth. Repo-relative paths only.] -[File paths and conventions from the unit(s) "Patterns to follow" fields. If no +[File paths and conventions from the unit's "Patterns to follow" field. If no patterns are specified: "No explicit patterns referenced -- follow existing conventions in the modified files."] -[For a single-unit dispatch: Approach from the unit, verbatim. -For a multi-unit dispatch: list each unit's approach, noting any suggested -ordering within the batch.] +[The Approach from the unit, verbatim.] @@ -63,8 +90,9 @@ ordering within the batch.] - Resolve the task fully before opening the PR. Do not stop at the first plausible implementation if verification has not passed. - If you discover mid-execution that the unit's scope is wrong, the plan is - inconsistent, or required context is missing, surface that in the PR body's - `Issues` field rather than silently expanding scope. + inconsistent, or required context is missing, surface that in a new comment + on this issue using the `` below -- do not silently + expand scope. @@ -88,38 +116,101 @@ fix the issues and re-run until they pass. Do not open the PR until verification passes -- the orchestrator will not re-run verification before merging. -[Test and lint commands from the project. Use the union of the unit(s) -verification commands as a single combined invocation.] +[Test and lint commands from the project. Use the unit's verification +commands as a single combined invocation.] -The Compound Engineering (CE) plugin may be installed in this workspace -- -check by running the platform's plugin/skill listing command, or by listing -skills available to the harness. Two execution paths: - -- **Option A (preferred when CE plugin is installed):** Invoke `/ce-work` with - the plan path passed as the argument (use the platform's skill-invocation - primitive: `Skill` in Claude Code, `Skill` in Codex, the equivalent on - Gemini/Pi). `ce-work` reads the plan, builds a task list scoped to this - unit's U-ID, follows the project's patterns, and runs the standard - shipping workflow. -- **Option B (CE plugin not installed):** Follow the ``, ``, - ``, ``, ``, ``, and `` - sections in this prompt directly without delegating to a CE skill. - -Once implementation passes verification, commit and push. If the CE plugin is -installed, prefer `/ce-commit-push-pr` to author commits and open the PR with -project-aware metadata. Otherwise commit with `git commit`, push with -`git push`, and open the PR with the harness's PR action or `gh pr create`. - -The CE plugin is optional. The dispatch prompt is fully self-contained -without it. +The Compound Engineering (CE) plugin is the recommended path for this +dispatch. Follow the **nine-step sequence** below. Each step is explicit so +the agent runs the full compound-engineering loop end-to-end (work -> code +review -> compound -> PR -> standby for feedback). + +1. **Read the orientation files** in `` above. Build context + before doing any work. Do not skip this -- the orchestrator selected + these files specifically. +2. **Run `/ce-work`** with the plan path passed as the argument (use the + platform's skill-invocation primitive: `Skill` in Claude Code, `Skill` + in Codex, the equivalent on Gemini/Pi). `ce-work` reads the plan, builds + a task list scoped to this unit's U-ID, and walks the implementation. + If `ce-work` produces a task list that needs the orchestrator's input + (ambiguity, missing context, scope question), STOP and use the + `` to ask -- do not proceed past the question. +3. **Implement and verify** per ``, ``, ``, + ``, ``, ``, and `` above. +4. **Run `/ce-code-review`** against your branch before opening the PR. + Use the platform's skill-invocation primitive. Address findings inline + if straightforward; defer to the orchestrator via the comment protocol + if the finding implies architectural change. +5. **Run `/ce-compound`** if the unit produced learnings worth capturing + (a non-obvious bug fix, a pattern that should be documented, a + reproducible failure mode). Skip when there are no learnings. +6. **Run `/ce-commit-push-pr`** to commit the work, push the branch, and + open the PR with an adaptive description. If `ce-commit-push-pr` is not + available in this workspace, fall back to `git commit && git push && + gh pr create` and write the `## Dispatch Result` section by hand per + ``. +7. **Append a comment** to this issue with the PR URL. Format: + `**[ -> orchestrator] **\n\nPR opened: . Standing by for review.` +8. **Stop. Wait for orchestrator ping.** Do not poll. Do not start the + next unit. Conductor (or the user) will surface the new orchestrator + comment to you when the orchestrator replies. +9. **On orchestrator ping** with PR feedback: run `/ce-resolve-pr-feedback` + on the PR (use the platform's skill-invocation primitive). On + orchestrator ping with an issue-comment clarification: re-read the + issue thread, then continue the work. Loop until the orchestrator + approves the PR. + +If the CE plugin is **not** installed in this workspace, fall back to +following ``, ``, ``, ``, ``, +``, and `` directly, and use `git` + `gh` for the +commit/push/PR steps. The compound-engineering sequence still applies; only +the skill invocations are replaced with manual equivalents. + +Use issue comments **only for clarifications** you cannot resolve from this +issue body and the orientation files. Routine progress updates do not +belong in comments -- the PR description is the durable progress surface. + +**When to comment:** +- A decision you cannot make from this issue body alone changes a public + interface. +- A decision introduces a new dependency or pattern not already in + `` or the orientation files. +- The unit's stated approach turns out to be wrong, inconsistent with the + plan, or missing required context. +- Verification reveals the plan itself is wrong (e.g., the test scenarios + contradict the goal). + +**Format:** +- Open a new comment on this issue. +- First line: `**[ -> orchestrator] **` +- Then a blank line, then the body. The body must include: + 1. **Question** -- one or two sentences naming the decision. + 2. **What you considered** -- options you evaluated and why none was + obvious. + 3. **What you need from the orchestrator** -- the specific input that + unblocks you. + +**After commenting:** +- STOP. Do not proceed past the open question. Do not start the next + related task. Wait for an `**[orchestrator -> ] **` + reply. +- On reply, re-read the full comment thread before continuing. +- If the reply does not fully unblock, ask a follow-up using the same + format and stop again. + +The orchestrator addresses you in the same shape: +`**[orchestrator -> ] **` followed by the reply +body. You should be able to identify orchestrator replies unambiguously +from the prefix. + + Report the result via the **PR description**, not via a JSON file or scratch -artifact -- ce-dispatch reads the PR body to drive Phase 4 monitoring, -review, and merge gating. +artifact -- ce-dispatch reads the PR body in the Phase 4 respond loop to +drive review and merge gating. Render this section verbatim under a top-level `## Dispatch Result` heading in the PR description (Markdown, not XML in the rendered PR): @@ -144,7 +235,7 @@ in the PR description (Markdown, not XML in the rendered PR): (e.g., `bun test -- 14 passed, 0 failed` or `pytest -- exit code 0`). If verification was not possible, say why. -**Unit ID:** the U-ID(s) this PR satisfies (e.g., `U3` or `U3, U5`). +**Unit ID:** the U-ID this PR satisfies (e.g., `U3`). **Plan path:** the repo-relative plan file path. ``` @@ -156,8 +247,9 @@ Append the following HTML comment **outside** the `` block, at ```html ``` +Note: `dependencies:` is intentionally absent from the metadata in this single-unit MVP. Dependency-graph orchestration is out of scope for the MVP; the orchestrator does not gate merges on dependency state. A future iteration can re-introduce the field when multi-unit dispatch returns. + ## What the orchestrator does NOT include in the prompt - **Scratch directory paths**: the in-workspace agent has its own filesystem; do not reference paths from the orchestrator's machine. - **Codex CLI invocation flags or `--output-schema` artifacts**: `ce-dispatch` does not delegate to `codex exec` directly; the in-workspace agent runs whatever harness Conductor (or another platform) provides. -- **Orchestrator-private state**: dependency graphs, parallel-safety analysis, or the dispatch order. The in-workspace agent only needs its own unit context. +- **Orchestrator-private state**: dependency graphs, parallel-safety analysis, dispatch order. The single-unit MVP does not produce any of those. ## Token budget guidance -Keep the rendered prompt under ~8k tokens when possible. If a unit's plan section is large, link to the plan via repo-relative path inside `` rather than inlining the full text — the agent can `Read` it. +Keep the rendered prompt under ~8k tokens when possible. The `` block is the main lever: list paths, do not inline content. If a unit's plan section is large, link to the plan via repo-relative path inside `` rather than inlining the full text -- the agent can `Read` it. diff --git a/plugins/compound-engineering/skills/ce-plan/SKILL.md b/plugins/compound-engineering/skills/ce-plan/SKILL.md index 1c1f109e5..4e11783da 100644 --- a/plugins/compound-engineering/skills/ce-plan/SKILL.md +++ b/plugins/compound-engineering/skills/ce-plan/SKILL.md @@ -904,7 +904,7 @@ After document review and final checks, present this menu. The five options belo - **Start `/ce-work`** — Invoke the `ce-work` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the plan path as the skill argument. Do not merely tell the user to type `/ce-work` — fire the invocation now so the plan executes in this session. - **Create Issue** — Detect the project tracker (`gh` for GitHub, `linear` for Linear) and create the issue from the plan file as described under "Issue Creation" in `references/plan-handoff.md`. After creation, display the issue URL and ask whether to proceed to `/ce-work` via the platform's blocking question tool. - **Open in Proof (web app) — review and comment to iterate with the agent** — Load the `ce-proof` skill in HITL-review mode with the plan file as `source file`, the plan title as `doc title`, identity `ai:compound-engineering` / `Compound Engineering`, and recommended next step `/ce-work`. Then follow the post-HITL resync logic in `references/plan-handoff.md`, which handles the four `ce-proof` return statuses, re-runs `ce-doc-review` after material edits, and falls back gracefully on upload failure. -- **Dispatch to external agents** — Invoke the `ce-dispatch` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the plan path as the skill argument. Do not merely tell the user to type `/ce-dispatch` — fire the invocation now so the plan's implementation units fan out to GitHub issues in this session. +- **Dispatch to external agents** — Invoke the `ce-dispatch` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the plan path as the skill argument. Do not merely tell the user to type `/ce-dispatch` — fire the invocation now. The skill hands one implementation unit at a time off to a Conductor (or other issue-driven) workspace via a single GitHub issue; the orchestrator and agent coordinate sync via issue comments and the resulting PR. - **Done for now** — Display a brief confirmation that the plan file is saved and end the turn. Do not start follow-up work without an explicit further user prompt. If the user asks for another document review (either from a contextual prompt about residual findings or via free-form request), load the `ce-doc-review` skill with the plan path for another pass and then return to this menu. For free-text revisions outside the five options, accept the input and loop back to this menu after applying the revision. diff --git a/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md b/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md index 0d12daf5c..363f0ce4d 100644 --- a/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md +++ b/plugins/compound-engineering/skills/ce-plan/references/plan-handoff.md @@ -64,7 +64,7 @@ Based on selection (the bare per-option routing is also stated inline in the SKI - `status: aborted` -> fall back to the options without changes. If the initial upload fails (network error, Proof API down), retry once after a short wait. If it still fails, tell the user the upload didn't succeed and briefly explain why, then return to the options — don't leave them wondering why the option did nothing. -- **Dispatch to external agents** -> Invoke the `ce-dispatch` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the plan path as the skill argument. Do not merely tell the user to type `/ce-dispatch` — fire the invocation now so the plan's implementation units fan out to GitHub issues in this session. The dispatched workspaces (e.g., Conductor) pick up those issues for parallel execution; the orchestrating session monitors the resulting PRs and gates merges on dependency order. +- **Dispatch to external agents** -> Invoke the `ce-dispatch` skill via the platform's skill-invocation primitive (`Skill` in Claude Code, `Skill` in Codex, the equivalent on Gemini/Pi), passing the plan path as the skill argument. Do not merely tell the user to type `/ce-dispatch` — fire the invocation now. The skill is the single-unit sync MVP: it hands one implementation unit at a time off to a Conductor (or other issue-driven) workspace via a single GitHub issue, and the orchestrating session coordinates with the agent via issue comments and the resulting PR. The user pings each side manually; dispatching multiple units, dependency-graph orchestration, and merge-gate enforcement are out of scope for the MVP and require re-invoking `ce-dispatch` per unit. - **Done for now** -> Display a brief confirmation that the plan file is saved and end the turn. Do not start follow-up work without an explicit further user prompt. - **If the user asks for another document review** (either from the contextual prompt when P0/P1 findings remain, or by free-form request) -> Load the `ce-doc-review` skill with the plan path for another pass, then return to the options - **Other** -> Accept free text for revisions and loop back to options diff --git a/plugins/compound-engineering/skills/ce-setup/references/config-template.yaml b/plugins/compound-engineering/skills/ce-setup/references/config-template.yaml index dd3d9b040..236b41a9d 100644 --- a/plugins/compound-engineering/skills/ce-setup/references/config-template.yaml +++ b/plugins/compound-engineering/skills/ce-setup/references/config-template.yaml @@ -12,14 +12,14 @@ # work_delegate_effort: high # minimal | low | medium | high | xhigh (omit to use ~/.codex/config.toml default) # --- Dispatch (external workspace delegation) --- -# Settings for /ce-dispatch, which fans out plan implementation units to -# external agent workspaces (e.g., Conductor) via GitHub issues. +# Settings for /ce-dispatch, which hands one plan implementation unit at a time +# off to an external agent workspace (e.g., Conductor) via a GitHub issue. +# Single-unit sync MVP: orchestrator and agent coordinate via issue comments +# and the resulting PR; the user pings each side manually. -# dispatch_mode: conductor # conductor | (default: conductor) # dispatch_branch_prefix: dispatch/ # branch prefix suggested in dispatch prompts (default: dispatch/) # dispatch_base_branch: main # PR base branch (default: repo default branch) # dispatch_labels: ce-dispatch # comma-separated labels applied to created issues (default: ce-dispatch) -# dispatch_auto_review: true # true | false (default: true) -- auto-run ce-code-review on each new PR # --- Product pulse --- # Settings written by /ce-product-pulse first-run interview. Re-run the skill with diff --git a/tests/skills/ce-dispatch-contract.test.ts b/tests/skills/ce-dispatch-contract.test.ts index e65425c3a..a85114a74 100644 --- a/tests/skills/ce-dispatch-contract.test.ts +++ b/tests/skills/ce-dispatch-contract.test.ts @@ -55,17 +55,26 @@ describe("ce-dispatch SKILL.md frontmatter", () => { expect(fm.name).toBe("ce-dispatch") }) - test("description is present and mentions dispatch + plan implementation units", () => { + test("description carries [BETA] prefix per the beta-skills framework", () => { const description = fm.description expect(typeof description).toBe("string") const desc = description as string expect(desc.length).toBeGreaterThan(40) expect(desc.length).toBeLessThanOrEqual(1024) - expect(desc.toLowerCase()).toContain("dispatch") - expect(desc.toLowerCase()).toContain("implementation unit") + // BETA marker (paired with disable-model-invocation: true) + expect(desc.startsWith("[BETA]")).toBe(true) }) - test("disable-model-invocation is true (beta skill)", () => { + test("description names dispatch + single implementation unit (single-unit MVP shape)", () => { + const desc = (parseFrontmatter(SKILL_BODY).description as string).toLowerCase() + expect(desc).toContain("dispatch") + expect(desc).toContain("implementation unit") + // The MVP is single-unit; the description should reflect that so users + // who skim it (or AskUserQuestion summaries) don't expect fan-out. + expect(desc).toContain("single") + }) + + test("disable-model-invocation is true (beta skill triplet)", () => { expect(fm["disable-model-invocation"]).toBe(true) }) @@ -89,7 +98,7 @@ describe("ce-dispatch SKILL.md phases", () => { } }) - test("Phase 0 covers input + config resolution", () => { + test("Phase 0 covers input + config resolution + worktree confirmation", () => { const phase0Start = phaseHeadingIndex(0) const phase1Start = phaseHeadingIndex(1) expect(phase0Start).toBeGreaterThan(-1) @@ -101,61 +110,110 @@ describe("ce-dispatch SKILL.md phases", () => { // Auto-detects latest plan when input is blank expect(phase0Region.toLowerCase()).toContain("latest") expect(phase0Region).toContain("docs/plans") + // Single-unit MVP-specific: Phase 0 confirms a worktree path (the chicken- + // and-egg fix -- user creates the workspace before the skill creates the + // issue) and uses the worktree dirname as the agent name. + expect(phase0Region.toLowerCase()).toContain("worktree") + expect(phase0Region.toLowerCase()).toContain("agent name") + }) + + test("Phase 0 documents only the three retained dispatch_* keys (single-unit MVP)", () => { + const phase0Start = phaseHeadingIndex(0) + const phase1Start = phaseHeadingIndex(1) + const phase0Region = SKILL_BODY.slice(phase0Start, phase1Start) + // Retained keys + expect(phase0Region).toContain("dispatch_branch_prefix") + expect(phase0Region).toContain("dispatch_base_branch") + expect(phase0Region).toContain("dispatch_labels") + // Removed keys must NOT be documented as live config (the SKILL.md may + // mention them by name in the "Removed in this MVP" callout, but they + // must not appear in the live config table). Anchor on the table row + // shape `| ` so a callout that references the removed name in + // prose doesn't satisfy the assertion. + expect(phase0Region).not.toMatch(/^\|\s*`?dispatch_mode`?\s*\|/m) + expect(phase0Region).not.toMatch(/^\|\s*`?dispatch_auto_review`?\s*\|/m) }) - test("Phase 1 includes Parallel Safety Check (file-to-unit mapping, overlap detection)", () => { + test("Phase 1 picks ONE implementation unit (no dependency graph in MVP)", () => { const phase1Start = phaseHeadingIndex(1) const phase2Start = phaseHeadingIndex(2) expect(phase2Start).toBeGreaterThan(phase1Start) const phase1Region = SKILL_BODY.slice(phase1Start, phase2Start) - expect(phase1Region).toContain("Parallel Safety Check") - expect(phase1Region).toContain("file-to-unit") - expect(phase1Region.toLowerCase()).toContain("overlap") - // Dependency graph + cycle detection - expect(phase1Region.toLowerCase()).toContain("dependency") - expect(phase1Region.toLowerCase()).toContain("cycle") + // Reads the plan and parses implementation units + expect(phase1Region.toLowerCase()).toContain("implementation unit") + // Single-unit MVP: explicitly picks one + expect(phase1Region.toLowerCase()).toMatch(/(pick|select)[^.]*one/) + // The MVP intentionally drops parallel-safety / dependency-graph behavior + // -- those will return when multi-unit dispatch returns. Guard against + // accidental re-introduction in the wrong phase. + expect(phase1Region).not.toContain("Parallel Safety Check") }) - test("Phase 2 generates dispatch prompts using the template", () => { + test("Phase 2 generates the dispatch prompt using the template", () => { const phase2Start = phaseHeadingIndex(2) const phase3Start = phaseHeadingIndex(3) const phase2Region = SKILL_BODY.slice(phase2Start, phase3Start) expect(phase2Region).toContain("references/dispatch-prompt-template.md") + // The single-unit MVP requires the new , , + // and sections to be populated alongside the legacy + // sections. Phase 2 is where they get filled in. + expect(phase2Region).toContain("") + expect(phase2Region).toContain("") + expect(phase2Region).toContain("") }) - test("Phase 3 creates issues via gh and only dispatches root-or-unblocked units", () => { + test("Phase 3 creates exactly one issue via gh", () => { const phase3Start = phaseHeadingIndex(3) const phase4Start = phaseHeadingIndex(4) const phase3Region = SKILL_BODY.slice(phase3Start, phase4Start) expect(phase3Region).toContain("gh issue create") expect(phase3Region).toContain("[CE-Dispatch]") expect(phase3Region.toLowerCase()).toContain("label") - // Only dispatches units whose dependencies are merged or have none - expect(phase3Region.toLowerCase()).toMatch(/dependenc(y|ies)/) + // Single-unit MVP: tells the user to open the Conductor workspace and + // point the agent at the new issue. The handoff instruction is what + // closes the chicken-and-egg loop. + expect(phase3Region.toLowerCase()).toContain("conductor") }) - test("Phase 4 monitor loop has six options, including dependency-aware merge", () => { + test("Phase 4 respond loop has exactly four options (single-unit MVP)", () => { const phase4Start = phaseHeadingIndex(4) const phase4Region = SKILL_BODY.slice(phase4Start) - // The six menu options - expect(phase4Region).toContain("Check PR status") - expect(phase4Region).toContain("Review a PR") - expect(phase4Region).toContain("Merge a PR") - expect(phase4Region).toContain("Dispatch newly unblocked units") - expect(phase4Region).toContain("Show dependency graph") + // The four MVP menu options + expect(phase4Region).toContain("Reply to agent comment") + expect(phase4Region).toContain("Review the PR") + expect(phase4Region).toContain("Mark unit complete") expect(phase4Region).toContain("Done for now") - // Six options exceed 4-option cap -> numbered list in chat - expect(phase4Region.toLowerCase()).toContain("numbered list") - // Dependency-ordered merge gating - expect(phase4Region.toLowerCase()).toContain("dependency") - expect(phase4Region.toLowerCase()).toContain("merge") - // Conflict guidance - expect(phase4Region.toLowerCase()).toContain("rebase") + // Four options fits the 4-option blocking-tool cap, so the SKILL.md + // should explicitly say to use the blocking question tool (not a + // numbered list in chat). + expect(phase4Region.toLowerCase()).toContain("blocking question tool") + // The MVP intentionally drops the dependency-aware merge gate, the + // dependency-graph rendering, and the auto-review path. Guard against + // accidental re-introduction. + expect(phase4Region).not.toContain("Show dependency graph") + expect(phase4Region).not.toContain("Dispatch newly unblocked units") + }) + + test("Phase 4 routing references ce-resolve-pr-feedback for PR feedback round-trip", () => { + const phase4Start = phaseHeadingIndex(4) + const phase4Region = SKILL_BODY.slice(phase4Start) + // The dispatch-side respond loop hands PR-review feedback back to the + // in-workspace agent via ce-resolve-pr-feedback. Without this routing + // the user has to remember the chain manually -- defeats the point of + // composing existing CE skills. + expect(phase4Region).toContain("ce-resolve-pr-feedback") + expect(phase4Region).toContain("ce-code-review") }) }) describe("dispatch-prompt-template required XML sections", () => { + // The single-unit MVP expands the section list with three new ones + // (, , ) that capture the + // sync-comms protocol. Existing sections stay so the dispatched agent's + // mental model of the prompt shape is unchanged. const requiredSections = [ + "", + "", "", "", "", @@ -165,6 +223,7 @@ describe("dispatch-prompt-template required XML sections", () => { "", "", "", + "", "", ] @@ -174,13 +233,154 @@ describe("dispatch-prompt-template required XML sections", () => { }) } - test("template metadata footer is an HTML comment with required keys", () => { + test("template metadata footer is an HTML comment with single-unit keys", () => { expect(TEMPLATE_BODY).toContain("ce-dispatch-metadata") expect(TEMPLATE_BODY).toContain("plan:") - expect(TEMPLATE_BODY).toContain("unit_ids:") - expect(TEMPLATE_BODY).toContain("dependencies:") + // Single-unit MVP: unit_id (singular), not unit_ids (multi). + expect(TEMPLATE_BODY).toContain("unit_id:") + expect(TEMPLATE_BODY).not.toMatch(/^unit_ids:/m) + // New single-unit-specific metadata: agent_name + worktree_path + expect(TEMPLATE_BODY).toContain("agent_name:") + expect(TEMPLATE_BODY).toContain("worktree_path:") + // Existing keys expect(TEMPLATE_BODY).toContain("expected_branch:") expect(TEMPLATE_BODY).toContain("base_branch:") + expect(TEMPLATE_BODY).toContain("labels:") + expect(TEMPLATE_BODY).toContain("dispatched_at:") + }) + + test("template metadata footer drops dependencies (single-unit MVP)", () => { + // The MVP does not gate on a dependency graph; carrying `dependencies:` + // in the metadata implies behavior the skill no longer enforces. Guard + // against accidental re-introduction. + // Anchor on `:` shape so a prose mention of the word "dependencies" + // doesn't satisfy the assertion. The metadata footer has the key on its + // own line. + expect(TEMPLATE_BODY).not.toMatch(/^dependencies:/m) + }) +}) + +describe("dispatch-prompt-template nine-step compound-engineering loop", () => { + function extractSection(body: string, tag: string): string { + const open = `<${tag}>` + const close = `` + const start = body.indexOf(open) + const end = body.indexOf(close, start) + expect(start).toBeGreaterThan(-1) + expect(end).toBeGreaterThan(start) + return body.slice(start + open.length, end) + } + + test("ce-plugin block invokes the four CE skills the agent runs end-to-end", () => { + const cePluginSection = extractSection(TEMPLATE_BODY, "ce-plugin") + // The four CE skills the dispatched agent invokes during its loop. + // Without these explicit references the agent might paraphrase the steps + // and drift from the compound-engineering sequence. + expect(cePluginSection).toContain("/ce-work") + expect(cePluginSection).toContain("/ce-code-review") + expect(cePluginSection).toContain("/ce-compound") + expect(cePluginSection).toContain("/ce-commit-push-pr") + // PR-feedback round-trip is via ce-resolve-pr-feedback after the + // orchestrator pings. + expect(cePluginSection).toContain("/ce-resolve-pr-feedback") + }) + + test("ce-plugin block is a numbered nine-step sequence", () => { + const cePluginSection = extractSection(TEMPLATE_BODY, "ce-plugin") + // Steps 1-9 must each appear as `N.` at the start of a line. This is + // the prescriptive form -- an unnumbered list of suggestions is what + // we used to have, and it caused dispatched agents to skip steps. + for (const n of [1, 2, 3, 4, 5, 6, 7, 8, 9]) { + const stepPattern = new RegExp(`^${n}\\.\\s+\\*\\*`, "m") + expect(stepPattern.test(cePluginSection)).toBe(true) + } + }) + + test("ce-plugin block tells the agent to STOP and wait for orchestrator ping", () => { + const cePluginSection = extractSection(TEMPLATE_BODY, "ce-plugin") + // After PR open, the agent must STOP. Without this directive the agent + // tends to start the next plausible task; the sync MVP requires it to + // wait for the orchestrator's review. + expect(cePluginSection.toLowerCase()).toContain("stop") + expect(cePluginSection.toLowerCase()).toContain("wait for orchestrator") + expect(cePluginSection.toLowerCase()).toContain("do not poll") + }) +}) + +describe("dispatch-prompt-template + ", () => { + function extractSection(body: string, tag: string): string { + const open = `<${tag}>` + const close = `` + const start = body.indexOf(open) + const end = body.indexOf(close, start) + expect(start).toBeGreaterThan(-1) + expect(end).toBeGreaterThan(start) + return body.slice(start + open.length, end) + } + + test(" lists paths (progressive context exposure), does not inline content", () => { + const orientation = extractSection(TEMPLATE_BODY, "orientation") + // The block must direct the agent to `Read` paths -- not embed them. + // This is the progressive-context-exposure principle: list paths, let + // the agent read what it needs. + expect(orientation.toLowerCase()).toContain("read") + expect(orientation.toLowerCase()).toContain("repo-relative") + // Recommended set must mention the canonical orientation files so a + // sloppy renderer can't ship an empty orientation block. + expect(orientation).toContain("README") + expect(orientation).toContain("AGENTS.md") + expect(orientation.toLowerCase()).toContain("plan") + }) + + test(" carries agent-name and worktree-path", () => { + const identity = extractSection(TEMPLATE_BODY, "agent-identity") + // Both pieces ride here so the comment-protocol prefix + // ([ -> orchestrator]) and the orchestrator's "open the + // worktree at " hint both have a single source of truth. + expect(identity).toContain("agent-name") + expect(identity).toContain("worktree-path") + }) +}) + +describe("dispatch-prompt-template ", () => { + function extractSection(body: string, tag: string): string { + const open = `<${tag}>` + const close = `` + const start = body.indexOf(open) + const end = body.indexOf(close, start) + expect(start).toBeGreaterThan(-1) + expect(end).toBeGreaterThan(start) + return body.slice(start + open.length, end) + } + + test("comment-protocol restricts comments to clarifications", () => { + const protocol = extractSection(TEMPLATE_BODY, "comment-protocol") + // The MVP separates two surfaces: comments are for blocking + // clarifications; the PR description is for routine progress. Without + // this guardrail agents tend to chat-log progress on the issue, which + // floods the orchestrator's review surface. + expect(protocol.toLowerCase()).toContain("clarification") + }) + + test("comment-protocol prescribes the timestamped -> orchestrator format", () => { + const protocol = extractSection(TEMPLATE_BODY, "comment-protocol") + // The prefix shape is what the SKILL.md Phase 4 routing parses to find + // the latest agent comment. Drift in the prefix shape breaks the + // round-trip silently. + expect(protocol).toContain("[ -> orchestrator]") + expect(protocol).toContain("[orchestrator -> ]") + expect(protocol.toLowerCase()).toContain("iso 8601") + }) + + test("comment-protocol tells agent to STOP after asking, not proceed", () => { + const protocol = extractSection(TEMPLATE_BODY, "comment-protocol") + // The whole point of the comment protocol is: ask, then stop. If the + // agent proceeds anyway it will make architectural decisions without + // the orchestrator's full context (the airgap concern). Guard the + // STOP-after-ask directive. + expect(protocol.toLowerCase()).toContain("stop") + expect(protocol.toLowerCase()).toContain("wait for") + expect(protocol.toLowerCase()).toContain("do not proceed") }) }) @@ -250,31 +450,43 @@ describe("dispatch-prompt-template output contract (PR description, not JSON fil }) }) -describe("config templates carry dispatch_* keys", () => { - const dispatchKeys = [ - "dispatch_mode", +describe("config templates carry the retained dispatch_* keys (single-unit MVP)", () => { + // The MVP keeps three keys and drops two. Both config files MUST mirror + // each other -- the contract test below checks both surfaces explicitly + // because past drift between them caused silent default mismatches. + const retainedKeys = [ "dispatch_branch_prefix", "dispatch_base_branch", "dispatch_labels", - "dispatch_auto_review", ] + const droppedKeys = ["dispatch_mode", "dispatch_auto_review"] - for (const key of dispatchKeys) { - test(`ce-setup config-template.yaml documents ${key}`, () => { + for (const key of retainedKeys) { + test(`ce-setup config-template.yaml documents retained key ${key}`, () => { expect(SETUP_CONFIG_BODY).toContain(key) }) - test(`root config.local.example.yaml documents ${key}`, () => { + test(`root config.local.example.yaml documents retained key ${key}`, () => { expect(ROOT_CONFIG_BODY).toContain(key) }) } + + for (const key of droppedKeys) { + test(`ce-setup config-template.yaml drops removed key ${key}`, () => { + expect(SETUP_CONFIG_BODY).not.toContain(key) + }) + + test(`root config.local.example.yaml drops removed key ${key}`, () => { + expect(ROOT_CONFIG_BODY).not.toContain(key) + }) + } }) describe("ce-plan post-generation menu surfaces dispatch as a fifth option", () => { test("plan-handoff.md lists 'Dispatch to external agents' as option 4 in the menu", () => { - // The numbered menu now has 5 options; "Dispatch" sits at position 4 - // (between Proof and Done for now). The exact position is asserted to - // catch accidental reordering that would break user expectations. + // The numbered menu still has 5 options; "Dispatch" still sits at + // position 4 (between Proof and Done for now). Single-unit MVP changes + // the action description in the routing line, not the menu label. expect(PLAN_HANDOFF_BODY).toMatch( /4\.\s+\*\*Dispatch to external agents\*\*/, ) @@ -312,6 +524,26 @@ describe("ce-plan post-generation menu surfaces dispatch as a fifth option", () expect(bulletText.toLowerCase()).toContain("skill-invocation primitive") expect(bulletText.toLowerCase()).toContain("plan path") }) + + test("ce-plan routing wording reflects single-unit MVP shape (no fan-out language)", () => { + // The MVP dispatches one unit per invocation. Routing copy that talks + // about "fanning out" or "parallel execution" misleads users who skim + // the routing line and expect multi-issue creation. Both surfaces + // (SKILL.md and plan-handoff.md) must converge. + const phaseStart = PLAN_SKILL_BODY.indexOf("##### 5.3.8") + const phaseRegion = PLAN_SKILL_BODY.slice(phaseStart) + const dispatchBullet = phaseRegion.match( + /-\s+\*\*Dispatch to external agents\*\*[^\n]+/, + ) + expect(dispatchBullet).not.toBeNull() + expect(dispatchBullet![0].toLowerCase()).not.toMatch(/fan out|fan-out|parallel execution/) + + const handoffLine = PLAN_HANDOFF_BODY.match( + /\*\*Dispatch to external agents\*\* ->[^\n]+/, + ) + expect(handoffLine).not.toBeNull() + expect(handoffLine![0].toLowerCase()).not.toMatch(/fan out|fan-out|parallel execution/) + }) }) describe("ce-dispatch SKILL.md regression guards (Codex-flagged bugs)", () => { @@ -319,21 +551,20 @@ describe("ce-dispatch SKILL.md regression guards (Codex-flagged bugs)", () => { // bot on EveryInc#762. Without these, the original `gh pr list` and // `git symbolic-ref` invocations silently return the wrong data. - test("Phase 4 status refresh queries merged PRs, not just open ones", () => { + test("Phase 4 PR-discovery query uses --state all so merged PRs are visible", () => { // `gh pr list` defaults to open PRs only (CLI manual: "only lists open PRs" - // by default). Dispatched PRs merged outside this orchestrator (GitHub UI, - // Conductor, another shell) must still be discovered, otherwise the - // dependency graph never advances and `Dispatch newly unblocked units` - // can stay stuck even after prerequisites are merged. Required: --state all - // (or --state merged on a separate pass). + // by default). A PR merged outside this orchestrator (GitHub UI, + // Conductor, another shell) must still be discoverable when the user + // re-enters the loop later, otherwise `Mark unit complete` can't find + // the merged PR. Required: --state all (or --state merged on a separate + // pass). const phase4Start = SKILL_BODY.indexOf("### Phase 4:") expect(phase4Start).toBeGreaterThan(-1) const phase4Region = SKILL_BODY.slice(phase4Start) // Match `gh pr list` invocations (those that include flags/arguments, // identified by the `--search` flag we always pass) and require a state // flag on each. A bare prose mention of `gh pr list` without arguments - // is not an invocation and is exempt. Allow `--state all` or - // `--state merged`. + // is not an invocation and is exempt. const ghPrListInvocations = phase4Region.match(/gh pr list[^\n`]*--search[^\n`]*/g) ?? [] expect(ghPrListInvocations.length).toBeGreaterThan(0)