Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 30 additions & 8 deletions plugins/compound-engineering/skills/ce-code-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -292,15 +292,26 @@ If the output is non-empty, inform the user: "You have uncommitted changes on th
git checkout <branch>
```

Then detect the review base branch and compute the merge-base. Run the `scripts/resolve-base.sh` script, which handles fork-safe remote resolution with multi-fallback detection (PR metadata -> `origin/HEAD` -> `gh repo view` -> common branch names):

```
RESOLVE_OUT=$(bash scripts/resolve-base.sh) || { echo "ERROR: resolve-base.sh failed"; exit 1; }
Then detect the review base branch and compute the merge-base. Run the bundled `resolve-base.sh` script, which handles fork-safe remote resolution with multi-fallback detection (PR metadata -> `origin/HEAD` -> `gh repo view` -> common branch names). Resolve the script via `${CLAUDE_SKILL_DIR}` (never via the reviewed repo's working directory), so a repo-controlled `scripts/resolve-base.sh` cannot be used for arbitrary code execution at review time:

```
RESOLVE_SCRIPT=""
if [ -n "${CLAUDE_SKILL_DIR:-}" ] && [ -f "${CLAUDE_SKILL_DIR}/scripts/resolve-base.sh" ]; then
real_base=$(cd "${CLAUDE_SKILL_DIR}" 2>/dev/null && pwd -P) || real_base=""
repo_top=$(git rev-parse --show-toplevel 2>/dev/null) || repo_top=""
if [ -n "$repo_top" ]; then real_repo=$(cd "$repo_top" 2>/dev/null && pwd -P) || real_repo=""; else real_repo=$(pwd -P); fi
case "$real_base" in
""|"$real_repo"|"$real_repo"/*) ;;
*) RESOLVE_SCRIPT="$real_base/scripts/resolve-base.sh" ;;
esac
fi
if [ -z "$RESOLVE_SCRIPT" ]; then printf '%s\n' 'ERROR: resolve-base.sh is unavailable because ${CLAUDE_SKILL_DIR} is not exposed or resolves inside the reviewed repo. Re-run with `base:<ref>` or use a harness that exposes the skill directory.'; exit 1; fi
RESOLVE_OUT=$(bash "$RESOLVE_SCRIPT") || { echo "ERROR: resolve-base.sh failed"; exit 1; }
if [ -z "$RESOLVE_OUT" ] || echo "$RESOLVE_OUT" | grep -q '^ERROR:'; then echo "${RESOLVE_OUT:-ERROR: resolve-base.sh produced no output}"; exit 1; fi
BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')
```

If the script outputs an error, stop instead of falling back to `git diff HEAD`; a branch review without the base branch would only show uncommitted changes and silently miss all committed work.
If `${CLAUDE_SKILL_DIR}` is unavailable or resolves inside the reviewed repo, the helper script is missing, or the script outputs an error, stop instead of falling back to `git diff HEAD`; a branch review without the base branch would only show uncommitted changes and silently miss all committed work.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this will work reliably. other harnesses will see the literal ${CLAUDE_SKILL_DIR} variable which will have inconsistent results.


On success, produce the diff:

Expand All @@ -312,15 +323,26 @@ You may still fetch additional PR metadata with `gh pr view` for title, body, li

**If no argument (standalone on current branch):**

Detect the review base branch and compute the merge-base using the same `scripts/resolve-base.sh` script as branch mode:
Detect the review base branch and compute the merge-base using the same bundled `resolve-base.sh` script as branch mode. Resolve the script via `${CLAUDE_SKILL_DIR}` (never via the reviewed repo's working directory):

```
RESOLVE_OUT=$(bash scripts/resolve-base.sh) || { echo "ERROR: resolve-base.sh failed"; exit 1; }
RESOLVE_SCRIPT=""
if [ -n "${CLAUDE_SKILL_DIR:-}" ] && [ -f "${CLAUDE_SKILL_DIR}/scripts/resolve-base.sh" ]; then
real_base=$(cd "${CLAUDE_SKILL_DIR}" 2>/dev/null && pwd -P) || real_base=""
repo_top=$(git rev-parse --show-toplevel 2>/dev/null) || repo_top=""
if [ -n "$repo_top" ]; then real_repo=$(cd "$repo_top" 2>/dev/null && pwd -P) || real_repo=""; else real_repo=$(pwd -P); fi
case "$real_base" in
""|"$real_repo"|"$real_repo"/*) ;;
*) RESOLVE_SCRIPT="$real_base/scripts/resolve-base.sh" ;;
esac
fi
if [ -z "$RESOLVE_SCRIPT" ]; then printf '%s\n' 'ERROR: resolve-base.sh is unavailable because ${CLAUDE_SKILL_DIR} is not exposed or resolves inside the reviewed repo. Re-run with `base:<ref>` or use a harness that exposes the skill directory.'; exit 1; fi
RESOLVE_OUT=$(bash "$RESOLVE_SCRIPT") || { echo "ERROR: resolve-base.sh failed"; exit 1; }
if [ -z "$RESOLVE_OUT" ] || echo "$RESOLVE_OUT" | grep -q '^ERROR:'; then echo "${RESOLVE_OUT:-ERROR: resolve-base.sh produced no output}"; exit 1; fi
BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')
```

If the script outputs an error, stop instead of falling back to `git diff HEAD`; a standalone review without the base branch would only show uncommitted changes and silently miss all committed work on the branch.
If `${CLAUDE_SKILL_DIR}` is unavailable or resolves inside the reviewed repo, the helper script is missing, or the script outputs an error, stop instead of falling back to `git diff HEAD`; a standalone review without the base branch would only show uncommitted changes and silently miss all committed work on the branch.

On success, produce the diff:

Expand Down
152 changes: 146 additions & 6 deletions tests/review-skill-contract.test.ts
Original file line number Diff line number Diff line change
@@ -1,12 +1,56 @@
import { readFile } from "fs/promises"
import path from "path"
import { execSync } from "node:child_process"
import fs from "node:fs"
import os from "node:os"
import path from "node:path"
import { readFile } from "node:fs/promises"
import { describe, expect, test } from "bun:test"
import { parseFrontmatter } from "../src/utils/frontmatter"

async function readRepoFile(relativePath: string): Promise<string> {
return readFile(path.join(process.cwd(), relativePath), "utf8")
}

function shellQuote(value: string): string {
return `'${value.replaceAll("'", "'\\''")}'`
}

function runResolveProbe(snippet: string, cwd: string, env: NodeJS.ProcessEnv) {
try {
return {
status: 0,
output: execSync(`bash -c ${shellQuote(snippet)}`, {
cwd,
env: {
PATH: process.env.PATH,
...env,
},
encoding: "utf8",
stdio: ["ignore", "pipe", "pipe"],
}),
}
} catch (error) {
const failed = error as { status?: number; stdout?: Buffer | string; stderr?: Buffer | string }
return {
status: failed.status ?? 1,
output: `${failed.stdout?.toString() ?? ""}${failed.stderr?.toString() ?? ""}`,
}
}
}

function extractResolveProbe(content: string): string {
const blocks = content.match(
/RESOLVE_SCRIPT=""\n[\s\S]*?BASE=\$\(echo "\$RESOLVE_OUT" \| sed 's\/\^BASE:\/\/'\)/g,
)

expect(blocks).toHaveLength(2)
expect(blocks?.[0]).toBe(blocks?.[1])

return blocks![0].replace(
`BASE=$(echo "$RESOLVE_OUT" | sed 's/^BASE://')`,
`echo "RESOLVED:$RESOLVE_SCRIPT"`,
)
}

describe("ce-code-review contract", () => {
test("documents explicit modes and orchestration boundaries", async () => {
const content = await readRepoFile("plugins/compound-engineering/skills/ce-code-review/SKILL.md")
Expand Down Expand Up @@ -631,20 +675,116 @@ describe("ce-code-review contract", () => {
// PR mode still has an inline error for unresolved base
expect(content).toContain('echo "ERROR: Unable to resolve PR base branch')

// Branch and standalone modes delegate to resolve-base.sh and check its ERROR: output.
// The script itself emits ERROR: when the base is unresolved.
expect(content).toContain("scripts/resolve-base.sh")
// Branch and standalone modes delegate to resolve-base.sh from the harness-exposed
// skill directory and check its ERROR: output. The script itself emits ERROR: when
// the base is unresolved.
expect(content).toContain('"${CLAUDE_SKILL_DIR:-}"')
expect(content).toContain('[ -f "${CLAUDE_SKILL_DIR}/scripts/resolve-base.sh" ]')
// Reject helper paths against the reviewed-repo top-level, not just CWD --
// otherwise a launch from a monorepo subdirectory leaves repo-controlled
// sibling paths (e.g., `<repo-root>/.evil`) accepted.
expect(content).toContain("git rev-parse --show-toplevel")
// No bare relative-path fallback to `scripts/resolve-base.sh` -- that would resolve
// against the reviewed repo's CWD and could execute a repo-controlled script.
expect(content).not.toContain('RESOLVE_SCRIPT="scripts/resolve-base.sh"')
expect(content).not.toMatch(/bash\s+scripts\/resolve-base\.sh/)
// No CLAUDE_PLUGIN_ROOT fallback -- AGENTS.md documents CLAUDE_SKILL_DIR as the
// canonical primitive for skill-bundled scripts; adding a second variable is bloat.
expect(content).not.toContain("CLAUDE_PLUGIN_ROOT")
expect(content).toContain(
"Re-run with `base:<ref>` or use a harness that exposes the skill directory.",
)
const resolveScript = await readRepoFile(
"plugins/compound-engineering/skills/ce-code-review/scripts/resolve-base.sh",
)
expect(resolveScript).toContain("ERROR:")

// Branch and standalone modes must stop on script error, not fall back
expect(content).toContain(
"If the script outputs an error, stop instead of falling back to `git diff HEAD`",
"is unavailable or resolves inside the reviewed repo, the helper script is missing, or the script outputs an error, stop instead of falling back to `git diff HEAD`",
)
})

test("resolve-base probe resolves CLAUDE_SKILL_DIR and rejects repo-controlled paths", async () => {
const content = await readRepoFile("plugins/compound-engineering/skills/ce-code-review/SKILL.md")
const snippet = extractResolveProbe(content)
const tempDir = fs.mkdtempSync(path.join(os.tmpdir(), "ce-code-review-contract-"))

try {
const skillDir = path.join(tempDir, "skill-dir")
const evilCwd = path.join(tempDir, "evil-cwd")
const evilSkillDir = path.join(evilCwd, ".evil")

for (const dir of [
path.join(skillDir, "scripts"),
path.join(evilCwd, "scripts"),
path.join(evilSkillDir, "scripts"),
]) {
fs.mkdirSync(dir, { recursive: true })
}

fs.writeFileSync(path.join(skillDir, "scripts", "resolve-base.sh"), "echo BASE:from-skill-dir\n")
fs.writeFileSync(path.join(evilCwd, "scripts", "resolve-base.sh"), "echo BASE:from-cwd-evil\n")
fs.writeFileSync(path.join(evilSkillDir, "scripts", "resolve-base.sh"), "echo BASE:from-cwd-evil\n")

const realSkillDir = fs.realpathSync(skillDir)
const realEvilSkillDir = fs.realpathSync(evilSkillDir)
const realEvilCwd = fs.realpathSync(evilCwd)
const skillResolveScript = path.join(realSkillDir, "scripts", "resolve-base.sh")

// Unset: must fail closed, never fall through to the reviewed-repo decoy
const unset = runResolveProbe(snippet, realEvilCwd, {})
expect(unset.status).not.toBe(0)
expect(unset.output).toContain("Re-run with `base:<ref>`")
expect(unset.output).not.toContain("from-cwd-evil")

// Empty: same fail-closed behavior
const empty = runResolveProbe(snippet, realEvilCwd, { CLAUDE_SKILL_DIR: "" })
expect(empty.status).not.toBe(0)
expect(empty.output).toContain("Re-run with `base:<ref>`")

// Happy path: resolves to the bundled helper, even with a decoy in the CWD
expect(
runResolveProbe(snippet, realEvilCwd, { CLAUDE_SKILL_DIR: realSkillDir }).output,
).toContain(`RESOLVED:${skillResolveScript}`)

// Hostile harness: CLAUDE_SKILL_DIR points inside the reviewed-repo CWD --
// must be rejected and fail closed, NOT execute the decoy.
const evil = runResolveProbe(snippet, realEvilCwd, { CLAUDE_SKILL_DIR: realEvilSkillDir })
expect(evil.status).not.toBe(0)
expect(evil.output).toContain("Re-run with `base:<ref>`")
expect(evil.output).not.toContain("from-cwd-evil")

// Monorepo subdir attack: reviewer launched from a subdirectory of the reviewed
// repo. CLAUDE_SKILL_DIR points at a sibling of the subdir (e.g., `<repo-root>/.evil`)
// -- it lives outside the CWD subdir but is still repo-controlled. The guard must
// reject it by comparing against the repo top-level, not just the CWD.
const monorepoRoot = fs.mkdtempSync(path.join(os.tmpdir(), "ce-code-review-monorepo-"))
try {
const subDir = path.join(monorepoRoot, "services", "foo")
const sneakSkill = path.join(monorepoRoot, ".evil")
fs.mkdirSync(subDir, { recursive: true })
fs.mkdirSync(path.join(sneakSkill, "scripts"), { recursive: true })
fs.writeFileSync(
path.join(sneakSkill, "scripts", "resolve-base.sh"),
"echo BASE:from-monorepo-evil\n",
)
execSync("git init -q", { cwd: monorepoRoot })

const realSubDir = fs.realpathSync(subDir)
const realSneakSkill = fs.realpathSync(sneakSkill)
const monorepo = runResolveProbe(snippet, realSubDir, { CLAUDE_SKILL_DIR: realSneakSkill })
expect(monorepo.status).not.toBe(0)
expect(monorepo.output).toContain("Re-run with `base:<ref>`")
expect(monorepo.output).not.toContain("from-monorepo-evil")
} finally {
fs.rmSync(monorepoRoot, { recursive: true, force: true })
}
} finally {
fs.rmSync(tempDir, { recursive: true, force: true })
}
})

test("orchestration callers pass explicit mode flags", async () => {
const lfg = await readRepoFile("plugins/compound-engineering/skills/lfg/SKILL.md")
expect(lfg).toMatch(/ce-code-review[^\n]*mode:autofix/)
Expand Down