Skip to content

Content-address the Docker BuildKit cache#28482

Draft
9larsons wants to merge 1 commit into
mainfrom
ci/content-addressed-docker-cache
Draft

Content-address the Docker BuildKit cache#28482
9larsons wants to merge 1 commit into
mainfrom
ci/content-addressed-docker-cache

Conversation

@9larsons

Copy link
Copy Markdown
Contributor

Prototype for PLA-80. Draft — needs a real-CI A/B before merge.

Problem

The core/full/e2e image builds read/write a single mutable registry cache ref :cache-main, overwritten by every main push. Sampling 20 recent main runs, the install layer misses ~70% of the time on no-dep-change commits (6–7s hit vs 76–139s miss), adding 60–140s to the critical path per miss.

The cache key is provably stable — pnpm deploy/pack.js output is byte-deterministic, and both hits and misses changed 0 cache-key inputs. The misses alternate within the hour, far finer than the daily cleanup-ghcr job, pointing at the single mutable shared ref being unreliable under Ghost main's frequent, cancel-happy concurrency (likely BuildKit mode=max re-export thinning the shared tag after a cache hit).

Change

Key the cache by a hash of the dependency inputs instead of one shared tag:

  • cache-from: :cache-deps-<hash> (content tag, tried first) → :cache-main (fallback)
  • cache-to (only when pushing to GHCR — fork/artifact path unchanged):
    • push to main: :cache-deps-<hash> + :cache-main (keeps the fallback warm for first-build-with-new-deps)
    • pull_request: :cache-pr-<n> only (PR builds don't pollute the shared content tag)

<hash> = hashFiles('pnpm-lock.yaml','Dockerfile.production') (core/full) / + 'e2e/Dockerfile.e2e' (e2e). Cache strings are computed in the strategy steps and emitted as outputs rather than nested inline ternaries.

Each dependency state gets its own immutable cache tag, so a dep-stable build reliably hits regardless of concurrent main pushes.

Expected impact

~70% misses → consistent hits → ~60–130s off the critical path on most builds (dep-stable is the common case).

Notes / things to verify on a real run

  • mode=max "thinning on re-export" is unproven. Writing both :cache-deps-<hash> and :cache-main on main is intentional, but whether the second export thins the first must be confirmed via BUILDKIT_PROGRESS=plain logs. Core/full build steps already set plain progress; consider adding it to the e2e build step for the first verification run.
  • No cleanup-ghcr.yml change. Its age cutoff already prunes old cache-deps-* tags. Edge case (follow-up): a dependency set stable longer than the cutoff would have its tag pruned and fall back to :cache-main — still correct, just a miss.
  • Validated: YAML parses; actionlint clean (only pre-existing style-level SC2086 notes consistent with the rest of the file).
  • Independent of Build E2E image within build_artifacts to skip base-image re-pull #28471 (which moves the e2e build); if that lands first this will need a trivial rebase of the e2e cache block.

…isses

The core/full/e2e image builds read/write a single mutable registry cache ref
(:cache-main), overwritten by every main push. Under heavy concurrent main
builds this misses ~70% of the time on no-dep-change runs (the cache key is
stable; the shared mutable ref is unreliable), adding 60-140s to the critical
path per miss.

Key the cache by a hash of the dependency inputs instead:
- cache-from: :cache-deps-<hash> (content tag) then :cache-main (fallback)
- cache-to (only when pushing to GHCR):
  - push to main: :cache-deps-<hash> + :cache-main (keep fallback warm)
  - pull_request: :cache-pr-<n> only (don't pollute the shared content tag)

Cache strings are computed in the strategy steps and emitted as outputs rather
than nested inline. e2e cache-from stays gated on should-push (its fork path
uses the plain docker driver, which can't import registry cache). No non-cache
behavior changes; fork/artifact path is unaffected.
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1f46f721-0071-4b50-a14f-00b893fd732f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/content-addressed-docker-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.75%. Comparing base (c454a50) to head (3ff6f9b).
⚠️ Report is 19 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #28482   +/-   ##
=======================================
  Coverage   73.74%   73.75%           
=======================================
  Files        1541     1541           
  Lines      132184   132231   +47     
  Branches    15784    15793    +9     
=======================================
+ Hits        97484    97528   +44     
+ Misses      33734    33714   -20     
- Partials      966      989   +23     
Flag Coverage Δ
admin-tests 54.67% <ø> (ø)
e2e-tests 73.75% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@9larsons 9larsons changed the title Content-address the Docker BuildKit cache (PLA-80) Content-address the Docker BuildKit cache Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant