Skip to content

Binary cache improvements#523

Open
edolstra wants to merge 13 commits into
mainfrom
eelcodolstra/nix-421
Open

Binary cache improvements#523
edolstra wants to merge 13 commits into
mainfrom
eelcodolstra/nix-421

Conversation

@edolstra

@edolstra edolstra commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Motivation

This makes two improvements to the binary cache protocol:

  • It adds a field PartialClosure to .narinfo files, which lets a store path declare not just its direct references but also some or all of its indirect references. This lets queryMissing() traverse the closure graph faster, especially for high-latency binary caches.
  • It adds an endpoint /get-narinfos-v1 that lets a binary cache query many store paths in one HTTP call. This is intended for dynamic binary caches (like FlakeHub/Attic) where there may be some non-trivial overhead to each HTTP call (as opposed to HTTP/2 calls to a static CDN like cache.nixos.org).

Some timings (against a nix-serve on localhost with a 100ms latency), showing wall clock time and number of HTTP requests:

old PartialClosure PartialClosure + get-narinfos-v1
present firefox, nix-serve 100ms 1.80 (370) 1.64 (370) 1.25 (3)
mostly missing firefox, nix-serve 100ms 6.27 (3592) 4.48 (3636) 3.13 (22)

Context


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

Summary by CodeRabbit

  • New Features

    • Added support for batch fetching multiple cache info records in a single request, improving cache lookups.
    • Cache metadata now preserves unrecognized fields for better forward compatibility.
  • Bug Fixes

    • Improved handling of cache metadata and disk cache storage.
    • Fixed progress/log messages to use custom activity text when provided.
    • Enhanced substitutable path lookup to better handle partial reference information and missing entries.

@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR documents and persists GetNarInfosV1 cache-info fields, adds partialClosure to narinfo data and serving, introduces batch queryPathInfos support and /get-narinfos-v1, rewrites queryMissing to use batched substituter queries, and adds optional transfer activity text.

Changes

Batch narinfo and cache-info flow

Layer / File(s) Summary
Cache-info contract and persistence
doc/manual/source/protocols/nix-cache-info.md, src/libstore/include/nix/store/binary-cache-store.hh, src/libstore/include/nix/store/nar-info-disk-cache.hh, src/libstore/nar-info-disk-cache.cc
GetNarInfosV1 is documented, cache-info metadata is represented as generic fields, and the SQLite disk cache stores those fields in the v2 layout.
Cache-info parsing and init
src/libstore/binary-cache-store.cc, src/libstore/http-binary-cache-store.cc, src/libstore-tests/nar-info-disk-cache.cc
The cache-info parser preserves unknown fields, applies known fields to the store, reuses parsed fields during HTTP cache init, and updates the cache tests to assert on the field map.
Partial closure data
src/libstore/include/nix/store/nar-info.hh, src/libstore/include/nix/store/path-info.hh, src/libstore/nar-info.cc, src/libstore/misc.cc, src/libstore/remote-store.cc
partialClosure is added to narinfo and substitutable path info, parsed and serialized by NarInfo, copied into substitutable path info results, and noted in remote-store handling.
Narinfo serving
src/nix/serve.cc
makeNarInfo computes partialClosure hints, /<hash>.narinfo tracks already-sent paths per request, /get-narinfos-v1 accepts POST bodies of hash parts, and microhttpd request bodies are accumulated across callbacks.
Batch path-info API
src/libstore/include/nix/store/store-api.hh, src/libstore/store-api.cc, src/libstore/include/nix/store/http-binary-cache-store.hh, src/libstore/http-binary-cache-store.cc
Store gains queryPathInfos, the default implementation fans out to queryPathInfo, and HttpBinaryCacheStore batches misses through /get-narinfos-v1 and updates local caches from the reply.
Missing-path traversal
src/libstore/misc.cc, tests/functional/binary-cache.sh
queryMissing is rewritten around a batched async worklist, substituter queries via queryPathInfos, and traversal from partialClosure, and the functional binary-cache test updates its log assertion.

Transfer activity text

Layer / File(s) Summary
Activity text fallback
src/libstore/include/nix/store/filetransfer.hh, src/libstore/filetransfer.cc
FileTransferRequest gains activityText, and TransferItem::act() uses it when present instead of the default verb-and-URI text.

Sequence Diagram(s)

sequenceDiagram
  participant Store
  participant HttpBinaryCacheStore
  participant CmdServe
  participant NarInfoDiskCacheImpl
  Store->>HttpBinaryCacheStore: queryPathInfos(paths, callback)
  HttpBinaryCacheStore->>CmdServe: POST /get-narinfos-v1 with hashPart() lines
  CmdServe-->>HttpBinaryCacheStore: concatenated narinfo blocks
  HttpBinaryCacheStore->>NarInfoDiskCacheImpl: upsert narinfo fields
  HttpBinaryCacheStore-->>Store: callback(results)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • DeterminateSystems/nix-src#514: Rewrites Store::queryMissing around coroutine-based worklists and concurrent substituter querying in src/libstore/misc.cc.

Poem

I hopped through caches by moonbeam light,
And gathered narinfos in one big bite.
With partial trails and fields in tow,
The store found paths in a tidy flow.
Thump-thump, the bytes all nuzzle bright. 🐇

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.34% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is related to the PR, but it is too generic to convey the main protocol changes or the new endpoint. Use a more specific title mentioning the narinfo PartialClosure field and/or the new /get-narinfos-v1 batching endpoint.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch eelcodolstra/nix-421

Comment @coderabbitai help to get the list of available commands.

@edolstra edolstra changed the title Eelcodolstra/nix 421 Binary cache improvements Jun 26, 2026
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

@github-actions github-actions Bot temporarily deployed to pull request June 26, 2026 15:02 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
src/libstore/remote-store.cc (1)

229-229: 🗄️ Data Integrity & Integration | 🔵 Trivial

Track the remaining partialClosure protocol gap.

RemoteStore::querySubstitutablePathInfos() still drops SubstitutablePathInfo::partialClosure; if this API remains supported, this needs a worker-protocol extension or an explicit compatibility note.

I can help draft the protocol-compatible implementation or open a follow-up issue.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/libstore/remote-store.cc` at line 229,
RemoteStore::querySubstitutablePathInfos() still ignores
SubstitutablePathInfo::partialClosure, so update the remote-store worker
protocol or add an explicit compatibility note to account for it. Use the
existing querySubstitutablePathInfos() flow and the
SubstitutablePathInfo/partialClosure handling point to either transmit this
field end-to-end or clearly document that it is unsupported for now, removing
the FIXME once the chosen behavior is implemented.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@doc/manual/source/protocols/nix-cache-info.md`:
- Around line 64-68: Add a language tag to the example fence in the
nix-cache-info documentation so the Markdown lint rule is satisfied; update the
fenced block that contains the StoreDir/WantMassQuery/Priority/GetNarInfosV1
example to use a text code fence, keeping the example content unchanged.

In `@src/libstore/include/nix/store/nar-info-disk-cache.hh`:
- Around line 31-41: `CacheInfo` currently drops the validated `StoreDir`, which
prevents `HttpBinaryCacheStore::init()` from checking cached metadata against
the active store root before using `WantMassQuery` and `GetNarInfosV1`. Update
`CacheInfo` and the related cache-hit path so the parsed `nix-cache-info`
retains `storeDir` alongside `fields`, then have `HttpBinaryCacheStore::init()`
compare it with the current store directory and refetch or invalidate the cache
when they differ; keep `BinaryCacheStore::parseNixCacheInfo()` as the source of
validation.

In `@src/libstore/include/nix/store/nar-info.hh`:
- Around line 20-29: The new partialClosure field is only present in the narinfo
struct, but the JSON serialization path still drops it. Update
UnkeyedNarInfo::toJSON() and UnkeyedNarInfo::fromJSON() to include
partialClosure in the JSON round-trip, matching the existing handling for the
other narinfo fields so the hint survives serialization and deserialization.

In `@src/libstore/misc.cc`:
- Around line 302-306: The opaque path handling in the `DerivedPath::Opaque`
visitor and the related `pathsToQuery`/`unknown` logic should respect
`useSubstitutes` and the availability of `subs` before queueing anything. Update
the code in `maybeQueryPathInfo`/`pathsToQuery` processing so missing opaque
paths are only queued when substitutes are enabled and there is at least one
substituter to query, otherwise ensure they are treated as `unknown` immediately
and never fetched from substituters.
- Around line 319-323: The concurrent substituter query in
queryMissing()/queryPathInfos handling currently lets a single
sub->queryPathInfos() failure abort the whole lookup, which breaks the old
fallback behavior. Update the forEachAsync loop around the Store substituter
queries to handle failures per substituter: keep disabled substituters skipped
as before, catch/report errors from each sub->queryPathInfos() call, and
continue querying the remaining stores so tryFallback semantics are preserved.
Use the existing queryMissing, forEachAsync, and sub->queryPathInfos symbols to
locate the fix.
- Around line 328-338: The aggregation in misc.cc is double-counting substitute
results when multiple substituters return the same path. Update the loop over
infos in the substitution handling path so narSize and downloadSize are only
added once per unique path, matching the deduplication already done by
willSubstitute; use the existing path key in the infos iteration (and the
NarInfo cast) to track which paths have already been counted before adding
sizes.

In `@src/nix/main.cc`:
- Around line 495-496: The shutdown stats logging in Finally should not call
getFileTransfer(), because that helper can lazily create a new curlFileTransfer
during teardown and report the wrong stats. Update the Finally lambda near
printStats/debug to read the request count from the existing singleton instance
only, or guard against a missing instance without allocating a new one, using
getFileTransfer and curlFileTransfer as the relevant symbols to adjust.

In `@src/nix/serve.cc`:
- Around line 30-41: makeNarInfo currently lets computeFSClosure() failures
abort narinfo generation, turning an optional partialClosure hint into a hard
error. Update makeNarInfo in serve.cc so closure computation is best-effort:
catch or otherwise handle failures from store.computeFSClosure(info.path,
closure), leave ni.partialClosure empty when it cannot be computed, and still
return the rest of the NarInfo response for /foo.narinfo and /get-narinfos-v1.
- Around line 317-324: The POST handling in RequestState::body for the
/get-narinfos-v1 path currently buffers the entire upload without any limit, so
add an early size cap before appending upload_data. Update the request-handling
logic around the RequestState allocation and the upload_data_size check to
reject oversized bodies as soon as they exceed the allowed maximum, and return
an error response instead of continuing to buffer in memory.

---

Nitpick comments:
In `@src/libstore/remote-store.cc`:
- Line 229: RemoteStore::querySubstitutablePathInfos() still ignores
SubstitutablePathInfo::partialClosure, so update the remote-store worker
protocol or add an explicit compatibility note to account for it. Use the
existing querySubstitutablePathInfos() flow and the
SubstitutablePathInfo/partialClosure handling point to either transmit this
field end-to-end or clearly document that it is unsupported for now, removing
the FIXME once the chosen behavior is implemented.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a18fca0d-0e40-467a-84fd-9a40a835375f

📥 Commits

Reviewing files that changed from the base of the PR and between 923cd70 and fb87d94.

📒 Files selected for processing (20)
  • doc/manual/source/protocols/nix-cache-info.md
  • src/libstore-tests/nar-info-disk-cache.cc
  • src/libstore/binary-cache-store.cc
  • src/libstore/filetransfer.cc
  • src/libstore/http-binary-cache-store.cc
  • src/libstore/include/nix/store/binary-cache-store.hh
  • src/libstore/include/nix/store/filetransfer.hh
  • src/libstore/include/nix/store/http-binary-cache-store.hh
  • src/libstore/include/nix/store/nar-info-disk-cache.hh
  • src/libstore/include/nix/store/nar-info.hh
  • src/libstore/include/nix/store/path-info.hh
  • src/libstore/include/nix/store/store-api.hh
  • src/libstore/misc.cc
  • src/libstore/nar-info-disk-cache.cc
  • src/libstore/nar-info.cc
  • src/libstore/remote-store.cc
  • src/libstore/store-api.cc
  • src/nix/main.cc
  • src/nix/serve.cc
  • tests/functional/binary-cache.sh
💤 Files with no reviewable changes (1)
  • tests/functional/binary-cache.sh

Comment thread doc/manual/source/protocols/nix-cache-info.md Outdated
Comment thread src/libstore/include/nix/store/nar-info-disk-cache.hh
Comment thread src/libstore/include/nix/store/nar-info.hh
Comment thread src/libstore/misc.cc Outdated
Comment thread src/libstore/misc.cc
Comment thread src/libstore/misc.cc
Comment thread src/nix/main.cc Outdated
Comment thread src/nix/serve.cc Outdated
Comment thread src/nix/serve.cc
Comment thread doc/manual/source/protocols/nix-cache-info.md Outdated
@github-actions github-actions Bot temporarily deployed to pull request June 26, 2026 15:41 Inactive
edolstra and others added 10 commits June 26, 2026 17:52
This allows the client to start prefetching .narinfo files faster.
This allows servers to advertise optional endpoints.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…in one HTTP call

This adds an optional endpoint /get-narinfos-v1 to binary caches that
allows querying multiple narinfos in one HTTP call, which may reduce
latency and overhead for non-static caches like cache.flakehub.com.

To expose this, Store now has a virtual method queryPathInfos() that
HttpBinaryCacheStore overrides. queryMissing() has been rewritten to
try to batch paths together.

Note: queryMissing() now tries all substituters for a store path
concurrently rather than sequentially, unlike
querySubstitutablePathInfosAsync().

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Instead of a typed "Features" set, store all non-standard nix-cache-info
fields (everything other than StoreDir/WantMassQuery/Priority) verbatim in
a std::map<std::string, std::string> on CacheInfo and BinaryCacheStore.

This keeps the format forward-compatible: a newer server can advertise a
capability that an older client persists in its disk cache without
understanding it (e.g. a future BloomFilter field). It also makes the
endpoint for fetching multiple narinfos server-configurable: the server
advertises "GetNarInfosV1: /get-narinfos-v1", and the client POSTs to
whatever path that field specifies, falling back to per-path .narinfo
fetches when it is absent.

The disk cache stores the field map as JSON in the BinaryCaches table.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@edolstra edolstra force-pushed the eelcodolstra/nix-421 branch from 08da7ee to 9934bb0 Compare June 26, 2026 15:53
Comment thread src/libstore/nar-info.cc
@github-actions github-actions Bot temporarily deployed to pull request June 26, 2026 16:00 Inactive
@github-actions github-actions Bot temporarily deployed to pull request June 28, 2026 09:56 Inactive
The /get-narinfos-v1 endpoint now returns a stream of newline-delimited
JSON objects (one store path's metadata per line) rather than
concatenated .narinfo text. Since this is a new interface, only the V2
JSON format is supported.

The JSON serialisation of NarInfo gains a "partialClosure" field, and a
keyed NarInfo::toJSON()/fromJSON() that includes the store path (as a
"path" field) so each object is self-describing. The single
/<hash>.narinfo endpoint still serves the legacy text format.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions github-actions Bot temporarily deployed to pull request June 29, 2026 10:08 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants