Skip to content

fix(sqlite): fence remote requests by generation#5014

Draft
NathanFlurry wants to merge 1 commit into
counter-latency/core-metrics-endpointfrom
counter-latency/sqlite-generation-fence
Draft

fix(sqlite): fence remote requests by generation#5014
NathanFlurry wants to merge 1 commit into
counter-latency/core-metrics-endpointfrom
counter-latency/sqlite-generation-fence

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 11, 2026

🚅 Deployed to the rivet-pr-5014 environment in rivet-frontend

Service Status Web Updated (UTC)
kitchen-sink ❌ Build Failed (View Logs) Web May 19, 2026 at 7:04 pm
website 😴 Sleeping (View Logs) Web May 14, 2026 at 8:04 am
frontend-cloud ❌ Build Failed (View Logs) Web May 11, 2026 at 12:07 pm
frontend-inspector ❌ Build Failed (View Logs) Web May 11, 2026 at 12:07 pm
ladle ❌ Build Failed (View Logs) Web May 11, 2026 at 12:07 pm
mcp-hub ✅ Success (View Logs) Web May 11, 2026 at 12:06 pm

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 11, 2026

PR Review: fix(sqlite): fence remote requests by generation

Overview

This PR adds generation-based fencing to SQLite remote requests so that a stale actor instance cannot accidentally read or write another actor's database. The mechanism is a new GenerationFencedTransport wrapper that automatically injects expected_generation into every get_pages / commit request, and matching server-side routing that validates the generation rather than just checking actor existence.


What works well

  • Clean decorator pattern. GenerationFencedTransport wraps an existing transport handle and injects the generation via get_or_insert, which correctly preserves any caller-supplied value. The SqliteTransport trait is fully satisfied (only get_pages and commit), so there is no missing delegation.

  • is_startup_database_miss simplification is correct. Removing the expected_generation.is_none() guard is required: because GenerationFencedTransport now always populates expected_generation, the old guard would suppress the quiet startup path on every first-boot, causing spurious error logs.

  • Namespace security is maintained. validate_remote_sqlite_generation constructs the UDB actor key with conn.namespace_id, so the generation lookup is implicitly namespace-scoped. Skipping the separate validate_sqlite_actor call when a generation is present is safe.


Issues

1. Dead parameter should be removed

is_startup_database_miss has _expected_generation: Option<u64> but never uses it. The parameter is also passed at the one call site in handle_sqlite_get_pages_response. Both the definition and call site should drop the argument.

2. Fragile string match in is_initial_main_page_missing

message == "actor does not exist" is a verbatim match against a string produced by two different validation functions. If either message changes, the VFS will silently treat a brand-new database fetch as a fatal error. The other two conditions in the same function use contains(), not equality.

A structured SqliteStorageError variant would be more robust. At minimum, using contains("actor does not exist") would be consistent with the rest of the function.

3. Implicit coupling between VFS and transport wrapper

fetch_initial_pages in vfs.rs always passes expected_generation: None, relying on GenerationFencedTransport in database.rs to fill it in via get_or_insert. This is correct today but non-obvious: a reader of vfs.rs alone cannot tell the generation will be injected before the request leaves the process. A brief comment at the None assignment would make this explicit.

4. No tests

The change touches security-relevant validation logic. A regression would allow a stale actor instance to access another generation's data. A unit test for validate_sqlite_actor_for_request confirming (a) the generation path is taken when expected_generation is Some and (b) the actor-existence path is taken when it is None would be valuable.


Summary

The core approach is correct and the security properties hold. Two actionable items before merge: (1) remove the dead _expected_generation parameter from is_startup_database_miss and its call site, and (2) harden is_initial_main_page_missing against future message changes.

@MasterPtato MasterPtato force-pushed the counter-latency/sqlite-generation-fence branch from 71dd250 to becf8f8 Compare May 19, 2026 19:03
@MasterPtato MasterPtato force-pushed the counter-latency/core-metrics-endpoint branch from f21c5de to 6be56bf Compare May 19, 2026 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant