Skip to content

fix(rivetkit): exit pid1 after signal shutdown#5045

Draft
NathanFlurry wants to merge 1 commit into
active-actors/fix-active-countfrom
rivetkit-shutdown/exit-pid1-after-drain
Draft

fix(rivetkit): exit pid1 after signal shutdown#5045
NathanFlurry wants to merge 1 commit into
active-actors/fix-active-countfrom
rivetkit-shutdown/exit-pid1-after-drain

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 12, 2026

Code Review: fix(rivetkit): exit pid1 after signal shutdown

Overview

This PR fixes a real and important edge case: when a Node.js process runs as PID 1 inside a container (e.g., Docker), the OS kernel does not install default signal handlers. Re-raising SIGINT/SIGTERM via process.kill(process.pid, signal) silently swallows the signal instead of terminating the process, causing the container to hang until a forceful kill timeout. The fix detects PID 1 and calls process.exit() with the standard POSIX convention exit codes (128 + signal number) instead.


Correctness

  • Exit codes are correct: SIGINT = 2 → 130, SIGTERM = 15 → 143. These match the POSIX convention and are what container orchestrators and shell scripts expect.
  • Both call sites are updated: the double-signal path and the clean shutdown completion path both go through finishShutdownSignal, so behavior is consistent.
  • Non-PID-1 processes are unaffected: the else branch preserves the existing process.kill(process.pid, signal) re-raise behavior, which is correct for normal processes that have default kernel signal handlers.

Code Quality

  • The helper functions are small, focused, and well-named.
  • signalExitCode exhausts all variants of the ShutdownSignal discriminated union without a fallthrough — TypeScript's exhaustiveness check will flag a compile error if ShutdownSignal grows a new variant without updating this function. This aligns with project conventions around explicit enum matching.
  • The updated comment explains the why (PID 1 signal swallowing) rather than just the what.

Suggestions / Minor Observations

  • finishShutdownSignal is called after #removeSignalHandlers() in both call sites, so there is no risk of re-entering the shutdown flow when process.kill is called on non-PID-1 paths. The ordering is safe.
  • One subtle note: on a non-PID-1 process, process.kill(process.pid, signal) re-raises synchronously and the default handler runs before returning. On PID-1, process.exit() also terminates synchronously. Both paths are equivalent from the caller's perspective.
  • Test coverage: there are no automated tests for the PID 1 path. This is understandable since testing PID 1 behavior requires a real container environment, but the PR description does not describe how the fix was verified manually. Worth filling in before merging (e.g., noting a docker stop test).

Security / Performance

No security or performance concerns. This is a purely behavioral fix for a container deployment edge case.


Summary

Clean, minimal, and correct fix. Code quality is good, conventions are followed, and the change is appropriately scoped. The main gap is that the PR description does not describe how it was tested — worth documenting before merging.

Verdict: Looks good. Suggest documenting the manual test method in the PR description.

@MasterPtato MasterPtato force-pushed the rivetkit-shutdown/exit-pid1-after-drain branch from a20fe59 to e344619 Compare May 14, 2026 17:26
@MasterPtato MasterPtato force-pushed the active-actors/fix-active-count branch from 253e406 to 87e5d7d Compare May 14, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant