Skip to content

Session unloadable when Windows shell exits with negative exit code -- data.kind.exitCode: Number must be greater than or equal to 0 #3454

@SQLBImhugh

Description

@SQLBImhugh

Describe the bug

Sessions become permanently unloadable after any shell command exits with a negative Windows exit code. The session loader's Zod schema constrains data.kind.exitCode to >= 0, but Windows' GetExitCodeProcess returns a DWORD that the Node/Rust runtime surfaces as a signed i32, so any process that crashes with a high-bit exit code (e.g. .NET unhandled exception 0xE0434352, access violation 0xC0000005, stack overflow 0xC00000FD) gets serialized into events.jsonl as a negative number. On /resume, schema validation fails and the entire session — every conversation turn, every edit — is lost.

Error message

Error: Session '7406ade6-17c1-4924-ba12-36185ec0821b' was found but could not be loaded.
Error: Session file is corrupted (line 3844: data.kind.exitCode: Number must be greater than or equal to 0)

[process exited with code 1 (0x00000001)]

Offending event in events.jsonl

{
  "type": "system.notification",
  "data": {
    "content": "<system_notification>\nShell command \"Re-run launcher with ProcessStartInfo.ArgumentList\" (shellId: launcher5) has exited with exit code -532462766. Use read_powershell with shellId \"launcher5\" to retrieve the output.\n</system_notification>",
    "kind": {
      "type": "shell_completed",
      "shellId": "launcher5",
      "exitCode": -532462766,
      "description": "Re-run launcher with ProcessStartInfo.ArgumentList"
    }
  },
  ...
}

-532462766 is 0xE0434352 reinterpreted as a signed i32 — the canonical .NET CLR unhandled-exception code. It is a perfectly valid Windows process exit code; the schema is wrong to reject it.

Affected version

  • Session created with: GitHub Copilot CLI 1.0.39
  • Resume attempted with: GitHub Copilot CLI 1.0.51
  • OS: Windows 11

Steps to reproduce

  1. On Windows, run any command from a Copilot CLI session that exits with a high-bit exit code. Easiest repro: spawn a .NET process that throws an unhandled exception, e.g.
    pwsh -NoProfile -Command "throw 'boom'"
    or any native crash:
    cmd /c "exit /b 3221225477"   # 0xC0000005 - returns as -1073741819
  2. Allow the shell_completed notification to be written to events.jsonl.
  3. Exit the session.
  4. copilot --resume <session-id>"Session file is corrupted".

Expected behavior

  • Negative exitCode values should not invalidate the session file. Windows exit codes are 32-bit unsigned values; the schema must accept the full i32 range (or store them as unsigned u32/i64).
  • Loader should be tolerant: a single malformed system.notification event should not destroy a session containing thousands of valid events. Skip-and-warn would be far better than hard-fail.

Suggested fixes

  1. Widen the schema for kind.exitCode: drop the >= 0 constraint, accept any i32 (or i64). This matches what the OS actually returns.
  2. Coerce at write time: when surfacing the exit code into the event, convert to unsigned (exitCode >>> 0 in JS) so the hex value is preserved and is always >= 0. This is also more diagnostic for users (3762504530 vs -532462766 — neither is obvious, but the unsigned form matches what cmd /c exit and Task Manager display).
  3. Graceful loader degradation: on per-event schema failure, log a warning and skip the event rather than aborting the entire session load. Even a "session loaded with N skipped events" is infinitely better than total loss.

This bug class has bitten the loader before — see #1266 (organization_list[0].name is null Zod rejection) and #1338. A general "tolerant loader" pass would prevent the next variant from causing the same total-loss outcome.

Workaround

In-place patch the offending line in ~/.copilot/session-state/<id>/events.jsonl:

# Find the bad line
Select-String -Path events.jsonl -Pattern '"exitCode":-\d+'

# Replace the negative value with its unsigned u32 equivalent
# e.g. -532462766 -> 3762504530  (both are 0xE0434352)

Critical: preserve LF-only line endings (PowerShell's default is CRLF and will silently break parsing).

After patching, /resume works and no other data is lost.

Related issues (different root causes, same failure mode)

  • #2012 — Raw U+2028/U+2029 in events.jsonl breaks JSON.parse on /resume (write-time escaping bug)
  • #2209 — Long-lived session shows corrupted despite valid events.jsonl (likely size limit)
  • #1864 — Session corrupted after power loss (truncated JSON)
  • #1266 (closed) — Zod schema too strict for real-world data (same class of bug)

The common thread: any unexpected value in events.jsonl destroys the whole session. The loader needs to be tolerant of per-event failures, independent of fixing this specific schema.

Metadata

Metadata

Assignees

Labels

area:platform-windowsWindows-specific: PowerShell, cmd, Git Bash, WSL, Windows Terminalarea:sessionsSession management, resume, history, session picker, and session statearea:toolsBuilt-in tools: file editing, shell, search, LSP, git, and tool call behavior

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions