fix(sandbox): length-frame exec-stdin writes so file I/O works over a…#3719
fix(sandbox): length-frame exec-stdin writes so file I/O works over a…#3719imran31415 wants to merge 5 commits into
Conversation
… TLS DOCKER_HOST The Docker sandbox materializes files with `docker exec` + `tar -x`/`cat` reading the payload from stdin, then signals end-of-input with `raw_sock.shutdown(SHUT_WR)` (wrapped in `except Exception: pass`). Over a TLS DOCKER_HOST that half-close never delivers a stdin EOF to the container, so the in-container `tar`/`cat` blocks forever and `session.write()` / `apply_manifest()` hang. This reproduces against Docker-in-Docker sidecars and remote daemons reached over TLS (both common setups). Fix without falling back to `put_archive()` (deliberately avoided for volume-driver-mounted containers, see `write`): frame the payload by length and pipe the real command through `head -c <n>`, so the reader stops after exactly <n> bytes regardless of whether the stdin half-close is delivered. Works identically over unix sockets, TLS TCP, and DinD. Adds regression tests asserting the exec command is length-framed and the exact payload is streamed (seekable and non-seekable streams). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b9dd8f0e13
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…emory Addresses review feedback: `_measure_stream` previously drained a non-seekable stream into a BytesIO on the event-loop thread, which both blocked the loop and was unbounded (a large HTTP-body/pipe upload could OOM the process). Now all measuring/spooling runs inside the executor thread (`_write`), and a non-seekable stream is copied into a `SpooledTemporaryFile` — kept in memory up to 16 MiB, then spilled to disk — so the length can be framed without buffering the whole payload in RAM. Seekable streams (the common case: in-memory tar, sized file payloads) are still measured in place with zero copy. The spool is closed in a finally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a7d491ceed
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Addresses review feedback. `head -c "$n" | "$@"` reports only the consumer's exit status, so on an image without `head` the `head: not found` error was discarded while `cat`/`tar` saw an empty pipe, exited 0, and the write "succeeded" after creating an empty file — silent data loss. Capture the producer's status via a temp file and exit non-zero (98) when it fails, so a missing/broken `head` surfaces as a WorkspaceArchiveWriteError instead. Uses a portable POSIX pattern (no `pipefail`, which dash lacks). Verified end-to-end against a real container: normal writes still exit 0 with correct contents; with `head` removed the exec now exits 98 instead of silently writing an empty file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dfe259dc47
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Addresses review feedback. The send loop read the payload to EOF while the exec was framed with the length measured earlier by _measure_stream. If a seekable input changed in between (e.g. an open file appended or truncated during write()), that desynced: extra bytes piled up behind a `head -c` that had already stopped, and a short read left `head` blocked on a TLS stdin that never EOFs — re-introducing the hang this change removes. Send at most `payload_length` bytes (the framed count) and raise WorkspaceArchiveWriteError if the stream ends early, instead of truncating silently or blocking. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a7f0b5a76e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| stream.seek(0, io.SEEK_END) | ||
| end = stream.tell() | ||
| stream.seek(start) | ||
| return end - start, stream, None |
There was a problem hiding this comment.
Clamp negative measured lengths before framing
When a seekable stream's current position is past its end (BytesIO(b"abc").seek(10) or an open file left beyond current EOF), end - start is negative. That negative value is passed to head -c "$n"; head --help documents --bytes=[-]NUM as printing all but the last NUM bytes for a leading -, so over the TLS Docker transport the in-container head again waits for stdin EOF while the send loop sends nothing (remaining > 0 is false), recreating the hang this patch is fixing. Clamp this to 0 or fail before starting the exec.
Useful? React with 👍 / 👎.
Addresses review feedback. A seekable stream positioned past its end (e.g. BytesIO(b"abc").seek(10), or a file left beyond EOF) makes `end - start` negative. That negative count becomes `head -c -N`, which means "all but the last N bytes" and reads to EOF — re-hanging over a TLS stdin — while the send loop (`remaining > 0` false) sends nothing. Clamp the measured length to 0: a position past end has no readable bytes, so a 0-byte write (`head -c 0`, which exits immediately) is the correct and safe result. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
seratch
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I traced this against #3718, and this PR targets the same failing path: Docker sandbox file materialization through _stream_into_exec for session.write() / apply_manifest(). The byte-count framing approach is a good fit here because it keeps the existing exec-stdin path, avoids put_archive() and its volume-driver mount issue, and removes the dependency on stdin half-close delivery.
I also ran a local runtime check against a real Docker container. On the current unix-socket Docker context, both base and this PR pass normal write() / hydrate_workspace() round trips, and this PR also passes the new stream_into_exec tests, a non-seekable stream write/read, and the missing-head failure path where it now raises WorkspaceArchiveWriteError instead of silently writing an empty file. I could not reproduce the TLS-specific hang locally because my Docker context is not TLS-backed, but the code path and issue scope line up with #3718.
Before merging, please make one focused cleanup fix: if _measure_stream() creates a SpooledTemporaryFile for a non-seekable stream and then stream.read() or spool.write() raises before returning, the spool should be closed inside _measure_stream() because the caller never receives it. Please add a regression test for that failure path.
|
Also, can you fix the mypy/pyright errors? |
What
Resolves #3718
Fixes a hang in the Docker sandbox:
session.write()/apply_manifest()blocks forever whenDOCKER_HOSTis reached over TLS (Docker-in-Docker sidecars, remote daemons).Root cause
DockerSandboxSession._stream_into_execstreams the payload into adocker execrunningtar -x/catfrom stdin, then signals end-of-input withraw_sock.shutdown(socket.SHUT_WR)(insideexcept Exception: pass). Over a TLS transport that half-close never delivers a stdin-EOF to the container, so the in-container reader blocks forever, the exec never exits, and the client's drain loop hangs. Over a unix socket the half-close works, which is why this only bites TLS daemons.Fix
Frame the payload by length and pipe the real command through
head -c <n>, so the reader terminates on a byte count instead of a stdin half-close:put_archive()(volume-driver-mount plugins reject the duplicateMount, per the existing comments inread/write).shutdown(SHUT_WR)is left in place as a harmless best-effort no-op; termination no longer depends on it._stream_into_exec). The interactive PTY path is untouched.Length measurement (off the event loop, bounded memory)
_measure_streamruns entirely on the executor thread (never the event loop):seek/tellwith zero copy.SpooledTemporaryFile— kept in memory up to 16 MiB, then spilled to disk — so the length can be framed without buffering the whole payload in RAM. The spool is closed in afinally.Tests
Adds regression tests to
tests/sandbox/test_docker.py:test_stream_into_exec_length_frames_stdin_payload— asserts the exec command ishead -c <n>-framed with<n>== payload byte length, and that exactly the payload is streamed (payload includes NUL / non-UTF-8 bytes).test_stream_into_exec_frames_non_seekable_stream— a non-seekable stream is spooled and framed with the correct byte count.Validation:
tests/sandbox/→ 817 passed, 20 skippedruff format --checkandruff check→ cleanNotes
Discovered while running the Strix pentest tool (which uses this SDK's Docker sandbox) inside a Kubernetes dev workspace whose only Docker access is a TLS-only DinD sidecar — the scan hung during workspace materialization until this fix.