feat(envd): persist command stdout/stderr to the log pipeline#2935
feat(envd): persist command stdout/stderr to the log pipeline#2935mishushakov wants to merge 23 commits into
Conversation
Commands started via envd now get a unique cid that is returned in the
StartEvent and stamped on every stdout/stderr log line, so their output is
persisted through the existing Loki pipeline and capped per command. Adds
GET /v2/sandboxes/{sandboxID}/commands/{cid}/logs to retrieve a single
command's output, with the cid filter threaded through the local Loki query
and the remote edge contract.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit ae853a1. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Code Review
In packages/api/internal/handlers/command_logs.go, the expression new(time.UnixMilli(*params.Cursor)) is invalid Go syntax because new expects a type rather than a value. This will cause a compilation error, which can be resolved by converting the timestamp to a time.Time value first and then taking its address.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
❌ 3 Tests Failed:
View the full list of 4 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
The cid is stamped on process_start/process_end lifecycle lines too, so filtering by cid alone returned them alongside output. Add an event_type=process_output filter so the command-logs endpoint returns only stdout/stderr. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop the envd-assigned cid (and its proto changes); commands are already
identified by the pid returned in StartEvent. Output lines are now stamped
with pid, and the retrieval endpoint becomes
GET /v2/sandboxes/{sandboxID}/commands/{pid}/logs with start/end query params.
The time window disambiguates a reused pid: within [start, end] a pid maps to
a single command execution. Filter stays scoped to event_type=process_output.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop the dedicated /v2/sandboxes/{sandboxID}/commands/{pid}/logs route in
favor of an optional pid query param on the existing
/v2/sandboxes/{sandboxID}/logs endpoint. Callers scope to a single command's
output by passing pid (filtered to event_type=process_output), bounding the
window with the existing cursor/direction params.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Swap the pid query param on /v2/sandboxes/{sandboxID}/logs for a `query`
field that is appended verbatim as LogQL pipeline stages after the
server-built selector ({teamID, sandboxID, category!="metrics"} | json).
Because LogQL cannot reopen a stream selector mid-pipeline, the team/sandbox
scoping is always enforced and the client expression can only narrow within
the caller's own logs. A length cap (1024) bounds abuse. Callers fetch a
single command's output via `| pid="<pid>" | event_type="process_output"`.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Previously the server forced a `| json` stage before the client query, so a line-filter query like `|= "error"` produced the invalid `| json |= "error"`. Now the server contributes only the enforced stream selector and the client supplies the entire LogQL pipeline (including its own parser stage). Tenant scoping still holds since LogQL cannot reopen a selector mid-pipeline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ogs-method # Conflicts: # packages/api/internal/api/api.gen.go
Move the /v2/sandboxes/{sandboxID}/logs `query` field and its supporting
provider/resources/edge changes out to a separate PR (#2963). This branch now
contains only the envd command-output persistence: stdout/stderr is written to
the log pipeline (stamped with pid, capped per command).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add GET /sandboxes/{sandboxID}/logs/command/{pid}, which returns the
persisted output (stdout/stderr) of a single command by reusing the
sandbox logs pipeline with a pid + event_type=process_output filter.
A start/end time window disambiguates reused pids.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… into mishushakov/command-logs-method
Drop the dedicated /sandboxes/{sandboxID}/logs/command/{pid} route in favor
of an optional pid query param on the existing /v2/sandboxes/{sandboxID}/logs
endpoint. Callers scope to a single command's output by passing pid (filtered
to event_type=process_output), bounding the window with the existing
cursor/direction params.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 069a09341d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c4476183f1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
The budget only subtracted message payload bytes, so commands emitting tiny lines (e.g. `yes`) could produce ~2M individual log records — each with ~200 bytes of field/JSON framing overhead — before tripping the 2 MiB cap, flooding the exporter the cap was meant to protect. Charge a fixed per-event overhead per record so the budget also bounds record count (~8k records worst case). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5ca9fbbace
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…rd cap Charging synthetic overhead bytes per record silently taxed every command's payload budget — typical ~100-200 byte log lines would truncate at ~0.6-0.9 MiB of real output instead of the advertised 2 MiB. Keep the byte budget honest (payload only) and bound the tiny-line flood case (e.g. `yes`) directly with a per-command record cap of 10k, shared across stdout/stderr like the byte cap. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The marker only set event_type, so the pid-scoped retrieval filter (| pid = ... | event_type = process_output) dropped it — command-specific queries lost the only indication that output was truncated. Emit it with the same pid/stream fields as regular output lines. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit aad63dc. Configure here.
Completed empty lines (e.g. printf 'a\n\nb\n' or test runners separating sections) are part of the command's real output and are delivered on the live stream, but emitLine discarded them, making pid-filtered retrieval diverge from actual stdout/stderr. Emit them as records with an empty message. The guard was only ever reachable for blank lines: the over-long segment path can't produce an empty slice and flush() is guarded by len(buf) > 0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The stdout/stderr reader goroutines are launched in New, but cmd.Start() runs later in Start: a non-PTY command that writes immediately on startup can deliver output (and trigger Handler.Pid -> cmd.Process.Pid) before Start has published cmd.Process — a nil dereference that panics envd, and an unsynchronized read even after publication. Publish the pid via an atomic and a ready channel closed after cmd.Start() succeeds; the loggers block on it instead of touching cmd.Process. Output implies the process started, so the wait can't deadlock on a failed Start. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5d952a5e21
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Command output lines are always persisted at info (the truncation marker at warn), so combining pid with a minimum level of warn/error built a LogQL pipeline that filtered out every output line — pid-scoped retrieval returned empty despite the output being persisted. Drop the level stage when pid is set and document the behavior on the pid params. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
An Edge deployment that predates the pid query param ignores it and returns unfiltered sandbox logs, which the caller would mistake for command-scoped output. Unlike level/search (where degrading to unfiltered logs is acceptable and only logged), pid scoping is a correctness contract, so return 501 unless the Edge response advertises support via the new X-E2B-Edge-Feature-Sandbox-Logs-Pid-Filtering-Enabled header. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ae853a10b1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
b357648 to
ae853a1
Compare

Commands started via envd previously had their stdout/stderr discarded once no client was streaming. This persists each non-PTY command's output through the existing envd → Loki log pipeline as
process_outputlog lines, stamped with the command'spidand capped per command (with a truncation marker) to protect the exporter buffer and downstream storage. Output is captured even when no client is attached. The envd version is bumped (0.6.1 → 0.6.2) for the behavioral change; PTY/interactive sessions are out of scope. Includes unit tests for the output line-buffering and per-command cap.Retrieval is handled separately in #2963 (a
queryLogQL filter on the sandbox logs endpoint); with both, a caller fetches one command's output viaquery=| json | pid="<pid>" | event_type="process_output". This PR is independently useful — persisted output already shows up in the existing sandbox logs (search/level).🤖 Generated with Claude Code