feat(api): add LogQL query filter to sandbox logs endpoint#2963
feat(api): add LogQL query filter to sandbox logs endpoint#2963mishushakov wants to merge 6 commits into
Conversation
Add an optional `query` field to GET /v2/sandboxes/{sandboxID}/logs that is
appended verbatim as a LogQL pipeline after the server-built stream selector
({teamID, sandboxID, category!="metrics"}). Because LogQL cannot reopen a
stream selector mid-pipeline, tenant/sandbox scoping is always enforced
server-side and the client expression can only narrow within its own logs;
a 1024-char cap bounds abuse. The field is threaded through the cluster
resources layer and the edge contract, and takes precedence over level/search.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit 5a9b568. Bugbot is set up for automated code reviews on this repo. Configure here. |
Move the /v2/sandboxes/{sandboxID}/logs `query` field and its supporting
provider/resources/edge changes out to a separate PR (#2963). This branch now
contains only the envd command-output persistence: stdout/stderr is written to
the log pipeline (stamped with pid, capped per command).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
The QuerySandboxLogs function swallows errors returned by l.client.QueryRange and ResponseMapper, returning a nil error instead of propagating it. With the introduction of user-customizable LogQL queries, syntax errors are highly likely, and swallowing them will result in a confusing user experience where invalid queries silently return empty logs with a 200 OK status. Please update QuerySandboxLogs to propagate these errors back to the caller.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
❌ 5 Tests Failed:
View the top 2 failed test(s) by shortest run time
View the full list of 3 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
QuerySandboxLogs previously swallowed errors from QueryRange and ResponseMapper, returning empty logs with a 200 OK. With user-supplied LogQL queries, syntax errors silently returned empty results. Propagate those errors, and validate the query locally with Loki's parser before sending it so a malformed client query returns a helpful 400 with the parse message instead of an opaque 500 (the logcli client discards the HTTP status and retries 400s, so it can't be classified from the returned error). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The stream selector and structured level/search filters are always built server-side and sanitized, so trust them by default and skip parsing them. Validate only when the client supplies a raw pipeline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@claude review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 73142a90a8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| logger.L().Error(ctx, "error when returning logs for sandbox", zap.Error(err), logger.WithSandboxID(sandboxID)) | ||
|
|
||
| return make([]logs.LogEntry, 0), nil | ||
| return nil, fmt.Errorf("error when returning logs for sandbox: %w", err) |
There was a problem hiding this comment.
Preserve best-effort log responses on Loki errors
In local-cluster deployments, any Loki transport error from QueryRange now propagates through resources_local.go as a 500, whereas the previous implementation logged the failure and returned an empty log list. That means a transient Loki outage or timeout will break the sandbox logs endpoint entirely for callers even when the request/query is otherwise valid; only ErrInvalidQuery needs to be surfaced as an error, while the existing best-effort behavior should remain for upstream Loki failures.
Useful? React with 👍 / 👎.
| // and filter stages in valid LogQL order, e.g. `| json | pid="1234"` or `|= "error"`. | ||
| // This takes precedence over the structured level/search filters. | ||
| if pipelineValue := strings.TrimSpace(utils.DerefOrDefault(pipeline, "")); pipelineValue != "" { | ||
| return selector + " " + pipelineValue |
There was a problem hiding this comment.
🔒 Agentic Security Review
Severity: HIGH
The new query value is concatenated verbatim after the enforced selector, but there is no guard that the fragment must begin with a pipeline stage. A caller can supply a full binary expression (for example or {category!="metrics"}), which changes the query root from selector + pipeline to selector OR attacker_selector and can bypass sandbox/team scoping.
Impact: An authenticated caller can potentially expand log reads beyond the intended sandbox/team boundary and retrieve unauthorized logs.
Reviewed by Cursor Security Reviewer for commit 73142a9. Configure here.
There was a problem hiding this comment.
can instruct the parser to avoid any negative selectors perhaps
Only a malformed client query (ErrInvalidQuery) should fail the request. Transport errors, timeouts, and mapping failures from Loki revert to the prior best-effort behavior: log and return an empty list, so a transient Loki outage doesn't break the sandbox logs endpoint for valid requests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit bc7f8dd. Configure here.
A client-supplied pipeline is appended after the enforced stream selector. Parse the resulting query and verify on the AST that it is a log-selector query whose stream selector still pins the caller's teamID and sandboxID, rejecting anything that re-roots the query (e.g. a binary `or` selector or a metric expression) as ErrInvalidQuery. Defense-in-depth so scoping does not rely on LogQL grammar invariants. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bc7f8dd420
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| Direction: direction, | ||
| Level: logToEdgeLevel(level), | ||
| Search: search, | ||
| Query: query, |
There was a problem hiding this comment.
Gate query filtering on edge support
When the API talks to a remote cluster whose edge service has not yet been rolled to the new spec, this new query parameter is just an unknown query string and will be ignored, so callers using ?query=... can get a 200 with unfiltered logs. The existing compatibility check in this file only verifies the level/search feature header, so mixed-version remote clusters have no failure or warning path for this new filter.
Useful? React with 👍 / 👎.
Add the exact "or {teamID=…, sandboxID=…}" union form (and a metric-union
variant) to the scope-bypass regression suite. These do not parse as valid
LogQL (the `or` operator requires sample-expr operands, not log selectors),
and would also be caught by the AST scope guard; pin them so a future Loki
bump can't silently start accepting them.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@claude review |
|
was a nice experiment, but we don't want to lock ourselves into LogQL. |


Adds an optional
queryfield toGET /v2/sandboxes/{sandboxID}/logsthat is appended verbatim as a LogQL pipeline after the server-built stream selector ({teamID, sandboxID, category!="metrics"}). Since LogQL can't reopen a stream selector mid-pipeline, tenant/sandbox scoping is always enforced server-side and a client's expression can only narrow within its own logs (bounded by a 1024-char cap); it takes precedence over thelevel/searchfilters. The field is threaded through the cluster resources layer and the edgeV1SandboxLogscontract, with regenerated API/edge/integration clients.This is split out from #2935 (envd command-output persistence) and can merge independently — the two are decoupled, but together they let callers fetch a single command's output with
query=| json | pid="<pid>" | event_type="process_output".🤖 Generated with Claude Code