You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Follow-up from PR #88 — surfaced during T2.2 (E2E happy-path) when an error_message was usefully present on the TaskRecord but [redacted] on the dashboard's Recent-Events widget.
Functional description
agent/src/telemetry.py:488 (around there — exact line should be re-verified) uses a constant _METRICS_REDACT_KEYS and replaces every value matching any of those keys with "[redacted]" before writing the event to TaskEventsTable. The error field is in that list. The intent is reasonable: tool stderr can contain accidentally-leaked secrets (env values, tokens, paths), so redaction-by-default is safer than allowlisting.
The current implementation is too aggressive: structural errors that have zero secret-leak risk also get redacted. Example: missing built-in hard-deny policies: /app/policies/hard_deny.cedar — a clear, actionable error message that the dashboard's Recent-Events widget shows as [redacted], forcing an operator to dig into CloudWatch to see what actually happened.
This blunt redaction is the "secure by default but operationally painful" failure mode. The fix is to run a real secret-pattern scan (the same output_scanner.py already used for tool stderr) and only redact on a positive match.
User-visible impact:
Dashboard Recent-Events widget shows [redacted] for errors that are perfectly safe to display.
Operators have to manually correlate "task X failed" with CloudWatch logs to diagnose.
The error_message field on the TaskRecord (read by CLI) is unaffected, so this is dashboard-only — but the dashboard is the natural triage surface for incidents.
Original choice: "treat any string under a sensitive key as potentially-secret, redact everything." Defensible at the time the agent didn't have a separate secret-pattern scanner.
What's available now:
agent/src/output_scanner.py (used for tool stderr) implements pattern-based secret detection. False-positive rate is low; false-negative rate is the concern but the scanner is shared with the tool-output path, so adding it here doesn't change the threat model.
Why this is P3:
The leaky path (TaskRecord error_message) is unaffected — secrets-in-errors are the bigger concern and that path is not redacted, by design.
The dashboard is a triage surface; today's redaction creates friction but doesn't lose data.
Risk of regression is low (the change is "swap blanket replacement for pattern-scan-then-replace").
Why this is worth filing:
Every operator who diagnoses a failed task notices it.
The pattern-scan code already exists.
~1 hour of work.
Proposed approach
Replace the _redact_value blanket replacement with a call to output_scanner.scan_for_secrets(value). If the scanner returns matches, redact only the matched substrings (which the scanner can already do). If no matches, return the value unchanged.
Add tests that confirm:
A structural error (missing built-in hard-deny policies: ...) passes through unredacted.
A real secret pattern (AWS_ACCESS_KEY_ID=AKIA...) gets redacted at the secret position only, with the rest of the message intact.
Update the docstring on _METRICS_REDACT_KEYS to reflect the new pattern-aware semantics.
Acceptance criteria
_redact_value (or its caller) uses output_scanner for redaction decisions
Test: a structural error string passes unredacted
Test: an actual secret pattern is redacted at the matched substring only
Dashboard Recent-Events widget shows readable error messages for no-secret cases (verify by submitting a task that fails with a structural error)
No new false-positives versus the existing output_scanner behavior on the tool-stderr path
Out of scope
Tightening output_scanner.py to catch new secret patterns — that's a separate scanner-tuning issue.
Adding redaction to the TaskRecord error_message field (would break diagnosis; deliberate non-goal).
Operator opt-in to "show full error including any redactions" — not worth the complexity.
References
agent/src/telemetry.py (current redaction implementation; line numbers need verification)
agent/src/output_scanner.py (the pattern-scan logic)
T2.2 in .e2e-test-plan.md (the originating observation: "agent fails at 0 turns with [redacted] error" was the diagnostic-killer)
Functional description
agent/src/telemetry.py:488(around there — exact line should be re-verified) uses a constant_METRICS_REDACT_KEYSand replaces every value matching any of those keys with"[redacted]"before writing the event toTaskEventsTable. Theerrorfield is in that list. The intent is reasonable: tool stderr can contain accidentally-leaked secrets (env values, tokens, paths), so redaction-by-default is safer than allowlisting.The current implementation is too aggressive: structural errors that have zero secret-leak risk also get redacted. Example:
missing built-in hard-deny policies: /app/policies/hard_deny.cedar— a clear, actionable error message that the dashboard's Recent-Events widget shows as[redacted], forcing an operator to dig into CloudWatch to see what actually happened.This blunt redaction is the "secure by default but operationally painful" failure mode. The fix is to run a real secret-pattern scan (the same
output_scanner.pyalready used for tool stderr) and only redact on a positive match.User-visible impact:
[redacted]for errors that are perfectly safe to display.error_messagefield on the TaskRecord (read by CLI) is unaffected, so this is dashboard-only — but the dashboard is the natural triage surface for incidents.Technical root cause
Where the redaction lives:
agent/src/telemetry.py—_METRICS_REDACT_KEYS(constant tuple),_redact_value(the substitution function),_emit_metric_report(caller).Why the current behavior:
What's available now:
agent/src/output_scanner.py(used for tool stderr) implements pattern-based secret detection. False-positive rate is low; false-negative rate is the concern but the scanner is shared with the tool-output path, so adding it here doesn't change the threat model.Why this is P3:
error_message) is unaffected — secrets-in-errors are the bigger concern and that path is not redacted, by design.Why this is worth filing:
Proposed approach
_redact_valueblanket replacement with a call tooutput_scanner.scan_for_secrets(value). If the scanner returns matches, redact only the matched substrings (which the scanner can already do). If no matches, return the value unchanged.missing built-in hard-deny policies: ...) passes through unredacted.AWS_ACCESS_KEY_ID=AKIA...) gets redacted at the secret position only, with the rest of the message intact._METRICS_REDACT_KEYSto reflect the new pattern-aware semantics.Acceptance criteria
_redact_value(or its caller) usesoutput_scannerfor redaction decisionsoutput_scannerbehavior on the tool-stderr pathOut of scope
output_scanner.pyto catch new secret patterns — that's a separate scanner-tuning issue.error_messagefield (would break diagnosis; deliberate non-goal).References
agent/src/telemetry.py(current redaction implementation; line numbers need verification)agent/src/output_scanner.py(the pattern-scan logic).e2e-test-plan.md(the originating observation: "agent fails at 0 turns with[redacted]error" was the diagnostic-killer)7c65704(one of the 18-finding fixes; touched related telemetry code)