Skip to content

Make telemetry redaction pattern-aware so structural errors aren't redacted #118

@scoropeza

Description

@scoropeza

Follow-up from PR #88 — surfaced during T2.2 (E2E happy-path) when an error_message was usefully present on the TaskRecord but [redacted] on the dashboard's Recent-Events widget.

Functional description

agent/src/telemetry.py:488 (around there — exact line should be re-verified) uses a constant _METRICS_REDACT_KEYS and replaces every value matching any of those keys with "[redacted]" before writing the event to TaskEventsTable. The error field is in that list. The intent is reasonable: tool stderr can contain accidentally-leaked secrets (env values, tokens, paths), so redaction-by-default is safer than allowlisting.

The current implementation is too aggressive: structural errors that have zero secret-leak risk also get redacted. Example: missing built-in hard-deny policies: /app/policies/hard_deny.cedar — a clear, actionable error message that the dashboard's Recent-Events widget shows as [redacted], forcing an operator to dig into CloudWatch to see what actually happened.

This blunt redaction is the "secure by default but operationally painful" failure mode. The fix is to run a real secret-pattern scan (the same output_scanner.py already used for tool stderr) and only redact on a positive match.

User-visible impact:

  • Dashboard Recent-Events widget shows [redacted] for errors that are perfectly safe to display.
  • Operators have to manually correlate "task X failed" with CloudWatch logs to diagnose.
  • The error_message field on the TaskRecord (read by CLI) is unaffected, so this is dashboard-only — but the dashboard is the natural triage surface for incidents.

Technical root cause

Where the redaction lives:

  • agent/src/telemetry.py_METRICS_REDACT_KEYS (constant tuple), _redact_value (the substitution function), _emit_metric_report (caller).

Why the current behavior:

  • Original choice: "treat any string under a sensitive key as potentially-secret, redact everything." Defensible at the time the agent didn't have a separate secret-pattern scanner.

What's available now:

  • agent/src/output_scanner.py (used for tool stderr) implements pattern-based secret detection. False-positive rate is low; false-negative rate is the concern but the scanner is shared with the tool-output path, so adding it here doesn't change the threat model.

Why this is P3:

  • The leaky path (TaskRecord error_message) is unaffected — secrets-in-errors are the bigger concern and that path is not redacted, by design.
  • The dashboard is a triage surface; today's redaction creates friction but doesn't lose data.
  • Risk of regression is low (the change is "swap blanket replacement for pattern-scan-then-replace").

Why this is worth filing:

  • Every operator who diagnoses a failed task notices it.
  • The pattern-scan code already exists.
  • ~1 hour of work.

Proposed approach

  1. Replace the _redact_value blanket replacement with a call to output_scanner.scan_for_secrets(value). If the scanner returns matches, redact only the matched substrings (which the scanner can already do). If no matches, return the value unchanged.
  2. Add tests that confirm:
    • A structural error (missing built-in hard-deny policies: ...) passes through unredacted.
    • A real secret pattern (AWS_ACCESS_KEY_ID=AKIA...) gets redacted at the secret position only, with the rest of the message intact.
  3. Update the docstring on _METRICS_REDACT_KEYS to reflect the new pattern-aware semantics.

Acceptance criteria

  • _redact_value (or its caller) uses output_scanner for redaction decisions
  • Test: a structural error string passes unredacted
  • Test: an actual secret pattern is redacted at the matched substring only
  • Dashboard Recent-Events widget shows readable error messages for no-secret cases (verify by submitting a task that fails with a structural error)
  • No new false-positives versus the existing output_scanner behavior on the tool-stderr path

Out of scope

  • Tightening output_scanner.py to catch new secret patterns — that's a separate scanner-tuning issue.
  • Adding redaction to the TaskRecord error_message field (would break diagnosis; deliberate non-goal).
  • Operator opt-in to "show full error including any redactions" — not worth the complexity.

References

  • agent/src/telemetry.py (current redaction implementation; line numbers need verification)
  • agent/src/output_scanner.py (the pattern-scan logic)
  • T2.2 in .e2e-test-plan.md (the originating observation: "agent fails at 0 turns with [redacted] error" was the diagnostic-killer)
  • PR feat: Cedar HITL approval gates for agent tool use #88 commit 7c65704 (one of the 18-finding fixes; touched related telemetry code)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions