Skip to content

Tighten trace-artifacts bucket IAM: scope runtime s3:PutObject to per-user prefix #59

@scoropeza

Description

@scoropeza

Follow-up from PR #52 — security hardening identified during design review; shipped as a pragmatic MVP with a TODO pointing at this follow-up.

Functional description

When a task's --trace option is enabled, the agent uploads a detailed execution trace (a JSONL-compressed artifact) to a shared S3 bucket. Per-user isolation is enforced by the agent code convention that each trace goes under traces/<user_id>/<task_id>.jsonl.gz.

The problem: that per-user isolation is a trust boundary, not an enforcement boundary. The runtime's IAM role currently has s3:PutObject on the entire bucket (traceArtifactsBucket/*) — so a compromised or buggy runtime could overwrite another user's trace artifact. The only thing preventing cross-user writes today is the agent's own code being well-behaved.

Who cares: security reviewers, enterprise customers, anyone running this in a regulated environment. Not currently exploitable (our code doesn't have that bug), but it's a defense-in-depth gap that will come up in any security audit.

Technical context

cdk/src/stacks/agent.ts grants the AgentCore runtime s3:PutObject on traceArtifactsBucket/* via grantPut(runtime) — no prefix, no condition. Design §10.1 calls out this loosening as the pragmatic MVP. A TODO at the grantPut call site points at this issue.

Why the obvious fix doesn't work today: PR #52's L4 spike confirmed that Bedrock AgentCore's InvokeAgentRuntimeCommand does NOT currently expose sessionTags or TransitiveTagKeys. The originally-envisioned aws:PrincipalTag/UserId condition on the bucket resource policy is not viable — the SDK type InvokeAgentRuntimeRequest only accepts runtimeSessionId, runtimeUserId, traceId, etc., and runtimeUserId is a logical identifier, not an IAM principal tag.

Proposed options (all require separate design review + PR)

  1. Orchestrator-minted presigned PUT URL (preferred): the orchestrator signs a one-shot PutObject URL scoped to the exact key traces/<user_id>/<task_id>.jsonl.gz and passes it in the AgentCore invocation payload. Drop grantPut(runtime) entirely. upload_trace_to_s3 switches from boto3.client('s3').put_object to an HTTPS PUT against the presigned URL.

  2. Dedicated uploader Lambda: a narrowly-scoped Lambda with an IAM policy conditioned on a caller-supplied user_id uploads on behalf of the runtime. Runtime loses s3:PutObject entirely and invokes the Lambda instead.

  3. Watch for SessionTags API: track AWS release notes for @aws-sdk/client-bedrock-agentcore updates exposing session tags; revisit this issue when supported. Lowest effort but indefinite timeline.

Acceptance criteria

  • Runtime IAM role has no s3:PutObject on the trace-artifacts bucket, OR the grant is scoped with a condition whose effective reach is no wider than the user's own traces/<user_id>/ prefix.
  • upload_trace_to_s3 test coverage unchanged or improved.
  • cdk-nag suppression near the grant call in cdk/src/stacks/agent.ts is updated to reflect the tightened posture (the current suppression mentions "broad PutObject" and should be narrowed or removed).
  • TODO comment at the grantPut call site is resolved or updated to point at a clearer next step.

Out of scope

  • Changing the trace key layout (traces/<user_id>/<task_id>.jsonl.gz is load-bearing across orchestrator, agent, handler, and CLI — renaming is a separate project).
  • Migration to a different storage backend.

References

  • cdk/src/stacks/agent.tsgrantPut(runtime) call site (has the TODO comment)
  • agent/src/telemetry.py — trace key construction (traces/<user_id>/<task_id>.jsonl.gz)
  • docs/design/ARCHITECTURE.md §10.1 — design decision and loosening rationale
  • @aws-sdk/client-bedrock-agentcore InvokeAgentRuntimeRequest type (session-tag gap)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions