Skip to content

feat(linear): resolve API token via AgentCore Identity (Phase 2.0a)#100

Closed
isadeks wants to merge 9 commits into
aws-samples:mainfrom
isadeks:feat/agentcore-identity-api-key
Closed

feat(linear): resolve API token via AgentCore Identity (Phase 2.0a)#100
isadeks wants to merge 9 commits into
aws-samples:mainfrom
isadeks:feat/agentcore-identity-api-key

Conversation

@isadeks
Copy link
Copy Markdown
Contributor

@isadeks isadeks commented May 15, 2026

Summary

Phase 2.0a of the Linear v2 plan: migrate the agent runtime's Linear personal API token resolution from AWS Secrets Manager to AWS Bedrock AgentCore Identity. This is the "validate the Identity SDK" step before the bigger OAuth + Gateway cutover in Phase 2.0b.

Per Alain's guidance from the v1.1 review thread:

"Ideally yeah, you can try when exposing the server through agentcore gateway. you will setup an outbound auth for your server using agentcore identity. that identity can be (AC identity is like a wrapper around secrets manager) api key or oauth. start by using api key, if it works, switch to oauth."

⚠️ Stacked on PR #87

This PR depends on #87 (feat/linear-processor-feedback) and was developed against it as the base. The diff against main includes #87's changes — review #87 first; the additional changes here are scoped to:

  • agent/pyproject.toml — adds bedrock-agentcore==1.9.1
  • agent/uv.lock — synced
  • agent/src/config.py::resolve_linear_api_token — rewrites against IdentityClient
  • agent/tests/test_config.py — replaces 5 Secrets-Manager tests with 10 Identity tests
  • cdk/src/stacks/agent.ts — swaps Linear secret grant for Identity env + IAM (runtime only)
  • docs/guides/LINEAR_SETUP_GUIDE.md + starlight mirror — adds Step 4.5

Once #87 merges, this PR's diff will resolve to those changes only.

Scope: agent runtime only

Lambdas (orchestrator + processor) intentionally keep using Secrets Manager. Reason: the Python bedrock_agentcore SDK has no Node.js equivalent — Lambda migration requires @aws-sdk/client-bedrock-agentcore raw API calls and folds into 2.0b's bigger refactor (where OAuth tokens replace API keys for all consumers in one cutover).

End-state of 2.0a:

  • Agent reads from AgentCore Identity ✓
  • Lambdas read from Secrets Manager (unchanged) ✓
  • Both point at the same underlying token value (admin runs agentcore add credential once and populates Linear API token in both stores)

What changed

agent/src/config.py::resolve_linear_api_token

  • Drops boto3 SecretsManager fetch + LINEAR_API_TOKEN_SECRET_ARN env var
  • Reads new env LINEAR_API_KEY_PROVIDER_NAME (provider name in the Identity vault, default: linear-api-key)
  • Calls IdentityClient.get_api_key() with the workload access token auto-injected into BedrockAgentCoreContext by AgentCore Runtime
  • Caches the resolved token in LINEAR_API_TOKEN env so downstream consumers stay unchanged: channel_mcp.py's \${LINEAR_API_TOKEN} placeholder in .mcp.json and linear_reactions.py's GraphQL Authorization header

Why imperative IdentityClient.get_api_key() instead of the @requires_api_key decorator: API keys don't need refresh. The decorator pattern shines for OAuth (refresh tokens, scopes, per-session binding) and is the right shape for 2.0b. For a static API key fetched once at agent startup, the imperative form keeps the MCP-config-with-placeholder model working unchanged.

Verified by reading the SDK's auth.py: the @requires_api_key decorator does exactly client.get_api_key(provider_name=..., agent_identity_token=BedrockAgentCoreContext.get_workload_access_token()). Inside AgentCore Runtime the context returns the auto-injected token; outside (Lambda, local dev) it returns None. Our imperative version matches that behaviour without the decorator's call-site rewrite.

Preserves PR #87's nice-to-have improvements

  • ImportError graceful fallback adapted from boto3 to bedrock_agentcore — degrade with WARN, don't crash the agent
  • AccessDeniedException (likely missing IAM permission) and ResourceNotFoundException (provider name typo / not yet created) logged at ERROR severity to page someone, not WARN
  • Other ClientErrors (transient throttle, network) stay at WARN

cdk/src/stacks/agent.ts

On the AgentCore runtime:

  • Drops linearIntegration.apiTokenSecret.grantRead(runtime) and the LINEAR_API_TOKEN_SECRET_ARN env-var override
  • Adds LINEAR_API_KEY_PROVIDER_NAME env (hardcoded 'linear-api-key' for now)
  • Adds IAM for bedrock-agentcore:GetResourceApiKey + bedrock-agentcore:GetWorkloadAccessToken

Lambdas untouched — they still grant on the Linear secret and read from Secrets Manager. Verified at synth: only the AgentCore runtime's IAM policy and env vars changed.

Resource scope on the new IAM is * for now; the canonical AgentCore Identity ARN format isn't fully documented in public AWS docs as of 2026-05-15. Tighten in 2.0b when OAuth migration documents the resource shape.

docs/guides/LINEAR_SETUP_GUIDE.md

Adds Step 4.5 documenting the one-time admin command users run alongside the existing bgagent linear setup wizard:

agentcore add credential --type api-key --name linear-api-key
# (paste the same lin_api_… token when prompted)

Explains the dual-store setup and notes that 2.0b will retire the duplicate. Starlight mirror synced.

Tests

agent/tests/test_config.py::TestResolveLinearApiToken — 10 tests:

  • Cached env-var fast path (no SDK call)
  • Missing provider name → empty cleanly
  • Missing region → empty + WARN, no SDK call
  • Workload token absent (outside runtime) → empty + WARN
  • Happy path with env-var side-effect
  • Botocore errors swallowed (no crash)
  • SDK returns None → empty (defensive)
  • ImportError fallback (bedrock_agentcore unavailable)
  • AccessDeniedException → ERROR severity
  • ResourceNotFoundException → ERROR severity

542 agent / 1271 cdk / 196 cli — all green. Lint clean. Typecheck clean. CDK synth clean.

Phase plan context

2.0a  ← THIS PR — API key via AgentCore Identity (agent only)
2.0b  ← Linear OAuth + Gateway in one cutover (all consumers, retires Secrets Manager)
2.1   ← Agent Sessions UX (status pill, activities, retires reaction-spam)
2.2   ← AgentSessionEvent webhooks (@bgagent natural-language prompts)
2.3   ← HITL adapter via select activities (post Sam's #88 merge)

Reviewer notes

  • Smoke not run yet. The MCP placeholder, GraphQL header, and run.sh env-var pass-through are all unchanged — only the source of LINEAR_API_TOKEN changed. Unit tests cover the new code path with realistic SDK mocks. Will deploy + smoke before requesting review.
  • Provider name hardcoded as 'linear-api-key' in agent.ts. Happy to parametrize via CDK context (-c linear-api-key-provider=...) if reviewer prefers — opted for the simple shape since there's one deployment.
  • AgentCore Identity is "outbound auth" terminology — matches Alain's note. The provider lives in AgentCore Identity's token vault (separate AWS-managed service from Secrets Manager).

bgagent and others added 9 commits May 13, 2026 15:34
Closes the silent-drop UX gap that appeared whenever a Linear-triggered task
was rejected before the agent container started — the user would apply the
trigger label, see nothing happen, and have no way to know why. Reactions
and progress comments are emitted by the agent container; nothing fired
until that point, so all upstream rejections were invisible on the Linear
side.

This commit wires a best-effort GraphQL feedback path covering all six
distinct rejection points:

In `linear-webhook-processor.ts` (pre-`createTaskCore`):
  1. Issue has no projectId → "isn't in a project" comment
  2. Project not onboarded / removed → "isn't onboarded; admin can run
     `bgagent linear onboard-project`" comment
  3. Webhook missing organization or actor → diagnostic comment
  4. Linear actor has no linked platform user → "v1 only the API-token
     owner can submit; multi-user OAuth is on the v3 roadmap" comment
  5. `createTaskCore` returns non-201 → message branched on status:
     guardrail/validation block surfaces the user-facing error string;
     503 prompts the user to re-apply the label; other 4xx/5xx falls
     through to a generic message.

In `orchestrate-task.ts` (post-201, in admission control):
  6. User concurrency cap rejection → "concurrency limit; wait for one
     to finish, then re-apply the label" comment.

All five processor paths and the orchestrator path call a shared helper,
`reportIssueFailure(secretArn, issueId, message)`, that runs the comment
and ❌ reaction in parallel via `Promise.allSettled`. The helper:

  - Reuses the existing 5-minute `getLinearSecret` cache from
    `linear-verify.ts` (no extra Secrets Manager hits on warm Lambdas).
  - Swallows network, auth, and GraphQL errors with WARN logs — Linear
    feedback is advisory and must never gate the rejection path.
  - Posts to Linear's hosted GraphQL endpoint; mutation shapes match
    `agent/src/linear_reactions.py` (`commentCreate`, `reactionCreate`).

CDK plumbing:

  - `linear-integration.ts` — wires `LINEAR_API_TOKEN_SECRET_ARN` into
    the webhook processor and grants read on the existing
    `LinearIntegration.apiTokenSecret`.
  - `agent.ts` — grants the same secret to `orchestrator.fn` and
    populates the env var. The grant is unconditional; the orchestrator
    only invokes the helper when `task.channel_source === 'linear'`.

The non-Linear case is a hard no-op at the call site — `notifyLinear-
OnConcurrencyCap` early-returns on `channel_source !== 'linear'`, and the
processor only handles Linear payloads. Slack/API/webhook tasks are
unaffected.

Tests (28 new; 1240 → 1268, all green):

  - `cdk/test/handlers/shared/linear-feedback.test.ts` (13 tests):
    mutation shape, auth header, error swallowing in 4 distinct failure
    modes (secret-resolution null, non-2xx, GraphQL `errors`, network
    throw), `Promise.allSettled` partial-success semantics.
  - `cdk/test/handlers/linear-webhook-processor.test.ts` (10 new tests
    in a `user-visible feedback` describe block): one assertion per
    rejection path + happy-path-doesn't-fire + filter-rejection-doesn't-
    fire (the latter is intentional UX — the processor sees many events
    that aren't tasks, and dropping a comment on each would be noisy).
  - `cdk/test/handlers/orchestrate-task-feedback.test.ts` (5 tests):
    new file; covers `notifyLinearOnConcurrencyCap` directly with
    `withDurableExecution` mocked. Asserts the linear path fires; the
    api/webhook/slack paths no-op; missing metadata, missing env, and
    undefined `channel_metadata` all no-op cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
 nits

Wraps the v1.1 polish theme from PR aws-samples#87. Five small additions, all
agent-side or docs:

State-on-start (the user-visible one):
  - prompt_builder._channel_prompt_addendum now instructs the agent to
    transition the originating Linear issue to `In Progress` (or `Todo`
    fallback) at agent-start, mirroring the existing `In Review` chain
    fired at PR-open. Closes the gap where the issue stayed at `Backlog`
    during real agent work — only the 👀 reaction and "🤖 Starting"
    comment signaled progress, while humans-using-Linear expect the state
    column to reflect "being worked." Skips if the issue is already in
    `In Progress` or any later state; doesn't loop on list_issue_statuses.

Alain aws-samples#63 review nits (4 small surgical changes):
  - linear_reactions.py: auth-failure circuit breaker. Track consecutive
    401/403s; after 3 strikes, log ERROR once and short-circuit all later
    _graphql calls (return None) until the container restarts. Resets on
    any 2xx response. Replaces the prior behaviour where revoked tokens
    flooded CloudWatch with WARNs and wasted Linear API quota indefinitely.
  - pipeline.py: declare `linear_eyes_reaction_id: str | None = None`
    explicitly before the try block instead of relying on
    `locals().get("linear_eyes_reaction_id")` in the crash handler.
    Functionally identical; survives refactors and reads cleanly.
  - config.py::resolve_linear_api_token: narrow `except Exception` to
    `(BotoCoreError, ClientError)` from botocore.exceptions. Switch
    `print()` to `shell.log("WARN", ...)` so warnings join the structured
    log stream the rest of the agent uses.
  - LINEAR_SETUP_GUIDE.md + cli/src/commands/linear.ts: stop telling
    users to run `bgagent linear link <code>` when auto-link fails — the
    code generator is a v3 feature that doesn't ship in v1, so the
    suggestion was misleading. Replaced with explicit admin-assisted
    fallback (DynamoDB put-item with steps to find workspaceId, viewerId,
    Cognito sub) and a clear "this command exists but is non-functional
    in v1" note.

Tests: 532 agent + 1268 cdk + 196 cli, all green. Deployed to
backgroundagent-dev. Smoke-tested 👀-on-start (156ms, agent unblocked)
in the prior commit; state-on-start smoke is the next manual step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Whitespace-only changes flagged by CI's self-mutation guard. No behaviour
change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- linear_reactions: guard auth-circuit globals with `_auth_state_lock`
  so the daemon sweep thread and the main thread can't race the
  read-modify-write on `_consecutive_auth_failures` /
  `_auth_circuit_open`.
- linear_reactions: wrap the daemon sweep target in
  `_sweep_stale_reactions_safe` so an unexpected exception logs at
  ERROR instead of dying silently (stderr from a daemon thread doesn't
  reliably reach CloudWatch).
- linear_reactions: only increment the sweep delete counter when
  `_graphql(_DELETE_MUTATION, ...)` actually returns a non-None
  response — previously the summary log overstated success.
- config: hoist `import boto3` out of the catch-narrowed try/except
  so an `ImportError` (boto3 missing from the image) degrades to a
  WARN log instead of crashing the agent.
- orchestrate-task: wrap `notifyLinearOnConcurrencyCap` in a
  defensive try/catch — durable-execution retries the entire
  admission-control step on throw, which would re-fire `failTask` +
  `emitTaskEvent` and produce duplicate events.
- tests: 1 new throw-propagation test for `notifyLinearOnConcurrencyCap`,
  3 new tests for `resolve_linear_api_token` (cached env, no-arn,
  ImportError fallback). Auto-reset fixture in
  `test_linear_reactions.py` now also resets the circuit-breaker
  globals between tests so future cases don't leak state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- linear_reactions: log a single DEBUG line when the auth circuit
  breaker short-circuits a call, so the path isn't zero-trace once
  open.
- config: split the `(BotoCoreError, ClientError)` catch so
  `AccessDeniedException` logs at ERROR instead of WARN — IAM
  misconfig is persistent and should page someone, not blend into
  transient warnings. Also drop the personal name from the inline
  reference to the aws-samples#63 review.
- linear-webhook-processor: tighten `buildCreateTaskFailureMessage`
  param types to `number` / `string` (no `| undefined`) — the only
  caller passes `APIGatewayProxyResult` fields which are always
  defined. Removes dead fallback-to-`'unknown'` branches.
- test_config: 2 new tests covering the split exception path
  (AccessDenied → ERROR; ResourceNotFound → WARN).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Migrates the agent runtime's Linear personal API token resolution from
AWS Secrets Manager to AWS Bedrock AgentCore Identity. This is the
"validate Identity SDK" step of the v2 plan; Phase 2.0b will swap the
API key for OAuth and converge Linear MCP onto AgentCore Gateway in
one cutover.

Per Alain's guidance: "start by using api key, if it works, switch to
oauth. you will setup an outbound auth for your server using agentcore
identity. that identity can be (AC identity is like a wrapper around
secrets manager) api key or oauth."

## Scope: agent runtime only

Lambdas (orchestrator + processor) intentionally keep using Secrets
Manager via the existing `LinearApiTokenSecret` for now. The Python
`bedrock_agentcore` SDK has no Node.js equivalent — Lambda migration
requires `@aws-sdk/client-bedrock-agentcore` raw API calls and folds
into 2.0b's bigger refactor. End-state of 2.0a: agent reads from
Identity, Lambdas read from Secrets Manager, both pointing at the same
underlying token value (admin populates both).

## What changed

`agent/src/config.py::resolve_linear_api_token`:

  - Drops boto3 SecretsManager fetch + `LINEAR_API_TOKEN_SECRET_ARN` env.
  - Reads new env `LINEAR_API_KEY_PROVIDER_NAME` (provider name in
    Identity vault).
  - Calls `IdentityClient.get_api_key()` with the workload access token
    auto-injected into `BedrockAgentCoreContext` by AgentCore Runtime
    (verified by reading the SDK's `auth.py` decorator implementation —
    no manual workload-identity mint needed inside the runtime).
  - Caches the resolved token in `LINEAR_API_TOKEN` so downstream
    consumers stay unchanged: `channel_mcp.py`'s `${LINEAR_API_TOKEN}`
    placeholder in `.mcp.json` and `linear_reactions.py`'s GraphQL
    Authorization header.

Preserves PR aws-samples#87's nice-to-have improvements:

  - `ImportError` graceful fallback (now for `bedrock_agentcore` instead
    of `boto3`) — degrade with WARN, don't crash the agent.
  - `AccessDeniedException` and `ResourceNotFoundException` logged at
    ERROR severity (persistent IAM/config bugs that should page).
    Other ClientErrors stay at WARN (transient throttle/network).

`agent/pyproject.toml`: adds `bedrock-agentcore==1.9.1` dep.

`cdk/src/stacks/agent.ts`:

  - On the AgentCore runtime: drops `linearIntegration.apiTokenSecret.
    grantRead(runtime)` and the `LINEAR_API_TOKEN_SECRET_ARN` env-var
    override. Adds `LINEAR_API_KEY_PROVIDER_NAME` env (hardcoded
    `'linear-api-key'` for now; can parametrize later via context if
    multi-environment naming is needed) and IAM permissions for
    `bedrock-agentcore:GetResourceApiKey` and
    `bedrock-agentcore:GetWorkloadAccessToken`.
  - Lambdas (orchestrator + processor) untouched — they still grant on
    the Linear secret and read from Secrets Manager.
  - Resource scope on the new IAM is `*` for now; AgentCore Identity ARN
    format isn't fully standardized in public docs as of 2026-05-15.
    Tighten in 2.0b when OAuth migration documents the canonical
    resource shape.

`docs/guides/LINEAR_SETUP_GUIDE.md`: adds Step 4.5 documenting the
one-time `agentcore add credential --type api-key --name linear-api-key`
admin command users must run alongside the existing `bgagent linear
setup` wizard. Notes that Lambdas keep Secrets Manager temporarily and
2.0b will retire the dual-store setup. Starlight mirror synced.

## Tests

`agent/tests/test_config.py::TestResolveLinearApiToken` — 10 tests
covering: cached env var fast-path; missing provider name; missing
region; workload token absent (outside runtime); happy path with
env-var side-effect; botocore error swallowed with WARN; SDK returns
None defensively; ImportError fallback; AccessDeniedException → ERROR
severity; ResourceNotFoundException → ERROR severity.

542 agent / 1271 cdk / 196 cli, all green. Lint + typecheck clean.
CDK synth clean.

## Migration notes for reviewer

`bedrock_agentcore` SDK confirmed working in our runtime image (verified
in `node_modules` post-install). The `BedrockAgentCoreContext` workload
token auto-injection is documented behaviour for code running inside
AgentCore Runtime — verified by reading the SDK's `@requires_api_key`
decorator implementation, which uses the same context lookup we use
here.

Stacked on PR aws-samples#87 (`feat/linear-processor-feedback`). Will conflict on
`config.py` and `test_config.py` if aws-samples#87 needs further rework before
merge — happy to rebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@isadeks
Copy link
Copy Markdown
Contributor Author

isadeks commented May 15, 2026

Reopening on the fork stacked on #87 — wrong base.

@isadeks isadeks closed this May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants