You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Follow-up from PR #88 — surfaced during the Cedar HITL merge when adding matching_rule_ids to the TaskApprovalsTable.user_id-status-index GSI projection.
Functional description
ABCA's TaskApprovalsTable exposes a user_id-status-index Global Secondary Index used by GET /tasks/pending to return per-user pending approvals in a single query. When a new feature needs another attribute available on that GSI (e.g. matching_rule_ids for the Cedar HITL UX), DynamoDB rejects in-place updates to the GSI's nonKeyAttributes — the only way forward is to delete the GSI and recreate it, which on a backing-table-replacing CDK construct means destroy + redeploy.
Today this works fine in dev (the affected stack rebuilds cleanly), but there is no documented prod-safe procedure for adding a field to a GSI projection on a table holding live customer data. Any team taking ABCA past dev hits this the first time they evolve the schema.
User-visible impact:
Operator running cdk deploy on a stack with live data sees a CloudFormation rollback with Cannot update GSI's properties other than Provisioned Throughput and Contributor Insights Specification.
No documented recovery path; the only way through (in dev) was to rename the construct id (TaskApprovalsTable → TaskApprovalsTableV2) which forces table replace and drops every pending approval row.
A team running this in prod would either lose data or have to write the migration tooling from scratch under time pressure.
Technical context
Where the constraint lives: AWS DynamoDB API. UpdateTable accepts GSI updates only for provisioned throughput and contributor insights — the projection list is fixed at GSI creation time.
cdk/src/constructs/task-approvals-table.ts — the V2 suffix on the construct id is the dev workaround; commit c149993 (initial) and the rename to V2 (during PR feat: Cedar HITL approval gates for agent tool use #88) are the audit trail.
cdk/test/constructs/task-approvals-table.test.ts — uses Match.arrayWith([..., 'matching_rule_ids']) to lock the projection list. This catches future projection drift but doesn't help with the migration itself.
Two viable patterns for prod:
Dual-index shadow pattern: ship a new GSI alongside the old one (e.g. user_id-status-v2-index), backfill it asynchronously, switch reads via a feature flag, then remove the old GSI in a follow-up release. Zero data loss; operator visibility into progress; safe to abort. ~1 week of work to template + document.
Drain-and-rebuild: pause writes via a scheduled rule, snapshot the table to S3, recreate the table+GSI, restore. Faster (~1 hour for our table sizes) but requires a write outage and a tested restore procedure.
Other ABCA tables with GSIs that may face the same evolution: TaskTable.status-index, TaskEventsTable (no GSI today but planned per design §10).
Proposed options
Recommend option 1 (dual-index shadow) as the canonical procedure, with option 2 documented as the "small table / acceptable outage" alternative. Land as:
A runbook under docs/operations/ (new directory) titled "Migrating DynamoDB GSI projections."
A reusable construct cdk/src/constructs/dual-index-table.ts that wraps the dual-index ship + backfill pattern, parameterized by table name + index name + new attribute list.
A worked example doing the migration the runbook describes against TaskApprovalsTable on the backgroundagent-dev stack so reviewers can see the procedure end-to-end.
Acceptance criteria
docs/operations/dynamodb-gsi-migration.md exists and is linked from docs/design/ARCHITECTURE.md
Runbook covers: detection (what error you see), preconditions, step-by-step (with aws CLI commands and CDK changes), rollback, and how to verify success
Construct unit tests pin the projection list with Match.arrayWith (already done for TaskApprovalsTable; confirm for any new construct)
Functional description
ABCA's
TaskApprovalsTableexposes auser_id-status-indexGlobal Secondary Index used byGET /tasks/pendingto return per-user pending approvals in a single query. When a new feature needs another attribute available on that GSI (e.g.matching_rule_idsfor the Cedar HITL UX), DynamoDB rejects in-place updates to the GSI'snonKeyAttributes— the only way forward is to delete the GSI and recreate it, which on a backing-table-replacing CDK construct means destroy + redeploy.Today this works fine in dev (the affected stack rebuilds cleanly), but there is no documented prod-safe procedure for adding a field to a GSI projection on a table holding live customer data. Any team taking ABCA past dev hits this the first time they evolve the schema.
User-visible impact:
cdk deployon a stack with live data sees a CloudFormation rollback withCannot update GSI's properties other than Provisioned Throughput and Contributor Insights Specification.TaskApprovalsTable→TaskApprovalsTableV2) which forces table replace and drops every pending approval row.Technical context
Where the constraint lives: AWS DynamoDB API.
UpdateTableaccepts GSI updates only for provisioned throughput and contributor insights — the projection list is fixed at GSI creation time.Where it bit us in PR #88:
cdk/src/constructs/task-approvals-table.ts— the V2 suffix on the construct id is the dev workaround; commitc149993(initial) and the rename to V2 (during PR feat: Cedar HITL approval gates for agent tool use #88) are the audit trail.cdk/test/constructs/task-approvals-table.test.ts— usesMatch.arrayWith([..., 'matching_rule_ids'])to lock the projection list. This catches future projection drift but doesn't help with the migration itself.Two viable patterns for prod:
user_id-status-v2-index), backfill it asynchronously, switch reads via a feature flag, then remove the old GSI in a follow-up release. Zero data loss; operator visibility into progress; safe to abort. ~1 week of work to template + document.Other ABCA tables with GSIs that may face the same evolution:
TaskTable.status-index,TaskEventsTable(no GSI today but planned per design §10).Proposed options
Recommend option 1 (dual-index shadow) as the canonical procedure, with option 2 documented as the "small table / acceptable outage" alternative. Land as:
docs/operations/(new directory) titled "Migrating DynamoDB GSI projections."cdk/src/constructs/dual-index-table.tsthat wraps the dual-index ship + backfill pattern, parameterized by table name + index name + new attribute list.TaskApprovalsTableon thebackgroundagent-devstack so reviewers can see the procedure end-to-end.Acceptance criteria
docs/operations/dynamodb-gsi-migration.mdexists and is linked fromdocs/design/ARCHITECTURE.mdawsCLI commands and CDK changes), rollback, and how to verify successMatch.arrayWith(already done for TaskApprovalsTable; confirm for any new construct)Out of scope
backgroundagent-devstack; the runbook can be tested against an ephemeral stack.References
cdk/src/constructs/task-approvals-table.ts(the V2 suffix is the dev workaround)cdk/test/constructs/task-approvals-table.test.ts(Match.arrayWithprojection lock)UpdateTableAPI constraints: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_UpdateTable.html#API_UpdateTable_RequestSyntaxTaskApprovalsTable→TaskApprovalsTableV2was the dev-only escape hatch.