Feature: Integrate AKW multi-agent capabilities into ABCA

### Component

None

### Describe the feature

# Feature: Integrate AKW multi-agent capabilities into ABCA

## Summary

Extend ABCA from a single-purpose coding agent platform into a general-purpose autonomous agent platform by integrating the AKW (Autonomous Knowledge Work) subsystems. The result is a single AWS-hosted platform that runs both coding tasks (git, PRs, build/lint) and knowledge-work tasks (research, document generation, email triage, etc.) on shared infrastructure, with a shared memory system, trust model, and blueprint registry.

## Motivation

ABCA today is hardwired for coding tasks: every task requires a GitHub repo, clones it, runs an agent against it, and opens a PR. This makes the platform unusable for tasks that have no repo (research, document drafting, data analysis). AKW solves exactly this problem with a blueprint-driven, task-mode-agnostic agent loop, but lacks ABCA's production-grade AWS infrastructure (durable orchestration, AgentCore compute isolation, Cognito auth, Cedar policy enforcement).

Merging the two gives each what it lacks:
- AKW gets cloud-scale durable execution, AgentCore MicroVM isolation, and Bedrock Guardrail screening
- ABCA gets blueprint-driven extensibility, semantic long-term memory, risk-aware admission, in-execution human-in-the-loop (HITL), self-extending tool generation, and support for non-coding task domains

## Proposed changes

### 1. `task_mode` field — decouple coding from knowledge tasks

Add a `task_mode: 'coding' | 'knowledge'` field to blueprints and `TaskRecord`. Coding tasks follow the existing path (repo clone, GitHub context hydration, build/lint post-hooks). Knowledge tasks skip all git scaffolding and run the agent directly against instructions + memory context.

**Touches:** `cdk/src/handlers/shared/types.ts`, `agent/src/config.py`, `agent/src/pipeline.py`, `cdk/src/constructs/blueprint.ts`

---

### 2. Blueprint registry

Replace hardcoded system prompts with a `FilesystemRegistryService` backed by YAML blueprint files. Each blueprint declares its task type, system prompt, tool set, execution phases, HITL conditions, and parameters. The registry resolves which blueprint to load at runtime.

Add an initial set of blueprints covering coding (`new_task`, `pr_iteration`, `pr_review`) and knowledge (`web_research`, `pubmed`, `document_draft`, `email_triage`) task types.

**New files:** `agent/blueprints/`, `agent/src/registry/`

---

### 3. Mem0 long-term memory backend

Add a `Mem0LTM` backend that runs alongside the existing AgentCore Memory. Mem0 provides semantic search (via embeddings), contradiction detection before writes, and a memory lifecycle engine (decay + consolidation). AgentCore Memory remains authoritative for episodic task history; Mem0 stores tool knowledge, repo learnings, and cross-task semantic facts.

Deploy Mem0 + Qdrant as an ECS Fargate service (`MemoBacked` CDK construct) in the agent VPC, reachable at `mem0.agent-services:8001` via Cloud Map.

**New files:** `agent/src/backends/ltm/mem0.py`, `cdk/src/constructs/mem0-backend.ts`, `agent/mem0/`

---

### 4. Blueprint phase tracking and PatternEvaluator (HITL)

Add `BlueprintTracker` to track which execution phase the agent is in and `PatternEvaluator` to evaluate HITL conditions between turns. When a pattern fires (e.g. conflicting evidence detected, low corpus quality, scope violation), the agent pauses at `AWAITING_APPROVAL`, writes a `PendingApproval` record, and waits for human input before resuming.

**New files:** `agent/src/blueprint_tracker.py`, `agent/src/pattern_evaluator.py`
**New status:** `AWAITING_APPROVAL` added to `TaskStatusType` with transitions `RUNNING → AWAITING_APPROVAL → RUNNING`

---

### 5. Risk-aware pre-flight pipeline

Add a 4-stage pre-flight Lambda (readiness check → context hydration → risk assessment → admission policy) invoked as a durable step in the orchestrator. Pre-flight writes `pre_flight_decision` (`ADMIT | ADMIT_WITH_HITL | DEFER | REJECT`) and `risk_tier` (`LOW | MEDIUM | HIGH | CRITICAL`) to `TaskRecord` before the agent session starts.

**New files:** `agent/src/preflight/`, `cdk/src/constructs/preflight-lambda.ts`
**Orchestrator change:** new `invoke-preflight` durable step between `admission-control` and `hydrate-context`

---

### 6. SandboxManager + ECS sidecar for tool execution

Add `SandboxManager` to execute dynamically generated tool code in isolated ECS task containers (network-none, read-only filesystem, tmpfs /tmp). `SecretManager` scopes secrets per tool ID under `/abca/tools/{tool_id}/`. The ECS sidecar CDK construct provisions the execution environment in the agent VPC.

**New files:** `agent/src/sandbox/`, `cdk/src/constructs/sandbox-sidecar.ts`

---

### 7. ToolBuilderAgent and BlueprintBuilderAgent (meta-agents)

Add two new task types:
- `generate_tool` — `ToolBuilderAgent` searches `CapabilityIndex` for existing tools, generates new tool code if none found, tests it in the sandbox, and promotes it to the registry
- `generate_blueprint` — `BlueprintBuilderAgent` generates a new YAML blueprint for an unknown task type and promotes it through the DRAFT → VALIDATED → PRODUCTION pipeline

These agents enable the platform to extend itself without a code deploy.

**New files:** `agent/src/agents/tool_builder/`, `agent/src/agents/blueprint_builder/`

---

### 8. CapabilityIndex — semantic tool search

Add `CapabilityIndex` backed by `Mem0LTM`. When `ToolBuilderAgent` is asked to find or generate a tool, it first searches the index semantically before generating new code. Registered tools are stored with embeddings so future searches find them by intent, not just name.

**New files:** `agent/src/registry/capability_index.py`

---

### 9. Trust & Graduation

Add `TrustEventsTable` (DynamoDB) and a `TrustEmitter` that records typed signals (`TOOL_SUCCESS`, `TOOL_FAILURE`, `SCOPE_VIOLATION`, `HITL_TRIGGERED`, `TASK_COMPLETE`, `TASK_FAILED`, etc.) on every significant agent event. `AutonomyGraduationEngine` accumulates net points per agent and promotes the autonomy level (`restricted → supervised → autonomous`) when thresholds are met, reducing HITL gate frequency for agents with strong track records.

**New files:** `agent/src/trust/`, `cdk/src/constructs/trust-events-table.ts`

---

### 10. HITL approve/reject API

Add `POST /v1/tasks/{id}/approve` and `POST /v1/tasks/{id}/reject` Lambda handlers backed by `ApprovalsTable` (DynamoDB). Add `bgagent approve <task-id> [comment]` and `bgagent reject <task-id> [reason]` CLI commands. The orchestrator poll loop does not count `AWAITING_APPROVAL` cycles against `MAX_POLL_ATTEMPTS`.

**New files:** `cdk/src/handlers/approve-task.ts`, `cdk/src/handlers/reject-task.ts`

---

## Out of scope for this issue

- **DEFER sub-workflow** — unknown task type currently returns `FAILED`; auto-spawning a `BlueprintBuilderAgent` child and resuming the original task requires a new durable sub-workflow
- **Per-blueprint HITL timeout** (`hitl_max_wait_sec`) — orchestrator currently uses the global 8.5h poll limit
- **PR #54 TUI wiring** — interactive CLI TUI exists with mock data; real endpoint wiring is a separate task
- **LangGraph / LangSmith integration** — deferred to a later phase

## Acceptance criteria

- [ ] `bgagent submit --task-mode knowledge --task "summarise recent papers on RAG"` completes without a repo argument
- [ ] Submitting a task with an unregistered task type triggers `BlueprintBuilderAgent` (DEFER path), not a hard failure
- [ ] A task that hits a HITL condition transitions to `AWAITING_APPROVAL`; `bgagent approve <id>` resumes it to completion
- [ ] A second run of the same knowledge task shows non-empty `memory_context` from Mem0 in the trace
- [ ] `generate_tool` task produces a tool registered in the registry and discoverable via `search_capability_index`
- [ ] `TrustEventsTable` records a `TASK_COMPLETE` event after every successful task
- [ ] All existing `new_task`, `pr_iteration`, `pr_review` task types continue to pass end-to-end

## References

- Detailed source-level comparison: `~/merge-streams/compare.md`
- Phase-by-phase plan and deployed environment details: `~/merge-streams/plan.md`
- Implementation branch: [`merge/akw-integration`](https://github.com/aws-samples/sample-autonomous-cloud-coding-agents/tree/merge/akw-integration)


### Use case

Today, ABCA can autonomously write code, open pull requests, and iterate on them — but only if the work involves a GitHub repository.  Any task that doesn't fit that mold (researching a topic, drafting a document, triaging emails, analyzing data) simply can't run on the platform. This feature removes that constraint: by integrating AKW's blueprint-driven agent loop and knowledge-work task types, the same cloud infrastructure that runs your coding agents can now run research agents, document agents, or any other autonomous workflow you can describe in a YAML blueprint — with the same security model, the same memory system, the same human-in-the-loop controls, and the same auditability. A team could submit a coding task and a literature review in the same CLI session, watch both run in parallel in isolated compute environments, approve or steer them mid-execution, and have the results accumulate in a shared long-term memory that makes every subsequent task smarter.

### Proposed solution

_No response_

### Other information

_No response_

### Acknowledgements

- [ ] I may be able to implement this feature
- [ ] This might be a breaking change

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Integrate AKW multi-agent capabilities into ABCA #99

Component

Describe the feature

Feature: Integrate AKW multi-agent capabilities into ABCA

Summary

Motivation

Proposed changes

1. `task_mode` field — decouple coding from knowledge tasks

2. Blueprint registry

3. Mem0 long-term memory backend

4. Blueprint phase tracking and PatternEvaluator (HITL)

5. Risk-aware pre-flight pipeline

6. SandboxManager + ECS sidecar for tool execution

7. ToolBuilderAgent and BlueprintBuilderAgent (meta-agents)

8. CapabilityIndex — semantic tool search

9. Trust & Graduation

10. HITL approve/reject API

Out of scope for this issue

Acceptance criteria

References

Use case

Proposed solution

Other information

Acknowledgements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Integrate AKW multi-agent capabilities into ABCA #99

Description

Component

Describe the feature

Feature: Integrate AKW multi-agent capabilities into ABCA

Summary

Motivation

Proposed changes

1. task_mode field — decouple coding from knowledge tasks

2. Blueprint registry

3. Mem0 long-term memory backend

4. Blueprint phase tracking and PatternEvaluator (HITL)

5. Risk-aware pre-flight pipeline

6. SandboxManager + ECS sidecar for tool execution

7. ToolBuilderAgent and BlueprintBuilderAgent (meta-agents)

8. CapabilityIndex — semantic tool search

9. Trust & Graduation

10. HITL approve/reject API

Out of scope for this issue

Acceptance criteria

References

Use case

Proposed solution

Other information

Acknowledgements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. `task_mode` field — decouple coding from knowledge tasks