feat: add CancellationToken for graceful agent execution cancellation by jgoyani1 · Pull Request #1772 · strands-agents/sdk-python

jgoyani1 · 2026-02-26T21:41:35Z

Motivation

Agents need a way to be stopped from external contexts — web request handlers, background threads, timeout logic. Currently there is no graceful cancellation mechanism, so callers have no way to interrupt a running agent without killing the process.

Resolves: #81

Public API Changes

New agent.cancel() method for graceful cancellation:

agent = Agent()

# Cancel from any thread or async context
agent.cancel()

# Agent stops at next checkpoint with stop_reason="cancelled"
result = await agent.invoke_async("Hello")
assert result.stop_reason == "cancelled"

Cancellation is checked at two strategic checkpoints:

During model response streaming — discards partial output, returns {"text": "Cancelled by user"}
Before tool execution — adds error toolResult for each pending toolUse to maintain valid conversation state

The agent is reusable after cancellation — the cancel signal is automatically cleared when the invocation completes.

Use Cases

Web servers: Cancel agent on request timeout or client disconnect
Background tasks: Stop agent from a monitoring thread when conditions change
Interactive UIs: Wire a "Stop" button to agent.cancel()

codecov · 2026-03-02T21:37:37Z

Codecov Report

❌ Patch coverage is 92.85714% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/strands/event_loop/event_loop.py	88.23%	0 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

github-actions · 2026-03-03T14:53:02Z

Issue: Code coverage gap noted by Codecov (74.4% patch coverage).

The Codecov report indicates 11 missing lines in src/strands/event_loop/event_loop.py. Consider adding unit tests specifically for:

Checkpoint 1 cancellation (cycle start) - lines 152-164
Checkpoint 2 cancellation (before model call) - lines 346-359
Checkpoint 4 cancellation (before tool execution) - lines 522-533

While the integration tests cover many scenarios, targeted unit tests with mock signals would improve coverage and make it easier to verify each checkpoint works correctly in isolation.

github-actions · 2026-03-03T14:53:03Z

Review Summary

Assessment: Request Changes

This is a well-designed feature that adds graceful cancellation support to agents. The implementation is thread-safe, has clear checkpoints, and includes comprehensive tests. However, there's a critical issue that needs to be addressed before merging.

Review Categories

Critical - Agent Reusability: The StopSignal has no reset mechanism. Once cancel() is called, the agent is permanently cancelled and cannot be reused for subsequent invocations. This needs either a reset() method or automatic reset at invocation start.
Important - Naming Consistency: As noted by @pgrayy, the naming should be aligned. Consider renaming StopSignal to CancelSignal and updating docstrings that reference "cancellation token" for consistency.
Test Coverage: Codecov reports 74.4% patch coverage with 11 missing lines. Consider adding targeted unit tests for each cancellation checkpoint.

The core implementation is solid - the four checkpoint design is sensible and the thread-safety approach using threading.Lock is appropriate.

agent-of-mkmeral · 2026-03-09T16:47:04Z

🔬 Adversarial Testing: Cancel + Interrupt Interactions

I ran additional adversarial tests focusing on how cancel() interacts with interrupts, concurrent tool execution, and partial results.

Result: ✅ No critical bugs found — 2 design observations documented below.

📊 Test Summary

Metric	Value
Tests written	25
Tests passing	17
Tests failing (design behaviors, not bugs)	8
Critical bugs	0

🔍 Finding 1: Cancel is NOT checked during sequential tool execution

Observation (Design Choice — Not a Bug)

With SequentialToolExecutor, cancellation is checked before starting tool execution, not between individual tools.

Evidence:

# Cancel called after first tool starts → BOTH tools complete fully
execution_log = ['start_0.1', 'end_0.1', 'start_0.5', 'end_0.5']

Where in code: event_loop.py ~line 468-500 — the cancel check happens once before the executor runs.

Impact: For long-running tools, cancel() won't interrupt mid-tool. The current design is safer (avoids inconsistent state) but less responsive.

Is this a bug? No — this appears intentional. Stopping a running tool mid-execution could leave state inconsistent.

🔍 Finding 2: Cancel during interrupt state doesn't clear interrupt

Observation (UX Question — May Need Documentation)

When the agent is in INTERRUPTED state and cancel() is called:

✅ Cancel signal is set
✅ Interrupt state remains activated=True
⚠️ Subsequent agent("normal prompt") throws TypeError because you're still in interrupt state

Reproduction:

result = await agent.invoke_async("start")  # Triggers interrupt
assert result.stop_reason == "interrupt"

agent.cancel()  # User wants to cancel

# This FAILS with TypeError!
await agent.invoke_async("normal prompt")  # Must resume from interrupt

# To recover, must manually:
agent._interrupt_state.deactivate()
agent._cancel_signal.clear()

Is this a bug? Probably not — but it's a UX question:

Current: Cancel doesn't abandon interrupt workflow (preserves state for recovery)
Alternative: Cancel could also call _interrupt_state.deactivate() for true "cancel and forget"

The current behavior is defensible but should be documented.

✅ Confirmed: No Race Conditions

Thread safety verified with 50+ concurrent threads calling cancel() simultaneously.

threading.Event provides the needed thread safety — no race conditions found.

✅ Confirmed: No Information Loss

Every tool_use gets a corresponding tool_result (even when cancelled).

The implementation correctly adds cancel toolResult for unexecuted tools, maintaining valid conversation state.

🎯 State Machine Analysis

Agent States:
- IDLE: Agent not running, ready for invocation
- STREAMING: Model is generating response
- TOOL_PENDING: Model returned tool_use(s), about to execute
- TOOL_EXECUTING: Currently executing a tool
- INTERRUPTED: Waiting for human input (interruptResponse)
- CANCELLED: Invocation was cancelled

Key Transitions:
- STREAMING --cancel()--> Check at next chunk, return "cancelled"
- TOOL_PENDING --cancel()--> All tools get cancel results
- TOOL_EXECUTING --cancel()--> Current tool COMPLETES, remaining get cancel results
- INTERRUPTED --cancel()--> Signal set, but still in INTERRUPTED state

Conclusion: The implementation is solid. The two findings are design observations, not bugs — both behaviors are defensible and appear intentional. Consider documenting the cancel + interrupt interaction for users.

Move cancel_signal out of invocation_state to avoid leaking internal implementation details to hooks, tools, and model providers. The signal is now passed as a dedicated parameter to stream_messages and process_stream, while event_loop continues accessing it directly via agent._cancel_signal.

tahitimoon · 2026-03-12T05:00:16Z

@jgoyani1 Hello, thank you very much for contributing this feature. I tested it as soon as the new version was released.

However, I noticed an issue with the token usage statistics when a request is canceled.

I did a quick test and found that after cancellation, the reported token count is inaccurate. It doesn’t include the size of each prompt, and it also misses the tokens that have already been generated. Right now, it seems to be counting only a very small portion of the total.

Normal output without interruption

Total: 12087
Input: 11699
Output: 388

Interrupted near the end of output

Total: 35
Input: 0
Output: 35

Accurate token statistics would be very helpful for billing-related applications.

jgoyani1 · 2026-03-20T20:28:36Z

@tahitimoon

The root cause is in process_stream (streaming.py). When cancellation is detected mid-stream, the function returns immediately with usage=Usage(inputTokens=0, outputTokens=0, totalTokens=0) because:

Model providers (Bedrock, Anthropic, etc.) report token usage in the final metadata event of the stream.
The metadata event is always the last chunk — it comes after messageStop.
When we cancel mid-stream, we break out before the metadata chunk arrives.
So usage stays at its initialized zero values.

Would investigate more on how to fetch this values and then post an update here if its possible to get the token count on the go while streaming.

tahitimoon · 2026-03-21T04:44:28Z

@tahitimoon

The root cause is in process_stream (streaming.py). When cancellation is detected mid-stream, the function returns immediately with usage=Usage(inputTokens=0, outputTokens=0, totalTokens=0) because:

Model providers (Bedrock, Anthropic, etc.) report token usage in the final metadata event of the stream.

The metadata event is always the last chunk — it comes after messageStop.

When we cancel mid-stream, we break out before the metadata chunk arrives.

So usage stays at its initialized zero values.

Would investigate more on how to fetch this values and then post an update here if its possible to get the token count on the go while streaming.

thx，here’s my current approach:

After the output is interrupted, I can’t get the complete usage data returned by the model provider. So for now, I’m doing a local rough estimate with tiktoken using cl100k_base. For the input side, I count tokens by concatenating the full context sent to the model each time, including the system prompt, conversation history, tool results, and so on. For the output side, I count tokens by concatenating the text chunks that were already streamed back.

There are two issues with this approach:

cl100k_base is the tokenizer used for OpenAI models, but its tokenization rules differ from those used by Claude, DeepSeek, Kimi, Grok, and others, so the numbers will inevitably be off.

The estimated results are only for display and reference. They do not represent the actual token count used for billing by the provider.

…strands-agents#1772)

github-actions Bot added the size/l label Feb 26, 2026

jgoyani1 had a problem deploying to manual-approval February 26, 2026 21:41 — with GitHub Actions Failure

pgrayy reviewed Feb 27, 2026

View reviewed changes

Comment thread src/strands/types/cancellation.py Outdated

github-actions Bot added size/l and removed size/l labels Mar 2, 2026

jgoyani1 had a problem deploying to manual-approval March 2, 2026 18:58 — with GitHub Actions Failure

github-actions Bot added size/l and removed size/l labels Mar 2, 2026

jgoyani1 had a problem deploying to manual-approval March 2, 2026 19:08 — with GitHub Actions Failure

jgoyani1 temporarily deployed to manual-approval March 2, 2026 19:08 — with GitHub Actions Inactive

github-actions Bot added the strands-running label Mar 3, 2026

github-actions Bot reviewed Mar 3, 2026

View reviewed changes

Comment thread src/strands/types/stop_signal.py Outdated

github-actions Bot reviewed Mar 3, 2026

View reviewed changes

Comment thread src/strands/event_loop/event_loop.py Outdated

github-actions Bot reviewed Mar 3, 2026

View reviewed changes

Comment thread tests/strands/agent/test_agent_cancellation.py

github-actions Bot removed the strands-running label Mar 3, 2026

pgrayy reviewed Mar 3, 2026

View reviewed changes

Comment thread src/strands/types/stop_signal.py Outdated

Comment thread src/strands/types/stop_signal.py Outdated

Comment thread src/strands/event_loop/streaming.py Outdated

jgoyani1 force-pushed the feat/cancellation-token-impl branch from ae46812 to c9535ce Compare March 5, 2026 06:17

github-actions Bot added size/l and removed size/l labels Mar 5, 2026

jgoyani1 had a problem deploying to manual-approval March 5, 2026 06:17 — with GitHub Actions Failure

jgoyani1 force-pushed the feat/cancellation-token-impl branch from c9535ce to 05bc0bf Compare March 5, 2026 06:23

jgoyani1 had a problem deploying to manual-approval March 5, 2026 06:24 — with GitHub Actions Failure

github-actions Bot added size/xl and removed size/l labels Mar 5, 2026

pgrayy had a problem deploying to manual-approval March 8, 2026 21:32 — with GitHub Actions Failure

pgrayy previously approved these changes Mar 8, 2026

View reviewed changes

pgrayy mentioned this pull request Mar 9, 2026

feat(multiagent): swarm orchestration pattern strands-agents/sdk-typescript#606

Merged

7 tasks

pgrayy reviewed Mar 9, 2026

View reviewed changes

Comment thread src/strands/session/repository_session_manager.py

mkmeral reviewed Mar 9, 2026

View reviewed changes

Comment thread src/strands/agent/agent.py Outdated

Comment thread src/strands/event_loop/event_loop.py

jgoyani1 dismissed pgrayy’s stale review via 6057e55 March 9, 2026 17:18

jgoyani1 force-pushed the feat/cancellation-token-impl branch from f6ff47f to 6057e55 Compare March 9, 2026 17:18

github-actions Bot added size/l and removed size/m size/l labels Mar 9, 2026

jgoyani1 had a problem deploying to manual-approval March 9, 2026 17:18 — with GitHub Actions Failure

github-actions Bot added the size/l label Mar 9, 2026

jgoyani1 had a problem deploying to manual-approval March 9, 2026 17:18 — with GitHub Actions Failure

mkmeral approved these changes Mar 9, 2026

View reviewed changes

jgoyani1 requested a review from pgrayy March 9, 2026 17:31

pgrayy approved these changes Mar 9, 2026

View reviewed changes

pgrayy merged commit 73fe9cc into strands-agents:main Mar 9, 2026
20 of 38 checks passed

pgrayy mentioned this pull request Mar 10, 2026

[FEATURE] Add support of CancellationToken #81

Closed

pgrayy mentioned this pull request Mar 12, 2026

docs: add agent cancellation documentation strands-agents/docs#648

Merged

4 tasks

agent-of-mkmeral mentioned this pull request Mar 17, 2026

Weekly Strands Digest agent-of-mkmeral/strands-coder#21

Open

pgrayy mentioned this pull request Mar 23, 2026

feat: add AbortSignal support to Agent and BedrockModel strands-agents/sdk-typescript#618

Closed

kpx-dev pushed a commit to kpx-dev/sdk-python that referenced this pull request Mar 31, 2026

feat: add CancellationToken for graceful agent execution cancellation (…

900842a

…strands-agents#1772)

Unshure mentioned this pull request Apr 13, 2026

[FEATURE] Provide officially supported method for force stopping agent execution #1591

Closed

Conversation

jgoyani1 commented Feb 26, 2026 • edited by pgrayy Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Public API Changes

Use Cases

Uh oh!

Uh oh!

codecov Bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Mar 3, 2026

Uh oh!

github-actions Bot commented Mar 3, 2026

Review Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

agent-of-mkmeral commented Mar 9, 2026

🔬 Adversarial Testing: Cancel + Interrupt Interactions

Observation (Design Choice — Not a Bug)

Observation (UX Question — May Need Documentation)

Uh oh!

Uh oh!

tahitimoon commented Mar 12, 2026

Normal output without interruption

Interrupted near the end of output

Uh oh!

jgoyani1 commented Mar 20, 2026

Uh oh!

tahitimoon commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jgoyani1 commented Feb 26, 2026 •

edited by pgrayy

Loading

codecov Bot commented Mar 2, 2026 •

edited

Loading