Skip to content

fix(core): prevent listener crashes on late CDP transaction responses#242

Closed
Avejack wants to merge 1 commit intocdpdriver:mainfrom
Avejack:fix/listener-crash-late-cdp-response
Closed

fix(core): prevent listener crashes on late CDP transaction responses#242
Avejack wants to merge 1 commit intocdpdriver:mainfrom
Avejack:fix/listener-crash-late-cdp-response

Conversation

@Avejack
Copy link
Copy Markdown
Contributor

@Avejack Avejack commented Feb 23, 2026

Summary

This PR fixes a race condition where late CDP responses for cancelled/completed transactions could raise InvalidStateError and terminate Listener.listener_loop(), causing automation sessions to become unhealthy over long runtimes.

Problem

In Transaction.__call__, response completion used set_result() / set_exception() directly without guarding future state.
When a response arrived after cancellation/completion, asyncio.InvalidStateError could be raised.
That exception propagated from tx(**message) in listener_loop, killing the listener task.

Root Cause

  • Transaction.__call__ had no done-state guard.
  • No safe handling for InvalidStateError during future completion.
  • listener_loop invoked transaction completion without isolation.
  • Connection.send() and _send_oneshot() could leave stale mapper entries on cancellation.

Changes

  • Updated zendriver/core/connection.py.
  • Added done-state guard in Transaction.__call__.
  • Added safe completion helpers:
    • _set_result_safely
    • _set_exception_safely
  • Converted parse-failure path to set transaction exception instead of raising out.
  • Added missing-response guard ("result" not present).
  • Added cancellation cleanup in Connection.send() mapper lifecycle.
  • Added cancellation/finally cleanup in Connection._send_oneshot().
  • Added Listener._complete_transaction() wrapper to prevent listener crash on completion errors.
  • Updated listener response paths to use _complete_transaction.
  • Added changelog entry in CHANGELOG.md under Unreleased -> Fixed.
  • Added regression tests in tests/core/test_connection.py:
    • test_transaction_late_response_after_cancel_does_not_raise
    • test_transaction_parse_mismatch_sets_exception_not_raise
    • test_send_cancellation_removes_tx_from_mapper
    • test_listener_complete_transaction_handles_exceptions

Behavior Impact

  • No public API changes.
  • Internal reliability improvement only.
  • Late/stale responses are safely ignored or converted into transaction-level errors.
  • Listener task remains alive when transaction completion encounters unexpected errors.

Validation

Executed:

  • poetry run pytest -q tests/core/test_connection.py

Result:

  • 4 passed

Notes

A pre-existing warning remains unrelated to this fix:

  • zendriver/core/connection.py warns about continue in a finally block.

Risk Assessment

Low.
Changes are scoped to internal transaction/listener robustness and covered by new regression tests.

Pre-merge Checklist

  • I have described my change in the section above.
  • I have ran the ./scripts/format.sh and ./scripts/lint.sh scripts. My code is properly formatted and has no linting errors.
  • I have ran uv run pytest and ensured all tests pass.
  • I have added my change to CHANGELOG.md under the [Unreleased] section.

Harden transaction completion and listener handling so stale/late CDP
responses cannot crash the websocket listener task with InvalidStateError.

- Guard Transaction.__call__ against already-completed futures.
- Add safe completion helpers that swallow/log InvalidStateError:
  _set_result_safely and _set_exception_safely.
- Treat malformed CDP responses (missing "result") as transaction errors.
- Stop raising parse failures out of Transaction.__call__; set transaction
  exceptions instead.
- Clean mapper entries when Connection.send() is cancelled.
- Mirror cancellation/stale-mapper cleanup in Connection._send_oneshot().
- Add Listener._complete_transaction() to isolate transaction completion
  errors and keep listener_loop alive.
- Add regression tests for late response after cancel, parse mismatch path,
  send-cancellation mapper cleanup, and listener completion safety.
- Document fix in CHANGELOG under Unreleased.

This addresses long-running instability where listener_loop could terminate
after a late response arrived for a cancelled/completed transaction.
@Avejack Avejack requested a review from a team as a code owner February 23, 2026 14:52
@stephanlensky
Copy link
Copy Markdown
Member

Thanks for the contribution @Avejack, the changes make sense to me.

Looks like the lint is failing, would you mind taking a look? I'd suggest using the included scripts & uv to verify tests/formatting instead of poetry.

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
zendriver/core/connection.py 66.66% 15 Missing ⚠️

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been marked stale because it has been open for 30 days with no activity. If there is no activity within 7 days, it will be automatically closed.

@github-actions github-actions bot added the stale label Mar 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 4, 2026

This pull request was automatically closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants