Skip to content

Python: Restrict persisted checkpoint deserialization by default#4941

Open
moonbox3 wants to merge 9 commits intomicrosoft:mainfrom
moonbox3:agent/fix-4894-1
Open

Python: Restrict persisted checkpoint deserialization by default#4941
moonbox3 wants to merge 9 commits intomicrosoft:mainfrom
moonbox3:agent/fix-4894-1

Conversation

@moonbox3
Copy link
Copy Markdown
Contributor

@moonbox3 moonbox3 commented Mar 27, 2026

Motivation and Context

Persisted checkpoint loading in the Python SDK is currently too permissive by default: arbitrary pickled types can be restored unless callers add their own safeguards. Before GA, checkpoint persistence should default to a narrower, framework-approved type surface while still giving applications an explicit extension point for additional safe types.

Fixes #4894

Description

This change hardens FileCheckpointStorage so persisted checkpoints are deserialized with a restricted unpickler by default. Built-in safe value types and agent_framework internal types continue to round-trip without extra configuration, while application-specific types must now be explicitly allowed via the new allowed_checkpoint_types parameter using module:qualname entries. The checkpoint encoding layer enforces that allow-list during deserialization, the storage and encoding docs now describe the safer default posture, and the workflow checkpoint tests were updated to cover blocked arbitrary callables and __reduce__ payloads, explicit allow-listed custom types, and existing round-trip scenarios that rely on custom test types.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Note: PR autogenerated by moonbox3's agent

Copilot and others added 2 commits March 27, 2026 05:46
Add RestrictedUnpickler to _checkpoint_encoding.py that limits which
types may be instantiated during pickle deserialization.  By default
FileCheckpointStorage now uses the restricted unpickler, allowing only:

- Built-in Python value types (primitives, datetime, uuid, decimal,
  collections, etc.)
- All agent_framework.* internal types
- Additional types specified via the new allowed_checkpoint_types
  parameter on FileCheckpointStorage

This narrows the default type surface area for persisted checkpoints
while keeping framework-owned scenarios working without extra
configuration.  Developers can extend the allowed set by passing
"module:qualname" strings to allowed_checkpoint_types.

The decode_checkpoint_value function retains backward-compatible
unrestricted behavior when called without the new allowed_types kwarg.

Fixes microsoft#4894

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add explicit type annotation for super().find_class() return value
to satisfy mypy's no-any-return check.

Fixes microsoft#4894

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 27, 2026 05:56
@moonbox3 moonbox3 self-assigned this Mar 27, 2026
Copy link
Copy Markdown
Contributor Author

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 92%

✓ Correctness

This PR contains only cosmetic and linting changes: reformatting indentation in test files, collapsing a multi-line call onto one line, adding a # noqa: S301 comment to suppress a Bandit pickle warning on the restricted unpickler, splitting a return into annotated local variable + return (likely to satisfy a linter), replacing try/except with contextlib.suppress, and removing an unused import. All changes are semantically equivalent to the original code with no behavioral differences.

✓ Security Reliability

This PR contains purely cosmetic changes: a # noqa: S301 suppression on the restricted unpickler class (appropriate since the class intentionally subclasses pickle.Unpickler to add safety restrictions), a local variable assignment in find_class for type annotation, indentation fixes in tests, replacing try/except with contextlib.suppress, and removing an unused import. No security-relevant behavior is changed. The existing restricted unpickler design, allowlist logic, and fallback to unrestricted pickle when allowed_types is None are all unchanged.

✓ Test Coverage

This PR is purely formatting, linting, and style cleanup with no behavioral changes. Production code changes are: (1) a # noqa: S301 comment to suppress Bandit's pickle warning, (2) an intermediate typed variable before returning in find_class, and (3) a single-line reformatting of a function call. Test changes are: removing an unused import (_BUILTIN_ALLOWED_TYPE_KEYS), replacing a try/except: pass with contextlib.suppress, and fixing indentation. Since no behavior changed, no new tests are needed and the existing test suite adequately covers the restricted unpickler logic including blocked callables, blocked __reduce__ payloads, code execution prevention, user-type allow/block, built-in safe types, and file-based checkpoint round-trips.

✓ Design Approach

This PR makes four types of changes: (1) reformats a long function call in _checkpoint.py, (2) adds a # noqa: S301 suppression to the _RestrictedUnpickler class definition and splits a return into an intermediate typed variable, (3) replaces try/except Exception: pass with contextlib.suppress(Exception) in a security test, and (4) fixes indentation in test files. None of these represent fundamentally misguided design approaches. The # noqa: S301 on the class definition is appropriate—the file already uses the same pattern for pickle.loads and the import itself. The contextlib.suppress(Exception) change is semantically equivalent to the original. The one minor concern is the intermediate result: type = super().find_class(module, name); return result pattern, which appears to work around a linter warning in an indirect way rather than either (a) placing a targeted # noqa on the return line or (b) simply keeping the direct return. The type annotation result: type is also imprecise since pickle.Unpickler.find_class has a broader return type in most stubs, though this has no runtime impact. No blocking design issues exist.

Suggestions

  • The intermediate variable result: type = super().find_class(module, name); return result in _RestrictedUnpickler.find_class appears to be a workaround for a linter warning. It is clearer to keep the direct return super().find_class(module, name) # noqa: S301 # nosec form, which is already the established pattern used on the pickle.loads call in the same file. The current split adds indirection without improving readability or type safety.

Automated review by moonbox3's agents

@markwallace-microsoft
Copy link
Copy Markdown
Member

markwallace-microsoft commented Mar 27, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework/_workflows
   _checkpoint.py158398%115–116, 299
   _checkpoint_encoding.py65296%250–251
TOTAL28028341087% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
5464 20 💤 0 ❌ 0 🔥 1m 28s ⏱️

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Python workflow checkpoint persistence by introducing restricted pickle deserialization for file-based checkpoints, addressing typing/lint issues, and updating/adding tests to cover the new allow-list behavior (Fixes #4894).

Changes:

  • Added a restricted unpickler and a built-in allow-list for checkpoint deserialization, with an extension mechanism via allow-listed "module:qualname" strings.
  • Updated FileCheckpointStorage to accept allowed_checkpoint_types and to use restricted deserialization by default.
  • Updated existing workflow checkpoint tests and added new tests validating restricted vs. unrestricted decoding behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
python/packages/core/agent_framework/_workflows/_checkpoint_encoding.py Adds restricted unpickling path + allow-list and threads restriction through decode helpers.
python/packages/core/agent_framework/_workflows/_checkpoint.py Adds allowed_checkpoint_types option and applies restricted decoding on load/list.
python/packages/core/tests/workflow/test_request_info_event_rehydrate.py Updates file-based checkpoint tests to pass required allowed types.
python/packages/core/tests/workflow/test_checkpoint.py Updates file-based checkpoint roundtrip tests to pass required allowed types for custom objects.
python/packages/core/tests/workflow/test_checkpoint_unrestricted_pickle.py New tests validating restricted decoding blocks unsafe payloads and allow-list behavior.

Copilot and others added 2 commits March 27, 2026 06:04
Remove unnecessary intermediate variable and apply # noqa: S301 # nosec
directly on the super().find_class() call, matching the established
pattern used on the pickle.loads() call in the same file.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 93%

✓ Correctness

This is a trivial comment-only change that swaps a ruff/flake8-bandit lint suppression (# noqa: S301) for a mypy type suppression (# type: ignore[no-any-return]) on the super().find_class() call. The # nosec bandit suppression is correctly retained. The type ignore is appropriate because pickle.Unpickler.find_class is typed as returning Any in typeshed, while this override declares a type return. The class-level # noqa: S301 on line 82 already covers the S301 rule for this class, so removing it from line 102 should not introduce new lint warnings. No behavioral change whatsoever.

✓ Security Reliability

This change swaps a ruff/flake8 suppression comment (# noqa: S301) for a mypy suppression comment (# type: ignore[no-any-return]) on the super().find_class() call in _RestrictedUnpickler. The # nosec bandit suppression is retained. No runtime behavior is altered — this is purely a static-analysis annotation change. The existing security controls (allowlist-based find_class, _BUILTIN_ALLOWED_TYPE_KEYS, framework module prefix check) remain intact and are sound for their stated purpose.

✓ Test Coverage

This is a purely cosmetic change: replacing a Ruff/flake8 # noqa: S301 suppression comment with a mypy # type: ignore[no-any-return] on the find_class return in _RestrictedUnpickler. No runtime behavior is altered. The existing test suite in test_checkpoint_decode.py and test_checkpoint_unrestricted_pickle.py comprehensively covers the find_class code path — including allowed built-in types, framework types, explicitly allowed user types, blocked arbitrary callables, blocked __reduce__ payloads, and code-execution prevention. No new behavior was introduced, so no additional tests are needed.

✗ Design Approach

The change correctly adds # type: ignore[no-any-return] to silence a mypy complaint about pickle.Unpickler.find_class being typed as Any in typeshed while the override declares -> type. However, it removes the # noqa: S301 suppression that was previously on the same line. Since the project runs ruff with the "S" (bandit) rule set enabled and ruff respects # noqa (not # nosec), dropping # noqa: S301 will likely produce a ruff S301 lint failure on this line. The # nosec retained here only suppresses direct bandit runs, not ruff's bandit rules. The other two pickle-related lines in the same file (line 82, line 246) still carry # noqa: S301, making this removal inconsistent.

Flagged Issues

  • Removing # noqa: S301 from line 102 will cause a ruff S301 lint failure. The project has "S" rules enabled in ruff (pyproject.toml select), and ruff uses # noqa syntax—not # nosec—to suppress those violations. Lines 82 and 246 in the same file still carry # noqa: S301; line 102 should too.

Suggestions

  • Removing # noqa: S301 means ruff/flake8 (with the S plugin) will now flag this line with S301 (use of pickle). Verify that your CI linting configuration either doesn't enable the S ruleset for this file or that the project-level config already suppresses S301 globally, otherwise this will become a lint failure.

Automated review by moonbox3's agents

Copilot and others added 2 commits March 27, 2026 06:11
…t#4894)

The review feedback correctly identified that removing the # noqa: S301
suppression from the find_class return statement would cause a ruff S301
lint failure, since the project enables bandit ("S") rules. This
restores consistency with lines 82 and 246 in the same file.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 94%

✓ Correctness

This is a trivial cleanup removing a redundant # noqa: S301 comment from line 102. The class-level suppression on line 82 already covers the S301 (pickle usage) diagnostic for the entire class, and the # nosec comment (Bandit) remains in place on line 102. No correctness concerns.

✓ Security Reliability

This is a cosmetic-only change that removes a redundant # noqa: S301 (ruff/flake8 suppression) from a line that already carries # nosec (Bandit suppression). The underlying super().find_class(module, name) call and all surrounding security logic — the restricted unpickler allowlist, the framework-module prefix check, and the type-verification step — are completely unchanged. No security or reliability concerns are introduced.

✓ Test Coverage

This PR removes a redundant # noqa: S301 lint suppression comment from a line that already carries # nosec (bandit's own suppression directive). The change is purely cosmetic — no logic, no behavior, and no API surface is altered. There is no test coverage implication because no runtime behavior changed. Existing tests in test_checkpoint_encode.py, test_checkpoint_decode.py, and test_checkpoint_unrestricted_pickle.py already cover the _RestrictedUnpickler.find_class path, and no new tests are needed for a comment removal.

✓ Design Approach

The diff removes a # noqa: S301 suppression comment from a line that already carries # nosec. The noqa: S301 suppression is for Bandit's 'pickle usage' warning as surfaced through flake8-bugbear or similar linters; # nosec covers Bandit's direct scan. Since super().find_class() in a restricted unpickler is intentional and controlled, the # nosec annotation is the correct Bandit suppression mechanism. Removing the redundant # noqa: S301 is a minor cleanup with no functional impact. There are no design or correctness concerns in this change.

Suggestions

  • Confirm that the CI linter pipeline does not flag S301 via a noqa-based tool (e.g., flake8 or ruff) independently of Bandit's nosec mechanism. If so, removing # noqa: S301 could reintroduce a lint failure.

Automated review by moonbox3's agents

@moonbox3 moonbox3 changed the title Python: Fix type annotation and code quality issues in checkpoint persistence Python: Restrict persisted checkpoint deserialization by default Mar 27, 2026
Copilot and others added 3 commits March 27, 2026 07:08
- Move module docstring to proper position after __future__ import
- Fix find_class return type annotation to type[Any]
- Add missing # noqa: S301 pragma on find_class return
- Improve error message to reference both allowed_types param and
  FileCheckpointStorage.allowed_checkpoint_types
- Add -> None return annotation to FileCheckpointStorage.__init__
- Replace tempfile.mktemp with TemporaryDirectory in test
- Replace contextlib.suppress with pytest.raises for precise assertion
- Remove unused contextlib import

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… return type

- Move module docstring before 'from __future__' import so it populates
  __doc__ (comment #4)
- Change find_class return annotation from type[Any] to type to avoid
  misleading callers about non-type returns like copyreg._reconstructor
  (comment #2)

Comments #1, #3, #5, #6, #7, #8 were already addressed in the current code.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@moonbox3 moonbox3 added the workflows Related to Workflows in agent-framework label Mar 27, 2026
@moonbox3 moonbox3 moved this to In Review in Agent Framework Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python workflows Related to Workflows in agent-framework

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Python: Harden Python checkpoint persistence defaults

3 participants