Skip to content

Conversation

@rwgk
Copy link
Collaborator

@rwgk rwgk commented Jan 18, 2026

Description

Currently all changes are Cursor-generated and unreviewed

The first commit is PR #5958 squashed: 0da9df0

Cursor-driven changes (oldest@top to newest@bottom)

  • Claude 4.5 Opus
    • a8812e7 Add instance_base cleanup and improve reset() method
    • 40efead Add patients container cleanup in internals destructor
    • 5b1464d Add null check before resetting unique_ptr in reset() method
    • 297f32c Add comment clarifying when internals destructor runs
    • 9eb5239 Add exception handling to reset() method
  • GPT-5.2 Codex Extra High
    • 51a8187 Avoid recreating internals during type dealloc

Suggested changelog entry:

  • Placeholder.

rwgk added 8 commits January 17, 2026 11:59
Destruct internals during interpreter finalization to fix PyTypeObject leaks.
During subinterpreter destruction, the internals object stored in a capsule
in the interpreter's state dict is not freed. The capsule destructor
(internals_shutdown) only runs at interpreter shutdown, not when individual
subinterpreters are destroyed, causing a memory leak of ~1.7 MB per
subinterpreter cycle.

This commit adds a reset() method to internals_pp_manager that explicitly
resets the unique_ptr in the capsule before Py_EndInterpreter is called,
ensuring the internals object is freed when the subinterpreter is destroyed.

Changes:
- Added reset() method to internals_pp_manager that acquires the GIL and
  resets the unique_ptr from the state dict
- Updated destroy() to also reset the unique_ptr before deleting the
  cached pointer-to-pointer
- Modified subinterpreter destructor to call reset() before Py_EndInterpreter

Note: Initial testing shows the leak persists, suggesting the issue may be
more complex. Further investigation with heaptrack is needed to identify
the root cause.
- Added Py_CLEAR(instance_base) to internals destructor to ensure it's
  properly decref'd when internals is destroyed
- Modified reset() to directly access the capsule from the state dict
  instead of using cached pointer, ensuring we reset the correct unique_ptr

Note: Leak investigation continues - these changes don't fully resolve
the ~1.7 MB per iteration leak yet.
The patients container stores PyObject* pointers that are INCREF'd when
added (see add_patient in class.h). These need to be DECREF'd when the
internals object is destroyed to prevent leaks.

Note: This doesn't fully resolve the leak yet - investigation continues.
Check that the unique_ptr actually contains an object before resetting it
to avoid unnecessary operations on null pointers.
Clarify that when called from reset() before Py_EndInterpreter,
the interpreter is definitely alive, so cleanup should always happen.
Wrap get_python_state_dict() call in try-catch to handle potential
exceptions gracefully. If we can't get the state dict or capsule,
silently continue as the capsule destructor will clean up eventually.
Introduce a non-creating internals lookup and use it in
pybind11_meta_dealloc so teardown paths don’t accidentally rebuild
internals and their static types during Py_EndInterpreter. This prevents
with_internals from triggering make_static_property_type while the
interpreter is finalizing, which heaptrack showed as leaked allocations.

Changes:
- add is_interpreter_finalizing() helper
- add internals_pp_manager::get_pp_if_exists() and a state-dict lookup
  helper that does not create capsules
- switch reset() to use the non-creating lookup
- update pybind11_meta_dealloc to operate only when internals exist,
  avoiding implicit internals creation during GC/finalization

This reduces the leaked allocations tied to pybind11_meta_dealloc and
keeps teardown from reinitializing internals after they were reset.
@rwgk rwgk requested a review from henryiii as a code owner January 18, 2026 03:53
@rwgk
Copy link
Collaborator Author

rwgk commented Jan 18, 2026

Summary @ commit 51a8187

  • Avoid recreating pybind11 internals during type deallocation to prevent teardown from reinitializing static types.
  • Add a non-creating internals lookup so teardown paths can use existing internals safely without creating capsules.
  • Leak investigation indicates the remaining large leak is dominated by CPython per-interpreter obmalloc arenas rather than pybind11-owned objects.

What heaptrack shows now

  • The pybind11_meta_dealloc → make_static_property_type() leak path is gone (filtered leak output is empty for that symbol).
  • Leaked allocations dropped (126 → 76) and total leaked bytes dropped slightly (4.40M → 4.35M for 10 iterations).
  • The dominant leak remains _PyUnicode_InitGlobalObjects::init_interned_dict via pycore_interp_init, i.e., per-interpreter obmalloc arenas.

Root-cause direction: obmalloc arenas

Controlled experiments with PyInterpreterConfig.use_main_obmalloc = 1 reduce the leak to ~28–30 KB/iter in both pybind11 and pure C tests:

  • pybind11, default config: ~1712 KB/iter
  • pybind11, use_main_obmalloc=1: ~28 KB/iter
  • pure C, default config: ~202 KB/iter
  • pure C, use_main_obmalloc=1: ~30 KB/iter

This strongly suggests per-interpreter obmalloc arenas are not freed on Py_EndInterpreter (CPython behavior/limitation). pybind11 exercises more allocations (types/interned strings), so retained arenas are larger.

Next steps I can take

  • Add an opt-in knob or doc note for use_main_obmalloc=1 to mitigate leaks in subinterpreters.
  • Draft a minimal CPython repro and issue write-up pointing to per-interpreter obmalloc arenas not being released.

@rwgk rwgk requested review from Skylion007, b-pass and oremanj and removed request for henryiii January 18, 2026 03:56
@rwgk
Copy link
Collaborator Author

rwgk commented Jan 18, 2026

@b-pass @oremanj @Skylion007 Please let me know any feedback. If you have ideas for how to direct Cursor, please post them here, I'll try them out.

AFAICT Claude 4.5 Opus got stuck kind-of, therefore I switched to GPT-5.2 Codex Extra High. At first sight it looks like it got a lot deeper, but I didn't look in any depth myself.

If we can trust the Summary @ commit 51a8187 above, and IIUC, the remaining problems appear to be more likely in CPython than in pybind11. But TBH, I don't know anything about use_main_obmalloc, I could be completely off.

@rwgk
Copy link
Collaborator Author

rwgk commented Jan 18, 2026

This is meant to help jump-start work in other agents:

pr5961_context_2026-01-17+2010.md

It references a bunch of files I have on my workstation. Please let me know if you think any of those might help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant