Skip to content

Move enum explanations and health checks from cuda_core to cuda_bindings#1805

Open
rwgk wants to merge 7 commits intoNVIDIA:mainfrom
rwgk:move_enum_explanations
Open

Move enum explanations and health checks from cuda_core to cuda_bindings#1805
rwgk wants to merge 7 commits intoNVIDIA:mainfrom
rwgk:move_enum_explanations

Conversation

@rwgk
Copy link
Collaborator

@rwgk rwgk commented Mar 23, 2026

Closes #1712

The DRIVER_CU_RESULT_EXPLANATIONS and RUNTIME_CUDA_ERROR_EXPLANATIONS dicts are fundamentally tied to the cuda-bindings release (they must match the enums shipped in that release). Having them live exclusively in cuda_core meant the health-check tests failed whenever cuda_core was tested against a different version of cuda-bindings.

Refactor

  • Copy the dicts to cuda_bindings/cuda/bindings/_utils/ as the
    authoritative source (_EXPLANATIONS + _CTK_MAJOR_MINOR_PATCH).
  • Keep fallback copies in cuda_core/cuda/core/_utils/
    (_FALLBACK_EXPLANATIONS) so error messages still include human-readable
    explanations when paired with older cuda-bindings that don't ship the dicts.
    Each module tries to import from cuda.bindings._utils first
    (ModuleNotFoundError-guarded) and falls back to the local copy.
  • Move the exhaustive health-check tests (every enum code present, no
    extras) to cuda_bindings/tests/test_enum_explanations.py, where they
    belong.
  • Add lightweight smoke + version-sync tests in cuda_core that verify:
    • codes 0, 1, 2 exist (smoke)
    • the cuda_core fallback is at least as new as the cuda_bindings copy
      (_CTK_MAJOR_MINOR_PATCH comparison)
    • when versions match, the dicts are identical (catches mid-CTK-cycle drift)

rwgk added 5 commits March 22, 2026 20:44
…VIDIA#1712)

The explanation dicts are fundamentally tied to the bindings version, so
they belong in cuda_bindings. This copies them (keeping the cuda_core
originals for backward compatibility) and adds the corresponding health
tests under cuda_bindings/tests/.

Made-with: Cursor
These tests now live in cuda_bindings/tests/test_enum_explanations.py,
where they belong alongside the explanation dicts they verify.

Made-with: Cursor
…llback (NVIDIA#1712)

Each explanation module now tries to import the authoritative dict from
cuda.bindings._utils (ModuleNotFoundError-guarded) and falls back to its
own copy for older cuda-bindings that don't ship it yet. Smoke tests
added for both dicts.

Made-with: Cursor
NVIDIA#1712)

Rename explanation dicts to _EXPLANATIONS / _FALLBACK_EXPLANATIONS,
add _CTK_MAJOR_MINOR_PATCH to each module, and enforce that the
cuda_core fallback copy is as new as (and in-sync with) cuda_bindings.
Parametrize the smoke and version-check tests to cover both driver and
runtime without duplication.

Made-with: Cursor
@rwgk rwgk self-assigned this Mar 23, 2026
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 23, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk rwgk added bug Something isn't working P0 High priority - Must do! cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module labels Mar 23, 2026
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 23, 2026

/ok to test

@github-actions
Copy link

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 23, 2026

/ok to test

@rwgk rwgk marked this pull request as ready for review March 23, 2026 06:50
@rwgk rwgk requested a review from leofang March 23, 2026 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Move enum explanations and health checks from cuda_core to cuda_bindings

1 participant