Skip to content

feat(pathfinder): centralize CUDA env var handling, prioritize CUDA_PATH over CUDA_HOME#1801

Open
rwgk wants to merge 24 commits intoNVIDIA:mainfrom
rwgk:CUDA_PATH_CUDA_HOME_cleanup
Open

feat(pathfinder): centralize CUDA env var handling, prioritize CUDA_PATH over CUDA_HOME#1801
rwgk wants to merge 24 commits intoNVIDIA:mainfrom
rwgk:CUDA_PATH_CUDA_HOME_cleanup

Conversation

@rwgk
Copy link
Collaborator

@rwgk rwgk commented Mar 20, 2026

Closes #1433

Continuation of #1519 (squash-merged and rebased onto main; @rparolin as co-author).

Summary

Centralizes CUDA_PATH/CUDA_HOME handling in cuda-pathfinder and establishes a consistent priority: CUDA_PATH takes precedence over CUDA_HOME when both are set.

  • New public API: cuda.pathfinder.get_cuda_path_or_home() returns the highest-priority CUDA root directory. Warns when both vars are set and differ.
  • CUDA_PATH > CUDA_HOME: All consumers now go through the centralized function. The previous ad-hoc CUDA_HOME > CUDA_PATH ordering in some files is removed.
  • Empty strings treated as undefined: CUDA_PATH="" no longer shadows a valid CUDA_HOME.
  • found_via provenance label: Changed from "CUDA_HOME" to "CUDA_PATH" in LoadedDL, LocatedHeaderDir, LocatedStaticLib, and LocatedBitcodeLib to match the new priority.
  • Multi-path splitting dropped: The os.pathsep-based splitting in build_hooks.py is removed. — We are unaware of real-world usage. This could be added back later if needed. At this stage we believe the extra complexity isn't warranted.
  • Consumers migrated: conftest.py, cuda_core tests/examples, cuda_bindings examples all use get_cuda_path_or_home().
  • Documentation: pathfinder 1.5.0 release notes added, cuda_bindings environment variable docs updated, API docs updated.

Build hooks and cuda.pathfinder at build time

The build_hooks.py files in cuda_bindings and cuda_core currently use os.environ.get("CUDA_PATH", os.environ.get("CUDA_HOME")) directly instead of get_cuda_path_or_home(). This is necessary because in PEP 517 isolated build environments, backend-path=["."] causes the cuda namespace package to resolve to only the project's cuda/ directory, shadowing the installed cuda-pathfinder in the build-env's site-packages.

A proven workaround exists: replace the _NamespacePath with a plain list that includes the site-packages cuda/ directory, allowing import cuda.pathfinder to succeed. This is demonstrated and tested in PR #1803. After cuda-pathfinder >= 1.5 is released on PyPI, a follow-up PR will switch the build hooks to use get_cuda_path_or_home().

…d onto main.

Adds cuda.pathfinder._utils.env_var_constants with canonical search order,
enhances get_cuda_home_or_path() with robust path comparison and caching,
and updates documentation across all packages to reflect the new priority.

Co-authored-by: Rob Parolin <rparolin@nvidia.com>
Made-with: Cursor
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 20, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 20, 2026

Decision: drop os.pathsep splitting of CUDA_HOME/CUDA_PATH

Both cuda_bindings/build_hooks.py and cuda_core/build_hooks.py currently split the env var value on os.pathsep and iterate over a list of CUDA roots. After reviewing the codebase and the broader ecosystem, I'm removing this pattern. Rationale:

  • No real-world usage. The NVIDIA CTK installer (Windows), conda environments, pixi configs, and CMake's FindCUDAToolkit all set CUDA_PATH/CUDA_HOME to a single directory. There is no documented convention for pathsep-separated multi-path values.
  • Never tested. The two build_hooks.py files even disagreed on env var priority (CUDA_HOME > CUDA_PATH in bindings, CUDA_PATH > CUDA_HOME in core) — if anyone had used multi-path, that inconsistency would have surfaced.
  • Avoids premature complexity. Adding multi-path support to the new centralized get_cuda_home_or_path() (especially combined with the conflict-warning logic) would add design surface for a feature nobody uses today.

Both _get_cuda_paths() functions will be replaced with a plain get_cuda_home_or_path() call returning a single path. If a real multi-path use case surfaces later, we can add it with proper design at that point.


Related commit: 2164c33

Drop os.pathsep splitting of CUDA_PATH/CUDA_HOME in both build_hooks.py files.
Both functions now delegate to get_cuda_home_or_path() from cuda.pathfinder,
returning a single path.

See NVIDIA#1801 (comment)

Made-with: Cursor
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 20, 2026

Decision: treat empty CUDA_PATH/CUDA_HOME as undefined

Rob's original implementation preserved empty strings (CUDA_PATH="") as valid return values from get_cuda_home_or_path(). I'm changing this so that empty strings are equivalent to the variable not being set. Rationale:

  • No valid use case. An empty string is not a valid filesystem path. Nobody intentionally sets CUDA_PATH="" to mean "the CUDA path is empty."
  • Fixes a subtle shadowing bug. With the old behavior, CUDA_PATH="" + CUDA_HOME=/usr/local/cuda would return "" (CUDA_PATH wins by priority), silently hiding the valid CUDA_HOME. Treating empty as undefined lets it correctly fall through.
  • Simplifies callers. All downstream consumers already do if not cuda_path: raise .... Returning "" just pushes the falsy check to every call site.
  • Matches convention. The standard os.environ.get("CUDA_PATH") idiom throughout the codebase has always treated empty and unset interchangeably (via if not value). This aligns with that expectation.

This addresses my review comment on Rob's PR: #1519 (comment)

rwgk added 10 commits March 20, 2026 12:49
Safe: currently an internal-only API (not yet public).
Made-with: Cursor
Export get_cuda_path_or_home from cuda.pathfinder.__init__. External
consumers now import from cuda.pathfinder directly. Rename constant
to _CUDA_PATH_ENV_VARS_ORDERED and remove all public references to it.

Made-with: Cursor
Pathfinder 1.5.0 release notes no longer claim cross-package consistency
(that depends on future bindings/core releases). cuda_bindings env var
docs now defer to pathfinder release notes for migration guidance.

Made-with: Cursor
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

Dump of findings from independent review with Cursor GPT-5.4 Extra High:

  Findings
  • High cuda_core and cuda_bindings now import get_cuda_path_or_home() in their build hooks (cuda_core/build_hooks.py:33, cuda_bindings/build_hooks.py:38), but their metadata still allows
     older cuda-pathfinder releases in both build and runtime deps (cuda_core/pyproject.toml:10, cuda_core/pyproject.toml:51, cuda_bindings/pyproject.toml:9,
    cuda_bindings/pyproject.toml:35). That means a resolver can legally choose a 1.4.x release that does not export this symbol, and isolated source builds will fail with ImportError
    before the actual CUDA-path logic runs.
  • Medium The new CUDA_PATH-first behavior is not reflected in public provenance fields. Winner selection now goes through get_cuda_path_or_home()
    (cuda_pathfinder/cuda/pathfinder/_utils/env_vars.py:77, cuda_pathfinder/cuda/pathfinder/_utils/env_vars.py:105), but env-based results still hard-code found_via="CUDA_HOME" in
    headers/libs lookup (cuda_pathfinder/cuda/pathfinder/_headers/find_nvidia_headers.py:110, cuda_pathfinder/cuda/pathfinder/_dynamic_libs/search_steps.py:198,
    cuda_pathfinder/cuda/pathfinder/_static_libs/find_static_lib.py:152, cuda_pathfinder/cuda/pathfinder/_static_libs/find_bitcode_lib.py:148). So callers can be told the path came from
    CUDA_HOME even when CUDA_PATH won.
  • Medium The header-dependent core tests now skip based on “an env var resolved” rather than “headers actually exist.” skipif_need_cuda_headers only checks the helper result in
    cuda_core/tests/conftest.py:257, while the shared helpers still require a real <cuda_root>/include directory in cuda_core/tests/helpers/__init__.py:13 and
    cuda_core/tests/helpers/__init__.py:17. With CUDA_PATH=/bad/path, these tests will run and fail for the wrong reason instead of skipping cleanly.
  • Low The docs version switcher points multiple releases at the wrong docs set. cuda_pathfinder/docs/nv-versions.json:7, cuda_pathfinder/docs/nv-versions.json:8,
    cuda_pathfinder/docs/nv-versions.json:11, cuda_pathfinder/docs/nv-versions.json:12, cuda_pathfinder/docs/nv-versions.json:15, and cuda_pathfinder/docs/nv-versions.json:16 map 1.5.0,
    1.4.3, and 1.4.2 to the 1.4.1 URL.

  Open questions / assumptions
  • I did not count the removed os.pathsep multi-root handling in the build hooks as a formal finding. It looks intentional, but it is still a compatibility change worth documenting if
    anyone relies on path-list values in CUDA_PATH or CUDA_HOME.
  • I could not run trustworthy local commands/tests in this environment because command execution was returning empty success responses, so this is a static review against main plus the
    PR/issue context rather than an executed test pass.

  Compared with main, the overall CUDA_PATH precedence cleanup looks directionally right, but I would not merge without fixing the dependency bounds first; that one is a real build-breaker.

rwgk added 2 commits March 20, 2026 21:03
…docs/nv-versions.json before

Discovered via independent review from GPT-5.4 Extra High
Aligns the provenance label with the new CUDA_PATH-first priority.
The label signals the highest-priority env var name, not necessarily
which variable supplied the value.

Discovered via independent review from GPT-5.4 Extra High

Made-with: Cursor
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

Responses to #1801 (comment)

• High cuda_core and cuda_bindings now import get_cuda_path_or_home() in their build hooks (cuda_core/build_hooks.py:33, cuda_bindings/build_hooks.py:38), but their metadata still allows
older cuda-pathfinder releases in both build and runtime deps (cuda_core/pyproject.toml:10, cuda_core/pyproject.toml:51, cuda_bindings/pyproject.toml:9,
cuda_bindings/pyproject.toml:35). That means a resolver can legally choose a 1.4.x release that does not export this symbol, and isolated source builds will fail with ImportError
before the actual CUDA-path logic runs.

This is understood: we cannot immediately pin cuda-bindings and cuda-core to cuda-pathfinder>=1.5 because that version is not released yet. We will resolve this chicken-and-egg situation by releasing cuda-pathfinder 1.5.0 (shortly after this PR is merged), then adding the pins before any new cuda-bindings or cuda-core release is made.

• Medium The new CUDA_PATH-first behavior is not reflected in public provenance fields. Winner selection now goes through get_cuda_path_or_home()
(cuda_pathfinder/cuda/pathfinder/_utils/env_vars.py:77, cuda_pathfinder/cuda/pathfinder/_utils/env_vars.py:105), but env-based results still hard-code found_via="CUDA_HOME" in
headers/libs lookup (cuda_pathfinder/cuda/pathfinder/_headers/find_nvidia_headers.py:110, cuda_pathfinder/cuda/pathfinder/_dynamic_libs/search_steps.py:198,
cuda_pathfinder/cuda/pathfinder/_static_libs/find_static_lib.py:152, cuda_pathfinder/cuda/pathfinder/_static_libs/find_bitcode_lib.py:148). So callers can be told the path came from
CUDA_HOME even when CUDA_PATH won.

It was an oversight that CUDA_HOME was still assigned to found_via. We decided to fix this with commit 8d3ed03, which includes additions to the documentation to call out that the provenance label aligns with the new CUDA_PATH-first priority, not necessarily which variable supplying the value (in keeping with the status quo).

• Medium The header-dependent core tests now skip based on “an env var resolved” rather than “headers actually exist.” skipif_need_cuda_headers only checks the helper result in
cuda_core/tests/conftest.py:257, while the shared helpers still require a real <cuda_root>/include directory in cuda_core/tests/helpers/init.py:13 and
cuda_core/tests/helpers/init.py:17. With CUDA_PATH=/bad/path, these tests will run and fail for the wrong reason instead of skipping cleanly.

This was intentional. The rationale discussed before (with Opus 4.6 1M Thinking) previously was: simple is best here: it's OK if there are downstream failures if CUDA_PATH unexpectedly points to a location that does not have the include subdir.

• Low The docs version switcher points multiple releases at the wrong docs set. cuda_pathfinder/docs/nv-versions.json:7, cuda_pathfinder/docs/nv-versions.json:8,
cuda_pathfinder/docs/nv-versions.json:11, cuda_pathfinder/docs/nv-versions.json:12, cuda_pathfinder/docs/nv-versions.json:15, and cuda_pathfinder/docs/nv-versions.json:16 map 1.5.0,
1.4.3, and 1.4.2 to the 1.4.1 URL.

Fixed with commit 6d065e9

Open questions / assumptions
• I did not count the removed os.pathsep multi-root handling in the build hooks as a formal finding. It looks intentional, but it is still a compatibility change worth documenting if
anyone relies on path-list values in CUDA_PATH or CUDA_HOME.

Yes this was intentional: #1801 (comment)

Since this affects the build hooks only, we believe it'd be more distracting than helpful to mention it in the release notes or other documentation.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

/ok to test

The build backends run in an isolated venv created by pyproject-build.
Although cuda-pathfinder is listed in build-system.requires and gets
installed, the cuda namespace package from backend-path=["."] shadows
the installed cuda-pathfinder, making `from cuda.pathfinder import ...`
fail with ModuleNotFoundError. This broke all CI wheel builds.

Revert _get_cuda_path() to use os.environ.get() directly with
CUDA_PATH > CUDA_HOME priority, and remove cuda-pathfinder from
build-system.requires (it was not there on main; our PR added it).

Made-with: Cursor
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

/ok to test

@github-actions
Copy link

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 23, 2026

/ok to test

@rwgk rwgk self-assigned this Mar 23, 2026
@rwgk rwgk added the cuda.pathfinder Everything related to the cuda.pathfinder module label Mar 23, 2026
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 23, 2026

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 23, 2026

Commit 1eff0f5 only changed the release notes, therefore I canceled the CI, to not waste our resources. See URL in previous comment for a link to a successful full run.

@rwgk rwgk requested a review from cpcloud March 23, 2026 18:43
Copy link
Contributor

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One blocking regression remains in the new CUDA_PATH flow:

  • conftest.py and cuda_core/tests/conftest.py now only check whether a CUDA env var is set. If CUDA_PATH/CUDA_HOME points at a stale or incomplete root, we stop skipping and fail later in compilation instead.

I left inline suggestions on the affected lines.


import pytest

from cuda.pathfinder import get_cuda_path_or_home
Copy link
Contributor

@cpcloud cpcloud Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the repo-level core Cython gate accurate after the env-var change, this file needs os again if you preserve the include/ existence check below.

Suggested change
from cuda.pathfinder import get_cuda_path_or_home
import os
import pytest
from cuda.pathfinder import get_cuda_path_or_home

conftest.py Outdated

def pytest_collection_modifyitems(config, items): # noqa: ARG001
cuda_home = os.environ.get("CUDA_HOME")
cuda_home = get_cuda_path_or_home()
Copy link
Contributor

@cpcloud cpcloud Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: this now treats any non-empty CUDA_PATH/CUDA_HOME as if headers are available. If the env var points at a stale or incomplete root, the core Cython tests stop skipping and fail later in compilation instead. Please keep the include/ existence check here.

Suggested change
cuda_home = get_cuda_path_or_home()
cuda_root = get_cuda_path_or_home()
cuda_home = cuda_root if cuda_root is not None and os.path.isdir(os.path.join(cuda_root, "include")) else None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: commit 9c13237

I believe skipping if CUDA_PATH is set, but there is no include/, is likely to mask oversights. My commit introduces a hard, obvious failure in that situation. I believe that'll be more helpful than skipping, which is likely to get overlooked.

Comment on lines 257 to 260
skipif_need_cuda_headers = pytest.mark.skipif(
not os.path.isdir(os.path.join(os.environ.get("CUDA_PATH", ""), "include")),
get_cuda_path_or_home() is None,
reason="need CUDA header",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same regression here: any non-empty CUDA_PATH/CUDA_HOME now marks headers as available. test_cooperative_launch() includes <cooperative_groups.h>, so a stale toolkit root stops skipping and fails inside Program.compile() instead. Please keep the include-dir check at this marker.

Suggested change
skipif_need_cuda_headers = pytest.mark.skipif(
not os.path.isdir(os.path.join(os.environ.get("CUDA_PATH", ""), "include")),
get_cuda_path_or_home() is None,
reason="need CUDA header",
)
skipif_need_cuda_headers = pytest.mark.skipif(
(cuda_root := get_cuda_path_or_home()) is None or not os.path.isdir(os.path.join(cuda_root, "include")),
reason="need CUDA header",
)

Copy link
Collaborator Author

@rwgk rwgk Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was deliberate (see comment above), but commit 9c13237 resolves this, and your other feedback above, in a safer way:

  • We skip if CUDA_PATH / CUDA_HOME is not set.
  • If set, we fail hard if the include/ directory doesn't exist.

(I believe in practice that amounts to "hey, your CUDA_PATH is pointing to the wrong directory"; which will be obvious.)

@@ -183,20 +183,23 @@ def find_in_conda(ctx: SearchContext) -> FindResult | None:


def find_in_cuda_home(ctx: SearchContext) -> FindResult | None:
Copy link
Contributor

@cpcloud cpcloud Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking architecture note: after the recent descriptor/search-step refactors, this helper now represents the shared CUDA env-var anchor rather than specifically CUDA_HOME. Renaming this step (and the analogous header step/tests) to something like find_in_cuda_env() would keep the step layer aligned with the centralized get_cuda_path_or_home() helper and the new found_via="CUDA_PATH" provenance.

Suggested change
def find_in_cuda_home(ctx: SearchContext) -> FindResult | None:
def find_in_cuda_env(ctx: SearchContext) -> FindResult | None:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: commit 0255083

I decided to simply make this find_in_cuda_path. cuda_env is too ambiguous (no obvious connection to CUDA_PATH or CUDA_HOME).

Thanks for catching this; I overlooked it.

rwgk added 4 commits March 23, 2026 15:18
Add a helper that skips tests when no CUDA path is set, but
asserts that the include/ subdirectory exists when one is — surfacing
stale or incomplete toolkit roots at collection time instead of
letting them fail later in compilation.

Applied in both the root conftest.py and cuda_core/tests/conftest.py.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.pathfinder Everything related to the cuda.pathfinder module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA]: Improve handling of CUDA_HOME and CUDA_PATH in build across projects

2 participants