Skip to content

Fix a shutdown issue and make debugging of CUDA or WebGPU EPs easier#835

Merged
skottmckay merged 3 commits into
mainfrom
skottmckay/FixShutdownIssueAndImproveDebuggability
Jun 25, 2026
Merged

Fix a shutdown issue and make debugging of CUDA or WebGPU EPs easier#835
skottmckay merged 3 commits into
mainfrom
skottmckay/FixShutdownIssueAndImproveDebuggability

Conversation

@skottmckay

Copy link
Copy Markdown
Collaborator
  • Handle late logging statements from static cleanup more gracefully.
    • e.g. webgpu has a callback that is triggered on wgpu instance destruction which happens during static destruction due to globals in genai
      • the globals in genai aren't required for plugin EPs but that's a separate issue/cleanup
  • Make it possible to drop in a custom webgpu or cuda EP for debugging.
    • add env var for specifying path to load library from

….g. webgpu has a callback that is triggered on wgpu instance destruction which happens during static destruction due to globals in genai (which aren't required for plugin EPs but that's a separate issue/cleanup).

Make it possible to drop in a custom webgpu or cuda EP for debugging.
Copilot AI review requested due to automatic review settings June 24, 2026 04:20
@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment Jun 25, 2026 12:12am

Request Review

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets the C++ SDK (sdk_v2/cpp) with two independent improvements. First, it hardens shutdown by routing the ORT log callback through a process-static std::atomic<ILogger*> (s_ort_logger) instead of a raw logger_param baked into CreateEnvWithCustomLogger. Because OrtEnv is a process singleton that can outlive a Manager (e.g., GenAI globals trigger late teardown logs during static destruction), the previous design risked dereferencing a dangling logger_ pointer; the new design nulls the pointer at the start of ~Manager, mirroring the existing s_oga_logger pattern. Second, it adds FOUNDRY_LOCAL_CUDA_EP_LIBRARY and FOUNDRY_LOCAL_WEBGPU_EP_LIBRARY environment variables so a developer can drop in a custom EP provider library for debugging, bypassing the CDN download/verify flow.

Changes:

  • Route ORT log callbacks through an atomic global logger that is cleared on Manager teardown to avoid a use-after-free from late ORT static-cleanup logs.
  • Add an env-var override in the CUDA and WebGPU bootstrappers to register a custom provider library path directly.
  • Add supporting includes (utils.h, <cstdlib>) for the override logic.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
sdk_v2/cpp/src/manager.cc Introduces s_ort_logger atomic; OrtLogCallback reads it; passes nullptr to CreateEnvWithCustomLogger; sets it at end of ctor and clears it at start of dtor.
sdk_v2/cpp/src/ep_detection/cuda_ep_bootstrapper.cc Adds FOUNDRY_LOCAL_CUDA_EP_LIBRARY override to register a custom CUDA provider before the normal download path.
sdk_v2/cpp/src/ep_detection/webgpu_ep_bootstrapper.cc Adds FOUNDRY_LOCAL_WEBGPU_EP_LIBRARY override to register a custom WebGPU provider before the normal download path.

Comment thread sdk_v2/cpp/src/ep_detection/cuda_ep_bootstrapper.cc
Comment thread sdk_v2/cpp/src/ep_detection/webgpu_ep_bootstrapper.cc
Comment thread sdk_v2/cpp/src/manager.cc Outdated
Comment thread sdk_v2/cpp/src/manager.cc Outdated
@skottmckay skottmckay merged commit 0fdabf3 into main Jun 25, 2026
47 checks passed
@skottmckay skottmckay deleted the skottmckay/FixShutdownIssueAndImproveDebuggability branch June 25, 2026 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants