Skip to content

fix(cuda_std): align WarpShuffleMode discriminants with LLVM intrinsics#360

Open
Snehal-Reddy wants to merge 1 commit intoRust-GPU:mainfrom
Snehal-Reddy:fix/warp-shuffle-mode
Open

fix(cuda_std): align WarpShuffleMode discriminants with LLVM intrinsics#360
Snehal-Reddy wants to merge 1 commit intoRust-GPU:mainfrom
Snehal-Reddy:fix/warp-shuffle-mode

Conversation

@Snehal-Reddy
Copy link
Contributor

The WarpShuffleMode enum previously relied on implicit discriminants (Up=0, Down=1, Idx=2, Xor=3), which misaligned with the integer values expected by the llvm.nvvm.shfl.sync.i32 intrinsic (Idx=0, Up=1, Down=2, Bfly=3).

Fixes #359

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect WarpShuffleMode discriminants mapping to wrong shuffle instructions

2 participants