Skip to content

cuda::std::simd Always use _CCCL_HOST_DEVICE_API#9191

Merged
fbusato merged 1 commit into
NVIDIA:mainfrom
fbusato:fix-simd-cccl-api
May 30, 2026
Merged

cuda::std::simd Always use _CCCL_HOST_DEVICE_API#9191
fbusato merged 1 commit into
NVIDIA:mainfrom
fbusato:fix-simd-cccl-api

Conversation

@fbusato

@fbusato fbusato commented May 29, 2026

Copy link
Copy Markdown
Contributor

Description

conservatively use _CCCL_HOST_DEVICE_API (no Tile) for now.

@fbusato fbusato self-assigned this May 29, 2026
@fbusato fbusato requested a review from a team as a code owner May 29, 2026 19:14
@fbusato fbusato added the libcu++ For all items related to libcu++ label May 29, 2026
@fbusato fbusato added this to CCCL May 29, 2026
@fbusato fbusato requested a review from davebayer May 29, 2026 19:14
@github-project-automation github-project-automation Bot moved this to Todo in CCCL May 29, 2026
@fbusato fbusato moved this from Todo to In Review in CCCL May 29, 2026
@coderabbitai

coderabbitai Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: db498d27-eda1-4f34-ba57-dd2437aee42a

📥 Commits

Reviewing files that changed from the base of the PR and between 294326d and c2d9976.

📒 Files selected for processing (2)
  • libcudacxx/include/cuda/std/__simd/complex_math.h
  • libcudacxx/include/cuda/std/__simd/specializations/fixed_size_mask.h

📝 Walkthrough

Summary by CodeRabbit

  • Enhancements
    • SIMD complex math operations (real, imaginary, absolute value, argument, norm, conjugate, projection, and transcendental functions) now support execution on both host and device.
    • Mask storage operations enhanced for improved device code compatibility.

Walkthrough

This PR updates API annotations across SIMD complex math operations and mask storage to use _CCCL_HOST_DEVICE_API instead of _CCCL_API. The change applies consistently to unary and binary helper functors, their generator mechanics, all public SIMD vector entry points, and the fixed-size mask accessor, enabling these operations to execute on both host and device without altering function signatures, logic, or return types.

Changes

SIMD Host-Device Callability

Layer / File(s) Summary
SIMD Complex Unary Operations
libcudacxx/include/cuda/std/__simd/complex_math.h
Unary helper functors (__fn_real, __fn_imag, __fn_abs, __fn_arg, __fn_norm, __fn_conj, __fn_proj, __fn_exp, __fn_log, __fn_log10, __fn_sqrt, __fn_sin, __fn_asin, __fn_cos, __fn_acos, __fn_tan, __fn_atan, __fn_sinh, __fn_asinh, __fn_cosh, __fn_acosh, __fn_tanh, __fn_atanh) and internal generator __gen_complex_apply_unary update their operator() annotations. Real-valued vector functions (real, imag, abs, arg, norm) and complex-valued vector functions (conj, proj, exp, log, log10, sqrt, sin, asin, cos, acos, tan, atan, sinh, asinh, cosh, acosh, tanh, atanh) update their declarations. All preserve template constraints, return types, and generator-based construction.
SIMD Complex Binary Operations
libcudacxx/include/cuda/std/__simd/complex_math.h
Binary helper functors __fn_polar_binary and __fn_pow_binary, and entry points polar(const basic_vec<_Tp, _Abi>&, const basic_vec<_Tp, _Abi>& = {}) and pow(const basic_vec<_Tp, _Abi>&, const basic_vec<_Tp, _Abi>&) update annotations while maintaining signatures and generator wiring.
Fixed-Size Mask Storage Accessor
libcudacxx/include/cuda/std/__simd/specializations/fixed_size_mask.h
The __mask_storage<_Bytes, __fixed_size<_Np>>::__get(__simd_size_type __idx) const noexcept accessor updates its attribute macro.

Possibly Related PRs

  • NVIDIA/cccl#8475: Introduced SIMD complex math operations (cuda::std::simd Complex), which are the targets of these annotation updates.

Suggested Reviewers

  • wmaxey
  • bernhardmgruber
  • pciolkosz

Comment @coderabbitai help to get the list of available commands and usage tips.

@fbusato fbusato enabled auto-merge (squash) May 29, 2026 19:27
@github-actions

Copy link
Copy Markdown
Contributor

🥳 CI Workflow Results

🟩 Finished in 1h 21m: Pass: 100%/116 | Total: 1d 09h | Max: 51m 49s | Hits: 97%/348808

See results here.

@fbusato fbusato merged commit fb8629d into NVIDIA:main May 30, 2026
137 of 138 checks passed
shwina pushed a commit to shwina/cccl that referenced this pull request Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libcu++ For all items related to libcu++

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants