2.1.0 rc rebase1 by ktangsali · Pull Request #1673 · NVIDIA/physicsnemo

ktangsali · 2026-05-27T00:12:00Z

PhysicsNeMo Pull Request

Description

Rebase RC into main

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.
If I am implementing a new model or modifying any existing model, I have followed the Models Implementation Coding Standards.

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

copy-pr-bot · 2026-05-27T00:12:04Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

ktangsali · 2026-05-27T00:20:05Z

/blossom-ci

ktangsali · 2026-05-27T01:28:23Z

Blossom CI passes (see report here) and GitHub CI passed too.

* add fixes for the nvfuser bug * test(natten): narrow CPU-backward skip to FlexAttention NotImplementedError Cherry-picked test/nn/functional/test_natten.py from upstream commit 7f2451a ("Ci deps group (#1634)"). The previous device == "cpu" early-skip was too broad; this wraps the forward call and only skips on the specific NotImplementedError raised by FlexAttention's CPU-backward guard. If natten picks a different backend (or FlexAttention ever supports CPU backward), the test will run. * black formatting --------- Co-authored-by: Corey adams <6619961+coreyjadams@users.noreply.github.com>

…1642) * add missing dependencies for examples * Replace np.infty with np.inf based on new API * add some more missing dependencies * update use of FusedAdam with native torch's Adam * add tensorboard deps * fix ci issues

* bump up package versions to fix cves * fix greptile comments * update

* fix cve in uv

* few-more-security-fixes

…on3' for improved environment consistency. (#1653)

* Refactor weight initialization to use PyTorch's trunc_normal_ directly - Updated internal weight initialization in distributed AFNO layers and EarthAttention blocks to utilize `torch.nn.init.trunc_normal_` instead of legacy implementations. - Deprecated `trunc_normal_` wrapper in `physicsnemo.nn.module.utils` and removed the in-tree legacy implementation. - Regenerated forward-accuracy reference outputs for several models to align with the new initialization method. - Updated tests to skip on PyTorch versions below 2.12 due to changes in RNG algorithms affecting output consistency. * fix doctest for dit layers --------- Co-authored-by: Kaustubh Tangsali <ktangsali@nvidia.com> Co-authored-by: Kaustubh Tangsali <71059996+ktangsali@users.noreply.github.com>

* Update shard tensor ring attention to use the expected default ring attention results directory * Update examples/minimal/ShardTensorExamples/6_ring_attention/README.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * Update examples/minimal/ShardTensorExamples/6_ring_attention/benchmark_sharded_attention.py Co-authored-by: Negin Sobhani <negin513@gmail.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Negin Sobhani <negin513@gmail.com>

Prune residual accepted-point conflicts before returning Warp mesh Poisson samples.

…sion (#1658) * Update tensordict dependency constraints and add regression tests for Mesh under torch.compile - Adjusted the tensordict dependency in pyproject.toml to be upper-bounded due to regressions in version 0.12.x, with a note to drop the upper bound once the related PR is merged. - Introduced a new test file for regression testing of the Mesh class to ensure compatibility with torch.compile, specifically addressing issues caused by the tensordict 0.12.x changes. The tests validate that cached properties and data fields behave correctly when compiled. * Update CHANGELOG and bump mlflow and starlette versions - Added a new entry in CHANGELOG detailing the fix for constructing a Mesh inside a torch.compile-traced function, addressing regressions from tensordict 0.12.0. - Updated the mlflow and starlette package versions to 3.12.0 and 0.52.1 respectively, along with their corresponding source distribution and wheel URLs. - Adjusted tensordict dependency constraints to ensure compatibility with the latest changes. * format

* gnn recipes bug fixes * minor fixes

ktangsali · 2026-05-27T01:59:14Z

/blossom-ci

ktangsali requested review from CharlelieLrt, NickGeneva, RishikeshRanade, coreyjadams, dallasfoster, loliverhennigh, mnabian, peterdsharpe and pzharrington as code owners May 27, 2026 00:12

NickGeneva approved these changes May 27, 2026

View reviewed changes

ktangsali and others added 17 commits May 27, 2026 01:51

Update version

0412ac6

Bump up package versions to fix CVEs (#1649)

f958017

* bump up package versions to fix cves * fix greptile comments * update

UV Security fixes (#1650)

0a9093b

* fix cve in uv

Security fixes (#1652)

ac9bd5b

* few-more-security-fixes

Update Jupyter notebook kernel specification to use '.venv' and 'pyth…

053fdd2

…on3' for improved environment consistency. (#1653)

Update cuml-cu13 version in requirements.txt (#1659)

d394b27

Fix zarr v2 API and update to zarr v3 (#1661)

e6ecc1b

Add tensorboard, matplotlib, and termcolor to requirements (#1664)

4afe6e9

Fix Warp mesh Poisson residual conflicts

b5b156e

Prune residual accepted-point conflicts before returning Warp mesh Poisson samples.

np.infty -> np.inf (#1668)

240e93d

Gnn recipes bug fixes (#1672)

2e76b7a

* gnn recipes bug fixes * minor fixes

update versions

054a461

Move main-side CHANGELOG entries from [2.1.0] to [2.2.0]

bf1a3ba

ktangsali force-pushed the 2.1.0-rc-rebase1 branch from c4386f7 to bf1a3ba Compare May 27, 2026 01:58

ktangsali merged commit e579a9f into main May 27, 2026
4 checks passed

ktangsali deleted the 2.1.0-rc-rebase1 branch May 27, 2026 03:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.1.0 rc rebase1#1673

2.1.0 rc rebase1#1673
ktangsali merged 18 commits into
mainfrom
2.1.0-rc-rebase1

ktangsali commented May 27, 2026

Uh oh!

copy-pr-bot Bot commented May 27, 2026

Uh oh!

ktangsali commented May 27, 2026

Uh oh!

ktangsali commented May 27, 2026

Uh oh!

ktangsali commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

ktangsali commented May 27, 2026

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Review Process

Uh oh!

copy-pr-bot Bot commented May 27, 2026

Uh oh!

ktangsali commented May 27, 2026

Uh oh!

ktangsali commented May 27, 2026

Uh oh!

ktangsali commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants