Skip to content

feat: Enable TP>1 in Knowledge Distillation#1297

Merged
thomasdhc merged 2 commits intoNVIDIA-NeMo:mainfrom
Separius:patch-1
Feb 17, 2026
Merged

feat: Enable TP>1 in Knowledge Distillation#1297
thomasdhc merged 2 commits intoNVIDIA-NeMo:mainfrom
Separius:patch-1

Conversation

@Separius
Copy link
Copy Markdown
Contributor

gather tensors in kd loss to enable TP > 1 with KD

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

  • Add specific line by line info of high level changes in this PR.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

  • Related to # (issue)

gather tensors in kd loss to enable TP > 1 with KD
@akoumpa akoumpa changed the title Enable TP>1 in Knowledge Distillation feat: Enable TP>1 in Knowledge Distillation Feb 17, 2026
@akoumpa
Copy link
Copy Markdown
Contributor

akoumpa commented Feb 17, 2026

/ok to test 606b2bb

Copy link
Copy Markdown
Contributor

@akoumpa akoumpa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @Separius , looking great

Copy link
Copy Markdown
Contributor

@thomasdhc thomasdhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bypass codecov

@thomasdhc thomasdhc disabled auto-merge February 17, 2026 15:01
@thomasdhc thomasdhc merged commit 78083fe into NVIDIA-NeMo:main Feb 17, 2026
50 of 51 checks passed
akoumpa added a commit that referenced this pull request Feb 25, 2026
* Update kd_loss.py

gather tensors in kd loss to enable TP > 1 with KD

* lint kd_loss.py

---------

Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
linnanwang pushed a commit that referenced this pull request Apr 24, 2026
* Update kd_loss.py

gather tensors in kd loss to enable TP > 1 with KD

* lint kd_loss.py

---------

Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants