Benchmark: Model benchmark - deterministic training support#731
Benchmark: Model benchmark - deterministic training support#731Aishwarya-Tonpe wants to merge 1 commit intomainfrom
Conversation
@microsoft-github-policy-service agree company="Microsoft" |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #731 +/- ##
==========================================
- Coverage 85.70% 85.68% -0.03%
==========================================
Files 102 103 +1
Lines 7703 7886 +183
==========================================
+ Hits 6602 6757 +155
- Misses 1101 1129 +28
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks for addressing all the comments, since this is a big PR, could we do an apple-2-apple comparision before merging this PR. For example,
|
Tested and compared all the 3 items listed above. Looks good. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4c1ebf8 to
dd0457d
Compare
dd0457d to
f2c7554
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f2c7554 to
f831f73
Compare
f831f73 to
840c62f
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/benchmarks/model_benchmarks/test_pytorch_determinism_all.py
Outdated
Show resolved
Hide resolved
840c62f to
181b9ad
Compare
181b9ad to
20c1fac
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
20c1fac to
2803619
Compare
2803619 to
34689f9
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
34689f9 to
c163ddb
Compare
c163ddb to
b5ad62a
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b5ad62a to
a6ce77c
Compare
a6ce77c to
2b52174
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adds opt-in deterministic training mode to SuperBench's PyTorch model benchmarks. When enabled --enable-determinism. PyTorch deterministic algorithms are enforced, and per-step numerical fingerprints (loss, activation means) are recorded as metrics. These can be compared across runs using the existing sb result diagnosis pipeline to verify bit-exact reproducibility — useful for hardware validation and platform comparison.
Flags added -
--enable-determinism
--check-frequency: Number of steps after which you want the metrics to be recorded
--deterministic-seed
Changes -
Updated pytorch_base.py to handle deterministic settings, logging.
Added a new example script: pytorch_deterministic_example.py
Added a test file: test_pytorch_determinism_all.py to verify everything works as expected.
Usage -
Step 1: Run 1 - Run with --enable-determinism and the necessary metrics will be recorded in the results-summary.jsonl file
Step 2: Generate the baseline file from the Run 1 results using - sb result generate-baseline
Step 3: Run 2 - Run with --enable-determinism and the necessary metrics will be recorded in the results-summary.jsonl file on a different machine (or the same machine)
Step 4: Run diagnosis on the results generated from the 2 runs using the - sb result diagnosis command
Note -