Skip to content

STATISTICS-92: TruncatedNormalDistribution rejection-sampler uses incorrect bounds for mirroring checl#125

Merged
aherbert merged 3 commits intoapache:masterfrom
kevinmilner:STATISTICS-92-TruncatedNormal-rejection-sampler
Mar 16, 2026
Merged

STATISTICS-92: TruncatedNormalDistribution rejection-sampler uses incorrect bounds for mirroring checl#125
aherbert merged 3 commits intoapache:masterfrom
kevinmilner:STATISTICS-92-TruncatedNormal-rejection-sampler

Conversation

@kevinmilner
Copy link
Contributor

Fix for STATISTICS-92. This corrects the rejection sampler mirroring test in TruncatedNormalDistribution to use the standard normal bounds rather than the original bounds.

As is, when mean > 0, lower > 0, lower < mean, and upper > mean, samples are only returned that are > mean. For example, here is the distribution for:

ContinuousDistribution dist = TruncatedNormalDistribution.of(1, 0.1, 0.7, 1.3).createSampler(RandomSource.XO_RO_SHI_RO_128_PP.create(123456l));

trunc_dist_1 0_0 1_0 7_1 3

The black histogram is the current implementation and the orange histogram is with this fix.

And when mean > 0, lower > 0, lower < mean, and upper < mean, the sampler gets stuck in an infinite loop.

The same is true on the negative side, for example:

ContinuousDistribution dist = TruncatedNormalDistribution.of(-1, 0.1, -1.3, -0.7).createSampler(RandomSource.XO_RO_SHI_RO_128_PP.create(123456l));

trunc_dist_-1 0_0 1_-1 3_-0 7

The fix is simple. I also added tests that detect the infinite loop cases with a timeout of 1 second; they fail with the current implementation, and pass with the fix. I could add tests for the other cases that don't trigger the infinite loop, for example, to ensure that values on either side of the mean are correctly sampled, but I'm not sure your policy on tests that rely on randomness (even with fixed seeds). The tests that I added demonstrate the issue and cannot suceed with the current implementation. Let me know if you would like any changes!

@aherbert
Copy link
Contributor

Thanks for the fix. I added some test files to the JIRA report that fail the current build. Can you add them to this PR please.

@codecov-commenter
Copy link

codecov-commenter commented Mar 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.70%. Comparing base (d524259) to head (cc66e00).
⚠️ Report is 200 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #125      +/-   ##
============================================
+ Coverage     99.62%   99.70%   +0.08%     
+ Complexity     2441      647    -1794     
============================================
  Files           116      120       +4     
  Lines          6087     6403     +316     
  Branches        951      967      +16     
============================================
+ Hits           6064     6384     +320     
+ Misses           10        6       -4     
  Partials         13       13              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kevinmilner
Copy link
Contributor Author

Added, thanks @aherbert!

@aherbert aherbert merged commit 3c2edbc into apache:master Mar 16, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants