Skip to content

Fix serial-path reproducibility for L'Ecuyer-CMRG (list) seeds#89

Open
mronkko wants to merge 1 commit into
philchalmers:mainfrom
mronkko:fix-serial-iseed-reproducibility
Open

Fix serial-path reproducibility for L'Ecuyer-CMRG (list) seeds#89
mronkko wants to merge 1 commit into
philchalmers:mainfrom
mronkko:fix-serial-iseed-reproducibility

Conversation

@mronkko

@mronkko mronkko commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Summary

set_seed() did not actually apply list-type (L'Ecuyer-CMRG) seeds on the serial (non-parallel) execution path, so seeded runs were not reproducible when parallel = FALSE.

The helper assigned the seed to its local function frame:

set_seed <- function(seed){
    if(is.list(seed)) .Random.seed <- seed[[1L]]   # local binding -> discarded on return
    else set.seed(seed)
    invisible(NULL)
}

R's RNG only ever consults globalenv()$.Random.seed, so this assignment had no effect and the L'Ecuyer-CMRG state was silently dropped. The serial loop then ran off the ambient global RNG state.

This affected:

  • runArraySimulation(..., iseed = <int>, parallel = FALSE)genSeeds() produces a length-1 list holding a 7-integer L'Ecuyer state, which was dropped.
  • runSimulation(seed = genSeeds(design, iseed = ...), parallel = FALSE) — same list-seed path.

The parallel path was unaffected, because it installs the seed via clusterSetRNGSubStream() / clusterSetRNGStream(), which assign() into the workers' .GlobalEnv.

Fix

One-line change: assign into .GlobalEnv (matching the idiom already used at analysis.R:21 and util.R:711):

if(is.list(seed)) .GlobalEnv$.Random.seed <- seed[[1L]]

Setting a valid L'Ecuyer-CMRG vector as the global .Random.seed also switches the active generator automatically, since R derives the RNG kind from .Random.seed[1]. No explicit RNGkind() call is needed (the array path already sets it).

Verification

Before the fix, two serial runs with the same iseed produced different per-replication draws; after the fix they are byte-identical (and parallel runs remain reproducible as before):

Mode Before After
parallel = FALSE not reproducible reproducible
parallel = TRUE reproducible reproducible

A regression test asserting serial-path reproducibility of runArraySimulation(..., iseed, parallel = FALSE) is added to tests/tests/test-03-array.R.

🤖 Generated with Claude Code

set_seed() assigned `.Random.seed` to its local function frame rather than
.GlobalEnv when given a list-type (L'Ecuyer-CMRG) seed. R's RNG only consults
globalenv()$.Random.seed, so the seed state was silently discarded on the
serial (non-parallel) path. This made runArraySimulation(..., iseed,
parallel = FALSE) and runSimulation(seed = <genSeeds() list>, parallel = FALSE)
non-reproducible across runs, while the parallel path (which installs the seed
via clusterSetRNGSubStream() into the workers' global env) was unaffected.

Assigning into .GlobalEnv installs a valid L'Ecuyer-CMRG state; R derives the
RNG kind from .Random.seed[1], so the generator switches automatically.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant