Skip to content

Switch to larger Helios instances for CI#10018

Merged
emilyalbini merged 1 commit intomainfrom
ea-large-helios
Mar 10, 2026
Merged

Switch to larger Helios instances for CI#10018
emilyalbini merged 1 commit intomainfrom
ea-large-helios

Conversation

@emilyalbini
Copy link
Copy Markdown
Member

@emilyalbini emilyalbini commented Mar 10, 2026

This PR switches the slowest Helios jobs to run on larger instance sizes, significantly speeding up CI times. Along with this I already deployed a Buildomat configuration change to run all Helios jobs on Zen 4 AWS instances, instead of Zen 3 instances either on AWS or lab Gimlets. Together, these two changes should bring CI times down considerably.

Unfortunately we cannot use Zen 5 AWS instances (like we did on Linux) until oxidecomputer/stlouis#938 is fixed.

build-and-test (helios)

Architecture CPU cores RAM Execution time Price per build
Old: Zen 3 8 32 GB 135 minutes $1.04
Zen 4 8 32 GB 110 minutes $0.85
Zen 4 16 128 GB 58 minutes $1.24
New: Zen 4 32 256 GB 45 minutes $1.87

This job was actually slowed down for a nondeterministic amount of time by it running out of memory and being forced to aggressively page memory to disk. Turns out it was using around 150% of the RAM the VM had allocated. Switching to memory-optimized AWS instances (2x the RAM) fixed the problem.

The switch from 16 cores to 32 cores is fairly expensive and has diminishing returns, like for the Linux instance, but still, it's a 15 minutes win. When we switch to Zen 5 it might be worth it to go back to 16 cores.

omicron-common

Architecture CPU cores RAM Execution time Price per build
Old: Zen 3 8 32 GB 6 minutes $0.05
New: Zen 4 8 32 GB 5 minutes $0.04

The switch has negligible impact on a job this short, but it's not worth it to create a dedicated target just to keep this job back on Zen 3. So it gets unintentionally updated to Zen 4.

helios / package

Architecture CPU cores RAM Execution time Price per build
Old: Zen 3 8 32 GB 46 minutes $0.35
Zen 4 8 32 GB 30 minutes $0.23
New: Zen 4 16 64 GB 24 minutes $0.37

The wins from 8 cores to 16 cores are not that impressive, but this job is a dependency of the "deploy" job which we cannot really speed up (it needs to run on a lab Gimlet, and we can't shard it as far as I'm aware), so any time we can shave is worth it.

helios / build TUF repo

Architecture CPU cores RAM Execution time Price per build
Old: Zen 3 8 32 GB 80 minutes $0.62
New: Zen 4 16 128 GB 43 minutes $0.92

Similarly to the build and test job this was paging memory to disk due to not having enough memory in the VM (even though to a less extent). After the size increase there was a lot of single-thread CPU, so I didn't bother testing more cores.

check-features (helios)

Architecture CPU cores RAM Execution time Price per build
Old: Zen 3 8 32 GB 38 minutes $0.29
Zen 4 1 4 GB 55 minutes $0.04
Zen 4 2 8 GB 38 minutes $0.07
New: Zen 4 8 32 GB 24 minutes $0.19

This job was mostly single-threaded so I tried aggressively reducing the VM size but with mixed results. In the end decided to keep it with the now-Zen4 standard target.

clippy (helios)

Architecture CPU cores RAM Execution time Price per build
Old: Zen 3 8 32 GB 26 minutes $0.20
New: Zen 4 8 32 GB 18 minutes $0.14
Zen 4 16 64 GB 17 minutes $0.26

Turns out there was zero benefit going from 8 to 16 cores for this job.

Copy link
Copy Markdown
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great (once you rebase onto #10001).

@emilyalbini emilyalbini enabled auto-merge (squash) March 10, 2026 18:01
@emilyalbini emilyalbini merged commit a4f3f9d into main Mar 10, 2026
16 checks passed
@emilyalbini emilyalbini deleted the ea-large-helios branch March 10, 2026 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants