cranelift(aarch64): share base+index across offset loads via uimm12 amode by darmie · Pull Request #13766 · bytecodealliance/wasmtime

darmie · 2026-06-30T08:04:06Z

When several loads share base + (extended index) and differ only by a
uimm12-scaled offset (e.g. fields read through a computed index), AArch64 amode
selection emitted a separate add base, #offset per load, defeating CSE of the
shared base + index.

This folds the offset into the load's immediate and keeps base + index as one
value, so a single add base, index, uxtw is materialized and reused. A zero
offset keeps the single-instruction RegExtended form, so it is
neutral-or-better: a cluster of N such loads drops N address-adds to 1, a lone
load is unchanged.

Tested: new aarch64 precise-output filetest + a stack round-trip runtest
(aarch64/x86_64/s390x); the full aarch64 filetest and runtest suites pass.

…mode When several loads share `base + (extended index)` and differ only by a uimm12-scaled offset, keep `base + index` as one CSE-able value and fold each offset into the load's immediate, so a single `add base, index, uxtw` is reused instead of a per-load `add base, #offset`. A zero offset keeps the single-instruction RegExtended form.

The amode CSE folds constant offsets into load/store immediates and keeps base+index as one add. Single-use cases are neutral (offset moves into the immediate); debug-exceptions sheds 2 instructions by reusing base+index across four offsets.

alexcrichton · 2026-06-30T17:40:33Z

Thanks for this! I'm wary of the duplication being added here though, and amode rules are some of the most sensitive in Cranelift so we ideally want as little duplication as possible. For example the offset=0 rules are duplicates of above rules, and even using AMode.UnsignedOffset is a duplicate of prior rules as well. Would it be possible to shuffle rules around and add if-let clauses perhaps to tweak existing rules? For example rules could use if-let or some sort of extraction matcher to ensure that offset is 0. Either that or the rule using AMode.UnsignedOffset could be given higher priority.

Address review feedback: the offset==0 rules only existed to beat the new UnsignedOffset rules for a zero offset, duplicating rules 4/5. Replace them with a `uimm12_scaled_nonzero_from_i64` guard on rules 8/9 so a zero offset falls through to the existing RegExtended rules instead. The UnsignedOffset rules simply take higher priority than 4/5 for non-zero offsets. Codegen and all aarch64 filetests are unchanged.

darmie · 2026-07-01T06:49:55Z

For example the offset=0 rules are duplicates of above rules, and even using AMode.UnsignedOffset is a duplicate of prior rules as well. Would it be possible to shuffle rules around and add if-let clauses perhaps to tweak existing rules? For example rules could use if-let or some sort of extraction matcher to ensure that offset is 0. Either that or the rule using AMode.UnsignedOffset could be given higher priority.

I have removed the duplication. Also added uimm12_scaled_nonzero_from_i64 (+ its verifier spec) so rules 8/9's UnsignedOffset only matches non-zero offsets and takes higher priority than 4/5; a zero offset falls through to the existing RegExtended rules naturally.

github-actions · 2026-07-01T10:02:56Z

Subscribe to Label Action

cc @cfallin, @fitzgen

Details

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "isle"

Thus the following users have been cc'd because of the following labels:

cfallin: isle
fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

alexcrichton · 2026-07-01T14:56:04Z

This helps remove some duplication, yeah, but this is still effectively a duplication of rule (2), however, and it's also a duplication of 3/4/5 in terms of matching pattern. My thinking is that what you want can be achieved by raising the priority of rule 2 because the premise here is that the static offset should have the highest priority for getting folded into the instruction itself while leaving all other computations in a CSE-able form.

darmie requested a review from a team as a code owner June 30, 2026 08:04

darmie requested review from alexcrichton and removed request for a team June 30, 2026 08:04

darmie requested a review from a team as a code owner June 30, 2026 08:25

github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. labels Jun 30, 2026

github-actions Bot added the isle Related to the ISLE domain-specific language label Jul 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cranelift(aarch64): share base+index across offset loads via uimm12 amode#13766

cranelift(aarch64): share base+index across offset loads via uimm12 amode#13766
darmie wants to merge 3 commits into
bytecodealliance:mainfrom
darmie:aarch64-amode-shared-base-cse

darmie commented Jun 30, 2026

Uh oh!

alexcrichton commented Jun 30, 2026

Uh oh!

darmie commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Uh oh!

alexcrichton commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

darmie commented Jun 30, 2026

Uh oh!

alexcrichton commented Jun 30, 2026

Uh oh!

darmie commented Jul 1, 2026

Uh oh!

github-actions Bot commented Jul 1, 2026

Subscribe to Label Action

Uh oh!

alexcrichton commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants