Skip to content

Rec group builder#13687

Open
somdoron wants to merge 5 commits into
bytecodealliance:mainfrom
somdoron:rec-group-builder
Open

Rec group builder#13687
somdoron wants to merge 5 commits into
bytecodealliance:mainfrom
somdoron:rec-group-builder

Conversation

@somdoron

Copy link
Copy Markdown
Contributor

Implements the embedder API requested in issue #10176.

Embedders can currently only build one-off struct/array/func types, so types that reference
themselves or each other can't be constructed directly. This adds RecGroupBuilder: declare
kind-typed labels, use them as forward references, and build() the whole recursion group at
once. It lowers to module-canonical WasmSubTypes and reuses the existing rec-group
registration path.

Also fixes an unrelated pre-existing bug in StorageType::is_val_type (matched I16 instead of
ValType), in its own commit.

It previously matched the `I16` variant instead of `ValType`, so it
returned the wrong answer for every storage type. The method is
otherwise unused inside the tree, which is why this went unnoticed.
@somdoron somdoron requested review from a team as code owners June 18, 2026 13:08
@somdoron somdoron requested review from alexcrichton and pchickey and removed request for a team June 18, 2026 13:08
Adds `RecGroupBuilder`, which lets embedders declare kind-typed labels
(`PendingStructId`/`PendingArrayId`/`PendingFuncId`), use them as forward
references while defining other types via a small build-time "template"
family, and register the whole group at once with `build()`. This makes
it possible to construct self-referential and mutually-recursive
struct/array/func types directly from the embedder API, which previously
required plucking such types out of a module's imports/exports.

The builder lowers its members to module-canonical `WasmSubType`s (using
0-based `Module` indices for intra-group references and `Engine` indices
for already-registered types) and reuses the existing rec-group
registration path, so hash-consing, runtime canonicalization, supertype
lists, and GC layouts all come for free.

Implements the embedder API requested in bytecodealliance#10176.
@somdoron somdoron force-pushed the rec-group-builder branch from e4e67d9 to ff45908 Compare June 18, 2026 13:19
@github-actions github-actions Bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:docs Issues related to Wasmtime's documentation labels Jun 18, 2026
@fitzgen fitzgen added the wasm-proposal:gc Issues with the implementation of the gc wasm proposal label Jun 18, 2026
@fitzgen fitzgen requested review from fitzgen and removed request for alexcrichton and pchickey June 18, 2026 16:57
@fitzgen

fitzgen commented Jun 18, 2026

Copy link
Copy Markdown
Member

@somdoron given that you just posted an AI plan verbatim in this PR's associated issue, I am assuming you also used AI to implement this PR. Can you confirm that you have already reviewed the AI tool's work in detail before posting it here and asking reviewers to spend their time on it? And are you taking full responsibility for the code in this PR?

https://github.com/bytecodealliance/governance/blob/main/AI_TOOL_POLICY.md

@somdoron

Copy link
Copy Markdown
Contributor Author

@fitzgen I confirmed that I reviewed the AI tool. I reviewed the code, and I'm taking full responsibility for the code in this PR. The PR is for a real problem I'm suffering from, and is focused and small.

@github-actions

Copy link
Copy Markdown

Subscribe to Label Action

cc @fitzgen

Details This issue or pull request has been labeled: "wasm-proposal:gc", "wasmtime:api", "wasmtime:docs"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: wasm-proposal:gc

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@fitzgen fitzgen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, I want to say that this feature is going to involve a lot of API design, and probably won't land super quickly because of that. If you really need to be able to define rec groups in the host now, you can do that via building Modules (either via WAT strings or via wasm_encoder) that define the rec groups you need and export a global of each of the rec groups types so you can exfiltrate the GlobalType and the global type's ValType on the host via Module::exports. So that should unblock you for the time being, if necessary.

Some high-level thoughts on the design:

  • I don't think we need to distinguish between pending structs versus pending arrays at the type level. I don't think it is buying us much in terms of correctness by construction or whatever, but it makes ramping up on the API that much harder.

  • I don't love adding FieldTemplate/HeapTypeTemplate/etc... types to Wasmtime's public API. I think we could avoid that (and also not force callers to construct slices of them, potentially allocating when they otherwise wouldn't) by having a struct type builder that borrows the rec group builder and exposes a method to define fields one at a time, something like this:

    impl RecGroupBuilder<'_> {
        pub fn define_struct(&mut self, ty: PendingType) -> StructTypeBuilder<'_> { ... }
    }
    
    pub struct StructTypeBuilder<'a> {
        rec_group: &'a mut RecGroupBuilder,
        // ...
    }
    
    impl StructTypeBuilder<'_> {
        /// Define a normal, non-forward-reference field.
        pub fn field(&mut self, ty: FieldType) { ... }
    
        /// Define a field that is a forward reference to another type defined in this rec group.
        pub fn forward_ref_field(&mut self, ty: PendingType) -> ForwardRefFieldBuilder<'_> { ... }
    }
    
    pub struct ForwardRefFieldBuilder<'a> {
        parent: StructOrArrayTypeBuilder<'a>,
        // ...
    }
    
    impl ForwardRefFieldBuilder<'_> {
        pub fn nullable(&mut self, nullable: bool) -> &mut Self { ... }
    
        pub fn shared(&mut self, shared: bool) -> &mut Self { ... }
    
        pub fn finish(self) -> &mut StructTypeBuilder<'_> { ... }
    }

Replace the public surface with a fluent nested-builder API:

- Use a single kind-agnostic `PendingType` handle instead of separate
  `PendingStructId`/`PendingArrayId`/`PendingFuncId`.
- Drop the public `*Template` value types. Concrete fields/params take the
  existing `FieldType`/`ValType` directly; forward references to in-group
  siblings use `forward_ref_field`/`forward_ref_element`/`forward_ref_param`/
  `forward_ref_result`, which take plain `(mutability, is_nullable, target)`
  arguments (no sub-builders, no slice allocation).
- Finality and supertypes are set via builder methods: `finality(Finality)`,
  `supertype(<concrete>)` for already-registered supertypes, and
  `forward_supertype(PendingType)` for in-group siblings. `build()` performs
  the same structural/finality validation the one-off constructors do, after
  registration so that forward supertypes resolve.

Getters are `get_struct`/`get_array`/`get_func -> Option<..>`.
@somdoron

Copy link
Copy Markdown
Contributor Author

Thanks, I have a workaround for now (similar to what you suggested).

I applied your suggestion, using a builder API for struct/array/func. I haven't built the ForwardRefFieldBuilder yet, I went with an API similar to StructType::new, taking both nullability and mutability as parameters. It does lack the shared field at the moment. Let me know if you think the field builder API is preferred.

@fitzgen

fitzgen commented Jun 29, 2026

Copy link
Copy Markdown
Member

I applied your suggestion, using a builder API for struct/array/func. I haven't built the ForwardRefFieldBuilder yet, I went with an API similar to StructType::new, taking both nullability and mutability as parameters. It does lack the shared field at the moment. Let me know if you think the field builder API is preferred.

Yeah the builder API is preferred because it is less brittle as the Wasm spec evolves (e.g. with the addition of shared to reference types).

@somdoron somdoron force-pushed the rec-group-builder branch 2 times, most recently from f786458 to 8f26178 Compare June 29, 2026 16:42
@somdoron

Copy link
Copy Markdown
Contributor Author

Done, now with the builder API for fields / array elements / function params.

I didn't add shared to any of the builders yet, as it isn't supported yet and I think it depends on the shared-everything-threads proposal.

@fitzgen fitzgen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think we are on the right track. Some more comments below.

(Don't want to start getting too detailed with the review until the skeleton / overall design is in the final shape.)

types: impl ExactSizeIterator<Item = WasmSubType>,
) -> Result<Vec<RegisteredType>, OutOfMemory> {
let len = types.len();
assert!(len >= 1, "a rec group must contain at least one type");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't true at the spec level, and is easy to support (just remove this assertion and everything should work, AFAICT) so I don't think we should require it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +86 to +129
/// A struct field or array element being defined: either an already-known
/// concrete/abstract type, or a forward reference to a sibling in this group.
enum FieldDef {
Concrete(FieldType),
Forward {
target: u32,
nullable: bool,
mutable: bool,
},
}

/// A function parameter or result being defined: either an already-known
/// concrete/abstract type, or a forward reference to a sibling in this group.
enum ValDef {
Concrete(ValType),
Forward { target: u32, nullable: bool },
}

/// A supertype being defined: either a sibling in this group or an
/// already-registered concrete type. Generic over the concrete kind.
enum SuperDef<T> {
Forward(u32),
Known(T),
}

/// The in-progress definition of one member of a rec group.
enum MemberDef {
Struct {
finality: Finality,
supertype: Option<SuperDef<StructType>>,
fields: Vec<FieldDef>,
},
Array {
finality: Finality,
supertype: Option<SuperDef<ArrayType>>,
element: Option<FieldDef>,
},
Func {
finality: Finality,
supertype: Option<SuperDef<FuncType>>,
params: Vec<ValDef>,
results: Vec<ValDef>,
},
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than duplicating all these types for the forward-or-not cases, could we reuse the types from wasmtime_environ::types::* and have the not-forward cases be EngineOrModuleTypeIndex::Engine and references within the rec group be EngineOrModuleTypeIndex::RecGroup or something like that?

pub struct RecGroupBuilder {
engine: Engine,
builder_id: usize,
members: Vec<Option<MemberDef>>,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: my previous comment, we would ideally make this become

Suggested change
members: Vec<Option<MemberDef>>,
inner: WasmRecGroup,

Although unfortunately that would require defining all struct vs array vs function types up front for all the members, leading to an awkward API for the builder, so I guess we probably actually want something like this:

Suggested change
members: Vec<Option<MemberDef>>,
members: Vec<Option<WasmSubType>>,

Although I guess we could do the first if we had different RecGroupBuilder::declare_struct vs RecGroupBuilder::declare_array vs RecGroupBuilder::decalre_func, as you originally formulated it... I didn't realize the implications that change would have here. I think we actually do want to have those separately-typed declare methods so we can have this builder wrap a WasmRecGroup. Sorry for the churn.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Vec<Option<WasmSubType>> might be better. The reason is that TypeRegistryInner::register_rec_group_types takes an iterator of WasmSubType, so we don't get much benefit from wrapping a WasmRecGroup here.

In any case, we can't construct the underlying WasmSubTypes for the WasmRecGroup until the struct/array/func builder finishes. We could create a placeholder of the correct composite kind when declare_* is called, but we'd still have to replace it once the struct/array/func builder finishes. Since that overwrite is needed either way, I don't think the separately-typed declare methods are required to wrap a WasmRecGroup — unless you'd prefer them for API reasons.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: we do need the separate declare functions — they're required to build the WasmHeapType for a forward reference, whose concrete variant (ConcreteStruct / ConcreteArray / ConcreteFunc) encodes the composite kind. So the kind has to be known at declare time, even though the body isn't filled in yet.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some conclusions after digging in:

  1. We can use multiple declare functions, one per type.
  2. We can use WasmHeapType with EngineOrModuleTypeIndex::Engine and EngineOrModuleTypeIndex::Module for forward references. That would remove FieldDef, ValDef, and SuperDef.
  3. Using WasmRecGroup or Vec<WasmSubType> for the RecGroupBuilder's internal type isn't very elegant, since they aren't mutable.
  4. We can use Vec<WasmSubType>: create a placeholder on declare_*, and create the real type when we call RecGroupBuilder::build or Array/Func/StructBuilder::finish.

Anyway, I think keeping the MemberDef enum is more elegant than trying to use WasmRecGroup or Vec<WasmSubType>, since it's mutable and allows accumulation.

Comment on lines +152 to +153
/// Declare a new type in this rec group, returning a handle that can be used
/// as a forward reference before the type is defined.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should document that the order of declare calls determines the order that the types are defined within the rec group, which has semantically visible implications. In the following example, $f != $f' and $s != $s', but you might otherwise not realize this when using the RecGroupBuilder, declare, and PendingType.

(rec (type $f (func))
     (type $s (struct)))

(rec (type $s' (struct))
     (type $f' (func)))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, added documentation to each declare function and RecGroupBuilder.

@somdoron

somdoron commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

The new commit stores Vec<Option<WasmSubType>> instead of a MemberDef. Each StructTypeBuilder/ArrayTypeBuilder/FuncTypeBuilder must have its finish method called to commit the type to the RecGroupBuilder, consistent with ForwardRefFieldBuilder. If finish is never called on one of these builders, that type's definition is silently dropped.

At the moment only RecGroupBuilder::build reports errors: the first error is stored on the RecGroupBuilder and returned when build is called. We could instead have StructTypeBuilder/ArrayTypeBuilder/FuncTypeBuilder::finish return the error directly.

For example, two structs where $a has a forward reference to $b:

let mut builder = RecGroupBuilder::new(&engine);

let a = builder.declare_struct();
let b = builder.declare_struct();

builder
    .define_struct(a)
    .forward_ref_field(b)
    .nullable(true)
    .finish()  // commit the field, back to the struct builder
    .finish(); // commit the struct to the rec group

builder
    .define_struct(b)
    .field(FieldType::new(
        Mutability::Const,
        StorageType::ValType(ValType::I32),
    ))
    .finish();

let group = builder.build()?;
let a: StructType = group.get_struct(a).unwrap();
let b: StructType = group.get_struct(b).unwrap();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasm-proposal:gc Issues with the implementation of the gc wasm proposal wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:docs Issues related to Wasmtime's documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants