Skip to content

Add ML-KEM re-encryption test vectors.#254

Merged
cpu merged 2 commits into
C2SP:mainfrom
lukaszobernig:mlkem_reenc_comp_bugs
Jun 25, 2026
Merged

Add ML-KEM re-encryption test vectors.#254
cpu merged 2 commits into
C2SP:mainfrom
lukaszobernig:mlkem_reenc_comp_bugs

Conversation

@lukaszobernig

Copy link
Copy Markdown
Contributor

This adds ML-KEM test vectors that fail to decapsulate on a correct implementation but decapsulate to the correct shared secret on a faulty implementation. The first vector tests whether the implementation does not compare c_1 and c_1' correctly, specifically whether the implementation forgets to compare the last byte. The second vector tests the same for the comparison of c_2 and c_2'.

@lukaszobernig lukaszobernig force-pushed the mlkem_reenc_comp_bugs branch from dfe281f to 91738eb Compare June 10, 2026 17:15
@cpu cpu self-assigned this Jun 18, 2026
@cpu cpu force-pushed the mlkem_reenc_comp_bugs branch from 91738eb to 44dcf46 Compare June 22, 2026 15:13
@cpu

cpu commented Jun 22, 2026

Copy link
Copy Markdown
Member

Thanks for the PR!

While reviewing this & fiddling with introducing the c_1/c_2 comparison bugs in an implementation I realized that the original mlkem_semi_expanded_decaps_test_schema.json should have the expected K like the seed-form mlkem_test_schema.json already had. I think this was an oversight when we landed the semi-expanded schema/vectors initially. I went ahead and tacked on a commit that updates the schema + backfills the values (backfill tool here).

I also had some review feedback for your new vectors beyond adding K and since I was pushing commits I applied it directly. Please push-back if you disagree with the changes.

  1. I think these should be result = valid - the invalid cases so far have been instances where Decapsulate fails, and that isn't applicable here.
  2. I think these vectors should be in a new group so that we can have a source distinct from the original "github/aws/aws-lc". I went with "github/lukaszobernig/reenc" but if you'd prefer something else we can change it.

With all of the above handled I was able to verify that introducing a representative c_1/c_2 comparison bug in the Go ML-KEM implementation went unnoticed by existing vectors, but was caught by the expected new vectors 👍

@sgmenda @FiloSottile Would appreciate you folks reviewing before I merge since I've made adjustments of my own. 🙇

@cpu cpu requested review from FiloSottile and sgmenda June 22, 2026 15:21
@lukaszobernig

Copy link
Copy Markdown
Contributor Author

Thanks for making those changes, I wasn't super sure how to include the vectors into the existing schema. But what you're proposing sounds good to me.

@sgmenda sgmenda left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes including @cpu 's improvements look great to me.

also ran them against aws-lc with no issues.

thanks so much for the contribution.

"type": "string",
"format": "HexBytes",
"description": "If present, the shared key the implementation MUST return on a successful Decapsulate call."
},

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this required for result: valid?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

As a side-note, adding these sort of conditional schema rules seems to interact badly with some code generators. As a concrete example the one I've been using for JSON Schema -> Go will stop emitting a typed structure for anything that has a conditional rule, instead spitting out an interface{} that we have to cast to a manually defined structure.

I'm operating on the assumption we care more about vector accuracy vs fidelity of code generation so adding this kind of conditional validation is the right call and we'll deal with the less useful code generation downstream, but I wanted to mention it as an explicit trade-off. (It's also possible other JSON schema tools handle it better, I haven't done an exhaustive survey).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's unfortunate, but yeah not having to manually notice missing fields feels more important.

This sounds very fixable, btw. Want me to point @filippo-claude at it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds very fixable, btw. Want me to point @filippo-claude at it?

Sure! If you want to take a crack at it I've been using https://github.com/omissis/go-jsonschema Maybe there's an easy fix.

tools/schemagen.go in this repo is already setup and just needs a small diff to output the generated code instead of just checking that it builds:
diff --git a/tools/schemagen/schemagen.go b/tools/schemagen/schemagen.go
index 5b656c8..3c8fd4f 100644
--- a/tools/schemagen/schemagen.go
+++ b/tools/schemagen/schemagen.go
@@ -17,7 +17,10 @@ import (
        "github.com/atombender/go-jsonschema/pkg/generator"
 )

-var schemaDirectory = flag.String("schemas-dir", "schemas", "directory containing schema files")
+var (
+       schemaDirectory = flag.String("schemas-dir", "schemas", "directory containing schema files")
+       dumpStdout      = flag.Bool("stdout", false, "write the generated source to stdout")
+)

 func main() {
        flag.Parse()
@@ -61,11 +64,17 @@ func main() {
        if sourceCount := len(sources); sourceCount != 1 {
                log.Fatalf("expected to generate 1 source file, got %d\n", sourceCount)
        }
-       _, ok := sources[ouputName]
+       source, ok := sources[ouputName]
        if !ok {
                log.Fatalf("missing generated %q output file source", ouputName)
        }

+       if *dumpStdout {
+               if _, err := os.Stdout.Write(source); err != nil {
+                       log.Fatalf("writing source to stdout: %v", err)
+               }
+       }
+
        for _, warning := range warnings {
                log.Printf("⚠ Warning: %s", warning)
        }

If you apply that you can get a nice loop going by deleting schemas/mlkem_semi_expanded_decaps_test_schema.json#L116-L129, running tools/schemagen.go --stdout > before.go, then reverting the schema back and running it again. You'll see the nice structure drop-out of the generated code and get replaced with interface{}.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpu and others added 2 commits June 25, 2026 17:07
Adds a required 'ek' field and an optional 'K' field to the
semi-expanded ML-KEM decaps schema. The former carries the encapsulation
key bytes of 'dk', and the latter carries the expected shared key
produced by a successful Decapsulate call (if applicable).

Existing vectors using this schema have the new values backfilled.
@cpu cpu force-pushed the mlkem_reenc_comp_bugs branch from 44dcf46 to c2bd219 Compare June 25, 2026 21:07
@cpu cpu merged commit ee7b4f7 into C2SP:main Jun 25, 2026
4 checks passed
@cpu

cpu commented Jun 25, 2026

Copy link
Copy Markdown
Member

Thanks again @lukaszobernig!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants