refactor: move spamoor benchmark into testify suite by chatton · Pull Request #3107 · evstack/ev-node

chatton · 2026-02-25T12:16:21Z

Overview

Introduces a benchmarking package within the e2e dir a testsuite we can add additional spamoor tests too.

Summary by CodeRabbit

Tests
- Added a new end-to-end benchmark suite for spam-generation, tracing, and performance metrics with JSON output; added Spamoor smoke/benchmark tests and helpers; removed an older duplicate smoke test.
Refactor
- Promoted and renamed several test utilities and trace/metric helpers to exported/public APIs for broader reuse and improved reporting.
Chores
- Updated CI workflow triggers and moved benchmark artifact paths/handling into the benchmark workspace; adjusted test runner package selection.

…mark - Create test/e2e/benchmark/ subpackage with SpamoorSuite (testify/suite) - Move spamoor smoke test into suite as TestSpamoorSmoke - Split helpers into focused files: traces.go, output.go, metrics.go - Introduce resultWriter for defer-based benchmark JSON output - Export shared symbols from evm_test_common.go for cross-package use - Restructure CI to fan-out benchmark jobs and fan-in publishing - Run benchmarks on PRs only when benchmark-related files change

Resolve conflicts keeping the benchmark suite refactoring: - benchmark.yml: keep path filters and suite-style test command - evm_spamoor_smoke_test.go: keep deleted (moved to benchmark pkg) - evm_test_common.go: keep exported types, drop writeTraceBenchmarkJSON (now in benchmark/output.go)

coderabbitai · 2026-02-25T12:28:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fccd9db and 85c9d2d.

📒 Files selected for processing (2)

.github/workflows/benchmark.yml
scripts/test.mk

🚧 Files skipped from review as they are similar to previous changes (1)

.github/workflows/benchmark.yml

📝 Walkthrough

Walkthrough

Moves Spamoor benchmark logic into a new benchmark suite at test/e2e/benchmark/, adds evm-tagged helpers for metrics, traces, and JSON output, exports common EVM test helpers/types, updates CI workflow artifact paths and invocation, and removes the legacy test/e2e/evm_spamoor_smoke_test.go.

Changes

Cohort / File(s)	Summary
CI Workflow `.github/workflows/benchmark.yml`	Adds path filters for PR trigger; updates Spamoor smoke invocation, EVMB path, and artifact upload/download/workspace paths to `test/e2e/benchmark/`.
New Benchmark Suite `test/e2e/benchmark/suite_test.go`, `test/e2e/benchmark/spamoor_smoke_test.go`	Adds `SpamoorSuite` with SetupTest and a Spamoor smoke benchmark test orchestrating Jaeger, reth, sequencer, Spamoor, spammers, metrics collection, trace retrieval, and assertions.
Benchmark Helpers `test/e2e/benchmark/metrics.go`, `test/e2e/benchmark/traces.go`, `test/e2e/benchmark/output.go`	New evm build-tagged helpers: HTTP readiness polling, Prometheus counter summation, Jaeger trace parsing/adapters, span assertions, and JSON benchmark result writer.
E2E Common API `test/e2e/evm_test_common.go`	Exports/renames test helpers and types: `setup` → `Setup`, `traceSpan`→`TraceSpan`, `spanStats`→`SpanStats`; exposes `AggregateSpanStats` and `PrintTraceReport`; removes legacy benchmark JSON writer.
Test Callsite Updates `test/e2e/evm_contract_bench_test.go`, `test/e2e/evm_force_inclusion_e2e_test.go`, `test/e2e/evm_full_node_e2e_test.go`, `test/e2e/failover_e2e_test.go`	Updated call sites to use exported helper names and capitalized trace interfaces/report functions.
Removed legacy test `test/e2e/evm_spamoor_smoke_test.go`	Deleted legacy Spamoor smoke test and associated helpers; logic migrated into the new `test/e2e/benchmark/` suite.
Test Makefile `scripts/test.mk`	Changes e2e package selection to `go list -tags='e2e evm' ./...

Sequence Diagram(s)

sequenceDiagram
    participant Test as "Spamoor Smoke Test"
    participant Suite as "SpamoorSuite"
    participant Jaeger as "Jaeger (tracing)"
    participant Reth as "Reth (execution)"
    participant Sequencer as "Sequencer"
    participant Spamoor as "Spamoor"

    Test->>Suite: run SetupTest
    Suite->>Jaeger: start Jaeger
    Suite->>Reth: start Reth (OTLP -> Jaeger)
    Suite->>Sequencer: start Sequencer
    Suite->>Spamoor: build & start Spamoor (RPC -> Sequencer)
    Test->>Spamoor: create & run spammers
    Spamoor->>Sequencer: submit txs
    Sequencer->>Reth: propose blocks
    Spamoor->>Jaeger: export traces
    Test->>Jaeger: collect traces & metrics
    Jaeger-->>Test: return trace payloads
    Test->>Test: parse/aggregate spans, assert, write results

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat: adding spammoor test to benchmark #3105: Related migration/reorganization of Spamoor benchmark and CI artifact/path updates.
chore: gather spans from both ev-node and ev-reth in spamoor test #3099: Overlaps on Spamoor e2e tracing integration and Jaeger trace helpers.

Suggested labels

T:testing, T:code-hygiene

Suggested reviewers

alpe
tzdybal
tuxcanfly

Poem

🐰 I hopped through traces with whiskers bright,
Collected spans by day and metrics by night,
Benchmarks tucked in a neat little lair,
CI paths rerouted with meticulous care,
Hop, report, and sprint — the tests take flight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 77.27% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'refactor: move spamoor benchmark into testify suite' accurately describes the main change—relocating the existing Spamoor benchmark test into a new testify-based test suite structure.
Description check	✅ Passed	The PR description provides a brief overview explaining the introduction of a benchmarking package and testify suite, though it lacks detail on structural changes, testing implications, and rationale.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch cian/bench-refactor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-25T12:28:55Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Feb 25, 2026, 2:46 PM

claude · 2026-02-25T12:28:55Z

Claude finished @chatton's task in 1m 54s —— View job

Code Review

Gather context and understand PR scope
Review new benchmark package files
Review CI workflow changes
Review exported helper changes in evm_test_common.go
Compile and post review findings

Good refactor overall — isolating the Spamoor benchmark into its own testify suite with a clean env struct and per-test setup is much more maintainable than the previous monolithic approach. The SetupTest isolation pattern is correct and the use of t.Cleanup throughout is idiomatic.

I found a few issues ranging from functional bugs to style nitpicks.

Bugs / Functional Issues

1. OTLP endpoint configuration is incorrect — test/e2e/benchmark/suite_test.go:74-77

The OpenTelemetry spec requires OTEL_EXPORTER_OTLP_ENDPOINT to be a base URL; the SDK appends /v1/traces automatically. The current code sets it to .../v1/traces, which will cause requests to go to .../v1/traces/v1/traces. Additionally, "http" is not a valid OTLP protocol value — the spec defines grpc, http/protobuf, and http/json.

// current (broken)
"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",
"OTEL_EXPORTER_OTLP_PROTOCOL=http",
"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http",

// correct
"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint(),   // base URL, no suffix
"OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf",
"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf",
// OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is fine as-is (per-signal = full URL)

Fix this →

2. Hardcoded rethTag: "pr-140" will break — test/e2e/benchmark/spamoor_smoke_test.go:21

This references an ephemeral PR image tag that will become invalid once the image is cleaned up. It should be configurable via an env var with a sensible default so CI can override it without a code change.

rethTag := os.Getenv("BENCH_RETH_TAG")
if rethTag == "" {
    rethTag = "pr-140"
}
e := s.setupEnv(config{
    rethTag:     rethTag,
    serviceName: "ev-node-smoke",
})

Fix this →

Design / Robustness

3. flush() uses require.NoError in a deferred context — test/e2e/benchmark/output.go:74-75

require.NoError calls runtime.Goexit via FailNow. When flush is called via defer, this means an I/O error writing the benchmark JSON file can mask the original test failure. Prefer non-fatal logging here:

data, err := json.MarshalIndent(w.entries, "", "  ")
if err != nil {
    w.t.Logf("WARNING: failed to marshal benchmark JSON: %v", err)
    return
}
if err := os.WriteFile(outputPath, data, 0644); err != nil {
    w.t.Logf("WARNING: failed to write benchmark JSON to %s: %v", outputPath, err)
    return
}

4. Fixed 60-second sleep for trace accumulation — test/e2e/benchmark/spamoor_smoke_test.go:70

This is the canonical fragile-test antipattern. On slow CI runners it may be too short; on fast machines it wastes time. The collectServiceTraces helper already polls Jaeger with a timeout via WaitForTraces — the sleep before it could simply be removed, or converted to poll for a minimum span count.

5. addEntry method is dead code — test/e2e/benchmark/output.go:61

resultWriter.addEntry is defined but never called anywhere in the package. Either add a use or remove it to avoid confusion about whether it's part of the intended API.

Code Quality Nitpicks

6. Loop variable t shadows test helper convention — test/e2e/benchmark/traces.go:25

for _, t := range traces {   // 't' reads as testing.TB to any Go developer

Rename to traceEntry or raw to avoid confusion.

7. Stale doc comment — test/e2e/evm_test_common.go:532

The comment still says setupCommonEVMEnv (the old unexported name); Go convention requires the doc comment to start with the exported function name:

// SetupCommonEVMEnv creates and initializes ev-reth instances...

8. requireHTTP has no cancellation path — test/e2e/benchmark/metrics.go:15

The polling loop uses time.Now().Before(deadline) with time.Sleep inside, with no hook for test context cancellation (t.Context()). If the test is killed mid-flight, this loop will run until the deadline. Minor in practice but inconsistent with the rest of the suite which correctly passes ctx everywhere.

CI

9. Missing path trigger for Block Executor benchmarks — .github/workflows/benchmark.yml:11-16

The evm-benchmark job runs go test ./block/internal/executing/... but block/internal/executing/** is not in the PR path filter. Changes to those files won't trigger the workflow on PRs.

Note: I cannot modify workflow files directly (GitHub App permission restriction), but this is a straightforward one-liner addition to the paths: list.

Performance Regression

The benchmark-action comment shows 3 Spamoor metrics regressed >1.5x vs the previous run. These are likely noise from the infrastructure refactor rather than a code regression, but worth checking:

Metric	Ratio
`Store.NewBatch (avg)`	1.75x
`Batch.SetHeight (avg)`	1.60x
`Store.GetMetadata (avg)`	1.57x

These could be caused by the new suite initialization overhead or Docker orchestration differences in the refactored setup.

Summary: The structural refactor is clean and the direction is right. The two hard-blockers before merge are the OTLP endpoint misconfiguration (items 1) which will produce silently incorrect tracing data, and the hardcoded PR image tag (item 2) which will cause test failures once that image is cleaned up.

chatton · 2026-02-25T12:29:33Z

test/e2e/benchmark/spamoor_smoke_test.go

+	defer w.flush()
+
+	e := s.setupEnv(config{
+		rethTag:     "pr-140",


the PR the traces were added, we can change this to a proper tag in a follow up.

chatton · 2026-02-25T12:32:07Z

.github/workflows/benchmark.yml

  pull_request:
    branches:
      - main
+    paths:


only run benchmark tests for PRs when relevant files are changed.

go test sets the working directory to the package under test, so the env var should be relative to test/e2e/benchmark/, not test/e2e/.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

test/e2e/benchmark/spamoor_smoke_test.go (1)
20-23: ⚠️ Potential issue | 🟠 Major

Hardcoded rethTag: "pr-140" will become stale and cause opaque failures.

This tag refers to a specific container image built for a PR. Once that image is cleaned up or the tag is overwritten, the test will silently fail to start or pull the container with a confusing error. Expose this via an environment variable or CLI flag so it can be updated without a code change.
🔧 Suggested refactor
+rethTagFromEnv := os.Getenv("BENCH_RETH_TAG")
+if rethTagFromEnv == "" {
+    rethTagFromEnv = "pr-140"
+}
 e := s.setupEnv(config{
-    rethTag:     "pr-140",
+    rethTag:     rethTagFromEnv,
     serviceName: "ev-node-smoke",
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/spamoor_smoke_test.go` around lines 20 - 23, Replace the
hardcoded rethTag value ("pr-140") in the test setup by reading it from
configuration (environment variable or test CLI flag) so tests don't break when
image tags change: update the call site where s.setupEnv(config{ rethTag:
"pr-140", ... }) is used in spamoor_smoke_test.go to instead pull the tag from
an env var (eg. os.Getenv("RETH_TAG")) or a provided flag, and adjust the config
struct and any test bootstrap that constructs config to accept and pass that
dynamic value (look for the config type, its rethTag field, and the setupEnv
function to make the change).

🧹 Nitpick comments (5)

test/e2e/benchmark/metrics.go (1)

15-34: Consider accepting a context.Context for cancellable polling.

requireHTTP drives a blocking polling loop with no way to interrupt it if the parent test is cancelled mid-flight (e.g., via t.Context() deadline or a FailNow from another goroutine). For long timeouts this can leave dangling goroutines/blocked tests.

🔧 Suggested refactor

-func requireHTTP(t testing.TB, url string, timeout time.Duration) {
+func requireHTTP(t testing.TB, ctx context.Context, url string, timeout time.Duration) {
 	t.Helper()
 	client := &http.Client{Timeout: 200 * time.Millisecond}
-	deadline := time.Now().Add(timeout)
+	ctx, cancel := context.WithTimeout(ctx, timeout)
+	defer cancel()
 	var lastErr error
-	for time.Now().Before(deadline) {
+	ticker := time.NewTicker(100 * time.Millisecond)
+	defer ticker.Stop()
+	for {
+		select {
+		case <-ctx.Done():
+			t.Fatalf("daemon not ready at %s: %v", url, lastErr)
+			return
+		case <-ticker.C:
+		}
 		resp, err := client.Get(url)
 		if err == nil {
 			_ = resp.Body.Close()
 			if resp.StatusCode >= 200 && resp.StatusCode < 300 {
 				return
 			}
 			lastErr = fmt.Errorf("status %d", resp.StatusCode)
 		} else {
 			lastErr = err
 		}
-		time.Sleep(100 * time.Millisecond)
 	}
-	t.Fatalf("daemon not ready at %s: %v", url, lastErr)
 }

As per coding guidelines, "Use context.Context for cancellation in Go."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/metrics.go` around lines 15 - 34, The requireHTTP function
currently blocks and can't be cancelled; change its signature to accept a
context.Context (e.g., requireHTTP(ctx context.Context, t testing.TB, url
string, timeout time.Duration) or use t.Context() inside) and use that context
for cancellation inside the polling loop: create requests with
http.NewRequestWithContext or attach ctx to the client.Do call, replace
time.Sleep with a select on ctx.Done() and time.After(100*time.Millisecond),
check ctx.Err() and call t.Fatalf when the context is cancelled, and ensure you
still honor the timeout logic and preserve the lastErr handling in the loop.

test/e2e/benchmark/spamoor_smoke_test.go (1)

68-70: Fixed 60s time.Sleep is fragile for trace accumulation.

On fast machines traces may be ready much sooner; on slow CI runners traces may not be ready even after 60s. Consider polling the Jaeger API with a timeout instead of an unconditional sleep, making the test both faster in the happy path and more robust in the slow path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/spamoor_smoke_test.go` around lines 68 - 70, Replace the
fixed time.Sleep(60 * time.Second) wait with a polling loop that queries the
Jaeger API until the expected traces/spans from spamoor appear or a context
deadline elapses; implement this by replacing the direct call to time.Sleep with
a function (e.g., waitForTraces(ctx, serviceName, expectedSpanCount)) that uses
a context with timeout, exponential/backoff retries, and HTTP calls to the
Jaeger query endpoint, returning early when traces are observed and failing the
test only if the timeout is reached—update the test to call waitForTraces
instead of time.Sleep and use the existing service identifier used in the test
to locate traces.

test/e2e/benchmark/traces.go (1)

23-47: Loop variable t shadows the conventional test-helper name convention.

Using t as a range variable in extractSpansFromTraces (which imports "testing") is not a bug but is slightly confusing, particularly for readers accustomed to t meaning testing.TB. A more descriptive name like traceEntry or raw would improve clarity.
🔧 Suggested refactor
-	for _, t := range traces {
-		traceMap, ok := t.(map[string]any)
+	for _, traceEntry := range traces {
+		traceMap, ok := traceEntry.(map[string]any)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/traces.go` around lines 23 - 47, In
extractSpansFromTraces, rename the range variable t to a clearer name (e.g.,
traceEntry or rawTrace) to avoid shadowing the conventional testing variable t;
update all uses of t in that function (the outer loop in extractSpansFromTraces
and its casts to traceMap) so the logic remains unchanged but the variable is
more descriptive and less confusing to readers familiar with testing.TB.

test/e2e/benchmark/output.go (1)

67-76: require.NoError inside a deferred flush will call runtime.Goexit on failure, which is safe but worth documenting.

When flush is called via defer after a test has already failed, require.NoError → FailNow → runtime.Goexit is invoked from within the deferred function, which Go handles correctly (deferred calls complete). No behavioral bug here, but the interaction with require vs os.WriteFile returning an error silently (by being suppressed) could be surprising. Consider logging the error path rather than asserting fatally so benchmark output failures don't obscure the real test failure.

🔧 Suggested refactor

 func (w *resultWriter) flush() {
 	outputPath := os.Getenv("BENCH_JSON_OUTPUT")
 	if outputPath == "" || len(w.entries) == 0 {
 		return
 	}
 
 	data, err := json.MarshalIndent(w.entries, "", "  ")
-	require.NoError(w.t, err, "failed to marshal benchmark JSON")
-	require.NoError(w.t, os.WriteFile(outputPath, data, 0644), "failed to write benchmark JSON to %s", outputPath)
+	if err != nil {
+		w.t.Logf("WARNING: failed to marshal benchmark JSON: %v", err)
+		return
+	}
+	if err := os.WriteFile(outputPath, data, 0644); err != nil {
+		w.t.Logf("WARNING: failed to write benchmark JSON to %s: %v", outputPath, err)
+		return
+	}
 	w.t.Logf("wrote %d benchmark entries to %s", len(w.entries), outputPath)
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/output.go` around lines 67 - 76, In resultWriter.flush,
avoid using require.NoError (which calls FailNow/Goexit when deferred) for the
json.MarshalIndent and os.WriteFile errors; instead check the error values from
json.MarshalIndent and os.WriteFile and log them non-fatally (e.g., use
w.t.Errorf or w.t.Logf with descriptive messages referencing outputPath and the
error) so flush reports failures without invoking runtime.Goexit from a deferred
context; update the calls that currently use require.NoError(w.t, ...) to
explicit if err != nil { w.t.Errorf("...: %v", err); return } or similar logging
behavior for both the marshal and write steps in resultWriter.flush.

test/e2e/evm_test_common.go (1)

532-534: Doc comment references the old unexported name.

Line 532 says setupCommonEVMEnv but the function is now exported as SetupCommonEVMEnv. Per Go convention, the doc comment should start with the exported identifier name.

Proposed fix

-// setupCommonEVMEnv creates and initializes ev-reth instances, while also initializing the local ev-node instance
+// SetupCommonEVMEnv creates and initializes ev-reth instances, while also initializing the local ev-node instance

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/e2e/evm_test_common.go` around lines 532 - 534, The doc comment
currently starts with the old unexported name; update the comment to begin with
the exported function name SetupCommonEVMEnv so it follows Go doc conventions
(e.g., change the leading sentence from "setupCommonEVMEnv creates..." to
"SetupCommonEVMEnv creates..."), leaving the rest of the description intact and
ensuring the comment now documents the exported SetupCommonEVMEnv function.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/benchmark.yml:
- Around line 11-16: Update the PR path filter in the GitHub Actions workflow to
include the block executor benchmark sources so changes to that package trigger
the workflow; specifically add the path pattern 'block/internal/executing/**'
(or more broadly 'block/**' if preferred) to the existing paths array in
.github/workflows/benchmark.yml so the job that runs `go test
./block/internal/executing/...` will trigger on relevant PRs.

In `@test/e2e/benchmark/suite_test.go`:
- Around line 73-78: The OTLP env setup in the b.WithTag(...).WithEnv call is
wrong: change the value for "OTEL_EXPORTER_OTLP_ENDPOINT" to use the base ingest
URL (remove the "/v1/traces" suffix) so the SDK can append per-signal paths,
keep "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT" as-is (full URL), and update
"OTEL_EXPORTER_OTLP_PROTOCOL" and "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL" from
"http" to the spec-correct "http/protobuf" (or another valid spec value like
"grpc" or "http/json" if intended).

---

Duplicate comments:
In `@test/e2e/benchmark/spamoor_smoke_test.go`:
- Around line 20-23: Replace the hardcoded rethTag value ("pr-140") in the test
setup by reading it from configuration (environment variable or test CLI flag)
so tests don't break when image tags change: update the call site where
s.setupEnv(config{ rethTag: "pr-140", ... }) is used in spamoor_smoke_test.go to
instead pull the tag from an env var (eg. os.Getenv("RETH_TAG")) or a provided
flag, and adjust the config struct and any test bootstrap that constructs config
to accept and pass that dynamic value (look for the config type, its rethTag
field, and the setupEnv function to make the change).

---

Nitpick comments:
In `@test/e2e/benchmark/metrics.go`:
- Around line 15-34: The requireHTTP function currently blocks and can't be
cancelled; change its signature to accept a context.Context (e.g.,
requireHTTP(ctx context.Context, t testing.TB, url string, timeout
time.Duration) or use t.Context() inside) and use that context for cancellation
inside the polling loop: create requests with http.NewRequestWithContext or
attach ctx to the client.Do call, replace time.Sleep with a select on ctx.Done()
and time.After(100*time.Millisecond), check ctx.Err() and call t.Fatalf when the
context is cancelled, and ensure you still honor the timeout logic and preserve
the lastErr handling in the loop.

In `@test/e2e/benchmark/output.go`:
- Around line 67-76: In resultWriter.flush, avoid using require.NoError (which
calls FailNow/Goexit when deferred) for the json.MarshalIndent and os.WriteFile
errors; instead check the error values from json.MarshalIndent and os.WriteFile
and log them non-fatally (e.g., use w.t.Errorf or w.t.Logf with descriptive
messages referencing outputPath and the error) so flush reports failures without
invoking runtime.Goexit from a deferred context; update the calls that currently
use require.NoError(w.t, ...) to explicit if err != nil { w.t.Errorf("...: %v",
err); return } or similar logging behavior for both the marshal and write steps
in resultWriter.flush.

In `@test/e2e/benchmark/spamoor_smoke_test.go`:
- Around line 68-70: Replace the fixed time.Sleep(60 * time.Second) wait with a
polling loop that queries the Jaeger API until the expected traces/spans from
spamoor appear or a context deadline elapses; implement this by replacing the
direct call to time.Sleep with a function (e.g., waitForTraces(ctx, serviceName,
expectedSpanCount)) that uses a context with timeout, exponential/backoff
retries, and HTTP calls to the Jaeger query endpoint, returning early when
traces are observed and failing the test only if the timeout is reached—update
the test to call waitForTraces instead of time.Sleep and use the existing
service identifier used in the test to locate traces.

In `@test/e2e/benchmark/traces.go`:
- Around line 23-47: In extractSpansFromTraces, rename the range variable t to a
clearer name (e.g., traceEntry or rawTrace) to avoid shadowing the conventional
testing variable t; update all uses of t in that function (the outer loop in
extractSpansFromTraces and its casts to traceMap) so the logic remains unchanged
but the variable is more descriptive and less confusing to readers familiar with
testing.TB.

In `@test/e2e/evm_test_common.go`:
- Around line 532-534: The doc comment currently starts with the old unexported
name; update the comment to begin with the exported function name
SetupCommonEVMEnv so it follows Go doc conventions (e.g., change the leading
sentence from "setupCommonEVMEnv creates..." to "SetupCommonEVMEnv creates..."),
leaving the rest of the description intact and ensuring the comment now
documents the exported SetupCommonEVMEnv function.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 212ac08 and cc56590.

📒 Files selected for processing (12)

.github/workflows/benchmark.yml
test/e2e/benchmark/metrics.go
test/e2e/benchmark/output.go
test/e2e/benchmark/spamoor_smoke_test.go
test/e2e/benchmark/suite_test.go
test/e2e/benchmark/traces.go
test/e2e/evm_contract_bench_test.go
test/e2e/evm_force_inclusion_e2e_test.go
test/e2e/evm_full_node_e2e_test.go
test/e2e/evm_spamoor_smoke_test.go
test/e2e/evm_test_common.go
test/e2e/failover_e2e_test.go

💤 Files with no reviewable changes (1)

test/e2e/evm_spamoor_smoke_test.go

coderabbitai · 2026-02-25T12:43:20Z

.github/workflows/benchmark.yml

+    paths:
+      - 'test/e2e/benchmark/**'
+      - 'test/e2e/evm_contract_bench_test.go'
+      - 'test/e2e/evm_test_common.go'
+      - 'test/e2e/sut_helper.go'
+      - '.github/workflows/benchmark.yml'


⚠️ Potential issue | 🟠 Major

PR path filter misses Block Executor benchmark sources.

This workflow runs go test ./block/internal/executing/... (Line 40), but PR paths do not include that tree. Changes to those benchmarks/sources can skip this workflow unexpectedly.

Suggested fix

pull_request: branches: - main paths: - 'test/e2e/benchmark/**' - 'test/e2e/evm_contract_bench_test.go' - 'test/e2e/evm_test_common.go' - 'test/e2e/sut_helper.go' + - 'block/internal/executing/**' - '.github/workflows/benchmark.yml'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/benchmark.yml around lines 11 - 16, Update the PR path filter in the GitHub Actions workflow to include the block executor benchmark sources so changes to that package trigger the workflow; specifically add the path pattern 'block/internal/executing/**' (or more broadly 'block/**' if preferred) to the existing paths array in .github/workflows/benchmark.yml so the job that runs `go test ./block/internal/executing/...` will trigger on relevant PRs.

coderabbitai · 2026-02-25T12:43:21Z

test/e2e/benchmark/suite_test.go

+			b.WithTag(cfg.rethTag).WithEnv(
+				"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",
+				"OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",
+				"OTEL_EXPORTER_OTLP_PROTOCOL=http",
+				"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http",
+				"RUST_LOG=debug",


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

OpenTelemetry OTEL_EXPORTER_OTLP_ENDPOINT specification base URL path suffix

💡 Result:

For OTLP/HTTP, OTEL_EXPORTER_OTLP_ENDPOINT is specified as a base URL. Exporters must append a signal-specific relative path to it:

Traces: v1/traces

Metrics: v1/metrics

Logs: v1/logs

So if you set:

OTEL_EXPORTER_OTLP_ENDPOINT=http://collector:4318

the actual requests go to:

http://collector:4318/v1/traces, .../v1/metrics, .../v1/logs (respectively). The base URL may itself include a path prefix (e.g. /mycollector/), in which case the v1/* suffix is appended under that prefix. [1]

If you instead set a per-signal endpoint (e.g. OTEL_EXPORTER_OTLP_TRACES_ENDPOINT), that URL is used as-is (no automatic /v1/traces appending); if it has no path component, / is used. [1]

Separately, the OTLP protocol spec defines the default OTLP/HTTP paths as /v1/traces, /v1/metrics, /v1/logs (server-side defaults), but clients may be configured to use non-default paths. [2]

For OTLP/gRPC, OTEL_EXPORTER_OTLP_ENDPOINT is a target for the gRPC client (typically host/port; path suffixes like /v1/traces are not part of the gRPC endpoint concept). [1]

Sources: [1] OpenTelemetry OTLP Exporter spec (Endpoint URLs for OTLP/HTTP) [2] OTLP Specification (default OTLP/HTTP URL paths)

🌐 Web query:

OTEL_EXPORTER_OTLP_PROTOCOL valid values http/protobuf http/json grpc

💡 Result:

OTEL_EXPORTER_OTLP_PROTOCOL valid values are exactly: grpc, http/protobuf, http/json. [1][2]

🏁 Script executed:

find . -name "suite_test.go" -path "*/benchmark/*" | head -5

Repository: evstack/ev-node

Length of output: 93

🏁 Script executed:

sed -n '65,85p' ./test/e2e/benchmark/suite_test.go

Repository: evstack/ev-node

Length of output: 889

🏁 Script executed:

rg "IngestHTTPEndpoint" -A 2 -B 2 | head -50

Repository: evstack/ev-node

Length of output: 1044

🏁 Script executed:

rg -i "reth.*otlp|otlp.*protocol" | head -20

Repository: evstack/ev-node

Length of output: 292

🏁 Script executed:

rg -i "http/protobuf|http/json" test/ | head -20

Repository: evstack/ev-node

Length of output: 41

🌐 Web query:

Reth OTLP exporter HTTP protocol http/protobuf http/json specification

💡 Result:

OTLP exporter over HTTP: http/protobuf vs http/json (spec)

Transport & endpoints (OTLP/HTTP):

Uses HTTP POST over HTTP/1.1 or HTTP/2. (opentelemetry.io)

Default URL paths:

Traces: POST /v1/traces with ExportTraceServiceRequest

Metrics: POST /v1/metrics with ExportMetricsServiceRequest

Logs: POST /v1/logs with ExportLogsServiceRequest (opentelemetry.io)

When SDKs are configured with a base endpoint (OTEL_EXPORTER_OTLP_ENDPOINT), they construct per-signal URLs by appending v1/traces, v1/metrics, v1/logs (unless a per-signal endpoint is set). (opentelemetry.io)

Encoding choices (same protobuf schema, different wire format):

http/protobuf: protobuf messages serialized in binary proto3. (opentelemetry.io)

Content-Type: application/x-protobuf (request + response). (opentelemetry.io)

http/json: protobuf messages serialized using protobuf JSON mapping, with OTLP-specific rules. (opentelemetry.io)

Content-Type: application/json (request + response). (opentelemetry.io)

OTLP JSON-specific mapping rules (important for http/json):

traceId / spanId are hex strings (not base64). (opentelemetry.io)

Enums MUST be integers (enum names not allowed). (opentelemetry.io)

Unknown JSON fields must be ignored by receivers. (opentelemetry.io)

JSON field names are lowerCamelCase. (opentelemetry.io)

64-bit integers in JSON are encoded as decimal strings (numbers or strings accepted when decoding). (opentelemetry.io)

Compression:

Client may gzip the request body; if so it MUST send Content-Encoding: gzip. (opentelemetry.io)

Client may send Accept-Encoding: gzip; server may gzip responses. (opentelemetry.io)

Responses:

Success is HTTP 200 OK with an Export<signal>ServiceResponse (e.g., ExportTraceServiceResponse). (opentelemetry.io)

Server must respond using the same Content-Type it received. (opentelemetry.io)

How exporters select protocol:

OTEL_EXPORTER_OTLP_PROTOCOL (and per-signal variants) supports: grpc, http/protobuf, http/json; default is typically http/protobuf. (opentelemetry.io)

Where the protobuf definitions live:

The canonical .proto files and OTLP protocol materials are in open-telemetry/opentelemetry-proto. (github.com)

Citations:

1: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

2: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

3: https://opentelemetry.io/docs/specs/otel/protocol/exporter/?utm_source=openai

4: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

5: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

6: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

7: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

8: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

9: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

10: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

11: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

12: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

13: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

14: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

15: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

16: https://opentelemetry.io/docs/specs/otlp/?utm_source=openai

17: https://opentelemetry.io/docs/specs/otel/protocol/exporter/?utm_source=openai

18: https://github.com/open-telemetry/opentelemetry-proto?utm_source=openai

Remove the /v1/traces suffix from OTEL_EXPORTER_OTLP_ENDPOINT and change protocol values to http/protobuf.

The OpenTelemetry specification requires that OTEL_EXPORTER_OTLP_ENDPOINT be a base URL, with the SDK automatically appending signal-specific paths (/v1/traces, /v1/metrics, /v1/logs). Setting it to …/v1/traces will cause requests to be sent to …/v1/traces/v1/traces. The per-signal endpoint OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is correct as-is since it expects the full URL without automatic appending.

Additionally, OTEL_EXPORTER_OTLP_PROTOCOL and OTEL_EXPORTER_OTLP_TRACES_PROTOCOL must use spec-defined values: grpc, http/protobuf, or http/json. The value "http" is not valid.

Proposed fix

- "OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces", + "OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint(), "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces", - "OTEL_EXPORTER_OTLP_PROTOCOL=http", - "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http", + "OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf", + "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

b.WithTag(cfg.rethTag).WithEnv(

"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",

"OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",

"OTEL_EXPORTER_OTLP_PROTOCOL=http",

"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http",

"RUST_LOG=debug",

b.WithTag(cfg.rethTag).WithEnv(

"OTEL_EXPORTER_OTLP_ENDPOINT="+jg.Internal.IngestHTTPEndpoint(),

"OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="+jg.Internal.IngestHTTPEndpoint()+"/v1/traces",

"OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf",

"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=http/protobuf",

"RUST_LOG=debug",

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/e2e/benchmark/suite_test.go` around lines 73 - 78, The OTLP env setup in the b.WithTag(...).WithEnv call is wrong: change the value for "OTEL_EXPORTER_OTLP_ENDPOINT" to use the base ingest URL (remove the "/v1/traces" suffix) so the SDK can append per-signal paths, keep "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT" as-is (full URL), and update "OTEL_EXPORTER_OTLP_PROTOCOL" and "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL" from "http" to the spec-correct "http/protobuf" (or another valid spec value like "grpc" or "http/json" if intended).

there is a different env OTEL_EXPORTER_OTLP_ENDPOINT and OTEL_EXPORTER_OTLP_TRACES_ENDPOINT based one which reth version being used, adding both to work with both versions.

@chatton, I understand you're setting both to support different reth versions. However, there's a spec compliance issue with the current approach:

According to the OpenTelemetry specification, when using OTEL_EXPORTER_OTLP_ENDPOINT, spec-compliant SDKs automatically append /v1/traces to construct the full URL. Setting it to ...IngestHTTPEndpoint()+"/v1/traces" will cause compliant implementations to send requests to .../v1/traces/v1/traces (double path).

The correct approach for version compatibility would be:

OTEL_EXPORTER_OTLP_ENDPOINT = base URL without /v1/traces (for spec-compliant versions)

OTEL_EXPORTER_OTLP_TRACES_ENDPOINT = full URL with /v1/traces (already correct, used as-is by all versions)

This way:

Spec-compliant reth versions will use OTEL_EXPORTER_OTLP_ENDPOINT and append the path correctly

Non-compliant versions will fall back to OTEL_EXPORTER_OTLP_TRACES_ENDPOINT which has the full path

Additionally, the protocol values need to be updated from "http" to one of the spec-defined values: "grpc", "http/protobuf", or "http/json". The value "http" is not valid per the OTLP specification.

Would you like me to verify reth's specific behavior across versions to confirm the best compatibility approach?

🧠 Learnings used

Learnt from: CR Repo: evstack/ev-node PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-02-24T11:34:24.001Z Learning: Applies to **/*_test.go : Ensure tests are deterministic

Learnt from: CR Repo: evstack/ev-node PR: 0 File: CLAUDE.md:0-0 Timestamp: 2026-02-24T11:34:24.001Z Learning: Applies to **/*_test.go : Use table-driven tests where appropriate

Learnt from: CR Repo: evstack/ev-node PR: 0 File: types/CLAUDE.md:0-0 Timestamp: 2026-02-24T11:34:49.206Z Learning: Applies to types/**/*_test.go : Test serialization round-trips (marshal/unmarshal) for all network types

Learnt from: CR Repo: evstack/ev-node PR: 0 File: types/CLAUDE.md:0-0 Timestamp: 2026-02-24T11:34:49.206Z Learning: Applies to types/**/*_test.go : Use table-driven tests for all validation paths, serialization round-trips, and edge cases in test files

go test treats all arguments after an unknown flag (--evm-binary) as test binary args, so ./benchmark/ was never recognized as a package pattern.

go test sets the cwd to the package directory (test/e2e/benchmark/), so the binary path needs an extra parent traversal.

github-actions

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Spamoor Trace Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.50.

Benchmark suite	Current: `ae525ca`	Previous: `212ac08`	Ratio
`SpamoorSmoke - Batch.SetHeight (avg)`	`14.966867469879517` us	`9.33933933933934` us	`1.60`
`SpamoorSmoke - Store.GetMetadata (avg)`	`9.872934263679765` us	`6.306730415593968` us	`1.57`
`SpamoorSmoke - Store.NewBatch (avg)`	`13.335843373493976` us	`7.635542168674699` us	`1.75`

This comment was automatically generated by workflow using github-action-benchmark.

The benchmark package doesn't define the --binary flag that test-e2e passes. It has its own CI workflow so it doesn't need to run here.

codecov · 2026-02-25T14:49:41Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.93%. Comparing base (212ac08) to head (85c9d2d).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3107      +/-   ##
==========================================
- Coverage   60.95%   60.93%   -0.02%     
==========================================
  Files         113      113              
  Lines       11617    11617              
==========================================
- Hits         7081     7079       -2     
- Misses       3738     3739       +1     
- Partials      798      799       +1

Flag	Coverage Δ
combined	`60.93% <ø> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

alpe

Nice work. 👍
utack

alpe · 2026-02-26T14:03:34Z

test/e2e/benchmark/metrics.go

+	client := &http.Client{Timeout: 200 * time.Millisecond}
+	deadline := time.Now().Add(timeout)
+	var lastErr error
+	for time.Now().Before(deadline) {


personal preference: require.Eventually()

alpe · 2026-02-26T14:04:55Z

test/e2e/benchmark/metrics.go

+)
+
+// requireHTTP polls a URL until it returns a 2xx status code or the timeout expires.
+func requireHTTP(t testing.TB, url string, timeout time.Duration) {


nit: naming is hard. how about requireHostUp

github-actions bot assigned chatton Feb 25, 2026

chatton marked this pull request as draft February 25, 2026 12:17

chatton mentioned this pull request Feb 25, 2026

feat: add gasburner gas throughput benchmark #3108

Draft

chatton commented Feb 25, 2026

View reviewed changes

chatton marked this pull request as ready for review February 25, 2026 12:30

chatton commented Feb 25, 2026

View reviewed changes

fix: correct BENCH_JSON_OUTPUT path for spamoor benchmark

18fc15a

go test sets the working directory to the package under test, so the env var should be relative to test/e2e/benchmark/, not test/e2e/.

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

fix: place package pattern before test binary flags in benchmark CI

fccd9db

go test treats all arguments after an unknown flag (--evm-binary) as test binary args, so ./benchmark/ was never recognized as a package pattern.

chatton marked this pull request as draft February 25, 2026 13:44

fix: adjust evm-binary path for benchmark subpackage working directory

ae525ca

go test sets the cwd to the package directory (test/e2e/benchmark/), so the binary path needs an extra parent traversal.

github-actions bot reviewed Feb 25, 2026

View reviewed changes

fix: exclude benchmark subpackage from make test-e2e

85c9d2d

The benchmark package doesn't define the --binary flag that test-e2e passes. It has its own CI workflow so it doesn't need to run here.

chatton marked this pull request as ready for review February 26, 2026 09:27

alpe approved these changes Feb 26, 2026

View reviewed changes

Conversation

chatton commented Feb 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Bugs / Functional Issues

Design / Robustness

Code Quality Nitpicks

CI

Performance Regression

Uh oh!

chatton Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatton Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

OTLP exporter over HTTP: http/protobuf vs http/json (spec)

Uh oh!

chatton Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

codecov bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

alpe left a comment

Choose a reason for hiding this comment

Uh oh!

alpe Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

alpe Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chatton commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 25, 2026 •

edited

Loading

github-actions bot commented Feb 25, 2026 •

edited

Loading

claude bot commented Feb 25, 2026 •

edited

Loading

chatton Feb 25, 2026 •

edited

Loading

coderabbitai bot Feb 25, 2026 •

edited

Loading

OTLP exporter over HTTP: `http/protobuf` vs `http/json` (spec)

codecov bot commented Feb 25, 2026 •

edited

Loading