feat(vortex-bench): wire SpatialBench into the bench orchestrator by HarukiMoriarty · Pull Request #8607 · vortex-data/vortex

HarukiMoriarty · 2026-06-26T17:51:24Z

Summary

Wires SpatialBench into the vx-bench / bench-orchestrator pipeline so it can be run end-to-end like the other benchmarks (datagen → Parquet → Vortex conversion → query). It builds on the WKB datagen landed in #8598.

Running command:

uv run --project bench-orchestrator vx-bench run spatialbench --engine duckdb --format parquet,vortex --opt scale-factor=N --queries 1,2,3,4,5,6,7,8,9 --iterations 3

Limitation

DuckDB-only. For now SpatialBench queries use DuckDB-specific ST_* spatial SQL that DataFusion has no functions for yet. There is a a single ad-hoc entry in BENCHMARK_ENGINES = { SPATIALBENCH: {DUCKDB} }.
No dictionary encoding / compaction on the WKB column. WKB geometry blobs are large and effectively unique, so running the dictionary builder over them balloons memory (tens of GB) for zero compression gain. The normal compaction path is preserved for every other column on every other benchmark.
Queries 10, 11, 12 is timeout simply because DuckDB poorly support on Spatial index.

Performance

SF=1.0

┏━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Query ┃ duckdb:parquet (base) ┃   duckdb:vortex ┃
┡━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ 1     │                39.1ms │  14.6ms (0.37x) │
│ 2     │                72.1ms │  24.9ms (0.35x) │
│ 3     │                57.7ms │  20.1ms (0.35x) │
│ 4     │               113.1ms │  70.6ms (0.62x) │
│ 5     │               354.2ms │ 288.4ms (0.81x) │
│ 6     │               169.5ms │  91.6ms (0.54x) │
│ 7     │               156.7ms │  71.5ms (0.46x) │
│ 8     │               196.5ms │  80.3ms (0.41x) │
│ 9     │                20.3ms │  18.7ms (0.92x) │
└───────┴───────────────────────┴─────────────────┘

SF=3

┏━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Query ┃ duckdb:parquet (base) ┃   duckdb:vortex ┃
┡━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ 1     │                50.7ms │  31.5ms (0.62x) │
│ 2     │               126.2ms │  60.3ms (0.48x) │
│ 3     │                71.9ms │  42.9ms (0.60x) │
│ 4     │               539.9ms │  64.9ms (0.12x) │
│ 5     │               948.7ms │ 874.5ms (0.92x) │
│ 6     │               656.2ms │ 121.7ms (0.19x) │
│ 7     │               256.6ms │ 232.1ms (0.90x) │
│ 8     │               273.8ms │ 244.6ms (0.89x) │
│ 9     │                35.7ms │  27.9ms (0.78x) │
└───────┴───────────────────────┴─────────────────┘

SF=10

┏━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Query ┃ duckdb:parquet (base) ┃   duckdb:vortex ┃
┡━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ 1     │               158.6ms │ 114.4ms (0.72x) │
│ 2     │               255.1ms │ 219.3ms (0.86x) │
│ 3     │               229.2ms │ 181.5ms (0.79x) │
│ 4     │               184.5ms │ 134.3ms (0.73x) │
│ 5     │                 3.30s │   3.08s (0.93x) │
│ 6     │               476.4ms │ 348.9ms (0.73x) │
│ 7     │               918.2ms │ 961.2ms (1.05x) │
│ 8     │               980.6ms │ 926.7ms (0.94x) │
│ 9     │                33.7ms │  33.6ms (1.00x) │
└───────┴───────────────────────┴─────────────────┘

codspeed-hq · 2026-06-26T17:57:20Z

Merging this PR will improve performance by 11.93%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 1 improved benchmark
✅ 1594 untouched benchmarks
⏩ 4 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	Simulation	`bitwise_not_vortex_buffer_mut[128]`	273.6 ns	244.4 ns	+11.93%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing nemo/spatial-wire-vx-bench (8fa6ee0) with develop (0a45777)}

4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

myrrc

LGTM

Signed-off-by: Nemo Yu <zyu379@wisc.edu>

DuckDB's GEOMETRY only accepts little-endian (NDR) WKB, but the externally sourced zone table (Overture Maps via spatialbench-cli) is big-endian, so the vortex lane failed spatial queries with "Only little-endian WKB is supported". Re-encode geometry columns to little-endian during the parquet->vortex conversion so the vortex file stores canonical little-endian WKB; columns that are already little-endian pass through without a copy. Also drop the best-effort warn around zone generation so data-generation failures propagate. Signed-off-by: Nemo Yu <zyu379@wisc.edu>

AdamGS · 2026-07-01T14:20:28Z


-def targets_from_axes(engine: str, format: str) -> tuple[list[BenchmarkTarget], list[str]]:
+def targets_from_axes(
+    engine: str, format: str, benchmark: Benchmark | None = None


I don't about this much, but why is this needed?

AdamGS · 2026-07-01T14:22:22Z

+///
+/// For SpatialBench (`skip_binary_dict`), the geometry blobs are large and
+/// unique, so the dictionary builder balloons memory (tens of GB) for zero gain.
+fn write_options_for(


this is clunky, not sure I have a better way if doing that right now :/

AdamGS · 2026-07-01T14:24:05Z

+    // Generate into a scratch dir so the CLI's `zone.parquet` name can't collide with the base
+    // tables, then move the produced parts into place as `zone_{part}.parquet`.
+    // Start from an empty scratch dir (clear any leftover from an interrupted run).


We already have code that handles idempotent datagen.

HarukiMoriarty added the changelog/feature A new feature label Jun 26, 2026

HarukiMoriarty requested a review from myrrc June 26, 2026 17:52

HarukiMoriarty mentioned this pull request Jun 26, 2026

feat(vortex-bench): infra of SpatialBench on DuckDB, plain binary #8598

Merged

HarukiMoriarty force-pushed the nemo/spatial-wire-vx-bench branch from a0218b2 to fdb0872 Compare June 26, 2026 17:55

myrrc requested a review from AdamGS June 29, 2026 13:27

myrrc reviewed Jun 29, 2026

View reviewed changes

Comment thread vortex-bench/src/spatialbench/datagen/wkb.rs Outdated

myrrc approved these changes Jun 29, 2026

View reviewed changes

Base automatically changed from nemo/spatial-wkb to develop June 30, 2026 20:57

HarukiMoriarty requested a review from a team June 30, 2026 20:57

feat: wire in the vx-benchmark

b362551

Signed-off-by: Nemo Yu <zyu379@wisc.edu>

HarukiMoriarty force-pushed the nemo/spatial-wire-vx-bench branch from fdb0872 to b362551 Compare June 30, 2026 21:10

HarukiMoriarty and others added 2 commits July 1, 2026 10:09

Merge branch 'develop' into nemo/spatial-wire-vx-bench

8fa6ee0

HarukiMoriarty enabled auto-merge (squash) July 1, 2026 14:12

AdamGS reviewed Jul 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(vortex-bench): wire SpatialBench into the bench orchestrator#8607

feat(vortex-bench): wire SpatialBench into the bench orchestrator#8607
HarukiMoriarty wants to merge 3 commits into
developfrom
nemo/spatial-wire-vx-bench

HarukiMoriarty commented Jun 26, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

myrrc left a comment

Uh oh!

AdamGS Jul 1, 2026

Uh oh!

AdamGS Jul 1, 2026

Uh oh!

AdamGS Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

HarukiMoriarty commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Limitation

Performance

Uh oh!

codspeed-hq Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 11.93%

Performance Changes

Footnotes

Uh oh!

Uh oh!

myrrc left a comment

Choose a reason for hiding this comment

Uh oh!

AdamGS Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

AdamGS Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

AdamGS Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HarukiMoriarty commented Jun 26, 2026 •

edited

Loading

codspeed-hq Bot commented Jun 26, 2026 •

edited

Loading