Skip to content

Add split metadata enforcement and CLI support for leak-free benchmarks#19

Draft
Codex wants to merge 2 commits intomainfrom
codex/add-train-val-test-split-methodology
Draft

Add split metadata enforcement and CLI support for leak-free benchmarks#19
Codex wants to merge 2 commits intomainfrom
codex/add-train-val-test-split-methodology

Conversation

@Codex
Copy link
Copy Markdown
Contributor

@Codex Codex AI commented Apr 1, 2026

The benchmark flow risked data leakage because train/val/test separation was not enforced, Anti-DreamBooth’s holdout was only metadata, and reports lacked split lineage. This update introduces split manifests, enforces split membership in benchmarking, and clarifies holdout handling.

  • Split infrastructure: New SplitType/SplitMetadata, random split creator, manifest save/load, overlap validation (src/auralock/benchmarks/splits.py), plus README guidance and auralock split create/validate CLI.
  • Benchmark enforcement: ProtectionService.benchmark_file/benchmark_directory now require split metadata, validate membership, warn on non-test splits, and embed split info in reports; CLI benchmark mandates --split-manifest/--split-type.
  • Anti-DreamBooth clarity: Notes updated to treat set_C as holdout; tests made path-agnostic.
  • Tests: Added coverage for splits manifest and updated CLI/Docker benchmark tests to import correct modules.

Example:

# Create deterministic splits
auralock split create ./artworks --output splits.json --train-ratio 0.7 --val-ratio 0.15 --test-ratio 0.15

# Benchmark only declared test images; warns if not test
auralock benchmark ./artworks \
  --profiles safe,balanced \
  --split-manifest splits.json \
  --split-type test \
  --report reports/benchmark.json

Co-authored-by: VoDaiLocz <88762074+VoDaiLocz@users.noreply.github.com>
@Codex Codex AI changed the title [WIP] Add train/validation/test split methodology to benchmark Add split metadata enforcement and CLI support for leak-free benchmarks Apr 1, 2026
@Codex Codex AI requested a review from VoDaiLocz April 1, 2026 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants