Skip to content

feat(helpers): add sync_audio.py for precise multicam audio sync via GCC-PHAT#69

Open
robertgenito wants to merge 1 commit into
browser-use:mainfrom
robertgenito:feat/sync-audio-gccphat
Open

feat(helpers): add sync_audio.py for precise multicam audio sync via GCC-PHAT#69
robertgenito wants to merge 1 commit into
browser-use:mainfrom
robertgenito:feat/sync-audio-gccphat

Conversation

@robertgenito

@robertgenito robertgenito commented Jun 17, 2026

Copy link
Copy Markdown

What

Adds helpers/sync_audio.py, a precise multicam audio-sync helper, documents it in SKILL.md, and declares scipy as a direct dependency.

Why

Multicam projects need frame-accurate audio alignment. Eyeballing offsets from Scribe's (claps) audio_event timestamps drifts 200-500ms, and plain cross-correlation between mismatched microphones gives a broad, ambiguous peak. GCC-PHAT (Generalized Cross-Correlation with Phase Transform) whitens the cross-power spectrum before correlating, so the peak stays sharp regardless of mic frequency response or room reverb. It is sub-frame accurate even between a studio mic and an on-camera mic.

How it works

  • Takes a reference video, one or more targets, and rough sync timestamps for at least one shared transient (a clap).
  • Extracts a window of audio around each event (default 10s), runs GCC-PHAT to find the precise lag, and combines the lag with the window-start delta into a precise source-time offset.
  • Averages across events per target, and flags disagreement over 50ms as likely clap-pattern aliasing or genuine clock drift, falling back to the highest-sharpness measurement.
  • Outputs sync_offsets.json with final offsets per target plus per-event measurements.

Usage

# JSON config (recommended)
uv run python helpers/sync_audio.py --config sync.json

# or CLI flags
uv run python helpers/sync_audio.py \
    --reference DESK=/path/desk.mp4 \
    --target A1=/path/c0036.mp4 \
    --event clap_1:DESK=6.23,A1=101.03 \
    --out sync_offsets.json

Dependencies

Declares scipy in pyproject.toml. It was already present transitively (via librosa), but this helper imports scipy.signal and scipy.io.wavfile directly, so it should be an explicit dependency.


Summary by cubic

Adds helpers/sync_audio.py to precisely sync multicam audio using GCC-PHAT. Produces sub-frame accurate offsets across dissimilar mics and writes sync_offsets.json.

  • New Features

    • GCC-PHAT-based sync from rough event times (e.g., claps) between a reference and multiple targets.
    • Config file or CLI; per-target window sizing; per-event measurements with sharpness.
    • Averages multiple events; warns on >50ms disagreement and falls back to highest-sharpness result.
    • Documented in SKILL.md.
  • Dependencies

    • Declare scipy as a direct dependency (used via scipy.signal and scipy.io.wavfile).

Written for commit 6df055a. Summary will update on new commits.

Review in cubic

…GCC-PHAT

Multicam projects need frame-accurate audio alignment. Eyeballing offsets
from Scribe's (claps) audio_event timestamps drifts 200-500ms, and plain
cross-correlation between mismatched mics gives a broad, ambiguous peak.
GCC-PHAT whitens the cross-power spectrum before correlating, so the peak
stays sharp regardless of mic frequency response or room reverb. It is
sub-frame accurate even between a studio mic and an on-camera mic.

The helper takes a reference video, one or more targets, and rough sync
timestamps for at least one shared transient (a clap). It cross-correlates
10s+ windows around each event, combines the lag with the window-start delta
into a precise source-time offset, averages across events per target, and
flags disagreement over 50ms as likely clap-pattern aliasing or clock drift.
Output goes to sync_offsets.json, the helper is documented in SKILL.md, and
scipy is declared as a direct dependency.

More about me: https://geni.to/about

Signed-off-by: Robert Genito <robert@robertgenito.com>

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="helpers/sync_audio.py">

<violation number="1" location="helpers/sync_audio.py:197">
P2: Default reference name "REF" is inconsistent with the docstring example and help text, causing events to be silently skipped when users follow the documented CLI pattern.</violation>

<violation number="2" location="helpers/sync_audio.py:270">
P2: Target measurement coverage is not validated: if a target has no shared events, the script silently produces partial JSON output and exits 0.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread helpers/sync_audio.py
print("=" * 60)
for cam, ms in by_target.items():
if not ms:
print(f" {cam}: no measurements")

@cubic-dev-ai cubic-dev-ai Bot Jun 17, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Target measurement coverage is not validated: if a target has no shared events, the script silently produces partial JSON output and exits 0.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At helpers/sync_audio.py, line 270:

<comment>Target measurement coverage is not validated: if a target has no shared events, the script silently produces partial JSON output and exits 0.</comment>

<file context>
@@ -0,0 +1,295 @@
+    print("=" * 60)
+    for cam, ms in by_target.items():
+        if not ms:
+            print(f"  {cam}: no measurements")
+            continue
+        offsets = [m["precise_offset"] for m in ms]
</file context>
Fix with cubic

Comment thread helpers/sync_audio.py
if not (args.reference and args.target and args.event):
sys.exit("Need --config OR (--reference + --target + --event …)")

ref_name, ref_path = args.reference.split("=", 1) if "=" in args.reference else ("REF", args.reference)

@cubic-dev-ai cubic-dev-ai Bot Jun 17, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Default reference name "REF" is inconsistent with the docstring example and help text, causing events to be silently skipped when users follow the documented CLI pattern.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At helpers/sync_audio.py, line 197:

<comment>Default reference name "REF" is inconsistent with the docstring example and help text, causing events to be silently skipped when users follow the documented CLI pattern.</comment>

<file context>
@@ -0,0 +1,295 @@
+    if not (args.reference and args.target and args.event):
+        sys.exit("Need --config OR (--reference + --target + --event …)")
+
+    ref_name, ref_path = args.reference.split("=", 1) if "=" in args.reference else ("REF", args.reference)
+    targets = {}
+    for t in args.target:
</file context>
Fix with cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant