This document describes the high-level architecture of WebBlackbox, the design decisions behind it, and how the components interact.
WebBlackbox is a three-tier core system with an optional collaboration tier:
- Recording Tier — A Chrome extension captures events from multiple sources
- Processing Tier — A pipeline chunks, compresses, indexes, and archives events
- Playback Tier — A Player SDK and React UI provide analysis and visualization
- Collaboration Tier (optional) —
share-serverstores uploaded archives and emits redacted secondary indexes for share links
┌─────────────────────────────────────────────────────────────────────┐
│ RECORDING TIER │
│ │
│ ┌──────────┐ ┌──────────┐ ┌────────────┐ ┌──────────────────┐ │
│ │ Injected │→ │ Content │→ │ Service │← │ CDP Router │ │
│ │ Script │ │ Script │ │ Worker │ │ (chrome.debugger) │ │
│ └──────────┘ └──────────┘ └─────┬──────┘ └──────────────────┘ │
│ │ │
│ ┌─────▼──────┐ │
│ │ Recorder │ │
│ │ (normalize │ │
│ │ + buffer) │ │
│ └─────┬──────┘ │
└────────────────────────────────────┼────────────────────────────────┘
│
┌────────────────────────────────────▼────────────────────────────────┐
│ PROCESSING TIER │
│ │
│ ┌───────────┐ ┌─────────┐ ┌──────────┐ ┌──────────────────┐ │
│ │ Chunker │→ │ Codec │→ │ Indexer │→ │ Archive Exporter │ │
│ └───────────┘ └─────────┘ └──────────┘ └────────┬─────────┘ │
│ │ │
│ ┌────────────────────────────┐ │ │
│ │ Blob Storage (SHA-256 │ │ │
│ │ dedup, ref counting) │──────────────────────┘ │
│ └────────────────────────────┘ │
└──────────────────────────────────────────┬─────────────────────────┘
│
.webblackbox
ZIP archive
│
┌──────────────────────────────────────────▼─────────────────────────┐
│ PLAYBACK TIER │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Player SDK │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────┐ │ │
│ │ │ Query │ │ Network │ │ DOM │ │ Code │ │ │
│ │ │ Engine │ │ Waterfall│ │ Differ │ │ Generator │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ └─────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Player UI (React) │ │
│ │ Timeline │ Network │ Console │ Storage │ DOM │ Perf │ │
│ └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
All data flows through the @webblackbox/protocol package. Every event, message, and configuration has a corresponding TypeScript type and Zod validation schema. This ensures:
- Type safety across all packages at compile time
- Runtime validation at system boundaries
- Forward compatibility via the
v: 1protocol version field - Strict schemas prevent accidental data corruption
WebBlackbox uses an event-sourced architecture. All state changes are captured as immutable events with monotonic timestamps. The ring buffer and archive are append-only event logs that can be replayed to reconstruct session state at any point in time.
Binary data (screenshots, DOM snapshots, response bodies) is stored as blobs identified by SHA-256 hashes. This provides:
- Automatic deduplication — Identical content is stored once
- Integrity verification — Hashes are checked on read
- Reference counting — Blobs are cleaned up when no longer referenced
Sensitive data is redacted before it enters the pipeline:
- Headers like
Authorization,Cookie, andSet-Cookieare scrubbed - Body content matching patterns like
password,token,secretis masked - DOM elements matching CSS selectors like
input[type='password']are blocked - Optional SHA-256 hashing preserves correlation analysis without exposing raw values
Each package has a single responsibility:
| Package | Responsibility |
|---|---|
protocol |
Data definitions and validation |
recorder |
Event collection and normalization |
pipeline |
Event processing and archival |
player-sdk |
Session analysis and code generation |
cdp-router |
Chrome DevTools Protocol management |
WebBlackbox captures events from three sources:
- Network domain — HTTP requests, responses, WebSocket frames, failures
- Runtime domain — JavaScript exceptions, console API calls
- Log domain — Browser log entries
- Page domain — Navigation events, frame lifecycle
- User interactions — click, dblclick, keydown, input, submit, scroll, mousemove, focus, blur, resize, visibilitychange
- DOM mutations — Batched MutationObserver records
- DOM snapshots — Full page snapshots at intervals
- Screenshots — SnapDOM captures on idle and after actions
- Console — Intercepts
console.log/warn/error/etc.calls - Storage — Monitors localStorage, sessionStorage, and IndexedDB operations
- Network/Error hooks — Captures fetch/XHR lifecycle plus page/runtime errors
Raw events from all three sources are normalized by the DefaultEventNormalizer into a consistent WebBlackboxEvent format. This unified representation enables downstream processing to be source-agnostic.
Events are stored in a time-windowed ring buffer (default: 10 minutes). When the buffer exceeds its time window, the oldest events are automatically pruned. This keeps memory usage bounded while always preserving recent context.
User actions (clicks, form submissions, navigation) create "action spans" — time windows (default: 1500ms) that group related events. Network requests initiated during an action span are linked via ref.act, enabling cause-effect analysis in the player.
The recorder evaluates freeze conditions on every event:
- Error freeze — Uncaught JavaScript exceptions or unhandled promise rejections
- Network freeze — Network request failure rate exceeds threshold
- Performance freeze — Long tasks exceeding 200ms
- Manual freeze — User-triggered markers (Ctrl+Shift+M)
When a freeze is triggered, the ring buffer contents are preserved, providing full context around the issue.
Events are grouped into size-bounded chunks (default: 512KB). Each chunk is:
- Serialized as NDJSON (newline-delimited JSON)
- Encoded with chunk codecs (
none,gzip,br,zst) - Hashed with SHA-256 for integrity
- Stored with metadata (timestamps, event count, byte length)
Three indexes are built for efficient querying:
- Time Index — Maps timestamp ranges to chunks for O(log n) time-based lookup
- Request Index — Maps network request IDs to event IDs for request tracing
- Inverted Index — Maps searchable terms to event IDs for full-text search
In the extension pipeline, chunks are persisted first and indexes are rebuilt on demand during finalizeIndexes() / export. This avoids keeping full-session request and inverted indexes resident in offscreen memory during long-running recordings.
Binary content is stored as content-addressable blobs:
Event: screen.screenshot { shotId: "abc", ... }
→ Blob: sha256("...") = "7f83b1657ff1..."
→ Storage: blobs/sha256-7f83b1657ff1....webp
Blobs are deduplicated by hash and reference-counted. The MemoryPipelineStorage implementation uses Maps for the extension's offscreen document context.
The export process creates a .webblackbox ZIP file:
- All chunks are collected from storage
- Indexes are finalized
- Blobs are included
- Manifest is generated with metadata and stats
- Integrity hashes are computed for all files
- Optional AES-GCM encryption is applied
- Everything is packaged into a ZIP archive
- ZIP is extracted
- Manifest is parsed and validated
- If encrypted, encryption metadata is extracted
- Chunks are decrypted (if needed) and decoded
- Indexes are loaded
- Blobs are kept in the archive for on-demand retrieval
The Player SDK provides a flexible query API that filters events by:
- Time range — Monotonic timestamp start/end
- Event types — Array of specific types
- Levels — debug, info, warn, error
- Text — Full-text search using the inverted index
- Request ID — Network request correlation
| Analysis | Method | Description |
|---|---|---|
| Network waterfall | getNetworkWaterfall() |
Complete request/response timeline |
| Realtime streams | getRealtimeNetworkTimeline() |
WebSocket and SSE analysis |
| Storage operations | getStorageTimeline() |
All storage operations chronologically |
| DOM diffing | getDomDiffTimeline() |
Changes between DOM snapshots |
| Performance | getPerformanceArtifacts() |
CPU profiles, heap snapshots, vitals |
| Action spans | buildDerived() |
User action analysis with stats |
| Session comparison | compareWith() |
Diff two sessions |
The Player SDK can generate executable code from captured data:
- curl — Replay any HTTP request from the command line
- fetch — Replay any request in JavaScript
- HAR — Standard HTTP Archive for tool interop
- Playwright test — Automated test script from user actions
- Playwright mock — Test script with captured response mocks
- Bug report — Markdown-formatted report with context
- GitHub/Jira issues — Pre-filled issue templates
The Chrome extension operates across multiple execution contexts with strict message-passing boundaries:
Page World Extension World Background
┌──────────┐ ┌──────────────┐ ┌──────────────┐
│ Injected │ │ Content │ │ Service │
│ Script │──────→│ Script │───────→│ Worker │
│ │ │ │ │ │
│ window. │ │ chrome. │ │ CDP Router │
│ postMsg │ │ runtime. │ │ Recorder │
│ │ │ connect/port │ │ │
└──────────┘ └──────────────┘ └──────┬───────┘
│
┌──────▼───────┐
│ Offscreen │
│ Document │
│ (Pipeline) │
└──────────────┘
- Page World → Extension:
window.postMessage(injected → content) - Extension → Background:
chrome.runtime.connect+port.postMessage(content → SW) - Background → Offscreen:
chrome.runtime.connect+port.postMessage(SW ↔ offscreen) - CDP:
chrome.debugger.sendCommand/onEvent(SW ↔ browser)
- Sensitive headers are redacted before entering the pipeline
- Body content is pattern-matched and scrubbed
- DOM elements with sensitive selectors are masked
- Archives can be encrypted with AES-GCM
- Algorithm: AES-GCM (256-bit key)
- Key Derivation: PBKDF2 with SHA-256, 120,000 iterations
- Salt: Random 16-byte salt per archive
- IV: Random 12-byte IV per file within the archive
- Scope: Event chunks, indexes, and blobs; manifest remains readable
- The extension requires
debuggerpermission for CDP access <all_urls>host permission is needed for content script injection- Users must explicitly grant permissions during installation