Skip to content

V0.5+: own getSessionMessages — re-implement to preserve compactMetadata, subtype, isCompactSummary #13

Description

@yyq1025

Problem

@anthropic-ai/claude-agent-sdk's getSessionMessages strips a bunch of fields from the JSONL on the way out. Specifically, every returned `SessionMessage` is shaped:

```
{ type, uuid, session_id, message, parent_tool_use_id, timestamp }
```

So we lose:

  • `subtype` (e.g. `'compact_boundary'` vs `'stop_hook_summary'` — both come back as anonymous `{type:'system'}`)
  • `compactMetadata` (trigger / preTokens / postTokens / durationMs)
  • `isCompactSummary` (the flag on the SDK-injected user message after compaction)
  • `logicalParentUuid` (compact backpointer)
  • `content` (outer-level field on system messages, distinct from `message.content`)

Confirmed across SDK 0.3.142, 0.3.152 — same 6-field shape both versions. The strip happens in an internal mapping function (`AH` in the minified mjs) that the public API funnels through. `includeSystemMessages: true` (the only relevant option) gives us back system messages but each one is anonymous — we can't distinguish a compact_boundary from a stop_hook_summary.

Why it matters

Immediate: resume-time compact UI can't render the divider with token counts. Live path is fine (SDK live stream emits `SDKCompactBoundaryMessage` and `SDKStatusMessage` with full fields), but reload loses the divider entirely or falls back to a degraded "Context compacted" line with no numbers.

Future use cases the strip blocks:

  • Per-message timestamps for UI (`timestamp` IS in the strip output, but other fields aren't — and we may want more)
  • `logicalParentUuid` for handling partial compaction's preserved-segment splice correctly
  • Distinguishing system message subtypes for richer chrome (stop_hook_summary as a different visual)
  • Any future field Claude Code adds to JSONL that SDK happens not to surface

Why feasible to re-implement

The Python SDK at `claude-agent-sdk-python` (cloned locally for reference) is non-minified, well-annotated, and reads the same JSONL format. Re-implementation is roughly a line-by-line TS port. Key insight: Python SDK doesn't "strip" anything either — it just doesn't put the extra fields into its narrow public `SessionMessage` type. The raw transcript-entry dict carries everything, and `_parse_transcript_entries` keeps it. We just need our own narrow type that's wider than SDK's.

Critical algorithm detail (Python source notes this explicitly — easy to get wrong):

Walk `parentUuid` NOT `logicalParentUuid`. `logicalParentUuid` is set on compact_boundary entries as a backpointer to the pre-compact chain; following it would duplicate post-compaction content.

Estimated effort

Module LOC Notes
Path resolution (project key + worktree) 120-150 Bun/Node hash-mismatch fallback via prefix scan
JSONL streaming parser 80-120 Line-by-line, skip corrupt
Chain rebuild + compact splice 100-140 UUID index, leaf selection, walk-back
Filtering (isMeta / isSidechain / teamName) 40-60 Match SDK filter contract
Enhanced SessionMessage type + adapter 50-80 Wider type that preserves the fields we want
Tests + types 100-150
Total 490-700 2-3 focused days

Tradeoffs vs the V0 heuristic approach

For V0 we ship a degraded "detect summary by content prefix, look back for preceding system" heuristic that gives us divider position without token counts. Works but:

  • No token-count caption on divider (`Context compacted · 215k → 18k` becomes just `Context compacted`)
  • Fragile to content-prefix changes (Claude Code's summary template is hardcoded today; could change)
  • Doesn't generalize beyond compact (every new SDK-strip pain repeats the same workaround pattern)

The re-implementation kills all three.

Migration path

  1. Write new `packages/daemon/src/messages/session-reader.ts` (port from Python SDK)
  2. Define `ExtendedSessionMessage` type in protocol or daemon (wider than SDK's)
  3. Replace the `getSessionMessages` call site in `daemon/src/index.ts`
  4. Strip the V0 heuristic from `normalize.ts`; replace with proper field reads
  5. `normalize.ts` can now emit `compact_divider` with full metadata on resume path, matching live path's reducer-emitted dividers

`getSessionInfo` we likely keep using from SDK (smaller surface). Only `getSessionMessages` worth replacing.

When to pick this up

When at least one of:

  • Multiple session-read features are blocked by the strip (timestamps for UI, partial-compact preserved-segment splice, etc.)
  • Claude Code changes the summary template text and breaks the V0 heuristic
  • We add fork session support (issue V0.5+: support /fork, /rename, /rewind slash commands #10) and need full `parentUuid` chain rebuild on our side
  • A weekend opens up

Not blocking V0 ship.

Related

  • V0 compact UI heuristic ships in upcoming Slice 2 commits — this issue's GA replaces that heuristic
  • Slice 2 plans: noise filter / compact_started + compact_applied lifecycle (live works regardless) / divider + summary in transcript
  • V0.5+: support /fork, /rename, /rewind slash commands #10 fork sessions — would also need this for correct fork chain rebuilds

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions