You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@anthropic-ai/claude-agent-sdk's getSessionMessages strips a bunch of fields from the JSONL on the way out. Specifically, every returned `SessionMessage` is shaped:
`isCompactSummary` (the flag on the SDK-injected user message after compaction)
`logicalParentUuid` (compact backpointer)
`content` (outer-level field on system messages, distinct from `message.content`)
Confirmed across SDK 0.3.142, 0.3.152 — same 6-field shape both versions. The strip happens in an internal mapping function (`AH` in the minified mjs) that the public API funnels through. `includeSystemMessages: true` (the only relevant option) gives us back system messages but each one is anonymous — we can't distinguish a compact_boundary from a stop_hook_summary.
Why it matters
Immediate: resume-time compact UI can't render the divider with token counts. Live path is fine (SDK live stream emits `SDKCompactBoundaryMessage` and `SDKStatusMessage` with full fields), but reload loses the divider entirely or falls back to a degraded "Context compacted" line with no numbers.
Future use cases the strip blocks:
Per-message timestamps for UI (`timestamp` IS in the strip output, but other fields aren't — and we may want more)
`logicalParentUuid` for handling partial compaction's preserved-segment splice correctly
Distinguishing system message subtypes for richer chrome (stop_hook_summary as a different visual)
Any future field Claude Code adds to JSONL that SDK happens not to surface
Why feasible to re-implement
The Python SDK at `claude-agent-sdk-python` (cloned locally for reference) is non-minified, well-annotated, and reads the same JSONL format. Re-implementation is roughly a line-by-line TS port. Key insight: Python SDK doesn't "strip" anything either — it just doesn't put the extra fields into its narrow public `SessionMessage` type. The raw transcript-entry dict carries everything, and `_parse_transcript_entries` keeps it. We just need our own narrow type that's wider than SDK's.
Critical algorithm detail (Python source notes this explicitly — easy to get wrong):
Walk `parentUuid` NOT `logicalParentUuid`. `logicalParentUuid` is set on compact_boundary entries as a backpointer to the pre-compact chain; following it would duplicate post-compaction content.
Estimated effort
Module
LOC
Notes
Path resolution (project key + worktree)
120-150
Bun/Node hash-mismatch fallback via prefix scan
JSONL streaming parser
80-120
Line-by-line, skip corrupt
Chain rebuild + compact splice
100-140
UUID index, leaf selection, walk-back
Filtering (isMeta / isSidechain / teamName)
40-60
Match SDK filter contract
Enhanced SessionMessage type + adapter
50-80
Wider type that preserves the fields we want
Tests + types
100-150
Total
490-700
2-3 focused days
Tradeoffs vs the V0 heuristic approach
For V0 we ship a degraded "detect summary by content prefix, look back for preceding system" heuristic that gives us divider position without token counts. Works but:
No token-count caption on divider (`Context compacted · 215k → 18k` becomes just `Context compacted`)
Fragile to content-prefix changes (Claude Code's summary template is hardcoded today; could change)
Doesn't generalize beyond compact (every new SDK-strip pain repeats the same workaround pattern)
The re-implementation kills all three.
Migration path
Write new `packages/daemon/src/messages/session-reader.ts` (port from Python SDK)
Define `ExtendedSessionMessage` type in protocol or daemon (wider than SDK's)
Replace the `getSessionMessages` call site in `daemon/src/index.ts`
Strip the V0 heuristic from `normalize.ts`; replace with proper field reads
`normalize.ts` can now emit `compact_divider` with full metadata on resume path, matching live path's reducer-emitted dividers
`getSessionInfo` we likely keep using from SDK (smaller surface). Only `getSessionMessages` worth replacing.
When to pick this up
When at least one of:
Multiple session-read features are blocked by the strip (timestamps for UI, partial-compact preserved-segment splice, etc.)
Claude Code changes the summary template text and breaks the V0 heuristic
Problem
@anthropic-ai/claude-agent-sdk'sgetSessionMessagesstrips a bunch of fields from the JSONL on the way out. Specifically, every returned `SessionMessage` is shaped:```
{ type, uuid, session_id, message, parent_tool_use_id, timestamp }
```
So we lose:
Confirmed across SDK 0.3.142, 0.3.152 — same 6-field shape both versions. The strip happens in an internal mapping function (`AH` in the minified mjs) that the public API funnels through. `includeSystemMessages: true` (the only relevant option) gives us back system messages but each one is anonymous — we can't distinguish a compact_boundary from a stop_hook_summary.
Why it matters
Immediate: resume-time compact UI can't render the divider with token counts. Live path is fine (SDK live stream emits `SDKCompactBoundaryMessage` and `SDKStatusMessage` with full fields), but reload loses the divider entirely or falls back to a degraded "Context compacted" line with no numbers.
Future use cases the strip blocks:
Why feasible to re-implement
The Python SDK at `claude-agent-sdk-python` (cloned locally for reference) is non-minified, well-annotated, and reads the same JSONL format. Re-implementation is roughly a line-by-line TS port. Key insight: Python SDK doesn't "strip" anything either — it just doesn't put the extra fields into its narrow public `SessionMessage` type. The raw transcript-entry dict carries everything, and `_parse_transcript_entries` keeps it. We just need our own narrow type that's wider than SDK's.
Critical algorithm detail (Python source notes this explicitly — easy to get wrong):
Estimated effort
Tradeoffs vs the V0 heuristic approach
For V0 we ship a degraded "detect summary by content prefix, look back for preceding system" heuristic that gives us divider position without token counts. Works but:
The re-implementation kills all three.
Migration path
`getSessionInfo` we likely keep using from SDK (smaller surface). Only `getSessionMessages` worth replacing.
When to pick this up
When at least one of:
Not blocking V0 ship.
Related