fix(sessions): show "Thinking…" on the group chip while the agent thinks#2961
fix(sessions): show "Thinking…" on the group chip while the agent thinks#2961adboio wants to merge 5 commits into
Conversation
A turn that's mid extended-thinking folds into a collapsed ToolCallGroupChip,
but `summarize()` derived the chip's label/spinner only from `tool_call`
updates. Thought chunks carry no tool status or title, so a thinking-only turn
summarized as `liveLabel = null` / `active = false` / `doneLabel = "Worked"` —
the chip read as finished ("Worked", no spinner) while the agent was actively
thinking, with the reasoning buried inside the collapsed chip.
Make `summarize()` thinking-aware: a trailing, still-streaming thought
(`thoughtComplete !== true`, the same flag ThoughtView uses for its spinner)
now marks the group active with a "Thinking…" live label. The existing chip
logic then shows "Thinking…" with a spinner (or "Read a file · Thinking…" when
work preceded it). Works live since the active turn's summary is never cached.
Adds buildThreadGroups.test.ts covering the thinking-only, tool-after-thought,
thought-after-work, and completed-thought cases.
Generated-By: PostHog Code
Task-Id: 6fa59fea-01a6-4d72-aadb-2e745e2c5495
|
React Doctor found no issues in the changed files. 🎉 Reviewed by React Doctor for commit |
|
Reviews (1): Last reviewed commit: "fix(sessions): show "Thinking…" on the g..." | Re-trigger Greptile |
| describe("buildThreadGroups summary — thinking awareness", () => { | ||
| it("reads a turn mid extended-thinking as live, not 'Worked'", () => { | ||
| // A still-streaming thought (thoughtComplete falsy) is the only activity so | ||
| // far: the chip must say it's thinking, not fall back to the done label. | ||
| const summary = summaryOf([thought("th1", { thoughtComplete: false })]); | ||
|
|
||
| expect(summary.active).toBe(true); | ||
| expect(summary.liveLabel).toBe("Thinking…"); | ||
| expect(summary.hasCountableWork).toBe(false); | ||
| }); | ||
|
|
||
| it("keeps the tool's live label when a tool runs after thinking", () => { | ||
| // Thought, then an in-flight tool call: the tool is the latest activity, so | ||
| // its title wins over the thinking label. | ||
| const summary = summaryOf([ | ||
| thought("th1", { thoughtComplete: true }), | ||
| toolItem("t1"), | ||
| ]); | ||
|
|
||
| expect(summary.active).toBe(true); | ||
| expect(summary.liveLabel).toBe("Read file.ts"); | ||
| }); | ||
|
|
||
| it("shows thinking again when a thought trails completed tool work", () => { | ||
| // Tool finished, agent is thinking once more: countable work plus a live | ||
| // thinking label, so the chip can read "Read a file · Thinking…". | ||
| const summary = summaryOf([ | ||
| toolItem("t1"), | ||
| thought("th1", { thoughtComplete: false }), | ||
| ]); | ||
|
|
||
| expect(summary.active).toBe(true); | ||
| expect(summary.liveLabel).toBe("Thinking…"); | ||
| expect(summary.hasCountableWork).toBe(true); | ||
| expect(summary.doneLabel).toBe("Read a file"); | ||
| }); | ||
|
|
||
| it("does not treat a completed thought as live work", () => { | ||
| // A finished turn whose only activity was thinking: no live label, falls | ||
| // back to the "Worked" done label (there is no countable tool work). | ||
| const summary = summaryOf([ | ||
| thought("th1", { thoughtComplete: true }, completeContext), | ||
| ]); | ||
|
|
||
| expect(summary.active).toBe(false); | ||
| expect(summary.liveLabel).toBeNull(); | ||
| expect(summary.doneLabel).toBe("Worked"); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
The four cases share the same shape — a list of ConversationItems in, a set of GroupSummary fields out — which is the canonical fit for it.each. With separate it() blocks, adding or adjusting a case also requires updating the surrounding boilerplate, and the inline comments duplicate what a table's column names would express. Consolidating into an it.each table (with active, liveLabel, hasCountableWork, and doneLabel as expectation columns) keeps every scenario OnceAndOnlyOnce and matches the repo's stated test convention.
Context Used: Do not attempt to comment on incorrect alphabetica... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
The `biome ci` quality check enforces formatting; wrap the guard's throw onto its own line so the file matches the formatter's output. Generated-By: PostHog Code Task-Id: 6fa59fea-01a6-4d72-aadb-2e745e2c5495
The four cases share one shape — items in, GroupSummary fields out — so fold them into a single it.each table (active / liveLabel / hasCountableWork / doneLabel as expectation columns), matching the repo's parameterised-test convention. Per Greptile review feedback. Generated-By: PostHog Code Task-Id: 6fa59fea-01a6-4d72-aadb-2e745e2c5495
Expanding a "Worked" chip whose only activity was thinking showed an empty bordered box, and even with content it showed a redundant collapsed "Thinking" row needing a second click. Two causes: - Blank extended-thinking streams as a text-less agent_thought_chunk, which ThoughtView renders as null. But the chip's bordered box draws whenever it has children (a hidden child still defeats CSS :empty), so the box stayed empty. - ThoughtView collapses its content by default, so a non-blank thought inside an already-expanded chip showed only a "Thinking" header. Fixes: - groupItemRendersContent() mirrors the null-returning branches of SessionUpdateView/ThoughtView; ConversationView only feeds renderable items to the chip and passes no children when none render, so ToolRow skips the box. - ToolCallGroupChip gains an `expandable` prop: a group with no renderable body is a plain summary line with no caret, instead of a caret opening onto nothing. - ThoughtView renders its content open by default — thinking has no useful one-line summary, so once revealed the reasoning itself is what's worth seeing. Adds groupItemRendersContent test cases (blank/streaming/text/tool). Generated-By: PostHog Code Task-Id: 6fa59fea-01a6-4d72-aadb-2e745e2c5495
Problem
When the agent is thinking, its reasoning streams as
agent_thought_chunkitems. By default (conversationCollapseMode: "all") every turn folds into a collapsedToolCallGroupChip, whose label/spinner come fromsummarize()inbuildThreadGroups.ts. That function only inspectedtool_callupdates — thought chunks carry no tool status or title — so a turn that's mid-thought (before its first tool call) summarized as:liveLabel = null,active = false, no countable work →doneLabel = "Worked"The chip then computed
running = falseand rendered a static "Worked" with no spinner, even though the agent was actively thinking. The actual reasoning was buried inside the collapsed chip, so you had to expand it to see anything was happening.Fix
Make
summarize()thinking-aware: if the group's trailing item is a still-streaming thought (thoughtComplete !== true— the same flagThoughtViewuses to drive its own spinner), the group is markedactivewith a"Thinking…"live label. The existingToolCallGroupChiplogic does the rest:Works live because the active turn's summary is never cached (both the memoization in
buildThreadGroupsand the incremental grouper recompute it on each streamed chunk).Tests
Adds
buildThreadGroups.test.tscovering: thinking-only turn, tool-after-thought, thought-after-completed-work, and completed-thought-only. All pass; existingincrementalThreadGroupingtests still pass.Note
This fixes the chip masquerading as "Worked". A separate, smaller layer remains: even after expanding the chip, the inner
ThoughtViewrow is itself collapsed by default, so live thinking text needs one more click. Left as-is for now — happy to also auto-expand the thought body while it streams if wanted.🤖 Generated with Claude Code