Skip to content

fix(sessions): show "Thinking…" on the group chip while the agent thinks#2961

Draft
adboio wants to merge 5 commits into
mainfrom
posthog-code/thinking-chip-label
Draft

fix(sessions): show "Thinking…" on the group chip while the agent thinks#2961
adboio wants to merge 5 commits into
mainfrom
posthog-code/thinking-chip-label

Conversation

@adboio

@adboio adboio commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Problem

When the agent is thinking, its reasoning streams as agent_thought_chunk items. By default (conversationCollapseMode: "all") every turn folds into a collapsed ToolCallGroupChip, whose label/spinner come from summarize() in buildThreadGroups.ts. That function only inspected tool_call updates — thought chunks carry no tool status or title — so a turn that's mid-thought (before its first tool call) summarized as:

  • liveLabel = null, active = false, no countable work → doneLabel = "Worked"

The chip then computed running = false and rendered a static "Worked" with no spinner, even though the agent was actively thinking. The actual reasoning was buried inside the collapsed chip, so you had to expand it to see anything was happening.

Fix

Make summarize() thinking-aware: if the group's trailing item is a still-streaming thought (thoughtComplete !== true — the same flag ThoughtView uses to drive its own spinner), the group is marked active with a "Thinking…" live label. The existing ToolCallGroupChip logic does the rest:

  • Mid-thought, no tools yet → chip shows "Thinking…" with a spinner instead of "Worked".
  • Thinking after some tool work → chip shows "Read a file · Thinking…".
  • A finished thinking-only turn still collapses to "Worked" (correct — it's done).

Works live because the active turn's summary is never cached (both the memoization in buildThreadGroups and the incremental grouper recompute it on each streamed chunk).

Tests

Adds buildThreadGroups.test.ts covering: thinking-only turn, tool-after-thought, thought-after-completed-work, and completed-thought-only. All pass; existing incrementalThreadGrouping tests still pass.

Note

This fixes the chip masquerading as "Worked". A separate, smaller layer remains: even after expanding the chip, the inner ThoughtView row is itself collapsed by default, so live thinking text needs one more click. Left as-is for now — happy to also auto-expand the thought body while it streams if wanted.

🤖 Generated with Claude Code

A turn that's mid extended-thinking folds into a collapsed ToolCallGroupChip,
but `summarize()` derived the chip's label/spinner only from `tool_call`
updates. Thought chunks carry no tool status or title, so a thinking-only turn
summarized as `liveLabel = null` / `active = false` / `doneLabel = "Worked"` —
the chip read as finished ("Worked", no spinner) while the agent was actively
thinking, with the reasoning buried inside the collapsed chip.

Make `summarize()` thinking-aware: a trailing, still-streaming thought
(`thoughtComplete !== true`, the same flag ThoughtView uses for its spinner)
now marks the group active with a "Thinking…" live label. The existing chip
logic then shows "Thinking…" with a spinner (or "Read a file · Thinking…" when
work preceded it). Works live since the active turn's summary is never cached.

Adds buildThreadGroups.test.ts covering the thinking-only, tool-after-thought,
thought-after-work, and completed-thought cases.

Generated-By: PostHog Code
Task-Id: 6fa59fea-01a6-4d72-aadb-2e745e2c5495
@github-actions

github-actions Bot commented Jun 27, 2026

Copy link
Copy Markdown

React Doctor found no issues in the changed files. 🎉

Reviewed by React Doctor for commit c6c511b.

@greptile-apps

greptile-apps Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "fix(sessions): show "Thinking…" on the g..." | Re-trigger Greptile

Comment on lines +65 to +113
describe("buildThreadGroups summary — thinking awareness", () => {
it("reads a turn mid extended-thinking as live, not 'Worked'", () => {
// A still-streaming thought (thoughtComplete falsy) is the only activity so
// far: the chip must say it's thinking, not fall back to the done label.
const summary = summaryOf([thought("th1", { thoughtComplete: false })]);

expect(summary.active).toBe(true);
expect(summary.liveLabel).toBe("Thinking…");
expect(summary.hasCountableWork).toBe(false);
});

it("keeps the tool's live label when a tool runs after thinking", () => {
// Thought, then an in-flight tool call: the tool is the latest activity, so
// its title wins over the thinking label.
const summary = summaryOf([
thought("th1", { thoughtComplete: true }),
toolItem("t1"),
]);

expect(summary.active).toBe(true);
expect(summary.liveLabel).toBe("Read file.ts");
});

it("shows thinking again when a thought trails completed tool work", () => {
// Tool finished, agent is thinking once more: countable work plus a live
// thinking label, so the chip can read "Read a file · Thinking…".
const summary = summaryOf([
toolItem("t1"),
thought("th1", { thoughtComplete: false }),
]);

expect(summary.active).toBe(true);
expect(summary.liveLabel).toBe("Thinking…");
expect(summary.hasCountableWork).toBe(true);
expect(summary.doneLabel).toBe("Read a file");
});

it("does not treat a completed thought as live work", () => {
// A finished turn whose only activity was thinking: no live label, falls
// back to the "Worked" done label (there is no countable tool work).
const summary = summaryOf([
thought("th1", { thoughtComplete: true }, completeContext),
]);

expect(summary.active).toBe(false);
expect(summary.liveLabel).toBeNull();
expect(summary.doneLabel).toBe("Worked");
});
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Prefer parameterised tests

The four cases share the same shape — a list of ConversationItems in, a set of GroupSummary fields out — which is the canonical fit for it.each. With separate it() blocks, adding or adjusting a case also requires updating the surrounding boilerplate, and the inline comments duplicate what a table's column names would express. Consolidating into an it.each table (with active, liveLabel, hasCountableWork, and doneLabel as expectation columns) keeps every scenario OnceAndOnlyOnce and matches the repo's stated test convention.

Context Used: Do not attempt to comment on incorrect alphabetica... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

adboio and others added 4 commits June 27, 2026 11:36
The `biome ci` quality check enforces formatting; wrap the guard's throw onto
its own line so the file matches the formatter's output.

Generated-By: PostHog Code
Task-Id: 6fa59fea-01a6-4d72-aadb-2e745e2c5495
The four cases share one shape — items in, GroupSummary fields out — so fold
them into a single it.each table (active / liveLabel / hasCountableWork /
doneLabel as expectation columns), matching the repo's parameterised-test
convention. Per Greptile review feedback.

Generated-By: PostHog Code
Task-Id: 6fa59fea-01a6-4d72-aadb-2e745e2c5495
Expanding a "Worked" chip whose only activity was thinking showed an empty
bordered box, and even with content it showed a redundant collapsed "Thinking"
row needing a second click.

Two causes:
- Blank extended-thinking streams as a text-less agent_thought_chunk, which
  ThoughtView renders as null. But the chip's bordered box draws whenever it has
  children (a hidden child still defeats CSS :empty), so the box stayed empty.
- ThoughtView collapses its content by default, so a non-blank thought inside an
  already-expanded chip showed only a "Thinking" header.

Fixes:
- groupItemRendersContent() mirrors the null-returning branches of
  SessionUpdateView/ThoughtView; ConversationView only feeds renderable items to
  the chip and passes no children when none render, so ToolRow skips the box.
- ToolCallGroupChip gains an `expandable` prop: a group with no renderable body
  is a plain summary line with no caret, instead of a caret opening onto nothing.
- ThoughtView renders its content open by default — thinking has no useful
  one-line summary, so once revealed the reasoning itself is what's worth seeing.

Adds groupItemRendersContent test cases (blank/streaming/text/tool).

Generated-By: PostHog Code
Task-Id: 6fa59fea-01a6-4d72-aadb-2e745e2c5495
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant