Skip to content

fix(worktree): bound git ops and always clear the setup spinner#3051

Merged
charlesvien merged 1 commit into
mainfrom
posthog-code/fix-stuck-worktree-setup
Jul 2, 2026
Merged

fix(worktree): bound git ops and always clear the setup spinner#3051
charlesvien merged 1 commit into
mainfrom
posthog-code/fix-stuck-worktree-setup

Conversation

@frankh

@frankh frankh commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Problem

Some tasks get stuck forever on the "Setting up worktree…" spinner with no error. Two independent causes:

  1. Unbounded git ops under the write lock (the real hang). Every new local task runs WorktreeManager.createWorktree({ fetchBeforeCreate: true }), which does git fetch origin <branch> inside GitOperationManager.executeWrite. If that fetch blocks (network stall, a credential helper waiting on stdin, an auth prompt) the operation promise never settles, so the finally { releaseWrite() } never runs — the per-repo write lock is held forever and every subsequent worktree creation for that repo queues behind it. One bad fetch strands many tasks. The raw git worktree add and post-checkout hook spawns likewise had no timeout.

  2. Leaked spinner on failure. TaskCreationSaga calls setProvisioningActive before the worktree step but only calls clearProvisioning on the happy path. The Saga base class catches any thrown step, rolls back, and returns {success:false} without ever reaching the clear — so even once a hang becomes an error, the spinner stays up.

Changes

  • packages/git/src/worktree.ts
    • spawnWorktreeAdd: kill + reject on a 60s timeout.
    • runPostCheckoutHook: kill + resolve a non-fatal warning on a 120s timeout.
    • New fetchRefWithTimeout helper: wraps the fetch in an AbortController (60s) and threads the signal through executeWrite, which aborts the git subprocess and releases the write lock. Used by resolveFreshBaseRef (default new-task path) and createWorktreeForRemoteBranch.
  • packages/core/src/task-detail/taskCreationSaga.ts: wrap the saga body in try/finally so clearProvisioning always runs, tearing down the spinner and letting the existing error toast in useTaskCreation surface. Removes the now-redundant mid-flow clear.

Testing

  • pnpm --filter @posthog/core test taskCreationSaga — 16 pass, incl. two new cases asserting clearProvisioning fires on both the worktree failure and happy paths.
  • pnpm --filter @posthog/git test worktree — 59 pass; fetch paths now route through fetchRefWithTimeout (incl. the existing unreachable-remote fallback and remote-branch fetch cases).
  • @posthog/git typechecks clean.

Created with PostHog Code

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

React Doctor found no issues in the changed files. 🎉

Reviewed by React Doctor for commit 6ba7c6e.

@greptile-apps

greptile-apps Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Comments Outside Diff (2)

  1. packages/core/src/task-detail/taskCreationSaga.test.ts, line 9-59 (link)

    P2 New tests not parameterized

    The two new cases test the same invariant — clearProvisioning is always called — under success and failure conditions. Per the team's preference for parameterised tests, these could be expressed as a single it.each block with [true, false] (or a descriptive table) controlling the mock setup, making the shared guarantee and its two scenarios explicit in one place.

    Context Used: Do not attempt to comment on incorrect alphabetica... (source)

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

  2. packages/git/src/worktree.ts, line 469-480 (link)

    P2 Double-reject possible after SIGKILL timeout. When the timer fires it calls proc.kill("SIGKILL") and then reject(...). The resulting close event always fires asynchronously with code = null; because null !== 0 is true, the handler calls reject(...) a second time. JavaScript ignores the second call silently, but it will produce a misleading "exited with code null" error message in the reject argument that could surface in logs or test output. Adding an early-return guard makes the intent clear.

Reviews (1): Last reviewed commit: "fix(worktree): bound git ops and always ..." | Re-trigger Greptile

frankh added a commit that referenced this pull request Jul 1, 2026
Address Greptile review on #3051:

- spawnWorktreeAdd / runPostCheckoutHook: after the timeout SIGKILLs the
  process, the `close` handler fires with code=null and settled a second
  time, producing a misleading "exited with code null" error that clobbered
  the real timeout message. Add a `settled` guard so the timeout result wins
  and later events are ignored.
- taskCreationSaga.test.ts: fold the two "clears provisioning on
  success/failure" cases into a single it.each, matching the repo's
  parameterised-test convention while keeping the happy-path connect
  assertion.

Generated-By: PostHog Code
Task-Id: f3c8da8b-fc20-47bc-becd-575128b4b70f
#3054 added kill-timeouts to `git worktree add` and the post-checkout hook,
and #3053 restructured the saga to clear the provisioning spinner and keep the
task for retry when provisioning fails. Both landed on main and supersede the
worktree-add/hook-timeout and spinner-clearing parts of this PR, so those are
dropped here.

The one hang path neither covered is the `git fetch` that precedes worktree
creation — the trigger both fixes cited. It still runs unbounded inside the
per-repo write lock, so a blocked fetch (network stall, unreachable remote)
never settles, the lock is never released, and every later worktree creation
for that repo queues behind it forever.

Bound the two fetch call sites (resolveFreshBaseRef and
createWorktreeForRemoteBranch) with a timeout. The fetch runs through
simple-git rather than a raw spawn, so it can't use armProcessTimeout; instead
it aborts via the AbortSignal executeWrite already forwards to its scoped git
client, which kills the fetch subprocess and releases the write lock. On
timeout the trunk path degrades to the local ref and the remote-branch path
fails with a retryable error, rather than hanging.
@frankh frankh force-pushed the posthog-code/fix-stuck-worktree-setup branch from 1f7bfe7 to 6ba7c6e Compare July 1, 2026 13:39
@charlesvien charlesvien merged commit 72496ff into main Jul 2, 2026
23 checks passed
@charlesvien charlesvien deleted the posthog-code/fix-stuck-worktree-setup branch July 2, 2026 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants