Skip to content

Gateway: mid-stream socket drops (no SSE ping forwarding) + silent body-param stripping breaks Anthropic betas #3092

Description

@VXNCXNX

PostHog Code gateway — API integrator report: mid-stream socket drops & Anthropic beta passthrough

Date: 2026-07-02
Gateway: https://gateway.us.posthog.com/posthog_code (US region)
Client: Anthropic TypeScript SDK (v0.91.x) over Bun's fetch, streaming SSE from /v1/messages
Models: claude-fable-5 (primary), claude-opus-4-8

All findings below were reproduced today against the live gateway with an OAuth access token from the PostHog Code beta. Repro commands use $POSTHOG_TOKEN as a placeholder.


Finding 1 — Anthropic ping keepalives are not forwarded → mid-stream TCP kills

Symptom

During long silent stretches of a streaming response (typically extended thinking on claude-fable-5 / Opus, where Anthropic emits no content deltas for tens of seconds), the TCP connection is closed by an intermediary. The client surfaces:

Error: The socket connection was closed unexpectedly. For more information,
pass `verbose: true` in the second argument to fetch()

This happens mid-response — after message_start and often after partial thinking/text has streamed — so it can't be handled as a simple pre-request connection retry.

Corroboration: the official PostHog Code app shows the same stall

This is not client-specific. In the official PostHog Code desktop app, on the same account and models, we regularly see turns where nothing arrives for 2–3 minutes: the UI stays in its "working" state with no output and no error, then eventually recovers or produces a fresh answer. That is exactly the signature of this failure absorbed silently — the stream dies (or goes irrecoverably quiet) mid-turn, and the client retries/re-issues the request without surfacing anything to the user. The user experience is a silent multi-minute hang; the API-integrator experience is the raw socket error above. Same gateway path, same root cause, two presentations.

If you can correlate server-side: look for connections on /v1/messages closed upstream-idle after ~60–120s of zero payload bytes during active turns, paired with a same-conversation retry request arriving seconds later.

What we believe is happening

  • api.anthropic.com emits SSE ping events during quiet periods precisely to keep intermediaries from idle-killing the connection.
  • The gateway does not forward these ping events to the client (we have never observed one in captured SSE traffic through the gateway, while they are routine against the official endpoint).
  • With zero bytes flowing, an LB/proxy on the gateway path enforces an idle timeout and resets the connection. The longer the model thinks, the higher the probability of a kill.

Impact

  • Long-thinking turns die non-deterministically. Anything that raises client-side stream-idle tolerances (which integrators must do anyway, because without pings a quiet-but-alive stream is indistinguishable from a dead one) makes the raw socket error more likely to surface instead of a clean client timeout.
  • The failure text is transport-level and vendor-specific (Bun/undici/node each produce different strings), so generic retry classifiers frequently treat it as fatal rather than transient. Integrators each have to discover and special-case it.
  • Where a client absorbs the failure with a silent retry (as the official app appears to), the user pays twice: minutes of dead air on the UI, and the partial turn's input tokens re-billed on the retry.

Suggested fix

Forward Anthropic's ping events verbatim, or synthesize an SSE comment (: keepalive\n\n) every ~15–30s of upstream silence. Either keeps the connection warm end-to-end and is invisible to SSE parsers.

Client-side mitigation we applied (works, but shouldn't be needed)

We now wrap the response stream and reclassify mid-stream socket deaths as transient connection errors so our retry layer restarts the turn. That re-bills the partial turn's tokens on every retry — server-side keepalives would eliminate both the failure and the re-billing.


Finding 2 — Request body is schema-filtered: unknown params silently dropped → Anthropic betas unusable

Symptom

The gateway strips top-level request-body parameters it doesn't recognize before forwarding to Anthropic, instead of passing them through or rejecting them. This silently disables Anthropic beta features that are negotiated via new body params + anthropic-beta header.

Concrete case: server-side fallback (server-side-fallback-2026-06-01, refusals-and-fallback docs) — the feature that retries a Fable-5 classifier refusal on another model inside one API call.

Evidence (all reproduced 2026-07-02)

The key probe: a fallbacks chain that is guaranteed to 400 on api.anthropic.com (fallback model identical to the primary — the API requires distinct entries) returns 200 through the gateway. The only explanation is that fallbacks never reaches Anthropic.

# Request Expected (direct Anthropic) Observed via gateway
1 fallbacks: [{model: <same as primary>}] + anthropic-beta: server-side-fallback-2026-06-01 400 "must be distinct" 200, normal completion
2 unknown top-level param frobnicate: true, no beta header 400 "Unexpected value(s)" 200, normal completion
3 fallbacks: "bogus" (wrong type) + beta header 400 type error 200, normal completion
4 same as #2 but without ?beta=true query param 400 200 (both gateway paths filter)

Repro for probe 1:

curl -sS "https://gateway.us.posthog.com/posthog_code/v1/messages?beta=true" \
  -H "x-api-key: $POSTHOG_TOKEN" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: server-side-fallback-2026-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-fable-5",
    "max_tokens": 16,
    "fallbacks": [{"model": "claude-fable-5"}],
    "messages": [{"role": "user", "content": "Say ok"}]
  }'
# → 200 with a normal assistant message. Direct to api.anthropic.com this is a hard 400.

Note the anthropic-beta header appears to pass through fine (no error, no behavior change) — it's specifically the body param that's filtered.

Impact

  • Server-side fallback cannot be used through the gateway at all. Given Fable-5's classifier refusals are exactly the problem your own PR PostHog/code#3078 addresses client-side (SDK fallbackModel), forwarding fallbacks would give every API integrator the same protection in one round trip, with Anthropic's fallback-credit billing (a pre-output refusal attempt costs nothing) instead of a full client-side re-send.
  • More generally: any future Anthropic beta that introduces body params will be silently broken through the gateway until each param is individually whitelisted. Silent dropping is the worst failure mode — a 400 would at least tell integrators the feature is unsupported.

Suggested fix

Whitelist fallbacks (and ideally adopt pass-through-with-denylist rather than parse-and-rebuild-with-allowlist for /v1/messages bodies). If filtering must stay, return a 4xx or a warning header when a param is dropped.


Finding 3 (related, previously observed) — context_management is half-parsed → upstream 400

When a request contains Anthropic's context_management body param (e.g. {"edits":[{"type":"clear_thinking_20251015"}]}) alongside thinking: {"type":"adaptive"}, the gateway forwards context_management but drops the paired thinking field. Anthropic then rejects the request:

400 ... `clear_thinking_20251015` strategy requires thinking to be enabled or adaptive

So the body filtering is not only dropping unknown params (Finding 2) — for at least one known param it forwards the param while dropping a field it depends on. We currently strip context_management client-side to avoid the 400. Consistent pass-through would fix this too.

Also worth noting: unsigned thinking blocks

Thinking blocks returned through the gateway carry an empty signature (signature: ""), while the official endpoint returns signed blocks. Replaying such a block to Anthropic on the next turn produces a 400, so integrators must strip or downgrade thinking blocks in multi-turn conversations. If the gateway is re-serializing responses, preserving the original signature bytes would restore standard multi-turn replay behavior.


Summary of asks, in priority order

  1. Keepalives: forward Anthropic ping SSE events (or inject SSE comments) so idle LB timeouts stop killing long thinking turns mid-stream.
  2. fallbacks passthrough: whitelist the fallbacks body param + server-side-fallback-2026-06-01 beta so classifier refusals can fall back server-side (the API-integrator counterpart of feat(agent): add refusal handling and model fallback #3078).
  3. Body filtering policy: prefer pass-through; if filtering stays, fail loudly instead of silently dropping params, and forward context_management together with its paired thinking field.
  4. Thinking signatures: preserve signature on thinking blocks for standard multi-turn replay.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions