feat(agent-applications): model policy config UI + session model by benjackwhite · Pull Request #2900 · PostHog/code

benjackwhite · 2026-06-24T13:53:55Z

Problem

The agent config pane couldn't show or edit the new spec.models (the auto/manual model picker + optimize_for) — the model node still rendered the legacy spec.model string, so new specs showed "not set". There was also no way to see which model a session actually used, or to browse the served models and their costs.

Changes

Surface and edit spec.models in the agent config pane, save it back to a draft, and show the model a session used.

Types (agent-platform-types): add AgentModelPolicy (auto level / manual models[]), AgentModelOptimizeFor, and ModelCatalog; spec.model becomes optional (legacy fallback).
Model section (AgentModelConfig): an interactive policy editor — mode / optimize for / level / reasoning as dropdowns with per-option descriptions + icons; an "auto level resolves to" preview with live pricing; and a searchable model browser of every served model with its cost profile (input/output/cache $/Mtok, context window, provider; sort by name / cheapest / priciest). Manual mode supports add-from-browser and drag-to-reorder (@dnd-kit).
- optimize for (cost | availability) — the session-stability control mirroring spec.models.optimize_for. Cost (default) pins the first working model for the whole session (warm prompt cache, no mid-session failover); Availability fails over to the next model if the session's model goes down (survives outages, re-reads context cold). Applies to both auto and manual.
Save (useApplyAgentSpec): a create-draft-and-apply path — patches the spec on a draft in place, or branches a fresh draft from a non-draft revision first (and archives the orphan best-effort if the patch fails). Dirty banner + reset; "Save to new draft" when the source isn't a draft.
Catalog (useModelCatalog): fetches the real catalog (GET …/agent_applications/models/ → janitor → gateway), with a known-levels fallback while loading.
Session detail (AgentSessionDetailBody): a Model KPI showing the model(s) that answered (one, or primary +N on fallback).

Default policy is auto / medium / cost.

How did you test this?

pnpm --filter @posthog/ui --filter @posthog/shared typecheck — clean.
pnpm biome check on changed files — clean.
useApplyAgentSpec unit tests (PATCH-in-place, clone-then-PATCH, orphaned-draft archive-on-failure) pass.
Drove the running dev app over CDP and screenshotted the model section (auto + manual), the dropdowns, and the manual drag-to-reorder list against a live agent. (Screenshot above predates the optimize for row — will refresh.)

Automatic notifications

Publish to changelog?
Alert Sales and Marketing teams?

Surface and edit `spec.model_policy` (the auto/manual model picker) in the agent config pane, and show which model a session used. - agent-platform-types: add AgentModelPolicy (auto level / manual list) + ModelCatalog types; make spec.model optional (legacy). - Model section: interactive policy editor (mode/level/reasoning dropdowns with descriptions + icons), an "auto level resolves to" preview with live pricing, and a searchable model browser (all served models + cost profiles). Manual mode supports add-from-browser and drag-to-reorder. Editing is local-state only — no save wired yet. - useModelCatalog: catalog stand-in (snapshot of the gateway /v1/models), the single swap point for a real model-info endpoint. - Session detail: add a "Model" KPI showing the model(s) that answered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-24T13:54:30Z

React Doctor found 1 issue in 1 file · 1 warning.

1 warning

src/features/agent-applications/components/AgentModelConfig.tsx

⚠️ src/features/agent-applications/components/AgentModelConfig.tsx:50 Many related useState calls prefer-useReducer

_{Reviewed by React Doctor for commit f012d09.}

greptile-apps · 2026-06-24T13:57:45Z

_{Reviews (1): Last reviewed commit: "feat(agent-applications): model policy c..." | Re-trigger Greptile}

greptile-apps · 2026-06-24T13:57:49Z

+  },
+  {
+    value: "high",
+    title: "High",
+    description: "Top-tier — long, branching, reasoning-heavy work.",
+  },
+] as const;
+
+const REASONING_OPTIONS = [
+  {
+    value: "default",
+    title: "Default",


Stale local state when revision changes

All four useState calls are initialized from spec.model_policy (and spec.reasoning), but useState initializers run only on mount. DetailBody has no key prop tied to the revision, so when the user switches to a different revision in AgentRevisionBar, spec changes but mode, level, reasoning, and manual stay at the previous revision's values. The dirty banner then compares stale state against the new revision's policy, so it will typically show "preview — not saved yet" for a revision the user has never touched.

The cleanest fix is to add key={revisionId} on <ModelBody> (or <AgentModelConfig>) in DetailBody, forcing a remount whenever the revision changes.

Make the model picker functional: save edits and read the real catalog. - api-client: updateAgentRevisionSpec (draft-only PATCH) + getAgentModelCatalog. - useApplyAgentSpec: "create draft and apply changes" — PATCH a draft in place, else clone the revision to a fresh draft, apply, and select it. Freeze/promote stay on the existing lifecycle buttons. - AgentModelConfig: Save / Reset bar (auto-branches a draft on non-draft revisions); threaded through the config pane. - useModelCatalog: read GET …/agent_applications/models/ (drops the stand-in snapshot; small levels fallback while loading). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…catalog - createAgentDraftRevisionFrom: new_draft returns `{ revision, source_revision_id }`, not a flat revision — returning the wrapper left `.id` undefined and 404'd the follow-up spec PATCH (also affected "Clone to draft"). Unwrap it. - tests: api-client (unwrap regression, updateAgentRevisionSpec PATCH, getAgentModelCatalog GET) + useApplyAgentSpec (draft-in-place vs clone+patch, missing-appId guard). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… entrypoint Mirror the backend spec change in the agent-builder UI + wire shapes. - Rename the `model_policy` field → `models` (AgentSpec + accessors + tests). The `AgentModelPolicy` type keeps its name. - Drop `spec.entrypoint` (removed from the spec) — and its config-pane row. - Manual mode now seeds its model list from the level you were on when you switch from auto, so you start from auto's choices and edit rather than a blank slate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-06-24T16:22:18Z

_{Reviews (2): Last reviewed commit: "refactor(agent-applications): rename spe..." | Re-trigger Greptile}

…y check, set lookup - fmtUsd: fixed-precision formatting (min 2 / max 4 fraction digits) so the cost column reads consistently and survives float noise from the catalog API. - dirty check: canonicalise via a recursive key-sorting serializer so the "not saved yet" banner doesn't fire when the server serialises spec.models with a different key order than the locally-built policy. (Greptile's replacer-array suggestion would have dropped nested manual-entry keys.) - session metrics: dedupe distinct models with a Set instead of Array.includes inside the loop. P1 stale-state-on-revision-change was already handled by `key={ctx.revisionId}` on ModelBody (greptile reviewed a pre-key commit). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… after clone useApplyAgentSpec clones a fresh draft when the target isn't already a draft, then PATCHes the spec onto it. If the PATCH failed, that draft was left orphaned (a copy of the source with no edit landed), and repeated failed applies piled up empty drafts. Now the just-cloned draft is archived best-effort on PATCH failure; the original error is always rethrown, and a pre-existing draft passed in by the caller is never touched. Addresses greptile review on #2900. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…l in model config Surfaces spec.models.optimize_for in the model editor as a dropdown alongside mode/level/reasoning (applies to both auto and manual): - Cost (default): pin the first working model for the session — warm cache, no mid-session failover. - Availability: fail over to the next model if the session's model goes down. Wired into the policy object, dirty check, reset, and the create-draft-and-apply save path. Adds AgentModelOptimizeFor to the shared spec types. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…odel KPI The Model `MetricItem` always emitted `text-[14px]` and additionally `text-[12.5px]` when `mono` was true. Tailwind JIT generates both rules and the resolved font-size depends on stylesheet emission order, not className order — fragile across builds. Make the size class conditional so only one is ever present. Also drop a stale "no save wired yet" comment on AgentModelConfig — the save path landed in the same series. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dmarticus

Verified the spec.model / spec.entrypoint loosening — no other consumers in packages/ or apps/, so safe. Pushed a fix for the Tailwind size collision on the mono Model KPI (e341704). LGTM otherwise — three non-blocking nits inline.

- ModelBrowser sort: rank cheapest/priciest by blended input+output cost, not input alone — input-only mis-ranks reasoning models (cheap input, dominant output), exactly the ones cost-conscious authors compare. - useApplyAgentSpec: derive the revision-prefix invalidation key from the shared agentApplicationsKeys factory (new revisionPrefix) instead of a hand-rolled array, so it can't silently drift from revision(). - test: capture and exercise onSuccess, asserting it invalidates the detail/revisions/revision-prefix keys via the factory — closes the gap where mocking useMutation left invalidation keys untested. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

greptile-apps Bot reviewed Jun 24, 2026

View reviewed changes

benjackwhite and others added 3 commits June 24, 2026 16:40

benjackwhite marked this pull request as ready for review June 24, 2026 16:18

greptile-apps Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread packages/ui/src/features/agent-applications/hooks/useApplyAgentSpec.ts

benjackwhite requested a review from a team June 25, 2026 06:33

benjackwhite and others added 4 commits June 25, 2026 08:51

Merge remote-tracking branch 'origin/main' into agent-model-policy-ui

e891524

dmarticus approved these changes Jun 25, 2026

View reviewed changes

Comment thread packages/ui/src/features/agent-applications/components/AgentModelConfig.tsx Outdated

Comment thread packages/ui/src/features/agent-applications/components/AgentModelConfig.tsx

Comment thread packages/ui/src/features/agent-applications/hooks/useApplyAgentSpec.test.ts Outdated

benjackwhite and others added 2 commits June 26, 2026 10:56

Merge remote-tracking branch 'origin/main' into agent-model-policy-ui

316abd1

benjackwhite merged commit a685c8b into main Jun 26, 2026
23 checks passed

benjackwhite deleted the agent-model-policy-ui branch June 26, 2026 10:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(agent-applications): model policy config UI + session model#2900

feat(agent-applications): model policy config UI + session model#2900
benjackwhite merged 11 commits into
mainfrom
agent-model-policy-ui

benjackwhite commented Jun 24, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Jun 24, 2026

Uh oh!

greptile-apps Bot Jun 24, 2026

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 24, 2026

Uh oh!

Uh oh!

dmarticus left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

benjackwhite commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

How did you test this?

Automatic notifications

Uh oh!

github-actions Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 24, 2026

Uh oh!

greptile-apps Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Jun 24, 2026

Uh oh!

Uh oh!

dmarticus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

benjackwhite commented Jun 24, 2026 •

edited

Loading

github-actions Bot commented Jun 24, 2026 •

edited

Loading