Skip to content

feat(agent-applications): model policy config UI + session model#2900

Merged
benjackwhite merged 11 commits into
mainfrom
agent-model-policy-ui
Jun 26, 2026
Merged

feat(agent-applications): model policy config UI + session model#2900
benjackwhite merged 11 commits into
mainfrom
agent-model-policy-ui

Conversation

@benjackwhite

@benjackwhite benjackwhite commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Problem

The agent config pane couldn't show or edit the new spec.models (the auto/manual model picker + optimize_for) — the model node still rendered the legacy spec.model string, so new specs showed "not set". There was also no way to see which model a session actually used, or to browse the served models and their costs.

Changes

2026-06-24 18 03 45

Surface and edit spec.models in the agent config pane, save it back to a draft, and show the model a session used.

  • Types (agent-platform-types): add AgentModelPolicy (auto level / manual models[]), AgentModelOptimizeFor, and ModelCatalog; spec.model becomes optional (legacy fallback).
  • Model section (AgentModelConfig): an interactive policy editor — mode / optimize for / level / reasoning as dropdowns with per-option descriptions + icons; an "auto level resolves to" preview with live pricing; and a searchable model browser of every served model with its cost profile (input/output/cache $/Mtok, context window, provider; sort by name / cheapest / priciest). Manual mode supports add-from-browser and drag-to-reorder (@dnd-kit).
    • optimize for (cost | availability) — the session-stability control mirroring spec.models.optimize_for. Cost (default) pins the first working model for the whole session (warm prompt cache, no mid-session failover); Availability fails over to the next model if the session's model goes down (survives outages, re-reads context cold). Applies to both auto and manual.
  • Save (useApplyAgentSpec): a create-draft-and-apply path — patches the spec on a draft in place, or branches a fresh draft from a non-draft revision first (and archives the orphan best-effort if the patch fails). Dirty banner + reset; "Save to new draft" when the source isn't a draft.
  • Catalog (useModelCatalog): fetches the real catalog (GET …/agent_applications/models/ → janitor → gateway), with a known-levels fallback while loading.
  • Session detail (AgentSessionDetailBody): a Model KPI showing the model(s) that answered (one, or primary +N on fallback).

Default policy is auto / medium / cost.

How did you test this?

  • pnpm --filter @posthog/ui --filter @posthog/shared typecheck — clean.
  • pnpm biome check on changed files — clean.
  • useApplyAgentSpec unit tests (PATCH-in-place, clone-then-PATCH, orphaned-draft archive-on-failure) pass.
  • Drove the running dev app over CDP and screenshotted the model section (auto + manual), the dropdowns, and the manual drag-to-reorder list against a live agent. (Screenshot above predates the optimize for row — will refresh.)

Automatic notifications

  • Publish to changelog?
  • Alert Sales and Marketing teams?

Surface and edit `spec.model_policy` (the auto/manual model picker) in the
agent config pane, and show which model a session used.

- agent-platform-types: add AgentModelPolicy (auto level / manual list) +
  ModelCatalog types; make spec.model optional (legacy).
- Model section: interactive policy editor (mode/level/reasoning dropdowns
  with descriptions + icons), an "auto level resolves to" preview with live
  pricing, and a searchable model browser (all served models + cost profiles).
  Manual mode supports add-from-browser and drag-to-reorder. Editing is
  local-state only — no save wired yet.
- useModelCatalog: catalog stand-in (snapshot of the gateway /v1/models), the
  single swap point for a real model-info endpoint.
- Session detail: add a "Model" KPI showing the model(s) that answered.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown

React Doctor found 1 issue in 1 file · 1 warning.

1 warning

src/features/agent-applications/components/AgentModelConfig.tsx

Reviewed by React Doctor for commit f012d09.

@greptile-apps

greptile-apps Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "feat(agent-applications): model policy c..." | Re-trigger Greptile

Comment on lines +152 to +163
},
{
value: "high",
title: "High",
description: "Top-tier — long, branching, reasoning-heavy work.",
},
] as const;

const REASONING_OPTIONS = [
{
value: "default",
title: "Default",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Stale local state when revision changes

All four useState calls are initialized from spec.model_policy (and spec.reasoning), but useState initializers run only on mount. DetailBody has no key prop tied to the revision, so when the user switches to a different revision in AgentRevisionBar, spec changes but mode, level, reasoning, and manual stay at the previous revision's values. The dirty banner then compares stale state against the new revision's policy, so it will typically show "preview — not saved yet" for a revision the user has never touched.

The cleanest fix is to add key={revisionId} on <ModelBody> (or <AgentModelConfig>) in DetailBody, forcing a remount whenever the revision changes.

benjackwhite and others added 3 commits June 24, 2026 16:40
Make the model picker functional: save edits and read the real catalog.

- api-client: updateAgentRevisionSpec (draft-only PATCH) + getAgentModelCatalog.
- useApplyAgentSpec: "create draft and apply changes" — PATCH a draft in place,
  else clone the revision to a fresh draft, apply, and select it. Freeze/promote
  stay on the existing lifecycle buttons.
- AgentModelConfig: Save / Reset bar (auto-branches a draft on non-draft
  revisions); threaded through the config pane.
- useModelCatalog: read GET …/agent_applications/models/ (drops the stand-in
  snapshot; small levels fallback while loading).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…catalog

- createAgentDraftRevisionFrom: new_draft returns `{ revision, source_revision_id }`,
  not a flat revision — returning the wrapper left `.id` undefined and 404'd the
  follow-up spec PATCH (also affected "Clone to draft"). Unwrap it.
- tests: api-client (unwrap regression, updateAgentRevisionSpec PATCH,
  getAgentModelCatalog GET) + useApplyAgentSpec (draft-in-place vs clone+patch,
  missing-appId guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… entrypoint

Mirror the backend spec change in the agent-builder UI + wire shapes.

- Rename the `model_policy` field → `models` (AgentSpec + accessors + tests).
  The `AgentModelPolicy` type keeps its name.
- Drop `spec.entrypoint` (removed from the spec) — and its config-pane row.
- Manual mode now seeds its model list from the level you were on when you
  switch from auto, so you start from auto's choices and edit rather than a
  blank slate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@benjackwhite benjackwhite marked this pull request as ready for review June 24, 2026 16:18
@greptile-apps

greptile-apps Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Reviews (2): Last reviewed commit: "refactor(agent-applications): rename spe..." | Re-trigger Greptile

…y check, set lookup

- fmtUsd: fixed-precision formatting (min 2 / max 4 fraction digits) so the
  cost column reads consistently and survives float noise from the catalog API.
- dirty check: canonicalise via a recursive key-sorting serializer so the
  "not saved yet" banner doesn't fire when the server serialises spec.models
  with a different key order than the locally-built policy. (Greptile's
  replacer-array suggestion would have dropped nested manual-entry keys.)
- session metrics: dedupe distinct models with a Set instead of Array.includes
  inside the loop.

P1 stale-state-on-revision-change was already handled by `key={ctx.revisionId}`
on ModelBody (greptile reviewed a pre-key commit).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@benjackwhite benjackwhite requested a review from a team June 25, 2026 06:33
benjackwhite and others added 4 commits June 25, 2026 08:51
… after clone

useApplyAgentSpec clones a fresh draft when the target isn't already a
draft, then PATCHes the spec onto it. If the PATCH failed, that draft was
left orphaned (a copy of the source with no edit landed), and repeated
failed applies piled up empty drafts. Now the just-cloned draft is archived
best-effort on PATCH failure; the original error is always rethrown, and a
pre-existing draft passed in by the caller is never touched.

Addresses greptile review on #2900.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l in model config

Surfaces spec.models.optimize_for in the model editor as a dropdown alongside
mode/level/reasoning (applies to both auto and manual):
- Cost (default): pin the first working model for the session — warm cache, no
  mid-session failover.
- Availability: fail over to the next model if the session's model goes down.

Wired into the policy object, dirty check, reset, and the create-draft-and-apply
save path. Adds AgentModelOptimizeFor to the shared spec types.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…odel KPI

The Model `MetricItem` always emitted `text-[14px]` and additionally
`text-[12.5px]` when `mono` was true. Tailwind JIT generates both rules
and the resolved font-size depends on stylesheet emission order, not
className order — fragile across builds.

Make the size class conditional so only one is ever present.

Also drop a stale "no save wired yet" comment on AgentModelConfig — the
save path landed in the same series.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@dmarticus dmarticus left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified the spec.model / spec.entrypoint loosening — no other consumers in packages/ or apps/, so safe. Pushed a fix for the Tailwind size collision on the mono Model KPI (e341704). LGTM otherwise — three non-blocking nits inline.

Comment thread packages/ui/src/features/agent-applications/components/AgentModelConfig.tsx Outdated
Comment thread packages/ui/src/features/agent-applications/hooks/useApplyAgentSpec.test.ts Outdated
benjackwhite and others added 2 commits June 26, 2026 10:56
- ModelBrowser sort: rank cheapest/priciest by blended input+output cost,
  not input alone — input-only mis-ranks reasoning models (cheap input,
  dominant output), exactly the ones cost-conscious authors compare.
- useApplyAgentSpec: derive the revision-prefix invalidation key from the
  shared agentApplicationsKeys factory (new revisionPrefix) instead of a
  hand-rolled array, so it can't silently drift from revision().
- test: capture and exercise onSuccess, asserting it invalidates the
  detail/revisions/revision-prefix keys via the factory — closes the gap
  where mocking useMutation left invalidation keys untested.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@benjackwhite benjackwhite merged commit a685c8b into main Jun 26, 2026
23 checks passed
@benjackwhite benjackwhite deleted the agent-model-policy-ui branch June 26, 2026 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants