feat(agent-applications): model policy config UI + session model#2900
Conversation
Surface and edit `spec.model_policy` (the auto/manual model picker) in the agent config pane, and show which model a session used. - agent-platform-types: add AgentModelPolicy (auto level / manual list) + ModelCatalog types; make spec.model optional (legacy). - Model section: interactive policy editor (mode/level/reasoning dropdowns with descriptions + icons), an "auto level resolves to" preview with live pricing, and a searchable model browser (all served models + cost profiles). Manual mode supports add-from-browser and drag-to-reorder. Editing is local-state only — no save wired yet. - useModelCatalog: catalog stand-in (snapshot of the gateway /v1/models), the single swap point for a real model-info endpoint. - Session detail: add a "Model" KPI showing the model(s) that answered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
React Doctor found 1 issue in 1 file · 1 warning. 1 warning
Reviewed by React Doctor for commit |
|
Reviews (1): Last reviewed commit: "feat(agent-applications): model policy c..." | Re-trigger Greptile |
| }, | ||
| { | ||
| value: "high", | ||
| title: "High", | ||
| description: "Top-tier — long, branching, reasoning-heavy work.", | ||
| }, | ||
| ] as const; | ||
|
|
||
| const REASONING_OPTIONS = [ | ||
| { | ||
| value: "default", | ||
| title: "Default", |
There was a problem hiding this comment.
Stale local state when revision changes
All four useState calls are initialized from spec.model_policy (and spec.reasoning), but useState initializers run only on mount. DetailBody has no key prop tied to the revision, so when the user switches to a different revision in AgentRevisionBar, spec changes but mode, level, reasoning, and manual stay at the previous revision's values. The dirty banner then compares stale state against the new revision's policy, so it will typically show "preview — not saved yet" for a revision the user has never touched.
The cleanest fix is to add key={revisionId} on <ModelBody> (or <AgentModelConfig>) in DetailBody, forcing a remount whenever the revision changes.
Make the model picker functional: save edits and read the real catalog. - api-client: updateAgentRevisionSpec (draft-only PATCH) + getAgentModelCatalog. - useApplyAgentSpec: "create draft and apply changes" — PATCH a draft in place, else clone the revision to a fresh draft, apply, and select it. Freeze/promote stay on the existing lifecycle buttons. - AgentModelConfig: Save / Reset bar (auto-branches a draft on non-draft revisions); threaded through the config pane. - useModelCatalog: read GET …/agent_applications/models/ (drops the stand-in snapshot; small levels fallback while loading). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…catalog
- createAgentDraftRevisionFrom: new_draft returns `{ revision, source_revision_id }`,
not a flat revision — returning the wrapper left `.id` undefined and 404'd the
follow-up spec PATCH (also affected "Clone to draft"). Unwrap it.
- tests: api-client (unwrap regression, updateAgentRevisionSpec PATCH,
getAgentModelCatalog GET) + useApplyAgentSpec (draft-in-place vs clone+patch,
missing-appId guard).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… entrypoint Mirror the backend spec change in the agent-builder UI + wire shapes. - Rename the `model_policy` field → `models` (AgentSpec + accessors + tests). The `AgentModelPolicy` type keeps its name. - Drop `spec.entrypoint` (removed from the spec) — and its config-pane row. - Manual mode now seeds its model list from the level you were on when you switch from auto, so you start from auto's choices and edit rather than a blank slate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Reviews (2): Last reviewed commit: "refactor(agent-applications): rename spe..." | Re-trigger Greptile |
…y check, set lookup
- fmtUsd: fixed-precision formatting (min 2 / max 4 fraction digits) so the
cost column reads consistently and survives float noise from the catalog API.
- dirty check: canonicalise via a recursive key-sorting serializer so the
"not saved yet" banner doesn't fire when the server serialises spec.models
with a different key order than the locally-built policy. (Greptile's
replacer-array suggestion would have dropped nested manual-entry keys.)
- session metrics: dedupe distinct models with a Set instead of Array.includes
inside the loop.
P1 stale-state-on-revision-change was already handled by `key={ctx.revisionId}`
on ModelBody (greptile reviewed a pre-key commit).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… after clone useApplyAgentSpec clones a fresh draft when the target isn't already a draft, then PATCHes the spec onto it. If the PATCH failed, that draft was left orphaned (a copy of the source with no edit landed), and repeated failed applies piled up empty drafts. Now the just-cloned draft is archived best-effort on PATCH failure; the original error is always rethrown, and a pre-existing draft passed in by the caller is never touched. Addresses greptile review on #2900. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l in model config Surfaces spec.models.optimize_for in the model editor as a dropdown alongside mode/level/reasoning (applies to both auto and manual): - Cost (default): pin the first working model for the session — warm cache, no mid-session failover. - Availability: fail over to the next model if the session's model goes down. Wired into the policy object, dirty check, reset, and the create-draft-and-apply save path. Adds AgentModelOptimizeFor to the shared spec types. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…odel KPI The Model `MetricItem` always emitted `text-[14px]` and additionally `text-[12.5px]` when `mono` was true. Tailwind JIT generates both rules and the resolved font-size depends on stylesheet emission order, not className order — fragile across builds. Make the size class conditional so only one is ever present. Also drop a stale "no save wired yet" comment on AgentModelConfig — the save path landed in the same series. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dmarticus
left a comment
There was a problem hiding this comment.
Verified the spec.model / spec.entrypoint loosening — no other consumers in packages/ or apps/, so safe. Pushed a fix for the Tailwind size collision on the mono Model KPI (e341704). LGTM otherwise — three non-blocking nits inline.
- ModelBrowser sort: rank cheapest/priciest by blended input+output cost, not input alone — input-only mis-ranks reasoning models (cheap input, dominant output), exactly the ones cost-conscious authors compare. - useApplyAgentSpec: derive the revision-prefix invalidation key from the shared agentApplicationsKeys factory (new revisionPrefix) instead of a hand-rolled array, so it can't silently drift from revision(). - test: capture and exercise onSuccess, asserting it invalidates the detail/revisions/revision-prefix keys via the factory — closes the gap where mocking useMutation left invalidation keys untested. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Problem
The agent config pane couldn't show or edit the new
spec.models(the auto/manual model picker +optimize_for) — themodelnode still rendered the legacyspec.modelstring, so new specs showed "not set". There was also no way to see which model a session actually used, or to browse the served models and their costs.Changes
Surface and edit
spec.modelsin the agent config pane, save it back to a draft, and show the model a session used.agent-platform-types): addAgentModelPolicy(autolevel/ manualmodels[]),AgentModelOptimizeFor, andModelCatalog;spec.modelbecomes optional (legacy fallback).AgentModelConfig): an interactive policy editor —mode/optimize for/level/reasoningas dropdowns with per-option descriptions + icons; an "auto level resolves to" preview with live pricing; and a searchable model browser of every served model with its cost profile (input/output/cache $/Mtok, context window, provider; sort by name / cheapest / priciest). Manual mode supports add-from-browser and drag-to-reorder (@dnd-kit).optimize for(cost | availability) — the session-stability control mirroringspec.models.optimize_for. Cost (default) pins the first working model for the whole session (warm prompt cache, no mid-session failover); Availability fails over to the next model if the session's model goes down (survives outages, re-reads context cold). Applies to both auto and manual.useApplyAgentSpec): a create-draft-and-apply path — patches the spec on a draft in place, or branches a fresh draft from a non-draft revision first (and archives the orphan best-effort if the patch fails). Dirty banner + reset; "Save to new draft" when the source isn't a draft.useModelCatalog): fetches the real catalog (GET …/agent_applications/models/→ janitor → gateway), with a known-levels fallback while loading.AgentSessionDetailBody): a Model KPI showing the model(s) that answered (one, orprimary +Non fallback).Default policy is
auto / medium / cost.How did you test this?
pnpm --filter @posthog/ui --filter @posthog/shared typecheck— clean.pnpm biome checkon changed files — clean.useApplyAgentSpecunit tests (PATCH-in-place, clone-then-PATCH, orphaned-draft archive-on-failure) pass.optimize forrow — will refresh.)Automatic notifications