Expose explicit primaryUpstreamProvider for Cedar authz on VirtualMCPServer#5199
Expose explicit primaryUpstreamProvider for Cedar authz on VirtualMCPServer#5199tgrunnagle wants to merge 6 commits intomainfrom
Conversation
Adds an optional primaryUpstreamProvider field to the inline authz config on VirtualMCPServer so users with multiple upstream IDPs can pin Cedar to a non-first provider, instead of being silently bound to whichever upstream happens to be listed first. Changes for issue #5197: - Add PrimaryUpstreamProvider to InlineAuthzConfig (shared type, vMCP-only in practice, mirroring the SubjectProviderName precedent on the token- exchange and AWS-STS strategies). - Switch the converter from unconditional first-upstream binding to an explicit-then-fallback resolution; both branches normalize through authserver.ResolveUpstreamName. - Reject the spec with AuthServerConfigValidated=False (AuthzUpstreamUnknown) when the explicit name does not match any declared upstream — Cedar would otherwise deny every request at runtime. - Suppress the AuthzUpstreamSelectionWarning advisory when the user has set the field explicitly; the auto-selection it warns about is no longer happening. - Extend converter and validator tests; regenerate CRD YAMLs and API docs. Existing manifests without the new field keep current behavior — the fallback branch is unchanged for that path.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5199 +/- ##
==========================================
+ Coverage 67.65% 67.72% +0.06%
==========================================
Files 607 607
Lines 61982 62241 +259
==========================================
+ Hits 41937 42153 +216
- Misses 16883 16925 +42
- Partials 3162 3163 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
f24fcbe to
b5d2821
Compare
Fixed issues from code review: - MEDIUM: Reject explicit primaryUpstreamProvider when no embedded auth server is configured. The early-return direct-IdP branch in validateAuthzUpstreamAvailable now checks for a non-empty explicit name first and returns SpecValidationError with ConditionReasonAuthzUpstreamUnknown when set — closing the silent misconfiguration where the converter would forward an unresolvable name into Cedar config at runtime. - MEDIUM: Update the converter block comment so it accurately describes both rejection paths (mismatch with declared upstreams AND explicit name without an embedded AS), keeping the comment synchronized with the validator's behavior per the go-style.md rule. - MEDIUM: Replace the misleading "is normalized via ResolveUpstreamName" converter test with a case that actually exercises normalization (upstream Name:"" resolving to "default", user pinning to "default"). Removes redundancy with the single-upstream-honored case and matches the test's claimed assertion. - MEDIUM: Add validator test case for explicit primaryUpstreamProvider with no embedded auth server, locking the new rejection in.
b5d2821 to
b1499ed
Compare
tgrunnagle
left a comment
There was a problem hiding this comment.
Multi-Agent Consensus Review
Agents consulted: kubernetes-operator, security, architecture, test-coverage, go-quality, review-friendliness, general/cross-cutting. Phase 4.5 codex cross-review skipped (CLI not installed).
Consensus Summary
| # | Finding | Consensus | Severity | Action |
|---|---|---|---|---|
| F1 | Field on shared InlineAuthzConfig is silent no-op on MCPServer/MCPRemoteProxy |
10/10 | MEDIUM | Fix |
| F2 | validateAuthzUpstreamAvailable size + 3× duplicated rejection sequence |
10/10 | MEDIUM | Fix |
| F3 | Helper in wrong package; converter inlines duplicate | 9/10 | MEDIUM | Fix |
| F4 | Converter forwards explicit name without AS — defense in depth | 9/10 | MEDIUM | Fix |
| F5 | ConfigMap-sourced authz bypasses validation, no tests, TODO not tracked | 9/10 | MEDIUM | Fix |
| F6 | Doc comment on validateAuthzUpstreamAvailable not updated for new behavior |
8/10 | MEDIUM | Fix |
| F7 | No test for clearing stale AuthzUpstreamUnknown condition |
8/10 | MEDIUM | Fix |
| F8 | Mutable matched bool — use slices.ContainsFunc |
8/10 | LOW | Fix |
| F9 | Stale comment "Leaving PrimaryUpstreamProvider empty…" in converter | 8/10 | LOW | Fix |
| F10 | ConditionReasonAuthzUpstreamUnknown collapses two distinct paths |
7/10 | LOW | Fix |
| F11 | Free-string field not pre-validated against DNS-label regex | 7/10 | LOW | Fix |
| F12 | kubebuilder doc omits inline-only constraint | 7/10 | LOW | Fix |
| F13 | Helper name verbose / misleading | 7/10 | LOW | Fix |
| F14 | PR description missing "where to focus" hint | 7/10 | LOW | Body-only |
Overall
The PR is well-scoped and addresses a real, documented user pain (Cedar evaluating against the wrong upstream IDP because PrimaryUpstreamProvider was hard-bound to whichever upstream was listed first). The core inline-authz fix is correct, idiomatic, and well-tested for the inline path; the converter switch + validator rejection branches form a sensible explicit-then-fallback shape, and the multi-upstream advisory is correctly suppressed when the user disambiguates. The SubjectProviderName precedent invoked in the PR description is genuine and the field/condition naming is consistent with it. Reviewing as draft → COMMENT-level event (also forced because the reviewer is the PR author).
The MEDIUM findings cluster around three architectural choices worth deciding before flipping out of draft. F1 (shared InlineAuthzConfig exposes the field on MCPServer and MCPRemoteProxy as a silent no-op) is the most user-visible — neither admission nor a status condition tells operators their override has no effect on those CRDs. F4 (converter forwards an explicit name even when no AS is configured) reduces defense in depth: the validator is the only thing standing between a misconfigured spec and Cedar's deny-every-request fail-closed behavior. F5 (ConfigMap-sourced authz silently bypasses the new admission validator) is the most future-incident-prone gap; the TODO is acknowledged but not tracked.
The remaining findings are mechanical: the validator function has tripled its rejection-sequence boilerplate (F2), the helper is in the wrong package and the converter inlines its logic (F3), one comment is now stale (F9), one doc block is missing the new behavior (F6), one mutable-bool loop violates the immutable-assignment rule (F8). F10–F13 are polish: a single condition reason for two distinct paths, no kubebuilder Pattern on the upstream-name string, doc could mention the inline-only constraint, and the helper name is verbose. None block correctness on the inline-authz happy path.
Documentation
cmd/thv-operator/controllers/virtualmcpserver_controller.go:569— function doc onvalidateAuthzUpstreamAvailabledoes not describe the two new rejection branches (F6).cmd/thv-operator/pkg/vmcpconfig/converter.go:199— "Leaving PrimaryUpstreamProvider empty (no embedded AS or no upstreams)…" comment is no longer accurate after the explicit-forwarding switch (F9).cmd/thv-operator/api/v1beta1/mcpserver_types.go:783— kubebuilder marker doesn't mention the field is only honored whenAuthzConfigRef.Type == 'inline'(F12).- F14 (PR-level): This PR exceeds the repo's stated 400 LOC / 10-file guideline because the bulk is generated CRD YAML + auto-generated docs. Per
.claude/rules/pr-creation.md, large PRs are acceptable for generated code, but a one-sentence "Where to focus" hint under Special notes would convert the apparent 13-file review into the actual 4 hand-written files (controller, converter, two API types) plus 2 test files.
Generated with Claude Code
Inline review detail (companion to the consensus review above)GitHub's review-comment API is returning persistent Pinned to head F1 ·
|
Addresses #5199 review comments: - MEDIUM cmd/thv-operator/api/v1beta1/mcpserver_types.go:785 (F1): surface advisory condition AuthzPrimaryUpstreamProviderIgnored on MCPServer/MCPRemoteProxy reconcilers when primaryUpstreamProvider is set; the field has no embedded auth server to honor it. - MEDIUM cmd/thv-operator/controllers/virtualmcpserver_controller.go:562 (F3): move the explicit-provider lookup to AuthzConfigRef. ExplicitPrimaryUpstreamProvider() so the validator and the converter share one accessor. - LOW mcpserver_types.go:784 (F11): add kubebuilder Pattern matching the upstream-name regex from pkg/authserver/config.go so kubectl apply rejects malformed values up-front. - LOW mcpserver_types.go:783 (F12): doc the inline-only constraint and point at the configMap TODO in vmcpconfig/converter.go. - LOW virtualmcpserver_controller.go:562 (F13): renamed via the move in F3 — the new method has the cleaner name.
Addresses #5199 review comments: - MEDIUM cmd/thv-operator/controllers/virtualmcpserver_controller.go:628 (F2): extract rejectAuthzAdmission to dedupe the boilerplate (clear advisory + log + Phase=Failed + condition + ObservedGeneration + return *SpecValidationError) used by the two new authz rejection branches. - MEDIUM virtualmcpserver_controller.go:569 (F6): expand the function doc to describe the new explicit-primaryUpstreamProvider rejections alongside the original behaviors. - LOW virtualmcpserver_controller.go:687 (F8): replace the matched-bool/break loop with slices.ContainsFunc. The "no upstreams declared" branch (line 635) is intentionally not folded into the helper because it returns a plain error (causing requeue) rather than *SpecValidationError (no requeue) — preserving the existing semantics for that branch.
Addresses #5199 review comments: - MEDIUM cmd/thv-operator/pkg/vmcpconfig/converter.go:213 (F4): add a defensive cross-check so Convert refuses to produce an unresolvable PrimaryUpstreamProvider — even if invoked outside the reconcile flow (CLI dry-run, webhook, test harness). Errors when the explicit name has no embedded auth server, or doesn't match any declared upstream. Validator remains the primary user-facing fail-loud point. - LOW converter.go:199 (F9): retract the stale "no embedded AS leaves PrimaryUpstreamProvider empty" claim and describe the post-PR contract precisely. - LOW cmd/thv-operator/api/v1beta1/virtualmcpserver_types.go:407 (F10): introduce ConditionReasonAuthzPrimaryProviderRequiresAuthServer for the no-AS case and keep ConditionReasonAuthzUpstreamUnknown for the mismatch case, so dashboards/alerts can route the two misconfigurations separately. Updated the validator branch and the matching test to use the new reason. Updated converter test "explicit primary provider without auth server is forwarded" → "is rejected" to reflect the new fail-closed contract, and added a sibling test for the upstream-mismatch case.
Addresses #5199 review comments: - MEDIUM cmd/thv-operator/pkg/vmcpconfig/converter.go:210 (F5): - File tracking issue #5208 covering both configMap TODOs (policy load + primaryUpstreamProvider override) and reference it from the comments. - Add validator test TestVirtualMCPServerValidateAuthzUpstreamAvailable_ConfigMapFallThrough that pins the documented inline-only contract: configMap-sourced authz never trips the new admission rejections, the multi-upstream advisory still fires, and configMap users get auto-selection of the first upstream. - MEDIUM cmd/thv-operator/controllers/virtualmcpserver_controller_test.go :3718 (F7): add TestVirtualMCPServerValidateAuthzUpstreamAvailable_ClearsStaleAuthzUnknown table with both new failure reasons. Mirrors the production reconcile order (validateAuthServerConfig → validateAuthzUpstreamAvailable) to lock in the recoverability contract: a corrected spec transitions AuthServerConfigValidated back to True and the stale rejection reason does not survive.
|
Addressed all 13 findings from the consensus review across 4 commits.
Notes:
|
Summary
A VirtualMCPServer with multiple upstream IDPs configured on its embedded auth server has no way to point Cedar at a non-first upstream — the operator unconditionally binds
PrimaryUpstreamProviderto whichever upstream happens to be listed first in the spec. This forces users to reorder the list as a side-channel for an authz decision and offers no opt-out beyond editing the upstream order. This PR exposes an explicitprimaryUpstreamProviderfield on the inline authz config so users can pin Cedar's claim source, while preserving existing first-upstream behavior when the field is empty so existing manifests continue to work unchanged. Behavior brings Cedar/authz to parity with theSubjectProviderNameprecedent already established on the token-exchange and AWS-STS strategies ofMCPExternalAuthConfig.Closes #5197
Type of change
Test plan
task test)task lint-fix)API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.The new
primaryUpstreamProviderfield is optional withomitempty. Existing manifests without the field continue to use the first-upstream auto-binding fallback verbatim.Changes
cmd/thv-operator/api/v1beta1/mcpserver_types.goPrimaryUpstreamProviderto sharedInlineAuthzConfig(vMCP-only in practice, mirrorsSubjectProviderName).cmd/thv-operator/api/v1beta1/virtualmcpserver_types.goConditionReasonAuthzUpstreamUnknownfor the new rejection paths.cmd/thv-operator/pkg/vmcpconfig/converter.goauthserver.ResolveUpstreamName.cmd/thv-operator/controllers/virtualmcpserver_controller.goAuthServerConfigValidated=False,AuthzUpstreamUnknown); suppress theAuthzUpstreamSelectionWarningadvisory when an explicit choice is set; advisory message now points users at the new field.cmd/thv-operator/controllers/virtualmcpserver_controller_test.gocmd/thv-operator/pkg/vmcpconfig/converter_test.goResolveUpstreamName(""upstream +"default"explicit), and explicit-forwarded-without-AS contract.deploy/charts/operator-crds/files/crds/*.yaml,templates/*.yaml,docs/operator/crd-api.mdDoes this introduce a user-facing change?
Yes. Users authoring
VirtualMCPServermanifests can now setspec.incomingAuth.authzConfig.inline.primaryUpstreamProviderto explicitly choose which upstream IDP's access token claims Cedar evaluates. When unset, behavior is unchanged: the controller defaults to the first configured upstream and emits the existingAuthzUpstreamSelectionWarningadvisory if multiple upstreams are declared. Setting the field to a name that does not match any declared upstream — or setting it without an embedded auth server — now rejects the spec at admission rather than failing silently at request time.Implementation plan
Approved implementation plan
Sub-issue of #5146, addressing root cause 1: the operator auto-binds
PrimaryUpstreamProviderto the first upstream provider's name without exposing a CRD field that lets users override or opt out.PrimaryUpstreamProvider stringto the sharedInlineAuthzConfig(matches theSubjectProviderNameprecedent on token-exchange/AWS-STS strategies — also defined on shared types but documented as vMCP-only).ResolveUpstreamNameon the user-supplied value too, so the field is normalized consistently with the auto-selected path.validateAuthzUpstreamAvailable): Reject (SpecValidationError,AuthServerConfigValidated=False) when an explicitPrimaryUpstreamProvideris set but does not match any entry inSpec.AuthServerConfig.UpstreamProvidersafterResolveUpstreamNamenormalization on both sides. Suppress theAuthzUpstreamSelectionWarningadvisory when the user has set the field explicitly. Leave the existing "embedded AS but zero upstreams" reject path unchanged. (Code review added: also reject when an explicit name is set with no embedded auth server at all.)task operator-generate,task operator-manifests,task -d cmd/thv-operator crdref-gen.Special notes for reviewers
InlineAuthzConfigtype (also reachable fromMCPServerandMCPRemoteProxy) but is documented as vMCP-only and has no effect elsewhere. This matches theSubjectProviderNameprecedent. The alternative — introducing a vMCP-specificVirtualMCPInlineAuthzConfigandAuthzConfigRef— was rejected as a larger refactor with knock-on changes in the converter,controllerutil.GenerateOrUpdateInlineAuthzConfigMap, and tests. Happy to revisit if reviewers prefer the stricter typing.pkg/authz/authorizers/cedar/core.goand will be a separate change. This PR only makes the auto-binding overridable.Address code review feedback) reflects feedback from/review-code-for-issueand adds the "explicit name without embedded AS" rejection plus a converter test that actually exercisesResolveUpstreamNamenormalization (the previous test was misleading becauseResolveUpstreamNameis the identity function for non-empty input).