Skip to content

Bump K8s test versions to 1.32.x-1.35.x; update vcluster, cert-manager, kuttl, and upgrade tests#1334

Merged
david-yu merged 36 commits intomainfrom
bump/kubernetes-test-versions-1.32-1.35
Mar 30, 2026
Merged

Bump K8s test versions to 1.32.x-1.35.x; update vcluster, cert-manager, kuttl, and upgrade tests#1334
david-yu merged 36 commits intomainfrom
bump/kubernetes-test-versions-1.32-1.35

Conversation

@david-yu
Copy link
Copy Markdown
Contributor

@david-yu david-yu commented Mar 21, 2026

Summary

  • Update supported Kubernetes version range for testing from v1.30.x-v1.33.x to v1.32.x-v1.35.x
  • Per-PR tests run against the minimum supported version (K8s 1.32.x) using the default K3S/Kind images
  • Nightly tests run against the maximum supported version (K8s 1.35.x) by overriding K3S_IMAGE in a separate Buildkite pipeline step
  • Bump kuttl from v0.19.0 to v0.25.0 to support Kind v0.31.0 node images required for K8s 1.32.x
  • Bump vcluster from v0.28.0 to v0.31.2 — the old version fails to initialize on K8s 1.32
  • Bump vcluster cert-manager from v1.8.0 to v1.17.2 — the old version only supports K8s 1.19-1.24
  • Update acceptance upgrade tests from operator v25.1.3 to v25.2.2 and default Redpanda image to v26.1.1-rc5
  • Add CLAUDE.md documenting repo structure, CI patterns, and a step-by-step checklist for future K8s version bumps

Integration test retry for transient DNS failures

Integration tests (TestIntegrationClientFactory, TestIntegrationClientFactoryTLSListeners) occasionally see transient DNS resolution failures during SRV lookups after cluster startup. These are spurious — StatefulSet pods have stable name-[ordinal] DNS identities and the PodDialer correctly handles them via the SRV record code path in charts/redpanda/client/client.go.

The fix wraps client connection assertions in require.Eventually retry loops (2min timeout, 5s interval) to tolerate transient DNS propagation delays rather than failing on the first attempt.

Root cause: vcluster v0.28.0 incompatible with K8s 1.32

The acceptance upgrade tests were failing with secrets "vc-vcluster-xxx" not found — the vcluster pod itself failed to initialize on K8s 1.32, so the kubeconfig secret was never created. This caused all vcluster-dependent tests (operator upgrades, field manager regressions) to fail. The subsequent INSTALLATION FAILED: context deadline exceeded errors from helm were a downstream symptom.

Fix: Bump vcluster from v0.28.0 → v0.31.2 which supports K8s 1.32+.

Updated in:

  • pkg/vcluster/vcluster.go — vcluster chart version constant
  • Taskfile.ymlDEFAULT_TEST_VCLUSTER_VERSION
  • All integration test files that import the vcluster-pro image

Root cause: cert-manager v1.8.0 incompatible with K8s 1.32

The vcluster test infrastructure also deploys cert-manager v1.8.0 inside the vcluster for webhook TLS certificate management. cert-manager v1.8.0 only supports K8s 1.19-1.24 and fails to start on K8s 1.32, preventing the operator's webhook certificates from being issued.

Fix: Bump cert-manager from v1.8.0 → v1.17.2 which supports K8s 1.29-1.33+.

Updated in:

  • pkg/vcluster/vcluster.go — cert-manager chart version constant
  • Taskfile.ymlDEFAULT_SECOND_TEST_CERTMANAGER_VERSION
  • All integration test files that import cert-manager images

Acceptance test improvements

operatorIsRunning readiness check: Replaced immediate require.Equal assertions with require.Eventually (2min timeout, 5s interval) that polls until the operator deployment has available replicas. The previous check used checkStableResource which only waited for the resource version to stabilize — not for the pod to become ready.

Upgrade test versions: Bumped starting operator from v25.1.3 to v25.2.2 and default Redpanda image to redpanda-unstable:v26.1.1-rc5 (26.1 is not yet GA).

Before this PR:

Test Upgrade Path Default Redpanda Image
Operator upgrade from 25.1.3 v25.1.3 → dev redpandadata/redpanda:v25.3.1
Regression - field managers v25.1.3 → v25.3.1 → dev redpandadata/redpanda:v25.3.1
Console v2 to v3 (2 scenarios) v25.1.3 → dev redpandadata/redpanda:v25.3.1

After this PR:

Test Upgrade Path Default Redpanda Image
Operator upgrade from 25.2.2 v25.2.2 → dev redpandadata/redpanda-unstable:v26.1.1-rc5
Regression - field managers v25.2.2 → v25.3.1 → dev redpandadata/redpanda-unstable:v26.1.1-rc5
Console v2 to v3 (2 scenarios) v25.2.2 → dev redpandadata/redpanda-unstable:v26.1.1-rc5

Why the 3-step path for field managers? The regression test verifies that upgrading through v25.3.1 introduces a *kube.Ctl field manager (a known issue in that version), and then upgrading to the current dev build removes it. v25.3.1 must be the intermediate step because it's the version that exhibits the regression — skipping it means the regression never appears and the test has nothing to verify.

Changes

File Change
pkg/k3d/k3d.go Default K3S image: v1.29.6-k3s2v1.32.13-k3s1
pkg/vcluster/vcluster.go vcluster: v0.28.0v0.31.2; cert-manager: v1.8.0v1.17.2
operator/kind*.yaml Kind node images: v1.29.8v1.32.11 (with sha256 digest from Kind v0.31.0)
Taskfile.yml Kube test components: v1.29.6v1.32.13; vcluster: 0.28.00.31.2; cert-manager: v1.8.0v1.17.2; default test Redpanda image: redpanda:v25.3.1redpanda-unstable:v26.1.1-rc5; add v25.3.1 operator image to pull list
ci/kuttl.nix Kuttl: v0.19.0v0.25.0 (embeds Kind v0.31.0, required for kindest/node v1.32.x)
pkg/lint/testdata/tool-versions.txtar Updated kuttl version golden
operator/*_test.go (3 files) kube-controller-manager/kube-apiserver: v1.29.6v1.32.13; cert-manager: v1.8.0v1.17.2; vcluster-pro: 0.28.00.31.2
operator/pkg/client/factory_test.go Wrap Kafka/Admin client assertions in require.Eventually retry loops; update test images
acceptance/steps/operator.go operatorIsRunning uses require.Eventually instead of immediate assertions
acceptance/features/operator-upgrades.feature Start from v25.2.2 (was v25.1.3)
acceptance/features/upgrade-regressions.feature Start from v25.2.2 (was v25.1.3); intermediate upgrade to v25.3.1 (published); final upgrade to dev chart
acceptance/features/console-upgrades.feature Start from v25.2.2 (was v25.1.3)
acceptance/steps/defaults.go Default Redpanda image: redpanda:v25.3.1redpanda-unstable:v26.1.1-rc5
.buildkite/pipeline.yml Split ci-entry-point into per-PR (min K8s) and nightly (max K8s via K3S_IMAGE=rancher/k3s:v1.35.2-k3s1)
CLAUDE.md New file: repo guide, CI patterns, K8s version bump checklist (including vcluster, cert-manager, upgrade test docs)

Why bump kuttl?

Kuttl v0.19.0 embedded Kind v0.24.0 which maxes out at kindest/node:v1.31.0. Attempting to use kindest/node:v1.32.11 with the old kuttl caused failed to detect containerd snapshotter errors because Kind v0.24.0 doesn't understand the containerd configuration in newer node images. Kuttl v0.25.0 embeds Kind v0.31.0, which natively supports kindest/node:v1.32.11.

How nightly K8s version override works

The K3S_IMAGE env var set on the nightly Buildkite step propagates through buildkite-agent pipeline upload into the testsuite pipeline. The k3d package (pkg/k3d/k3d.go:93-95) checks K3S_IMAGE and uses it to override the default image when creating test clusters.

CLAUDE.md

Documents learnings from this bump, including:

  • Repository structure and build system
  • CI lint flow (generatelintgit diff --exit-code)
  • Golden test file patterns
  • Step-by-step checklist for bumping Kubernetes versions (11 locations: k3d, Kind, kuttl, Taskfile, test images, golden files, nightly, envtest, vcluster, cert-manager, acceptance upgrade tests)
  • Proto conflict workaround

Test plan

  • Per-PR CI: Lint passes
  • Per-PR CI: Unit tests pass
  • Per-PR CI: Kuttl-V1-Nodepools tests pass with kuttl v0.25.0 and kindest/node v1.32.11
  • Per-PR CI: Integration tests pass (vcluster v0.31.2 + cert-manager v1.17.2)
  • Per-PR CI: Acceptance tests pass (vcluster v0.31.2 + operator v25.2.2 → v25.3.1 → dev + Redpanda v26.1.1-rc5)
  • Verify nightly schedule triggers the K8s 1.35.x test run via K8S_NIGHTLY=1

🤖 Generated with Claude Code

Update the supported Kubernetes version range for testing from
v1.30.x-v1.33.x to v1.32.x-v1.35.x.

Per-PR tests use the minimum supported version (K8s 1.32.x):
- Default K3S image: rancher/k3s:v1.32.13-k3s1
- Kind node images: kindest/node:v1.32.11
- Kube test components: v1.32.13

Nightly tests use the maximum supported version (K8s 1.35.x):
- Split ci-entry-point into two steps: one for per-PR tests (minimum
  K8s version) and one for nightly tests (maximum K8s version via
  K3S_IMAGE=rancher/k3s:v1.35.2-k3s1)
- The K3S_IMAGE env var propagates through to the testsuite pipeline
  and overrides the default in pkg/k3d/k3d.go

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
david-yu and others added 7 commits March 20, 2026 22:04
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Kind v0.31.0 requires the @sha256 digest to guarantee the correct
image for the release. Without it, the containerd snapshotter
detection fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Kuttl v0.19.0 embeds Kind v0.24.0 which cannot handle kindest/node
images from Kind v0.31.0 (fails with "failed to detect containerd
snapshotter"). Since kuttl tests use Kind internally via startKIND,
the Kind node images must stay compatible with kuttl's embedded Kind
library.

The K8s version bump to 1.32.x-1.35.x is achieved through k3d
(integration and acceptance tests) which is not affected by this
limitation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Kuttl v0.19.0 embedded Kind v0.24.0 which couldn't handle kindest/node
images from Kind v0.31.0 ("failed to detect containerd snapshotter").
Kuttl v0.25.0 embeds Kind v0.31.0, enabling support for K8s 1.32.x
node images.

- ci/kuttl.nix: bump from v0.19.0 to v0.25.0
- operator/kind*.yaml: kindest/node v1.29.8 -> v1.32.11 with sha256 digest
- Taskfile.yml: kube test component images v1.29.6 -> v1.32.13

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The integration tests import kube-controller-manager and kube-apiserver
images into the k3d cluster. These were hardcoded to v1.29.6 and need
to match the bumped test version (v1.32.13).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents repository structure, build system, CI lint flow, golden
test patterns, Kubernetes version testing architecture, and a
step-by-step checklist for bumping Kubernetes versions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@david-yu david-yu changed the title Bump Kubernetes test versions from 1.29.x to 1.32.x-1.35.x Bump Kubernetes test versions from 1.29.x to 1.32.x-1.35.x; add CLAUDE.md Mar 21, 2026
Comment thread .buildkite/pipeline.yml
Comment thread CLAUDE.md Outdated
Comment thread CLAUDE.md
Comment thread .buildkite/pipeline.yml Outdated
david-yu and others added 7 commits March 22, 2026 22:41
- Revert pipeline.yml split: restore single ci-entry-point with
  original nightly condition instead of two separate steps
- Move K3S_IMAGE env var to flake.nix devshell: nightly and local
  dev default to max K8s version (v1.35.2), per-PR tests use the
  hardcoded default in pkg/k3d/k3d.go (v1.32.13)
- CLAUDE.md: note -update flag is legacy, prefer -update-golden
- CLAUDE.md: wrap all commands in nix develop -c for correct
  tool versions and environment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Nix's string interpolation syntax conflicts with shell parameter
expansion containing colons (e.g. v1.35.2-k3s1). Use a plain value
instead of eval to avoid the parser error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove references to the legacy -update flag per review feedback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…steps

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The integration test failures with spurious DNS resolution errors
(non-StatefulSet DNS names in SRV records) were transient, not a
systemic issue. StatefulSet pods have stable name-[ordinal] DNS
identities, so the PodDialer correctly handles them.

Wrap KafkaClient and AdminClient assertions in require.Eventually
retry loops to guard against transient DNS propagation delays
during test cluster startup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…1.1-rc5

The old operator v25.1.3 can't install on K8s 1.35, causing all upgrade
acceptance tests to fail with helm install timeouts. Bump the upgrade-from
version to v25.2.2 (latest 25.2.x) which has better K8s compatibility.

Also update the default Redpanda image to the v26.1.1-rc5 unstable build
for testing with the upcoming 26.1 release (not yet GA).

Changes:
- operator-upgrades.feature: upgrade from v25.2.2 (was v25.1.3)
- upgrade-regressions.feature: start from v25.2.2 (was v25.1.3)
- console-upgrades.feature: start from v25.2.2 (was v25.1.3)
- defaults.go: default Redpanda image to redpanda-unstable:v26.1.1-rc5
- Taskfile.yml: update test image defaults and add v26.1.1-rc5 to pull list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@david-yu david-yu changed the title Bump Kubernetes test versions from 1.29.x to 1.32.x-1.35.x; add CLAUDE.md Bump Kubernetes test versions to 1.32.x-1.35.x; update upgrade tests to 25.2.2 + Redpanda v26.1.1-rc5 Mar 24, 2026
david-yu and others added 5 commits March 23, 2026 22:34
cert-manager v1.8.0 only supports K8s 1.19-1.24 and fails to start
on K8s 1.32+. This caused the operator upgrade acceptance tests to
timeout because cert-manager couldn't issue webhook TLS certificates,
so the old operator's helm install never completed.

cert-manager v1.17.2 supports K8s 1.29-1.33+, covering our test
range of K8s 1.32-1.35.

Updated in:
- pkg/vcluster/vcluster.go (certManagerChartversion)
- Taskfile.yml (DEFAULT_SECOND_TEST_CERTMANAGER_VERSION)
- All integration test files that import cert-manager images

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two acceptance test fixes:

1. operatorIsRunning now uses require.Eventually (2min timeout) to
   wait for the operator deployment to have available replicas instead
   of immediately asserting after checkStableResource. The previous
   check only waited for the resource version to stabilize, but the
   pod may not be ready yet — especially after namespace switches
   between scenarios.

2. Bump upgrade-from operator version from v25.2.2 to v25.3.1. The
   v25.2.2 operator still times out on helm install in K8s 1.32
   vclusters. v25.3.1 is the latest release and most likely to be
   compatible with the K8s 1.32 API surface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The acceptance upgrade tests install operator v25.3.1 from the public
helm repo, which requires the operator container image to be available
in the k3d cluster. Without pre-pulling it, the image pull inside the
vcluster times out causing INSTALLATION FAILED: context deadline exceeded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: vcluster v0.28.0 fails to initialize on K8s 1.32 —
the vcluster pod never starts, so the kubeconfig secret
"vc-<name>" is never created, causing all vcluster-dependent
tests to fail with "secrets not found".

Changes:
- Bump vcluster from v0.28.0 to v0.31.2 (supports K8s 1.32+)
- Revert upgrade-from operator version back to v25.2.2 (from v25.3.1)
  since the vcluster fix is the actual blocker
- Remove unused v25.3.1 operator image from pull list
- Update vcluster-pro image refs in all integration test files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@david-yu david-yu changed the title Bump Kubernetes test versions to 1.32.x-1.35.x; update upgrade tests to 25.2.2 + Redpanda v26.1.1-rc5 Bump K8s test versions to 1.32.x-1.35.x; update vcluster, cert-manager, kuttl, and upgrade tests Mar 24, 2026
david-yu and others added 3 commits March 24, 2026 11:35
The field managers regression test was upgrading from v25.2.2 to v25.2.2
(same version) due to an earlier replace_all error. Fix the intermediate
upgrade step to use the local dev chart ("../operator/chart") so it
actually tests upgrading from v25.2.2 to the current build (v26.1.x).

Also add sections 9-11 to CLAUDE.md documenting the vcluster, cert-manager,
and acceptance upgrade test version dependencies that must be updated when
bumping Kubernetes versions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The field managers regression test needs a 3-step upgrade path:
  v25.2.2 → v25.3.1 → dev chart

v25.3.1 introduced the *kube.Ctl field manager regression, and the
dev chart fixes it. The previous commit incorrectly skipped v25.3.1
by upgrading directly to dev, so the regression never appeared and
the test timed out waiting for *kube.Ctl.

Also add the v25.3.1 operator image to the pull list so it's
available inside the k3d cluster.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
flake.nix was unconditionally setting K3S_IMAGE=rancher/k3s:v1.35.2-k3s1
in the devshell, which meant ALL CI runs (including per-PR) used K8s 1.35
instead of the intended K8s 1.32 minimum.

Remove the K3S_IMAGE env var from flake.nix so:
- Per-PR tests use the Go default from pkg/k3d/k3d.go (v1.32.13-k3s1)
- Nightly tests override via the Buildkite schedule env setting:
  K3S_IMAGE=rancher/k3s:v1.35.2-k3s1

The Buildkite nightly schedule must be configured to set both:
  K8S_NIGHTLY=1 (gate condition in pipeline.yml)
  K3S_IMAGE=rancher/k3s:v1.35.2-k3s1 (runtime override for k3d)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@david-yu
Copy link
Copy Markdown
Contributor Author

david-yu commented Mar 25, 2026

Action needed in Buildkite UI: The nightly schedule for this pipeline must be configured with these env vars:

  K8S_NIGHTLY=1
  K3S_IMAGE=rancher/k3s:v1.35.2-k3s1

K8S_NIGHTLY=1 is the gate condition (already in pipeline.yml line 39). K3S_IMAGE is the runtime override that pkg/k3d/k3d.go reads via os.LookupEnv. Both must be set in the Buildkite schedule's environment settings — this can't be done in code, only in the Buildkite UI

Resolve conflict in redpanda_controller_test.go: take main's
refactor that uses the importImages variable instead of a hardcoded
list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
david-yu and others added 4 commits March 26, 2026 11:19
…2 images

The redpanda_controller_test.go was missed when bumping test
infrastructure versions. It still imported:
- vcluster-pro:0.28.0 (should be 0.31.2)
- kube-controller-manager:v1.29.6 (should be v1.32.13)
- kube-apiserver:v1.29.6 (should be v1.32.13)
- cert-manager:v1.8.0 (should be v1.17.2)

This caused TestIntegrationRedpandaController to fail immediately
with "Image 'ghcr.io/loft-sh/vcluster-pro:0.28.0' couldn't be found
in the container runtime" since only v0.31.2 is pre-pulled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
vcluster v0.31.2 may not create the kubeconfig secret (vc-<name>)
immediately after helm install --wait completes. The secret creation
is asynchronous — the vcluster pod is ready but the secret hasn't
been written yet.

Replace the single Get with wait.PollUntilContextTimeout (2 min
timeout, 2 sec interval) to handle this race condition.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@andrewstucki andrewstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things we should definitely strip out, some things that we'll want to double check, and a question about what we want to document as our supported range (are we going to tell users they must be on 1.32-1.35? if so, let's also constrain the helm charts to that)

Comment thread acceptance/features/upgrade-regressions.feature Outdated
Comment thread acceptance/features/operator-upgrades.feature
Comment thread acceptance/features/console-upgrades.feature
Comment thread pkg/k3d/k3d.go

const (
DefaultK3sImage = `rancher/k3s:v1.29.6-k3s2`
DefaultK3sImage = `rancher/k3s:v1.32.13-k3s1`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we talked about keeping the default k3s image the earliest documented version we'll support. Are we planning on telling everyone we only support 1.32+? If so we should also change the helm manifests to match. If not, I'd keep this as is and solely overwrite the K3S_IMAGE variable in nightly tests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the helm chart installation not blocked by ks version but the testing should be done on what we say we support. We can always do spot check if need to manually for certain versions if asked but I don't think we should keep older versions around as I feel like there will be a tendency to support a large surface area and not test against non EOL k8s versions.

Copy link
Copy Markdown
Contributor Author

@david-yu david-yu Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Industry Comparison

Project K8s Versions Tested Min Max Tool
Istio 12 (!!) 1.23 1.35 Kind (custom images)
cert-manager 5 1.31 1.35 Kind
ArgoCD 4 1.32 1.35 K3s
Prometheus Operator 1 1.35 1.35 Kind

Key Takeaways

  • Istio is an extreme outlier — they test 12 minor versions (1.23–1.35), including many long-EOL versions. Most projects don't do this.
  • cert-manager and ArgoCD represent the mainstream — 4–5 versions, roughly tracking what cloud providers still support. Their minimums (1.31, 1.32) include at most 1 recently-EOL version.
  • Prometheus Operator only tests on the latest version in CI.
  • Nobody uses the minimum supported version as the default CI test target — the default is always a recent version, with older versions in matrix/nightly runs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cert manager also just moved to 1.32 - 1.35 as of Friday: https://cert-manager.io/docs/releases/#currently-supported-releases

Comment thread CLAUDE.md Outdated
Comment thread CLAUDE.md Outdated
Comment thread CLAUDE.md Outdated
Comment thread CLAUDE.md Outdated
- Remove "Post-Merge: Tagging and Publishing" section to prevent
  accidental release cutting via Claude (git tag push, workflow triggers)
- Use task-based generators (task generate, task k8s:generate, task lint,
  task test:unit) instead of raw tool invocations for consistency with CI
- Note that chart template tests use -update instead of -update-golden

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@david-yu
Copy link
Copy Markdown
Contributor Author

CLAUDE.md review feedback addressed

Commit 33a09505:

  • Removed "Post-Merge: Tagging and Publishing" section (prevents accidental release cutting)
  • Switched to task-based generators (task generate, task k8s:generate, task lint, task test:unit) for consistency with CI
  • Added note that chart template tests use -update instead of -update-golden

Unit test failure (Build 12372)

The only failure is TestLaggingPeerCatchesUpViaSnapshot in pkg/multicluster/leaderelection — a raft peer connectivity timing issue (connection refused during peer startup). This is a pre-existing flaky test unrelated to the K8s version bump. The test uses localhost gRPC connections between raft peers and is sensitive to startup timing on CI machines.

david-yu and others added 6 commits March 27, 2026 09:35
Address review comment: consolidate Build System section to reference
task generate instead of individual gotohelm and gen commands.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The nix devshell already sets GOLANG_PROTOBUF_REGISTRATION_CONFLICT=ignore,
so recommend nix develop -c instead of manual env var prefix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace all raw gotohelm and gen references with task-based equivalents.
Remove the standalone gotohelm and k8s:generate entries from Common
Commands since task generate covers both.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explain what the tools do but explicitly state to use task-based
commands instead of invoking gotohelm, gen schema, or gen partial
directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ce tests

- upgrade-regressions: use v25.2.1 (pre-dates the field manager fix in v25.2.2)
- console-upgrades: revert to v25.1.3 (v25.2.2 already has Console v3 migration, making the v2→v3 test invalid)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v25.2.2 supports the Stable status condition, so use it for both
pre-upgrade and post-upgrade checks instead of the weaker Ready check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@david-yu david-yu requested a review from andrewstucki March 27, 2026 20:11
v25.2.1 doesn't use cluster.redpanda.com/operator as the Service field
manager, causing the test to poll forever and timeout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@david-yu
Copy link
Copy Markdown
Contributor Author

@andrewstucki Ready for review. Here is the docs PR associated with this bump: redpanda-data/docs#1627

@david-yu david-yu merged commit 35ac290 into main Mar 30, 2026
10 checks passed
@RafalKorepta RafalKorepta deleted the bump/kubernetes-test-versions-1.32-1.35 branch April 3, 2026 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants