Skip to content

Stub image int test#1836

Open
Igor-splunk wants to merge 1 commit intodevelopfrom
integration-tests-spike
Open

Stub image int test#1836
Igor-splunk wants to merge 1 commit intodevelopfrom
integration-tests-spike

Conversation

@Igor-splunk
Copy link
Copy Markdown
Collaborator

Integration Tests

Envtest-based integration tests that run the real reconciliation logic against a
lightweight in-process Kubernetes API server. No full cluster, no real Splunk image,
no Docker required.

What it does

The test suite boots envtest
(embedded etcd + kube-apiserver), registers the operator's controllers, and exercises
them end-to-end through the Kubernetes API — the same code path a production deployment
takes.

Because envtest has no kubelet, a fake kubelet goroutine fills that role: it watches
for StatefulSets, creates Pod objects with Ready status, and patches StatefulSet
.status so the operator sees ReadyReplicas and can advance the CR to PhaseReady.

Pod exec commands (used by addTelApp, secret rotation, etc.) are intercepted by a
mock PodExecClient that always succeeds.

Test cases

Tests are table-driven — each entry is a standaloneTestCase struct. Every subtest
gets its own namespace for isolation.

Category Test What it verifies
Green path deploy standalone reaches PhaseReady StatefulSet, Services, Secret created; CR phase = Ready; status fields correct
Green path deploy standalone with 3 replicas Multi-replica StatefulSet; ReadyReplicas == 3
Red path negative liveness delay rejected by API CRD validation (+kubebuilder:validation:Minimum=0) rejects the CR at admission
Red path negative readiness delay rejected by API Same — API-level rejection
Red path missing SPLUNK_GENERAL_TERMS → PhaseError CR is accepted but operator validation sets PhaseError
Red path pause annotation blocks reconciliation Controller short-circuits; no resources created; Phase stays empty
Red path CR deletion succeeds and CR disappears Deploy → PhaseReady → delete → CR is removed

Adding a new test case

Append an entry to the tests slice in TestStandaloneIntegration:

{
    name:             "my new scenario",
    crName:           "mytest",
    spec:             defaultSpec(),         // or customise
    annotations:      nil,                   // optional
    needsFakeKubelet: true,                  // false if no pods needed
    validate: func(t *testing.T, g gomega.Gomega, ctx context.Context, c client.Client, ns, crName string) {
        // assertions go here
    },
},

Set validate: nil if you expect the API server to reject the CR at creation time.

Prerequisites

The only prerequisite is the setup-envtest binary, which the Makefile already manages:

make setup-envtest

This downloads the correct etcd and kube-apiserver binaries into bin/k8s/.

How to run

# One-liner (resolves KUBEBUILDER_ASSETS automatically)
KUBEBUILDER_ASSETS="$(bin/setup-envtest use --bin-dir bin -p path)" \
  go test -v -count=1 -timeout 300s ./test/integration/...

Or run a single subtest:

KUBEBUILDER_ASSETS="$(bin/setup-envtest use --bin-dir bin -p path)" \
  go test -v -count=1 -timeout 60s -run "TestStandaloneIntegration/pause" ./test/integration/...

Expected output:

--- PASS: TestStandaloneIntegration (≈18s)
    --- PASS: deploy standalone reaches PhaseReady          (2.5s)
    --- PASS: deploy standalone with 3 replicas             (2.0s)
    --- PASS: negative liveness delay rejected by API       (0.0s)
    --- PASS: negative readiness delay rejected by API      (0.0s)
    --- PASS: missing SPLUNK_GENERAL_TERMS → PhaseError     (1.0s)
    --- PASS: pause annotation blocks reconciliation        (5.0s)
    --- PASS: CR deletion succeeds and CR disappears        (3.0s)

Architecture

┌─────────────────────────────────────────────────┐
│                  go test process                │
│                                                 │
│  ┌──────────┐   ┌──────────────────────────┐    │
│  │ envtest   │   │ controller-manager       │    │
│  │ (etcd +   │◄──│  StandaloneReconciler    │    │
│  │  apisvr)  │   │  (real ApplyStandalone)  │    │
│  └──────────┘   └──────────────────────────┘    │
│       ▲                                         │
│       │          ┌──────────────────────────┐   │
│       └──────────│ fake kubelet goroutine   │   │
│                  │  - creates Pods          │   │
│                  │  - patches STS status    │   │
│                  └──────────────────────────┘   │
│                                                 │
│  Mocked: GetPodExecClient (no real pod exec)    │
│  Mocked: probe script paths (point to repo)     │
└─────────────────────────────────────────────────┘

Known limitations

  • No garbage collector — envtest doesn't run the K8s GC controller, so
    owner-reference cascading deletes don't happen. The CR deletion test verifies
    the CR is removed but doesn't assert StatefulSet cleanup.
  • No admission webhooks — unless explicitly configured in the envtest
    Environment, webhook-based validation is skipped.
  • Global stateSPLUNK_GENERAL_TERMS and function-variable mocks
    (GetPodExecClient, probe script locations) are process-global, so subtests
    run sequentially.

@vivekr-splunk
Copy link
Copy Markdown
Collaborator

Thanks for putting this together. I like the direction of getting faster reconciliation coverage without requiring a full cluster. A couple of things looked risky from the operator/product side:

  1. tools/splunk-stub/* looks unused in the current envtest flow. The new suite never runs containers; it creates Pod objects directly and drives readiness by patching pod/StatefulSet status in fakeKubelet (test/integration/standalone_integration_test.go). That means the stub image adds maintenance surface without affecting test behavior today. If the intent is to support a future cluster-backed path, would it make sense to land that image together with the first test that actually consumes it?

  2. The fake kubelet currently synthesizes pods with only controller-revision-hash (test/integration/standalone_integration_test.go, around pod creation in fakeKubelet). The operator's services select on the standard Splunk labels from the generated workload config (pkg/splunk/enterprise/configuration.go). Because of that, the suite can still report PhaseReady without checking an important product contract: that the generated Services actually target the pods the operator created. Could we either copy the StatefulSet template labels onto the synthetic pods, or add an assertion that the service selectors match at least one pod?

Happy to take another look after that is tightened up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants