-
Notifications
You must be signed in to change notification settings - Fork 21
HYPERFLEET-930 - chore: update claude.md context #159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
openshift-merge-bot
merged 1 commit into
openshift-hyperfleet:main
from
kuudori:HYPERFLEET-930
Jun 2, 2026
+172
−177
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,171 @@ | ||
| # CLAUDE.md | ||
|
|
||
| ## Project Identity | ||
|
|
||
| HyperFleet Sentinel is a **Kubernetes resource watcher** that polls the HyperFleet API for cluster/nodepool updates, makes orchestration decisions via CEL-based decision logic, and publishes CloudEvents to message brokers. Stateless, horizontally scalable via label-based sharding, delegates all state persistence to the API. | ||
|
|
||
| - **Language**: Go 1.25 (see `go.mod`) | ||
| - **Messaging**: Broker abstraction (RabbitMQ, GCP Pub/Sub, Stub) | ||
| - **API Client**: Generated from [hyperfleet-api-spec](https://github.com/openshift-hyperfleet/hyperfleet-api-spec) — see [openapi/README.md](openapi/README.md) | ||
| - **Deployment**: Helm chart in `charts/` | ||
|
|
||
| Sentinel is one component in the HyperFleet control plane: | ||
| - **API** — persists cluster/nodepool state (source of truth) | ||
| - **Sentinel** — watches API, decides when resources need reconciliation, publishes events | ||
| - **Adapters** — consume events, execute provisioning/deprovisioning, report back to API | ||
| - **Broker** (RabbitMQ or Pub/Sub) — decouples Sentinel from adapters | ||
|
|
||
| ## Critical First Steps | ||
|
|
||
| **Generated OpenAPI client is NOT committed to git.** Before any build, test, or development task: | ||
|
|
||
| ```bash | ||
| make generate # Extracts OpenAPI spec from hyperfleet-api-spec module and generates Go client | ||
| ``` | ||
|
|
||
| Setup sequence for a fresh clone: | ||
| 1. `make generate` — generate OpenAPI client in `pkg/api/openapi/` | ||
| 2. `make download` — fetch Go dependencies | ||
| 3. `make build` — build `bin/sentinel` binary | ||
| 4. `make test` — verify unit tests pass | ||
|
|
||
| ## Verification | ||
|
|
||
| | Command | What it does | | ||
| |---|---| | ||
| | `make verify` | go vet + format check (fast) | | ||
| | `make lint` | golangci-lint (comprehensive) | | ||
| | `make test` | all tests (`./...`), writes `coverage.out` profile | | ||
| | `make test-unit` | unit tests only — specific internal/ and pkg/ packages | | ||
| | `make test-integration` | integration tests with testcontainers (Docker required) | | ||
| | `make test-coverage` | runs `make test` then opens HTML coverage report | | ||
| | `make test-helm` | Helm chart lint + template validation (10 scenarios) | | ||
| | `make test-all` | test + test-integration + test-helm + lint | | ||
|
|
||
| Quick feedback: `make verify && make test-unit`. Full pre-push: `make test-all`. | ||
|
|
||
| **PR pre-flight order:** | ||
| 1. `make generate` | ||
| 2. `make fmt` | ||
| 3. `make lint` | ||
| 4. `make test-unit` | ||
| 5. `make test-integration` — if broker/API changes | ||
| 6. `make test-helm` — if chart changes | ||
| 7. Update CHANGELOG.md if the change is user-visible | ||
|
|
||
| ## Source of Truth | ||
|
|
||
| | Topic | Where to look | | ||
| |---|---| | ||
| | Configuration reference | [docs/config.md](docs/config.md) | | ||
| | Metrics definitions | [docs/metrics.md](docs/metrics.md), `internal/metrics/` | | ||
| | Local/GKE deployment | [docs/running-sentinel.md](docs/running-sentinel.md) | | ||
| | Multi-instance sharding | [docs/multi-instance-deployment.md](docs/multi-instance-deployment.md) | | ||
| | Alerts and runbooks | [docs/alerts.md](docs/alerts.md), [docs/runbook.md](docs/runbook.md) | | ||
| | Helm values | [charts/values.yaml](charts/values.yaml) | | ||
| | Contributing and setup | [CONTRIBUTING.md](CONTRIBUTING.md) | | ||
| | OpenAPI client generation | [openapi/README.md](openapi/README.md) | | ||
| | Example configs | `configs/dev-example.yaml`, `configs/rabbitmq-example.yaml`, `configs/gcp-pubsub-example.yaml` | | ||
| | Broker configuration | `broker.yaml` (loaded by hyperfleet-broker; override path via `BROKER_CONFIG_FILE` env var) | | ||
| | CloudEvents / CEL payloads | `internal/payload/` | | ||
| | Resource profiling | [docs/resource-profiling.md](docs/resource-profiling.md) | | ||
|
|
||
| ## Architecture Context | ||
|
|
||
| Sentinel's job: **decide when**, not **execute how**. It can be killed and restarted at any time without data loss — this is what makes label-based sharding safe. The `message_decision` config uses CEL expressions to decide when to publish — see `DefaultMessageDecision()` in `internal/config/config.go` for default expressions. | ||
|
|
||
| ### Key Internal Patterns | ||
| - **Config validation fails fast** — `Validate()` returns error at startup, `LoadConfig()` propagates to main which exits non-zero | ||
| - **Context propagation** — `context.Context` threaded through all calls with correlation keys (OpID, TraceID, SpanID, DecisionReason) | ||
| - **Health probes** — `/healthz` (liveness: stale poll detection), `/readyz` (readiness: broker + first successful poll) | ||
|
|
||
| ## Code Conventions | ||
|
|
||
| ### Commit Messages | ||
| Format: `HYPERFLEET-### - type: description` | ||
|
|
||
| Example: | ||
| ``` | ||
| HYPERFLEET-427 - feat: add standard metrics labels | ||
|
|
||
| Adds resource_type and resource_selector labels to all | ||
| Prometheus metrics for consistent querying. | ||
|
|
||
| Co-Authored-By: Claude <noreply@anthropic.com> | ||
| ``` | ||
| Co-Authored-By trailer required on all Claude-assisted commits. | ||
|
|
||
| ### Configuration | ||
| - Config struct in `internal/config/config.go` — YAML struct tags, validation via `Validate()` | ||
| - All durations use `time.Duration` with YAML `duration` format (e.g., `5s`, `30m`) | ||
| - Config precedence (highest wins): CLI flags > env vars (`HYPERFLEET_*`) > YAML file > defaults | ||
| - Broker credentials handled separately via `broker.yaml` (or `BROKER_CONFIG_FILE` env var) | ||
|
|
||
| ### CLI Commands | ||
| - `sentinel serve --config config.yaml` — run the service | ||
| - `sentinel config-dump --config config.yaml` — print merged config (debug precedence issues) | ||
| - `sentinel version` — print version, commit, build date | ||
| - Run `sentinel serve --help` for full flag list | ||
|
|
||
| ### Error Handling | ||
| - Log at boundaries (main service loop), not deep in call stack | ||
|
|
||
| ### Logging | ||
| - Custom structured logger in `pkg/logger/` — stdlib only, no external deps | ||
| - Interface: `logger.HyperFleetLogger` with `Info()`, `Error()`, `Warn()`, `Debug()`, `V(level)` (verbosity), `Extra()` | ||
| - Create via `logger.NewHyperFleetLogger()` — uses global config | ||
| - Chaining: `logger.Extra("key", val).Extra("key2", val2).Info("msg")` | ||
| - **IMPORTANT: always use `pkg/logger`, never `log/slog` directly** | ||
|
|
||
| ### CloudEvents Payloads | ||
| `message_data` config uses CEL expressions, not static values: | ||
| ```yaml | ||
| message_data: | ||
| id: resource.id | ||
| kind: resource.kind | ||
| href: resource.href | ||
| ``` | ||
| CEL context: | ||
| - `resource` — cluster/nodepool object from API (id, kind, href, generation, status, labels, etc.) | ||
| - `reason` — decision reason string from engine (e.g., `"message decision matched"`, `"message decision result is false"`) | ||
| - `condition("Type")` — custom function to look up resource status condition by type name | ||
| - `now` — current timestamp | ||
| - `timestamp()`, `duration()` — standard CEL time functions | ||
|
|
||
| ### Testing | ||
| - Table-driven tests with plain `if` assertions — no testify | ||
| - Mocking via simple interface implementations (e.g., MockPublisher), no gomock | ||
| - Unit tests live alongside code: `foo_test.go` next to `foo.go` | ||
| - Integration tests in `test/integration/` with `//go:build integration` tag | ||
| - Prometheus metrics verified with `prometheus/testutil` | ||
| - Run single test: `go test -run TestDecisionEngine ./internal/engine/...` | ||
|
|
||
| ## Git Workflow | ||
|
|
||
| - Branch from `main`, PR back to `main` | ||
| - Branch naming: `HYPERFLEET-###-short-description` | ||
| - Pre-commit hooks: run `make install-hooks` to install — enforces commit message format (`hyperfleet-commitlint`), Go formatting, linting, and vet | ||
|
|
||
| ## Project Boundaries | ||
|
|
||
| **DO NOT**: | ||
| - Add business logic to Sentinel — orchestration decisions only, execution belongs in adapters | ||
| - Store state in Sentinel — it is stateless, API is source of truth | ||
| - Hardcode the resource polling interval — always use `poll_interval` from config for the main sentinel loop; adding a second resource polling loop bypasses the single-ticker backpressure model | ||
|
|
||
| **DO**: | ||
| - Update `hyperfleet-api-spec` version in `go.mod` and run `make generate` when API spec changes | ||
| - New exported functions require unit tests; new broker/API interactions require integration tests | ||
| - Add metrics when adding observable behavior — see [docs/metrics.md](docs/metrics.md) for conventions | ||
| - Convention: `message_data` should include `id`, `kind`, `href` fields (not enforced by validation, but expected by downstream adapters) — see `configs/dev-example.yaml` | ||
| - Use broker abstraction (`hyperfleet-broker`) — never import RabbitMQ/Pub/Sub clients directly | ||
|
|
||
| ## Gotchas | ||
|
|
||
| - **`make generate` is mandatory** — build and tests fail without it; generated code is gitignored | ||
| - **`pkg/api/openapi/` is read-only** — never hand-edit, always regenerate | ||
| - **Broker config comes from `broker.yaml`** (or `BROKER_CONFIG_FILE` env var), not sentinel YAML config — handled by hyperfleet-broker library | ||
| - **CEL expressions in `message_data` are compiled at startup** — syntax errors fail fast, but semantic errors (wrong field names on resource) surface at evaluation time | ||
| - **Metrics labels must include `resource_type` and `resource_selector`** — see [docs/metrics.md](docs/metrics.md) for naming conventions | ||
| - **Metrics use `sync.Once` registration** — call `ResetSentinelMetrics()` in tests to avoid duplicate registration panics | ||
| - **No testify** — project uses plain Go assertions and table-driven tests; don't introduce testify |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,177 +1 @@ | ||
| # CLAUDE.md | ||
|
|
||
| ## Project Identity | ||
|
|
||
| HyperFleet Sentinel is a **Kubernetes resource watcher** that polls the HyperFleet API for cluster/nodepool updates, makes orchestration decisions based on max age intervals, and publishes CloudEvents to message brokers. It is stateless, horizontally scalable via label-based sharding, and delegates all state persistence to the API. | ||
|
|
||
| - **Language**: Go 1.25+ | ||
| - **Messaging**: Broker abstraction supporting RabbitMQ, GCP Pub/Sub, and Stub implementations | ||
| - **API Client**: Generated from the [hyperfleet-api-spec](https://github.com/openshift-hyperfleet/hyperfleet-api-spec) Go module — see [openapi/README.md](openapi/README.md) | ||
| - **Deployment**: Helm chart with PodMonitoring (GKE) and ServiceMonitor (Prometheus Operator) | ||
|
|
||
| ## Critical First Steps | ||
|
|
||
| **Generated OpenAPI client is NOT committed to git.** Before any build, test, or development task: | ||
|
|
||
| ```bash | ||
| make generate # Extracts OpenAPI spec from hyperfleet-api-spec module and generates Go client | ||
| ``` | ||
|
|
||
| Setup sequence for a fresh clone: | ||
| 1. `make generate` — generate OpenAPI client in `pkg/api/openapi/` | ||
| 2. `make download` — fetch Go dependencies | ||
| 3. `make build` — build `bin/sentinel` binary | ||
| 4. `make test` — verify unit tests pass | ||
|
|
||
| ## Verification Commands | ||
|
|
||
| | Command | What it does | | ||
| |---|---| | ||
| | `make verify` | go vet + format check (fast) | | ||
| | `make lint` | golangci-lint (comprehensive) | | ||
| | `make test` | unit tests only (no external deps) | | ||
| | `make test-integration` | integration tests with testcontainers (RabbitMQ, Pub/Sub) | | ||
| | `make test-helm` | Helm chart lint and validation | | ||
| | `make test-all` | lint + unit + integration + helm tests | | ||
|
|
||
| Use `make verify && make test` for fast local feedback. Use `make test-all` before pushing. | ||
|
|
||
| ## Code Conventions | ||
|
|
||
| ### Commit Messages | ||
| Format: `HYPERFLEET-### - type: description` | ||
|
|
||
| Example: | ||
| ``` | ||
| HYPERFLEET-427 - feat: add standard metrics labels | ||
|
|
||
| Adds resource_type and resource_selector labels to all | ||
| Prometheus metrics for consistent querying. | ||
|
|
||
| Co-Authored-By: Claude <noreply@anthropic.com> | ||
| ``` | ||
|
|
||
| ### Import Ordering | ||
| 1. Standard library | ||
| 2. External packages (`github.com/google/cel-go`, `github.com/prometheus/client_golang`) | ||
| 3. HyperFleet packages (`github.com/openshift-hyperfleet/hyperfleet-broker`, etc.) | ||
| 4. Internal packages (`github.com/openshift-hyperfleet/hyperfleet-sentinel/internal/...`) | ||
|
|
||
| ### Configuration | ||
| - Config lives in `internal/config/config.go` — struct tags for YAML, validation via `Validate()` | ||
| - All durations use `time.Duration` with YAML `duration` format (e.g., `5s`, `30m`) | ||
| - Environment variables override YAML only for broker credentials (via hyperfleet-broker library) | ||
| - Config validation fails fast at startup — never run with invalid config | ||
|
|
||
| ### Error Handling | ||
| - Errors propagate with context: `fmt.Errorf("failed to poll API: %w", err)` | ||
| - Log errors at the boundary (main service loop), not deep in call stack | ||
| - Use structured logging: `logger.Error("msg", "key", value, "error", err)` | ||
|
|
||
| ### Metrics | ||
| - All metrics defined in `pkg/metrics/metrics.go` — use Prometheus client conventions | ||
| - Standard labels on all metrics: `resource_type`, `resource_selector` | ||
| - Counter: `_total` suffix (e.g., `hyperfleet_sentinel_events_published_total`) | ||
| - Gauge: no suffix (e.g., `hyperfleet_sentinel_pending_resources`) | ||
| - Histogram: `_seconds` suffix (e.g., `hyperfleet_sentinel_poll_duration_seconds`) | ||
|
|
||
| ### Testing | ||
| - Unit tests: mock external dependencies (API client, broker), fast, deterministic | ||
| - Integration tests: testcontainers for real RabbitMQ/Pub/Sub, slower, covers end-to-end flows | ||
| - Test file naming: `*_test.go` alongside implementation | ||
| - Integration tests: `test/integration/*_test.go` with build tag `//go:build integration` | ||
|
|
||
| ### CloudEvents Structure | ||
| Events use CEL expressions from `message_data` config to build payloads: | ||
| ```yaml | ||
| message_data: | ||
| id: resource.id # CEL expressions, not static values | ||
| kind: resource.kind | ||
| href: resource.href | ||
| generation: resource.generation | ||
| ``` | ||
|
|
||
| CEL context includes: | ||
| - `resource` — the cluster/nodepool object from API | ||
| - `reason` — decision string ("not_reconciled", "reconciled_stale", "reconciled_fresh") | ||
|
|
||
| ## Project Boundaries | ||
|
|
||
| **DO NOT**: | ||
| - Modify generated code in `pkg/api/openapi/` — regenerate via `make generate` instead | ||
| - Add dependencies without checking licenses (`go-licenses` reports in CI) | ||
| - Commit broker credentials or GCP service account keys | ||
| - Add business logic to Sentinel — orchestration decisions only, execution belongs in adapters | ||
| - Store state in Sentinel — it is stateless, API is the source of truth | ||
| - Poll faster than API can handle — respect backpressure and rate limits | ||
|
|
||
| **DO**: | ||
| - Update `hyperfleet-api-spec` version in `go.mod` and run `make generate` when the API spec changes | ||
| - Add tests for new features (unit + integration if broker/API interaction) | ||
| - Update Prometheus metrics when adding observable behaviors | ||
| - Update CHANGELOG.md for user-visible changes | ||
| - Follow the ObjectReference pattern for CloudEvents payloads (id, kind, href) | ||
| - Use broker abstraction (`hyperfleet-broker`) — never import RabbitMQ/Pub/Sub clients directly | ||
|
|
||
| ## Architecture Context | ||
|
|
||
| Sentinel is one component in the HyperFleet control plane: | ||
| - **API** persists cluster/nodepool state (source of truth) | ||
| - **Sentinel** watches API, decides when resources need reconciliation, publishes events | ||
| - **Adapters** consume events, execute provisioning/deprovisioning, report status back to API | ||
| - **Broker** (RabbitMQ or Pub/Sub) decouples Sentinel from adapters | ||
|
|
||
| Sentinel's job: **decide when**, not **execute how**. Max age intervals define "when": | ||
| - `max_age_not_reconciled`: poll frequently for unstable resources | ||
| - `max_age_reconciled`: poll infrequently for stable resources | ||
|
|
||
| ## Local Development | ||
|
|
||
| ```bash | ||
| # 1. Start HyperFleet API (see hyperfleet-api repo) and RabbitMQ | ||
| docker run -d -p 5672:5672 rabbitmq:3-management | ||
|
|
||
| # 2. Configure (see configs/dev-example.yaml and broker.yaml for templates) | ||
| # 3. Run Sentinel | ||
| ./bin/sentinel serve --config config.yaml | ||
|
|
||
| # Watch events at http://localhost:15672 (guest/guest) | ||
| ``` | ||
|
|
||
| For detailed local/GKE deployment, see [docs/running-sentinel.md](docs/running-sentinel.md). | ||
|
|
||
| ## Helm Chart | ||
|
|
||
| Chart lives in `charts/` with values for: | ||
| - Multiple Sentinel instances with different `resource_selector` (sharding) | ||
| - Monitoring: PodMonitoring (GKE/GMP) or ServiceMonitor (Prometheus Operator) | ||
| - Broker config via ConfigMap (type, topic) + Secret (credentials) | ||
|
|
||
| Example: deploy 2 Sentinels watching different shards: | ||
| ```bash | ||
| helm install sentinel-shard-1 ./charts \ | ||
| --set config.resourceSelector[0].label=shard \ | ||
| --set config.resourceSelector[0].value=1 \ | ||
| --set broker.topic=hyperfleet-prod-clusters | ||
|
|
||
| helm install sentinel-shard-2 ./charts \ | ||
| --set config.resourceSelector[0].label=shard \ | ||
| --set config.resourceSelector[0].value=2 \ | ||
| --set broker.topic=hyperfleet-prod-clusters | ||
| ``` | ||
|
|
||
| Both read from the same API and publish to the same topic, but watch different label-filtered subsets. | ||
|
|
||
| ## Validation Checklist | ||
|
|
||
| Before submitting a PR: | ||
| 1. `make generate` — ensure OpenAPI client is current | ||
| 2. `make fmt` — format code | ||
| 3. `make verify` — vet and format check | ||
| 4. `make lint` — pass golangci-lint | ||
| 5. `make test` — pass unit tests | ||
| 6. `make test-integration` — pass integration tests (if broker/API changes) | ||
| 7. `make test-helm` — validate Helm chart | ||
| 8. Update CHANGELOG.md for user-visible changes | ||
| 9. Add metrics if new observable behavior | ||
| 10. Commit message follows `HYPERFLEET-### - type: description` format | ||
| @AGENTS.md | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.