HYPERFLEET-1056 - feat: Helmfile infrastructure configurations#44
HYPERFLEET-1056 - feat: Helmfile infrastructure configurations#44ma-hill wants to merge 1 commit into
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughSummary by CodeRabbitRelease Notes
WalkthroughThis PR migrates HyperFleet infrastructure from direct Helm chart management to a Helmfile-based, environment-scoped architecture. It adds a development RabbitMQ Helm chart, introduces comprehensive E2E adapter configurations for five Kubernetes resource types across GCP and Kind environments, integrates Terraform-driven values generation, and refactors the Makefile and documentation to reflect the new workflow. Obsolete adapter and sentinel chart definitions are removed. Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
5b7f9d6 to
ccc611e
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
CONTRIBUTING.md (1)
112-135:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThese Helmfile examples are not runnable as written.
The direct
helmfile ... templatecommands omitNAMESPACE, even though the new templates callrequiredEnv "NAMESPACE", and the lint example drops the=<env>assignment entirely. Copy-pasting this section will fail before rendering.Suggested fix
-helmfile -f helmfile/helmfile.yaml.gotmpl -e gcp template -helmfile -f helmfile/helmfile.yaml.gotmpl -e kind template -helmfile -f helmfile/helmfile.yaml.gotmpl -e e2e-gcp template -helmfile -f helmfile/helmfile.yaml.gotmpl -e e2e-kind template +NAMESPACE=<namespace> helmfile -f helmfile/helmfile.yaml.gotmpl -e gcp template +NAMESPACE=<namespace> helmfile -f helmfile/helmfile.yaml.gotmpl -e kind template +NAMESPACE=<namespace> helmfile -f helmfile/helmfile.yaml.gotmpl -e e2e-gcp template +NAMESPACE=<namespace> helmfile -f helmfile/helmfile.yaml.gotmpl -e e2e-kind template @@ -HELMFILE_ENV make lint-helmfile +HELMFILE_ENV=<env> make lint-helmfile🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CONTRIBUTING.md` around lines 112 - 135, The Helmfile examples omit the required NAMESPACE env var and the lint make target example drops the HELMFILE_ENV assignment; update the examples so every direct helmfile command and the make invocations set both HELMFILE_ENV and NAMESPACE (e.g. export or prefix HELMFILE_ENV=<env> NAMESPACE=<ns> before running) and ensure the make targets referenced (template-helmfile, lint-helmfile) are shown with the HELMFILE_ENV and NAMESPACE assignment so copy-paste will supply requiredEnv "NAMESPACE" for the templates.
🟠 Major comments (23)
scripts/generate-rabbitmq-values.sh-23-37 (1)
23-37:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate adapter/topic inputs before using them in paths and YAML.
Lines 74-88 write
adapter,topic, andNAMESPACEstraight into filenames and YAML without boundary checks. A malformed--adapter-topicslike../../outside=fooescapesOUT_DIR, and quotes/newlines can produce invalid or attacker-controlled values files. Reject anything outside a tight allowlist before writing.Possible hardening
+is_safe_token() { + [[ "$1" =~ ^[a-z0-9]([-.a-z0-9]*[a-z0-9])?$ ]] +} + IFS=',' read -ra MAPPINGS <<< "$ADAPTER_TOPICS" for mapping in "${MAPPINGS[@]}"; do adapter="${mapping%%=*}" topic="${mapping##*=}" + + if ! is_safe_token "$adapter" || ! is_safe_token "$topic" || ! is_safe_token "$NAMESPACE"; then + echo "ERROR: adapter, topic, and namespace must match [a-z0-9.-] and cannot contain path separators" >&2 + exit 1 + fi + file="${OUT_DIR}/${adapter}.yaml"As per coding guidelines, "Validate input at system boundaries (HTTP handlers, CLI parsers, webhook receivers)."
Also applies to: 74-88
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/generate-rabbitmq-values.sh` around lines 23 - 37, The script currently accepts --adapter-topics, NAMESPACE and OUT_DIR without validation which allows path traversal or injection into filenames/YAML; after flag parsing (use ADAPTER_TOPICS, NAMESPACE, OUT_DIR variables) validate and normalize inputs: reject ADAPTER_TOPICS entries that contain '/', '..', quotes, newlines or any chars outside a strict allowlist (e.g., only [A-Za-z0-9_.-]); split ADAPTER_TOPICS and validate each adapter and topic pair and fail with an error if any item is invalid; also validate NAMESPACE with the same allowlist and ensure OUT_DIR is a safe path (make absolute or ensure it doesn’t contain traversal segments) before writing files; perform these checks in the same script scope that uses ADAPTER_TOPICS/OUT_DIR/NAMESPACE so functions like require_value remain unchanged.helm/rabbitmq/values.yaml-12-14 (1)
12-14:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid default
guest/guestin RabbitMQ Helm values (breaks in-cluster auth if loopback-only is kept)RabbitMQ’s built-in
guestuser is loopback-only by default, so clients connecting via a Kubernetes Service (network/Pod IP, not localhost) typically can’t authenticate withauth.username: guest/auth.password: guest. The chart’shelm/rabbitmqdirectory only containstemplates/deployment.yaml/service.yaml(no rabbitmq.conf/loopback override files detected), so these defaults are likely used as-is—consider a non-guestdefault or require explicit credentials at install time.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helm/rabbitmq/values.yaml` around lines 12 - 14, Replace the insecure default auth.username: guest / auth.password: guest in the helm/rabbitmq values by removing the hardcoded guest credentials and either (a) make auth.username/auth.password required (empty by default) so installs fail fast and document this, or (b) set auth.password to a generated/templated secret value; update the chart templates (templates/deployment.yaml and any secret/template logic) to consume .Values.auth.username and .Values.auth.password and to error or reference a Secret when values are empty so the chart never relies on the loopback-only guest account.Makefile-170-179 (1)
170-179:⚠️ Potential issue | 🟠 Major | ⚡ Quick winE2E environments are skipped by generation and validation.
The PR advertises
e2e-gcpande2e-kindas supported Helmfile environments, but this logic only handles exactgcpandkind. Todaye2e-kindnever generates RabbitMQ values, and both e2e envs can passcheck-helmfile-env-generatedwithout validating the directory they depend on.Also applies to: 247-255
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Makefile` around lines 170 - 179, The generate-rabbitmq-values Makefile target incorrectly treats only exact HELMFILE_ENV values ('kind' and 'gcp'), skipping e2e-kind/e2e-gcp; update the conditional in the generate-rabbitmq-values rule (which references HELMFILE_ENV and BROKER_TYPE) to match prefixes or patterns (e.g., treat values starting with "kind" or "gcp" the same as exact names) so e2e-* envs are handled the same as kind/gcp, and apply the same pattern fix to the other similar block around the later rule lines (the other generate/validate checks mentioned). Also tighten check-helmfile-env-generated so it validates the expected output directory (GENERATED_RABBITMQ_DIR) actually exists and fails if missing, referencing the check-helmfile-env-generated target and GENERATED_RABBITMQ_DIR variable.terraform/helm-values-files.tf-38-56 (1)
38-56:⚠️ Potential issue | 🟠 Major | ⚡ Quick winBootstrap the output directory before writing files.
These
local_fileresources write into../generated-values-from-terraform, but nothing here creates that directory. On a fresh checkout,terraform applycan fail before Helmfile ever sees the generated values.Does the Terraform hashicorp/local provider's local_file resource create missing parent directories for the filename automatically?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@terraform/helm-values-files.tf` around lines 38 - 56, The Terraform config writes files via local_file.adapter_values and local_file.sentinel_values into local.helm_values_dir but does not ensure the target directory exists; add a resource to create the directory (e.g., a null_resource.create_helm_values_dir with a provisioner "local-exec" running mkdir -p ${local.helm_values_dir}) and add depends_on = [null_resource.create_helm_values_dir] to both local_file.adapter_values and local_file.sentinel_values so the directory is created before attempting to write the files.Makefile-197-211 (1)
197-211:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix
helmfileconfig path for component-scoped install targets
install-api,install-sentinels, andinstall-adaptersrunhelmfile apply ...without-f, sohelmfilefalls back to looking for./helmfile.yaml/./helmfile.d. This repo doesn’t have a roothelmfile.yaml(onlyhelmfile/helmfile.yaml.gotmpl), so these targets can fail or apply the wrong state.Suggested fix
install-api: check-helmfile-env ## Install HyperFleet API - helmfile apply -e $(HELMFILE_ENV) -l component=api + helmfile -f helmfile/helmfile.yaml.gotmpl apply -e $(HELMFILE_ENV) -l component=api install-sentinels: check-helmfile-env ## Install Hyperfleet Sentinels - helmfile apply -e $(HELMFILE_ENV) -l component=sentinel + helmfile -f helmfile/helmfile.yaml.gotmpl apply -e $(HELMFILE_ENV) -l component=sentinel install-adapters: check-helmfile-env ## Install Hyperfleet Adapters - helmfile apply -e $(HELMFILE_ENV) -l component=adapter + helmfile -f helmfile/helmfile.yaml.gotmpl apply -e $(HELMFILE_ENV) -l component=adapter🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Makefile` around lines 197 - 211, The component-specific Make targets install-api, install-sentinels, and install-adapters currently call "helmfile apply" without the -f flag so helmfile may load a wrong/default file; update each target (install-api, install-sentinels, install-adapters) to pass the same -f helmfile/helmfile.yaml.gotmpl flag used by install-hyperfleet and keep the existing -e $(HELMFILE_ENV) and label filters so helmfile uses the repo's helmfile template rather than falling back to ./helmfile.yaml.helmfile/values/base-adapter.yaml.gotmpl-22-24 (1)
22-24:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove
SIMULATE_RESULTfrom the shared base values.This template is merged into every adapter release, so hardcoding
SIMULATE_RESULT=successmakes non-E2E environments opt into simulated behavior by default. Scope this flag to the E2E overlays, or make it an explicit opt-in instead of a base default.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/values/base-adapter.yaml.gotmpl` around lines 22 - 24, Remove the hardcoded env entry named SIMULATE_RESULT from the shared template (the env: - name: SIMULATE_RESULT value: success block) so it is not merged into all adapter releases; instead add a dedicated conditional or move this variable into the E2E overlay values and templates (e.g., guard with a value like .Values.e2e.enableSimulate or set SIMULATE_RESULT only in the E2E overlay values file) so non-E2E environments do not opt into simulated behavior by default.helmfile/environments/kind/kind-config.yaml.gotmpl-3-4 (1)
3-4:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate the env-selected broker and service types before rendering.
A bad
BROKER_TYPEhere still flows into the shared templates, buthelmfile/values/base-sentinel.yaml.gotmplonly handlesgooglepubsubandrabbitmq, so other values render an unusable broker config.API_SERVICE_TYPEis also forwarded without checks. Please enforce the supported values in this environment template instead of letting invalid input reach deployment time.As per coding guidelines, "Validate input at system boundaries (HTTP handlers, CLI parsers, webhook receivers)."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/environments/kind/kind-config.yaml.gotmpl` around lines 3 - 4, The environment template currently forwards BROKER_TYPE and API_SERVICE_TYPE without validating allowed values, causing invalid broker/service values to reach helm templates (base-sentinel only supports "googlepubsub" and "rabbitmq"); update the gotmpl around brokerType and serviceType to validate the env values against allowed lists and either coerce to a safe default or error early. Specifically, read env "BROKER_TYPE" and ensure it is one of ["googlepubsub","rabbitmq"] before assigning brokerType (otherwise set the default "rabbitmq" or abort render), and similarly validate env "API_SERVICE_TYPE" against allowed Kubernetes service types (e.g., "ClusterIP","NodePort","LoadBalancer") before assigning serviceType; implement these checks in the helmfile/environments/kind/kind-config.yaml.gotmpl template so invalid inputs never get forwarded to base-sentinel.yaml.gotempl or downstream charts.helmfile/environments/gcp/gcp-config.yaml.gotmpl-3-5 (1)
3-5:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail fast on unsupported
BROKER_TYPEandAPI_SERVICE_TYPEvalues.These env vars are accepted unchecked here. Downstream,
helmfile/values/base-sentinel.yaml.gotmplonly materializes broker settings forgooglepubsubandrabbitmq, so any otherBROKER_TYPEproduces a partial config, and an invalidAPI_SERVICE_TYPEis passed through to the Service manifest unchanged. Please validate both against an explicit allowlist at this boundary.As per coding guidelines, "Validate input at system boundaries (HTTP handlers, CLI parsers, webhook receivers)."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/environments/gcp/gcp-config.yaml.gotmpl` around lines 3 - 5, Add validation in the helmfile template to fail fast when env var values are unsupported: read BROKER_TYPE and API_SERVICE_TYPE into locals (e.g., $brokerType and $serviceType) and use template conditionals with an explicit allowlist (e.g., broker allowlist ["googlepubsub","rabbitmq"] and service allowlist ["LoadBalancer","ClusterIP","NodePort"]) and call the template fail function if a value is not in the allowlist; update helmfile/environments/gcp/gcp-config.yaml.gotmpl to perform these checks before emitting projectId/brokerType/serviceType (and reference values/base-sentinel.yaml.gotmpl behavior when choosing broker allowlist entries).helmfile/configs/e2e/adapters/cl-deployment/adapter-task-config.yaml-4-8 (1)
4-8:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate
clusterIdbefore reusing it in URLs and namespace-scoped resources.This adapter takes
event.idand propagates it into API URLs, namespace selection, and the status writeback path without a boundary check. A bad value here breaks the whole e2e chain, not just this deployment step.As per coding guidelines, "Validate input at system boundaries (HTTP handlers, CLI parsers, webhook receivers)."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-deployment/adapter-task-config.yaml` around lines 4 - 8, The param "clusterId" (sourced from event.id) must be validated before being used in API URLs, Kubernetes namespace selection, or status writeback paths; add a validation step where the adapter reads this param to enforce allowed characters/length (e.g., alphanumeric, dashes, max length) or match a strict regex, return a clear error if invalid, and sanitize/encode the value (URL-encode for API paths and validate against Kubernetes namespace rules) before constructing URLs or namespaces so no raw event.id is directly reused.helmfile/configs/e2e/adapters/cl-namespace/adapter-task-config.yaml-4-8 (1)
4-8:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate
clusterIdbefore templating it into URLs and Kubernetes metadata.
event.idis external input, and this file later injects it into/clusters/{{ .clusterId }}, the Namespace name, labels, and the status-update URL. Without a boundary check, malformed values can either break reconciliation or hit an unintended API path. Constrain it to the Kubernetes-safe/path-safe subset before any templating happens.As per coding guidelines, "Validate input at system boundaries (HTTP handlers, CLI parsers, webhook receivers)."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-namespace/adapter-task-config.yaml` around lines 4 - 8, The param "clusterId" (params -> name: "clusterId", referenced as {{ .clusterId }} and used in /clusters/{{ .clusterId }}, Namespace name/labels and status URLs) is coming from external event.id and must be validated/sanitized before templating: enforce it conforms to a Kubernetes DNS-1123 label (lowercase a-z0-9 and '-', <=63 chars, no leading/trailing '-'), and reject or canonicalize any value that fails; for URL paths either validate against a stricter path-safe subset or percent-encode after validation; implement this check at the parameter parsing/binding step (where params are read) and return a clear validation error or abort templating if clusterId is invalid.helmfile/configs/e2e/adapters/cl-job/adapter-task-config.yaml-4-8 (1)
4-8:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate
clusterIdbefore using it across HTTP and Kubernetes boundaries.
event.idis injected into/clusters/{{ .clusterId }}, the target namespace, selectors, and the status-update URL with no boundary validation. A malformed value can break both the API calls and the resource naming path for this adapter chain.As per coding guidelines, "Validate input at system boundaries (HTTP handlers, CLI parsers, webhook receivers)."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-job/adapter-task-config.yaml` around lines 4 - 8, Validate and sanitize the incoming clusterId (params -> name: "clusterId", source: "event.id") at the adapter boundary before any use in templates like /clusters/{{ .clusterId }}, namespace names, selectors or status-update URLs: enforce a strict whitelist (e.g., DNS-1123 label regex for Kubernetes names: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ with appropriate max length), reject or return a clear error for invalid values, URL-encode clusterId when inserting into HTTP paths, and/or transform to a safe canonical form; implement this check in the param parsing/handler that reads params.clusterId so the rest of the code can assume a validated, safe identifier.helmfile/configs/e2e/adapters/cl-deployment/adapter-task-resource-deployment.yaml-23-28 (1)
23-28:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThis Deployment still relies on a default root-capable security context.
The container/pod spec has no explicit hardening, so it inherits defaults that allow root and a writable root filesystem. That is both a security regression and a portability risk for clusters enforcing restricted admission.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-deployment/adapter-task-resource-deployment.yaml` around lines 23 - 28, The Deployment currently leaves the pod and container security context to defaults; add explicit pod-level and container-level securityContext entries to harden the workload: set pod spec.securityContext with runAsNonRoot: true and a non-root runAsUser/fsGroup (e.g., >1000), and on the container named "test" set securityContext with readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, drop all capabilities (capabilities.drop: ["ALL"]) and optionally a seccompProfile/runtimeClass if required by your cluster; ensure these keys are added alongside the existing spec -> containers -> - name: test block so the pod will be non-root and use a read-only root filesystem.helmfile/configs/e2e/adapters/cl-job/adapter-task-resource-job.yaml-16-29 (1)
16-29:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHarden this Job instead of relying on the default root-capable security context.
Right now the pod/container run with the platform defaults, which is exactly what Trivy flagged. Even for e2e, this weakens the cluster baseline and can fail under restricted admission policies.
Suggested hardening
spec: backoffLimit: 0 template: spec: + securityContext: + seccompProfile: + type: RuntimeDefault restartPolicy: Never containers: - name: hello-world image: alpine:3.19 imagePullPolicy: IfNotPresent command: ["/bin/sh", "-c", "echo 'Hello, World!'"] + securityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + runAsNonRoot: true + runAsUser: 65532 + runAsGroup: 65532 + capabilities: + drop: ["ALL"] resources: requests: memory: "32Mi"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-job/adapter-task-resource-job.yaml` around lines 16 - 29, The Job's Pod uses platform defaults and runs root-capable containers; update the manifest (under spec and the container with name "hello-world") to harden its securityContext: add a pod-level securityContext with runAsNonRoot: true and runAsUser: 1000 (or another non-root UID), and add container-level securityContext for the "hello-world" container with runAsNonRoot: true, runAsUser: 1000, allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, capabilities: drop: ["ALL"], and seccompProfile: { type: "RuntimeDefault" } (or Localhost with a profile if required by policy); ensure no privileged: true is set and remove any settings that would grant extra capabilities.helmfile/configs/e2e/adapters/cl-deployment/adapter-task-resource-deployment.yaml-25-26 (1)
25-26:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPin the nginx image and harden the test Deployment’s securityContext defaults.
- name: test image: nginx:latest
nginx:latestis a floating tag; HyperFleet standards call for container image digest pinning to avoid supply-chain/registry drift—use a pinned version orsha256digest instead.- This Deployment template lacks the required restrictive defaults (
podSecurityContextand containersecurityContext:runAsNonRoot,allowPrivilegeEscalation: false,capabilities.drop: ["ALL"],readOnlyRootFilesystem: true,seccompProfile.type: RuntimeDefault).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-deployment/adapter-task-resource-deployment.yaml` around lines 25 - 26, Replace the floating image tag for the container named "test" with a pinned image (use a specific version tag or an image@sha256:<digest>) and add restrictive security defaults to the Deployment template: add podSecurityContext with runAsNonRoot: true and a non-root runAsUser, and add container-level securityContext for the "test" container including runAsNonRoot: true, allowPrivilegeEscalation: false, capabilities.drop: ["ALL"], readOnlyRootFilesystem: true, and seccompProfile.type: RuntimeDefault; ensure these fields are added in the Deployment spec template for the pod and container corresponding to the "test" container.helmfile/configs/e2e/adapters/cl-deployment/adapter-task-config.yaml-39-52 (1)
39-52:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix
validationCheckgate for list-shapedclusterJobStatuscapture
clusterJobStatuscomes from a JSONPath filter (capture.field) where the adapter executor captures a single string only when the JSONPath yields exactly one match; otherwise it captures[]interface{...}. If the/clusters/.../statusesresponse contains multiplecl-job/Availablestatuses,validationCheck’sclusterJobStatus == "True"won’t match correctly and can keep the gate blocked—update the JSONPath to return a single scalar (e.g., add an index) or adjust the CEL expression to accept a list.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-deployment/adapter-task-config.yaml` around lines 39 - 52, The validation gate fails when capture.clusterJobStatus yields a list instead of a single string; update the capture or CEL to handle lists: either modify the clusterAdapterStatus.capture.field JSONPath (the capture named "clusterJobStatus") to return a single scalar (e.g., add an index selector so it returns the first match) or change the validationCheck expression to accept a list (use a CEL any/exists style check over clusterJobStatus to see if any element == "True") so the condition clusterJobStatus == "True" works for both single values and arrays.helmfile/configs/e2e/adapters/cl-maestro/adapter-task-config.yaml-11-14 (1)
11-14:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't overwrite the event generation.
paramsalready bindsgenerationfromevent.generation, thenclusterStatus.capturereuses the same name for the GET response field. That makesobserved_generationdepend on the follow-up API response instead of the triggering event.Proposed fix
- name: "clusterName" field: "name" - - name: "generation" + - name: "clusterGeneration" field: "generation"Also applies to: 32-33, 198-199
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-maestro/adapter-task-config.yaml` around lines 11 - 14, The capture in clusterStatus.capture is overwriting the event-bound params.generation (event.generation), so rename the GET response field there (e.g., change the captured key from "generation" to something like "api_generation" or "captured_generation") and ensure observed_generation continues to use the original params.generation (the event value) rather than the captured field; update all occurrences where clusterStatus.capture currently uses "generation" (including the other instances mentioned) to use the new name and adjust any downstream references accordingly.helmfile/configs/e2e/adapters/cl-maestro/adapter-task-config.yaml-43-45 (1)
43-45:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove the hard-coded placement target.
placementClusterNameis fixed to"cluster1", and Line 65 uses it as the Maestrotarget_cluster. Every event will reconcile against that same consumer, regardless of the actual placement result.Also applies to: 65-65
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-maestro/adapter-task-config.yaml` around lines 43 - 45, The placementClusterName is hard-coded to the literal "\"cluster1\"" which forces every event to reconcile to the same consumer; remove the fixed expression and wire placementClusterName to the placement adapter output (or the appropriate templated variable) so that target_cluster (where it's consumed) uses the actual placement result instead of the literal. Update the adapter-task-config entry for placementClusterName and any references to it (e.g., target_cluster) to read from the placement result variable rather than the hard-coded string.helmfile/configs/e2e/adapters/cl-maestro/adapter-task-resource-manifestwork.yaml-30-32 (1)
30-32:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake this file template-aware for tooling.
The standalone
{{ if ... }}block makes the raw file invalid YAML, which already shows up in static analysis. Any generic YAML lint/parse step over*.yamlwill choke on it until it is rendered.#!/bin/bash # Verify whether raw YAML files contain standalone Go-template control blocks # and whether repo tooling lint/parses them as plain YAML. fd -e yaml -e yml | xargs rg -n '^\s*\{\{\s*(if|else|end|range)\b' rg -n 'yamllint|kubeconform|kubectl apply -f|helm lint|helmfile lint|yq' -SExpected result: this file should appear in the first command. If the second command shows generic YAML tooling over
*.yaml, rename this template to a template-specific extension and update the manifest ref accordingly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-maestro/adapter-task-resource-manifestwork.yaml` around lines 30 - 32, This file contains a raw Go template control block ("{{ if .platformType }}" / "{{ end }}") which makes the YAML invalid to generic linters; to fix, rename the file to a template-aware extension (for example change adapter-task-resource-manifestwork.yaml -> adapter-task-resource-manifestwork.yaml.gotmpl or .tpl) and update any code/manifest references that consume this file to point to the new name so tooling skips plain-YAML parsing; ensure the template tokens ("{{ if .platformType }}" and "{{ end }}") remain unchanged inside the renamed file.helmfile/configs/e2e/adapters/cl-maestro/adapter-task-resource-manifestwork.yaml-6-6 (1)
6-6:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize
clusterIdbefore using it inManifestWork.metadata.name(keepdiscovery.by_namein sync)In
helmfile/configs/e2e/adapters/cl-maestro/adapter-task-resource-manifestwork.yaml,ManifestWork.metadata.nameuses raw{{ .clusterId }}-{{ .adapter.name }}even though the template already lower-cases nested resource names via{{ .clusterId | lower }}-....helmfile/configs/e2e/adapters/cl-maestro/adapter-task-config.yamlthen discovers the ManifestWork using rawdiscovery.by_name: "{{ .clusterId }}-{{ .adapter.name }}", so any non-DNS-1123-safeclusterIdwill break create/discovery/status reporting. Apply DNS-1123-safe normalization toclusterIdfor the ManifestWork name (and update the pairedby_name), or document a strict upstream guarantee thatclusterIdis already Kubernetes-name safe.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/configs/e2e/adapters/cl-maestro/adapter-task-resource-manifestwork.yaml` at line 6, Normalize .clusterId to a DNS-1123-safe form before composing ManifestWork.metadata.name and use the same normalized value in discovery.by_name; specifically, update adapter-task-resource-manifestwork.yaml where ManifestWork.metadata.name is built (currently "{{ .clusterId }}-{{ .adapter.name }}") to apply the same normalization/filters used elsewhere (e.g., lowercasing and replacing/stripping invalid chars to produce a DNS-1123-compliant name) and then update the paired value in adapter-task-config.yaml (discovery.by_name) to use the identical normalized expression so create/discovery/status use the same safe name.CONTRIBUTING.md-247-260 (1)
247-260:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix the cleanup shell snippets.
export HELMFILE_ENV=gcp or HELMFILE_ENV=e2e-gcpis not valid shell, so these commands will not set the environment the way the section describes.Suggested fix
-export HELMFILE_ENV=gcp or HELMFILE_ENV=e2e-gcp +export HELMFILE_ENV=gcp # or: export HELMFILE_ENV=e2e-gcp @@ -export HELMFILE_ENV=kind or HELMFILE_ENV=e2e-kind +export HELMFILE_ENV=kind # or: export HELMFILE_ENV=e2e-kind🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CONTRIBUTING.md` around lines 247 - 260, The shell snippets use an invalid construct "export HELMFILE_ENV=gcp or HELMFILE_ENV=e2e-gcp"; update the examples so they show valid shell exports for HELMFILE_ENV (e.g., a single export per line with a comment or two separate example lines: one exporting gcp and one exporting e2e-gcp) and do the same for the kind/e2e-kind section; keep the NAMESPACE export as-is and ensure the subsequent commands (make uninstall-hyperfleet, make uninstall-maestro, make destroy-terraform) remain unchanged so the documentation demonstrates correct, runnable shell commands using the HELMFILE_ENV and NAMESPACE variables.README.md-205-219 (1)
205-219:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDocument the E2E generated-values flow too.
This section says generated values are only used for
HELMFILE_ENV=gcporHELMFILE_ENV=kind, but the README earlier introducese2e-gcpande2e-kindas supported Helmfile environments. As written, E2E readers are told the wrong prerequisite path for broker values.Suggested fix
-### Generated Helm Values - only used for HELMFILE_ENV=gcp or HELMFILE_ENV=kind +### Generated Helm Values - used for GCP and kind environment families -Broker configuration values are auto-generated and consumed by Helmfile: +Broker configuration values are auto-generated and consumed by Helmfile across the matching standard and E2E environments: -**For GCP environments (`gcp`):** +**For GCP environments (`gcp`, `e2e-gcp`):** - `make install-terraform` generates Google Pub/Sub configuration via `terraform/helm-values-files.tf` - Files are written to `generated-values-from-terraform/` - Contains: topic names, project ID, subscription IDs, Workload Identity bindings -- Helmfile automatically reads these values when deploying with `HELMFILE_ENV=gcp` +- Helmfile automatically reads these values when deploying with `HELMFILE_ENV=gcp` or `HELMFILE_ENV=e2e-gcp` -**For kind environments (`kind`):** +**For kind environments (`kind`, `e2e-kind`):** - `make generate-rabbitmq-values` generates RabbitMQ configuration - Files are written to `generated-values-rabbitmq/` - Contains: RabbitMQ URL, exchange names, queue names, routing keys -- Helmfile automatically reads these values when deploying with `HELMFILE_ENV=kind` +- Helmfile automatically reads these values when deploying with `HELMFILE_ENV=kind` or `HELMFILE_ENV=e2e-kind`🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 205 - 219, Update the "Generated Helm Values" section to include E2E environments by mentioning HELMFILE_ENV values e2e-gcp and e2e-kind alongside gcp and kind; explicitly state that for E2E workflows the same generation steps apply (e.g., `make install-terraform` → generated-values-from-terraform for GCP/e2e-gcp and `make generate-rabbitmq-values` → generated-values-rabbitmq for kind/e2e-kind) and that Helmfile reads those files when deploying with HELMFILE_ENV set to gcp, kind, e2e-gcp or e2e-kind so E2E readers are pointed to the correct prerequisite path for broker values.helmfile/environments/e2e-kind/adapter-configs.yaml.gotmpl-49-54 (1)
49-54:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep
broker.createenabled forcl-job.Every other adapter in this template sets
broker.create: true, butcl-jobomits it. If the chart defaults that flag to false, this adapter will render without broker wiring even though queue/exchange settings are present.Suggested fix
broker: + create: true rabbitmq: url: "amqp://guest:guest@rabbitmq:5672" queue: {{ $namespace }}-clusters-cl-job🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/environments/e2e-kind/adapter-configs.yaml.gotmpl` around lines 49 - 54, The cl-job adapter block is missing broker.create and may therefore not wire the RabbitMQ settings; update the cl-job adapter (the broker section for "cl-job") to include broker.create: true so that the RabbitMQ settings (broker.rabbitmq.url, queue, exchange, routingKey) are actually applied, matching the other adapters in this template.helmfile/environments/e2e-kind/adapter-configs.yaml.gotmpl-133-139 (1)
133-139:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix the
np-configmapqueue name.Line 137 points the nodepool adapter at
{{ $namespace }}-nodepools-cl-namespaceinstead of annp-configmapqueue. That misroutes messages and can collide with a different consumer naming scheme.Suggested fix
broker: create: true rabbitmq: url: "amqp://guest:guest@rabbitmq:5672" - queue: {{ $namespace }}-nodepools-cl-namespace + queue: {{ $namespace }}-nodepools-np-configmap exchange: {{ $namespace }}-nodepools routingKey: #🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@helmfile/environments/e2e-kind/adapter-configs.yaml.gotmpl` around lines 133 - 139, The broker.rabbitmq.queue value is pointing at the wrong queue (currently "{{ $namespace }}-nodepools-cl-namespace"); update it to the np-configmap queue name (e.g. "{{ $namespace }}-np-configmap") so the nodepool adapter publishes to the correct queue; edit the broker -> rabbitmq -> queue entry to use "{{ $namespace }}-np-configmap" (leave exchange and routingKey as-is) and ensure any consumer/service that expects "np-configmap" matches this name to avoid collisions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Makefile`:
- Around line 117-130: The create-maestro-consumer Make target embeds
MAESTRO_CONSUMER directly into single-quoted shell JSON and grep patterns, which
allows a consumer name containing a single-quote to break quoting and enable
shell/JSON injection; fix it by (1) validating MAESTRO_CONSUMER against a strict
allowlist regex (e.g., alphanumerics, dashes, underscores) early in the
create-maestro-consumer flow and fail fast if it doesn't match, and (2) stop
inline single-quote interpolation when building the POST body and when checking
existence—construct the JSON payload safely (e.g., use jq or a small python
snippet to produce '{"name": <value>}' into a variable or stdin) and use a
JSON-aware check (e.g., pipe the API response through jq to test .name ==
MAESTRO_CONSUMER) instead of grep; update the parts of the
create-maestro-consumer target that reference MAESTRO_CONSUMER and the curl -d /
grep commands accordingly.
---
Outside diff comments:
In `@CONTRIBUTING.md`:
- Around line 112-135: The Helmfile examples omit the required NAMESPACE env var
and the lint make target example drops the HELMFILE_ENV assignment; update the
examples so every direct helmfile command and the make invocations set both
HELMFILE_ENV and NAMESPACE (e.g. export or prefix HELMFILE_ENV=<env>
NAMESPACE=<ns> before running) and ensure the make targets referenced
(template-helmfile, lint-helmfile) are shown with the HELMFILE_ENV and NAMESPACE
assignment so copy-paste will supply requiredEnv "NAMESPACE" for the templates.
---
Major comments:
In `@CONTRIBUTING.md`:
- Around line 247-260: The shell snippets use an invalid construct "export
HELMFILE_ENV=gcp or HELMFILE_ENV=e2e-gcp"; update the examples so they show
valid shell exports for HELMFILE_ENV (e.g., a single export per line with a
comment or two separate example lines: one exporting gcp and one exporting
e2e-gcp) and do the same for the kind/e2e-kind section; keep the NAMESPACE
export as-is and ensure the subsequent commands (make uninstall-hyperfleet, make
uninstall-maestro, make destroy-terraform) remain unchanged so the documentation
demonstrates correct, runnable shell commands using the HELMFILE_ENV and
NAMESPACE variables.
In `@helm/rabbitmq/values.yaml`:
- Around line 12-14: Replace the insecure default auth.username: guest /
auth.password: guest in the helm/rabbitmq values by removing the hardcoded guest
credentials and either (a) make auth.username/auth.password required (empty by
default) so installs fail fast and document this, or (b) set auth.password to a
generated/templated secret value; update the chart templates
(templates/deployment.yaml and any secret/template logic) to consume
.Values.auth.username and .Values.auth.password and to error or reference a
Secret when values are empty so the chart never relies on the loopback-only
guest account.
In `@helmfile/configs/e2e/adapters/cl-deployment/adapter-task-config.yaml`:
- Around line 4-8: The param "clusterId" (sourced from event.id) must be
validated before being used in API URLs, Kubernetes namespace selection, or
status writeback paths; add a validation step where the adapter reads this param
to enforce allowed characters/length (e.g., alphanumeric, dashes, max length) or
match a strict regex, return a clear error if invalid, and sanitize/encode the
value (URL-encode for API paths and validate against Kubernetes namespace rules)
before constructing URLs or namespaces so no raw event.id is directly reused.
- Around line 39-52: The validation gate fails when capture.clusterJobStatus
yields a list instead of a single string; update the capture or CEL to handle
lists: either modify the clusterAdapterStatus.capture.field JSONPath (the
capture named "clusterJobStatus") to return a single scalar (e.g., add an index
selector so it returns the first match) or change the validationCheck expression
to accept a list (use a CEL any/exists style check over clusterJobStatus to see
if any element == "True") so the condition clusterJobStatus == "True" works for
both single values and arrays.
In
`@helmfile/configs/e2e/adapters/cl-deployment/adapter-task-resource-deployment.yaml`:
- Around line 23-28: The Deployment currently leaves the pod and container
security context to defaults; add explicit pod-level and container-level
securityContext entries to harden the workload: set pod spec.securityContext
with runAsNonRoot: true and a non-root runAsUser/fsGroup (e.g., >1000), and on
the container named "test" set securityContext with readOnlyRootFilesystem:
true, allowPrivilegeEscalation: false, drop all capabilities (capabilities.drop:
["ALL"]) and optionally a seccompProfile/runtimeClass if required by your
cluster; ensure these keys are added alongside the existing spec -> containers
-> - name: test block so the pod will be non-root and use a read-only root
filesystem.
- Around line 25-26: Replace the floating image tag for the container named
"test" with a pinned image (use a specific version tag or an
image@sha256:<digest>) and add restrictive security defaults to the Deployment
template: add podSecurityContext with runAsNonRoot: true and a non-root
runAsUser, and add container-level securityContext for the "test" container
including runAsNonRoot: true, allowPrivilegeEscalation: false,
capabilities.drop: ["ALL"], readOnlyRootFilesystem: true, and
seccompProfile.type: RuntimeDefault; ensure these fields are added in the
Deployment spec template for the pod and container corresponding to the "test"
container.
In `@helmfile/configs/e2e/adapters/cl-job/adapter-task-config.yaml`:
- Around line 4-8: Validate and sanitize the incoming clusterId (params -> name:
"clusterId", source: "event.id") at the adapter boundary before any use in
templates like /clusters/{{ .clusterId }}, namespace names, selectors or
status-update URLs: enforce a strict whitelist (e.g., DNS-1123 label regex for
Kubernetes names: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$ with appropriate max length),
reject or return a clear error for invalid values, URL-encode clusterId when
inserting into HTTP paths, and/or transform to a safe canonical form; implement
this check in the param parsing/handler that reads params.clusterId so the rest
of the code can assume a validated, safe identifier.
In `@helmfile/configs/e2e/adapters/cl-job/adapter-task-resource-job.yaml`:
- Around line 16-29: The Job's Pod uses platform defaults and runs root-capable
containers; update the manifest (under spec and the container with name
"hello-world") to harden its securityContext: add a pod-level securityContext
with runAsNonRoot: true and runAsUser: 1000 (or another non-root UID), and add
container-level securityContext for the "hello-world" container with
runAsNonRoot: true, runAsUser: 1000, allowPrivilegeEscalation: false,
readOnlyRootFilesystem: true, capabilities: drop: ["ALL"], and seccompProfile: {
type: "RuntimeDefault" } (or Localhost with a profile if required by policy);
ensure no privileged: true is set and remove any settings that would grant extra
capabilities.
In `@helmfile/configs/e2e/adapters/cl-maestro/adapter-task-config.yaml`:
- Around line 11-14: The capture in clusterStatus.capture is overwriting the
event-bound params.generation (event.generation), so rename the GET response
field there (e.g., change the captured key from "generation" to something like
"api_generation" or "captured_generation") and ensure observed_generation
continues to use the original params.generation (the event value) rather than
the captured field; update all occurrences where clusterStatus.capture currently
uses "generation" (including the other instances mentioned) to use the new name
and adjust any downstream references accordingly.
- Around line 43-45: The placementClusterName is hard-coded to the literal
"\"cluster1\"" which forces every event to reconcile to the same consumer;
remove the fixed expression and wire placementClusterName to the placement
adapter output (or the appropriate templated variable) so that target_cluster
(where it's consumed) uses the actual placement result instead of the literal.
Update the adapter-task-config entry for placementClusterName and any references
to it (e.g., target_cluster) to read from the placement result variable rather
than the hard-coded string.
In
`@helmfile/configs/e2e/adapters/cl-maestro/adapter-task-resource-manifestwork.yaml`:
- Around line 30-32: This file contains a raw Go template control block ("{{ if
.platformType }}" / "{{ end }}") which makes the YAML invalid to generic
linters; to fix, rename the file to a template-aware extension (for example
change adapter-task-resource-manifestwork.yaml ->
adapter-task-resource-manifestwork.yaml.gotmpl or .tpl) and update any
code/manifest references that consume this file to point to the new name so
tooling skips plain-YAML parsing; ensure the template tokens ("{{ if
.platformType }}" and "{{ end }}") remain unchanged inside the renamed file.
- Line 6: Normalize .clusterId to a DNS-1123-safe form before composing
ManifestWork.metadata.name and use the same normalized value in
discovery.by_name; specifically, update adapter-task-resource-manifestwork.yaml
where ManifestWork.metadata.name is built (currently "{{ .clusterId }}-{{
.adapter.name }}") to apply the same normalization/filters used elsewhere (e.g.,
lowercasing and replacing/stripping invalid chars to produce a
DNS-1123-compliant name) and then update the paired value in
adapter-task-config.yaml (discovery.by_name) to use the identical normalized
expression so create/discovery/status use the same safe name.
In `@helmfile/configs/e2e/adapters/cl-namespace/adapter-task-config.yaml`:
- Around line 4-8: The param "clusterId" (params -> name: "clusterId",
referenced as {{ .clusterId }} and used in /clusters/{{ .clusterId }}, Namespace
name/labels and status URLs) is coming from external event.id and must be
validated/sanitized before templating: enforce it conforms to a Kubernetes
DNS-1123 label (lowercase a-z0-9 and '-', <=63 chars, no leading/trailing '-'),
and reject or canonicalize any value that fails; for URL paths either validate
against a stricter path-safe subset or percent-encode after validation;
implement this check at the parameter parsing/binding step (where params are
read) and return a clear validation error or abort templating if clusterId is
invalid.
In `@helmfile/environments/e2e-kind/adapter-configs.yaml.gotmpl`:
- Around line 49-54: The cl-job adapter block is missing broker.create and may
therefore not wire the RabbitMQ settings; update the cl-job adapter (the broker
section for "cl-job") to include broker.create: true so that the RabbitMQ
settings (broker.rabbitmq.url, queue, exchange, routingKey) are actually
applied, matching the other adapters in this template.
- Around line 133-139: The broker.rabbitmq.queue value is pointing at the wrong
queue (currently "{{ $namespace }}-nodepools-cl-namespace"); update it to the
np-configmap queue name (e.g. "{{ $namespace }}-np-configmap") so the nodepool
adapter publishes to the correct queue; edit the broker -> rabbitmq -> queue
entry to use "{{ $namespace }}-np-configmap" (leave exchange and routingKey
as-is) and ensure any consumer/service that expects "np-configmap" matches this
name to avoid collisions.
In `@helmfile/environments/gcp/gcp-config.yaml.gotmpl`:
- Around line 3-5: Add validation in the helmfile template to fail fast when env
var values are unsupported: read BROKER_TYPE and API_SERVICE_TYPE into locals
(e.g., $brokerType and $serviceType) and use template conditionals with an
explicit allowlist (e.g., broker allowlist ["googlepubsub","rabbitmq"] and
service allowlist ["LoadBalancer","ClusterIP","NodePort"]) and call the template
fail function if a value is not in the allowlist; update
helmfile/environments/gcp/gcp-config.yaml.gotmpl to perform these checks before
emitting projectId/brokerType/serviceType (and reference
values/base-sentinel.yaml.gotmpl behavior when choosing broker allowlist
entries).
In `@helmfile/environments/kind/kind-config.yaml.gotmpl`:
- Around line 3-4: The environment template currently forwards BROKER_TYPE and
API_SERVICE_TYPE without validating allowed values, causing invalid
broker/service values to reach helm templates (base-sentinel only supports
"googlepubsub" and "rabbitmq"); update the gotmpl around brokerType and
serviceType to validate the env values against allowed lists and either coerce
to a safe default or error early. Specifically, read env "BROKER_TYPE" and
ensure it is one of ["googlepubsub","rabbitmq"] before assigning brokerType
(otherwise set the default "rabbitmq" or abort render), and similarly validate
env "API_SERVICE_TYPE" against allowed Kubernetes service types (e.g.,
"ClusterIP","NodePort","LoadBalancer") before assigning serviceType; implement
these checks in the helmfile/environments/kind/kind-config.yaml.gotmpl template
so invalid inputs never get forwarded to base-sentinel.yaml.gotempl or
downstream charts.
In `@helmfile/values/base-adapter.yaml.gotmpl`:
- Around line 22-24: Remove the hardcoded env entry named SIMULATE_RESULT from
the shared template (the env: - name: SIMULATE_RESULT value: success block) so
it is not merged into all adapter releases; instead add a dedicated conditional
or move this variable into the E2E overlay values and templates (e.g., guard
with a value like .Values.e2e.enableSimulate or set SIMULATE_RESULT only in the
E2E overlay values file) so non-E2E environments do not opt into simulated
behavior by default.
In `@Makefile`:
- Around line 170-179: The generate-rabbitmq-values Makefile target incorrectly
treats only exact HELMFILE_ENV values ('kind' and 'gcp'), skipping
e2e-kind/e2e-gcp; update the conditional in the generate-rabbitmq-values rule
(which references HELMFILE_ENV and BROKER_TYPE) to match prefixes or patterns
(e.g., treat values starting with "kind" or "gcp" the same as exact names) so
e2e-* envs are handled the same as kind/gcp, and apply the same pattern fix to
the other similar block around the later rule lines (the other generate/validate
checks mentioned). Also tighten check-helmfile-env-generated so it validates the
expected output directory (GENERATED_RABBITMQ_DIR) actually exists and fails if
missing, referencing the check-helmfile-env-generated target and
GENERATED_RABBITMQ_DIR variable.
- Around line 197-211: The component-specific Make targets install-api,
install-sentinels, and install-adapters currently call "helmfile apply" without
the -f flag so helmfile may load a wrong/default file; update each target
(install-api, install-sentinels, install-adapters) to pass the same -f
helmfile/helmfile.yaml.gotmpl flag used by install-hyperfleet and keep the
existing -e $(HELMFILE_ENV) and label filters so helmfile uses the repo's
helmfile template rather than falling back to ./helmfile.yaml.
In `@README.md`:
- Around line 205-219: Update the "Generated Helm Values" section to include E2E
environments by mentioning HELMFILE_ENV values e2e-gcp and e2e-kind alongside
gcp and kind; explicitly state that for E2E workflows the same generation steps
apply (e.g., `make install-terraform` → generated-values-from-terraform for
GCP/e2e-gcp and `make generate-rabbitmq-values` → generated-values-rabbitmq for
kind/e2e-kind) and that Helmfile reads those files when deploying with
HELMFILE_ENV set to gcp, kind, e2e-gcp or e2e-kind so E2E readers are pointed to
the correct prerequisite path for broker values.
In `@scripts/generate-rabbitmq-values.sh`:
- Around line 23-37: The script currently accepts --adapter-topics, NAMESPACE
and OUT_DIR without validation which allows path traversal or injection into
filenames/YAML; after flag parsing (use ADAPTER_TOPICS, NAMESPACE, OUT_DIR
variables) validate and normalize inputs: reject ADAPTER_TOPICS entries that
contain '/', '..', quotes, newlines or any chars outside a strict allowlist
(e.g., only [A-Za-z0-9_.-]); split ADAPTER_TOPICS and validate each adapter and
topic pair and fail with an error if any item is invalid; also validate
NAMESPACE with the same allowlist and ensure OUT_DIR is a safe path (make
absolute or ensure it doesn’t contain traversal segments) before writing files;
perform these checks in the same script scope that uses
ADAPTER_TOPICS/OUT_DIR/NAMESPACE so functions like require_value remain
unchanged.
In `@terraform/helm-values-files.tf`:
- Around line 38-56: The Terraform config writes files via
local_file.adapter_values and local_file.sentinel_values into
local.helm_values_dir but does not ensure the target directory exists; add a
resource to create the directory (e.g., a null_resource.create_helm_values_dir
with a provisioner "local-exec" running mkdir -p ${local.helm_values_dir}) and
add depends_on = [null_resource.create_helm_values_dir] to both
local_file.adapter_values and local_file.sentinel_values so the directory is
created before attempting to write the files.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 531ca917-2119-485a-a6aa-8cb630b7edf4
📒 Files selected for processing (65)
.gitignoreCONTRIBUTING.mdMakefileREADME.mdenv.gcpenv.kindhelm/adapter1/Chart.yamlhelm/adapter2/Chart.yamlhelm/adapter2/dryrun-adapter2-discovery.jsonhelm/adapter3/Chart.yamlhelm/api/values.yamlhelm/rabbitmq/Chart.yamlhelm/rabbitmq/README.mdhelm/rabbitmq/templates/_helpers.tplhelm/rabbitmq/templates/deployment.yamlhelm/rabbitmq/templates/service.yamlhelm/rabbitmq/values.yamlhelm/sentinel-clusters/Chart.yamlhelm/sentinel-clusters/values.yamlhelm/sentinel-nodepools/Chart.yamlhelm/sentinel-nodepools/values.yamlhelmfile/configs/base/adapters/adapter1/adapter-config.yamlhelmfile/configs/base/adapters/adapter1/adapter-task-config.yamlhelmfile/configs/base/adapters/adapter2/adapter-config.yamlhelmfile/configs/base/adapters/adapter2/adapter-task-config.yamlhelmfile/configs/base/adapters/adapter3/adapter-config.yamlhelmfile/configs/base/adapters/adapter3/adapter-task-config.yamlhelmfile/configs/e2e/adapters/cl-deployment/adapter-config.yamlhelmfile/configs/e2e/adapters/cl-deployment/adapter-task-config.yamlhelmfile/configs/e2e/adapters/cl-deployment/adapter-task-resource-deployment.yamlhelmfile/configs/e2e/adapters/cl-job/adapter-config.yamlhelmfile/configs/e2e/adapters/cl-job/adapter-task-config.yamlhelmfile/configs/e2e/adapters/cl-job/adapter-task-resource-job.yamlhelmfile/configs/e2e/adapters/cl-maestro/adapter-config.yamlhelmfile/configs/e2e/adapters/cl-maestro/adapter-task-config.yamlhelmfile/configs/e2e/adapters/cl-maestro/adapter-task-resource-manifestwork.yamlhelmfile/configs/e2e/adapters/cl-namespace/adapter-config.yamlhelmfile/configs/e2e/adapters/cl-namespace/adapter-task-config.yamlhelmfile/configs/e2e/adapters/np-configmap/adapter-config.yamlhelmfile/configs/e2e/adapters/np-configmap/adapter-task-config.yamlhelmfile/configs/e2e/adapters/np-configmap/adapter-task-resource-configmap.yamlhelmfile/configs/e2e/adapters/np-configmap/values.yamlhelmfile/environments/e2e-gcp/adapter-configs.yaml.gotmplhelmfile/environments/e2e-gcp/e2e-gcp-config.yaml.gotmplhelmfile/environments/e2e-gcp/sentinel-configs.yamlhelmfile/environments/e2e-kind/adapter-configs.yaml.gotmplhelmfile/environments/e2e-kind/e2e-kind-config.yaml.gotmplhelmfile/environments/e2e-kind/sentinel-configs.yamlhelmfile/environments/gcp/adapter-configs.yaml.gotmplhelmfile/environments/gcp/gcp-config.yaml.gotmplhelmfile/environments/gcp/sentinel-configs.yaml.gotmplhelmfile/environments/kind/adapter-configs.yaml.gotmplhelmfile/environments/kind/kind-config.yaml.gotmplhelmfile/environments/kind/sentinel-configs.yaml.gotmplhelmfile/helmfile.yaml.gotmplhelmfile/values/.gitkeephelmfile/values/base-adapter.yaml.gotmplhelmfile/values/base-api.yaml.gotmplhelmfile/values/base-sentinel.yaml.gotmplscripts/generate-rabbitmq-values.shscripts/kind-build-images.shscripts/tf-helm-values.shterraform/helm-values-files.tfterraform/outputs.tfterraform/versions.tf
💤 Files with no reviewable changes (10)
- helm/adapter2/dryrun-adapter2-discovery.json
- helm/adapter2/Chart.yaml
- helm/api/values.yaml
- helm/sentinel-nodepools/values.yaml
- helm/adapter1/Chart.yaml
- helm/adapter3/Chart.yaml
- scripts/tf-helm-values.sh
- helm/sentinel-clusters/Chart.yaml
- helm/sentinel-clusters/values.yaml
- helm/sentinel-nodepools/Chart.yaml
| create-maestro-consumer: check-kubectl check-maestro-namespace ## Create a Maestro consumer (requires Maestro server running) | ||
| @echo "Creating Maestro consumer '$(MAESTRO_CONSUMER)'..." | ||
| @for i in 1 2 3 4 5; do \ | ||
| exists=$$(kubectl exec deploy/maestro --namespace $(MAESTRO_NS) --kubeconfig $(KUBECONFIG) -- \ | ||
| curl -sS --connect-timeout 5 --max-time 10 http://maestro.$(MAESTRO_NS).svc.cluster.local:8000/api/maestro/v1/consumers \ | ||
| exists=$$(kubectl exec deploy/maestro --namespace $(MAESTRO_NAMESPACE) -- \ | ||
| curl -sS --connect-timeout 5 --max-time 10 http://maestro.$(MAESTRO_NAMESPACE).svc.cluster.local:8000/api/maestro/v1/consumers \ | ||
| 2>/dev/null | grep -F -c -- '"name":"$(MAESTRO_CONSUMER)"') 2>/dev/null || true; \ | ||
| if [ "$$exists" -gt 0 ]; then \ | ||
| echo "OK: consumer '$(MAESTRO_CONSUMER)' already exists"; exit 0; \ | ||
| fi; \ | ||
| status=$$(kubectl exec deploy/maestro --namespace $(MAESTRO_NS) --kubeconfig $(KUBECONFIG) -- \ | ||
| status=$$(kubectl exec deploy/maestro --namespace $(MAESTRO_NAMESPACE) --kubeconfig $(KUBECONFIG) -- \ | ||
| curl -sS --connect-timeout 5 --max-time 10 -o /dev/null -w '%{http_code}' -X POST \ | ||
| -H "Content-Type: application/json" \ | ||
| http://maestro.$(MAESTRO_NS).svc.cluster.local:8000/api/maestro/v1/consumers \ | ||
| http://maestro.$(MAESTRO_NAMESPACE).svc.cluster.local:8000/api/maestro/v1/consumers \ | ||
| -d '{"name": "$(MAESTRO_CONSUMER)"}' 2>/dev/null) 2>/dev/null || status="error"; \ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the Makefile and show the relevant section
ls -la
find . -maxdepth 2 -name 'Makefile' -print
echo "---- Makefile lines 90-160 ----"
sed -n '90,160p' Makefile
echo "---- Search for MAESTRO_CONSUMER ----"
rg -n "MAESTRO_CONSUMER" Makefile
echo "---- Search for validation/check targets ----"
rg -n "check-maestro|maestro-consumer" MakefileRepository: openshift-hyperfleet/hyperfleet-infra
Length of output: 6672
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "---- Search assignments for MAESTRO_CONSUMER ----"
rg -n "^(MAESTRO_CONSUMER)\s*([?:+]?=)" Makefile || true
rg -n "MAESTRO_CONSUMER\s*[:?+]?=" Makefile || true
echo "---- Show top of Makefile for variable defaults ----"
sed -n '1,120p' Makefile
echo "---- Search for other mentions in repository (if any) ----"
rg -n "MAESTRO_CONSUMER" . || trueRepository: openshift-hyperfleet/hyperfleet-infra
Length of output: 5811
Fix shell/JSON injection risk in create-maestro-consumer
MAESTRO_CONSUMER is used via Make interpolation inside single-quoted shell literals in create-maestro-consumer (-d '{"name": "$(MAESTRO_CONSUMER)"}' and the corresponding grep string). If the consumer name contains a ', it breaks the surrounding shell quoting and can lead to shell command injection / malformed JSON. Add input constraints (allowlist/regex for allowed consumer-name characters) and escape/construct the JSON payload without inline single-quote interpolation (e.g., build it with jq/python or use safe shell escaping).
🧰 Tools
🪛 checkmake (0.3.2)
[warning] 117-117: Target body for "create-maestro-consumer" exceeds allowed length of 5 lines (20).
(maxbodylength)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Makefile` around lines 117 - 130, The create-maestro-consumer Make target
embeds MAESTRO_CONSUMER directly into single-quoted shell JSON and grep
patterns, which allows a consumer name containing a single-quote to break
quoting and enable shell/JSON injection; fix it by (1) validating
MAESTRO_CONSUMER against a strict allowlist regex (e.g., alphanumerics, dashes,
underscores) early in the create-maestro-consumer flow and fail fast if it
doesn't match, and (2) stop inline single-quote interpolation when building the
POST body and when checking existence—construct the JSON payload safely (e.g.,
use jq or a small python snippet to produce '{"name": <value>}' into a variable
or stdin) and use a JSON-aware check (e.g., pipe the API response through jq to
test .name == MAESTRO_CONSUMER) instead of grep; update the parts of the
create-maestro-consumer target that reference MAESTRO_CONSUMER and the curl -d /
grep commands accordingly.
Summary
Migrates HyperFleet Infrastructure from individual Helm chart management to Helmfile-based declarative orchestration across four deployment environments (GCP, kind, E2E-GCP, E2E-kind). The previous approach required running multiple
make install-*commands with complex variable overrides, while Helmfile provides a single configuration file describing all releases, their dependencies, and environment-specific values. This simplifies local development, enables consistent E2E testing workflows, and reduces deployment complexity through environment-specific defaults inenv.gcpandenv.kindfiles.HYPERFLEET-1056
Changes
Helmfile Orchestration:
helmfile/helmfile.yaml.gotmplwith repository definitions, release configurations for API/Sentinels/Adapters, and environment-specific value templates using Go templatinghelmfile/environments/:gcp,kind,e2e-gcp,e2e-kind, each with adapter and sentinel broker configurationscl-deployment,cl-job,cl-maestro,cl-namespace,np-configmapinhelmfile/configs/e2e/adapters/helm/adapter{1,2,3}/tohelmfile/configs/base/adapters/for reuse across environmentsgenerated-values-from-terraform/(GCP) orgenerated-values-rabbitmq/(kind) based onHELMFILE_ENVEnvironment Configuration:
env.gcpdefining GCP environment defaults:BROKER_TYPE=googlepubsub,API_SERVICE_TYPE=LoadBalancer, user-specific namespaces,IMAGE_PULL_POLICY=Alwaysenv.kinddefining kind environment defaults:BROKER_TYPE=rabbitmq,API_SERVICE_TYPE=ClusterIP, local namespace,IMAGE_PULL_POLICY=IfNotPresentHELMFILE_ENVvalue (contains "gcp" →env.gcp, otherwise →env.kind).envto.gitignorefor local environment configuration overrides.envfile >env.gcp/env.kind> Makefile defaultsMakefile Refactoring:
install-hyperfleet,install-api,install-sentinels,install-adaptersnow callhelmfile applywith component labelscheck-helmfile-envprerequisite that validates kubectl context matchesHELMFILE_ENV(GKE for gcp, kind for kind) and checks generated values directory existstemplate-helmfile,build-helmfile,lint-helmfiletargets for Helmfile validationmake helptarget that automatically organizes targets by# ====section headers (Terraform Targets, Maestro Targets, etc.)Terraform Integration:
scripts/tf-helm-values.shscriptterraform/helm-values-files.tfthat directly generates Helm values files ingenerated-values-from-terraform/duringterraform applylocal_fileresources for sentinel and adapter broker configurationsRabbitMQ Support:
helm/rabbitmq/local Helm chart for RabbitMQ deployments in kind environments (includes deployment, service, ConfigMap templates)scripts/generate-rabbitmq-values.shto generate broker configuration values for RabbitMQ environments ingenerated-values-rabbitmq/make generate-rabbitmq(alias:generate-rabbitmq-values) target with smart detection - displays info message ifHELMFILE_ENVcontains "gcp", otherwise generates RabbitMQ valuesDocumentation Updates:
HELMFILE_ENVbehaviorCleanup:
helm/api/,helm/sentinel-clusters/,helm/sentinel-nodepools/,helm/adapter{1,2,3}/(charts now sourced from component repos)terraform/envs/gke/dev.tfbackend.example,dev.tfvars.example).generated/)Notes
Helmfile Dependency: This introduces a new dependency on
helmfileCLI tool, which must be installed alongside Helm. The README and CONTRIBUTING docs include installation instructions.Breaking Changes: The previous
make install-allandmake install-all-rabbitmqaggregate targets have been removed. Users should now use environment-specific workflows documented in CONTRIBUTING.md or specifyHELMFILE_ENVexplicitly.Generated Values: Terraform-generated values (
generated-values-from-terraform/) are created duringterraform applyviaterraform/helm-values-files.tf. RabbitMQ values (generated-values-rabbitmq/) must be generated manually withmake generate-rabbitmqbefore deploying to kind environments.Test Plan
helmfile -f helmfile/helmfile.yaml.gotmpl -e <env> templateHELMFILE_ENV=kind make install-hyperfleetHELMFILE_ENV=gcp make install-hyperfleetgenerated-values-from-terraform/make helpdisplays categorized targets correctlycheck-helmfile-envprerequisite validates kubectl context matches environment