Per-user Helm charts for the MCP with OpenShift lab, deployed via ArgoCD ApplicationSets during user provisioning.
📖 Developer Guide & Documentation: https://rhpds.github.io/ocpsandbox-mcp-with-openshift-gitops/
The original MCP with OpenShift lab provisioned a full dedicated CNV cluster per order. That model does not scale for modern workshop delivery.
The goal is for an attendee to click Order and have a fully working OpenShift environment in minutes — not 45–60 minutes. This is only possible with pre-provisioned shared clusters. The Sandbox API scheduler-only pattern is the foundation that makes this possible.
Speed — A dedicated CNV cluster takes 45–60 minutes to provision. A shared cluster order using Sandbox API takes 2–3 minutes — the cluster is already running, only per-tenant resources are provisioned. For event formats like Lightning Labs at Summit 2026 (self-service, walk-up sessions with no instructor) or booth demos, waiting an hour is not an option.
Lightning Labs at Summit 2026 — The event team is introducing self-service labs as one of the biggest attractions of the event. An attendee walks up, orders, gets their environment in 2–3 minutes, completes the lab, and the environment is destroyed automatically. That cycle must work at scale, concurrently, across the event floor. One dedicated cluster per order cannot support that.
Scale across use cases — Full workshop sessions, Lightning Labs, booth demos, and on-demand ordering from demo.redhat.com can all be served from a single shared cluster pool without provisioning a dedicated cluster for each.
Establishing a reusable pattern — This is the first lab in RHDP to use the shared cluster + Sandbox API scheduler-only pattern end-to-end. The tenant roles (ocp4_workload_tenant_keycloak_user, ocp4_workload_tenant_namespace, ocp4_workload_tenant_gitea) were built to be generic and reusable — not tied to MCP. Any lab running on a shared OpenShift cluster can use the same building blocks.
Authentication — HTPasswd with sequential usernames (user1, user2) made sense when each order got its own cluster. On a shared cluster that model breaks down: sequential usernames conflict across concurrent orders and you need an identity provider that handles isolated user sessions. RHBK is already on the cluster; each order adds one user to the existing realm and removes it on destroy.
This lab uses the Infra / Tenant pattern for OCP Sandbox deployments. Shared cluster infrastructure is provisioned once (infra), and per-user resources are deployed on demand (tenant).
┌─────────────────────────────────────────────┐
│ OCP Cluster (Pre-provisioned) │
│ │
┌───────────────────┐ │ ┌──────────────────────────────────────┐ │
│ Cluster │ │ │ Infra (Cluster Provisioner) │ │
│ Provisioner │──┼─▶│ │ │
│ (run once per │ │ │ - Keycloak (RHBK + OAuth) │ │
│ cluster, CI or │ │ │ - Gitea Operator │ │
│ manual) │ │ │ - OpenShift Pipelines (Tekton) │ │
└───────────────────┘ │ │ - OpenShift GitOps (ArgoCD) │ │
│ │ - ToolHive Operator │ │
│ │ - User Workload Monitoring │ │
│ │ - CloudNativePG Operator │ │
│ │ - OCP Console Embed (iframe CSP) │ │
│ └──────────────────────────────────────┘ │
│ │
┌───────────────────┐ │ ┌──────────────────────────────────────┐ │
│ OCPSandbox API │ │ │ Sandbox API (before workloads run) │ │
│ (on each order) │──┼─▶│ │ │
└───────────────────┘ │ │ - Creates Keycloak user │ │
│ │ - Creates per-user namespaces │ │
│ │ with quotas and limit ranges │ │
│ └──────────────────────────────────────┘ │
│ │
┌───────────────────┐ │ ┌──────────────────────────────────────┐ │
│ User Provisioner │ │ │ Tenant (Per-User Resources) │ │
│ (AgnosticD │──┼─▶│ │ │
│ workloads, runs │ │ │ - ArgoCD AppProject │ │
│ after sandbox │ │ │ - Per-user Gitea instance (from op) │ │
│ API creates │ │ │ - Gitea user + repos │ │
│ namespaces) │ │ │ - LiteLLM Virtual Key │ │
└───────────────────┘ │ │ - MCP OpenShift Server (ToolHive) │ │
│ │ - MCP Gitea Server (ToolHive) │ │
│ │ - LibreChat (AI UI) │ │
│ │ - Pipeline Failure Agent │ │
│ │ - Showroom (Lab Guide) │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────┘Shared infrastructure deployed once per cluster (manually or via CI). Managed via the
rhpds.ocpsandbox_mcp_with_openshift
collection’s tests/e2e/cluster-provision.yml playbook.
The cluster provisioner installs operators and shared prerequisites only. It does NOT create any per-user resources. Per-user instances (Gitea CRs, MCP servers, etc.) are created by the user provisioner from these pre-installed operators.
Key components:
-
Keycloak (RHBK) as OCP OAuth provider — OCPSandbox API creates users dynamically when
keycloak: "yes"is set on the cluster pool. -
Gitea Operator installed cluster-wide — user provisioner creates per-user
GiteaCRs. -
ToolHive Operator installed cluster-wide — user provisioner deploys per-user
MCPServerCRs. -
OpenShift Pipelines (Tekton) — shared pipeline infrastructure for agent builds.
-
OpenShift GitOps (ArgoCD) — user provisioner creates per-user AppProjects and ApplicationSets.
-
CloudNativePG Operator — available for per-user database instances.
-
User Workload Monitoring — metrics collection for per-user workloads.
-
OCP Console Embed — patches IngressController CSP headers for Showroom iframe embedding.
The AgnosticD config: namespace framework sets K8S_AUTH_HOST, K8S_AUTH_API_KEY, and K8S_AUTH_VERIFY_SSL environment variables on every workload role via apply: environment:. This means workload roles using kubernetes.core modules work without a kubeconfig file — the env vars provide cluster access directly.
As a fallback (for e2e tests or environments outside the AgnosticD framework), the MCP user role also writes a kubeconfig to disk at the path specified by KUBECONFIG env var or ~/.kube/config.
Per-user resources deployed on each lab order via the OCPSandbox API using config: namespace.
The user provisioner is split across multiple roles, each handling a single concern:
| Role | What it does |
|---|---|
|
Creates ArgoCD AppProject scoped to the user |
|
Deploys per-user Gitea CR (operator creates the instance) |
|
Creates Gitea user, migrates repos from GitHub |
|
Creates LiteMaaS virtual key for LLM API access |
|
Creates Gitea API token, SCC, and ArgoCD ApplicationSets that deploy this repo’s Helm charts |
|
Deploys lab guide UI |
This repository contains the Helm charts for the tenant-level components. ArgoCD ApplicationSets
(created by the mcp_user role) deploy these charts into per-user namespaces.
Each user gets isolated namespaces (created by OCPSandbox API):
| Namespace | Component |
|---|---|
|
Pipeline Failure Agent (primary sandbox) |
|
LibreChat AI UI + Meilisearch + MongoDB |
|
OpenShift MCP Server (via ToolHive MCPServer CR) |
|
Gitea MCP Server (via ToolHive MCPServer CR) |
|
Per-user Gitea instance (via Gitea operator CR) |
|
Showroom Lab Guide |
User orders lab via RHDP catalog
│
▼
OCPSandbox API (runs BEFORE workloads)
│
├── Selects cluster from pool (cloud_selector)
├── Creates Keycloak user (keycloak: "yes")
│ → sandbox_openshift_user / sandbox_openshift_password
├── Creates 6 namespaces with quotas:
│ agent, librechat, mcp-gitea, mcp-openshift, gitea, showroom
│ → Each exposed as {var}_namespace Ansible variable
│
▼
AgnosticD Workloads (config: namespace, runs IN ORDER)
│
├── 1. ArgoCD AppProject (per-user RBAC)
├── 2. Gitea Instance (deploys Gitea CR → operator creates instance)
├── 3. Gitea User (creates user + migrates repos from GitHub)
├── 4. LiteLLM Virtual Key (LLM API access)
├── 5. MCP User Role:
│ ├── Create Gitea API token
│ ├── Discover cluster ingress domain
│ ├── Write kubeconfig for downstream roles
│ ├── Create SCC + ArgoCD ApplicationSets ─────┐
│ └── Save user_info for Showroom │
├── 6. OCP Console Embed (idempotent) │
└── 7. Showroom (lab guide UI) │
│
┌───────────────────────┘
▼
ArgoCD syncs
Helm charts from
THIS REPO (tenant/)
│
┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
mcp-openshift mcp-gitea librechat + agent
(ToolHive) (ToolHive) (Helm charts)
On destroy (REVERSE order):
7. Showroom removed
6. Console embed (no-op)
5. MCP User: deletes ApplicationSets + SCC
4. LiteLLM: deletes virtual key (external resource)
3. Gitea User: purges user via admin API
2. Gitea Instance: no-op (namespace deletion handles it)
1. ArgoCD: deletes AppProject
→ Sandbox API deletes all namespaces (catch_all: false
ensures destroy job completes first)ArgoCD deploys into namespaces that Ansible pre-creates — they never need to be specified explicitly in AgV.
The convention is {suffix}-{username}. Both layers derive the same names independently:
| Layer | How namespace names are constructed |
|---|---|
Ansible ( |
|
ArgoCD ( |
|
tenant.username is the single contract variable. You pass it once in gitops_bootstrap_helm_values and both layers produce identical namespace names. CreateNamespace=false in the ArgoCD Application tells ArgoCD to use the existing namespace rather than create it.
This is why you never specify namespace names in AgV — they are fully derived from the username.
The Pipeline Failure Agent connects MCP servers to an LLM for automated pipeline failure analysis:
┌─────────────────┐ POST /report-failure ┌─────────────────┐
│ Tekton │ ────────────────────────────▶│ Pipeline │
│ Pipeline │ │ Failure Agent │
└─────────────────┘ └────────┬────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ LiteLLM │ │ MCP OpenShift │ │ MCP Gitea │
│ (LLM API) │ │ Server │ │ Server │
└────────────────┘ └────────────────┘ └────────────────┘
│ │
▼ ▼
Pod Logs Issue CreationThe agent operates in an iterative loop:
-
Receive Failure Report — Pipeline sends pod details to
/report-failure -
Build Prompt — Agent creates a prompt with pod context and examples
-
LLM Analysis — Model analyzes the situation and requests tools
-
Tool Execution — Agent executes requested tools via MCP servers (
pods_log,create_issue) -
Iterate — Process continues until the model completes or max iterations reached
-
Return Result — Agent returns the final result (usually issue URL)
ocpsandbox-mcp-with-openshift-gitops/
│
├── tenant/ # Per-user Helm charts (deployed by ArgoCD)
│ ├── agent/ # Pipeline Failure Agent
│ │ ├── Chart.yaml
│ │ ├── values.yaml
│ │ └── templates/
│ ├── librechat/ # LibreChat config overrides
│ │ ├── Chart.yaml
│ │ ├── values.yaml
│ │ └── templates/
│ ├── mcp-gitea/ # Gitea MCP Server (ToolHive MCPServer CR)
│ │ ├── Chart.yaml
│ │ ├── values.yaml
│ │ └── templates/
│ └── mcp-openshift/ # OpenShift MCP Server (ToolHive MCPServer CR)
│ ├── Chart.yaml
│ ├── values.yaml
│ └── templates/
│
├── agent/ # Agent source code
│ ├── main.py # FastAPI server + agent logic
│ ├── mcp_client.py # MCP server client (SSE + streamable-http)
│ ├── requirements.txt
│ ├── Containerfile
│ └── tests/
│
└── README.adoc # This fileThe tenant/ directory follows the rhpds/ci-template-gitops pattern where:
-
infra/= cluster-scoped resources (not used here — infra is handled by the cluster provisioner playbook) -
tenant/= per-user (namespace-scoped) resources
| Repository | Purpose |
|---|---|
Ansible collection with |
|
Generic sandbox roles ( |
|
LiteLLM virtual key provisioning |
|
Showroom lab guide content |
|
This repo — GitOps charts |
| Server | Transport | Image | Purpose |
|---|---|---|---|
OpenShift |
SSE |
|
K8s/OCP resource management, pod logs, exec |
Gitea |
streamable-http |
|
Issue management, repository operations, PR workflows |
| Variable | Description | Default |
|---|---|---|
|
Base URL for the LiteLLM API endpoint |
(required) |
|
API key for LiteLLM authentication |
(required) |
|
Model identifier to use for analysis |
|
|
URL for the OpenShift MCP server |
(required) |
|
Transport type for OpenShift MCP server |
|
|
URL for the Gitea MCP server |
(required) |
|
Transport type for Gitea MCP server |
|
|
Gitea repository owner for issue creation |
|
|
Gitea repository name for issue creation |
|
|
Server listening port |
|
cd agent
pip install -r requirements.txt
export LITELLM_URL="http://your-litellm-endpoint"
export LITELLM_API_KEY="your-api-key"
export MCP_OPENSHIFT_URL="http://mcp-openshift-server/sse"
export MCP_GITEA_URL="http://mcp-gitea-server/mcp"
python main.py