Skip to content

rhpds/ocpsandbox-mcp-with-openshift-gitops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

137 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MCP with OpenShift - GitOps Repository

Why Sandbox API — Shared Clusters

The original MCP with OpenShift lab provisioned a full dedicated CNV cluster per order. That model does not scale for modern workshop delivery.

Zero Touch OpenShift

The goal is for an attendee to click Order and have a fully working OpenShift environment in minutes — not 45–60 minutes. This is only possible with pre-provisioned shared clusters. The Sandbox API scheduler-only pattern is the foundation that makes this possible.

Key Drivers

Speed — A dedicated CNV cluster takes 45–60 minutes to provision. A shared cluster order using Sandbox API takes 2–3 minutes — the cluster is already running, only per-tenant resources are provisioned. For event formats like Lightning Labs at Summit 2026 (self-service, walk-up sessions with no instructor) or booth demos, waiting an hour is not an option.

Lightning Labs at Summit 2026 — The event team is introducing self-service labs as one of the biggest attractions of the event. An attendee walks up, orders, gets their environment in 2–3 minutes, completes the lab, and the environment is destroyed automatically. That cycle must work at scale, concurrently, across the event floor. One dedicated cluster per order cannot support that.

Scale across use cases — Full workshop sessions, Lightning Labs, booth demos, and on-demand ordering from demo.redhat.com can all be served from a single shared cluster pool without provisioning a dedicated cluster for each.

Establishing a reusable pattern — This is the first lab in RHDP to use the shared cluster + Sandbox API scheduler-only pattern end-to-end. The tenant roles (ocp4_workload_tenant_keycloak_user, ocp4_workload_tenant_namespace, ocp4_workload_tenant_gitea) were built to be generic and reusable — not tied to MCP. Any lab running on a shared OpenShift cluster can use the same building blocks.

Authentication — HTPasswd with sequential usernames (user1, user2) made sense when each order got its own cluster. On a shared cluster that model breaks down: sequential usernames conflict across concurrent orders and you need an identity provider that handles isolated user sessions. RHBK is already on the cluster; each order adds one user to the existing realm and removes it on destroy.

Deployment Architecture

This lab uses the Infra / Tenant pattern for OCP Sandbox deployments. Shared cluster infrastructure is provisioned once (infra), and per-user resources are deployed on demand (tenant).

                         ┌─────────────────────────────────────────────┐
                         │          OCP Cluster (Pre-provisioned)      │
                         │                                             │
  ┌───────────────────┐  │  ┌──────────────────────────────────────┐   │
  │ Cluster           │  │  │  Infra (Cluster Provisioner)         │   │
  │ Provisioner       │──┼─▶│                                      │   │
  │ (run once per     │  │  │  - Keycloak (RHBK + OAuth)           │   │
  │  cluster, CI or   │  │  │  - Gitea Operator                    │   │
  │  manual)          │  │  │  - OpenShift Pipelines (Tekton)       │   │
  └───────────────────┘  │  │  - OpenShift GitOps (ArgoCD)          │   │
                         │  │  - ToolHive Operator                  │   │
                         │  │  - User Workload Monitoring            │   │
                         │  │  - CloudNativePG Operator              │   │
                         │  │  - OCP Console Embed (iframe CSP)      │   │
                         │  └──────────────────────────────────────┘   │
                         │                                             │
  ┌───────────────────┐  │  ┌──────────────────────────────────────┐   │
  │ OCPSandbox API    │  │  │  Sandbox API (before workloads run)  │   │
  │ (on each order)   │──┼─▶│                                      │   │
  └───────────────────┘  │  │  - Creates Keycloak user              │   │
                         │  │  - Creates per-user namespaces         │   │
                         │  │    with quotas and limit ranges        │   │
                         │  └──────────────────────────────────────┘   │
                         │                                             │
  ┌───────────────────┐  │  ┌──────────────────────────────────────┐   │
  │ User Provisioner  │  │  │  Tenant (Per-User Resources)         │   │
  │ (AgnosticD        │──┼─▶│                                      │   │
  │  workloads, runs  │  │  │  - ArgoCD AppProject                 │   │
  │  after sandbox    │  │  │  - Per-user Gitea instance (from op) │   │
  │  API creates      │  │  │  - Gitea user + repos                │   │
  │  namespaces)      │  │  │  - LiteLLM Virtual Key               │   │
  └───────────────────┘  │  │  - MCP OpenShift Server (ToolHive)    │   │
                         │  │  - MCP Gitea Server (ToolHive)        │   │
                         │  │  - LibreChat (AI UI)                  │   │
                         │  │  - Pipeline Failure Agent             │   │
                         │  │  - Showroom (Lab Guide)               │   │
                         │  └──────────────────────────────────────┘   │
                         └─────────────────────────────────────────────┘

Infra (Cluster Provisioner)

Shared infrastructure deployed once per cluster (manually or via CI). Managed via the rhpds.ocpsandbox_mcp_with_openshift collection’s tests/e2e/cluster-provision.yml playbook.

The cluster provisioner installs operators and shared prerequisites only. It does NOT create any per-user resources. Per-user instances (Gitea CRs, MCP servers, etc.) are created by the user provisioner from these pre-installed operators.

Key components:

  • Keycloak (RHBK) as OCP OAuth provider — OCPSandbox API creates users dynamically when keycloak: "yes" is set on the cluster pool.

  • Gitea Operator installed cluster-wide — user provisioner creates per-user Gitea CRs.

  • ToolHive Operator installed cluster-wide — user provisioner deploys per-user MCPServer CRs.

  • OpenShift Pipelines (Tekton) — shared pipeline infrastructure for agent builds.

  • OpenShift GitOps (ArgoCD) — user provisioner creates per-user AppProjects and ApplicationSets.

  • CloudNativePG Operator — available for per-user database instances.

  • User Workload Monitoring — metrics collection for per-user workloads.

  • OCP Console Embed — patches IngressController CSP headers for Showroom iframe embedding.

Cluster Authentication

The AgnosticD config: namespace framework sets K8S_AUTH_HOST, K8S_AUTH_API_KEY, and K8S_AUTH_VERIFY_SSL environment variables on every workload role via apply: environment:. This means workload roles using kubernetes.core modules work without a kubeconfig file — the env vars provide cluster access directly.

As a fallback (for e2e tests or environments outside the AgnosticD framework), the MCP user role also writes a kubeconfig to disk at the path specified by KUBECONFIG env var or ~/.kube/config.

Tenant (User Provisioner)

Per-user resources deployed on each lab order via the OCPSandbox API using config: namespace.

The user provisioner is split across multiple roles, each handling a single concern:

Role What it does

argocd_user (namespaced_workloads)

Creates ArgoCD AppProject scoped to the user

gitea_instance (namespaced_workloads)

Deploys per-user Gitea CR (operator creates the instance)

gitea_user (namespaced_workloads)

Creates Gitea user, migrates repos from GitHub

litellm_virtual_keys (rhpds collection)

Creates LiteMaaS virtual key for LLM API access

mcp_user (rhpds collection)

Creates Gitea API token, SCC, and ArgoCD ApplicationSets that deploy this repo’s Helm charts

showroom (agnosticd collection)

Deploys lab guide UI

This repository contains the Helm charts for the tenant-level components. ArgoCD ApplicationSets (created by the mcp_user role) deploy these charts into per-user namespaces.

Each user gets isolated namespaces (created by OCPSandbox API):

Namespace Component

agent-{guid}

Pipeline Failure Agent (primary sandbox)

librechat-{guid}

LibreChat AI UI + Meilisearch + MongoDB

mcp-openshift-{guid}

OpenShift MCP Server (via ToolHive MCPServer CR)

mcp-gitea-{guid}

Gitea MCP Server (via ToolHive MCPServer CR)

gitea-{guid}

Per-user Gitea instance (via Gitea operator CR)

showroom-{guid}

Showroom Lab Guide

How It Works

User orders lab via RHDP catalog
       │
       ▼
OCPSandbox API (runs BEFORE workloads)
       │
       ├── Selects cluster from pool (cloud_selector)
       ├── Creates Keycloak user (keycloak: "yes")
       │     → sandbox_openshift_user / sandbox_openshift_password
       ├── Creates 6 namespaces with quotas:
       │     agent, librechat, mcp-gitea, mcp-openshift, gitea, showroom
       │     → Each exposed as {var}_namespace Ansible variable
       │
       ▼
AgnosticD Workloads (config: namespace, runs IN ORDER)
       │
       ├── 1. ArgoCD AppProject (per-user RBAC)
       ├── 2. Gitea Instance (deploys Gitea CR → operator creates instance)
       ├── 3. Gitea User (creates user + migrates repos from GitHub)
       ├── 4. LiteLLM Virtual Key (LLM API access)
       ├── 5. MCP User Role:
       │      ├── Create Gitea API token
       │      ├── Discover cluster ingress domain
       │      ├── Write kubeconfig for downstream roles
       │      ├── Create SCC + ArgoCD ApplicationSets ─────┐
       │      └── Save user_info for Showroom               │
       ├── 6. OCP Console Embed (idempotent)                │
       └── 7. Showroom (lab guide UI)                       │
                                                            │
                                    ┌───────────────────────┘
                                    ▼
                              ArgoCD syncs
                              Helm charts from
                              THIS REPO (tenant/)
                                    │
                    ┌───────────────┼───────────────┐
                    │               │               │
                    ▼               ▼               ▼
             mcp-openshift    mcp-gitea      librechat + agent
             (ToolHive)       (ToolHive)     (Helm charts)

On destroy (REVERSE order):
  7. Showroom removed
  6. Console embed (no-op)
  5. MCP User: deletes ApplicationSets + SCC
  4. LiteLLM: deletes virtual key (external resource)
  3. Gitea User: purges user via admin API
  2. Gitea Instance: no-op (namespace deletion handles it)
  1. ArgoCD: deletes AppProject
  → Sandbox API deletes all namespaces (catch_all: false
    ensures destroy job completes first)

How Namespace Targeting Works

ArgoCD deploys into namespaces that Ansible pre-creates — they never need to be specified explicitly in AgV.

The convention is {suffix}-{username}. Both layers derive the same names independently:

Layer How namespace names are constructed

Ansible (ocp4_workload_tenant_namespace)

ocp4_workload_tenant_namespace_prefix + suffix → librechat-mcpuser-drw4x

ArgoCD (tenant/bootstrap/templates/applications.yaml)

printf "librechat-%s" $username where $username = tenant.username from Helm values

tenant.username is the single contract variable. You pass it once in gitops_bootstrap_helm_values and both layers produce identical namespace names. CreateNamespace=false in the ArgoCD Application tells ArgoCD to use the existing namespace rather than create it.

This is why you never specify namespace names in AgV — they are fully derived from the username.

Agent Architecture

The Pipeline Failure Agent connects MCP servers to an LLM for automated pipeline failure analysis:

┌─────────────────┐     POST /report-failure     ┌─────────────────┐
│  Tekton          │ ────────────────────────────▶│  Pipeline       │
│  Pipeline       │                              │  Failure Agent  │
└─────────────────┘                              └────────┬────────┘
                                                          │
                                     ┌────────────────────┼────────────────────┐
                                     │                    │                    │
                                     ▼                    ▼                    ▼
                            ┌────────────────┐   ┌────────────────┐   ┌────────────────┐
                            │  LiteLLM       │   │  MCP OpenShift │   │  MCP Gitea     │
                            │  (LLM API)     │   │  Server        │   │  Server        │
                            └────────────────┘   └────────────────┘   └────────────────┘
                                                          │                    │
                                                          ▼                    ▼
                                                   Pod Logs            Issue Creation

The agent operates in an iterative loop:

  1. Receive Failure Report — Pipeline sends pod details to /report-failure

  2. Build Prompt — Agent creates a prompt with pod context and examples

  3. LLM Analysis — Model analyzes the situation and requests tools

  4. Tool Execution — Agent executes requested tools via MCP servers (pods_log, create_issue)

  5. Iterate — Process continues until the model completes or max iterations reached

  6. Return Result — Agent returns the final result (usually issue URL)

Repository Structure

ocpsandbox-mcp-with-openshift-gitops/
│
├── tenant/                        # Per-user Helm charts (deployed by ArgoCD)
│   ├── agent/                     # Pipeline Failure Agent
│   │   ├── Chart.yaml
│   │   ├── values.yaml
│   │   └── templates/
│   ├── librechat/                 # LibreChat config overrides
│   │   ├── Chart.yaml
│   │   ├── values.yaml
│   │   └── templates/
│   ├── mcp-gitea/                 # Gitea MCP Server (ToolHive MCPServer CR)
│   │   ├── Chart.yaml
│   │   ├── values.yaml
│   │   └── templates/
│   └── mcp-openshift/             # OpenShift MCP Server (ToolHive MCPServer CR)
│       ├── Chart.yaml
│       ├── values.yaml
│       └── templates/
│
├── agent/                         # Agent source code
│   ├── main.py                    # FastAPI server + agent logic
│   ├── mcp_client.py              # MCP server client (SSE + streamable-http)
│   ├── requirements.txt
│   ├── Containerfile
│   └── tests/
│
└── README.adoc                    # This file

The tenant/ directory follows the rhpds/ci-template-gitops pattern where:

  • infra/ = cluster-scoped resources (not used here — infra is handled by the cluster provisioner playbook)

  • tenant/ = per-user (namespace-scoped) resources

Repository Purpose

rhpds.ocpsandbox_mcp_with_openshift

Ansible collection with ocp4_workload_ocpsandbox_mcp_user role + e2e tests + cluster provisioner

agnosticd/namespaced_workloads

Generic sandbox roles (argocd_user, gitea_instance, gitea_user, keycloak_user)

rhpds.litellm_virtual_keys

LiteLLM virtual key provisioning

rhpds/lb1726-mcp-showroom

Showroom lab guide content

rhpds/ocpsandbox-mcp-with-openshift-gitops

This repo — GitOps charts

MCP Servers

Server Transport Image Purpose

OpenShift

SSE

quay.io/containers/kubernetes_mcp_server:latest

K8s/OCP resource management, pod logs, exec

Gitea

streamable-http

docker.gitea.com/gitea-mcp-server:latest

Issue management, repository operations, PR workflows

Configuration

Agent Environment Variables

Variable Description Default

LITELLM_URL

Base URL for the LiteLLM API endpoint

(required)

LITELLM_API_KEY

API key for LiteLLM authentication

(required)

LITELLM_MODEL

Model identifier to use for analysis

openai/Llama-4-Scout-17B-16E-W4A16

MCP_OPENSHIFT_URL

URL for the OpenShift MCP server

(required)

MCP_OPENSHIFT_TRANSPORT

Transport type for OpenShift MCP server

sse

MCP_GITEA_URL

URL for the Gitea MCP server

(required)

MCP_GITEA_TRANSPORT

Transport type for Gitea MCP server

streamable-http

GITEA_OWNER

Gitea repository owner for issue creation

user1

GITEA_REPO

Gitea repository name for issue creation

mcp

PORT

Server listening port

8000

Local Development

Agent

cd agent
pip install -r requirements.txt

export LITELLM_URL="http://your-litellm-endpoint"
export LITELLM_API_KEY="your-api-key"
export MCP_OPENSHIFT_URL="http://mcp-openshift-server/sse"
export MCP_GITEA_URL="http://mcp-gitea-server/mcp"

python main.py

Container Build

cd agent
podman build -t pipeline-agent -f Containerfile .

Testing

cd agent
pip install -r requirements.txt
pytest tests/ -v --cov=. --cov-report=term-missing

About

GitOps Helm charts for MCP with OpenShift lab (OcpSandbox deployment)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors