Skip to content

Update release with new fabrica-based services; remove old services#50

Draft
travisbcotton wants to merge 34 commits intomainfrom
trcotton/tokensmith-container
Draft

Update release with new fabrica-based services; remove old services#50
travisbcotton wants to merge 34 commits intomainfrom
trcotton/tokensmith-container

Conversation

@travisbcotton
Copy link
Copy Markdown

@travisbcotton travisbcotton commented Apr 2, 2026

Pull Request Template

Thank you for your contribution! Please ensure the following before submitting:

Checklist

  • My code follows the style guidelines of this project
  • I have added/updated comments where needed
  • I have added tests that prove my fix is effective or my feature works
  • I have run make test (or equivalent) locally and all tests pass
  • DCO Sign-off: All commits are signed off (git commit -s) with my real name and email
  • REUSE Compliance:
    • Each new/modified source file has SPDX copyright and license headers
    • Any non-commentable files include a <filename>.license sidecar
    • All referenced licenses are present in the LICENSES/ directory

Description

Please include a summary of the change and which issue is fixed.
Also include relevant motivation and context.

Fixes #(issue)

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update

For more info, see Contributing Guidelines.

@travisbcotton travisbcotton marked this pull request as draft April 7, 2026 20:51
@travisbcotton travisbcotton force-pushed the trcotton/tokensmith-container branch from 9bf779d to ef8d070 Compare April 7, 2026 20:52
@travisbcotton travisbcotton changed the title added tokensmith basic config file; update env file Update release with new fabrica-based services; remove old services Apr 7, 2026
Comment thread systemd/containers/boot-service.service
Comment thread systemd/containers/boot-service.service
Comment thread systemd/configs/openchami.env Outdated
Comment thread scripts/tokensmith_bootstrap_token
Comment thread systemd/containers/tokensmith.container Outdated
@davidallendj
Copy link
Copy Markdown
Contributor

davidallendj commented Apr 9, 2026

Just a couple of other notes before merging. We need to update the *.container files to use the most up-to-date version of our services including:

  1. SMD after this PR is merged.
  2. Tokensmith to v0.3.0 or later
  3. Boot-service after creating a release
  4. Metadata-service after creating a release

We also need to update systemd/targets/openchami.target to require the new services as well.

Comment thread systemd/containers/tokensmith.container
@davidallendj
Copy link
Copy Markdown
Contributor

davidallendj commented Apr 9, 2026

Another note...we're going to update the CoreDHCP config in /etc/openchami/configs/coredhcp.yaml to reflect the change from this PR if we upgrade to the latest version.

Here's snippet of the tutorial config should look like after the changes:

    - coresmd: |
        svc_base_uri=https://demo.openchami.cluster:8443 
        ipxe_base_uri=http://172.16.0.254:8081 
        ca_cert=/root_ca/root_ca.crt 
        cache_valid=30s 
        lease_time=1h 
        single_port=false
    - bootloop: |
        lease_file=/tmp/coredhcp.db 
        script_path=default 
        lease_time=5m 
        ipv4_start=172.16.0.200 
        ipv4_end=172.16.0.250

Comment thread systemd/configs/openchami.env
@davidallendj
Copy link
Copy Markdown
Contributor

davidallendj commented Apr 13, 2026

A couple of changes:

  1. I think OPAAL_URL can be removed
  2. I think JWKS_URL should be updated to use the tokensmith JWKS endpoint. In the tutorial, it will be something like http://tokensmith:8080/.well-known/jwks.json.
  3. The same change needs to be made to SMD_JWKS_URL as well.

@travisbcotton travisbcotton force-pushed the trcotton/tokensmith-container branch from 8ab666e to fc525d0 Compare April 14, 2026 14:11
@davidallendj
Copy link
Copy Markdown
Contributor

We'll need /etc/openchami/configs/haproxy.cfg to be updated to remove the old service routes and add the new ones for tokensmith, boot-service, and metadata-service.

@synackd
Copy link
Copy Markdown
Contributor

synackd commented Apr 15, 2026

We'll have to note these major changes in the release notes once this is merged. We'll want to bump the minor version on the tag.

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
… list

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
… flags

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
@travisbcotton travisbcotton force-pushed the trcotton/tokensmith-container branch from 651ce7d to caa1bd3 Compare April 16, 2026 14:43
… real this time

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
…ner 😔

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
@davidallendj
Copy link
Copy Markdown
Contributor

davidallendj commented Apr 16, 2026

Should we provide a /etc/openchami/configs/boot-service.yaml here alongside the /etc/openchami/configs/tokensmith.json? I think it should go in systemd/configs/boot-service.yaml here to be copied in the appropriate location.

Edit: Just to add, here's the default boot-service config.yaml:

systemd/configs/boot-service.yaml
# SPDX-FileCopyrightText: 2025 OpenCHAMI Contributors
#
# SPDX-License-Identifier: MIT

# OpenCHAMI Boot Service Configuration Example
#
# This is a comprehensive example configuration file for the OpenCHAMI boot service.
# To use this configuration:
#   1. Copy this file to config.yaml: cp config.example.yaml config.yaml
#   2. Customize the settings below for your environment
#   3. Remove or comment out sections you don't need
#
# Configuration precedence (highest to lowest):
#   1. Command-line flags
#   2. Environment variables (e.g., BOOT_SERVICE_PORT=8082)
#   3. Configuration file (config.yaml)
#   4. Default values

# =============================================================================
# SERVER CONFIGURATION
# =============================================================================

# HTTP server settings
port: 8082                    # Port to listen on
host: "0.0.0.0"              # Interface to bind to (0.0.0.0 for all interfaces)
read_timeout: 30             # HTTP read timeout in seconds
write_timeout: 30            # HTTP write timeout in seconds
idle_timeout: 120            # HTTP idle timeout in seconds

# =============================================================================
# STORAGE CONFIGURATION
# =============================================================================

# Data storage settings
data_dir: "./data"           # Directory for storing boot configurations
storage_type: "file"         # Storage backend: "file", "database" (future)

# Database settings (when storage_type: "database")
# database:
#   driver: "postgres"
#   host: "localhost"
#   port: 5432
#   name: "boot_service"
#   user: "boot_user"
#   password: "boot_password"
#   ssl_mode: "require"
#   max_connections: 25
#   connection_timeout: 30

# =============================================================================
# FEATURE TOGGLES
# =============================================================================

# Authentication
enable_auth: false           # Enable TokenSmith JWT authentication
                            # Set to true for production environments

# Metrics and monitoring
enable_metrics: true         # Enable Prometheus metrics endpoint
metrics_port: 9092          # Port for metrics endpoint (/metrics)

# API compatibility
enable_legacy_api: true     # Enable legacy BSS-compatible endpoints
                           # Disable to force use of new API only

# =============================================================================
# AUTHENTICATION CONFIGURATION (when enable_auth: true)
# =============================================================================

auth:
  # Core authentication settings
  enabled: false             # Must match enable_auth above

  # JWT validation method (choose one):

  # Option 1: JWKS URL (recommended for production)
  jwks_url: "https://auth.openchami.org/.well-known/jwks.json"
  jwks_refresh_interval: "1h"  # How often to refresh JWKS cache

  # Option 2: Static RSA public key (for development/testing)
  # jwt_public_key: |
  #   -----BEGIN PUBLIC KEY-----
  #   MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
  #   -----END PUBLIC KEY-----

  # JWT validation options
  jwt_issuer: "https://auth.openchami.org"     # Expected token issuer
  jwt_audience: "boot-service"                  # Expected token audience
  validate_expiration: true                     # Check token expiration
  validate_issuer: true                        # Validate issuer claim
  validate_audience: true                      # Validate audience claim

  # Authorization requirements
  required_claims: ["sub", "iss", "aud"]      # Required JWT claims
  required_scopes: ["boot:read"]              # Required OAuth2 scopes

  # Development/testing options (never use in production)
  allow_empty_token: false    # Allow requests without tokens
  non_enforcing: false       # Log auth failures but don't block requests

# =============================================================================
# HARDWARE STATE MANAGER INTEGRATION
# =============================================================================

# HSM (Hardware State Manager) settings
hsm_url: "http://localhost:27779"  # URL of the HSM service
                                   # Set to your HSM endpoint

# TokenSmith-backed HSM service authentication
# When both hsm_url and tokensmith_url are configured, boot-service exchanges a
# bootstrap token for short-lived service tokens and adds them to HSM requests.
# Standardized env vars: TOKENSMITH_URL, TOKENSMITH_BOOTSTRAP_TOKEN,
# TOKENSMITH_TARGET_SERVICE, TOKENSMITH_SCOPES, TOKENSMITH_REFRESH_SKEW_SEC
tokensmith_url: "http://localhost:8080"
tokensmith_target_service: "hsm"
tokensmith_scopes: "hsm:read"
tokensmith_refresh_skew_sec: 120
# tokensmith_bootstrap_token: "<bootstrap-jwt>"  # Prefer env var for secrets
# Environment fallback: TOKENSMITH_BOOTSTRAP_TOKEN

# HSM authentication (when HSM requires auth)
# hsm_auth:
#   type: "service_token"      # Authentication type for HSM
#   service_name: "boot-service"
#   token_endpoint: "http://tokensmith:8080/token"

# =============================================================================
# EXTERNAL SERVICES
# =============================================================================

# TokenSmith authentication service (when enable_auth: true)
tokensmith:
  url: "http://localhost:8080"                    # TokenSmith service URL
  timeout: 30                                    # Request timeout in seconds

  # Service-to-service authentication
  service_auth:
    enabled: false                               # Enable service tokens
    service_name: "boot-service"                 # This service's identifier
    token_endpoint: "/token"                     # Token endpoint path

# BSS (Boot Script Service) integration
bss:
  enabled: false                                 # Enable BSS integration
  url: "http://localhost:27778"                 # BSS service URL
  timeout: 30                                   # Request timeout in seconds

# =============================================================================
# LOGGING AND MONITORING
# =============================================================================

# Logging configuration
logging:
  level: "info"               # Log level: debug, info, warn, error
  format: "json"             # Log format: json, text
  output: "stdout"           # Log output: stdout, stderr, file
  # file: "/var/log/boot-service.log"  # Log file (when output: file)

# Health check configuration
health:
  enabled: true              # Enable health check endpoint
  endpoint: "/health"        # Health check URL path
  timeout: 5                # Health check timeout in seconds

# =============================================================================
# PERFORMANCE AND SCALING
# =============================================================================

# Request limits
limits:
  max_request_size: "10MB"   # Maximum request body size
  max_concurrent: 100        # Maximum concurrent requests
  rate_limit: 1000          # Requests per minute per IP

# Caching (future feature)
# cache:
#   enabled: false
#   type: "memory"           # Cache type: memory, redis
#   ttl: "5m"               # Cache TTL
#   max_size: "100MB"       # Maximum cache size

# =============================================================================
# DEVELOPMENT AND TESTING
# =============================================================================

# Development mode settings
development:
  enabled: false             # Enable development mode
  cors_enabled: true        # Enable CORS for browser testing
  cors_origins: ["*"]       # Allowed CORS origins
  debug_endpoints: false    # Enable debug/diagnostic endpoints
  mock_services: false      # Use mock external services

# =============================================================================
# DEPLOYMENT ENVIRONMENT EXAMPLES
# =============================================================================

# Uncomment and modify one of these sections for your deployment environment:

# --- Development Environment ---
# enable_auth: false
# enable_metrics: true
# logging:
#   level: "debug"
# development:
#   enabled: true
#   debug_endpoints: true

# --- Production Environment ---
# enable_auth: true
# enable_metrics: true
# auth:
#   enabled: true
#   jwks_url: "https://auth.openchami.org/.well-known/jwks.json"
#   jwt_issuer: "https://auth.openchami.org"
#   jwt_audience: "boot-service"
#   required_scopes: ["boot:read"]
# logging:
#   level: "info"
#   format: "json"

# --- Kubernetes/Container Environment ---
# port: 8080
# host: "0.0.0.0"
# data_dir: "/data"
# auth:
#   jwks_url: "http://tokensmith:8080/.well-known/jwks.json"
#   jwt_issuer: "openchami-tokensmith"
#   jwt_audience: "openchami-cluster"
# hsm_url: "http://smd:27779"
# logging:
#   format: "json"
#   output: "stdout"

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
TOKENSMITH_KEY_DIR=/tokensmith/data/keys
TOKENSMITH_RFC8693_BOOTSTRAP_STORE=/tokensmith/data/bootstrap
TOKENSMITH_RFC8693_REFRESH_STORE=/tokensmith/data/refresh
TOKENSMITH_OIDC_PROVIDER=http://hydra:4444
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're remoing Hydra, should this be something different?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are removing hydra, what will our OIDC provider be? Does tokensmith act as an OIDC provider too?

Exec=serve --enable-legacy-api=false --enable-auth=true --tokensmith_url=http://tokensmith:8080 --hsm-url=http://smd:27779 --tokensmith-target-service smd --port 8081

[Service]
ExecStartPre=/usr/local/sbin/tokensmith_bootstrap_token.sh boot-service
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tokensmith_bootstrap_token.sh -> tokensmith_bootstrap_token with the file name change.

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
…metadata-service

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
Signed-off-by: Travis Cotton <trcotton@lanl.gov>
… out lines too

Signed-off-by: Travis Cotton <trcotton@lanl.gov>
@synackd
Copy link
Copy Markdown
Contributor

synackd commented Apr 16, 2026

Another note...we're going to update the CoreDHCP config in /etc/openchami/configs/coredhcp.yaml to reflect the change from this PR if we upgrade to the latest version.

Here's snippet of the tutorial config should look like after the changes:

    - coresmd: |
        svc_base_uri=https://demo.openchami.cluster:8443 
        ipxe_base_uri=http://172.16.0.254:8081 
        ca_cert=/root_ca/root_ca.crt 
        cache_valid=30s 
        lease_time=1h 
        single_port=false
    - bootloop: |
        lease_file=/tmp/coredhcp.db 
        script_path=default 
        lease_time=5m 
        ipv4_start=172.16.0.200 
        ipv4_end=172.16.0.250

We may want to add default hostname rules since the default if none is to prefix with unknown-. Maybe something like:

rule=type:Node,hostname:n{04d}
rule=type:NodeBMC,hostname:{id}

The above will make the node hostnames be like n0001 and make the BMC hostnames be their xname.

Copy link
Copy Markdown
Contributor

@synackd synackd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial code review without testing this yet.

@@ -0,0 +1,17 @@
#!/bin/bash

CLIENT="${1}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably want to either err if no argument is passed or set this to a default value if unset (e.g. CLIENT="${1:-default_val}". Not sure what the default value would be, so maybe it would be better to err?

")

SECRET_NAME="${CLIENT}-bootstrap-token"
printf '%s' "$TOKENSMITH_BOOTSTRAP_TOKEN" | podman secret rm ${SECRET_NAME} 2>/dev/null || true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think -i will also ignore errors if the secret doesn't exist. From podman secret rm --help:

-i, --ignore   Ignore errors when a specified secret is missing

use_backend opaal if PATH_opaal
use_backend opaal-idp if PATH_opaal-idp
# add new services
acl PATH_boot-service path_beg -i /boot-service
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since metadata-service uses /metadata, do we want to use /boot here?

TOKENSMITH_KEY_DIR=/tokensmith/data/keys
TOKENSMITH_RFC8693_BOOTSTRAP_STORE=/tokensmith/data/bootstrap
TOKENSMITH_RFC8693_REFRESH_STORE=/tokensmith/data/refresh
TOKENSMITH_OIDC_PROVIDER=http://hydra:4444
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are removing hydra, what will our OIDC provider be? Does tokensmith act as an OIDC provider too?

SMD_URL=http://smd:27779
OPAAL_URL=http://opaal:3333
JWKS_URL=http://opaal:3333/keys
IMPERSONATION=true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally for cloud-init. Does the metadata-service need this too?

@@ -0,0 +1,32 @@
[Unit]
Description=The bss container
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/bss/boot-service/

Description=The bss container
PartOf=openchami.target

# Ensure SMD has started already
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace SMD with "dependent services" since there are more than one.

Comment on lines -10 to -11
Wants=hydra-gen-jwks.service
After=hydra-gen-jwks.service
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we replace these with tokensmith since SMD will depend on it?

#Image=tokensmith:test
EnvironmentFile=/etc/openchami/configs/openchami.env

Exec=serve --oidc-issuer="$TOKENSMITH_OIDC_PROVIDER" --issuer="$TOKENSMITH_ISSUER" --port="$TOKENSMITH_PORT" --cluster-id="$TOKENSMITH_CLUSTER_ID" --openchami-id="$TOKENSMITH_OPENCHAMI_ID" --config="$TOKENSMITH_CONFIG" --key-dir="$TOKENSMITH_KEY_DIR" --rfc8693-bootstrap-store="$TOKENSMITH_RFC8693_BOOTSTRAP_STORE" --rfc8693-refresh-store="$TOKENSMITH_RFC8693_REFRESH_STORE" --enable-local-user-mint
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we multi-line this for readability and version control purposes? E.g.:

Exec=serve \
  --oidc-issuer="$TOKENSMITH_OIDC_PROVIDER" \
  --issuer="$TOKENSMITH_ISSUER" \
  --port="$TOKENSMITH_PORT" \
  --cluster-id="$TOKENSMITH_CLUSTER_ID" \
  --openchami-id="$TOKENSMITH_OPENCHAMI_ID" \
  --config="$TOKENSMITH_CONFIG" \
  --key-dir="$TOKENSMITH_KEY_DIR" \
  --rfc8693-bootstrap-store="$TOKENSMITH_RFC8693_BOOTSTRAP_STORE" \
  --rfc8693-refresh-store="$TOKENSMITH_RFC8693_REFRESH_STORE" \
  --enable-local-user-mint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants