Add minimum changes to support containers #8

cjh1 · 2025-12-01T16:23:53Z

This PR add some properties to the JobSpec to allow containerize jobs to be run.

juztas · 2025-12-02T00:17:50Z

Hi, Just my observation and few comments. For container runtime, the current JobSpec is minimal for an image and volume mounts, and there might be a need for more options to express for container runtime, like --mpi , --nv/--gpu , --network host ….(just a wild guess based on my experience with physicists, not based on AmSC requirements). Would it be more valuable to have it like this:

class VolumeMount(BaseModel):
    source: str
    target: str
    read_only: bool = True

class ContainerRuntime(BaseModel):
    image: str | None = None
    network_mode: str | host = host
    mpi: bool | False = False
    gpu: bool | False - False
    volume_mounts: list[VolumeMount] = []
    ... expand as needed, required in the future

class JobSpec(BaseModel):
    executable : str | None = None
    container_runtime : ContainerRuntime | None = None

Kind of, rather than pushing these flags directly in JobSpec, the API could introduce a dedicated ContainerRuntime model and separate container-related configurations.

It also raises additional questions for facilities and IRI Interface implementation (how this would work in practise between all facilities), as each facility might use a different container runtime (Docker, Apptainer, Podman…), and not everyone allows fully privileged containers (just my guess). How are these capabilities exposed (container runtime, flags supported) and who does the "heavy lifting" to translate container parameters to facility container runtime. Is it IRI Interface or is it left for the end-user to identify each facilities capabilities and make changes as required to run jobs.

cjh1 · 2025-12-02T15:28:40Z

Hi, Just my observation and few comments. For container runtime, the current JobSpec is minimal for an image and volume mounts, and there might be a need for more options to express for container runtime, like --mpi , --nv/--gpu , --network host ….(just a wild guess based on my experience with physicists, not based on AmSC requirements). Would it be more valuable to have it like this:
class VolumeMount(BaseModel):
    source: str
    target: str
    read_only: bool = True

class ContainerRuntime(BaseModel):
    image: str | None = None
    network_mode: str | host = host
    mpi: bool | False = False
    gpu: bool | False - False
    volume_mounts: list[VolumeMount] = []
    ... expand as needed, required in the future

class JobSpec(BaseModel):
    executable : str | None = None
    container_runtime : ContainerRuntime | None = None
Kind of, rather than pushing these flags directly in JobSpec, the API could introduce a dedicated ContainerRuntime model and separate container-related configurations.

Separating the configuration into into a separate container specific object is a good idea, however, I think we need to be careful to avoid exposing too much as we need to allow for sites to implement the interface, so it really needs to be the lowest common denominator that the container runtimes used across the different sites can support. For example I didn't expose the network configuration as I was thinking that we should just default to the host. For MPI and GPU configuration, I would say that these options could be enabled if the job spec dictated that they where necessary, to avoid duplicating configuration.

It also raises additional questions for facilities and IRI Interface implementation (how this would work in practise between all facilities), as each facility might use a different container runtime (Docker, Apptainer, Podman…), and not everyone allows fully privileged containers (just my guess). How are these capabilities exposed (container runtime, flags supported) and who does the "heavy lifting" to translate container parameters to facility container runtime. Is it IRI Interface or is it left for the end-user to identify each facilities capabilities and make changes as required to run jobs.

Yes, as I said above, we need to expose a very minimal subset of container functionality, so it can be implemented successfully across sites. I see this interface as a subset of container functionality rather than as superset of all container runtime options. We could also provide a site specific "extra container options" property, as an escape hatch that would allow sites to support more advanced options, but these would not necessarily be supported across all sites.

frobnitzem · 2026-01-15T18:56:48Z

It would be great to be able to get my job executable or script to run inside a specified container image. That would remove my need to login to the system manually and compile my code before submitting a job.

However, HPC systems need special mounts/definitions in order to use GPUs / accelerators and the system MPI libraries, etc.
A fantastic emerging way to capture these definitions is the container device interface (CDI) method (link below).
This should influence how we craft the metadata sent to Containers.

Some references to compare when doing this are:

Container Device Interface (CDI) https://docs.docker.com/build/building/cdi/
Podman CDI --device flag https://docs.podman.io/en/v4.3/markdown/options/device.html
SLURM's --container flag https://slurm.schedmd.com/containers.html#request
OLCF Containers Docs
Kubernetes Pod Spec - which specifies a list of "containers", each with volume mounts, env-vars, working directories, etc.
Oracle Cloud Interface CreateJobDetails - which specifies an image name and a list of mounts.

I think the common subset is the image name and mount list. However, we should also try and add something like a --device flag.

cjh1 · 2026-01-15T19:08:13Z

@frobnitzem I agree that we need to be able to support GPUs and MPI libraries. I was proposing that given this configuration is already covered in the job spec each implementation would use that to add appropriate --device flags etc. This is designed to be a high level api that covers the 80% of container base job use cases, the escape hatch is to run the container runtime invocation directly with site specific details.

cjh1 force-pushed the containers branch 3 times, most recently from 205b127 to b5bdf4a Compare December 4, 2025 19:09

cjh1 force-pushed the containers branch from b5bdf4a to e266623 Compare January 15, 2026 19:30

cjh1 added 2 commits January 15, 2026 22:05

Add minimum changes to support containers

5b3c515

Add documentation

8963b1d

cjh1 force-pushed the containers branch from e266623 to 8963b1d Compare January 15, 2026 22:07

cjh1 marked this pull request as ready for review January 15, 2026 22:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add minimum changes to support containers #8

Add minimum changes to support containers #8

Uh oh!

cjh1 commented Dec 1, 2025

Uh oh!

juztas commented Dec 2, 2025

Uh oh!

cjh1 commented Dec 2, 2025

Uh oh!

frobnitzem commented Jan 15, 2026

Uh oh!

cjh1 commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add minimum changes to support containers #8

Are you sure you want to change the base?

Add minimum changes to support containers #8

Uh oh!

Conversation

cjh1 commented Dec 1, 2025

Uh oh!

juztas commented Dec 2, 2025

Uh oh!

cjh1 commented Dec 2, 2025

Uh oh!

frobnitzem commented Jan 15, 2026

Uh oh!

cjh1 commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants