Skip to content

Conversation

@richiejp
Copy link
Collaborator

Testing fixes to #7423

  • ci(workflows): bump GitHub Actions images to Ubuntu 24.04
  • ci(workflows): remove CUDA 11.x support from GitHub Actions (incompatible with ubuntu:24.04)
  • ci(workflows): bump GitHub Actions CUDA support to 12.9
  • build(docker): bump base image to ubuntu:24.04 and adjust Vulkan SDK/packages
  • fix(backend): correct context paths for Python backends in workflows, Makefile and Dockerfile
  • chore(make): disable parallel backend builds to avoid race conditions
  • chore(make): export CUDA_MAJOR_VERSION and CUDA_MINOR_VERSION for override
  • chore(make): add backends/faster-whisper and docker-save-faster-whisper targets
  • build(backend): update backend Dockerfiles to Ubuntu 24.04
  • chore(backend): add ROCm env vars and default AMDGPU_TARGETS for hipBLAS builds
  • chore(chatterbox): bump ROCm PyTorch to 2.9.1+rocm6.4 and update index URL; align hipblas requirements
  • chore: add local-ai-launcher to .gitignore
  • ci(workflows): fix backends GitHub Actions workflows after rebase
  • build(docker): use build-time UBUNTU_VERSION variable
  • chore(docker): remove libquadmath0 from requirements-stage base image
  • chore(make): add backends/vllm to .NOTPARALLEL to prevent parallel builds
  • chore(make): remove duplicate docker-build-vllm target
  • fix(docker): correct CUDA installation steps in backend Dockerfiles
  • chore(backend): update ROCm to 6.4 and align Python hipblas requirements
  • ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for CUDA on arm64 builds
  • build(docker): update base image and backend Dockerfiles for Ubuntu 24.04 compatibility on arm64
  • build(backend): increase timeout for uv installs behind slow networks on backend/Dockerfile.python
  • ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for vibevoice backend
  • ci(workflows): fix failing GitHub Actions runners
  • fix: Allow FROM_SOURCE to be unset

@netlify
Copy link

netlify bot commented Dec 29, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 6d04e23
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/695d0646f498680008151af6
😎 Deploy Preview https://deploy-preview-7769--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@richiejp
Copy link
Collaborator Author

we have a chicken and egg problem here with the intel image, so I have set it to the upstream image to test building in the CI.

@mudler
Copy link
Owner

mudler commented Dec 29, 2025

we have a chicken and egg problem here with the intel image, so I have set it to the upstream image to test building in the CI.

we might also not need it anymore, it really was a workaround as I was having issues with using directly upstream images

@richiejp richiejp marked this pull request as ready for review December 30, 2025 09:39
@richiejp richiejp enabled auto-merge (squash) December 30, 2025 09:42
@richiejp
Copy link
Collaborator Author

Cool, OK, all images managed to build (the current build is blocked by a 503 internal server error). I'm guessing the GGML based backends will all be fine at runtime based on previous testing and the Python ones... maybe less so. Do you want to merge this then scramble to fix the resulting issues? @mudler

c.c. @toalex77

@mudler
Copy link
Owner

mudler commented Jan 2, 2026

Cool, OK, all images managed to build (the current build is blocked by a 503 internal server error). I'm guessing the GGML based backends will all be fine at runtime based on previous testing and the Python ones... maybe less so. Do you want to merge this then scramble to fix the resulting issues? @mudler

c.c. @toalex77

yup let's pick it up from master and fix remaining issues there

else
curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu${UBUNTU_VERSION}/arm64/cuda-keyring_1.1-1_all.deb
fi
curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu${UBUNTU_VERSION}/sbsa/cuda-keyring_1.1-1_all.deb
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any specific reason for this? it looks like a regression

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this part because this repository https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/arm64/ for Ubuntu 24.04, doesn't contain any of the packages required for the subsequent apt-get installation, but they are available in the sbsa repository.
I then tested the build, and it worked.
Obviously, not having the necessary hardware, I couldn't test the actual functionality of what had been compiled.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have both arches (DGX Spark and AGX Orin) so I will be able to test here

libcublas-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcusparse-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcusolver-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION}
if [ "${CUDA_MAJOR_VERSION}" = "13" ] && [ "arm64" = "$TARGETARCH" ]; then
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do this work for cuda12?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that these packages exist with CUDA 12. However I am running arm64 locally in QEMU and the build fails later on with what appears to be an error where the x86_64 protoc exe has found its way into the build.

curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu${UBUNTU_VERSION}/x86_64/cuda-keyring_1.1-1_all.deb
fi
if [ "arm64" = "$TARGETARCH" ]; then
if [ "${CUDA_MAJOR_VERSION}" = "13" ]; then
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here

apt-get install -y --no-install-recommends \
software-properties-common pciutils
if [ "amd64" = "$TARGETARCH" ]; then
echo https://developer.download.nvidia.com/compute/cuda/repos/ubuntu${UBUNTU_VERSION}/x86_64/cuda-keyring_1.1-1_all.deb
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems an oversight

libcublas-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcusparse-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcusolver-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION}
if [ "${CUDA_MAJOR_VERSION}" = "13" ] && [ "arm64" = "$TARGETARCH" ]; then
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here

else
curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu${UBUNTU_VERSION}/arm64/cuda-keyring_1.1-1_all.deb
fi
curl -O https://developer.download.nvidia.com/compute/cuda/repos/ubuntu${UBUNTU_VERSION}/sbsa/cuda-keyring_1.1-1_all.deb
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto here

libcusparse-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcusolver-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION}
if [ "${CUDA_MAJOR_VERSION}" = "13" ] && [ "arm64" = "$TARGETARCH" ]; then
if [ "arm64" = "$TARGETARCH" ]; then
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

libcusparse-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
libcusolver-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION}
if [ "${CUDA_MAJOR_VERSION}" = "13" ] && [ "arm64" = "$TARGETARCH" ]; then
if [ "arm64" = "$TARGETARCH" ]; then
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Dockerfile Outdated
RUN wget -qO - https://repositories.intel.com/gpu/intel-graphics.key | \
gpg --yes --dearmor --output /usr/share/keyrings/intel-graphics.gpg
RUN echo "deb [arch=amd64 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/gpu/ubuntu jammy/lts/2350 unified" > /etc/apt/sources.list.d/intel-graphics.list
RUN echo "deb [arch=amd64 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/gpu/ubuntu noble/lts/2350 unified" > /etc/apt/sources.list.d/intel-graphics.list
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, this likely would be better to have its own ARG

COPY python/${BACKEND} /${BACKEND}
COPY backend.proto /${BACKEND}/backend.proto
COPY python/common/ /${BACKEND}/common
COPY backend/python/${BACKEND} /${BACKEND}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mh. this is interesting, some of the Makefile commands where not correctly setting the context path as backend, but here we assume the context now is the whole repository. I tried to avoid this because the main path of LocalAI could likely be in a more dirty state when developing (e.g. model files, etc that then gets dumped in the build context and makes thing slow). any reason to use the . as build context? can we keep it scoped to the backend directory instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to or improve the .dockerignore because I seem to be sending 6GB+ to the context during a backend build.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I made this change in this commit 9323392 essentially because during the local build tests the Python backends all failed and because, being a different behavior compared to the other backends, I had interpreted it as something to be standardized, but in reality without particular awareness.
So it might be something that needs to be undone and fixed in another way (maybe with some comments explaining why the context is different?).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toalex77 I see, yes totally should be standardized in one way or another, I tend to prefer isolating the context as much as we can if possible as it hints which files are actually needed (we need only things from the backend directory because there is a common backend.proto, but we don't need anything from the top-level repository)

include:
# CUDA 11 builds
- build-type: 'cublas'
cuda-major-version: "11"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we drop cuda 11 (which is totally fine) we should also update docs accordingly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there was a lot more left over than just the docs, so I have removed everything related to CUDA 11. I believe this will remove support for Kepler GPUs (released 2012), although they may still work with Vulkan for GGML based backends.

@github-actions github-actions bot added the kind/documentation Improvements or additions to documentation label Jan 5, 2026
@richiejp
Copy link
Collaborator Author

richiejp commented Jan 5, 2026

"HTTP status server error (503 Service Temporarily Unavailable) for url
#22 67.94 (https://pypi.jetson-ai-lab.io/root/pypi/+f/63c/f8bbe7522de3b/h11-0.16.0-py3-none-any.whl)
" <-- failed twice in a row now.

EDIT: 3 times now, this time with sentencepiece package.

toalex77 and others added 21 commits January 6, 2026 09:40
… Makefile and Dockerfile

Signed-off-by: Alessandro Sturniolo <[email protected]>
…x URL; align hipblas requirements

Signed-off-by: Alessandro Sturniolo <[email protected]>
…4.04 compatibility on arm64

Signed-off-by: Alessandro Sturniolo <[email protected]>
… on backend/Dockerfile.python

Signed-off-by: Alessandro Sturniolo <[email protected]>
Signed-off-by: Richard Palethorpe <[email protected]>
@mudler
Copy link
Owner

mudler commented Jan 6, 2026

let's test on master! it's much easier to catch-up from there given our build matrix. I can also help a bit more to test also on both l4t platforms

@mudler mudler disabled auto-merge January 6, 2026 14:26
@mudler mudler merged commit e6ba26c into mudler:master Jan 6, 2026
36 of 100 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies kind/documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants