Skip to content

chore(ci): replace maximize-build-space with free-disk-space in e2e#4336

Merged
wolfboys merged 3 commits intoapache:devfrom
limc5462:chore/update-e2e-disk-space
Mar 23, 2026
Merged

chore(ci): replace maximize-build-space with free-disk-space in e2e#4336
wolfboys merged 3 commits intoapache:devfrom
limc5462:chore/update-e2e-disk-space

Conversation

@limc5462
Copy link
Contributor

What changes were proposed in this pull request

image

This is a new issue that surfaced following my previous fix for another E2E failure.

Currently, the E2E tests are consistently failing due to "No space left on device" errors. This issue is primarily caused by the easimon/maximize-build-space action currently used in the workflow. To improve disk space utilization, this action consolidates multiple disk spaces into a single LVM mount, but leaves only about 10GB of available space on the root directory after mounting. Ideally, all operations should be executed within the LVM volume. However, there are some hard-to-track, unexpected disk write behaviors in our workflow that do not use the LVM. As a result, the 10GB root directory space is quickly exhausted, leading to E2E failures.

I tried directly adjusting the reservation parameters of easimon/maximize-build-space (for example, setting root-reserve-mb: 30720 in hopes of preserving 30GB for the root directory), but no matter how it was adjusted, the root directory still only had 10GB left after the LVM mount was completed. This might be a bug in the action itself, or perhaps the parameters failed to take effect as expected in this specific environment.

Looking at the overall environment, the total physical disk space of the Runner is actually sufficient, and we don't need to introduce a complex LVM mounting mechanism to consolidate space. Therefore, this PR replaces the space-clearing solution with jlumbroso/free-disk-space. By directly removing large, pre-installed packages that are not actually needed in the environment (such as dotnet, android, haskell, etc.), we can free up ample disk space in a much simpler and more direct way, thereby completely resolving the disk space shortage issue in the E2E tests.

Brief change log

  • Replaced easimon/maximize-build-space with jlumbroso/free-disk-space@v1.3.1
    in the GitHub Actions workflow.
  • Configured the new action to remove unnecessary pre-installed environments (dotnet, android, haskell, codeql, docker-images, large-packages
    ) to directly free up Runner disk space instead of using LVM.

Verifying this change

This change is primarily verified by the E2E tests in the CI workflow.

Please note: The previous "No space left on device" error was a deterministic issue that happened every time, and this fix completely resolves it. However, during verification, I found that the E2E tests might occasionally fail due to other unrelated, flaky issues. Therefore, when verifying this PR, if you encounter a failure that is not related to disk space, simply re-running the jobs will highly likely result in a successful pass.

Does this pull request potentially affect one of the following parts

-Dependencies (does it add or upgrade a dependency): no

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the E2E GitHub Actions workflow to address consistent “No space left on device” failures by switching from an LVM-based disk maximization approach to a package-removal based disk cleanup action.

Changes:

  • Replaced easimon/maximize-build-space with jlumbroso/free-disk-space@v1.3.1 in the E2E workflow.
  • Configured the new action to remove several large preinstalled components (dotnet/android/haskell/codeql/docker images/large packages) to free runner disk space.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@limc5462
Copy link
Contributor Author

This workflow run failed: https://github.com/apache/streampark/actions/runs/23380797925
The failure is just a flaky error in UserManagementTest. Could you please help re-run it? @wolfboys

With the code in this PR, we won't see the "no space left on device" errors anymore. If the re-run is lucky enough to pass UserManagementTest, we will see a fully green E2E run, which can serve as the basis for merging.

Also, here is a screenshot of a fully successful E2E run I had a few days ago:
image

- name: Maximize runner space
uses: easimon/maximize-build-space@fc881a613ad2a34aca9c9624518214ebc21dfc0c
- name: Free Disk Space
uses: jlumbroso/free-disk-space@21bdaa2c9e347d9d7fdb1bd6124d61c0f335a419 # v1.3.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add comment here?
refer: https://github.com/jlumbroso/free-disk-space

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and used the correct SHA.

@sonarqubecloud
Copy link

Copy link
Member

@wolfboys wolfboys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wolfboys wolfboys merged commit 37583a4 into apache:dev Mar 23, 2026
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants