Skip to content

feat(marketplace): GPU (CUDA) plugin bundle variants + bundle-based demo image#638

Open
staging-devin-ai-integration[bot] wants to merge 10 commits into
mainfrom
devin/1782650002-marketplace-gpu-bundles
Open

feat(marketplace): GPU (CUDA) plugin bundle variants + bundle-based demo image#638
staging-devin-ai-integration[bot] wants to merge 10 commits into
mainfrom
devin/1782650002-marketplace-gpu-bundles

Conversation

@staging-devin-ai-integration

@staging-devin-ai-integration staging-devin-ai-integration Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Optional CUDA bundle variants, backward-compatible. Manifests gain a variants[] array next to the canonical bundle. When empty, manifests are byte-identical to today's CPU-only ones, so existing clients/tooling are unaffected. The install client resolves a bundle by explicit accelerator (REST/config default_accelerator) or CUDA auto-detect, always falling back to the CPU bundle when no GPU match exists. Auto-detect probes both libcuda.so.1 (driver) and libcudart.so (runtime) so a driver-only host doesn't pick a CUDA bundle it can't load.
  • BREAKING (on-disk layout): bundle installs are keyed by id + version + accelerator. Installs now extract to bundles/<id>/<version>/<accelerator>/ instead of bundles/<id>/<version>/. This lets CPU and CUDA builds of the same version coexist, makes variant switching deterministic (never a silent no-op), and removes the fragile .accelerator marker file (and its fail-open path where a missing marker + install_models=true returned success without fetching the requested variant). ActivePluginRecord now records the activated accelerator (#[serde(default)], empty for pre-existing records). Existing bundles/<id>/<version>/ installs become invisible after upgrade and are simply re-fetched into the new layout; uninstall reaps empty <version>/<id> dirs.
  • Append-only registry passes. build_registry.py --accelerator cuda layers a cuda variant onto the already-published CPU manifest (immutable). verify_bundles.py --accelerator cuda allowlists CUDA NEEDED/RUNPATH deps that the strict CPU gate rejects. Shared manifest/registry logic (build_manifest, SHERPA_*_LIBS) is now in scripts/marketplace/manifest_builder.py so the builder, the append-only guard, and the bundle verifier can't drift. Accelerator comparison is case-normalized to match the Rust eq_ignore_ascii_case resolver.
  • CI publishes GPU bundles on the self-hosted Ada runner. build-marketplace-cuda builds CUDA plugins, regenerates + minisigns the variant manifests, and uploads <id>-<ver>-cuda-bundle.tar.zst to the same per-plugin release. CPU bundle naming is unchanged.
  • Dockerfile.demo consumes the marketplace instead of rebuilding from source. All 11 plugins are fetched as sha256-pinned prebuilt CPU bundles (the artifacts users install), making the demo a regression surface for published bundles. The whisper bundle build now pins SOURCE_DATE_EPOCH + -march=x86-64 -mtune=generic so future ggml builds are portable; ggml-tiny-q5_1.bin is now sha256-pinned in the demo; the slint stage injects a version-matched plugin.yml for the pinned 0.4.0 bundle.

Review & Validation

  • Breaking layout: confirm the id+version+accelerator install path is acceptable; existing bundles/<id>/<version>/ installs are re-fetched (not migrated) after upgrade.
  • Bundle/model sha256 pins in Dockerfile.demo match docs/public/registry/plugins/<id>/<version>/manifest.json.
  • CUDA variant resolution & fallback: cuda variant only on GPU/explicit request, CPU fallback otherwise (122 marketplace unit tests pass, incl. side-by-side install + driver-only auto-detect).
  • Self-hosted Ada runner: verify runner labels ([self-hosted, linux, x64, gpu]) and that the CUDA sherpa-onnx archive / minisign secret are available.

Notes

  • Whisper portability is forward-looking only. The flags apply on the next whisper rebuild; the already-published whisper-0.3.0 bundle the demo ships is reused under append-only and is not retroactively rebuilt. Making the shipped bundle portable requires a whisper version bump — flagged on the review thread for a maintainer decision.
  • I cannot run-test CUDA bundles here (no GPU); GPU validation is build + verify_bundles.py --accelerator cuda only. Publishing + minisign signing happen in CI on the Ada runner.
  • #9 (copytree per variant pass) is intentionally left as a full-tree copy: publish-registry assembles the registry wholesale (rm -rf docs/public/registry && cp -R dist/registry/*), so the variant pass output must contain every plugin's manifest — copying only affected plugins would drop CPU-only plugins. The carried files are small JSON manifests + signatures (bundles live in releases), so cost is O(manifest count).

Link to Devin session: https://staging.itsdev.in/sessions/3fe4bb10d20147bc9384def4bc81694b
Requested by: @streamer45


Devin Review

Status Commit
🟢 Reviewed e0bba4f
Open in Devin Review (Staging)

…emo image

Add optional per-plugin CUDA bundle variants alongside the canonical CPU
bundle, end-to-end across schema, install client, release pipeline, and
the CPU demo image.

- marketplace.rs: add backward-compatible variants[] to manifests
  (empty => byte-identical to pre-variant CPU manifests).
- marketplace_installer.rs: resolve bundle by explicit accelerator or
  CUDA auto-detect (libcuda probe), falling back to the CPU bundle.
- build_registry.py / verify_bundles.py: --accelerator passes layer a
  cuda variant onto the published CPU manifest (append-only, immutable);
  CUDA NEEDED/RUNPATH deps allowlisted for cuda bundles only.
- CI: build-marketplace-cuda job on the self-hosted Ada GPU runner builds
  + publishes <id>-<ver>-cuda-bundle.tar.zst to the per-plugin release.
- Dockerfile.demo: fetch sha256-pinned prebuilt CPU marketplace bundles
  instead of rebuilding 11 plugins from source; drop the from-source
  sherpa-onnx copy (bundles vendor their own libs) and the torch-based
  Helsinki conversion (pull pre-converted model tarballs).

Signed-off-by: streamkit-devin <devin@streamkit.dev>
@streamer45 streamer45 self-assigned this Jun 28, 2026
@streamer45 streamer45 self-requested a review June 28, 2026 13:06
@staging-devin-ai-integration

Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

@codecov

codecov Bot commented Jun 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 85.12195% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.24%. Comparing base (5faed28) to head (e0bba4f).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
apps/skit/src/marketplace_installer.rs 81.81% 52 Missing ⚠️
apps/skit/src/plugins.rs 62.50% 6 Missing ⚠️
apps/skit/src/server/plugins.rs 86.36% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #638      +/-   ##
==========================================
+ Coverage   85.22%   85.24%   +0.01%     
==========================================
  Files         249      249              
  Lines       75412    75765     +353     
  Branches     2437     2314     -123     
==========================================
+ Hits        64272    64584     +312     
- Misses      11134    11175      +41     
  Partials        6        6              
Flag Coverage Δ
backend 85.29% <85.12%> (+0.01%) ⬆️
ui 84.71% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
core 85.79% <ø> (ø)
engine 83.76% <ø> (ø)
api 91.14% <ø> (ø)
nodes 84.84% <ø> (+<0.01%) ⬆️
server 85.22% <85.12%> (+0.03%) ⬆️
plugin-native 84.79% <ø> (ø)
plugin-wasm 95.41% <ø> (ø)
ui-services 86.29% <ø> (+0.02%) ⬆️
ui-components 70.18% <ø> (ø)
Files with missing lines Coverage Δ
apps/skit/src/config.rs 97.62% <ø> (ø)
apps/skit/src/marketplace.rs 99.30% <100.00%> (+0.09%) ⬆️
apps/skit/src/plugin_records.rs 91.66% <ø> (ø)
apps/skit/src/server/plugins.rs 83.41% <86.36%> (+0.09%) ⬆️
apps/skit/src/plugins.rs 84.82% <62.50%> (-0.28%) ⬇️
apps/skit/src/marketplace_installer.rs 82.34% <81.81%> (+0.23%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 5 potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment thread .github/workflows/marketplace-build.yml
Comment thread apps/skit/src/marketplace_installer.rs Outdated
@@ -803,7 +817,7 @@ impl PluginInstaller {
}

@staging-devin-ai-integration staging-devin-ai-integration Bot Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Accelerator marker keyed only by id+version; switching variants requires manual uninstall

The install directory and download cache are keyed by plugin id + version only, so a CPU and CUDA build of the same version cannot coexist. The new .accelerator marker plus the mismatch check in handle_bundle_install (marketplace_installer.rs:811-826) correctly turns a variant switch into an explicit error rather than a silent no-op. Legacy bundles installed before this marker existed read as None and skip the check, so a machine that gains a GPU and auto-detects cuda will get the CPU error message only after the marker is present; for pre-marker installs the variant simply stays whatever was first installed. This is the intended behavior per the marker doc comment, noted here for reviewer awareness of the UX (no automatic re-install on accelerator change).

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment on lines +45 to +54
features=()
if grep -qE '^[[:space:]]*cuda[[:space:]]*=' "${plugin_dir}/Cargo.toml"; then
features=(--features cuda)
fi

echo "Building CUDA plugin: ${plugin} ${features[*]:-}"
(
cd "${plugin_dir}"
CARGO_TARGET_DIR="${target_dir}" cargo build --release "${features[@]}"
)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 helsinki/whisper CUDA builds rely on toolkit + dynamic CUDA loading on self-hosted runner

build_official_plugins_cuda.sh builds plugins with --features cuda only when their Cargo.toml declares a cuda = feature. whisper (whisper-rs/cuda) and helsinki (candle-core/cuda etc.) both qualify. These require an NVIDIA CUDA toolkit/nvcc at build time, which the build-marketplace-cuda job assumes is preinstalled on the self-hosted runner (only stated in a comment, not enforced). Candle/cudarc and whisper.cpp CUDA generally dlopen libcudart/libcublas/libcudnn at runtime, which verify_bundles.py --accelerator cuda permits via CUDA_DEP_ALLOWLIST. No code defect, but the CUDA build correctness depends entirely on runner provisioning that CI does not validate.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment thread Dockerfile.demo
Comment on lines 451 to +453
# libfdk-aac2 is in Debian's non-free component.
# sherpa-onnx / ONNX Runtime are NOT installed here: each sherpa-backed
# bundle vendors its own libs (RUNPATH=$ORIGIN).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Demo image now depends on marketplace bundles being fully self-contained

The runtime stage no longer installs sherpa-onnx/onnxruntime into /usr/local/lib (previously copied from kokoro-builder), relying on each sherpa-backed bundle vendoring its own libs with RUNPATH=$ORIGIN. whisper's ort (onnxruntime) for Silero VAD and helsinki's Candle are expected to statically link / self-contain. This matches what external marketplace users install, so if a bundle is not self-contained it is a marketplace-wide regression rather than demo-specific. Confirmed whisper uses ort and helsinki uses Candle (not CTranslate2), so no CTranslate2 runtime dependency is missing from the slimmed runtime image. Worth a one-time validation that the demo container actually loads all 12 plugins.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment on lines 462 to +475

# Variant passes (e.g. cuda) layer onto an already-published registry, so
# copy the full existing tree forward; untouched plugins must stay present
# and only the variant-updated manifests are overwritten below.
if accelerator != "cpu":
if not args.existing_registry:
print(
"ERROR: --accelerator other than 'cpu' requires --existing-registry",
file=sys.stderr,
)
return 1
src_plugins = existing_registry_path / "plugins"
if src_plugins.exists():
shutil.copytree(src_plugins, registry_out / "plugins", dirs_exist_ok=True)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: CUDA registry pass correctly preserves full tree, index, and pubkey

The cuda accelerator pass copies the existing CPU registry's plugins/ tree forward (build_registry.py:466-475), overwrites only variant-updated manifests, then regenerates index.json from existing_registry (loaded from the CPU registry) and copies streamkit.pub from the checked-out repo default path (build_registry.py:749-751). Since the CUDA job runs actions/checkout, the default docs/public/registry/streamkit.pub exists, so the merged registry uploaded with overwrite=true is complete. The append-only immutability check correctly diffs only non-variants fields for variant passes and carries variants forward on CPU reuse passes. Logic verified as consistent.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

publish-registry depends on build-marketplace-cuda, which is skipped when
build_cuda=false. A skipped dependency fails the implicit success() gate,
so CPU registry publishing was silently skipped too. Gate publish-registry
on the CPU build succeeding and the CUDA build not failing.

Signed-off-by: streamkit-devin <devin@streamkit.dev>

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 new potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment thread apps/skit/src/marketplace_installer.rs
Comment thread .github/workflows/marketplace-build.yml
Comment thread .github/workflows/marketplace-build.yml
Comment thread apps/skit/src/marketplace_installer.rs Outdated
Comment on lines 982 to 994
let accelerator = self.resolve_accelerator(request.accelerator.as_deref());
let bundle = manifest.resolve_bundle(accelerator.as_deref()).ok_or_else(|| {
InstallError::Other(anyhow!("Plugin manifest missing required `bundle` section"))
})?;
if let Some(requested) = accelerator.as_deref() {
tracing::info!(
plugin_id = %manifest.id,
requested_accelerator = requested,
selected_accelerator = bundle.accelerator,
"Resolved plugin bundle variant"
);
}
let bundle_url = self

@staging-devin-ai-integration staging-devin-ai-integration Bot Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Accelerator auto-detection installs CPU bundle for CPU-only plugins (expected)

resolve_accelerator in apps/skit/src/marketplace_installer.rs:958-971 can return Some("cuda") purely from runtime auto-detection (cuda_runtime_available) even for plugins that declare no CUDA variant. PluginManifest::resolve_bundle (apps/skit/src/marketplace.rs:261-282) then falls back to the canonical CPU bundle when no matching variant exists, so the install still succeeds with the CPU artifact. This is the intended graceful fallback, but note the emitted log line Resolved plugin bundle variant requested=cuda selected=cpu will appear for every install on a CUDA host, which could be confusing during debugging.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — the CPU-fallback behavior is intended. The log wording (requested=cuda selected=cpu on a CUDA host for CPU-only plugins) is cosmetic; happy to soften it to a debug-level line or note "no cuda variant; using cpu" if you'd prefer, but leaving as-is for now to keep this PR scoped.

build_registry.py requires --signing-key, but the CUDA registry variant
step omitted it, so argparse would abort the GPU registry build (and, via
the publish-registry guard, block CPU publishing too).

Signed-off-by: streamkit-devin <devin@streamkit.dev>
staging-devin-ai-integration[bot]

This comment was marked as resolved.

…switch

Addresses review findings on the marketplace GPU bundle work:

- Make build-marketplace-cuda best-effort (continue-on-error) so a failed
  GPU job no longer blocks CPU per-plugin releases or registry publishing;
  simplify the publish-registry guard to gate solely on the CPU build.
- Guard the empty Bash array expansion in build_official_plugins_cuda.sh so
  feature-less plugins build under set -u on Bash < 4.4.
- Embed manifest.json in CUDA variant bundles for parity with CPU bundles.
- Record the activated accelerator in an installed bundle and return a clear
  'already installed as <variant>' error instead of a silent no-op when a
  request would switch the variant for an already-installed version.

Signed-off-by: streamkit-devin <devin@streamkit.dev>

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment thread Dockerfile.demo
Comment thread scripts/marketplace/build_registry.py
Comment on lines +45 to +53
features=()
if grep -qE '^[[:space:]]*cuda[[:space:]]*=' "${plugin_dir}/Cargo.toml"; then
features=(--features cuda)
fi

echo "Building CUDA plugin: ${plugin} ${features[*]:-}"
(
cd "${plugin_dir}"
CARGO_TARGET_DIR="${target_dir}" cargo build --release ${features[@]+"${features[@]}"}

@staging-devin-ai-integration staging-devin-ai-integration Bot Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: CUDA-capable plugins relying on sherpa must keep sherpa linkage or the cuda build aborts the whole batch

build_official_plugins_cuda.sh:64-74 errors out (exit 1, aborting the entire script under set -e) for any plugin that declares the cuda accelerator but has neither a cuda Cargo feature nor libsherpa-onnx-c-api.so linkage. Today all six cuda-declared plugins satisfy this (whisper/helsinki have a cuda feature; kokoro/matcha/sensevoice/vad link sherpa), so the batch builds fine. But this is a sharp edge: if a future cuda-declared plugin drops sherpa linkage or its cuda feature, the guard aborts the whole CUDA build (not just that plugin), silently producing zero cuda variants while the job stays green due to continue-on-error.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved in 6b4b1e8. build_official_plugins_cuda.sh now fails loudly for any cuda-declared plugin that ends up as a plain CPU build: after compiling, if the plugin has no cuda Cargo feature, it readelf-checks the produced artifact for libsherpa-onnx-c-api.so linkage (the EP-agnostic sherpa runtime that build_registry.py packages against GPU sherpa libs). If neither a cuda feature nor sherpa linkage is present, it errors out rather than packaging an unaccelerated .so as the cuda variant.

… variant build

Asset types (e.g. Slint's) declared in plugin.yml were dropped by
build_manifest, so manifest.json never carried them. The server only
learns asset types from a plugin.yml beside the .so, so both raw-extraction
consumers (the demo image) and real marketplace installs lost the type.

- build_manifest now includes a non-empty assets block (omitted when empty,
  keeping asset-less manifests byte-identical for the append-only check).
- build_bundle embeds a plugin.yml beside the entrypoint so bundles are
  self-describing, matching what the installer writes post-download.
- Bump slint 0.4.0 -> 0.5.0 so the corrected, assets-bearing manifest can be
  published under the append-only registry; inject the source plugin.yml into
  the demo's pinned (pre-self-describing) slint bundle in the interim.
- build_official_plugins_cuda.sh fails loudly when a cuda-declared plugin has
  neither a cuda Cargo feature nor sherpa linkage (was silently packaging a
  CPU .so as the cuda variant).
- CUDA variant pass verifies the existing manifest signature before re-signing,
  matching the CPU reuse path.
- Mirror the assets field in check_registry_versions.py.

Signed-off-by: streamkit-devin <devin@streamkit.dev>

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 5 new potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment on lines +995 to +1008
fn resolve_accelerator(&self, requested: Option<&str>) -> Option<String> {
if let Some(value) = requested.map(str::trim).filter(|value| !value.is_empty()) {
return Some(value.to_string());
}
if let Some(value) =
self.default_accelerator.as_deref().map(str::trim).filter(|value| !value.is_empty())
{
return Some(value.to_string());
}
if cuda_runtime_available() {
return Some("cuda".to_string());
}
None
}

@staging-devin-ai-integration staging-devin-ai-integration Bot Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Plugin installs default to GPU builds on machines that only have the graphics driver, causing failures

The installer treats the mere presence of the NVIDIA driver stub (libcuda.so.1 probed by cuda_runtime_available() at apps/skit/src/marketplace_installer.rs:2153) as a signal to auto-select the GPU plugin build whenever no accelerator is configured, so a machine that has the driver but not the CUDA runtime ends up installing a GPU build it cannot load.
Impact: On systems with an NVIDIA driver but no CUDA toolkit, installing a GPU-capable plugin now silently picks the CUDA bundle and fails to load, whereas it previously installed and ran the working CPU build.

Auto-detection heuristic and fallback path

resolve_accelerator (apps/skit/src/marketplace_installer.rs:995-1008) returns, in order, the explicit request, the configured default_accelerator, and finally "cuda" when cuda_runtime_available() is true. The default config value is None (apps/skit/src/config.rs:594), so auto-detection is the effective default. cuda_runtime_available() only dlopens libcuda.so.1, which is present whenever the NVIDIA driver is installed, but the CUDA bundles (e.g. whisper built with --features cuda) additionally need libcudart/libcublas/libcudnn at load time — these are permitted as NEEDED deps by verify_bundles.py but are not vendored into the bundle. resolve_bundle only falls back to CPU when a plugin has no matching variant (apps/skit/src/marketplace.rs:261-282); for cuda-capable plugins it returns the CUDA variant, which then fails dlopen on a runtime-less host.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already addressed on the current head: CUDA_RUNTIME_LIBS probes both libcuda.so.1 (driver stub) and libcudart.so (runtime), and cuda_runtime_available() only returns true when the full stack is loadable — driver-only hosts fall back to the CPU bundle. The comment's citation of a libcuda.so.1-only probe reflects an older revision.

Comment thread Dockerfile.demo Outdated
Comment thread Dockerfile.demo
Comment on lines +587 to +605
if not verify_existing_signature(
existing["manifest_path"],
existing["signature_path"],
public_key_path,
):
print(
f"ERROR: {plugin_id}@{plugin_version} — existing manifest "
f"signature verification failed before attaching the "
f"'{accelerator}' variant. The manifest may have been "
f"modified after signing; revert the changes or bump the "
f"version to trigger re-signing.",
file=sys.stderr,
)
return 1

manifest_dir = registry_out / "plugins" / plugin_id / plugin_version
manifest_path = manifest_dir / "manifest.json"
write_json(manifest_path, would_be_manifest)
sign_manifest(manifest_path, signing_key)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Variant pass re-signs an already-published manifest, changing its bytes

The cuda variant pass in build_registry.py:560-605 rebuilds, re-signs, and overwrites an existing published manifest.json to append the variants list. The append-only guard only checks that non-variant core fields are unchanged, but the manifest.json bytes and its minisig DO change when the first variant is attached. Clients that cached the prior manifest/signature will see a different (still validly signed) manifest on next fetch. The PR docs describe this as intended ("layers a cuda variant onto the already-published CPU manifest ... append-only; a published variant is immutable"), so the manifest-mutation-on-first-variant is by design — flagging only so reviewers confirm cache/CDN invalidation is handled for the published registry.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment thread Dockerfile.demo
…eview findings

Key install dirs by id+version+accelerator so CPU and CUDA variants of the
same version coexist and switching never silently no-ops; removes the
.accelerator marker and its fail-open path. resolve_accelerator is probed once
and threaded through download_bundle. ActivePluginRecord now tracks the
activated accelerator.

Build/registry: extract shared manifest_builder (build_manifest + SHERPA libs)
so the builder and append-only guard cannot drift; normalize accelerator case
in reuse/dedup; drop the dead variants fallback. Whisper bundle build pins
SOURCE_DATE_EPOCH + -march=x86-64 for portable ggml. Demo pins
ggml-tiny-q5_1.bin sha256 and injects a version-matched slint plugin.yml.

Signed-off-by: streamkit-devin <devin@streamkit.dev>

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 new potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment on lines +45 to +53
# whisper.cpp/ggml enables GGML_NATIVE (-march=native) by default, which can
# emit AVX/AVX512 that SIGILLs on older CPUs than the build runner. The
# published bundle must run on any x86-64 host, so pin a portable baseline
# (SOURCE_DATE_EPOCH disables ggml's native-default upstream).
if [ "${plugin}" = "whisper" ]; then
export SOURCE_DATE_EPOCH=1
export CFLAGS="-O3 -pipe -fPIC -march=x86-64 -mtune=generic"
export CXXFLAGS="-O3 -pipe -fPIC -march=x86-64 -mtune=generic"
fi

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Demo now consumes prebuilt whisper 0.3.0 bundle that predates the portability flags

This PR adds portable build flags (SOURCE_DATE_EPOCH=1, -march=x86-64 -mtune=generic) for whisper in scripts/marketplace/build_official_plugins.sh:49-53, mirroring what the now-deleted whisper-builder stage in Dockerfile.demo used to do. However, the demo now pulls the already-published whisper-0.3.0 bundle (Dockerfile.demo:154 pins WHISPER_BUNDLE_SHA256=79212cfa..., matching docs/public/registry/plugins/whisper/0.3.0/manifest.json). Because the registry is append-only, whisper 0.3.0 is reused (never rebuilt) so the new portability flags do NOT apply to the bundle the demo ships. If the published 0.3.0 bundle was compiled with ggml's default GGML_NATIVE (-march=native) on a newer build host, the demo image could SIGILL on older CPUs — a regression from the prior demo which always compiled whisper portably from source. Worth confirming how the published 0.3.0 bundle was actually built (and whether a whisper version bump is needed to rebuild it portably).

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, and worth calling out explicitly: the portability flags here are forward-looking only. Because the registry is append-only, whisper-0.3.0 is reused (never rebuilt), so the bundle the demo actually ships still reflects however 0.3.0 was originally compiled. These flags only take effect the next time whisper's version is bumped and the bundle is rebuilt.

Making the currently shipped 0.3.0 bundle portable would require a whisper version bump to trigger a portable rebuild, which is a separate decision I don't want to make unilaterally inside this PR. Flagging to the maintainer: do you want a whisper version bump in this PR to rebuild 0.3.0 portably, or track it as a follow-up? Until then the fix prevents the regression for all future bundles but does not retroactively fix 0.3.0.

Comment on lines +60 to +61
"assets": plugin.get("assets") or None,
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: build_manifest now emits assets; only safe because slint was version-bumped

scripts/marketplace/manifest_builder.py:60 adds "assets": plugin.get("assets") or None to the canonical manifest, which the old build_manifest/build_manifest_from_plugin in build_registry.py/check_registry_versions.py did not emit. The registry is append-only: any plugin that already has a committed manifest at its current version AND declares assets in plugin.yml would now produce a differing would-be manifest and fail the immutability/version checks. The only plugin with assets is slint, and it was bumped 0.4.0→0.5.0 in this PR (Cargo.toml + plugin.yml + official-plugins.json), so no committed manifest collides. This is correct for the current tree but is a latent constraint: any future plugin that gains assets without a version bump will trip the append-only guard.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — this is correct for the current tree (slint is the only plugin with assets and it was bumped 0.4.0→0.5.0, so no committed manifest collides) and the latent constraint is intentional: a plugin that gains assets without a version bump should trip the append-only guard, since changing a published manifest's content without a version bump is exactly what that guard exists to catch. The author would bump the version, same as for any other manifest-affecting metadata change.

Comment on lines +221 to +232
build-marketplace-cuda:
name: Build Marketplace CUDA Variants
needs: [build-marketplace]
if: ${{ inputs.build_cuda }}
# GPU variants are best-effort: a failure on the (potentially flaky)
# self-hosted runner must never regress CPU registry publishing or per-plugin
# releases. continue-on-error keeps the overall workflow green so downstream
# CPU jobs still run; the failure is still visible on this job.
continue-on-error: true
# Self-hosted NVIDIA (Ada) runner: CUDA toolkit + nvcc must be preinstalled.
# Adjust the labels to match your runner registration if they differ.
runs-on: [self-hosted, linux, x64, gpu]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Release workflow always invokes the CUDA job, which requires the self-hosted GPU runner

build_cuda defaults to true (.github/workflows/marketplace-build.yml:14-18) and marketplace-release.yml:18 calls the reusable workflow without passing build_cuda, so build-marketplace-cuda (which targets runs-on: [self-hosted, linux, x64, gpu]) always runs on release. continue-on-error: true makes the job's conclusion success even if its steps fail, and publish-registry/create-releases use always()/job-success semantics, so a failing GPU build degrades gracefully. However, if no runner with those labels is registered, the job stays queued indefinitely rather than failing — and downstream jobs (needs: [..., build-marketplace-cuda]) wait for it to complete, stalling the entire release. This is acceptable given the workflow is manual (workflow_dispatch) and the only caller, but it does hard-couple releases to GPU runner availability.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, and by design for now: the workflow is workflow_dispatch-only with a single caller, so coupling releases to GPU-runner availability is acceptable. If no runner with the [self-hosted, linux, x64, gpu] labels is registered the job stays queued rather than failing, which would stall downstream jobs — the mitigation is operational (ensure the runner is registered, or pass build_cuda=false to skip it). If we later want releases to be fully decoupled from GPU availability we can add a timeout or split the CUDA build into its own non-blocking workflow; tracking that as a follow-up rather than expanding this PR.

Comment on lines 681 to +684
})?;
}

if tokio::fs::try_exists(&bundles_root).await.unwrap_or(false) {
let bundles_root_real =
plugin_paths::ensure_existing_dir_under(&base_real, &bundles_root, "bundles").await?;
let mut entries = tokio::fs::read_dir(&bundles_root_real).await?;
if entries.next_entry().await?.is_none() {
let _ = tokio::fs::remove_dir(&bundles_root_real).await;
}
}
remove_dir_if_empty(&base_real, &version_dir, "version").await?;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Uninstall of an accelerator variant orphans sibling variant bundle dirs (no active record)

remove_active_record_and_bundle (apps/skit/src/server/plugins.rs:660-684) now removes only the accelerator-specific bundle dir and leaves sibling variants under the version dir (per the comment). Since there is exactly one active record per plugin id (active/{plugin_id}.json), uninstalling the active variant deletes the record while a previously-extracted sibling variant's bundle dir remains on disk with no record referencing it. The installer's bundle_dir.exists() fast-path (marketplace_installer.rs:829) will reuse it on a later reinstall, so this is intentional caching rather than a leak per se, but operators may be surprised that uninstall leaves variant bundle directories behind under bundles/<id>/<version>/.

(Refers to lines 660-684)

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional, and a direct consequence of the re-keying: sibling variant bundle dirs under bundles/<id>/<version>/ are kept as a cache so reinstalling a previously-extracted variant hits the bundle_dir.exists() fast-path instead of re-downloading. remove_dir_if_empty still reaps the <version> and <id> dirs once the last variant is gone, so there's no unbounded leak — only an extracted-but-inactive variant lingers. If operators find the leftover dirs surprising we could add a --prune-variants flag to uninstall, but I'd keep the cache-by-default behavior; tracking the optional flag as a follow-up.

Signed-off-by: streamkit-devin <devin@streamkit.dev>

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 new potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment thread scripts/marketplace/build_official_plugins_cuda.sh
Comment on lines +62 to +69
manifest = strip_none(manifest)
if variants:
ordered: dict = {}
for key, value in manifest.items():
ordered[key] = value
if key == "bundle":
ordered["variants"] = variants
manifest = ordered

@staging-devin-ai-integration staging-devin-ai-integration Bot Jul 4, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: New ValueError guard fails loud on variants-without-bundle

manifest_builder.py:63-68 now raises ValueError when variants is truthy but bundle_block is None. I traced all callers: in scripts/marketplace/build_registry.py the CPU reuse path passes variants=existing_manifest.get("variants") which is falsy for the common case (never triggers), and the variant pass always supplies a non-None existing_manifest.get("bundle") together with merged_variants. This converts a previously silent data-loss case (variants dropped because they anchor after the bundle key) into an explicit failure — a safe improvement, not a regression.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guarded in 026a083: build_manifest now raises ValueError when variants is passed without a bundle block, so a future caller can't silently lose variants.

Comment on lines 681 to +685
})?;
}

if tokio::fs::try_exists(&bundles_root).await.unwrap_or(false) {
let bundles_root_real =
plugin_paths::ensure_existing_dir_under(&base_real, &bundles_root, "bundles").await?;
let mut entries = tokio::fs::read_dir(&bundles_root_real).await?;
if entries.next_entry().await?.is_none() {
let _ = tokio::fs::remove_dir(&bundles_root_real).await;
}
}
remove_dir_if_empty(&base_real, &version_dir, "version").await?;
remove_dir_if_empty(&base_real, &bundles_root, "bundles").await?;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Accelerator-keyed bundle removal is consistent with installer layout

The refactored remove_active_record_and_bundle (apps/skit/src/server/plugins.rs:660-685) computes version_dir/<accelerator> for new records and version_dir for legacy (empty-accelerator) records, then prunes empty parents via remove_dir_if_empty. I verified this matches the installer, which writes bundles to bundles/<id>/<version>/<accelerator>/ (marketplace_installer.rs:1211-1215) and stores the accelerator on the active record (activate_bundle at 1301-1309). The integration test test_uninstall_marketplace_plugin_removes_bundle sets accelerator: "cpu" but stages the bundle directly under bundles/gain/1.0.0/ (no accelerator subdir); it still passes only because unload_plugin(remove_file=true) deletes the entrypoint .so (plugins.rs:875-885), emptying the version dir so remove_dir_if_empty removes it. The accelerator-keyed removal itself is a no-op in that test's layout — not a production bug, but the test does not actually exercise the new accelerator-subdir removal path.

(Refers to lines 660-685)

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 9684702: test_uninstall_marketplace_plugin_removes_bundle now stages the bundle under bundles/gain/1.0.0/cpu/ (matching the installer's accelerator-keyed layout) and additionally asserts the empty version dir is pruned, so the accelerator-subdir removal path is actually exercised. Verified the test passes locally.

Comment thread Dockerfile.demo
Comment on lines +425 to +438
FROM bundle-deps AS slint-bundle
ARG SLINT_PLUGIN_VERSION=0.4.0
ARG SLINT_BUNDLE_SHA256=2b33767cdd35b9eeba3bec4e1801aa020a59ef9a61c54b28f7c5d71fa087034d

WORKDIR /bundle
COPY plugins/native/slint/plugin.yml plugin.yml.src
RUN set -eux; \
url="https://github.com/streamer45/streamkit/releases/download/plugin-slint-v${SLINT_PLUGIN_VERSION}/slint-${SLINT_PLUGIN_VERSION}-bundle.tar.zst"; \
curl --proto '=https' --tlsv1.2 -fsSL -o slint-bundle.tar.zst "$url"; \
echo "${SLINT_BUNDLE_SHA256} slint-bundle.tar.zst" | sha256sum -c -; \
mkdir -p extracted; \
tar --zstd -xf slint-bundle.tar.zst -C extracted; \
test -f extracted/libslint.so; \
sed "s/^version:.*/version: ${SLINT_PLUGIN_VERSION}/" plugin.yml.src > extracted/plugin.yml

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Demo image pins slint 0.4.0 while the repo bumps slint to 0.5.0

Dockerfile.demo pins SLINT_PLUGIN_VERSION=0.4.0 (line 426) and injects a version-matched plugin.yml, while plugins/native/slint/{Cargo.toml,plugin.yml} and official-plugins.json bump slint to 0.5.0. This is intentional per the in-file comments (the 0.5.0 self-describing bundle is not yet published, so the demo keeps the last published bundle and injects plugin.yml), but it means the demo image will lag the registry until someone updates the pin. Confirming the 0.4.0 bundle + SLINT_BUNDLE_SHA256 remain downloadable is worthwhile before merge, since the demo build will hard-fail on a sha mismatch or 404.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified: downloaded plugin-slint-v0.4.0/slint-0.4.0-bundle.tar.zst from the release and its sha256 matches the pinned SLINT_BUNDLE_SHA256 (2b33767c…). The 0.4.0 pin is intentional until a self-describing 0.5.0 bundle is published; the in-file comment flags updating the pin at that point.

…uard variant manifests

Signed-off-by: streamkit-devin <devin@streamkit.dev>

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review (Staging)
Debug

Playground

Comment on lines +61 to +66
if [ "${plugin}" = "whisper" ]; then
export SOURCE_DATE_EPOCH=1
export CFLAGS="-O3 -pipe -fPIC -march=x86-64 -mtune=generic"
export CXXFLAGS="-O3 -pipe -fPIC -march=x86-64 -mtune=generic"
fi
CARGO_TARGET_DIR="${target_dir}" cargo build --release ${features[@]+"${features[@]}"}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Whisper CFLAGS scoped to subshell in CUDA build script

scripts/marketplace/build_official_plugins_cuda.sh:61-65 exports SOURCE_DATE_EPOCH/CFLAGS/CXXFLAGS only inside the ( cd ... ) subshell, so they do not leak to subsequent plugins in the loop. For whisper has_cuda_feature is 1, so --features cuda is used and the final sherpa-linkage guard (lines 75-86) is correctly skipped. The portable-baseline flags affect only host C/C++ ggml kernels, not nvcc GPU kernels, matching the CPU build script's behavior.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Signed-off-by: streamkit-devin <devin@streamkit.dev>

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review (Staging)
Debug

Playground

Comment on lines 453 to 461
assert!(!tokio::fs::try_exists(&record_path).await.unwrap(), "Active record should be removed");
assert!(
!tokio::fs::try_exists(&bundle_dir).await.unwrap(),
"Bundle directory should be removed"
"Accelerator variant directory should be removed"
);
assert!(
!tokio::fs::try_exists(&version_dir).await.unwrap(),
"Empty version directory should be pruned"
);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Uninstall test correctly exercises accelerator-keyed layout

The updated test_uninstall_marketplace_plugin_removes_bundle places the entrypoint under bundles/gain/1.0.0/cpu/ and sets accelerator: "cpu" on the record. This matches the production logic in apps/skit/src/server/plugins.rs:666-685: a non-empty accelerator resolves bundle_dir = version_dir/cpu, which is removed, after which remove_dir_if_empty prunes the now-empty version_dir and bundles_root. The two new assertions (variant dir removed, empty version dir pruned) are consistent with this flow. No behavioral gap found in the incremental change.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

…lugins

Expose the active accelerator on PluginSummary (from active records on
restart and from the installer on fresh installs), add an accelerator
selector (auto-detect/cpu/cuda) to the marketplace details pane wired to
InstallPluginRequest.accelerator, and show the active variant in the
installed plugins list.

Signed-off-by: streamkit-devin <devin@streamkit.dev>

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 0 new potential issues.

Open in Devin Review (Staging)
Debug

Playground

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants