feat(marketplace): GPU (CUDA) plugin bundle variants + bundle-based demo image by staging-devin-ai-integration[bot] · Pull Request #638 · streamer45/streamkit

staging-devin-ai-integration · 2026-06-28T13:05:58Z

Summary

Optional CUDA bundle variants, backward-compatible. Manifests gain a variants[] array next to the canonical bundle. When empty, manifests are byte-identical to today's CPU-only ones, so existing clients/tooling are unaffected. The install client resolves a bundle by explicit accelerator (REST/config default_accelerator) or CUDA auto-detect, always falling back to the CPU bundle when no GPU match exists. Auto-detect probes both libcuda.so.1 (driver) and libcudart.so (runtime) so a driver-only host doesn't pick a CUDA bundle it can't load.
BREAKING (on-disk layout): bundle installs are keyed by id + version + accelerator. Installs now extract to bundles/<id>/<version>/<accelerator>/ instead of bundles/<id>/<version>/. This lets CPU and CUDA builds of the same version coexist, makes variant switching deterministic (never a silent no-op), and removes the fragile .accelerator marker file (and its fail-open path where a missing marker + install_models=true returned success without fetching the requested variant). ActivePluginRecord now records the activated accelerator (#[serde(default)], empty for pre-existing records). Existing bundles/<id>/<version>/ installs become invisible after upgrade and are simply re-fetched into the new layout; uninstall reaps empty <version>/<id> dirs.
Append-only registry passes. build_registry.py --accelerator cuda layers a cuda variant onto the already-published CPU manifest (immutable). verify_bundles.py --accelerator cuda allowlists CUDA NEEDED/RUNPATH deps that the strict CPU gate rejects. Shared manifest/registry logic (build_manifest, SHERPA_*_LIBS) is now in scripts/marketplace/manifest_builder.py so the builder, the append-only guard, and the bundle verifier can't drift. Accelerator comparison is case-normalized to match the Rust eq_ignore_ascii_case resolver.
CI publishes GPU bundles on the self-hosted Ada runner. build-marketplace-cuda builds CUDA plugins, regenerates + minisigns the variant manifests, and uploads <id>-<ver>-cuda-bundle.tar.zst to the same per-plugin release. CPU bundle naming is unchanged.
Dockerfile.demo consumes the marketplace instead of rebuilding from source. All 11 plugins are fetched as sha256-pinned prebuilt CPU bundles (the artifacts users install), making the demo a regression surface for published bundles. The whisper bundle build now pins SOURCE_DATE_EPOCH + -march=x86-64 -mtune=generic so future ggml builds are portable; ggml-tiny-q5_1.bin is now sha256-pinned in the demo; the slint stage injects a version-matched plugin.yml for the pinned 0.4.0 bundle.

Review & Validation

Breaking layout: confirm the id+version+accelerator install path is acceptable; existing bundles/<id>/<version>/ installs are re-fetched (not migrated) after upgrade.
Bundle/model sha256 pins in Dockerfile.demo match docs/public/registry/plugins/<id>/<version>/manifest.json.
CUDA variant resolution & fallback: cuda variant only on GPU/explicit request, CPU fallback otherwise (122 marketplace unit tests pass, incl. side-by-side install + driver-only auto-detect).
Self-hosted Ada runner: verify runner labels ([self-hosted, linux, x64, gpu]) and that the CUDA sherpa-onnx archive / minisign secret are available.

Notes

Whisper portability is forward-looking only. The flags apply on the next whisper rebuild; the already-published whisper-0.3.0 bundle the demo ships is reused under append-only and is not retroactively rebuilt. Making the shipped bundle portable requires a whisper version bump — flagged on the review thread for a maintainer decision.
I cannot run-test CUDA bundles here (no GPU); GPU validation is build + verify_bundles.py --accelerator cuda only. Publishing + minisign signing happen in CI on the Ada runner.
#9 (copytree per variant pass) is intentionally left as a full-tree copy: publish-registry assembles the registry wholesale (rm -rf docs/public/registry && cp -R dist/registry/*), so the variant pass output must contain every plugin's manifest — copying only affected plugins would drop CPU-only plugins. The carried files are small JSON manifests + signatures (bundles live in releases), so cost is O(manifest count).

Link to Devin session: https://staging.itsdev.in/sessions/3fe4bb10d20147bc9384def4bc81694b
Requested by: @streamer45

Devin Review

Status	Commit
🟢 Reviewed	`e0bba4f`

…emo image Add optional per-plugin CUDA bundle variants alongside the canonical CPU bundle, end-to-end across schema, install client, release pipeline, and the CPU demo image. - marketplace.rs: add backward-compatible variants[] to manifests (empty => byte-identical to pre-variant CPU manifests). - marketplace_installer.rs: resolve bundle by explicit accelerator or CUDA auto-detect (libcuda probe), falling back to the CPU bundle. - build_registry.py / verify_bundles.py: --accelerator passes layer a cuda variant onto the published CPU manifest (append-only, immutable); CUDA NEEDED/RUNPATH deps allowlisted for cuda bundles only. - CI: build-marketplace-cuda job on the self-hosted Ada GPU runner builds + publishes <id>-<ver>-cuda-bundle.tar.zst to the per-plugin release. - Dockerfile.demo: fetch sha256-pinned prebuilt CPU marketplace bundles instead of rebuilding 11 plugins from source; drop the from-source sherpa-onnx copy (bundles vendor their own libs) and the torch-based Helsinki conversion (pull pre-converted model tarballs). Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration · 2026-06-28T13:06:01Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment, CI, and merge conflict monitoring

codecov · 2026-06-28T13:08:28Z

Codecov Report

❌ Patch coverage is 85.12195% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.24%. Comparing base (5faed28) to head (e0bba4f).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
apps/skit/src/marketplace_installer.rs	81.81%	52 Missing ⚠️
apps/skit/src/plugins.rs	62.50%	6 Missing ⚠️
apps/skit/src/server/plugins.rs	86.36%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #638      +/-   ##
==========================================
+ Coverage   85.22%   85.24%   +0.01%     
==========================================
  Files         249      249              
  Lines       75412    75765     +353     
  Branches     2437     2314     -123     
==========================================
+ Hits        64272    64584     +312     
- Misses      11134    11175      +41     
  Partials        6        6

Flag	Coverage Δ
backend	`85.29% <85.12%> (+0.01%)`	⬆️
ui	`84.71% <ø> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
core	`85.79% <ø> (ø)`
engine	`83.76% <ø> (ø)`
api	`91.14% <ø> (ø)`
nodes	`84.84% <ø> (+<0.01%)`	⬆️
server	`85.22% <85.12%> (+0.03%)`	⬆️
plugin-native	`84.79% <ø> (ø)`
plugin-wasm	`95.41% <ø> (ø)`
ui-services	`86.29% <ø> (+0.02%)`	⬆️
ui-components	`70.18% <ø> (ø)`

Files with missing lines	Coverage Δ
apps/skit/src/config.rs	`97.62% <ø> (ø)`
apps/skit/src/marketplace.rs	`99.30% <100.00%> (+0.09%)`	⬆️
apps/skit/src/plugin_records.rs	`91.66% <ø> (ø)`
apps/skit/src/server/plugins.rs	`83.41% <86.36%> (+0.09%)`	⬆️
apps/skit/src/plugins.rs	`84.82% <62.50%> (-0.28%)`	⬇️
apps/skit/src/marketplace_installer.rs	`82.34% <81.81%> (+0.23%)`	⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

staging-devin-ai-integration

Devin Review found 5 potential issues.

Debug

Playground

staging-devin-ai-integration · 2026-06-28T13:11:42Z

@@ -803,7 +817,7 @@ impl PluginInstaller {
        }


📝 Info: Accelerator marker keyed only by id+version; switching variants requires manual uninstall

The install directory and download cache are keyed by plugin id + version only, so a CPU and CUDA build of the same version cannot coexist. The new .accelerator marker plus the mismatch check in handle_bundle_install (marketplace_installer.rs:811-826) correctly turns a variant switch into an explicit error rather than a silent no-op. Legacy bundles installed before this marker existed read as None and skip the check, so a machine that gains a GPU and auto-detects cuda will get the CPU error message only after the marker is present; for pre-marker installs the variant simply stays whatever was first installed. This is the intended behavior per the marker doc comment, noted here for reviewer awareness of the UX (no automatic re-install on accelerator change).

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

staging-devin-ai-integration · 2026-06-28T13:11:43Z

+  features=()
+  if grep -qE '^[[:space:]]*cuda[[:space:]]*=' "${plugin_dir}/Cargo.toml"; then
+    features=(--features cuda)
+  fi
+
+  echo "Building CUDA plugin: ${plugin} ${features[*]:-}"
+  (
+    cd "${plugin_dir}"
+    CARGO_TARGET_DIR="${target_dir}" cargo build --release "${features[@]}"
+  )


🚩 helsinki/whisper CUDA builds rely on toolkit + dynamic CUDA loading on self-hosted runner

build_official_plugins_cuda.sh builds plugins with --features cuda only when their Cargo.toml declares a cuda = feature. whisper (whisper-rs/cuda) and helsinki (candle-core/cuda etc.) both qualify. These require an NVIDIA CUDA toolkit/nvcc at build time, which the build-marketplace-cuda job assumes is preinstalled on the self-hosted runner (only stated in a comment, not enforced). Candle/cudarc and whisper.cpp CUDA generally dlopen libcudart/libcublas/libcudnn at runtime, which verify_bundles.py --accelerator cuda permits via CUDA_DEP_ALLOWLIST. No code defect, but the CUDA build correctness depends entirely on runner provisioning that CI does not validate.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

staging-devin-ai-integration · 2026-06-28T13:11:44Z

 # libfdk-aac2 is in Debian's non-free component.
+# sherpa-onnx / ONNX Runtime are NOT installed here: each sherpa-backed
+# bundle vendors its own libs (RUNPATH=$ORIGIN).


🚩 Demo image now depends on marketplace bundles being fully self-contained

The runtime stage no longer installs sherpa-onnx/onnxruntime into /usr/local/lib (previously copied from kokoro-builder), relying on each sherpa-backed bundle vendoring its own libs with RUNPATH=$ORIGIN. whisper's ort (onnxruntime) for Silero VAD and helsinki's Candle are expected to statically link / self-contain. This matches what external marketplace users install, so if a bundle is not self-contained it is a marketplace-wide regression rather than demo-specific. Confirmed whisper uses ort and helsinki uses Candle (not CTranslate2), so no CTranslate2 runtime dependency is missing from the slimmed runtime image. Worth a one-time validation that the demo container actually loads all 12 plugins.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

staging-devin-ai-integration · 2026-06-28T13:11:45Z


+    # Variant passes (e.g. cuda) layer onto an already-published registry, so
+    # copy the full existing tree forward; untouched plugins must stay present
+    # and only the variant-updated manifests are overwritten below.
+    if accelerator != "cpu":
+        if not args.existing_registry:
+            print(
+                "ERROR: --accelerator other than 'cpu' requires --existing-registry",
+                file=sys.stderr,
+            )
+            return 1
+        src_plugins = existing_registry_path / "plugins"
+        if src_plugins.exists():
+            shutil.copytree(src_plugins, registry_out / "plugins", dirs_exist_ok=True)


📝 Info: CUDA registry pass correctly preserves full tree, index, and pubkey

The cuda accelerator pass copies the existing CPU registry's plugins/ tree forward (build_registry.py:466-475), overwrites only variant-updated manifests, then regenerates index.json from existing_registry (loaded from the CPU registry) and copies streamkit.pub from the checked-out repo default path (build_registry.py:749-751). Since the CUDA job runs actions/checkout, the default docs/public/registry/streamkit.pub exists, so the merged registry uploaded with overwrite=true is complete. The append-only immutability check correctly diffs only non-variants fields for variant passes and carries variants forward on CPU reuse passes. Logic verified as consistent.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

publish-registry depends on build-marketplace-cuda, which is skipped when build_cuda=false. A skipped dependency fails the implicit success() gate, so CPU registry publishing was silently skipped too. Gate publish-registry on the CPU build succeeding and the CUDA build not failing. Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration

Devin Review found 4 new potential issues.

Debug

Playground

staging-devin-ai-integration · 2026-06-28T14:51:56Z

+        let accelerator = self.resolve_accelerator(request.accelerator.as_deref());
+        let bundle = manifest.resolve_bundle(accelerator.as_deref()).ok_or_else(|| {
            InstallError::Other(anyhow!("Plugin manifest missing required `bundle` section"))
        })?;
+        if let Some(requested) = accelerator.as_deref() {
+            tracing::info!(
+                plugin_id = %manifest.id,
+                requested_accelerator = requested,
+                selected_accelerator = bundle.accelerator,
+                "Resolved plugin bundle variant"
+            );
+        }
        let bundle_url = self


📝 Info: Accelerator auto-detection installs CPU bundle for CPU-only plugins (expected)

resolve_accelerator in apps/skit/src/marketplace_installer.rs:958-971 can return Some("cuda") purely from runtime auto-detection (cuda_runtime_available) even for plugins that declare no CUDA variant. PluginManifest::resolve_bundle (apps/skit/src/marketplace.rs:261-282) then falls back to the canonical CPU bundle when no matching variant exists, so the install still succeeds with the CPU artifact. This is the intended graceful fallback, but note the emitted log line Resolved plugin bundle variant requested=cuda selected=cpu will appear for every install on a CUDA host, which could be confusing during debugging.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Acknowledged — the CPU-fallback behavior is intended. The log wording (requested=cuda selected=cpu on a CUDA host for CPU-only plugins) is cosmetic; happy to soften it to a debug-level line or note "no cuda variant; using cpu" if you'd prefer, but leaving as-is for now to keep this PR scoped.

build_registry.py requires --signing-key, but the CUDA registry variant step omitted it, so argparse would abort the GPU registry build (and, via the publish-registry guard, block CPU publishing too). Signed-off-by: streamkit-devin <devin@streamkit.dev>

…switch Addresses review findings on the marketplace GPU bundle work: - Make build-marketplace-cuda best-effort (continue-on-error) so a failed GPU job no longer blocks CPU per-plugin releases or registry publishing; simplify the publish-registry guard to gate solely on the CPU build. - Guard the empty Bash array expansion in build_official_plugins_cuda.sh so feature-less plugins build under set -u on Bash < 4.4. - Embed manifest.json in CUDA variant bundles for parity with CPU bundles. - Record the activated accelerator in an installed bundle and return a clear 'already installed as <variant>' error instead of a silent no-op when a request would switch the variant for an already-installed version. Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration

Devin Review found 3 new potential issues.

Debug

Playground

staging-devin-ai-integration · 2026-06-28T15:27:01Z

+  features=()
+  if grep -qE '^[[:space:]]*cuda[[:space:]]*=' "${plugin_dir}/Cargo.toml"; then
+    features=(--features cuda)
+  fi
+
+  echo "Building CUDA plugin: ${plugin} ${features[*]:-}"
+  (
+    cd "${plugin_dir}"
+    CARGO_TARGET_DIR="${target_dir}" cargo build --release ${features[@]+"${features[@]}"}


📝 Info: CUDA-capable plugins relying on sherpa must keep sherpa linkage or the cuda build aborts the whole batch

build_official_plugins_cuda.sh:64-74 errors out (exit 1, aborting the entire script under set -e) for any plugin that declares the cuda accelerator but has neither a cuda Cargo feature nor libsherpa-onnx-c-api.so linkage. Today all six cuda-declared plugins satisfy this (whisper/helsinki have a cuda feature; kokoro/matcha/sensevoice/vad link sherpa), so the batch builds fine. But this is a sharp edge: if a future cuda-declared plugin drops sherpa linkage or its cuda feature, the guard aborts the whole CUDA build (not just that plugin), silently producing zero cuda variants while the job stays green due to continue-on-error.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Resolved in 6b4b1e8. build_official_plugins_cuda.sh now fails loudly for any cuda-declared plugin that ends up as a plain CPU build: after compiling, if the plugin has no cuda Cargo feature, it readelf-checks the produced artifact for libsherpa-onnx-c-api.so linkage (the EP-agnostic sherpa runtime that build_registry.py packages against GPU sherpa libs). If neither a cuda feature nor sherpa linkage is present, it errors out rather than packaging an unaccelerated .so as the cuda variant.

… variant build Asset types (e.g. Slint's) declared in plugin.yml were dropped by build_manifest, so manifest.json never carried them. The server only learns asset types from a plugin.yml beside the .so, so both raw-extraction consumers (the demo image) and real marketplace installs lost the type. - build_manifest now includes a non-empty assets block (omitted when empty, keeping asset-less manifests byte-identical for the append-only check). - build_bundle embeds a plugin.yml beside the entrypoint so bundles are self-describing, matching what the installer writes post-download. - Bump slint 0.4.0 -> 0.5.0 so the corrected, assets-bearing manifest can be published under the append-only registry; inject the source plugin.yml into the demo's pinned (pre-self-describing) slint bundle in the interim. - build_official_plugins_cuda.sh fails loudly when a cuda-declared plugin has neither a cuda Cargo feature nor sherpa linkage (was silently packaging a CPU .so as the cuda variant). - CUDA variant pass verifies the existing manifest signature before re-signing, matching the CPU reuse path. - Mirror the assets field in check_registry_versions.py. Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration

Devin Review found 5 new potential issues.

Debug

Playground

staging-devin-ai-integration · 2026-06-28T17:06:26Z

+    fn resolve_accelerator(&self, requested: Option<&str>) -> Option<String> {
+        if let Some(value) = requested.map(str::trim).filter(|value| !value.is_empty()) {
+            return Some(value.to_string());
+        }
+        if let Some(value) =
+            self.default_accelerator.as_deref().map(str::trim).filter(|value| !value.is_empty())
+        {
+            return Some(value.to_string());
+        }
+        if cuda_runtime_available() {
+            return Some("cuda".to_string());
+        }
+        None
+    }


🟡 Plugin installs default to GPU builds on machines that only have the graphics driver, causing failures

The installer treats the mere presence of the NVIDIA driver stub (libcuda.so.1 probed by cuda_runtime_available() at apps/skit/src/marketplace_installer.rs:2153) as a signal to auto-select the GPU plugin build whenever no accelerator is configured, so a machine that has the driver but not the CUDA runtime ends up installing a GPU build it cannot load.
Impact: On systems with an NVIDIA driver but no CUDA toolkit, installing a GPU-capable plugin now silently picks the CUDA bundle and fails to load, whereas it previously installed and ran the working CPU build.

Auto-detection heuristic and fallback path

resolve_accelerator (apps/skit/src/marketplace_installer.rs:995-1008) returns, in order, the explicit request, the configured default_accelerator, and finally "cuda" when cuda_runtime_available() is true. The default config value is None (apps/skit/src/config.rs:594), so auto-detection is the effective default. cuda_runtime_available() only dlopens libcuda.so.1, which is present whenever the NVIDIA driver is installed, but the CUDA bundles (e.g. whisper built with --features cuda) additionally need libcudart/libcublas/libcudnn at load time — these are permitted as NEEDED deps by verify_bundles.py but are not vendored into the bundle. resolve_bundle only falls back to CPU when a plugin has no matching variant (apps/skit/src/marketplace.rs:261-282); for cuda-capable plugins it returns the CUDA variant, which then fails dlopen on a runtime-less host.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

This is already addressed on the current head: CUDA_RUNTIME_LIBS probes both libcuda.so.1 (driver stub) and libcudart.so (runtime), and cuda_runtime_available() only returns true when the full stack is loadable — driver-only hosts fall back to the CPU bundle. The comment's citation of a libcuda.so.1-only probe reflects an older revision.

staging-devin-ai-integration · 2026-06-28T17:06:29Z

+            if not verify_existing_signature(
+                existing["manifest_path"],
+                existing["signature_path"],
+                public_key_path,
+            ):
+                print(
+                    f"ERROR: {plugin_id}@{plugin_version} — existing manifest "
+                    f"signature verification failed before attaching the "
+                    f"'{accelerator}' variant. The manifest may have been "
+                    f"modified after signing; revert the changes or bump the "
+                    f"version to trigger re-signing.",
+                    file=sys.stderr,
+                )
+                return 1
+
+            manifest_dir = registry_out / "plugins" / plugin_id / plugin_version
+            manifest_path = manifest_dir / "manifest.json"
+            write_json(manifest_path, would_be_manifest)
+            sign_manifest(manifest_path, signing_key)


🚩 Variant pass re-signs an already-published manifest, changing its bytes

The cuda variant pass in build_registry.py:560-605 rebuilds, re-signs, and overwrites an existing published manifest.json to append the variants list. The append-only guard only checks that non-variant core fields are unchanged, but the manifest.json bytes and its minisig DO change when the first variant is attached. Clients that cached the prior manifest/signature will see a different (still validly signed) manifest on next fetch. The PR docs describe this as intended ("layers a cuda variant onto the already-published CPU manifest ... append-only; a published variant is immutable"), so the manifest-mutation-on-first-variant is by design — flagging only so reviewers confirm cache/CDN invalidation is handled for the published registry.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

…eview findings Key install dirs by id+version+accelerator so CPU and CUDA variants of the same version coexist and switching never silently no-ops; removes the .accelerator marker and its fail-open path. resolve_accelerator is probed once and threaded through download_bundle. ActivePluginRecord now tracks the activated accelerator. Build/registry: extract shared manifest_builder (build_manifest + SHERPA libs) so the builder and append-only guard cannot drift; normalize accelerator case in reuse/dedup; drop the dead variants fallback. Whisper bundle build pins SOURCE_DATE_EPOCH + -march=x86-64 for portable ggml. Demo pins ggml-tiny-q5_1.bin sha256 and injects a version-matched slint plugin.yml. Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration

Devin Review found 4 new potential issues.

Debug

Playground

staging-devin-ai-integration · 2026-06-28T20:01:47Z

+    # whisper.cpp/ggml enables GGML_NATIVE (-march=native) by default, which can
+    # emit AVX/AVX512 that SIGILLs on older CPUs than the build runner. The
+    # published bundle must run on any x86-64 host, so pin a portable baseline
+    # (SOURCE_DATE_EPOCH disables ggml's native-default upstream).
+    if [ "${plugin}" = "whisper" ]; then
+      export SOURCE_DATE_EPOCH=1
+      export CFLAGS="-O3 -pipe -fPIC -march=x86-64 -mtune=generic"
+      export CXXFLAGS="-O3 -pipe -fPIC -march=x86-64 -mtune=generic"
+    fi


🚩 Demo now consumes prebuilt whisper 0.3.0 bundle that predates the portability flags

This PR adds portable build flags (SOURCE_DATE_EPOCH=1, -march=x86-64 -mtune=generic) for whisper in scripts/marketplace/build_official_plugins.sh:49-53, mirroring what the now-deleted whisper-builder stage in Dockerfile.demo used to do. However, the demo now pulls the already-published whisper-0.3.0 bundle (Dockerfile.demo:154 pins WHISPER_BUNDLE_SHA256=79212cfa..., matching docs/public/registry/plugins/whisper/0.3.0/manifest.json). Because the registry is append-only, whisper 0.3.0 is reused (never rebuilt) so the new portability flags do NOT apply to the bundle the demo ships. If the published 0.3.0 bundle was compiled with ggml's default GGML_NATIVE (-march=native) on a newer build host, the demo image could SIGILL on older CPUs — a regression from the prior demo which always compiled whisper portably from source. Worth confirming how the published 0.3.0 bundle was actually built (and whether a whisper version bump is needed to rebuild it portably).

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Correct, and worth calling out explicitly: the portability flags here are forward-looking only. Because the registry is append-only, whisper-0.3.0 is reused (never rebuilt), so the bundle the demo actually ships still reflects however 0.3.0 was originally compiled. These flags only take effect the next time whisper's version is bumped and the bundle is rebuilt.

Making the currently shipped 0.3.0 bundle portable would require a whisper version bump to trigger a portable rebuild, which is a separate decision I don't want to make unilaterally inside this PR. Flagging to the maintainer: do you want a whisper version bump in this PR to rebuild 0.3.0 portably, or track it as a follow-up? Until then the fix prevents the regression for all future bundles but does not retroactively fix 0.3.0.

staging-devin-ai-integration · 2026-06-28T20:01:48Z

+        "assets": plugin.get("assets") or None,
+    }


📝 Info: build_manifest now emits assets; only safe because slint was version-bumped

scripts/marketplace/manifest_builder.py:60 adds "assets": plugin.get("assets") or None to the canonical manifest, which the old build_manifest/build_manifest_from_plugin in build_registry.py/check_registry_versions.py did not emit. The registry is append-only: any plugin that already has a committed manifest at its current version AND declares assets in plugin.yml would now produce a differing would-be manifest and fail the immutability/version checks. The only plugin with assets is slint, and it was bumped 0.4.0→0.5.0 in this PR (Cargo.toml + plugin.yml + official-plugins.json), so no committed manifest collides. This is correct for the current tree but is a latent constraint: any future plugin that gains assets without a version bump will trip the append-only guard.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Acknowledged — this is correct for the current tree (slint is the only plugin with assets and it was bumped 0.4.0→0.5.0, so no committed manifest collides) and the latent constraint is intentional: a plugin that gains assets without a version bump should trip the append-only guard, since changing a published manifest's content without a version bump is exactly what that guard exists to catch. The author would bump the version, same as for any other manifest-affecting metadata change.

staging-devin-ai-integration · 2026-06-28T20:01:49Z

+  build-marketplace-cuda:
+    name: Build Marketplace CUDA Variants
+    needs: [build-marketplace]
+    if: ${{ inputs.build_cuda }}
+    # GPU variants are best-effort: a failure on the (potentially flaky)
+    # self-hosted runner must never regress CPU registry publishing or per-plugin
+    # releases. continue-on-error keeps the overall workflow green so downstream
+    # CPU jobs still run; the failure is still visible on this job.
+    continue-on-error: true
+    # Self-hosted NVIDIA (Ada) runner: CUDA toolkit + nvcc must be preinstalled.
+    # Adjust the labels to match your runner registration if they differ.
+    runs-on: [self-hosted, linux, x64, gpu]


🚩 Release workflow always invokes the CUDA job, which requires the self-hosted GPU runner

build_cuda defaults to true (.github/workflows/marketplace-build.yml:14-18) and marketplace-release.yml:18 calls the reusable workflow without passing build_cuda, so build-marketplace-cuda (which targets runs-on: [self-hosted, linux, x64, gpu]) always runs on release. continue-on-error: true makes the job's conclusion success even if its steps fail, and publish-registry/create-releases use always()/job-success semantics, so a failing GPU build degrades gracefully. However, if no runner with those labels is registered, the job stays queued indefinitely rather than failing — and downstream jobs (needs: [..., build-marketplace-cuda]) wait for it to complete, stalling the entire release. This is acceptable given the workflow is manual (workflow_dispatch) and the only caller, but it does hard-couple releases to GPU runner availability.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Agreed, and by design for now: the workflow is workflow_dispatch-only with a single caller, so coupling releases to GPU-runner availability is acceptable. If no runner with the [self-hosted, linux, x64, gpu] labels is registered the job stays queued rather than failing, which would stall downstream jobs — the mitigation is operational (ensure the runner is registered, or pass build_cuda=false to skip it). If we later want releases to be fully decoupled from GPU availability we can add a timeout or split the CUDA build into its own non-blocking workflow; tracking that as a follow-up rather than expanding this PR.

staging-devin-ai-integration · 2026-06-28T20:01:50Z

        })?;
    }

-    if tokio::fs::try_exists(&bundles_root).await.unwrap_or(false) {
-        let bundles_root_real =
-            plugin_paths::ensure_existing_dir_under(&base_real, &bundles_root, "bundles").await?;
-        let mut entries = tokio::fs::read_dir(&bundles_root_real).await?;
-        if entries.next_entry().await?.is_none() {
-            let _ = tokio::fs::remove_dir(&bundles_root_real).await;
-        }
-    }
+    remove_dir_if_empty(&base_real, &version_dir, "version").await?;


📝 Info: Uninstall of an accelerator variant orphans sibling variant bundle dirs (no active record)

remove_active_record_and_bundle (apps/skit/src/server/plugins.rs:660-684) now removes only the accelerator-specific bundle dir and leaves sibling variants under the version dir (per the comment). Since there is exactly one active record per plugin id (active/{plugin_id}.json), uninstalling the active variant deletes the record while a previously-extracted sibling variant's bundle dir remains on disk with no record referencing it. The installer's bundle_dir.exists() fast-path (marketplace_installer.rs:829) will reuse it on a later reinstall, so this is intentional caching rather than a leak per se, but operators may be surprised that uninstall leaves variant bundle directories behind under bundles/<id>/<version>/.

(Refers to lines 660-684)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Intentional, and a direct consequence of the re-keying: sibling variant bundle dirs under bundles/<id>/<version>/ are kept as a cache so reinstalling a previously-extracted variant hits the bundle_dir.exists() fast-path instead of re-downloading. remove_dir_if_empty still reaps the <version> and <id> dirs once the last variant is gone, so there's no unbounded leak — only an extracted-but-inactive variant lingers. If operators find the leftover dirs surprising we could add a --prune-variants flag to uninstall, but I'd keep the cache-by-default behavior; tracking the optional flag as a follow-up.

Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration

Devin Review found 4 new potential issues.

Debug

Playground

staging-devin-ai-integration · 2026-07-04T07:52:01Z

+    manifest = strip_none(manifest)
+    if variants:
+        ordered: dict = {}
+        for key, value in manifest.items():
+            ordered[key] = value
+            if key == "bundle":
+                ordered["variants"] = variants
+        manifest = ordered


📝 Info: New ValueError guard fails loud on variants-without-bundle

manifest_builder.py:63-68 now raises ValueError when variants is truthy but bundle_block is None. I traced all callers: in scripts/marketplace/build_registry.py the CPU reuse path passes variants=existing_manifest.get("variants") which is falsy for the common case (never triggers), and the variant pass always supplies a non-None existing_manifest.get("bundle") together with merged_variants. This converts a previously silent data-loss case (variants dropped because they anchor after the bundle key) into an explicit failure — a safe improvement, not a regression.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Guarded in 026a083: build_manifest now raises ValueError when variants is passed without a bundle block, so a future caller can't silently lose variants.

staging-devin-ai-integration · 2026-07-04T07:52:02Z

        })?;
    }

-    if tokio::fs::try_exists(&bundles_root).await.unwrap_or(false) {
-        let bundles_root_real =
-            plugin_paths::ensure_existing_dir_under(&base_real, &bundles_root, "bundles").await?;
-        let mut entries = tokio::fs::read_dir(&bundles_root_real).await?;
-        if entries.next_entry().await?.is_none() {
-            let _ = tokio::fs::remove_dir(&bundles_root_real).await;
-        }
-    }
+    remove_dir_if_empty(&base_real, &version_dir, "version").await?;
+    remove_dir_if_empty(&base_real, &bundles_root, "bundles").await?;


📝 Info: Accelerator-keyed bundle removal is consistent with installer layout

The refactored remove_active_record_and_bundle (apps/skit/src/server/plugins.rs:660-685) computes version_dir/<accelerator> for new records and version_dir for legacy (empty-accelerator) records, then prunes empty parents via remove_dir_if_empty. I verified this matches the installer, which writes bundles to bundles/<id>/<version>/<accelerator>/ (marketplace_installer.rs:1211-1215) and stores the accelerator on the active record (activate_bundle at 1301-1309). The integration test test_uninstall_marketplace_plugin_removes_bundle sets accelerator: "cpu" but stages the bundle directly under bundles/gain/1.0.0/ (no accelerator subdir); it still passes only because unload_plugin(remove_file=true) deletes the entrypoint .so (plugins.rs:875-885), emptying the version dir so remove_dir_if_empty removes it. The accelerator-keyed removal itself is a no-op in that test's layout — not a production bug, but the test does not actually exercise the new accelerator-subdir removal path.

(Refers to lines 660-685)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Good catch — fixed in 9684702: test_uninstall_marketplace_plugin_removes_bundle now stages the bundle under bundles/gain/1.0.0/cpu/ (matching the installer's accelerator-keyed layout) and additionally asserts the empty version dir is pruned, so the accelerator-subdir removal path is actually exercised. Verified the test passes locally.

staging-devin-ai-integration · 2026-07-04T07:52:03Z

+FROM bundle-deps AS slint-bundle
+ARG SLINT_PLUGIN_VERSION=0.4.0
+ARG SLINT_BUNDLE_SHA256=2b33767cdd35b9eeba3bec4e1801aa020a59ef9a61c54b28f7c5d71fa087034d
+
+WORKDIR /bundle
+COPY plugins/native/slint/plugin.yml plugin.yml.src
+RUN set -eux; \
+    url="https://github.com/streamer45/streamkit/releases/download/plugin-slint-v${SLINT_PLUGIN_VERSION}/slint-${SLINT_PLUGIN_VERSION}-bundle.tar.zst"; \
+    curl --proto '=https' --tlsv1.2 -fsSL -o slint-bundle.tar.zst "$url"; \
+    echo "${SLINT_BUNDLE_SHA256}  slint-bundle.tar.zst" | sha256sum -c -; \
+    mkdir -p extracted; \
+    tar --zstd -xf slint-bundle.tar.zst -C extracted; \
+    test -f extracted/libslint.so; \
+    sed "s/^version:.*/version: ${SLINT_PLUGIN_VERSION}/" plugin.yml.src > extracted/plugin.yml


🔍 Demo image pins slint 0.4.0 while the repo bumps slint to 0.5.0

Dockerfile.demo pins SLINT_PLUGIN_VERSION=0.4.0 (line 426) and injects a version-matched plugin.yml, while plugins/native/slint/{Cargo.toml,plugin.yml} and official-plugins.json bump slint to 0.5.0. This is intentional per the in-file comments (the 0.5.0 self-describing bundle is not yet published, so the demo keeps the last published bundle and injects plugin.yml), but it means the demo image will lag the registry until someone updates the pin. Confirming the 0.4.0 bundle + SLINT_BUNDLE_SHA256 remain downloadable is worthwhile before merge, since the demo build will hard-fail on a sha mismatch or 404.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Verified: downloaded plugin-slint-v0.4.0/slint-0.4.0-bundle.tar.zst from the release and its sha256 matches the pinned SLINT_BUNDLE_SHA256 (2b33767c…). The 0.4.0 pin is intentional until a self-describing 0.5.0 bundle is published; the in-file comment flags updating the pin at that point.

…uard variant manifests Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration

Devin Review found 1 new potential issue.

Debug

Playground

staging-devin-ai-integration · 2026-07-04T07:56:31Z

+    if [ "${plugin}" = "whisper" ]; then
+      export SOURCE_DATE_EPOCH=1
+      export CFLAGS="-O3 -pipe -fPIC -march=x86-64 -mtune=generic"
+      export CXXFLAGS="-O3 -pipe -fPIC -march=x86-64 -mtune=generic"
+    fi
+    CARGO_TARGET_DIR="${target_dir}" cargo build --release ${features[@]+"${features[@]}"}


📝 Info: Whisper CFLAGS scoped to subshell in CUDA build script

scripts/marketplace/build_official_plugins_cuda.sh:61-65 exports SOURCE_DATE_EPOCH/CFLAGS/CXXFLAGS only inside the ( cd ... ) subshell, so they do not leak to subsequent plugins in the loop. For whisper has_cuda_feature is 1, so --features cuda is used and the final sherpa-linkage guard (lines 75-86) is correctly skipped. The portable-baseline flags affect only host C/C++ ggml kernels, not nvcc GPU kernels, matching the CPU build script's behavior.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration

Devin Review found 1 new potential issue.

Debug

Playground

staging-devin-ai-integration · 2026-07-04T07:59:59Z

    assert!(!tokio::fs::try_exists(&record_path).await.unwrap(), "Active record should be removed");
    assert!(
        !tokio::fs::try_exists(&bundle_dir).await.unwrap(),
-        "Bundle directory should be removed"
+        "Accelerator variant directory should be removed"
+    );
+    assert!(
+        !tokio::fs::try_exists(&version_dir).await.unwrap(),
+        "Empty version directory should be pruned"
    );


📝 Info: Uninstall test correctly exercises accelerator-keyed layout

The updated test_uninstall_marketplace_plugin_removes_bundle places the entrypoint under bundles/gain/1.0.0/cpu/ and sets accelerator: "cpu" on the record. This matches the production logic in apps/skit/src/server/plugins.rs:666-685: a non-empty accelerator resolves bundle_dir = version_dir/cpu, which is removed, after which remove_dir_if_empty prunes the now-empty version_dir and bundles_root. The two new assertions (variant dir removed, empty version dir pruned) are consistent with this flow. No behavioral gap found in the incremental change.

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

…lugins Expose the active accelerator on PluginSummary (from active records on restart and from the installer on fresh installs), add an accelerator selector (auto-detect/cpu/cuda) to the marketplace details pane wired to InstallPluginRequest.accelerator, and show the active variant in the installed plugins list. Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration

Devin Review found 0 new potential issues.

Debug

Playground

streamer45 self-assigned this Jun 28, 2026

streamer45 self-requested a review June 28, 2026 13:06

staging-devin-ai-integration Bot commented Jun 28, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

staging-devin-ai-integration Bot commented Jun 28, 2026

View reviewed changes

chore(ci): retrigger checks after apt mirror flake in servo lint

3e712cb

Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration Bot commented Jul 4, 2026

View reviewed changes

fix(marketplace): apply whisper portability flags to CUDA build and g…

026a083

…uard variant manifests Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration Bot commented Jul 4, 2026

View reviewed changes

test(server): exercise accelerator-keyed bundle removal on uninstall

9684702

Signed-off-by: streamkit-devin <devin@streamkit.dev>

staging-devin-ai-integration Bot commented Jul 4, 2026

View reviewed changes

staging-devin-ai-integration Bot mentioned this pull request Jul 4, 2026

chore(deps): cargo-deny advisories failing on main (anyhow, quick-xml, ttf-parser, wasmtime-wasi) #648

Closed

staging-devin-ai-integration Bot commented Jul 4, 2026

View reviewed changes

Conversation

staging-devin-ai-integration Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review & Validation

Notes

Devin Review

Uh oh!

staging-devin-ai-integration Bot commented Jun 28, 2026

🤖 Devin AI Engineer

Uh oh!

codecov Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

staging-devin-ai-integration Bot Jun 28, 2026

Choose a reason for hiding this comment

staging-devin-ai-integration Bot commented Jun 28, 2026 •

edited

Loading

codecov Bot commented Jun 28, 2026 •

edited

Loading

staging-devin-ai-integration Bot Jun 28, 2026 •

edited

Loading

staging-devin-ai-integration Bot Jun 28, 2026 •

edited

Loading

staging-devin-ai-integration Bot Jun 28, 2026 •

edited

Loading

staging-devin-ai-integration Bot Jun 28, 2026 •

edited

Loading

staging-devin-ai-integration Bot Jul 4, 2026 •

edited

Loading