Skip to content

Include Qwen shared experts in MoE LoRA#720

Open
vivekkalyan wants to merge 3 commits into
mainfrom
codex/lora-target-presets
Open

Include Qwen shared experts in MoE LoRA#720
vivekkalyan wants to merge 3 commits into
mainfrom
codex/lora-target-presets

Conversation

@vivekkalyan
Copy link
Copy Markdown
Collaborator

@vivekkalyan vivekkalyan commented Jun 5, 2026

Summary

  • include Qwen3.5/Qwen3.6 MoE shared-expert gate/up/down targets in the default LoRA target set
  • wrap and export shared-expert LoRA weights alongside routed grouped experts
  • preserve shared-expert target modules when converting fused MoE adapters to vLLM format

Validation

  • Sky H200 validation on art-qwen35-shared-expert-lora-test: created a rank-8 Qwen3.5 MoE ART adapter, published it to vLLM format, roundtripped it back into Megatron, compared structure against the run57 Tinker adapter, and loaded it with stock vLLM LoRA loader. Stock vLLM loaded shared-expert gate/up/down and fused routed experts successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant