-
Notifications
You must be signed in to change notification settings - Fork 14
fix: GDN F.conv1d fallback 缺少 groups 参数 + out_norm 昇腾 NPU 兼容 #94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -138,8 +138,11 @@ def build_model( | |
| lm_model = model.language_model if hasattr(model, 'language_model') else model | ||
| for layer in lm_model.decoder.layers: | ||
| if hasattr(layer.self_attention, 'out_norm'): | ||
| assert hasattr(layer.self_attention.out_norm, 'zero_centered_gamma') | ||
| layer.self_attention.out_norm.zero_centered_gamma = False | ||
| out_norm = layer.self_attention.out_norm | ||
| if hasattr(out_norm, 'zero_centered_gamma'): | ||
| out_norm.zero_centered_gamma = False | ||
| elif hasattr(out_norm, 'config'): | ||
| out_norm.config.layernorm_zero_centered_gamma = False | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. config参数是共享的,会导致其他所有的 config.layernorm_zero_centered_gamma都为False。 但我记得只有 out_norm 需要是 layernorm_zero_centered_gamma 为False
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 你好,感谢回复!现在希望qwen3.6-27b 训练,验证发现确实不对,现在采取 |
||
| return model | ||
|
|
||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| # Make sure to modify __release_datetime__ to release time when making official release. | ||
| __version__ = '1.4.0.dev0' | ||
| __version__ = '1.4.0' | ||
| # default release datetime for branches under active development is set | ||
| # to be a time far-far-away-into-the-future | ||
| __release_datetime__ = '2099-12-31 23:59:59' | ||
| __release_datetime__ = '2026-05-17 23:59:59' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在访问
out_norm.config.layernorm_zero_centered_gamma之前,建议增加对该属性是否存在的检查。虽然在昇腾 NPU 环境下 MindSpeed 的RMSNorm预期包含此配置,但为了代码的健壮性,防止在其他环境或不同版本的配置对象上触发AttributeError,使用hasattr检查会更安全。此外,由于config对象通常在多个层之间共享,修改此属性可能会产生全局影响,请确认这是否符合预期。