Fix scheduler stepping and label dtype handling in train.py and eval.py (#152) by egekaya1 · Pull Request #221 · ML4SCI/DeepLense

egekaya1 · 2026-03-31T15:17:04Z

scheduler.step(loss) -> scheduler.step(): CosineAnnealingWarmRestarts does not accept a metric argument; passing loss caused it to treat the loss value as an epoch override, breaking the cosine LR cycle entirely
labels.type(torch.LongTensor).to(device) -> labels.to(device, dtype=torch.long) in train.py: torch.LongTensor is CPU-specific; the old form allocated an intermediate CPU tensor before transferring to device every batch
batch_y.type(torch.LongTensor) -> batch_y.to(dtype=torch.long) in eval.py: same dtype issue, and the original had no .to(device) call at all — labels never moved to GPU, working only accidentally because logits were pulled back to CPU before metric computation

While reviewing eval.py, I noticed micro_auroc is initialised as an empty
list and never populated, so np.mean(micro_auroc) always returns NaN. This
is already tracked in #164 and is out of scope here, but flagging it for
whoever picks that up.

Fixes #152

- scheduler.step(loss) -> scheduler.step(): CosineAnnealingWarmRestarts does not accept a metric argument; passing loss caused it to treat the loss value as an epoch override, breaking the cosine LR cycle entirely - labels.type(torch.LongTensor).to(device) -> labels.to(device, dtype=torch.long) in train.py: torch.LongTensor is CPU-specific; the old form allocated an intermediate CPU tensor before transferring to device every batch - batch_y.type(torch.LongTensor) -> batch_y.to(dtype=torch.long) in eval.py: same dtype issue, and the original had no .to(device) call at all — labels never moved to GPU, working only accidentally because logits were pulled back to CPU before metric computation eval.py was missed by all prior fix attempts (ML4SCI#156, ML4SCI#168, ML4SCI#173). Fixes ML4SCI#152 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix scheduler stepping and label dtype handling in train.py and eval.py (#152)#221

Fix scheduler stepping and label dtype handling in train.py and eval.py (#152)#221
egekaya1 wants to merge 1 commit intoML4SCI:mainfrom
egekaya1:fix/scheduler-label-dtype-152

egekaya1 commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

egekaya1 commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant