fix: missing log in multitask training by iProzd · Pull Request #5382 · deepmodeling/deepmd-kit

iProzd · 2026-04-07T08:26:24Z

fix missing log during multitask training, resulting from #4850 .

Summary by CodeRabbit

Refactor
- Cleaner multi-task training logs: task-specific training metrics are always shown, and validation metrics are emitted only when valid results are present to avoid incomplete validation output.
Tests
- Added post-training checks to ensure the training log contains expected model-probability columns and that data rows match the header column count.

coderabbitai · 2026-04-07T08:29:57Z

📝 Walkthrough

Walkthrough

Refactored multi-task logging in deepmd/pt/train/training.py to always emit per-task training metrics when rank == 0 and to emit per-task validation metrics only when valid_results[_key] is truthy; added post-training assertions in source/tests/pt/test_multitask.py to validate lcurve.out headers and row counts against configured model probabilities.

Changes

Cohort / File(s)	Summary
Multi-task training logging restructure `deepmd/pt/train/training.py`	Restructured the `rank == 0` logging block: training metrics for each task are logged unconditionally under the rank guard; validation metrics logging is moved into a nested `if valid_results[_key]:` block.
Multitask test additions `source/tests/pt/test_multitask.py`	Added assertions after training to parse `lcurve.out`: extract header columns (skipping comments), ensure every `model_prob` key appears in at least one column, and verify the first data row has the same column count as the header.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix: missing log in multitask training' directly and specifically describes the main change—addressing missing log output in multitask training, which aligns with the changeset that fixes logging behavior by restructuring the validation metrics logging in the training code.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Copilot wasn't able to review any files in this pull request.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-07T08:33:20Z

deepmd/pt/train/training.py

                                train_results[_key] = log_loss_train(
                                    loss, more_loss, _task_key=_key
                                )
-                        valid_results[_key] = log_loss_valid(_task_key=_key)
-                        if self.rank == 0:
-                            log.info(
-                                format_training_message_per_task(
-                                    batch=display_step_id,
-                                    task_name=_key + "_trn",
-                                    rmse=train_results[_key],
-                                    learning_rate=cur_lr,
-                                )
-                            )
-                            if valid_results[_key]:
+                            valid_results[_key] = log_loss_valid(_task_key=_key)
+                            if self.rank == 0:
                                log.info(
                                    format_training_message_per_task(
                                        batch=display_step_id,
-                                        task_name=_key + "_val",
-                                        rmse=valid_results[_key],
-                                        learning_rate=None,
+                                        task_name=_key + "_trn",
+                                        rmse=train_results[_key],
+                                        learning_rate=cur_lr,
                                    )
                                )
+                                if valid_results[_key]:
+                                    log.info(
+                                        format_training_message_per_task(
+                                            batch=display_step_id,
+                                            task_name=_key + "_val",
+                                            rmse=valid_results[_key],
+                                            learning_rate=None,
+                                        )
+                                    )


The indentation change moves valid_results[_key] = log_loss_valid(...) (and both log messages) into the same inner block as train_results[_key] = .... Previously, validation collection/logging happened at a broader scope; if that outer condition isn’t always entered (e.g., loss not computed/filtered for a task), this will skip validation results and logs for that task, potentially reintroducing the ‘missing log’ behavior. Consider restoring valid_results assignment and the rank==0 logging block to the prior scope (keeping only the intended logging fix), so per-task validation/logging isn’t coupled to that inner conditional.

codecov · 2026-04-07T09:26:42Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.38%. Comparing base (adf5b83) to head (2a88470).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5382      +/-   ##
==========================================
- Coverage   80.40%   80.38%   -0.02%     
==========================================
  Files         812      812              
  Lines       83476    83477       +1     
  Branches     4050     4050              
==========================================
- Hits        67115    67106       -9     
- Misses      15139    15147       +8     
- Partials     1222     1224       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

anyangml · 2026-04-07T09:29:48Z

maybe add a simple UT to count number of columns in lcurve.out?

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/tests/pt/test_multitask.py`:
- Line 76: Rename the ambiguous comprehension variable `l` in the list
comprehension inside test_multitask.py to a clearer name (e.g., `line`) so ruff
E741 is avoided; update the expression `data_lines = [l for l in lines if not
l.startswith("#")]` to use the new identifier (`line`) and ensure any adjacent
references in the same scope match the rename.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: efcb3235-4e4e-4d57-8f12-b610de5b78a7

📥 Commits

Reviewing files that changed from the base of the PR and between 739bd1a and 2a88470.

📒 Files selected for processing (1)

source/tests/pt/test_multitask.py

coderabbitai · 2026-04-07T11:39:39Z

source/tests/pt/test_multitask.py

+                len(cols_for_model), 0, f"No lcurve columns found for {mk}"
+            )
+        # data line column count should match header
+        data_lines = [l for l in lines if not l.startswith("#")]


⚠️ Potential issue | 🟡 Minor

Rename ambiguous variable l to pass CI.

The variable name l is flagged by ruff (E741) as ambiguous since it can be confused with 1 or I. As per coding guidelines, ruff check . must pass before committing.

Proposed fix

- data_lines = [l for l in lines if not l.startswith("#")] + data_lines = [line for line in lines if not line.startswith("#")]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

data_lines = [l for l in lines if not l.startswith("#")]

data_lines = [line for line in lines if not line.startswith("#")]

🧰 Tools

🪛 Ruff (0.15.9)

[error] 76-76: Ambiguous variable name: l

(E741)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@source/tests/pt/test_multitask.py` at line 76, Rename the ambiguous comprehension variable `l` in the list comprehension inside test_multitask.py to a clearer name (e.g., `line`) so ruff E741 is avoided; update the expression `data_lines = [l for l in lines if not l.startswith("#")]` to use the new identifier (`line`) and ensure any adjacent references in the same scope match the rename.

fix: missing log in multitask training

739bd1a

iProzd requested review from OutisLi, Copilot and njzjz April 7, 2026 08:26

github-actions bot added the Python label Apr 7, 2026

iProzd requested a review from anyangml April 7, 2026 08:26

dosubot bot added the bug label Apr 7, 2026

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Copilot started reviewing on behalf of iProzd April 7, 2026 08:36 View session

add ut

2a88470

coderabbitai bot reviewed Apr 7, 2026

View reviewed changes

njzjz approved these changes Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: missing log in multitask training#5382

fix: missing log in multitask training#5382
iProzd wants to merge 2 commits intodeepmodeling:masterfrom
iProzd:0407_fix_mul_log

iProzd commented Apr 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 7, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

codecov bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

anyangml commented Apr 7, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	data_lines = [l for l in lines if not l.startswith("#")]
	data_lines = [line for line in lines if not line.startswith("#")]

Conversation

iProzd commented Apr 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

anyangml commented Apr 7, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

iProzd commented Apr 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 7, 2026 •

edited

Loading

codecov bot commented Apr 7, 2026 •

edited

Loading