feat(pt_expt): add eval_typeebd, eval_descriptor, eval_fitting_last_layer by wanghan-iapcm · Pull Request #5391 · deepmodeling/deepmd-kit

wanghan-iapcm · 2026-04-11T06:34:50Z

Summary

Add eval_typeebd, eval_descriptor, eval_fitting_last_layer to pt_expt's DeepEval, using the eager _dpmodel for diagnostic computation
Add get_dp_atomic_model() API to the model hierarchy (make_model, FrozenModel, SpinModel) for clean access to the underlying DPAtomicModel
Add set_return_middle_output to GeneralFitting with proper FittingOutputDef registration — fitting_check_output now evaluates output_def() dynamically
Preserve middle_output through DipoleFitting/PolarFitting call() by modifying the results dict in-place
Add cross-backend reference values for fparam_aparam model (descriptor + fit_ll)
Remove .pte/.pt2 skips in test_models.py for test_descriptor and test_fitting_last_layer

Limitations

Spin models: eval_descriptor and eval_fitting_last_layer raise NotImplementedError (SpinModel preprocesses coords with virtual atoms)
DPZBLModel: get_dp_atomic_model() returns None (LinearAtomicModel has no single descriptor/fitting_net)
model_check_output still caches output_def at init (only fitting_check_output is dynamic) — not an issue since middle_output is consumed at fitting level
_middle_output_def() added to all fitting subclasses but only tested for InvarFitting, DipoleFitting, PolarFitting (not PropertyFitting, DOSFitting)

Test plan

10 dpmodel middle_output tests (shape, toggle, output_def registration, decorator validation, dipole/polar passthrough)
6 eval diagnostic tests (typeebd, descriptor, fitting_last_layer for se_e2_a and DPA1)
4 get_dp_atomic_model tests (energy, ZBL, spin, frozen)
3 spin model diagnostic tests (typeebd works, descriptor/fitting raises)
2 ASE neighbor list consistency tests
Cross-backend consistency via fparam_aparam reference values in test_models.py

Summary by CodeRabbit

New Features
- Inference APIs now expose type-embedding, descriptor, and fitting-last-layer outputs; models may return intermediate fitting-network outputs and higher-level fitters propagate them.
- New public accessor to retrieve an underlying atomic model from wrapped models.
Refactor
- Consolidated input preparation and evaluation flow; multi-output handling and output-definition validation unified.
Tests
- Large expansion of unit and integration tests and test data for new outputs, toggles, determinism, and model variants.

Add pt_expt backend (.pte/.pt2) to the parameterized extensions in test_models.py to verify convert-backend works for the new exportable formats. The fparam_aparam model (1 atom type) is switched from type_one_side=False to type_one_side=True (with ndim 2→1), which is equivalent for single-type models but enables pt_expt export. Models with type_one_side=False and multiple types (se_e2_a, se_e2_r) are skipped for .pte/.pt2 as make_fx cannot trace data-dependent indexing in NetworkCollection(ndim=2).

tests/pt/__init__.py may set a fake default device for CPU fallback, which poisons AOTInductor compilation. Temporarily clear the default device before converting to .pt2, matching the pattern used in test_change_bias.py.

…ayer Implement three diagnostic eval methods for the pt_expt backend's DeepEval, using the eager _dpmodel (deserialized from model.json in the archive) to compute intermediate results directly. Key changes: - Add get_dp_atomic_model() API to model hierarchy (make_model CM, FrozenModel, SpinModel) for clean access to DPAtomicModel - Add set_return_middle_output / middle_output support in GeneralFitting - Extract _prepare_inputs helper from _eval_model for reuse - Preserve _call_common output dict in DipoleFitting/PolarFitting call() so middle_output flows through without ad-hoc propagation - Remove .pte/.pt2 skips in cross-backend test_descriptor/test_fitting_last_layer - Add fparam_aparam descriptor/fit_ll reference values to testcase YAML Limitations: - SpinModel: eval_descriptor/eval_fitting_last_layer raise NotImplementedError (virtual atom preprocessing required) - DPZBLModel/LinearEnergyModel: raise NotImplementedError (no single descriptor/fitting_net) - se_e2_a + fparam: cannot export to .pte (torch.export limitation)

Make fitting_check_output evaluate output_def() dynamically so that middle_output is validated when enabled via set_return_middle_output. Each fitting subclass includes _middle_output_def() in its output_def().

# Conflicts: # deepmd/pt_expt/infer/deep_eval.py # source/tests/pt_expt/infer/test_deep_eval.py

source/tests/pt_expt/infer/test_deep_eval.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 372f2db944

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

deepmd/dpmodel/fitting/general_fitting.py

coderabbitai · 2026-04-11T06:44:27Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds optional "middle_output" emission from fitting networks (declaration, computation, and propagation), updates fitting subclasses to include/return it, exposes atomic-submodel accessors, refactors DeepEval input preparation and adds diagnostic eval APIs, and adds tests and YAML expectations for these features.

Changes

Cohort / File(s)	Summary
Fitting base `deepmd/dpmodel/fitting/general_fitting.py`	Add `eval_return_middle_output` flag, `set_return_middle_output()`, `_middle_output_def()`, and compute/return `"middle_output"` from `_call_common()` when enabled.
Fitting subclasses `deepmd/dpmodel/fitting/dipole_fitting.py`, `deepmd/dpmodel/fitting/dos_fitting.py`, `deepmd/dpmodel/fitting/invar_fitting.py`, `deepmd/dpmodel/fitting/polarizability_fitting.py`, `deepmd/dpmodel/fitting/property_fitting.py`	Include `*self._middle_output_def()` in `output_def()`; Dipole/Polar callsites updated to operate on full `_call_common()` dict, update primary output in-place, and return the full results mapping.
Model accessors `deepmd/dpmodel/model/frozen.py`, `deepmd/dpmodel/model/make_model.py`, `deepmd/dpmodel/model/spin_model.py`	Add `get_dp_atomic_model()` accessor to expose underlying `DPAtomicModel` or return `None`; add `TYPE_CHECKING` guards for forward refs.
Output validation `deepmd/dpmodel/output_def.py`	`fitting_check_output.wrapper.__call__` now calls `self.output_def()` at runtime and validates returned mapping against it.
DeepEval & preprocessing `deepmd/pt_expt/infer/deep_eval.py`	Extract `_prepare_inputs()` from `_eval_model()` to centralize input shaping/tiling/fparam/aparam handling; add `_is_spin_model()`, `eval_typeebd()`, `eval_descriptor()`, and `eval_fitting_last_layer()` utilities with appropriate guards and middle-output toggling.
Tests: fitting middle output `source/tests/common/dpmodel/test_fitting_middle_output.py`	New E2E tests exercising `middle_output` toggling, shapes, determinism, fparam/aparam handling, `output_def()` registration, and propagation through Dipole/Polar fittings.
Tests: deep_eval / infer `source/tests/pt_expt/infer/test_deep_eval.py`, `source/tests/infer/test_models.py`	Add tests for DeepEval APIs (`eval_typeebd`, `eval_descriptor`, `eval_fitting_last_layer`), update eval calls to pass `fparam`/`aparam`, add get_dp_atomic_model wrappers and ASE consistency checks.
Test data `source/tests/infer/fparam_aparam-testcase.yaml`	Extend YAML expected outputs to include `descriptor` and `fit_ll` arrays for inference validation.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client as Client
participant DeepEval as DeepEval
participant Model as ExportedModel
participant DPAtomic as DPAtomicModel
Client->>DeepEval: eval_descriptor(coords,box,atypes,fparam?,aparam?)
DeepEval->>DeepEval: _prepare_inputs(...) -> tensors, nframes, natoms
DeepEval->>Model: forward call with tensors
Model-->>DeepEval: results dict (primary output + optional middle_output)
DeepEval->>DPAtomic: get_dp_atomic_model() (if required)
DeepEval-->>Client: numpy array (descriptor / fitting-last-layer / type ebd)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(pt_expt): add fitting for energy #5218: touches general_fitting.py and _call_common() — related to middle-output computation changes.
test(pt_expt): add .pt2 (AOTInductor) unit tests and bug fixes #5334: modifies _eval_model() / fparam handling in deep_eval.py — related to the _prepare_inputs() refactor and fparam/aparam behavior.
refactor(pt_expt): use model API for inference, consistent file naming #5354: refactors deep_eval.py model/metadata handling — related to the new eval utilities and get_dp_atomic_model() usage.

Suggested labels

enhancement

Suggested reviewers

njzjz

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 70.45% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main feature additions: three new evaluation methods (eval_typeebd, eval_descriptor, eval_fitting_last_layer) for the pt_expt module.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

deepmd/pt_expt/infer/deep_eval.py (1)

709-723: ⚠️ Potential issue | 🟠 Major

Don’t make eval_descriptor() depend on an unused aparam.

This code path never forwards aparam_t into dp_am.descriptor(), but Lines 715-721 still reject calls when dim_aparam > 0 and the caller omits it. That turns the descriptor-only diagnostic into a fitting-input requirement and will fail aparam-enabled models unless users pass a dummy array.

One way to decouple descriptor evaluation from fitting-only inputs

     def _prepare_inputs(
         self,
         coords: np.ndarray,
         cells: np.ndarray | None,
         atom_types: np.ndarray,
         fparam: np.ndarray | None,
         aparam: np.ndarray | None,
+        *,
+        require_aparam: bool = True,
     ) -> tuple:
@@
-        elif self.get_dim_aparam() > 0:
+        elif require_aparam and self.get_dim_aparam() > 0:
             # Exported models (.pt2/.pte) are compiled with aparam as a
             # required positional input.  Unlike fparam, there is no default.
             raise ValueError(
                 f"aparam is required for this model (dim_aparam={self.get_dim_aparam()}) "
                 "but was not provided."
             )
@@
-        ) = self._prepare_inputs(coords, cells, atom_types, fparam, aparam)
+        ) = self._prepare_inputs(
+            coords,
+            cells,
+            atom_types,
+            fparam,
+            aparam,
+            require_aparam=False,
+        )

Also applies to: 1071-1092

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/infer/deep_eval.py` around lines 709 - 723, The
eval_descriptor() path should not require aparam when it isn't passed to
dp_am.descriptor(); remove the ValueError branch and instead set aparam_t = None
when aparam is missing even if get_dim_aparam() > 0 so descriptor evaluation can
proceed without a dummy array. Update the block around aparam handling in
eval_descriptor() (where aparam_t is created and where get_dim_aparam() is
checked) to avoid raising and ensure downstream calls to dp_am.descriptor()
receive None when aparam is omitted.

🧹 Nitpick comments (1)

deepmd/dpmodel/output_def.py (1)

97-97: Remove stale cached self.md to avoid misleading state.

self.md is initialized but no longer used after switching to call-time md. Cleaning it up will reduce confusion.

Also applies to: 105-107

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@deepmd/dpmodel/output_def.py` at line 97, Remove the stale cached attribute
self.md that is set via self.output_def()—delete the assignment "self.md =
self.output_def()" (and the same stale assignments at the other occurrences
around lines 105–107) and any dead references to self.md so the code uses the
call-time md only; locate this in the class constructor (or initializer) where
output_def() is invoked and remove the self.md assignment while leaving the
output_def method and any call-site usages that pass md unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/dpmodel/fitting/general_fitting.py`:
- Around line 441-447: The _middle_output_def method and related middle-output
handling assume at least one hidden layer by using self.neuron[-1]; add a guard
that checks eval_return_middle_output together with the neuron config (e.g., if
self.eval_return_middle_output and not self.neuron: raise
ValueError("middle_output requires at least one hidden layer") or disable the
feature), and apply the same guard/validation in the other place that accesses
self.neuron[-1] (the code referenced near lines 723-724) so enabling
middle_output when neuron is empty is rejected with a clear error or safely
bypassed.

In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 1071-1091: The unpacking from self._prepare_inputs and
dp_am.descriptor creates many unused locals that trigger RUF059; adjust the
tuple unpacking to mark unused values with underscore-prefixed names (e.g.,
_aparam_t, _nframes) or single underscores for values you won't use, and
similarly change rot_mat, g2, h2, sw to _rot_mat, _g2, _h2, _sw (or _
placeholders) so only needed symbols like ext_coord_t, ext_atype_t, nlist_t,
mapping_t, fparam_t and descriptor remain as used variables; update the unpack
statements in the block around the call sites of _prepare_inputs and
dp_am.descriptor accordingly.
- Around line 997-1003: The _is_spin_model method currently checks
isinstance(self._dpmodel, SpinModel) which misses PT Expt spin models; change it
to return the precomputed flag (use getattr(self, "_is_spin", False)) so
_is_spin_model() reliably detects spin models (affecting _is_spin_model,
eval_typeebd, eval_descriptor, and eval_fitting_last_layer guards).

In `@source/tests/common/dpmodel/test_fitting_middle_output.py`:
- Line 97: The test currently unpacks three values from
self._build_fitting(mixed_types=True) into ft, descriptor, atype but never uses
descriptor or atype, causing ruff RUF059; fix this by changing the unpack to use
underscore-prefixed placeholders (e.g., ft, _descriptor, _atype =
self._build_fitting(mixed_types=True)) or a starred ignore (ft, *_ =
self._build_fitting(mixed_types=True)) so only ft is treated as used while
satisfying ruff, referencing the _build_fitting call and the ft variable.

In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1667-1692: The test currently only asserts zbl_am is not an
instance of DPAtomicModelDP; instead call the new API: invoke
get_dp_atomic_model() on the DPZBLLinearEnergyAtomicModel instance
(zbl_am.get_dp_atomic_model()) and assert it returns None (e.g.
self.assertIsNone(zbl_am.get_dp_atomic_model())), keeping the existing
DPAtomicModelDP instance (dp_am) creation for context.

---

Outside diff comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 709-723: The eval_descriptor() path should not require aparam when
it isn't passed to dp_am.descriptor(); remove the ValueError branch and instead
set aparam_t = None when aparam is missing even if get_dim_aparam() > 0 so
descriptor evaluation can proceed without a dummy array. Update the block around
aparam handling in eval_descriptor() (where aparam_t is created and where
get_dim_aparam() is checked) to avoid raising and ensure downstream calls to
dp_am.descriptor() receive None when aparam is omitted.

---

Nitpick comments:
In `@deepmd/dpmodel/output_def.py`:
- Line 97: Remove the stale cached attribute self.md that is set via
self.output_def()—delete the assignment "self.md = self.output_def()" (and the
same stale assignments at the other occurrences around lines 105–107) and any
dead references to self.md so the code uses the call-time md only; locate this
in the class constructor (or initializer) where output_def() is invoked and
remove the self.md assignment while leaving the output_def method and any
call-site usages that pass md unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d2119289-f2dc-4a74-90fb-243855953c0d

📥 Commits

Reviewing files that changed from the base of the PR and between baab3e8 and 372f2db.

📒 Files selected for processing (15)

deepmd/dpmodel/fitting/dipole_fitting.py
deepmd/dpmodel/fitting/dos_fitting.py
deepmd/dpmodel/fitting/general_fitting.py
deepmd/dpmodel/fitting/invar_fitting.py
deepmd/dpmodel/fitting/polarizability_fitting.py
deepmd/dpmodel/fitting/property_fitting.py
deepmd/dpmodel/model/frozen.py
deepmd/dpmodel/model/make_model.py
deepmd/dpmodel/model/spin_model.py
deepmd/dpmodel/output_def.py
deepmd/pt_expt/infer/deep_eval.py
source/tests/common/dpmodel/test_fitting_middle_output.py
source/tests/infer/fparam_aparam-testcase.yaml
source/tests/infer/test_models.py
source/tests/pt_expt/infer/test_deep_eval.py

deepmd/dpmodel/fitting/general_fitting.py

deepmd/pt_expt/infer/deep_eval.py

source/tests/common/dpmodel/test_fitting_middle_output.py

source/tests/pt_expt/infer/test_deep_eval.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

source/tests/pt_expt/infer/test_deep_eval.py (1)

1667-1692: ⚠️ Potential issue | 🟡 Minor

Assert the new API contract in the ZBL test.

At Line 1691, the test only checks isinstance(...) and never validates get_dp_atomic_model(). This misses the behavior the test name/docstring claims to cover.

Suggested fix

@@
-        # zbl_am is a LinearEnergyAtomicModel, not DPAtomicModel
-        self.assertFalse(isinstance(zbl_am, DPAtomicModelDP))
+        # ZBL linear model has no single underlying DPAtomicModel
+        self.assertIsNone(zbl_am.get_dp_atomic_model())

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 1667 - 1692, The
test test_zbl_model_returns_none currently only asserts zbl_am is not an
instance of DPAtomicModelDP; update it to also validate the new API by calling
zbl_am.get_dp_atomic_model() and asserting it returns the underlying
DPAtomicModel (dp_am) or is an instance of DPAtomicModelDP (e.g.,
self.assertIs(zbl_am.get_dp_atomic_model(), dp_am) or
self.assertIsInstance(zbl_am.get_dp_atomic_model(), DPAtomicModelDP)); ensure
you reference the existing local variables zbl_am and dp_am and keep the
existing isinstance check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1717-1738: The test claims to validate
FrozenModel.get_dp_atomic_model() but deserializes a regular EnergyModel;
replace or adjust the test so it actually exercises a FrozenModel: wrap the
EnergyModel in a FrozenModel (create FrozenModel(model) or otherwise construct a
FrozenModel) before serializing with deserialize_to_file, then deserialize into
DeepPot and assert that dp.deep_eval._dpmodel is an instance of FrozenModel and
that dp.deep_eval._dpmodel.get_dp_atomic_model() returns a DPAtomicModelDP;
alternatively, if you prefer not to change behavior, rename the test and
docstring to reflect that it validates a regular model delegation rather than
FrozenModel. Ensure references to FrozenModel, get_dp_atomic_model,
deserialize_to_file, DeepPot, and deep_eval._dpmodel are updated accordingly.

---

Duplicate comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1667-1692: The test test_zbl_model_returns_none currently only
asserts zbl_am is not an instance of DPAtomicModelDP; update it to also validate
the new API by calling zbl_am.get_dp_atomic_model() and asserting it returns the
underlying DPAtomicModel (dp_am) or is an instance of DPAtomicModelDP (e.g.,
self.assertIs(zbl_am.get_dp_atomic_model(), dp_am) or
self.assertIsInstance(zbl_am.get_dp_atomic_model(), DPAtomicModelDP)); ensure
you reference the existing local variables zbl_am and dp_am and keep the
existing isinstance check.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ea1128ee-1903-4ac7-a915-ee0c8372fb07

📥 Commits

Reviewing files that changed from the base of the PR and between 372f2db and 13fefa6.

📒 Files selected for processing (1)

source/tests/pt_expt/infer/test_deep_eval.py

source/tests/pt_expt/infer/test_deep_eval.py

- Raise ValueError (not warning) when set_return_middle_output(True) with neuron=[] (no hidden layers) - Fix _is_spin_model() to use precomputed self._is_spin flag instead of isinstance check against dpmodel SpinModel (which misses pt_expt SpinModel) - Fix RUF059 unused unpacked variables in eval_descriptor and eval_fitting_last_layer - Fix RUF059 unused unpack in test_middle_output_registered_in_output_def - Rename test_frozen_model_delegates to test_deserialized_model_delegates to match what it actually tests

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

source/tests/pt_expt/infer/test_deep_eval.py (1)
1667-1691: ⚠️ Potential issue | 🟡 Minor

Assert the new get_dp_atomic_model() contract directly.

This still only proves zbl_am is not a DPAtomicModel. The test would keep passing even if DPZBLLinearEnergyAtomicModel.get_dp_atomic_model() regressed, so it does not cover the API named in the test/docstring.
Minimal fix
-        # LinearEnergyAtomicModel is not a DPAtomicModel
-        self.assertFalse(isinstance(zbl_am, DPAtomicModelDP))
+        self.assertIsNone(zbl_am.get_dp_atomic_model())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 1667 - 1691, The
test currently only checks isinstance(zbl_am, DPAtomicModelDP) but should assert
the get_dp_atomic_model() contract directly; update the test to call
DPZBLLinearEnergyAtomicModel.get_dp_atomic_model() on zbl_am and assert it
returns None (e.g., self.assertIsNone(zbl_am.get_dp_atomic_model())), keeping
the existing construction of zbl_am and imports intact so the assertion verifies
the API behavior rather than type identity.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 1151-1164: The code unconditionally sets
fitting_net.set_return_middle_output(True) and then resets to False, which
overwrites any prior setting; modify the block to first read and store the
fitting_net's current middle_output state (e.g., via an existing getter or
attribute on fitting_net), set it to True for the call, and in the finally
restore the original stored value instead of hardcoding False; update references
around dp_am.fitting_net, fitting_net.set_return_middle_output and the
try/finally so the original setting is preserved for subsequent callers.

In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1777-1782: The test swaps in a SpinModel into
cls.dp.deep_eval._dpmodel but doesn't update the internal spin flag, so
_is_spin_model() still sees deep_eval._is_spin as False and the spin-only guards
in eval_descriptor() and eval_fitting_last_layer() are not exercised; after
assigning cls.dp.deep_eval._dpmodel = cls.spin_model, set
cls.dp.deep_eval._is_spin = True (or otherwise mark the deep_eval instance as
spin) so _is_spin_model() returns True and the NotImplementedError paths are
properly tested.

---

Duplicate comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1667-1691: The test currently only checks isinstance(zbl_am,
DPAtomicModelDP) but should assert the get_dp_atomic_model() contract directly;
update the test to call DPZBLLinearEnergyAtomicModel.get_dp_atomic_model() on
zbl_am and assert it returns None (e.g.,
self.assertIsNone(zbl_am.get_dp_atomic_model())), keeping the existing
construction of zbl_am and imports intact so the assertion verifies the API
behavior rather than type identity.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 116a01e7-ade6-4e0b-96f1-f46567668e8f

📥 Commits

Reviewing files that changed from the base of the PR and between 13fefa6 and f012b21.

📒 Files selected for processing (4)

deepmd/dpmodel/fitting/general_fitting.py
deepmd/pt_expt/infer/deep_eval.py
source/tests/common/dpmodel/test_fitting_middle_output.py
source/tests/pt_expt/infer/test_deep_eval.py

🚧 Files skipped from review as they are similar to previous changes (1)

deepmd/dpmodel/fitting/general_fitting.py

deepmd/pt_expt/infer/deep_eval.py

source/tests/pt_expt/infer/test_deep_eval.py

pt_expt.SpinModel inherits from dpmodel.SpinModel, so isinstance works correctly. The coderabbitai suggestion to use self._is_spin was wrong.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

deepmd/pt_expt/infer/deep_eval.py (1)

684-723: ⚠️ Potential issue | 🟠 Major

Don't require fitting-only params on the descriptor path.

eval_descriptor() drops aparam_t completely and only forwards fparam_t when dp_am.add_chg_spin_ebd is enabled, but _prepare_inputs() now raises for any nonzero dim_aparam / dim_fparam. That breaks descriptor diagnostics for models whose descriptor is independent of those inputs.

Minimal direction

 def _prepare_inputs(
     self,
     coords: np.ndarray,
     cells: np.ndarray | None,
     atom_types: np.ndarray,
     fparam: np.ndarray | None,
     aparam: np.ndarray | None,
+    *,
+    require_fparam: bool = True,
+    require_aparam: bool = True,
 ) -> tuple:
@@
-        elif self.get_dim_fparam() > 0:
+        elif require_fparam and self.get_dim_fparam() > 0:
@@
-        elif self.get_dim_aparam() > 0:
+        elif require_aparam and self.get_dim_aparam() > 0:

Then have eval_descriptor() call _prepare_inputs(..., require_aparam=False, require_fparam=getattr(dp_am, "add_chg_spin_ebd", False)).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/infer/deep_eval.py` around lines 684 - 723, _eval_descriptor()
currently calls _prepare_inputs() which unconditionally raises if dim_aparam or
dim_fparam > 0, preventing descriptor-only evaluation when those params are only
used for fitting; change the _prepare_inputs() call sites (notably in
eval_descriptor) to pass flags that relax these requirements: add optional
parameters require_aparam (default True) and require_fparam (default True) to
_prepare_inputs(), and in eval_descriptor() call _prepare_inputs(...,
require_aparam=False, require_fparam=getattr(dp_am, "add_chg_spin_ebd", False))
so aparam is not required for descriptor-only runs while fparam is only required
when dp_am.add_chg_spin_ebd is true; update any other callers as needed to
preserve current behavior when inputs are actually required.

♻️ Duplicate comments (1)

deepmd/pt_expt/infer/deep_eval.py (1)

1155-1168: ⚠️ Potential issue | 🟠 Major

Restore the previous middle_output setting.

This helper always leaves fitting_net with eval_return_middle_output=False. If the fitting net was already configured to emit "middle_output", this diagnostic call mutates shared model state and changes later evaluations.

Minimal fix

             atype = ext_atype_t[:, :natoms]
             fitting_net = dp_am.fitting_net
+            prev_return_middle_output = getattr(
+                fitting_net, "eval_return_middle_output", False
+            )
             fitting_net.set_return_middle_output(True)
             try:
                 ret = fitting_net(
                     descriptor,
                     atype,
@@
                     fparam=fparam_t,
                     aparam=aparam_t,
                 )
             finally:
-                fitting_net.set_return_middle_output(False)
+                fitting_net.set_return_middle_output(prev_return_middle_output)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/infer/deep_eval.py` around lines 1155 - 1168, The call
unconditionally resets fitting_net.set_return_middle_output(False) in the
finally block, which clobbers any previous configuration; capture the prior
state before changing it and restore that saved value in the finally block.
Specifically, before calling fitting_net.set_return_middle_output(True) read and
store the current value (e.g., prev = fitting_net.get_return_middle_output() or
store the attribute used internally), then call set_return_middle_output(True),
run fitting_net(...) to get ret, and in the finally call
set_return_middle_output(prev) to restore the original setting; update the code
paths around fitting_net, set_return_middle_output, and the try/finally in
deep_eval.py accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 1032-1040: eval_descriptor (and the similar method around lines
1094-1102) forwards coords, cells, and atom_types directly to _prepare_inputs
which assumes numpy arrays; callers passing Python lists will hit AttributeError
on .shape. Before calling _prepare_inputs, normalize the public inputs the same
way eval() does: coerce coords and atom_types to numpy arrays (coords =
np.asarray(coords), atom_types = np.asarray(atom_types)), coerce cells to None
or np.asarray(cells) if provided, and apply the same fparam/aparam handling used
by eval(); then pass these normalized arrays into _prepare_inputs so list inputs
work identically to the eval() API.

---

Outside diff comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 684-723: _eval_descriptor() currently calls _prepare_inputs()
which unconditionally raises if dim_aparam or dim_fparam > 0, preventing
descriptor-only evaluation when those params are only used for fitting; change
the _prepare_inputs() call sites (notably in eval_descriptor) to pass flags that
relax these requirements: add optional parameters require_aparam (default True)
and require_fparam (default True) to _prepare_inputs(), and in eval_descriptor()
call _prepare_inputs(..., require_aparam=False, require_fparam=getattr(dp_am,
"add_chg_spin_ebd", False)) so aparam is not required for descriptor-only runs
while fparam is only required when dp_am.add_chg_spin_ebd is true; update any
other callers as needed to preserve current behavior when inputs are actually
required.

---

Duplicate comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 1155-1168: The call unconditionally resets
fitting_net.set_return_middle_output(False) in the finally block, which clobbers
any previous configuration; capture the prior state before changing it and
restore that saved value in the finally block. Specifically, before calling
fitting_net.set_return_middle_output(True) read and store the current value
(e.g., prev = fitting_net.get_return_middle_output() or store the attribute used
internally), then call set_return_middle_output(True), run fitting_net(...) to
get ret, and in the finally call set_return_middle_output(prev) to restore the
original setting; update the code paths around fitting_net,
set_return_middle_output, and the try/finally in deep_eval.py accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 42260024-e1cc-45fa-905b-8a7e51d9399c

📥 Commits

Reviewing files that changed from the base of the PR and between f012b21 and 1df931a.

📒 Files selected for processing (1)

deepmd/pt_expt/infer/deep_eval.py

deepmd/pt_expt/infer/deep_eval.py

Accept Python lists (not just np.ndarray) by coercing coords, cells, atom_types with np.array() at the top, matching the eval() API.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

source/tests/pt_expt/infer/test_deep_eval.py (2)

1439-1444: Normalize the swapped-in eager models to inference mode and dtype.

model_fp is the only helper model in these setups that is neither moved to torch.float64 nor switched to .eval() before replacing deep_eval._dpmodel. That works only as long as the eager diagnostic path stays insensitive to dtype/train-mode differences.

Proposed change

         model_fp = EnergyModel(ds_fp, ft_fp, type_map=cls.type_map)
+        model_fp = model_fp.to(torch.float64)
+        model_fp.eval()
         pte_fp = os.path.join(cls._tmpdir, "fp.pte")
         deserialize_to_file(pte_fp, {"model": model_sea.serialize()})
         cls.dp_fp = DeepPot(pte_fp)
         cls.dp_fp.deep_eval._dpmodel = model_fp

Also applies to: 1590-1595

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 1439 - 1444, The
swapped-in eager EnergyModel instance (model_fp) is not set to the correct dtype
or eval mode before being assigned to DeepPot.deep_eval._dpmodel; update the
setup to call model_fp.double() (or .to(torch.float64)) and model_fp.eval()
after constructing EnergyModel (and do the same for the analogous helper model
in the 1590-1595 block) so the eager diagnostic path uses inference-mode and
float64 dtype when replacing deep_eval._dpmodel.

1476-1488: Avoid bitwise equality for floating-point diagnostic outputs.

These values come from PT inference, so assert_array_equal() is stricter than the behavior you actually want to validate and can become brittle on different kernels/devices. assert_allclose() with tight tolerances is safer here. Please verify this on the CUDA/PT test lane as well.

Proposed change

-        np.testing.assert_array_equal(d1, d2)
+        np.testing.assert_allclose(d1, d2, rtol=1e-12, atol=1e-12)
...
-        np.testing.assert_array_equal(d1, d2)
+        np.testing.assert_allclose(d1, d2, rtol=1e-12, atol=1e-12)
...
-        np.testing.assert_array_equal(fit_ll1, fit_ll2)
+        np.testing.assert_allclose(fit_ll1, fit_ll2, rtol=1e-12, atol=1e-12)
...
-        np.testing.assert_array_equal(fit_ll1, fit_ll2)
+        np.testing.assert_allclose(fit_ll1, fit_ll2, rtol=1e-12, atol=1e-12)

Also applies to: 1630-1650

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 1476 - 1488,
Replace bitwise equality checks in the deterministic descriptor tests by using
numpy.testing.assert_allclose with tight tolerances: in
test_descriptor_deterministic_sea and test_descriptor_deterministic_dpa1 call
np.testing.assert_allclose(d1, d2, rtol=1e-6, atol=1e-8) (or similar tight
tolerances) instead of np.testing.assert_array_equal; do the same for the other
similar tests that call deep_eval.eval_descriptor (e.g., the tests around the
later block) and run the CUDA/PT lane to confirm tolerances are acceptable.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1361-1367: The test test_typeebd_dpa1 is too permissive: it
accepts both ntypes and ntypes+1 rows from deep_eval.eval_typeebd(), which hides
a regression where a padding row is returned; update the test to assert the
documented contract exactly by requiring typeebd.shape[0] == self.nt (not
allowing self.nt+1), or if the implementation intentionally includes a padding
row, update the API/docs in deepmd/pt_expt/infer/deep_eval.py to state it
returns (ntypes+1, tebd_dim) and then change this test (and the analogous test
at the other location) to assert shape[0] == self.nt + 1 and adjust expectations
accordingly so tests reflect the declared contract for eval_typeebd().

---

Nitpick comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1439-1444: The swapped-in eager EnergyModel instance (model_fp) is
not set to the correct dtype or eval mode before being assigned to
DeepPot.deep_eval._dpmodel; update the setup to call model_fp.double() (or
.to(torch.float64)) and model_fp.eval() after constructing EnergyModel (and do
the same for the analogous helper model in the 1590-1595 block) so the eager
diagnostic path uses inference-mode and float64 dtype when replacing
deep_eval._dpmodel.
- Around line 1476-1488: Replace bitwise equality checks in the deterministic
descriptor tests by using numpy.testing.assert_allclose with tight tolerances:
in test_descriptor_deterministic_sea and test_descriptor_deterministic_dpa1 call
np.testing.assert_allclose(d1, d2, rtol=1e-6, atol=1e-8) (or similar tight
tolerances) instead of np.testing.assert_array_equal; do the same for the other
similar tests that call deep_eval.eval_descriptor (e.g., the tests around the
later block) and run the CUDA/PT lane to confirm tolerances are acceptable.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cce00baf-8f7d-4d55-b18f-96d82106d4a4

📥 Commits

Reviewing files that changed from the base of the PR and between 1df931a and 4f4983d.

📒 Files selected for processing (2)

deepmd/pt_expt/infer/deep_eval.py
source/tests/pt_expt/infer/test_deep_eval.py

🚧 Files skipped from review as they are similar to previous changes (1)

deepmd/pt_expt/infer/deep_eval.py

source/tests/pt_expt/infer/test_deep_eval.py

codecov · 2026-04-11T09:21:51Z

Codecov Report

❌ Patch coverage is 95.87629% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.35%. Comparing base (baab3e8) to head (4f4983d).

Files with missing lines	Patch %	Lines
deepmd/pt_expt/infer/deep_eval.py	96.42%	2 Missing ⚠️
deepmd/dpmodel/model/frozen.py	50.00%	1 Missing ⚠️
deepmd/dpmodel/model/make_model.py	80.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5391      +/-   ##
==========================================
+ Coverage   80.33%   80.35%   +0.02%     
==========================================
  Files         819      819              
  Lines       85356    85445      +89     
  Branches     4139     4139              
==========================================
+ Hits        68571    68662      +91     
+ Misses      15509    15508       -1     
+ Partials     1276     1275       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Han Wang added 6 commits April 7, 2026 21:03

fix: reset default device before .pt2 AOTInductor compilation

281989d

tests/pt/__init__.py may set a fake default device for CPU fallback, which poisons AOTInductor compilation. Temporarily clear the default device before converting to .pt2, matching the pattern used in test_change_bias.py.

test: cover untested code paths in eval diagnostic methods

da7d4f9

feat(dpmodel): register middle_output in FittingOutputDef

67ffce3

Make fitting_check_output evaluate output_def() dynamically so that middle_output is validated when enabled via set_return_middle_output. Each fitting subclass includes _middle_output_def() in its output_def().

Merge branch 'master' into feat-pt-expt-eval-diagnostic

372f2db

# Conflicts: # deepmd/pt_expt/infer/deep_eval.py # source/tests/pt_expt/infer/test_deep_eval.py

github-actions bot added the Python label Apr 11, 2026

wanghan-iapcm requested a review from njzjz April 11, 2026 06:35

dosubot bot added the new feature label Apr 11, 2026

github-advanced-security bot found potential problems Apr 11, 2026

View reviewed changes

source/tests/pt_expt/infer/test_deep_eval.py Fixed Show fixed Hide fixed

source/tests/pt_expt/infer/test_deep_eval.py Fixed Show fixed Hide fixed

chatgpt-codex-connector bot reviewed Apr 11, 2026

View reviewed changes

deepmd/dpmodel/fitting/general_fitting.py Show resolved Hide resolved