Skip to content

feat(pt_expt): add eval_typeebd, eval_descriptor, eval_fitting_last_layer#5391

Open
wanghan-iapcm wants to merge 10 commits intodeepmodeling:masterfrom
wanghan-iapcm:feat-pt-expt-eval-diagnostic
Open

feat(pt_expt): add eval_typeebd, eval_descriptor, eval_fitting_last_layer#5391
wanghan-iapcm wants to merge 10 commits intodeepmodeling:masterfrom
wanghan-iapcm:feat-pt-expt-eval-diagnostic

Conversation

@wanghan-iapcm
Copy link
Copy Markdown
Collaborator

@wanghan-iapcm wanghan-iapcm commented Apr 11, 2026

Summary

  • Add eval_typeebd, eval_descriptor, eval_fitting_last_layer to pt_expt's DeepEval, using the eager _dpmodel for diagnostic computation
  • Add get_dp_atomic_model() API to the model hierarchy (make_model, FrozenModel, SpinModel) for clean access to the underlying DPAtomicModel
  • Add set_return_middle_output to GeneralFitting with proper FittingOutputDef registration — fitting_check_output now evaluates output_def() dynamically
  • Preserve middle_output through DipoleFitting/PolarFitting call() by modifying the results dict in-place
  • Add cross-backend reference values for fparam_aparam model (descriptor + fit_ll)
  • Remove .pte/.pt2 skips in test_models.py for test_descriptor and test_fitting_last_layer

Limitations

  • Spin models: eval_descriptor and eval_fitting_last_layer raise NotImplementedError (SpinModel preprocesses coords with virtual atoms)
  • DPZBLModel: get_dp_atomic_model() returns None (LinearAtomicModel has no single descriptor/fitting_net)
  • model_check_output still caches output_def at init (only fitting_check_output is dynamic) — not an issue since middle_output is consumed at fitting level
  • _middle_output_def() added to all fitting subclasses but only tested for InvarFitting, DipoleFitting, PolarFitting (not PropertyFitting, DOSFitting)

Test plan

  • 10 dpmodel middle_output tests (shape, toggle, output_def registration, decorator validation, dipole/polar passthrough)
  • 6 eval diagnostic tests (typeebd, descriptor, fitting_last_layer for se_e2_a and DPA1)
  • 4 get_dp_atomic_model tests (energy, ZBL, spin, frozen)
  • 3 spin model diagnostic tests (typeebd works, descriptor/fitting raises)
  • 2 ASE neighbor list consistency tests
  • Cross-backend consistency via fparam_aparam reference values in test_models.py

Summary by CodeRabbit

  • New Features

    • Inference APIs now expose type-embedding, descriptor, and fitting-last-layer outputs; models may return intermediate fitting-network outputs and higher-level fitters propagate them.
    • New public accessor to retrieve an underlying atomic model from wrapped models.
  • Refactor

    • Consolidated input preparation and evaluation flow; multi-output handling and output-definition validation unified.
  • Tests

    • Large expansion of unit and integration tests and test data for new outputs, toggles, determinism, and model variants.

Han Wang added 6 commits April 7, 2026 21:03
Add pt_expt backend (.pte/.pt2) to the parameterized extensions in
test_models.py to verify convert-backend works for the new exportable
formats.

The fparam_aparam model (1 atom type) is switched from type_one_side=False
to type_one_side=True (with ndim 2→1), which is equivalent for single-type
models but enables pt_expt export. Models with type_one_side=False and
multiple types (se_e2_a, se_e2_r) are skipped for .pte/.pt2 as make_fx
cannot trace data-dependent indexing in NetworkCollection(ndim=2).
tests/pt/__init__.py may set a fake default device for CPU fallback,
which poisons AOTInductor compilation. Temporarily clear the default
device before converting to .pt2, matching the pattern used in
test_change_bias.py.
…ayer

Implement three diagnostic eval methods for the pt_expt backend's DeepEval,
using the eager _dpmodel (deserialized from model.json in the archive) to
compute intermediate results directly.

Key changes:
- Add get_dp_atomic_model() API to model hierarchy (make_model CM,
  FrozenModel, SpinModel) for clean access to DPAtomicModel
- Add set_return_middle_output / middle_output support in GeneralFitting
- Extract _prepare_inputs helper from _eval_model for reuse
- Preserve _call_common output dict in DipoleFitting/PolarFitting call()
  so middle_output flows through without ad-hoc propagation
- Remove .pte/.pt2 skips in cross-backend test_descriptor/test_fitting_last_layer
- Add fparam_aparam descriptor/fit_ll reference values to testcase YAML

Limitations:
- SpinModel: eval_descriptor/eval_fitting_last_layer raise NotImplementedError
  (virtual atom preprocessing required)
- DPZBLModel/LinearEnergyModel: raise NotImplementedError
  (no single descriptor/fitting_net)
- se_e2_a + fparam: cannot export to .pte (torch.export limitation)
Make fitting_check_output evaluate output_def() dynamically so that
middle_output is validated when enabled via set_return_middle_output.
Each fitting subclass includes _middle_output_def() in its output_def().
# Conflicts:
#	deepmd/pt_expt/infer/deep_eval.py
#	source/tests/pt_expt/infer/test_deep_eval.py
@wanghan-iapcm wanghan-iapcm requested a review from njzjz April 11, 2026 06:35
@dosubot dosubot bot added the new feature label Apr 11, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 372f2db944

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds optional "middle_output" emission from fitting networks (declaration, computation, and propagation), updates fitting subclasses to include/return it, exposes atomic-submodel accessors, refactors DeepEval input preparation and adds diagnostic eval APIs, and adds tests and YAML expectations for these features.

Changes

Cohort / File(s) Summary
Fitting base
deepmd/dpmodel/fitting/general_fitting.py
Add eval_return_middle_output flag, set_return_middle_output(), _middle_output_def(), and compute/return "middle_output" from _call_common() when enabled.
Fitting subclasses
deepmd/dpmodel/fitting/dipole_fitting.py, deepmd/dpmodel/fitting/dos_fitting.py, deepmd/dpmodel/fitting/invar_fitting.py, deepmd/dpmodel/fitting/polarizability_fitting.py, deepmd/dpmodel/fitting/property_fitting.py
Include *self._middle_output_def() in output_def(); Dipole/Polar callsites updated to operate on full _call_common() dict, update primary output in-place, and return the full results mapping.
Model accessors
deepmd/dpmodel/model/frozen.py, deepmd/dpmodel/model/make_model.py, deepmd/dpmodel/model/spin_model.py
Add get_dp_atomic_model() accessor to expose underlying DPAtomicModel or return None; add TYPE_CHECKING guards for forward refs.
Output validation
deepmd/dpmodel/output_def.py
fitting_check_output.wrapper.__call__ now calls self.output_def() at runtime and validates returned mapping against it.
DeepEval & preprocessing
deepmd/pt_expt/infer/deep_eval.py
Extract _prepare_inputs() from _eval_model() to centralize input shaping/tiling/fparam/aparam handling; add _is_spin_model(), eval_typeebd(), eval_descriptor(), and eval_fitting_last_layer() utilities with appropriate guards and middle-output toggling.
Tests: fitting middle output
source/tests/common/dpmodel/test_fitting_middle_output.py
New E2E tests exercising middle_output toggling, shapes, determinism, fparam/aparam handling, output_def() registration, and propagation through Dipole/Polar fittings.
Tests: deep_eval / infer
source/tests/pt_expt/infer/test_deep_eval.py, source/tests/infer/test_models.py
Add tests for DeepEval APIs (eval_typeebd, eval_descriptor, eval_fitting_last_layer), update eval calls to pass fparam/aparam, add get_dp_atomic_model wrappers and ASE consistency checks.
Test data
source/tests/infer/fparam_aparam-testcase.yaml
Extend YAML expected outputs to include descriptor and fit_ll arrays for inference validation.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client as Client
participant DeepEval as DeepEval
participant Model as ExportedModel
participant DPAtomic as DPAtomicModel
Client->>DeepEval: eval_descriptor(coords,box,atypes,fparam?,aparam?)
DeepEval->>DeepEval: _prepare_inputs(...) -> tensors, nframes, natoms
DeepEval->>Model: forward call with tensors
Model-->>DeepEval: results dict (primary output + optional middle_output)
DeepEval->>DPAtomic: get_dp_atomic_model() (if required)
DeepEval-->>Client: numpy array (descriptor / fitting-last-layer / type ebd)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • njzjz
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main feature additions: three new evaluation methods (eval_typeebd, eval_descriptor, eval_fitting_last_layer) for the pt_expt module.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deepmd/pt_expt/infer/deep_eval.py (1)

709-723: ⚠️ Potential issue | 🟠 Major

Don’t make eval_descriptor() depend on an unused aparam.

This code path never forwards aparam_t into dp_am.descriptor(), but Lines 715-721 still reject calls when dim_aparam > 0 and the caller omits it. That turns the descriptor-only diagnostic into a fitting-input requirement and will fail aparam-enabled models unless users pass a dummy array.

One way to decouple descriptor evaluation from fitting-only inputs
     def _prepare_inputs(
         self,
         coords: np.ndarray,
         cells: np.ndarray | None,
         atom_types: np.ndarray,
         fparam: np.ndarray | None,
         aparam: np.ndarray | None,
+        *,
+        require_aparam: bool = True,
     ) -> tuple:
@@
-        elif self.get_dim_aparam() > 0:
+        elif require_aparam and self.get_dim_aparam() > 0:
             # Exported models (.pt2/.pte) are compiled with aparam as a
             # required positional input.  Unlike fparam, there is no default.
             raise ValueError(
                 f"aparam is required for this model (dim_aparam={self.get_dim_aparam()}) "
                 "but was not provided."
             )
@@
-        ) = self._prepare_inputs(coords, cells, atom_types, fparam, aparam)
+        ) = self._prepare_inputs(
+            coords,
+            cells,
+            atom_types,
+            fparam,
+            aparam,
+            require_aparam=False,
+        )

Also applies to: 1071-1092

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/infer/deep_eval.py` around lines 709 - 723, The
eval_descriptor() path should not require aparam when it isn't passed to
dp_am.descriptor(); remove the ValueError branch and instead set aparam_t = None
when aparam is missing even if get_dim_aparam() > 0 so descriptor evaluation can
proceed without a dummy array. Update the block around aparam handling in
eval_descriptor() (where aparam_t is created and where get_dim_aparam() is
checked) to avoid raising and ensure downstream calls to dp_am.descriptor()
receive None when aparam is omitted.
🧹 Nitpick comments (1)
deepmd/dpmodel/output_def.py (1)

97-97: Remove stale cached self.md to avoid misleading state.

self.md is initialized but no longer used after switching to call-time md. Cleaning it up will reduce confusion.

Also applies to: 105-107

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/dpmodel/output_def.py` at line 97, Remove the stale cached attribute
self.md that is set via self.output_def()—delete the assignment "self.md =
self.output_def()" (and the same stale assignments at the other occurrences
around lines 105–107) and any dead references to self.md so the code uses the
call-time md only; locate this in the class constructor (or initializer) where
output_def() is invoked and remove the self.md assignment while leaving the
output_def method and any call-site usages that pass md unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/dpmodel/fitting/general_fitting.py`:
- Around line 441-447: The _middle_output_def method and related middle-output
handling assume at least one hidden layer by using self.neuron[-1]; add a guard
that checks eval_return_middle_output together with the neuron config (e.g., if
self.eval_return_middle_output and not self.neuron: raise
ValueError("middle_output requires at least one hidden layer") or disable the
feature), and apply the same guard/validation in the other place that accesses
self.neuron[-1] (the code referenced near lines 723-724) so enabling
middle_output when neuron is empty is rejected with a clear error or safely
bypassed.

In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 1071-1091: The unpacking from self._prepare_inputs and
dp_am.descriptor creates many unused locals that trigger RUF059; adjust the
tuple unpacking to mark unused values with underscore-prefixed names (e.g.,
_aparam_t, _nframes) or single underscores for values you won't use, and
similarly change rot_mat, g2, h2, sw to _rot_mat, _g2, _h2, _sw (or _
placeholders) so only needed symbols like ext_coord_t, ext_atype_t, nlist_t,
mapping_t, fparam_t and descriptor remain as used variables; update the unpack
statements in the block around the call sites of _prepare_inputs and
dp_am.descriptor accordingly.
- Around line 997-1003: The _is_spin_model method currently checks
isinstance(self._dpmodel, SpinModel) which misses PT Expt spin models; change it
to return the precomputed flag (use getattr(self, "_is_spin", False)) so
_is_spin_model() reliably detects spin models (affecting _is_spin_model,
eval_typeebd, eval_descriptor, and eval_fitting_last_layer guards).

In `@source/tests/common/dpmodel/test_fitting_middle_output.py`:
- Line 97: The test currently unpacks three values from
self._build_fitting(mixed_types=True) into ft, descriptor, atype but never uses
descriptor or atype, causing ruff RUF059; fix this by changing the unpack to use
underscore-prefixed placeholders (e.g., ft, _descriptor, _atype =
self._build_fitting(mixed_types=True)) or a starred ignore (ft, *_ =
self._build_fitting(mixed_types=True)) so only ft is treated as used while
satisfying ruff, referencing the _build_fitting call and the ft variable.

In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1667-1692: The test currently only asserts zbl_am is not an
instance of DPAtomicModelDP; instead call the new API: invoke
get_dp_atomic_model() on the DPZBLLinearEnergyAtomicModel instance
(zbl_am.get_dp_atomic_model()) and assert it returns None (e.g.
self.assertIsNone(zbl_am.get_dp_atomic_model())), keeping the existing
DPAtomicModelDP instance (dp_am) creation for context.

---

Outside diff comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 709-723: The eval_descriptor() path should not require aparam when
it isn't passed to dp_am.descriptor(); remove the ValueError branch and instead
set aparam_t = None when aparam is missing even if get_dim_aparam() > 0 so
descriptor evaluation can proceed without a dummy array. Update the block around
aparam handling in eval_descriptor() (where aparam_t is created and where
get_dim_aparam() is checked) to avoid raising and ensure downstream calls to
dp_am.descriptor() receive None when aparam is omitted.

---

Nitpick comments:
In `@deepmd/dpmodel/output_def.py`:
- Line 97: Remove the stale cached attribute self.md that is set via
self.output_def()—delete the assignment "self.md = self.output_def()" (and the
same stale assignments at the other occurrences around lines 105–107) and any
dead references to self.md so the code uses the call-time md only; locate this
in the class constructor (or initializer) where output_def() is invoked and
remove the self.md assignment while leaving the output_def method and any
call-site usages that pass md unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d2119289-f2dc-4a74-90fb-243855953c0d

📥 Commits

Reviewing files that changed from the base of the PR and between baab3e8 and 372f2db.

📒 Files selected for processing (15)
  • deepmd/dpmodel/fitting/dipole_fitting.py
  • deepmd/dpmodel/fitting/dos_fitting.py
  • deepmd/dpmodel/fitting/general_fitting.py
  • deepmd/dpmodel/fitting/invar_fitting.py
  • deepmd/dpmodel/fitting/polarizability_fitting.py
  • deepmd/dpmodel/fitting/property_fitting.py
  • deepmd/dpmodel/model/frozen.py
  • deepmd/dpmodel/model/make_model.py
  • deepmd/dpmodel/model/spin_model.py
  • deepmd/dpmodel/output_def.py
  • deepmd/pt_expt/infer/deep_eval.py
  • source/tests/common/dpmodel/test_fitting_middle_output.py
  • source/tests/infer/fparam_aparam-testcase.yaml
  • source/tests/infer/test_models.py
  • source/tests/pt_expt/infer/test_deep_eval.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
source/tests/pt_expt/infer/test_deep_eval.py (1)

1667-1692: ⚠️ Potential issue | 🟡 Minor

Assert the new API contract in the ZBL test.

At Line 1691, the test only checks isinstance(...) and never validates get_dp_atomic_model(). This misses the behavior the test name/docstring claims to cover.

Suggested fix
@@
-        # zbl_am is a LinearEnergyAtomicModel, not DPAtomicModel
-        self.assertFalse(isinstance(zbl_am, DPAtomicModelDP))
+        # ZBL linear model has no single underlying DPAtomicModel
+        self.assertIsNone(zbl_am.get_dp_atomic_model())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 1667 - 1692, The
test test_zbl_model_returns_none currently only asserts zbl_am is not an
instance of DPAtomicModelDP; update it to also validate the new API by calling
zbl_am.get_dp_atomic_model() and asserting it returns the underlying
DPAtomicModel (dp_am) or is an instance of DPAtomicModelDP (e.g.,
self.assertIs(zbl_am.get_dp_atomic_model(), dp_am) or
self.assertIsInstance(zbl_am.get_dp_atomic_model(), DPAtomicModelDP)); ensure
you reference the existing local variables zbl_am and dp_am and keep the
existing isinstance check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1717-1738: The test claims to validate
FrozenModel.get_dp_atomic_model() but deserializes a regular EnergyModel;
replace or adjust the test so it actually exercises a FrozenModel: wrap the
EnergyModel in a FrozenModel (create FrozenModel(model) or otherwise construct a
FrozenModel) before serializing with deserialize_to_file, then deserialize into
DeepPot and assert that dp.deep_eval._dpmodel is an instance of FrozenModel and
that dp.deep_eval._dpmodel.get_dp_atomic_model() returns a DPAtomicModelDP;
alternatively, if you prefer not to change behavior, rename the test and
docstring to reflect that it validates a regular model delegation rather than
FrozenModel. Ensure references to FrozenModel, get_dp_atomic_model,
deserialize_to_file, DeepPot, and deep_eval._dpmodel are updated accordingly.

---

Duplicate comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1667-1692: The test test_zbl_model_returns_none currently only
asserts zbl_am is not an instance of DPAtomicModelDP; update it to also validate
the new API by calling zbl_am.get_dp_atomic_model() and asserting it returns the
underlying DPAtomicModel (dp_am) or is an instance of DPAtomicModelDP (e.g.,
self.assertIs(zbl_am.get_dp_atomic_model(), dp_am) or
self.assertIsInstance(zbl_am.get_dp_atomic_model(), DPAtomicModelDP)); ensure
you reference the existing local variables zbl_am and dp_am and keep the
existing isinstance check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ea1128ee-1903-4ac7-a915-ee0c8372fb07

📥 Commits

Reviewing files that changed from the base of the PR and between 372f2db and 13fefa6.

📒 Files selected for processing (1)
  • source/tests/pt_expt/infer/test_deep_eval.py

- Raise ValueError (not warning) when set_return_middle_output(True) with
  neuron=[] (no hidden layers)
- Fix _is_spin_model() to use precomputed self._is_spin flag instead of
  isinstance check against dpmodel SpinModel (which misses pt_expt SpinModel)
- Fix RUF059 unused unpacked variables in eval_descriptor and
  eval_fitting_last_layer
- Fix RUF059 unused unpack in test_middle_output_registered_in_output_def
- Rename test_frozen_model_delegates to test_deserialized_model_delegates
  to match what it actually tests
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
source/tests/pt_expt/infer/test_deep_eval.py (1)

1667-1691: ⚠️ Potential issue | 🟡 Minor

Assert the new get_dp_atomic_model() contract directly.

This still only proves zbl_am is not a DPAtomicModel. The test would keep passing even if DPZBLLinearEnergyAtomicModel.get_dp_atomic_model() regressed, so it does not cover the API named in the test/docstring.

Minimal fix
-        # LinearEnergyAtomicModel is not a DPAtomicModel
-        self.assertFalse(isinstance(zbl_am, DPAtomicModelDP))
+        self.assertIsNone(zbl_am.get_dp_atomic_model())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 1667 - 1691, The
test currently only checks isinstance(zbl_am, DPAtomicModelDP) but should assert
the get_dp_atomic_model() contract directly; update the test to call
DPZBLLinearEnergyAtomicModel.get_dp_atomic_model() on zbl_am and assert it
returns None (e.g., self.assertIsNone(zbl_am.get_dp_atomic_model())), keeping
the existing construction of zbl_am and imports intact so the assertion verifies
the API behavior rather than type identity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 1151-1164: The code unconditionally sets
fitting_net.set_return_middle_output(True) and then resets to False, which
overwrites any prior setting; modify the block to first read and store the
fitting_net's current middle_output state (e.g., via an existing getter or
attribute on fitting_net), set it to True for the call, and in the finally
restore the original stored value instead of hardcoding False; update references
around dp_am.fitting_net, fitting_net.set_return_middle_output and the
try/finally so the original setting is preserved for subsequent callers.

In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1777-1782: The test swaps in a SpinModel into
cls.dp.deep_eval._dpmodel but doesn't update the internal spin flag, so
_is_spin_model() still sees deep_eval._is_spin as False and the spin-only guards
in eval_descriptor() and eval_fitting_last_layer() are not exercised; after
assigning cls.dp.deep_eval._dpmodel = cls.spin_model, set
cls.dp.deep_eval._is_spin = True (or otherwise mark the deep_eval instance as
spin) so _is_spin_model() returns True and the NotImplementedError paths are
properly tested.

---

Duplicate comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1667-1691: The test currently only checks isinstance(zbl_am,
DPAtomicModelDP) but should assert the get_dp_atomic_model() contract directly;
update the test to call DPZBLLinearEnergyAtomicModel.get_dp_atomic_model() on
zbl_am and assert it returns None (e.g.,
self.assertIsNone(zbl_am.get_dp_atomic_model())), keeping the existing
construction of zbl_am and imports intact so the assertion verifies the API
behavior rather than type identity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 116a01e7-ade6-4e0b-96f1-f46567668e8f

📥 Commits

Reviewing files that changed from the base of the PR and between 13fefa6 and f012b21.

📒 Files selected for processing (4)
  • deepmd/dpmodel/fitting/general_fitting.py
  • deepmd/pt_expt/infer/deep_eval.py
  • source/tests/common/dpmodel/test_fitting_middle_output.py
  • source/tests/pt_expt/infer/test_deep_eval.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • deepmd/dpmodel/fitting/general_fitting.py

pt_expt.SpinModel inherits from dpmodel.SpinModel, so isinstance works
correctly. The coderabbitai suggestion to use self._is_spin was wrong.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deepmd/pt_expt/infer/deep_eval.py (1)

684-723: ⚠️ Potential issue | 🟠 Major

Don't require fitting-only params on the descriptor path.

eval_descriptor() drops aparam_t completely and only forwards fparam_t when dp_am.add_chg_spin_ebd is enabled, but _prepare_inputs() now raises for any nonzero dim_aparam / dim_fparam. That breaks descriptor diagnostics for models whose descriptor is independent of those inputs.

Minimal direction
 def _prepare_inputs(
     self,
     coords: np.ndarray,
     cells: np.ndarray | None,
     atom_types: np.ndarray,
     fparam: np.ndarray | None,
     aparam: np.ndarray | None,
+    *,
+    require_fparam: bool = True,
+    require_aparam: bool = True,
 ) -> tuple:
@@
-        elif self.get_dim_fparam() > 0:
+        elif require_fparam and self.get_dim_fparam() > 0:
@@
-        elif self.get_dim_aparam() > 0:
+        elif require_aparam and self.get_dim_aparam() > 0:

Then have eval_descriptor() call _prepare_inputs(..., require_aparam=False, require_fparam=getattr(dp_am, "add_chg_spin_ebd", False)).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/infer/deep_eval.py` around lines 684 - 723, _eval_descriptor()
currently calls _prepare_inputs() which unconditionally raises if dim_aparam or
dim_fparam > 0, preventing descriptor-only evaluation when those params are only
used for fitting; change the _prepare_inputs() call sites (notably in
eval_descriptor) to pass flags that relax these requirements: add optional
parameters require_aparam (default True) and require_fparam (default True) to
_prepare_inputs(), and in eval_descriptor() call _prepare_inputs(...,
require_aparam=False, require_fparam=getattr(dp_am, "add_chg_spin_ebd", False))
so aparam is not required for descriptor-only runs while fparam is only required
when dp_am.add_chg_spin_ebd is true; update any other callers as needed to
preserve current behavior when inputs are actually required.
♻️ Duplicate comments (1)
deepmd/pt_expt/infer/deep_eval.py (1)

1155-1168: ⚠️ Potential issue | 🟠 Major

Restore the previous middle_output setting.

This helper always leaves fitting_net with eval_return_middle_output=False. If the fitting net was already configured to emit "middle_output", this diagnostic call mutates shared model state and changes later evaluations.

Minimal fix
             atype = ext_atype_t[:, :natoms]
             fitting_net = dp_am.fitting_net
+            prev_return_middle_output = getattr(
+                fitting_net, "eval_return_middle_output", False
+            )
             fitting_net.set_return_middle_output(True)
             try:
                 ret = fitting_net(
                     descriptor,
                     atype,
@@
                     fparam=fparam_t,
                     aparam=aparam_t,
                 )
             finally:
-                fitting_net.set_return_middle_output(False)
+                fitting_net.set_return_middle_output(prev_return_middle_output)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/infer/deep_eval.py` around lines 1155 - 1168, The call
unconditionally resets fitting_net.set_return_middle_output(False) in the
finally block, which clobbers any previous configuration; capture the prior
state before changing it and restore that saved value in the finally block.
Specifically, before calling fitting_net.set_return_middle_output(True) read and
store the current value (e.g., prev = fitting_net.get_return_middle_output() or
store the attribute used internally), then call set_return_middle_output(True),
run fitting_net(...) to get ret, and in the finally call
set_return_middle_output(prev) to restore the original setting; update the code
paths around fitting_net, set_return_middle_output, and the try/finally in
deep_eval.py accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 1032-1040: eval_descriptor (and the similar method around lines
1094-1102) forwards coords, cells, and atom_types directly to _prepare_inputs
which assumes numpy arrays; callers passing Python lists will hit AttributeError
on .shape. Before calling _prepare_inputs, normalize the public inputs the same
way eval() does: coerce coords and atom_types to numpy arrays (coords =
np.asarray(coords), atom_types = np.asarray(atom_types)), coerce cells to None
or np.asarray(cells) if provided, and apply the same fparam/aparam handling used
by eval(); then pass these normalized arrays into _prepare_inputs so list inputs
work identically to the eval() API.

---

Outside diff comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 684-723: _eval_descriptor() currently calls _prepare_inputs()
which unconditionally raises if dim_aparam or dim_fparam > 0, preventing
descriptor-only evaluation when those params are only used for fitting; change
the _prepare_inputs() call sites (notably in eval_descriptor) to pass flags that
relax these requirements: add optional parameters require_aparam (default True)
and require_fparam (default True) to _prepare_inputs(), and in eval_descriptor()
call _prepare_inputs(..., require_aparam=False, require_fparam=getattr(dp_am,
"add_chg_spin_ebd", False)) so aparam is not required for descriptor-only runs
while fparam is only required when dp_am.add_chg_spin_ebd is true; update any
other callers as needed to preserve current behavior when inputs are actually
required.

---

Duplicate comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 1155-1168: The call unconditionally resets
fitting_net.set_return_middle_output(False) in the finally block, which clobbers
any previous configuration; capture the prior state before changing it and
restore that saved value in the finally block. Specifically, before calling
fitting_net.set_return_middle_output(True) read and store the current value
(e.g., prev = fitting_net.get_return_middle_output() or store the attribute used
internally), then call set_return_middle_output(True), run fitting_net(...) to
get ret, and in the finally call set_return_middle_output(prev) to restore the
original setting; update the code paths around fitting_net,
set_return_middle_output, and the try/finally in deep_eval.py accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 42260024-e1cc-45fa-905b-8a7e51d9399c

📥 Commits

Reviewing files that changed from the base of the PR and between f012b21 and 1df931a.

📒 Files selected for processing (1)
  • deepmd/pt_expt/infer/deep_eval.py

Accept Python lists (not just np.ndarray) by coercing coords, cells,
atom_types with np.array() at the top, matching the eval() API.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
source/tests/pt_expt/infer/test_deep_eval.py (2)

1439-1444: Normalize the swapped-in eager models to inference mode and dtype.

model_fp is the only helper model in these setups that is neither moved to torch.float64 nor switched to .eval() before replacing deep_eval._dpmodel. That works only as long as the eager diagnostic path stays insensitive to dtype/train-mode differences.

Proposed change
         model_fp = EnergyModel(ds_fp, ft_fp, type_map=cls.type_map)
+        model_fp = model_fp.to(torch.float64)
+        model_fp.eval()
         pte_fp = os.path.join(cls._tmpdir, "fp.pte")
         deserialize_to_file(pte_fp, {"model": model_sea.serialize()})
         cls.dp_fp = DeepPot(pte_fp)
         cls.dp_fp.deep_eval._dpmodel = model_fp

Also applies to: 1590-1595

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 1439 - 1444, The
swapped-in eager EnergyModel instance (model_fp) is not set to the correct dtype
or eval mode before being assigned to DeepPot.deep_eval._dpmodel; update the
setup to call model_fp.double() (or .to(torch.float64)) and model_fp.eval()
after constructing EnergyModel (and do the same for the analogous helper model
in the 1590-1595 block) so the eager diagnostic path uses inference-mode and
float64 dtype when replacing deep_eval._dpmodel.

1476-1488: Avoid bitwise equality for floating-point diagnostic outputs.

These values come from PT inference, so assert_array_equal() is stricter than the behavior you actually want to validate and can become brittle on different kernels/devices. assert_allclose() with tight tolerances is safer here. Please verify this on the CUDA/PT test lane as well.

Proposed change
-        np.testing.assert_array_equal(d1, d2)
+        np.testing.assert_allclose(d1, d2, rtol=1e-12, atol=1e-12)
...
-        np.testing.assert_array_equal(d1, d2)
+        np.testing.assert_allclose(d1, d2, rtol=1e-12, atol=1e-12)
...
-        np.testing.assert_array_equal(fit_ll1, fit_ll2)
+        np.testing.assert_allclose(fit_ll1, fit_ll2, rtol=1e-12, atol=1e-12)
...
-        np.testing.assert_array_equal(fit_ll1, fit_ll2)
+        np.testing.assert_allclose(fit_ll1, fit_ll2, rtol=1e-12, atol=1e-12)

Also applies to: 1630-1650

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 1476 - 1488,
Replace bitwise equality checks in the deterministic descriptor tests by using
numpy.testing.assert_allclose with tight tolerances: in
test_descriptor_deterministic_sea and test_descriptor_deterministic_dpa1 call
np.testing.assert_allclose(d1, d2, rtol=1e-6, atol=1e-8) (or similar tight
tolerances) instead of np.testing.assert_array_equal; do the same for the other
similar tests that call deep_eval.eval_descriptor (e.g., the tests around the
later block) and run the CUDA/PT lane to confirm tolerances are acceptable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1361-1367: The test test_typeebd_dpa1 is too permissive: it
accepts both ntypes and ntypes+1 rows from deep_eval.eval_typeebd(), which hides
a regression where a padding row is returned; update the test to assert the
documented contract exactly by requiring typeebd.shape[0] == self.nt (not
allowing self.nt+1), or if the implementation intentionally includes a padding
row, update the API/docs in deepmd/pt_expt/infer/deep_eval.py to state it
returns (ntypes+1, tebd_dim) and then change this test (and the analogous test
at the other location) to assert shape[0] == self.nt + 1 and adjust expectations
accordingly so tests reflect the declared contract for eval_typeebd().

---

Nitpick comments:
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 1439-1444: The swapped-in eager EnergyModel instance (model_fp) is
not set to the correct dtype or eval mode before being assigned to
DeepPot.deep_eval._dpmodel; update the setup to call model_fp.double() (or
.to(torch.float64)) and model_fp.eval() after constructing EnergyModel (and do
the same for the analogous helper model in the 1590-1595 block) so the eager
diagnostic path uses inference-mode and float64 dtype when replacing
deep_eval._dpmodel.
- Around line 1476-1488: Replace bitwise equality checks in the deterministic
descriptor tests by using numpy.testing.assert_allclose with tight tolerances:
in test_descriptor_deterministic_sea and test_descriptor_deterministic_dpa1 call
np.testing.assert_allclose(d1, d2, rtol=1e-6, atol=1e-8) (or similar tight
tolerances) instead of np.testing.assert_array_equal; do the same for the other
similar tests that call deep_eval.eval_descriptor (e.g., the tests around the
later block) and run the CUDA/PT lane to confirm tolerances are acceptable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cce00baf-8f7d-4d55-b18f-96d82106d4a4

📥 Commits

Reviewing files that changed from the base of the PR and between 1df931a and 4f4983d.

📒 Files selected for processing (2)
  • deepmd/pt_expt/infer/deep_eval.py
  • source/tests/pt_expt/infer/test_deep_eval.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • deepmd/pt_expt/infer/deep_eval.py

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 11, 2026

Codecov Report

❌ Patch coverage is 95.87629% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.35%. Comparing base (baab3e8) to head (4f4983d).

Files with missing lines Patch % Lines
deepmd/pt_expt/infer/deep_eval.py 96.42% 2 Missing ⚠️
deepmd/dpmodel/model/frozen.py 50.00% 1 Missing ⚠️
deepmd/dpmodel/model/make_model.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5391      +/-   ##
==========================================
+ Coverage   80.33%   80.35%   +0.02%     
==========================================
  Files         819      819              
  Lines       85356    85445      +89     
  Branches     4139     4139              
==========================================
+ Hits        68571    68662      +91     
+ Misses      15509    15508       -1     
+ Partials     1276     1275       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wanghan-iapcm wanghan-iapcm added the Test CUDA Trigger test CUDA workflow label Apr 11, 2026
@github-actions github-actions bot removed the Test CUDA Trigger test CUDA workflow label Apr 11, 2026
@njzjz njzjz added this pull request to the merge queue Apr 11, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants