Skip to content

Conversation

@willg-nv
Copy link

@willg-nv willg-nv commented Dec 17, 2025

What does this PR do?

Type of change: new feature

Overview: This PR integrates automated QDQ placement tool to ModelOpt. This PR is 3/4 of the change. This PR contains the following changes:

  1. Implements QDQAutotuner and Autotuner CLI interface
  2. Implements Benchmark to measure E2E time of QDQ models.
  3. unit tests for QDQ Autotuner and config.

Part 1: #701
Part 2: #702
Part 3: #703
Part 4: #704

Usage

python -m modelopt.onnx.quantization.autotune --model model.onnx

Testing

Implemented unit tests for QDQAutotuner and Config classes.

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: Yes
  • Did you add or update any necessary documentation?: No, document will be in part 4.
  • Did you update Changelog?: No. change log will be in part 4.

Additional Information

@willg-nv willg-nv requested a review from a team as a code owner December 17, 2025 06:56
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 17, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@willg-nv
Copy link
Author

@vishalpandya1990 could you help me review this PR? thanks!

@vishalpandya1990
Copy link
Contributor

@vishalpandya1990 could you help me review this PR? thanks!

Sorry for the delay. Added Ajinkya for review.

@willg-nv willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part3 branch from 3454bba to 4b9d789 Compare December 31, 2025 02:09
@@ -0,0 +1,157 @@
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Contributor

@gcunhase gcunhase Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the year to 2026.

"--output",
"-o",
type=str,
default=DEFAULT_OUTPUT_DIR,
Copy link
Contributor

@gcunhase gcunhase Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update this behavior to match the ONNX quantization and Autocast workflow?

In there, if output_path is not given, the resulting model is saved in the same path as the input model with a name extension. For ex: the quantized model.onnx is saved as model.quant.onnx and the converted model is saved as model.fp16.onnx.

See more details in:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: rename this as output_path to match other ONNX workflows.

help=f"Number of timing runs (default: {DEFAULT_TIMING_RUNS})",
)
trt_group.add_argument(
"--plugin-libraries",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajrasane should we update all argparse names to have _ instead of - to match the ONNX quant and autocast workflows?


# Model and Output
io_group = parser.add_argument_group("Model and Output")
io_group.add_argument("--model", "-m", type=str, required=True, help="Path to ONNX model file")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: rename this as onnx_path to match other ONNX workflows.

from modelopt.onnx.quantization.autotune.common import PatternCache, RegionType


def create_simple_conv_model():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to tests/_test_utils/onnx/lib_test_models.py? Alternatively, NonSimplifiedModel or build_resnet_block could be used here instead?

@ajrasane WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants