[OMNIML-3776]: add clear docs restrict the model types by shengliangxu · Pull Request #1105 · NVIDIA/Model-Optimizer

shengliangxu · 2026-03-23T21:56:29Z

What does this PR do?

Our current library does not support loading quantized models and that make QA confusing. Let's clearly document it.

More detail in the NVBug:

https://nvbugspro.nvidia.com/bug/5993598

Summary by CodeRabbit

Documentation
- Clarified input requirements for simulated quantization workflows in LM-Eval-Harness, MMLU, and MT-Bench examples. Model path arguments must reference the original unquantized model, which then undergoes simulated quantization during evaluation.

Our current library does not support loading quantized models and that make QA confusing. Let's clearly document it. More detail in the NVBug: https://nvbugspro.nvidia.com/bug/5993598 Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>

copy-pr-bot · 2026-03-23T21:56:32Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-03-23T21:56:36Z

📝 Walkthrough

Walkthrough

Documentation updates to examples/llm_eval/README.md clarifying that simulated quantization workflows require the original unquantized model path as input. The scripts then perform simulated quantization internally before evaluation across LM-Eval-Harness, MMLU, and MT-Bench workflows.

Changes

Cohort / File(s)	Summary
Documentation `examples/llm_eval/README.md`	Added clarification that input model paths must reference the original unquantized model for LM-Eval-Harness, MMLU, and MT-Bench scripts, with simulated quantization performed internally before evaluation.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title mentions 'restrict the model types' but the actual changes clarify input expectations for simulated quantization workflows by documenting that model paths must reference unquantized models, not restricting/preventing model types.	Update the title to accurately reflect the changes, such as 'Clarify documentation for unquantized model inputs in evaluation scripts' or 'Add documentation clarifying model input requirements for quantization workflows'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns	✅ Passed	PR contains only documentation changes to README; no Python source files modified in modelopt package or examples.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch shengliangx/docs-update

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-23T22:00:37Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1105/
Built to branch `gh-pages` at 2026-03-23 22:00 UTC. Preview will be ready when the GitHub Pages deployment is complete.

coderabbitai

🧹 Nitpick comments (1)

examples/llm_eval/README.md (1)
62-82: Consider adding the same clarification to auto_quantize sections.

The auto_quantize sections (both here and in MMLU at lines 131-142) also perform simulated quantization and use the same scripts (lm_eval_hf.py, mmlu.py) with similar --quant_cfg parameters. For consistency and to fully address the PR objective of reducing confusion, consider adding the same clarification about requiring the original unquantized model as input.

Similarly, the "Customize quantization method for evaluation" section (lines 200-210) could benefit from the same clarification.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/llm_eval/README.md` around lines 62 - 82, Add a short clarifying
sentence to the auto_quantize sections (the blocks referencing lm_eval_hf.py and
mmlu.py and flags like --quant_cfg and --auto_quantize_bits) stating that
auto_quantize performs simulated per-layer quantization and therefore requires
the original unquantized pretrained model as input (not an already-quantized
checkpoint); also add the same clarification to the "Customize quantization
method for evaluation" section that describes using --quant_cfg so readers know
to provide the original model for these simulated quantization workflows.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@examples/llm_eval/README.md`:
- Around line 62-82: Add a short clarifying sentence to the auto_quantize
sections (the blocks referencing lm_eval_hf.py and mmlu.py and flags like
--quant_cfg and --auto_quantize_bits) stating that auto_quantize performs
simulated per-layer quantization and therefore requires the original unquantized
pretrained model as input (not an already-quantized checkpoint); also add the
same clarification to the "Customize quantization method for evaluation" section
that describes using --quant_cfg so readers know to provide the original model
for these simulated quantization workflows.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fc0822e5-bb27-48a9-9cb0-8822c4c8258b

📥 Commits

Reviewing files that changed from the base of the PR and between c425524 and 3709acd.

📒 Files selected for processing (1)

examples/llm_eval/README.md

codecov · 2026-03-23T22:08:18Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.23%. Comparing base (b61fb4e) to head (3709acd).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1105      +/-   ##
==========================================
- Coverage   70.24%   70.23%   -0.02%     
==========================================
  Files         227      227              
  Lines       25909    25909              
==========================================
- Hits        18201    18198       -3     
- Misses       7708     7711       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

shengliangxu marked this pull request as ready for review March 23, 2026 21:57

shengliangxu requested a review from a team as a code owner March 23, 2026 21:57

shengliangxu requested a review from realAsma March 23, 2026 21:57

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

shengliangxu requested a review from meenchen March 23, 2026 23:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OMNIML-3776]: add clear docs restrict the model types#1105

[OMNIML-3776]: add clear docs restrict the model types#1105
shengliangxu wants to merge 1 commit intomainfrom
shengliangx/docs-update

shengliangxu commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Mar 23, 2026

Built to branch `gh-pages` at 2026-03-23 22:00 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot left a comment

Uh oh!

codecov bot commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shengliangxu commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

github-actions bot commented Mar 23, 2026

Built to branch gh-pages at 2026-03-23 22:00 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 23, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shengliangxu commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-03-23 22:00 UTC.
Preview will be ready when the GitHub Pages deployment is complete.