refactor(examples): rename llm_ptq → hf_ptq (symlink for back-compat)#1759
refactor(examples): rename llm_ptq → hf_ptq (symlink for back-compat)#1759Edwardf0t1 wants to merge 2 commits into
Conversation
The example covers Hugging Face LLM and VLM PTQ, so "llm_ptq" is a misnomer since the vlm_ptq consolidation. Rename the directory to examples/hf_ptq and leave a relative symlink examples/llm_ptq -> hf_ptq so existing paths and commands keep working through a deprecation window. - git mv examples/llm_ptq -> examples/hf_ptq and tests/examples/llm_ptq -> tests/examples/hf_ptq (CI maps the matrix name to both examples/<name> and tests/examples/<name>). - Add back-compat symlink examples/llm_ptq -> hf_ptq (tracked as a symlink). - Update CI matrices and all repo path references (docs, READMEs, skills, launcher/debugger tools, tests) from llm_ptq to hf_ptq. Python identifiers and test-util module names (run_llm_ptq_command, llm_ptq_utils) are kept. - Preserve the CODEOWNERS team slug and historical CHANGELOG entries; add a CHANGELOG deprecation note for the rename. Follow-up to the examples/vlm_ptq -> examples/llm_ptq consolidation (#1705), targeted for the same 0.46 release. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
📝 WalkthroughWalkthroughRenames the Changesllm_ptq → hf_ptq rename and vlm_ptq consolidation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.agents/skills/ptq/SKILL.md (1)
108-108:⚠️ Potential issue | 🟡 MinorCorrect the launcher script path to match the actual template location.
Line 108 references
common/hf_ptq/hf_ptq.sh, which does not exist. The correct path iscommon/hf/ptq.sh, as documented in the launcher guide itself. Replacecommon/hf_ptq/hf_ptq.shwithcommon/hf/ptq.sh.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.agents/skills/ptq/SKILL.md at line 108, The launcher script path referenced in the SKILL.md file is incorrect. Locate the reference to `common/hf_ptq/hf_ptq.sh` on line 108 and replace it with the correct path `common/hf/ptq.sh` to match the actual template location documented in the launcher guide.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.agents/skills/ptq/references/unsupported-models.md:
- Line 150: In the unsupported-models.md file, replace the broken documentation
link `examples/hf_ptq/moe.md` with the correct path `examples/hf_ptq/README.md`.
The README file contains the actual MoE quantization documentation that should
be referenced instead of the non-existent moe.md file.
In `@README.md`:
- Line 33: The README.md file contains anchor links pointing to sections in
examples/hf_ptq/README.md that do not exist as markdown headings. To fix this,
either add the missing sections (`#llama-4`,
`#model-quantization-and-trt-llm-conversion`, and
`#deploy-fp8-quantized-model-using-vllm`) as properly formatted markdown headings
in examples/hf_ptq/README.md, or update the links in the main README to
reference only the existing anchors (`#support-matrix` and
`#hugging-face-supported-models`). Choose the approach that best maintains the
documentation structure and user experience.
---
Outside diff comments:
In @.agents/skills/ptq/SKILL.md:
- Line 108: The launcher script path referenced in the SKILL.md file is
incorrect. Locate the reference to `common/hf_ptq/hf_ptq.sh` on line 108 and
replace it with the correct path `common/hf/ptq.sh` to match the actual template
location documented in the launcher guide.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 36b1e7eb-b9ba-4047-a71d-d2be1dedff6f
📒 Files selected for processing (65)
.agents/skills/common/environment-setup.md.agents/skills/deployment/references/support-matrix.md.agents/skills/ptq/SKILL.md.agents/skills/ptq/references/slurm-setup-ptq.md.agents/skills/ptq/references/unsupported-models.md.github/CODEOWNERS.github/workflows/_example_tests_runner.yml.github/workflows/example_tests.ymlCHANGELOG.rstREADME.mddocs/source/deployment/3_unified_hf.rstdocs/source/guides/10_recipes.rstdocs/source/guides/_compress_quantized_models.rstdocs/source/guides/_customized_model_quantization.rstdocs/source/index.rstexamples/deepseek/README.mdexamples/deepseek/deepseek_v4/quantize_to_nvfp4.pyexamples/gpt-oss/README.mdexamples/hf_ptq/.gitignoreexamples/hf_ptq/README.mdexamples/hf_ptq/cast_mxfp4_to_nvfp4.pyexamples/hf_ptq/example_utils.pyexamples/hf_ptq/fsdp2.yamlexamples/hf_ptq/hf_ptq.pyexamples/hf_ptq/multinode_ptq.pyexamples/hf_ptq/nemotron_vl_calib.pyexamples/hf_ptq/notebooks/1_FP4-FP8_PTQ_Min-Max_Calibration.ipynbexamples/hf_ptq/notebooks/2_PTQ_AWQ_Calibration.ipynbexamples/hf_ptq/notebooks/3_PTQ_AutoQuantization.ipynbexamples/hf_ptq/requirements.txtexamples/hf_ptq/run_tensorrt_llm.pyexamples/hf_ptq/scripts/huggingface_example.shexamples/hf_ptq/scripts/parser.shexamples/hf_ptq/vlm_utils.pyexamples/llm_eval/README.mdexamples/llm_ptqexamples/llm_qat/README.mdexamples/llm_qat/llama_factory/README.mdexamples/llm_qat/notebooks/QAT_QAD_Walkthrough.ipynbexamples/megatron_bridge/quantize.pyexamples/model_hub/README.mdexamples/pruning/minitron/NVIDIA-Nemotron-Nano-9B-v2/README.mdexamples/speculative_decoding/README.mdexamples/vllm_serve/README.mdexamples/vlm_ptq/README.mdexamples/vlm_ptq/scripts/huggingface_example.shmodelopt/recipe/presets.pymodelopt/torch/quantization/utils/numeric_utils.pytests/_test_utils/examples/llm_ptq_example_utils.pytests/_test_utils/examples/run_command.pytests/examples/hf_ptq/_extensions/test_torch_extensions.pytests/examples/hf_ptq/test_cast_mxfp4_to_nvfp4.pytests/examples/hf_ptq/test_deploy.pytests/examples/hf_ptq/test_example_utils.pytests/examples/hf_ptq/test_hf_ptq_args.pytests/examples/hf_ptq/test_llm_ptq.pytests/examples/hf_ptq/test_vlm_ptq.pytests/examples/speculative_decoding/test_eagle_offline_ptq.pytests/gpu/torch/export/test_unified_hf_export_and_check_safetensors.pytests/gpu/torch/quantization/test_gpt_oss_mxfp4_nvfp4_cast_cuda.pytools/debugger/CLAUDE.mdtools/debugger/README.mdtools/launcher/common/eagle3/hf_ptq.shtools/launcher/common/hf/ptq.shtools/launcher/examples/Qwen/Qwen3-8B/hf_ptq.yaml
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1759 +/- ##
==========================================
+ Coverage 74.29% 76.65% +2.36%
==========================================
Files 511 511
Lines 56356 56356
==========================================
+ Hits 41868 43200 +1332
+ Misses 14488 13156 -1332
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Remove vlm_ptq codeowner entry
| /examples/llm_distill @NVIDIA/modelopt-torch-distill-codeowners | ||
| /examples/llm_eval @NVIDIA/modelopt-examples-llm_ptq-codeowners | ||
| /examples/llm_ptq @NVIDIA/modelopt-examples-llm_ptq-codeowners | ||
| /examples/hf_ptq @NVIDIA/modelopt-examples-llm_ptq-codeowners |
There was a problem hiding this comment.
Should we use @NVIDIA/modelopt-torch-quantization-codeowners here?
| from _test_utils.examples.run_command import MODELOPT_ROOT | ||
|
|
||
| _LLM_PTQ_DIR = MODELOPT_ROOT / "examples" / "llm_ptq" | ||
| _LLM_PTQ_DIR = MODELOPT_ROOT / "examples" / "hf_ptq" |
There was a problem hiding this comment.
| _LLM_PTQ_DIR = MODELOPT_ROOT / "examples" / "hf_ptq" | |
| _HF_PTQ_DIR = MODELOPT_ROOT / "examples" / "hf_ptq" |
There was a problem hiding this comment.
rename file to hf_ptq_example_utils.py
There was a problem hiding this comment.
Rename function
| def run_hf_ptq_command(*, model: str, quant: str, vlm: bool = False, **kwargs): |
| | LLM Quantization | [View Support Matrix](./examples/hf_ptq/README.md#support-matrix) | | ||
| | VLM Quantization | [View Support Matrix](./examples/hf_ptq/README.md#hugging-face-supported-models) | |
There was a problem hiding this comment.
Both point to the same table. Lets unify
| | LLM Quantization | [View Support Matrix](./examples/hf_ptq/README.md#support-matrix) | | |
| | VLM Quantization | [View Support Matrix](./examples/hf_ptq/README.md#hugging-face-supported-models) | | |
| | LLM / VLM Quantization | [View Support Matrix](./examples/hf_ptq/README.md#support-matrix) | |
- Fix broken docs link examples/hf_ptq/moe.md -> README.md (unsupported-models) - Fix broken README anchors to existing hf_ptq/README.md headings - Fix launcher script path common/hf_ptq/hf_ptq.sh -> common/hf/ptq.sh in SKILL.md - Remove stale examples/vlm_ptq CODEOWNERS entry - Unify README LLM/VLM support-matrix rows - Rename test util: llm_ptq_example_utils.py -> hf_ptq_example_utils.py, _LLM_PTQ_DIR -> _HF_PTQ_DIR, run_llm_ptq_command -> run_hf_ptq_command Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Zhiyu Cheng <zhiyuc@nvidia.com>
There was a problem hiding this comment.
Warning
CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.
Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/_test_utils/examples/llm_ptq_utils.py`:
- Line 22: The file tests/_test_utils/examples/llm_ptq_utils.py has not been
renamed despite the PR's stated intention to rename the llm_ptq_utils module to
hf_ptq_utils. While the imports within the file have been updated to use the new
naming convention (such as run_hf_ptq_command), the physical file itself still
uses the old name. Rename the file from llm_ptq_utils.py to hf_ptq_utils.py to
complete the module refactoring and ensure consistency with the updated imports.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e6bde70e-76ac-42bd-9fe0-bec3c6a84b60
📒 Files selected for processing (12)
.agents/skills/ptq/SKILL.md.agents/skills/ptq/references/unsupported-models.md.github/CODEOWNERSREADME.mdtests/_test_utils/examples/hf_ptq_example_utils.pytests/_test_utils/examples/llm_ptq_utils.pytests/_test_utils/examples/run_command.pytests/examples/hf_ptq/test_cast_mxfp4_to_nvfp4.pytests/examples/hf_ptq/test_example_utils.pytests/examples/hf_ptq/test_vlm_ptq.pytests/examples/llm_eval/test_llm_eval.pytests/gpu/torch/quantization/test_gpt_oss_mxfp4_nvfp4_cast_cuda.py
💤 Files with no reviewable changes (1)
- .github/CODEOWNERS
✅ Files skipped from review due to trivial changes (3)
- tests/examples/hf_ptq/test_vlm_ptq.py
- .agents/skills/ptq/references/unsupported-models.md
- README.md
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/gpu/torch/quantization/test_gpt_oss_mxfp4_nvfp4_cast_cuda.py
- .agents/skills/ptq/SKILL.md
- tests/examples/hf_ptq/test_example_utils.py
| import pytest | ||
| import torch | ||
| from _test_utils.examples.run_command import run_llm_ptq_command | ||
| from _test_utils.examples.run_command import run_hf_ptq_command |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🔴 Critical
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the actual filename and check for old module name references
# Check if the new filename exists
echo "=== Checking for hf_ptq_utils.py ==="
fd -t f 'hf_ptq_utils.py' tests/_test_utils/examples/
# Check if the old filename still exists
echo -e "\n=== Checking for llm_ptq_utils.py ==="
fd -t f 'llm_ptq_utils.py' tests/_test_utils/examples/
# Search for imports of the old module name
echo -e "\n=== Searching for imports of llm_ptq_utils ==="
rg -n 'from _test_utils.examples.llm_ptq_utils import' --type py
# Search for imports of the new module name
echo -e "\n=== Searching for imports of hf_ptq_utils ==="
rg -n 'from _test_utils.examples.hf_ptq_utils import' --type pyRepository: NVIDIA/Model-Optimizer
Length of output: 328
Rename module file from llm_ptq_utils.py to hf_ptq_utils.py to match the PR's stated module renaming.
The file tests/_test_utils/examples/llm_ptq_utils.py still uses its old name despite the PR summary stating that llm_ptq_utils should be renamed to hf_ptq_utils. While the imports within the file were updated (e.g., run_hf_ptq_command), the module file itself was not renamed. Rename the file to complete the module refactoring.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/_test_utils/examples/llm_ptq_utils.py` at line 22, The file
tests/_test_utils/examples/llm_ptq_utils.py has not been renamed despite the
PR's stated intention to rename the llm_ptq_utils module to hf_ptq_utils. While
the imports within the file have been updated to use the new naming convention
(such as run_hf_ptq_command), the physical file itself still uses the old name.
Rename the file from llm_ptq_utils.py to hf_ptq_utils.py to complete the module
refactoring and ensure consistency with the updated imports.
What does this PR do?
Type of change: refactor / deprecation (examples)
Follow-up to #1705 (which consolidated
examples/vlm_ptqintoexamples/llm_ptq). Since that example now covers Hugging Face LLM and VLM PTQ, thellm_ptqname is a misnomer. This renames the directory toexamples/hf_ptqand leaves a relative symlinkexamples/llm_ptq → hf_ptqso existing paths/commands keep working during a deprecation window.Requested by @kevalmorabia97 on #1705 (with the symlink-for-back-compat approach), targeted for the same 0.46 release as the consolidation.
Changes
git mv examples/llm_ptq → examples/hf_ptqandtests/examples/llm_ptq → tests/examples/hf_ptq(the CI runner maps the matrix name to bothexamples/<name>andtests/examples/<name>).examples/llm_ptq → hf_ptq.llm_ptqtohf_ptq.run_llm_ptq_command,llm_ptq_utils) — they name the LLM-PTQ task, not the directory.modelopt-examples-llm_ptq-codeowners) and historical CHANGELOG entries; add a CHANGELOG deprecation note.Back-compat caveats (inherent to git directory symlinks)
cwd/pytest resolution work through the symlink.examples/llm_ptq/...won't navigate in. All internal references are repointed tohf_ptq, so the symlink is only for legacy external/CLI use.Usage (unchanged via symlink)
Testing
bash -non moved/edited shell scripts (new path + via symlink).py_compileon moved/edited Python; test re-export shim repointed toexamples/hf_ptq/example_utils.examples/llm_ptqas a single symlink (mode 120000), not a duplicated tree (no pre-commit / pytest double-processing).pre-commit runon all changed files passes.Before your PR is "Ready for review"
examples/llm_ptqpaths valid; see caveats above)Additional Information
Follow-up (later release): remove the
examples/llm_ptqsymlink once external references have migrated.🤖 Generated with Claude Code
Summary by CodeRabbit
Documentation
Chores