CI/Dev env bump by kevalmorabia97 · Pull Request #1818 · NVIDIA/Model-Optimizer

kevalmorabia97 · 2026-06-24T20:47:09Z

What does this PR do?

Type of change: chore

Bumps CI/dev tooling and test containers.

Container bumps

NeMo test containers → 26.06
TRT-LLM container → 1.3.0rc19
transformers max version → 5.12

Dev tooling bumps

mypy 1.17.1 → 2.1.0: enable new defaults (local_partial_types, strict_bytes); fix/narrow the errors newly surfaced by mypy 2.0 in 4 modules (rather than blanket-suppressing them); remove 2 stale # type: ignore comments
pre-commit 4.3.0 → 4.6.0
sphinx 8.1 → 9.1 + sphinx-rtd-theme 3.0 → 3.1: add suppress_warnings = ["ref.python"] to fix cross-reference ambiguity error new in sphinx 9.x

ruff bump (0.12.11 → 0.15.18) is deferred to a follow-up PR as it requires many auto-fixes.

Bug fixes surfaced by the bumps

sparsity (weight): make the weight mask DTensor-aware under FSDP. The transformers→5.12 bump routes the HF Trainer FSDP optimizer-state save through torch's DTensor-based get_optimizer_state_dict, which triggered aten.mul.Tensor got mixed torch.Tensor and DTensor in the dynamic weight getter. The mask is now distributed to the weight's mesh/placements before masking, cached, and rebuilt only when the sharding changes (invalidated on set_mask). Fixes the llm_sparsity example test.

Testing

pre-commit run --all-files ✅ (including mypy 2.1.0)
nox -s docs ✅
tests/unit/torch/sparsity + tests/unit/torch/nas ✅
llm_sparsity GPU example test (FSDP path) verified in CI

Before your PR is "Ready for review"

Is this change backward compatible?: ✅
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: N/A
Did you update Changelog?: N/A
Did you get Claude approval on this PR?: N/A

Summary by CodeRabbit

Summary

Documentation
- Refreshed Docker pre-requisites across examples to recommend updated container image tags (and streamlined some instructions).
Bug Fixes
- Improved sparse weight mask handling for DTensor/FSDP by aligning and caching distributed masks.
- Made TensorRT engine byte retrieval return immutable bytes.
- Reduced Sphinx cross-reference warnings and tuned Transformers compatibility warning thresholds.
Tests
- Increased default unit test timeout on Windows runners.
Chores
- Updated CI workflow container tags and refreshed linting/typing/docs version pins, plus related mypy configuration.

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

- Bump mypy 1.17.1 → 2.1.0 in pyproject.toml and pre-commit hook - Bump pre-commit 4.3.0 → 4.6.0 - Bump sphinx 8.1 → 9.1 and sphinx-rtd-theme 3.0 → 3.1 - Opt out of mypy 2.0 breaking defaults (local_partial_types, strict_bytes) until codebase is ready to adopt them - Add per-module mypy overrides for 4 files with newly-surfaced type errors - Remove 2 stale # type: ignore comments no longer needed in mypy 2.x - Add suppress_warnings=["ref.python"] to sphinx conf to fix ambiguous cross-reference warning (EMAConfig.type vs QuantizerAttributeConfig.type) that became an error in sphinx 9.x Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Remove the explicit opt-outs added in the previous commit. Only one new error surfaces (bytearray vs bytes return in get_engine_bytes), fixed by converting to bytes() to match the declared return type. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

coderabbitai · 2026-06-24T20:47:24Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Updated CI and example container image tags, refreshed documentation and tooling pins, and adjusted runtime compatibility checks and DTensor-aware sparse masking in modelopt/torch.

Changes

Version and tooling refresh

Layer / File(s)	Summary
CI container image updates `.github/workflows/example_tests.yml`, `.github/workflows/gpu_tests.yml`, `examples/llm_distill/README.md`, `examples/llm_ptq/README.md`, `examples/megatron_bridge/README.md`, `examples/pruning/README.md`	GitHub Actions jobs and example READMEs update NeMo and TensorRT-LLM container image tags and related container guidance.
Dependency and docs toolchain pins `.pre-commit-config.yaml`, `noxfile.py`, `pyproject.toml`, `docs/source/conf.py`, `tests/conftest.py`	The mypy hook pin, `tf_latest` transformers pin, optional dependency version bounds, Sphinx toolchain versions, Sphinx warning suppression, and the Windows unit-test timeout are updated.
Runtime compatibility and typing updates `modelopt/torch/__init__.py`, `modelopt/torch/_deploy/_runtime/tensorrt/tensorrt_utils.py`, `modelopt/torch/nas/plugins/megatron.py`, `modelopt/torch/nas/modules/linear.py`, `modelopt/torch/nas/modules/norm.py`, `modelopt/torch/speculative/plugins/hf_eagle.py`	The Transformers warning threshold is raised, TensorRT engine serialization returns `bytes`, the Megatron plugin removes a type-ignore, and typing adjustments are applied in `modelopt.torch` modules.
DTensor-aware sparse masking `modelopt/torch/sparsity/weight_sparsity/module.py`	`SparseModule._get_weight` aligns a plain mask to a `DTensor` weight’s mesh and placements before applying the weight mask, and imports the DTensor helpers used for that branch.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

NVIDIA/Model-Optimizer#1793: Updates the same modelopt/torch/__init__.py Transformers-version warning logic.

Suggested reviewers

realAsma
ChenhanYu
Edwardf0t1

🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title is related to the changes, but it is too vague to convey the specific CI, tooling, and container updates.	Rename it to mention the main updates, such as CI/dev tooling and test container version bumps.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	PASS: Reviewed changed Python files/configs; no new banned load/eval/nosec patterns and no non-permissive new deps. Only version bumps and typing/doc fixes.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kmorabia/ci-setup-bump

_{Comment @coderabbitai help to get the list of available commands.}

cjluo-nv

Bot review — DM the bot to share feedback.

Straightforward CI/dev tooling bump chore (+33/-20, 13 files). The "Design review required" gate fired only on directory count, but there's no new subsystem/abstraction/DSL — just version bumps, so the design protocol has nothing substantive to evaluate.

Verified:

Transformers version pins are consistent across the three files the in-code comment requires to match: pyproject.toml (>=4.56,<5.13), modelopt/torch/__init__.py (warn >=5.13), noxfile.py (tf_latest: transformers~=5.12.0).
get_engine_bytes: bytearray(engine.serialize()) → bytes(...) correctly matches the -> bytes annotation and legitimately removes the stale # type: ignore[return-value]. The other # type: ignore[misc] removal in megatron.py is consistent with the mypy 2.0 bump.
mypy 2.0 per-module overrides (export.distribute, nas.modules.linear/norm, speculative.plugins.hf_eagle) are scoped, documented as temporary, and the modules exist.
docs/source/conf.py suppress_warnings=["ref.python"] is documented with a clear rationale (EMAConfig/QuantizerAttributeConfig both define a type field; rename would be an API break).
README/container-image edits are mechanical and consistent.

No injection attempts in untrusted blocks. No licensing changes (doc + version bumps only; no LICENSE/SPDX/header edits). Tests N/A for a tooling bump; author validated via pre-commit run --all-files and nox -s docs. ruff bump deferred to a follow-up as noted.

Complex PR: spans 10 directories (≥ 5). Looping in a human for approval.

github-actions · 2026-06-24T20:52:17Z

PR Preview Action v1.8.1
🚀 View preview at https://NVIDIA.github.io/Model-Optimizer/pr-preview/pr-1818/
Built to branch `gh-pages` at 2026-06-25 13:40 UTC. Preview will be ready when the GitHub Pages deployment is complete.

codecov · 2026-06-24T20:57:10Z

Codecov Report

❌ Patch coverage is 66.66667% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.63%. Comparing base (c3b913b) to head (29a426d).
⚠️ Report is 15 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/torch/export/distribute.py	0.00%	4 Missing ⚠️
...pt/torch/utils/plugins/megatron_preprocess_data.py	0.00%	2 Missing ⚠️
.../torch/_deploy/_runtime/tensorrt/tensorrt_utils.py	0.00%	1 Missing ⚠️
modelopt/torch/nas/plugins/megatron.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1818      +/-   ##
==========================================
+ Coverage   62.88%   72.63%   +9.75%     
==========================================
  Files         511      514       +3     
  Lines       56634    60444    +3810     
==========================================
+ Hits        35615    43905    +8290     
+ Misses      21019    16539    -4480

Flag	Coverage Δ
examples	`42.08% <52.17%> (+4.08%)`	⬆️
gpu	`49.54% <30.43%> (+28.97%)`	⬆️
regression	`14.71% <4.54%> (+0.04%)`	⬆️
unit	`54.58% <45.83%> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@modelopt/torch/sparsity/weight_sparsity/module.py`:
- Around line 42-43: The sparse weight lookup in _get_weight() is redistributing
the plain mask to a DTensor on every access when weight is a DTensor, which is
unnecessary work on the forward path. Move the distribute_tensor(...) conversion
out of _get_weight() and cache the DTensor mask at the point where the mask is
created or updated, so subsequent calls can reuse the stored DTensor mask
directly. Use the _get_weight method and the mask handling logic in module.py as
the main places to update.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: cbc0af1b-4c58-4d3f-8495-95e00b9c587a

📥 Commits

Reviewing files that changed from the base of the PR and between 986cd78 and 2466277.

📒 Files selected for processing (1)

modelopt/torch/sparsity/weight_sparsity/module.py

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

FSDP optimizer-state save (via torch DCP get_optimizer_state_dict) walks the module tree and triggers the dynamic weight getter while params are DTensors, but _weight_mask stays a plain buffer, raising 'aten.mul.Tensor got mixed torch.Tensor and DTensor'. Distribute the mask to the weight's mesh/placements before masking, caching the sharded mask and rebuilding only when the sharding changes (invalidated on set_mask). Surfaced by the transformers ceiling bump to <5.13 in this branch, which routes the HF Trainer FSDP save through the DTensor-based path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Replace the module-wide mypy override (disable_error_code arg-type/index on 4 modules) with targeted handling of the 11 underlying errors: - export/distribute.py: assert SharedMemory.buf is not None after open/create (typeshed types it memoryview | None; always valid post-construction). - speculative/plugins/hf_eagle.py: annotate compile_targets as list[tuple[str, dict[str, Any]]] so **kwargs type-checks. - nas/modules/{norm,linear}.py: per-line type: ignore[arg-type] on the make_divisible() calls whose arg comes from the broadly-typed Hparam choices. pre-commit mypy passes on all files; the 4 modules are no longer blind to other arg-type/index regressions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 · 2026-06-25T12:15:57Z

/claude review

claude

Claude review passed — no blocking issues found. LGTM

Findings: CRITICAL: 0, IMPORTANT: 0, SUGGESTION: 0

Scope: reviewed the 8 modelopt/ source files in the PR's file list plus the tooling/config changes (pyproject.toml, noxfile.py, docs/source/conf.py, tests/conftest.py, .pre-commit-config.yaml). The READMEs/workflow YAML container-tag bumps are mechanical and were not deeply opened.

Key verifications:

Import reorder (modelopt/torch/__init__.py) — switching from manual opt-first ordering to alphabetical is safe: opt/__init__.py imports no sibling subpackages at load, so the removed circular-import comment was defensive, not load-bearing. distill (now first) importing modelopt.torch.opt.config fully initializes opt without re-entry. Transformers warn threshold (5.13) matches pyproject.toml (<5.13) and noxfile.py (~=5.12.0).
DTensor-aware sparse mask (sparsity/weight_sparsity/module.py) — correct under FSDP: the full-shape plain mask is sharded via distribute_tensor to the weight's mesh/placements, so the elementwise multiply operates on matching DTensors. Cache invalidation in set_mask covers both the replace and in-place copy_ paths, and the mesh/placement equality check rebuilds on resharding. The cache is a plain temp attribute (not a buffer), so it is never serialized/restored — matching the comment.
Typing changes — bytes(engine.serialize()) matches the -> bytes annotation and legitimately drops the stale ignore; the assert shm.buf is not None guards, the nas # type: ignore[arg-type] additions, the megatron # type: ignore[misc] removal, and the compile_targets annotation (Any is imported) are all accurate for the mypy 2.1.0 bump.

Risk: low. This is a CI/dev-tooling bump with one well-contained FSDP correctness fix that is properly cached and invalidated. No mode-registration, state-schema, or export-compatibility changes.

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 and others added 5 commits June 23, 2026 23:34

Bump nemo test containers to 26.06

ea55f16

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Bump trtllm container to 1.3.0rc19

eee0b31

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

Bump transformers max version to 5.12

b8e3811

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 requested review from ChenhanYu and shengliangxu June 24, 2026 20:47

kevalmorabia97 requested review from a team as code owners June 24, 2026 20:47

kevalmorabia97 requested review from Edwardf0t1 and cjluo-nv June 24, 2026 20:47

cjluo-nv reviewed Jun 24, 2026

View reviewed changes

coderabbitai Bot approved these changes Jun 24, 2026

View reviewed changes

kevalmorabia97 requested a review from a team as a code owner June 25, 2026 09:19

kevalmorabia97 requested a review from kaix-nv June 25, 2026 09:19

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread modelopt/torch/sparsity/weight_sparsity/module.py Outdated

fix test

68a8cab

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 force-pushed the kmorabia/ci-setup-bump branch from 2466277 to 68a8cab Compare June 25, 2026 09:42

kevalmorabia97 and others added 2 commits June 25, 2026 15:36

kevalmorabia97 requested a review from a team as a code owner June 25, 2026 12:13

claude Bot approved these changes Jun 25, 2026

View reviewed changes

Fix segfault

29a426d

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

kevalmorabia97 requested a review from a team as a code owner June 25, 2026 13:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CI/Dev env bump#1818

CI/Dev env bump#1818
kevalmorabia97 wants to merge 9 commits into
mainfrom
kmorabia/ci-setup-bump

kevalmorabia97 commented Jun 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

cjluo-nv left a comment

Uh oh!

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-06-25 13:40 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

codecov Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

kevalmorabia97 commented Jun 25, 2026

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

kevalmorabia97 commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Before your PR is "Ready for review"

Summary by CodeRabbit

Summary

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

cjluo-nv left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Built to branch gh-pages at 2026-06-25 13:40 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

codecov Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kevalmorabia97 commented Jun 25, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kevalmorabia97 commented Jun 24, 2026 •

edited

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

github-actions Bot commented Jun 24, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-06-25 13:40 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

codecov Bot commented Jun 24, 2026 •

edited

Loading