-
Notifications
You must be signed in to change notification settings - Fork 837
Title: Add MobileNetV2 pytest for Cortex-M backend #17300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17300
Note: Links to docs will display an error until the docs builds have been completed. ❌ 10 New Failures, 1 Cancelled Job, 1 Unrelated FailureAs of commit 882c4c1 with merge base ba2516c ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
Note : test_implementation_mv2 is marked XFAIL because MV2's graph contains a dim_order_ops._clone_dim_order.default Should we add ARM's RemoveNoopPass to CortexMPassManager to remove identity clone ops. |
|
Yes if it is a noop I think it makes sense to remove, just make sure that it does not do any dim_order/ type conversion first. Also see if you can lower the numerical tolerance compared to mv3 and still get it to pass, would be interesting to know. |
| xfails={ | ||
| "mobilenet_v2": "MLETORCH-XXX - Investigate mobilenet_v2 flakiness" | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be on the test_implementation_mv2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you're right, Removed xfail from test_dialect_mv2 (stable now), kept it on test_implementation_mv2 with
strict=False for potential flakiness in the serialization/runtime path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds a new MobileNetV2 model test for the Cortex-M backend and updates the Cortex-M pass pipeline to remove certain no-op operations during lowering, aiming to unblock the MV2 runtime path.
Changes:
- Add a new
test_mobilenet_v2.pypytest covering both dialect and implementation paths (with calibration + top-1 argmax sanity check). - Add
RemoveNoopPasstoCortexMPassManagerpass list to eliminate identity_clone_dim_order/related no-op nodes.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| backends/cortex_m/test/models/test_mobilenet_v2.py | Introduces MobileNetV2 Cortex-M tests (dialect + implementation) with op-count assertions and calibration. |
| backends/cortex_m/passes/cortex_m_pass_manager.py | Inserts RemoveNoopPass into the Cortex-M lowering pass sequence to remove no-op ops earlier in the pipeline. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "test_case", | ||
| test_cases, | ||
| xfails={ | ||
| "mobilenet_v2": "MLETORCH-XXX - Investigate mobilenet_v2 flakiness" |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The xfail reason uses a placeholder ticket id ("MLETORCH-XXX"), which makes it hard to track and resolve the expected failure. Please replace it with the real issue/task identifier (or a concrete explanation if no tracker exists).
| "mobilenet_v2": "MLETORCH-XXX - Investigate mobilenet_v2 flakiness" | |
| "mobilenet_v2": "Flaky on Cortex-M due to nondeterministic quantized kernels; keep xfailed until kernels are made deterministic or flakiness is resolved (no external ticket)." |
| pass_list: list[ExportPass] = [ | ||
| # Run before folding so qparams attach to max_pool2d values, not tuple + getitem. | ||
| RemoveGetItemPass, | ||
| FoldAndAnnotateQParamsPass, | ||
| RemoveNoopPass, | ||
| ReplaceScalarWithTensorArgPass, | ||
| ReplaceQuantNodesPass, | ||
| ActivationFusionPass, |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding RemoveNoopPass here may be unsafe for Cortex-M graphs: the current implementation in backends/arm/_passes/remove_noop_pass.py removes dim_order_ops._clone_dim_order / _to_dim_order_copy based only on dtype, without checking whether dim_order (layout) actually changes. That can drop real dim-order conversions. It will also likely change existing Cortex-M op-count expectations (e.g., tests that currently expect _clone_dim_order to remain after transforms). Consider either (a) extending/wrapping the pass to verify dim_order equivalence via node meta (similar to backends/transforms/remove_clone_ops.py), or (b) limiting removal to proven-identity cases, and update affected op-count tests accordingly.
Summary: Added RemoveNoopPass (from ARM backend) to CortexMPassManager — it checks that _clone_dim_order has matching input/output dtypes before removing it, so it won't strip any actual type or dim_order conversions. This eliminates the identity clone from MV2's flatten/view and unblocks the runtime. Lowered qtol from 20 to 10 — passes reliably. Didn't go lower since calibration data is random (torch.randn) so the numerical diff varies across runs. The argmax assertion provides an additional correctness check independent of tolerance. Removed xfail from test_dialect_mv2 (stable now), kept it on test_implementation_mv2 with strict=False for potential flakiness in the serialization/runtime path.
… (introduced in #17300) which broke 8 quantization tests. After #17300 merged, the following tests started failing: - `test_shared_qspec_quantizer[input_fork_x_shared]` - `test_shared_qspec_quantizer[input_fork_y_shared]` - `test_shared_qspec_quantizer[surrounded_quantized_op]` - `test_shared_qspec_quantizer[output_fork_shared]` - `test_shared_qspec_quantizer[many_forks]` - `test_dialect_mv2[mobilenet_v2]` (+ 2 more) `RemoveNoopPass` removes `clone_dim_order` operators that are functionally necessary for shared quantization specs. It only checks dtype equality, not dim_order/layout changes, causing it to incorrectly remove clones needed for: - Tensor forking in quantized graphs - Shared quantization parameter propagation - Correct quantization fusion Remove `RemoveNoopPass` from the Cortex-M pass pipeline. The MobileNetV2 `clone_dim_order` issue it was meant to address needs a more targeted solution. ```bash pytest -c backends/arm/test/pytest.ini backends/cortex_m/test/misc/test_quantization.py::test_shared_qspec_quantizer -v pytest -c backends/arm/test/pytest.ini backends/cortex_m/test/models/test_mobilenet_v2.py::test_dialect_mv2 -v
#17407) … (introduced in #17300) which broke 8 quantization tests. After #17300 merged, the following tests started failing: - `test_shared_qspec_quantizer[input_fork_x_shared]` - `test_shared_qspec_quantizer[input_fork_y_shared]` - `test_shared_qspec_quantizer[surrounded_quantized_op]` - `test_shared_qspec_quantizer[output_fork_shared]` - `test_shared_qspec_quantizer[many_forks]` - `test_dialect_mv2[mobilenet_v2]` (+ 2 more) `RemoveNoopPass` removes `clone_dim_order` operators that are functionally necessary for shared quantization specs. It only checks dtype equality, not dim_order/layout changes, causing it to incorrectly remove clones needed for: - Tensor forking in quantized graphs - Shared quantization parameter propagation - Correct quantization fusion Remove `RemoveNoopPass` from the Cortex-M pass pipeline. The MobileNetV2 `clone_dim_order` issue it was meant to address needs a more targeted solution. ```bash pytest -c backends/arm/test/pytest.ini backends/cortex_m/test/misc/test_quantization.py::test_shared_qspec_quantizer -v pytest -c backends/arm/test/pytest.ini backends/cortex_m/test/models/test_mobilenet_v2.py::test_dialect_mv2 -v Co-authored-by: Github Executorch <[email protected]>
Summary:
Added RemoveNoopPass (from ARM backend) to CortexMPassManager — it checks that
_clone_dim_order has matching input/output dtypes before removing it, so it won't strip any
actual type or dim_order conversions. This eliminates the identity clone from MV2's
flatten/view and unblocks the runtime.
Lowered qtol from 20 to 10 — passes reliably. Didn't go lower since calibration data is
random (torch.randn) so the numerical diff varies across runs. The argmax assertion provides
an additional correctness check independent of tolerance.
Removed xfail from test_dialect_mv2 (stable now), kept it on test_implementation_mv2 with
strict=False for potential flakiness in the serialization/runtime path.
Test plan
pytest -c backends/arm/test/pytest.ini backends/cortex_m/test/models/test_mobilenet_v2.py -v