feature: support native YOLO .pt models while ensuring compatibility with Torchvision models by kashtennyson · Pull Request #495 · JdeRobot/PerceptionMetrics

kashtennyson · 2026-03-24T09:10:31Z

Description

This PR adds support for loading native Ultralytics YOLOv8 .pt models while ensuring a consistent interface for the rest of the library. This is a fix for #449

The Problem:
Native YOLO .pt models often return a tuple (inference_tensor, loss_tensor) rather than a raw tensor, which causes "too many values to unpack" errors in the inference and eval methods. Additionally, these models frequently use float16 (Half) precision, leading to DType mismatches with input images or NMS kernel errors on certain backends.

The Solution:
Following previous feedback, I have centralized the fix within the TorchImageDetectionModel class. I implemented a local Adapter class (DetectionModelWrapper) that standardizes the model's behavior at the source:

Tuple Unpacking: Automatically extracts the primary detection tensor.
Input Alignment: Automatically casts input images to match the model's native dtype (fixing "Float vs Half" errors).
Output Alignment: Ensures results are returned as float32 to maintain compatibility with torchvision.ops.nms.
Graceful Fallback: Wrapped the .pt loading logic to provide a clear error message suggesting the installation of ultralytics if it is missing.

This PR Supersedes #469. It implements a more stable version by ensuring compatibility with Torchvision models along with the Ultralytics YOLO models.

Architectural Question for Maintainers

"I have implemented the DetectionModelWrapper as a local class within the __init__ method of TorchImageDetectionModel to keep the fix strictly within the requested section and ensure that the normalization is context-specific to the model instance.

Do you prefer this local encapsulation, or would you like me to refactor the wrapper into a private, module-level class (e.g., _ModelNormalizationWrapper) at the top of the file to keep the __init__ method more concise?"

…lization

…with Torchvision models

dpascualhe · 2026-03-25T20:32:21Z

Hi, thanks for your contribution! I'll review the PR thoroughly when I can since this is an important upgrade.

kashtennyson · 2026-03-26T05:58:21Z

Alright @dpascualhe. Thanks for the update!

I am also currently working on a broader refactor to provide global .pt support across all tasks (Detection, Segmentation, and LiDAR) by centralizing the loading and normalization logic into a shared BaseTorchModel utility. So, your guidance and feedback is crucial for these architectural decisions. Looking forward to your thoughts!

kashtennyson added 2 commits March 26, 2026 00:27

feature: support native YOLO .pt models with centralized output norma…

8b07253

…lization

feature: support native YOLO .pt models while ensuring compatibility …

1ae5a00

…with Torchvision models

kashtennyson force-pushed the issue-449 branch from 2470f75 to 1ae5a00 Compare March 25, 2026 18:59

dpascualhe self-requested a review March 25, 2026 19:08

dpascualhe self-assigned this Mar 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: support native YOLO .pt models while ensuring compatibility with Torchvision models#495

feature: support native YOLO .pt models while ensuring compatibility with Torchvision models#495
kashtennyson wants to merge 2 commits intoJdeRobot:masterfrom
kashtennyson:issue-449

kashtennyson commented Mar 24, 2026

Uh oh!

dpascualhe commented Mar 25, 2026

Uh oh!

kashtennyson commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kashtennyson commented Mar 24, 2026

Description

Architectural Question for Maintainers

Uh oh!

dpascualhe commented Mar 25, 2026

Uh oh!

kashtennyson commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants