Fix Mxfp4 dequantize #43326

Cyrilvallez · 2026-01-16T15:02:50Z

What does this PR do?

As per the title. Fix #43317.
They move the devices without being asked to do it, resulting in OOMs when using with a device_map="auto". Also, no need to upscale the dtype to int64 as they are only used to indexing and it takes up more memory in constrained device_map="auto" settings

Cyrilvallez · 2026-01-16T15:04:43Z

src/transformers/integrations/mxfp4.py

-    if target_device == "cpu" and torch.cuda.is_available():
-        torch.cuda.empty_cache()


This is extremely bad! Very slow, and works against the memory allocator during loading! So it takes a lot of time to run, and we need more time later to re-allocate that memory!

HuggingFaceDocBuilderDev · 2026-01-16T15:16:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SunMarc

Thanks for fixing ! Can you have a quick look @MekkCyber ?

MekkCyber

Yes we can do that, lgtm

github-actions · 2026-01-16T16:40:20Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43326&sha=da664a

remove wrong device and dtype conversions

bed1bca

Cyrilvallez commented Jan 16, 2026

View reviewed changes

fix

72d31f4

fix

d25e7ed

SunMarc approved these changes Jan 16, 2026

View reviewed changes

SunMarc requested a review from MekkCyber January 16, 2026 15:23

Cyrilvallez added 2 commits January 16, 2026 16:26

remove more empty_cache

0b04391

style

655ae24

MekkCyber approved these changes Jan 16, 2026

View reviewed changes

Cyrilvallez added 2 commits January 16, 2026 17:07

sequentially

e9287c2

better

da664ab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Mxfp4 dequantize #43326

Fix Mxfp4 dequantize #43326

Cyrilvallez commented Jan 16, 2026 •

edited

Loading

Uh oh!

Cyrilvallez Jan 16, 2026

Uh oh!

SunMarc Jan 16, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Jan 16, 2026

Uh oh!

SunMarc left a comment

Uh oh!

MekkCyber left a comment

Uh oh!

github-actions bot commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		if target_device == "cpu" and torch.cuda.is_available():
		torch.cuda.empty_cache()

Fix Mxfp4 dequantize #43326

Are you sure you want to change the base?

Fix Mxfp4 dequantize #43326

Conversation

Cyrilvallez commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Cyrilvallez Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

SunMarc Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jan 16, 2026

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Cyrilvallez commented Jan 16, 2026 •

edited

Loading