Skip to content

[Feature] Support nvfp4 RL#1505

Open
fy1214 wants to merge 15 commits into
THUDM:mainfrom
fy1214:nvfp4
Open

[Feature] Support nvfp4 RL#1505
fy1214 wants to merge 15 commits into
THUDM:mainfrom
fy1214:nvfp4

Conversation

@fy1214

@fy1214 fy1214 commented Jan 27, 2026

Copy link
Copy Markdown
Contributor

WIP, support the nvfp4 for Slime RL process.

@zianglih

zianglih commented Feb 9, 2026

Copy link
Copy Markdown

If using nvfp4 for training fprop + wgrad + dgrad, it's better to stick to the original TE nvfp4 recipe (https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#NVFP4-training-recipe), 2d weight (for better preserving chain rule) + Random Hadamard transforms + stochastic rounding.

Additionally, I have a TE PR (NVIDIA/TransformerEngine#2644) which allows using only nvfp4 for fprop while keeping wgrad and dgrad in bf16. In this case since we no longer use nvfp4 for bwd, we don't need 2d weight to better preserve chain rule and 1d weight has lower quantization error for free. Since the bwd is not in nvfp4, Random Hadamard transforms and stochastic rounding can also be disabled.

@fy1214

fy1214 commented Feb 13, 2026

Copy link
Copy Markdown
Contributor Author

If using nvfp4 for training fprop + wgrad + dgrad, it's better to stick to the original TE nvfp4 recipe (https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/fp8_primer.html#NVFP4-training-recipe), 2d weight (for better preserving chain rule) + Random Hadamard transforms + stochastic rounding.

Additionally, I have a TE PR (NVIDIA/TransformerEngine#2644) which allows using only nvfp4 for fprop while keeping wgrad and dgrad in bf16. In this case since we no longer use nvfp4 for bwd, we don't need 2d weight to better preserve chain rule and 1d weight has lower quantization error for free. Since the bwd is not in nvfp4, Random Hadamard transforms and stochastic rounding can also be disabled.

I dont't know whether it's better not using nvfp4 in bwd or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants