Skip to content

align kaiming uniform and gaussian with torch#79309

Open
greenhandF wants to merge 1 commit into
PaddlePaddle:release/3.4from
greenhandF:cp_release
Open

align kaiming uniform and gaussian with torch#79309
greenhandF wants to merge 1 commit into
PaddlePaddle:release/3.4from
greenhandF:cp_release

Conversation

@greenhandF

Copy link
Copy Markdown
Contributor

PR Category

Operator Mechanism

PR Types

Op align

Description

align kaiming uniform and gaussian with torch

是否引起精度变化

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-06-12 12:36:06

📋 Review 摘要

PR 概述:将 gaussianmean/std 改为 double,并调整 Kaiming uniform 计算以对齐 torch。
变更范围:Phi gaussian/uniform kernels、op YAML/op version、old IR 到 PIR transcriber、PIR 序列化 patch、Kaiming initializer。
影响面 Tag[Operator Mechanism] [User Experience] [Execute Infrastructure]

问题

级别 文件 概述
🔴 Bug paddle/phi/kernels/cpu/gaussian_kernel.cc:41 CPU double/complex128 gaussian 仍把 mean/std 截断成 float
🔴 Bug paddle/phi/ops/yaml/op_version.yaml:313 gaussian_random op_version 升级未同步 legacy/static_ops.yaml,静态图入口仍是 float attr
🔴 Bug paddle/phi/kernels/gpu/uniform_kernel.cu:168 GPU uniform 只修复 seed == 0,显式 seed 分支仍截断 float64 边界

📝 PR 规范检查

不符合:标题缺少官方 Tag;PR Types 填写为 Op align,不在模板枚举内;目标分支是 release/3.4,Description 缺少 release cherry-pick 要求的 devPR:https://github.com/PaddlePaddle/Paddle/pull/xxxx是否引起精度变化 填“是”,但未说明精度变化来源、影响范围和验证方式。

标题建议(可直接复制):

  • [Operator Mechanism] Align Kaiming uniform and Gaussian with torch
PR 描述建议(点击展开,可直接复制)
### PR Category
<!-- One of [ User Experience | Execute Infrastructure | Operator Mechanism | CINN | Custom Device | Performance Optimization | Distributed Strategy | Parameter Server | Communication Library | Auto Parallel | Inference | Environment Adaptation ] -->
Operator Mechanism

### PR Types
<!-- One of [ New features | Bug fixes | Improvements | Performance | BC Breaking | Deprecations | Docs | Devs | Not User Facing | Security | Others ] -->
Bug fixes

### Description
<!-- Describe what you’ve done -->
Align Kaiming uniform and gaussian_random/gaussian mean/std handling with torch.

Changes:
- Change gaussian mean/std attributes to double in the PHI/PIR path.
- Add PIR serialize/deserialize compatibility patches for pd_op.gaussian mean/std.
- Update CPU/GPU/XPU/OneDNN gaussian kernel signatures.
- Add old IR to PIR attribute translation for gaussian_random.
- Adjust KaimingUniform limit computation.

Precision change: Yes. The intended change is to avoid float32 truncation of gaussian mean/std for float64 outputs and to align Kaiming uniform bound computation with torch.

Impact range: gaussian_random/gaussian, KaimingUniform/KaimingNormal initializer paths, PIR serialization/deserialization, and old IR to PIR translation.

Verification: N/A (not provided in the current PR description).

Release cherry-pick source: N/A (the required devPR:https://github.com/PaddlePaddle/Paddle/pull/xxxx link is not present in the current PR description).

### 是否引起精度变化
<!-- one of the following [ 是 | 否 ]-->

总体评价

本轮按风险优先审查了 gaussian double 属性迁移、legacy/static 生成链路、CPU/GPU/XPU/OneDNN kernel 行为、PIR 兼容 patch 和 Kaiming initializer。当前实现仍存在多处精度链路未闭合的问题,建议修复后再合入;本次未拉取 CI 日志。

}
NormalDistribution<T>(data, size, mean, std, engine);
NormalDistribution<T>(
data, size, static_cast<float>(mean), static_cast<float>(std), engine);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug mean/std 在 CPU kernel 中仍被强制收窄到 float

本 PR 把 gaussian 的 YAML、InferMeta 和 Kernel 签名都改成 doubleop_version.yaml 也说明目的是避免 float32 截断;但这里调用 NormalDistribution<T> 前又做了 static_cast<float>。CPU kernel 注册了 doublecomplex128,因此 paddle.tensor.random.gaussian(..., dtype="float64", mean=...) 在 CPU 上仍会按 float32 参数生成分布,PR 目标没有生效。

建议修复方式:让 CPU helper 按输出实数类型接收参数,double/complex128 使用 double,float/float16/bfloat16/complex64 使用 float;删除这里对 double 输出的 float cast,并补 CPU float64 回归测试。

- modify_attr : shape
comment : "The arg 'default_value' of attr 'shape' is changed: from 'None' to '{}'."
default : std::vector<int64_t>{}
- checkpoint : Upgrade gaussian_random, change the type of attributes [mean] and [std] from float to double to avoid float32 truncation that breaks float64 output precision.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug 这里声明 gaussian_randommean/std 从 float 升级到 double,但旧静态图的生成源没有同步。

paddle/fluid/operators/generator/CMakeLists.txt 会读取 paddle/phi/ops/yaml/legacy/static_ops.yaml 生成 legacy/static gaussian_random,而该文件里的 gaussian 仍是 float mean/std。这样静态图入口仍按 float attr 建模,old IR -> PIR transcriber 最多只能把已经截断的 FloatAttribute 包成 DoubleAttribute,无法保留用户传入的双精度值,和这个 checkpoint 的说明不一致。

建议修复方式:同步把 legacy/static_ops.yamlgaussianmean/std 改为 double;同时让 GaussianRandomOpTranscriber 兼容 FloatAttribute 和 DoubleAttribute(老模型可能是 float,新生成 op 可能是 double),并补静态图到 PIR 的 float64 精度回归测试。

MT min_val = static_cast<MT>(static_cast<T>(min.to<float>()));
MT max_val = static_cast<MT>(static_cast<T>(max.to<float>()));
funcs::uniform_real_transform<MT, T> trans(min_val, max_val);
funcs::uniform_real_transform<MT, T> trans(static_cast<MT>(min.to<T>()),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug 这个 hunk 只修复了 seed == 0 分支,显式 seed 分支仍会把边界截断成 float。

新增代码用 min.to<T>()/max.to<T>() 保留了 dtype=float64 时的 double 边界,但同一 hunk 下面的 else 分支仍调用 min.to<float>()/max.to<float>() 再传给 UniformGenerator<T>。因此 paddle.uniform(..., dtype="float64", seed=123) 在 GPU 上仍会使用 float32 边界,且同一 API 的数值语义变成依赖 seed 是否为 0。

建议修复方式:显式 seed 分支也使用输出类型转换,例如 UniformGenerator<T>(min.to<T>(), max.to<T>(), seed, ...),并补一个 GPU float64、seed != 0、边界值不可由 float32 精确表示的回归测试。

@greenhandF

Copy link
Copy Markdown
Contributor Author

/re-run all-failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants