Skip to content

[API Compatibility] Add paddle.nn.init.sparse_()#79310

Open
algorithm1832 wants to merge 9 commits into
PaddlePaddle:developfrom
algorithm1832:improvement_nn_init_sparse
Open

[API Compatibility] Add paddle.nn.init.sparse_()#79310
algorithm1832 wants to merge 9 commits into
PaddlePaddle:developfrom
algorithm1832:improvement_nn_init_sparse

Conversation

@algorithm1832

Copy link
Copy Markdown
Contributor

PR Category

User Experience

PR Types

Improvements

Description

  • Add paddle.nn.init.sparse_()
  • Update paddle.nn.init.normal_() to return tensor in dygraph mode
  • Add corresponding test

是否引起精度变化

PaddlePaddle-bot

This comment was marked as outdated.

@codecov-commenter

codecov-commenter commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@4ce8ac4). Learn more about missing BASE report.

Additional details and impacted files
@@             Coverage Diff             @@
##             develop    #79310   +/-   ##
===========================================
  Coverage           ?   100.00%           
===========================================
  Files              ?         1           
  Lines              ?        50           
  Branches           ?         0           
===========================================
  Hits               ?        50           
  Misses             ?         0           
  Partials           ?         0           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread python/paddle/nn/init.py
return init(tensor)


def sparse_(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按标准增加API,需要示例代码

@paddle-bot paddle-bot Bot added the contributor External developers label Jun 12, 2026
PaddlePaddle-bot

This comment was marked as outdated.

Comment thread python/paddle/nn/initializer/normal.py Outdated
)
out_var._share_underline_tensor_to(var)
return None
return out_var

@zhwesky2010 zhwesky2010 Jun 14, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个应该所有的 paddle.nn.init.* 都会有这个问题,其他同类API,如果也有这个问题就一起改了

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改返回值的方式是否有效,似乎也不确定,我看XPU的dygraph测试一直没过

@PaddlePaddle-bot

PaddlePaddle-bot commented Jun 15, 2026

Copy link
Copy Markdown

🤖 Paddle-CI-Agent | ci_status_monitor | 2026-06-18 21:48:55 UTC+08:00

CI报告基于以下代码生成(30分钟更新一次):
PR commit: d367791 | Merge base: 4ce8ac4 (branch: develop)


1 Required任务 : 41/48 通过

总执行(rerun次数) 总任务 ✅ 通过 ❌ 失败 ⏳ 运行中 ⏸️ 等待中 跳过
80(0) 80 73 3 4 0 0
任务 错误类型 置信度 日志
Windows-OPENBLAS / Build and test 不稳定问题:GPU内存为0后重跑通过 Job
Model-Benchmark / Benchmark test 环境问题:PaddleX模型注册/上传脚本异常 Job
Slice / Slice test 不稳定问题:Slice性能基准波动 Job

2 失败详情

🔴 Windows-OPENBLAS / Build and test — 不稳定问题(置信度: 高)

错误类型: 不稳定问题 | 置信度: 高
分析器: 通用分析(fallback)
失败用例:

用例 错误摘要
allocator_facade_frac_flags_test, Allocator.Allocator 首次运行因 GPU 可用内存为 0 抛出 ResourceExhaustedError,同 job 重跑后通过

关键日志:

27/31 Test #64: allocator_facade_frac_flags_test .........***Failed   15.21 sec
[ RUN      ] Allocator.Allocator
ResourceExhaustedError: Not enough available GPU memory.
[Hint: Expected available_to_alloc > 0, but received available_to_alloc:0 <= 0:0.]
The following unittest will be re-run:
allocator_facade_frac_flags_test
1/2 Test #64: allocator_facade_frac_flags_test ...   Passed    2.77 sec
There are failed tests, which have been successful after re-run
  • 根因摘要: Windows GPU资源瞬时不足导致重跑通过
  • 初始失败发生在 C++ allocator GPU 内存测试,报错为 GPU 可用内存为 0;日志随后显示该用例被重跑且通过。PR 只修改 python/paddle/nn/init.pytest/legacy_test/test_nn_init_function.py,与 C++ allocator 内存测试无直接关联。

修复建议:

  1. 已知不稳定,请 rerun;若持续复现,再检查 Windows runner 的 GPU 内存隔离和 allocator 测试资源占用。

关联变更: 未发现与本 PR 修改文件相关。

🔴 Model-Benchmark / Benchmark test — 环境问题(置信度: 高)

错误类型: 环境问题 | 置信度: 高
分析器: 通用分析(fallback)
失败用例:

用例 错误摘要
PaddleX_semantic_segmentation_Deeplabv3-R50_bs4_fp32_DP_dynamic_N1C1 PaddleX 未注册 Deeplabv3-R50,结果上传脚本又调用了拼写错误的 BOS SDK 方法

关键日志:

KeyError: 'Deeplabv3-R50'
Exception: No model named Deeplabv3-R50
AttributeError: 'BosClient' object has no attribute 'put_super_obejct_from_file'. Did you mean: 'put_super_object_from_file'?
Sorry, some models failed to run, Please check the result_html.html.
Process completed with exit code 10.
  • 根因摘要: 外部PaddleX模型注册/CI上传脚本异常
  • 失败出现在 PaddleX benchmark 流程,模型注册表查不到 Deeplabv3-R50;之后 /workspace/PaddleTest/tools/bos_tools.py 上传结果时调用 put_super_obejct_from_file,与 SDK 实际方法名不一致。本 PR 未修改 PaddleX、PaddleTest 或 benchmark 配置。

修复建议:

  1. 环境问题,请 rerun;若重复失败,需要修复 PaddleX 模型注册数据和 bos_tools.py 中 BOS SDK 方法名。

关联变更: 未发现与本 PR 修改文件相关。

🔴 Slice / Slice test — 不稳定问题(置信度: 中)

错误类型: 不稳定问题 | 置信度: 中
分析器: 通用分析(fallback)
失败用例:

用例 错误摘要
Getitem - forward - Slice - Slice with Step - float16 - paddle Slice benchmark 检测到相对性能变化 -11.70%,触发 slice测试失败

关键日志:

slice测试失败, 存在性能下降case
{'Getitem - forward - Slice - Slice with Step - float16 - paddle': -0.11698894133855679}
File "/paddle/PaddleTest/framework/slice_benchmark/run.py", line 164, in ci_test
raise Exception("slice测试失败")
Exception: slice测试失败
Process completed with exit code 1.
  • 根因摘要: Slice单项性能基准波动触发阈值
  • 失败来自 Slice micro benchmark 的单个 float16 step slicing 用例,日志未显示功能断言失败。PR 改动集中在 paddle.nn.init Python API 和对应 legacy 单测,未触及 Slice kernel、benchmark 脚本或性能配置;当前更像性能噪声/基线波动,置信度为中。

修复建议:

  1. 已知不稳定,请 rerun;若 rerun 后稳定复现,再交由 Slice 性能相关负责人核查 benchmark 基线和该用例性能。

关联变更: 未发现与本 PR 修改文件相关。

PaddlePaddle-bot

This comment was marked as outdated.

@algorithm1832

algorithm1832 commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

XPU测试没过,似乎是dygraph设置元素为0的时候没有成功

不过这个我手头条件难以复现

@zhwesky2010 zhwesky2010 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个应该是触发了XPU的存量问题,你可以跳过测试。这个PR重点还是保证新增的API没问题

        if paddle.is_compiled_with_xpu():
            self.skipTest("vander is not supported on XPU")

PaddlePaddle-bot

This comment was marked as outdated.

@algorithm1832

Copy link
Copy Markdown
Contributor Author

/re-run all-failed

PaddlePaddle-bot

This comment was marked as outdated.

Comment thread python/paddle/nn/init.py
"""
init = Normal(mean=mean, std=std)

if in_dygraph_mode():

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其他的init API都处理下返回值,这个文件是后面新增的,一般无后向兼容问题

@PaddlePaddle-bot PaddlePaddle-bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Paddle-CI-Agent | pr_review | 2026-06-18 20:16:44

📋 Review 摘要

PR 概述:新增 paddle.nn.init.sparse_(),并让多组 paddle.nn.init.*_ 函数在动态图下返回输入 Tensor。
变更范围python/paddle/nn/init.pytest/legacy_test/test_nn_init_function.py
影响面 Tag[User Experience]

问题

未发现新的非历史行间问题。历史未解决项的当前状态见下方;PR 规范问题在下面章节报,不在这里重复。

历史 Findings 修复情况

Finding 问题 状态
F1 static graph 下 normal_() 返回 op 后继续当 Tensor 使用 ⚠️ 仍存在
F2 新公开 API 文档/示例不完整 🔄 部分修复
F3 sparsity 缺少 [0, 1] 范围校验 ⚠️ 仍存在
F4 static graph 下切片赋值未按 paddle.static.setitem 重新绑定 ⚠️ 仍存在

📝 PR 规范检查

标题 Tag API Compatibility 不在 Paddle 模板枚举中;描述模板结构完整,精度变化字段已填写

标题建议(可直接复制):

  • [User Experience] Add paddle.nn.init.sparse_()

总体评价

本轮按 Paddle checklist A3/B3 复核了公开 Python API、动态图/静态图行为、测试和文档入口。新增的非历史问题未发现,但历史的 static graph 语义和参数校验问题仍需修复后再合入。

@zhwesky2010 zhwesky2010 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@algorithm1832

Copy link
Copy Markdown
Contributor Author

/re-run all-failed

1 similar comment
@algorithm1832

Copy link
Copy Markdown
Contributor Author

/re-run all-failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants