Skip to content

[L0] cache zeKernelSetIndirectAccess per kernel#1295

Merged
pvelesko merged 1 commit into
mainfrom
fix-l0-igba-cache
Jun 15, 2026
Merged

[L0] cache zeKernelSetIndirectAccess per kernel#1295
pvelesko merged 1 commit into
mainfrom
fix-l0-igba-cache

Conversation

@pvelesko

@pvelesko pvelesko commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Cache the indirect-access flags per kernel so zeKernelSetIndirectAccess is called once instead of on every launch.

  • measure performance impact

The L0 backend was calling zeKernelSetIndirectAccess on every kernel
launch when the module's HasNoIGBAs flag was false. The Intel L0 driver
appears to serialize this entry point globally, which caused 8 HecBench
benchmarks (entropy, layout, ldpc, minisweep, p4, scan2, simpleSpmv,
sptrsv) to time out under back-to-back launches while passing under the
OpenCL backend.

The indirect-access flags are a property of the kernel handle and only
need to be set once. Add an atomic flag to CHIPKernelLevel0 and use a
CAS so that the call is performed exactly once per kernel handle. On
failure the flag is cleared so a later launch can retry.
@pvelesko pvelesko force-pushed the fix-l0-igba-cache branch from 855a12c to 6b2724e Compare June 13, 2026 15:36
@pvelesko

Copy link
Copy Markdown
Collaborator Author

/run-aurora-ci

@pvelesko pvelesko marked this pull request as ready for review June 15, 2026 03:52
@pvelesko pvelesko merged commit 24bf5e0 into main Jun 15, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant