Skip to content

IGC 2.34.4: clBuildProgram ICE (Segmentation violation) on OpenCL C kernel using atomic_cmpxchg inside variable-bound nested loops (Alder Lake iGPU) #408

@sergiusens

Description

@sergiusens

Environment

  • IGC: 2.34.4-2.fc44 (intel-igc-opencl / intel-igc-core)
  • compute-runtime: 26.18.38308.1-2.fc44 (intel-opencl)
  • GPU: Intel Iris Xe Graphics (Alder Lake-P), PCI ID 8086:A7A0
  • OS: Fedora 44, kernel 6.19.14-101.fc44.x86_64
  • Path: OpenCL C via clBuildProgram (not SPIR-V / Level Zero)

Symptom

clBuildProgram returns CL_BUILD_PROGRAM_FAILURE (-11) with the build log:

IGC: Internal Compiler Error: Segmentation violation

The crash is deterministic and occurs every run. The crash happens at program load time — no kernel is ever dispatched.

Minimal reproducer

minimal_repro.cl (~20 lines of standard OpenCL C 1.2):

void atomic_add_f(global float *val, const float delta)
{
  union { float f; unsigned int i; } old_val, new_val;
  global volatile unsigned int *ival = (global volatile unsigned int *)val;
  do {
    old_val.i = atomic_add(ival, 0);
    new_val.f  = old_val.f + delta;
  } while(atomic_cmpxchg(ival, old_val.i, new_val.i) != old_val.i);
}

kernel void test_kernel(global float *buf, const int radius, const int width)
{
  const int x = get_global_id(0);
  const int y = get_global_id(1);
  for(int dy = -radius; dy <= radius; dy++)
    for(int dx = -radius; dx <= radius; dx++)
      atomic_add_f(buf + mad24(y + dy, width, x + dx), 1.0f);
}

Host code (abbreviated):

cl_program prog = clCreateProgramWithSource(ctx, 1, &src, &sz, NULL);
cl_int err = clBuildProgram(prog, 1, &dev, "", NULL, NULL);
// err == -11, build log == "IGC: Internal Compiler Error: Segmentation violation"

Bisection results

Construct Result
atomic_cmpxchg in CAS loop, called inside nested for with runtime bounds (const int radius from kernel arg) ICE
Same CAS loop, loops with constant bounds (for(int dy=-2; dy<=2; dy++)) OK
Runtime-bound loops, atomic_cmpxchg replaced with plain buf[dst] = val OK
CAS loop inlined verbatim into the loop body (no function call) ICE

The crash is strictly in the compiler — removing either the atomics or the variable loop bounds eliminates it.

Workaround

Setting IGC_ShaderDumpEnableAll=1 makes the kernel compile successfully, suggesting the crash is in an optimization pass that is skipped or handled differently in dump mode.

Notes

  • The union type-punning is explicitly permitted by the OpenCL C 1.2 spec (§6.2.4.2).
  • atomic_add and atomic_cmpxchg on global unsigned int * are standard OpenCL C 1.1/1.2 atomic built-ins (cl_khr_global_int32_base_atomics).
  • This is a float CAS pattern — the standard idiom for float atomics in OpenCL C before cl_ext_float_atomics.
  • The crash was discovered in the crystgrain module of the Ansel photo editor. The module works correctly on AMD and NVIDIA GPUs; it was excluded from Intel at load time as a workaround.

This bug report was assembled with the help of Claude Code (Anthropic), which assisted in bisecting the minimal reproducer and drafting the report.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions