I’m looking into analyzing nested for loops in existing CUDA functions (from .cu source files), specifically to extract polyhedral dependencies and tiling information. I know that Loopy can generate CUDA code from an isl-like description.
I’m wondering if the reverse is possible: starting from a CUDA kernel (for example, accessed via PyCUDA), can one recover a Loopy/isl-like representation of its loops and dependencies?
My goal is to analyze loop structures and extract tiling/affinity info for LLM-driven optimization research, rather than just executing the kernels.
Note: My experience with compiler internals is limited, so any guidance on feasible approaches or existing tools would be greatly appreciated.
Thanks!
I’m looking into analyzing nested for loops in existing CUDA functions (from .cu source files), specifically to extract polyhedral dependencies and tiling information. I know that Loopy can generate CUDA code from an isl-like description.
I’m wondering if the reverse is possible: starting from a CUDA kernel (for example, accessed via PyCUDA), can one recover a Loopy/isl-like representation of its loops and dependencies?
My goal is to analyze loop structures and extract tiling/affinity info for LLM-driven optimization research, rather than just executing the kernels.
Note: My experience with compiler internals is limited, so any guidance on feasible approaches or existing tools would be greatly appreciated.
Thanks!