Skip to content

docs: update cuDNN sliding window attention support#2624

Open
sbhavani wants to merge 3 commits intoNVIDIA:mainfrom
sbhavani:docs/update-cudnn-swa-support
Open

docs: update cuDNN sliding window attention support#2624
sbhavani wants to merge 3 commits intoNVIDIA:mainfrom
sbhavani:docs/update-cudnn-swa-support

Conversation

@sbhavani
Copy link
Collaborator

Description

Update documentation to reflect that cuDNN now supports causal sliding window attention (SWA) starting from version 9.2+.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

Changes:

  • Updated backend support matrix table to show cuDNN supports SWA (cuDNN 9.2+, causal masks only)
  • Added SWA comparison between flash-attention and cuDNN in section 1.3
  • Added clarifying note in cp_ag_thd_dpa_jax_deep_dive.ipynb that cuDNN supports SWA but not all striping patterns for context parallelism

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Update documentation to reflect that cuDNN now supports causal sliding
window attention (SWA) starting from version 9.2+.

Changes:
- Updated backend support matrix table to show cuDNN supports SWA
  (cuDNN 9.2+, causal masks only)
- Added SWA comparison between flash-attention and cuDNN in section 1.3
- Added clarifying note in cp_ag_thd_dpa_jax_deep_dive.ipynb that cuDNN
  supports SWA but not all striping patterns for context parallelism

Technical details:
- cuDNN 9.2+: Supports causal SWA with window_size=(left, 0)
- cuDNN 9.6+: Enhanced support for asymmetric windows (left, right)
- Constraints: Requires dropout=0.0 and bias_type="no_bias"
- Only works with causal mask types

Signed-off-by: Santosh Bhavani <santosh.bhavani@live.com>
@sbhavani sbhavani requested a review from pggPL January 26, 2026 18:57
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 26, 2026

Greptile Summary

Updated documentation to reflect cuDNN's sliding window attention (SWA) support added in version 9.2+.

Changes Made

  • Updated backend support matrix table to show cuDNN supports SWA starting from version 9.2+
  • Added detailed SWA comparison between flash-attention and cuDNN in section 1.3, clarifying version-specific support (SWA(left, 0) from cuDNN 9.2, SWA(left, right) from cuDNN 9.6)
  • Added note about cuDNN supporting SWA for causal masks only, without dropout, and with bias_type="no_bias"
  • Minor formatting improvement in JAX notebook

Review Notes

The documentation changes are accurate and well-aligned with the codebase. Test files confirm cuDNN 9.2.1+ requirement for SWA tests, and code comments validate the causal mask restriction for cuDNN SWA support.

Confidence Score: 5/5

  • This PR is safe to merge - documentation-only changes with accurate information
  • Perfect score reflects documentation-only changes that accurately describe cuDNN SWA support introduced in version 9.2+, validated against test files and code comments in the repository
  • No files require special attention

Important Files Changed

Filename Overview
docs/examples/attention/attention.ipynb Updated backend support matrix and added SWA comparison details between flash-attention and cuDNN - accurate documentation changes reflecting cuDNN 9.2+ SWA support
docs/examples/attention/cp_ag_thd_dpa_jax_deep_dive.ipynb Minor formatting change - added blank line with no functional impact

Last reviewed commit: 251e6b2

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

pggPL
pggPL previously approved these changes Feb 18, 2026
Copy link
Collaborator

@pggPL pggPL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Santosh Bhavani <santosh.bhavani@live.com>
@sbhavani sbhavani force-pushed the docs/update-cudnn-swa-support branch from edc84ae to 251e6b2 Compare February 24, 2026 23:12
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@sbhavani sbhavani requested a review from cyanguwa March 3, 2026 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants