Fix STG crash/index miscount with guide-mask self-attention by jjdejong · Pull Request #503 · Lightricks/ComfyUI-LTXVideo

Jean J. de Jong (jjdejong) · 2026-06-02T09:18:31Z

Summary

Fixes a crash (and a related attention-index miscount) in the STG / multimodal
guider whenever cond-image / keyframe guides with strength != 1.0 are combined
with STG perturbation.

Symptom

During the STG-perturbed denoise step:

RuntimeError: The expanded size of the tensor (N) must match the existing
size (M) at non-singleton dimension 1 ... in _attention_with_guide_mask

Root cause

ComfyUI core (CORE-166, "Reduce LTX2.3 peak VRAM when guide_mask is in use")
now splits one video self-attention into up to three optimized_attention
calls over sliced queries against the full key/value. The plugin's STG
PatchAttention skipped a layer by return v (the full-length value) and
counted each sub-call as a separate attention index. So:

return v is the wrong length for the sliced-query output slot → crash; and
the extra sub-calls shift audio_attn_idx in calc_stg_indexes, so audio STG
would skip the wrong attention even when it doesn't crash.

Fix

PatchAttention now recognises a guide-split sub-call via the
low_precision_attention=False kwarg (the only signal core's guide path passes —
this avoids false-positiving the v2a cross-attention, which also has
q_len < v_len), collapses the split into a single logical STG index, and
returns the matching v[:, off:off+q_len] slice when skipping. No core changes.

Scope

Not AV-specific: any video-only workflow combining optional_cond_images
(strength != 1.0) with the STG / multimodal guider triggers this.

Repro

A video-only LTX-2 workflow with optional_cond_images at strength != 1.0 and
the STG / multimodal guider with perturbation enabled.

Caveat

The detector keys on low_precision_attention=False as the guide-split marker,
which is a core implementation detail. If core changes that, the detector would
silently stop collapsing the split — maintainers may prefer a more explicit
contract or a core-side fix.

When cond-image/keyframe guides with strength != 1.0 are combined with STG perturbation, the perturbed denoise step crashed with: RuntimeError: The expanded size of the tensor (N) must match the existing size (M) ... in _attention_with_guide_mask Root cause: comfy core (CORE-166, "Reduce LTX2.3 peak VRAM when guide_mask is in use") splits one video self-attention into up to three optimized_attention calls over sliced queries against the full key/value. STG's PatchAttention skipped a layer with `return v` (the full sequence) and counted each sub-call as a separate attention index. So the returned value was the wrong length (crash on the query-slice assignment), and the extra sub-calls shifted audio_attn_idx in calc_stg_indexes. Fix: detect a guide-split sub-call via the low_precision_attention=False kwarg (the only signal core's guide path passes; this avoids false-positiving the v2a cross-attention, which also has q_len < v_len), collapse the split into a single logical STG index, and return the matching v[:, off:off+q_len] slice when skipping. No core changes. Not AV-specific: any video-only workflow combining cond_images (strength != 1.0) with the STG/multimodal guider triggers it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Jean J. de Jong (jjdejong) mentioned this pull request Jun 2, 2026

AV latent support for LTXVLoopingSampler and LTXVExtendSampler #472

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix STG crash/index miscount with guide-mask self-attention#503

Fix STG crash/index miscount with guide-mask self-attention#503
Jean J. de Jong (jjdejong) wants to merge 1 commit into
Lightricks:masterfrom
jjdejong:stg-guide-mask-fix

Jean J. de Jong (jjdejong) commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Jean J. de Jong (jjdejong) commented Jun 2, 2026

Summary

Symptom

Root cause

Fix

Scope

Repro

Caveat

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant