fix(index): serialize FTS position prewarm to avoid an IO-scheduler deadlock by BubbleCal · Pull Request #7623 · lance-format/lance

BubbleCal · 2026-07-04T12:30:53Z

Problem

prewarm_index(..., with_position=True) deadlocks on position-bearing inverted indexes: all tokio workers park and the prewarm never completes (reproduced on a 50M-doc index; >14 min with zero progress before this fix, vs ~19 min to fully prewarm 162G after it).

Position streams are dominated by a few huge hot-token rows, so prewarm chunks sized for the 128MB average routinely span hundreds of MBs to GBs. Two or more such read_ranges in flight on the store's shared ScanScheduler can exhaust its byte backpressure window while every request still has undelivered pages: a request's later pages never pass the min_in_flight priority bypass, so nothing can complete and nothing frees the window.

Fix

A single request in flight always delivers in order and recycles the window, so position prewarm now runs its chunks serially. Position-less prewarm keeps the concurrent path (its chunks are bounded by the 128MB target and never wedge).

The scheduler-side wedge (concurrent large read_ranges on one ScanScheduler) is a separate latent lance-io issue; this change removes the only known trigger.

Verification

Repro before/after on a 50M-doc 162G positions index: with_position prewarm hung (>840s timeout) → completes in ~19 min; position-less prewarm unchanged (~30s).
cargo test -p lance-index (inverted suite), cargo clippy -- -D warnings, cargo fmt --check all clean.

🤖 Generated with Claude Code

…backpressure wedge Position streams are dominated by a few huge hot-token rows, so position-bearing prewarm chunks routinely span hundreds of MBs. Two or more such read_ranges in flight on the store's shared ScanScheduler can exhaust its backpressure window while every request still has undelivered pages (later pages of a request never pass the min_in_flight priority bypass), deadlocking the prewarm. A single request in flight always delivers in order and recycles the window, so run position prewarm chunks serially. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

codecov · 2026-07-04T13:12:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

github-actions Bot added A-index Vector index, linalg, tokenizer bug Something isn't working and removed A-index Vector index, linalg, tokenizer labels Jul 4, 2026

BubbleCal mentioned this pull request Jul 4, 2026

test(fts): benchmark new FTS algo #7605

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(index): serialize FTS position prewarm to avoid an IO-scheduler deadlock#7623

fix(index): serialize FTS position prewarm to avoid an IO-scheduler deadlock#7623
BubbleCal wants to merge 1 commit into
mainfrom
yang/fts-position-prewarm-serial

BubbleCal commented Jul 4, 2026

Uh oh!

codecov Bot commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

BubbleCal commented Jul 4, 2026

Problem

Fix

Verification

Uh oh!

codecov Bot commented Jul 4, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant