perf(index): bulk conjunction path for FTS AND and phrase queries#7624
Open
BubbleCal wants to merge 1 commit into
Open
perf(index): bulk conjunction path for FTS AND and phrase queries#7624BubbleCal wants to merge 1 commit into
BubbleCal wants to merge 1 commit into
Conversation
AND and phrase queries previously leapfrogged doc-at-a-time through boxed PostingIterator::next calls (~61% of the AND profile) and phrase checks decoded a whole 256-doc position block per candidate (~39% of the phrase profile). - and_bulk_search: block-max window pruning plus a k-pointer merge over decompressed block slices; per-candidate advance cost drops to a few loads. Results are identical to the classic loop (LANCE_FTS_BULK_AND=0 opts out). Phrase queries ride the same path. - seek_packed_doc_positions: PackedDelta full groups are self-describing ([num_bits u8][16*num_bits bytes]), so group offsets are recovered by hopping headers; decode only the 1-2 groups overlapping the candidate doc's delta range, with a lazily-built group index, memoized unpacked group, and a decoded-tail cache per block. - check_exact_positions_bulk: allocation-free slop=0 alignment check on the decoded scratch slices for parked lead clauses. Warm mmlb benchmarks, 8 concurrent queries: AND\@200M k10 0.114->0.060s, k100 0.240->0.118s; phrase\@50m 3-word k10 0.335->0.210s, 2-word k10 0.098->0.042s. All steps verified score-identical to the classic path. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
4732787 to
fec0a88
Compare
fc1af4b to
22e3522
Compare
This was referenced Jul 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #7604 (base of this PR); part of the Lucene-parity series after #7600-#7605.
What
Top-k AND and phrase queries previously leapfrogged doc-at-a-time through boxed
PostingIterator::nextcalls — 61% of the AND profile went to per-doc advance machinery (~25-40ns per advance). Phrase checks additionally decoded a whole 256-doc position block per candidate (39% of the phrase profile) and allocated cursor vectors per candidate.This adds a bulk conjunction path (
and_bulk_search, default on,LANCE_FTS_BULK_AND=0opts back into the classic loop):u32slices.seek_packed_doc_positions: PackedDelta position groups are self-describing ([num_bits u8][16*num_bits bytes]), so group offsets are recovered by hopping headers — a phrase candidate decodes only the 1-2 groups overlapping its own doc instead of the whole block, with a lazily-built group index and a decoded-tail cache. No format change.check_exact_positions_bulk: allocation-free slop=0 alignment check over the decoded scratch slices.DocSet::scoring_num_tokens, so V3 partitions score with the quantized lengths feat(fts)!: add configurable posting block size #7466 defines. (The impact-score-cache dead-code cleanup originally carried here moved to perf(fts): bulk MAXSCORE search path for top-k disjunctions #7603, where the code actually becomes dead.)Results (mmlb-200m warm, 8 concurrent, vs Lucene 10.4)
Verification
cargo test -p lance-index,clippy -D warnings,fmt --checkclean.🤖 Generated with Claude Code