Performance regression: Python BITMAP index training and FTS phrase query slower since 2026-07-01

The lance-bench run for Lance `d86f1fc3db39039e78eec5bd9a407c3dd773e8ef` flagged two medium-confidence Python benchmark regressions that do not appear to be covered by existing open/recently closed issues.

Benchmark run: https://github.com/lancedb/lance-bench/actions/runs/28590579950

Affected benchmarks:

- `python/ci_benchmarks/benchmarks/test_index_training.py::test_index_training[low-float-BITMAP-5M]`
  - p=0.035841, last-4 avg 491.238 ms vs older avg 443.130 ms, +10.86% slower
- `python/ci_benchmarks/benchmarks/test_index_training.py::test_index_training[low-float-BITMAP-10M]`
  - p=0.047203, last-4 avg 1005.005 ms vs older avg 903.808 ms, +11.20% slower
- `python/ci_benchmarks/benchmarks/test_fts_search.py::test_query[cache-phrase_artificial_intelligence_research-k100]`
  - p=0.037284, last-4 avg 0.955 ms vs older avg 0.762 ms, +25.29% slower

Recent timing history from the benchmark DB:

```text
python/ci_benchmarks/benchmarks/test_index_training.py::test_index_training[low-float-BITMAP-5M]
older avg: 443.130 ms, last4 avg: 491.238 ms, change: +10.86%
2026-07-01 06:01  9.0.0-beta.10+7a5282a  423.443 ms
2026-07-01 08:14  9.0.0-beta.10+7a7bacc  468.314 ms
2026-07-01 17:19  9.0.0-beta.10+53fe06e  519.613 ms
2026-07-01 22:45  9.0.0-beta.10+d7f5b55  463.130 ms
2026-07-02 06:07  9.0.0-beta.10+843c8c2  486.883 ms
2026-07-02 09:52  9.0.0-beta.10+d86f1fc  495.326 ms

python/ci_benchmarks/benchmarks/test_index_training.py::test_index_training[low-float-BITMAP-10M]
older avg: 903.808 ms, last4 avg: 1005.005 ms, change: +11.20%
2026-07-01 06:01  9.0.0-beta.10+7a5282a  872.035 ms
2026-07-01 08:14  9.0.0-beta.10+7a7bacc  951.441 ms
2026-07-01 17:19  9.0.0-beta.10+53fe06e  1112.290 ms
2026-07-01 22:45  9.0.0-beta.10+d7f5b55  923.766 ms
2026-07-02 06:07  9.0.0-beta.10+843c8c2  969.736 ms
2026-07-02 09:52  9.0.0-beta.10+d86f1fc  1014.227 ms

python/ci_benchmarks/benchmarks/test_fts_search.py::test_query[cache-phrase_artificial_intelligence_research-k100]
older avg: 0.762 ms, last4 avg: 0.955 ms, change: +25.29%
2026-07-01 06:01  9.0.0-beta.10+7a5282a  0.755 ms
2026-07-01 08:14  9.0.0-beta.10+7a7bacc  0.755 ms
2026-07-01 17:19  9.0.0-beta.10+53fe06e  1.088 ms
2026-07-01 22:45  9.0.0-beta.10+d7f5b55  0.766 ms
2026-07-02 06:07  9.0.0-beta.10+843c8c2  0.915 ms
2026-07-02 09:52  9.0.0-beta.10+d86f1fc  1.051 ms
```

This does not look like a perfectly clean single-step regression because the `d7f5b55` sample is lower again. The elevated last-4 window starts after the `7a7bacc` result at 2026-07-01 08:14 UTC. Commits in the first elevated window include:

```text
2026-07-01T13:20:39Z  876ad93  fix: recover from stale cached manifest size on read (#7542)
2026-07-01T16:12:10Z  1fdc8b1  fix(compaction): exclude system indices from compaction binning (#7516)
2026-07-01T17:10:35Z  546e766  ci: detect Cargo.lock drift offline + harden java --locked (#7425)
2026-07-01T17:10:42Z  4ac49ea  ci: group Dependabot updates into one PR per lockfile (#7457)
2026-07-01T17:19:18Z  53fe06e  fix(rowids): tolerate sparse overlapping chunks in the stable row id index (#7480)
```

The latest two elevated samples also include:

```text
2026-07-01T22:45:33Z  d7f5b55  perf: merge half-open range queries on the same BTree index (#7477)
2026-07-02T06:07:29Z  843c8c2  fix(lance-io): include goosefs feature in DEFAULT_CLOUD_BLOCK_SIZE cfg gate (#7570)
2026-07-02T09:52:55Z  d86f1fc  chore(ci): ignore RUSTSEC-2026-0194/0915 for transitive quick-xml (#7577)
```

Suggested follow-up: rerun these Python CI benchmarks around `7a7bacc`, `53fe06e`, `d7f5b55`, `843c8c2`, and `d86f1fc` to confirm whether this is real or benchmark noise.

Duplicate checks performed before filing:

- Exact searches for `low-float-BITMAP-5M`, `low-float-BITMAP-10M`, and `phrase_artificial_intelligence_research-k100`: no matching open issues.
- Closed exact FTS phrase matches are from 2026-06-09, outside the 14-day recent-closed suppression window.
- Search for `index training benchmark`: no matching open issue; old closed issue #5714 is from January 2026 and not this benchmark.
- Broad open search for `benchmark performance created:>=2026-06-25`: no matching recent general benchmark-performance issue.

Other slower flagged clusters in this run were not filed because existing issues already cover them or they were recently closed: IVF/PQ search (#7137 open), TPCH/random access (#7135 open), scalar BTree/bitmap search (#7136 open), `sum_4bit_dist_table` (#7276 open), `from_elem random_read` (#7348 closed 2026-06-24), and encoding/primitive decode (#7458 closed 2026-07-01).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance regression: Python BITMAP index training and FTS phrase query slower since 2026-07-01 #7585

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Performance regression: Python BITMAP index training and FTS phrase query slower since 2026-07-01 #7585

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions