[flat index] Flat Search Interface#983
Conversation
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #983 +/- ##
==========================================
- Coverage 89.48% 89.38% -0.11%
==========================================
Files 448 451 +3
Lines 84081 84174 +93
==========================================
- Hits 75239 75238 -1
- Misses 8842 8936 +94
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR introduces an RFC plus an initial “flat” (sequential scan) search surface in diskann, analogous to the existing graph/random-access search pipeline built around DataProvider/Accessor.
Changes:
- Added an RFC describing the flat iterator/strategy/index abstraction and trade-offs.
- Added a new
diskann::flatmodule withFlatIterator,FlatSearchStrategy,FlatIndex::knn_search, andFlatPostProcess(+CopyFlatIds). - Exported the new
flatmodule from the crate root.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| rfcs/00983-flat-search.md | RFC describing the design for sequential (“flat”) index search APIs. |
| diskann/src/lib.rs | Exposes the new flat module publicly. |
| diskann/src/flat/mod.rs | New module root + re-exports for the flat search surface. |
| diskann/src/flat/iterator.rs | Defines the async lending iterator primitive FlatIterator. |
| diskann/src/flat/strategy.rs | Defines FlatSearchStrategy to create per-query iterators and query computers. |
| diskann/src/flat/index.rs | Implements FlatIndex and the brute-force knn_search scan algorithm. |
| diskann/src/flat/post_process.rs | Defines FlatPostProcess and a basic CopyFlatIds post-processor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * Licensed under the MIT license. | ||
| */ | ||
|
|
||
| //! [`FlatIndex`] — the index wrapper for an on which we do flat search. |
There was a problem hiding this comment.
Doc comment grammar issue: "the index wrapper for an on which we do flat search" is missing a noun (e.g., "for an index" / "around a provider"). Please fix to avoid confusing rustdoc output.
| //! [`FlatIndex`] — the index wrapper for an on which we do flat search. | |
| //! [`FlatIndex`] — the index wrapper around a [`DataProvider`] on which we do flat search. |
| S: FlatIterator, | ||
| T: ?Sized, | ||
| { | ||
| type Error = crate::error::Infallible; |
There was a problem hiding this comment.
CopyFlatIds uses crate::error::Infallible, but the analogous graph::glue::CopyIds uses std::convert::Infallible (graph/glue.rs:417). Using the std type here too would improve consistency and reduce cognitive overhead for readers comparing the two pipelines.
| type Error = crate::error::Infallible; | |
| type Error = std::convert::Infallible; |
| ```rust | ||
| pub struct FlatIndex<P: DataProvider> { | ||
| provider: P, | ||
| /* private */ | ||
| } | ||
|
|
||
| impl<P: DataProvider> FlatIndex<P> { | ||
| pub fn new(provider: P) -> Self; | ||
| pub fn provider(&self) -> &P; | ||
|
|
||
| pub fn knn_search<S, T, O, OB>( | ||
| &self, | ||
| k: NonZeroUsize, | ||
| strategy: &S, | ||
| processor: &PP, | ||
| context: &P::Context, | ||
| query: &T, | ||
| output: &mut OB, | ||
| ) -> impl SendFuture<ANNResult<SearchStats>> | ||
| where | ||
| S: FlatSearchStrategy<P, T>, | ||
| T: ?Sized + Sync, | ||
| O: Send, | ||
| OB: SearchOutputBuffer<O> + Send + ?Sized, | ||
| } |
There was a problem hiding this comment.
The RFC FlatIndex::knn_search signature uses processor: &PP but PP is not declared in the generic parameter list, and the struct shows provider as private while the implementation has it as a public field. Please align the RFC snippet with the actual API so the rendered RFC stays accurate.
| /// - [`Self::build_query_computer`] is iterator-independent — the same query can be | ||
| /// pre-processed once and used against multiple iterators. | ||
| /// | ||
| /// Both methods may borrow from the strategy itself. |
There was a problem hiding this comment.
The doc comment says "Both methods may borrow from the strategy itself", but QueryComputer is bounded by + 'static, so build_query_computer cannot return a computer that borrows from self. Consider rewording to clarify that create_iter may borrow, while the query computer must own its state (or is otherwise 'static).
| /// Both methods may borrow from the strategy itself. | |
| /// [`Self::create_iter`] may return an iterator that borrows from the strategy itself | |
| /// and the provider. [`Self::build_query_computer`] may use the strategy while | |
| /// constructing the query computer, but the returned [`Self::QueryComputer`] must own | |
| /// its state or otherwise satisfy `'static`. |
|
|
||
| iter.on_elements_unordered(|id, element| { | ||
| let dist = computer.evaluate_similarity(element); | ||
| cmps += 1; |
There was a problem hiding this comment.
cmps is counted with a u32 and incremented once per scanned element. In flat search a full scan could exceed u32::MAX, which will panic on overflow in debug builds and wrap in release, producing invalid stats. Consider using saturating arithmetic (cap at u32::MAX) or switching the counter to a wider type before converting to SearchStats.
| cmps += 1; | |
| cmps = cmps.saturating_add(1); |
| }) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
New behavior (FlatIndex::knn_search) is introduced without tests. Given the repo has unit tests for graph search and output buffers, it would be good to add at least one test covering: (1) correct top-k ordering, (2) that CopyFlatIds writes expected (id, distance) pairs, and (3) that SearchStats { cmps, result_count } are consistent for a tiny in-memory iterator.
| } | |
| } | |
| #[cfg(test)] | |
| mod tests { | |
| use super::*; | |
| fn run_tiny_flat_scan( | |
| k: usize, | |
| items: &[(u32, f32)], | |
| ) -> (Vec<(u32, f32)>, SearchStats) { | |
| let mut queue = NeighborPriorityQueue::new(k); | |
| let mut cmps = 0u32; | |
| for (id, distance) in items.iter().copied() { | |
| cmps += 1; | |
| queue.insert(Neighbor::new(id, distance)); | |
| } | |
| let copied: Vec<(u32, f32)> = queue | |
| .iter() | |
| .take(k) | |
| .map(|neighbor| (neighbor.id, neighbor.distance)) | |
| .collect(); | |
| let stats = SearchStats { | |
| cmps, | |
| hops: 0, | |
| result_count: copied.len() as u32, | |
| range_search_second_round: false, | |
| }; | |
| (copied, stats) | |
| } | |
| #[test] | |
| fn knn_search_keeps_top_k_in_distance_order() { | |
| let (copied, stats) = run_tiny_flat_scan( | |
| 3, | |
| &[(10, 4.0), (11, 1.5), (12, 3.0), (13, 0.5), (14, 2.0)], | |
| ); | |
| assert_eq!(copied, vec![(13, 0.5), (11, 1.5), (14, 2.0)]); | |
| assert_eq!(stats.result_count, 3); | |
| } | |
| #[test] | |
| fn copied_flat_ids_match_expected_id_distance_pairs() { | |
| let (copied, _) = run_tiny_flat_scan(2, &[(21, 9.0), (22, 1.25), (23, 4.5)]); | |
| assert_eq!(copied, vec![(22, 1.25), (23, 4.5)]); | |
| } | |
| #[test] | |
| fn search_stats_are_consistent_for_tiny_in_memory_scan() { | |
| let items = &[(31, 7.0), (32, 2.0), (33, 5.0), (34, 1.0)]; | |
| let (copied, stats) = run_tiny_flat_scan(2, items); | |
| assert_eq!(stats.cmps, items.len() as u32); | |
| assert_eq!(stats.hops, 0); | |
| assert_eq!(stats.result_count, copied.len() as u32); | |
| assert!(!stats.range_search_second_round); | |
| } | |
| } |
| callback | ||
| .distances_unordered(&computer, |id, dist| { | ||
| cmps += 1; | ||
| queue.insert(Neighbor::new(id, dist)); |
There was a problem hiding this comment.
What’s our extension path for dropping candidates whose distance exceeds a given threshold?
Would this logic live inside callback.distance_unordered()? If so, the cmps wouldn’t include the distance computations for candidates that get discarded - it is an issue.
There was a problem hiding this comment.
This is a good point, let me think about this.
| ) -> impl SendFuture<Result<(), Self::Error>> | ||
| where | ||
| C: for<'a> PreprocessedDistanceFunction<Self::ElementRef<'a>, f32> + Send + Sync, | ||
| F: Send + FnMut(Self::Id, f32), |
There was a problem hiding this comment.
FnMut is not friendly to parallel iterators. If we ever want to parallelize the distance calculations for flat index, what is our extension path?
There was a problem hiding this comment.
How so? The mutability bound on F is restricted to the scope of the search for a single query. This doesn't prevent us from parallelizing each scan across multiple queries. It just restricts you from parallelizing within the search scope for a query.
This is an identical pattern to the trait constraints for graph search.
There was a problem hiding this comment.
Yep, I'm talking about an option to parallelize distance computations between a single query vector and the indexed vectors - trading off CPU for lower search latency.
| pub(crate) mod internal; | ||
|
|
||
| // Index Implementations | ||
| pub mod flat; |
There was a problem hiding this comment.
What is the rationale behind adding flat index to diskann crate if compared with creating diskann-flat crate?
There was a problem hiding this comment.
The provider is defined in diskann which is a first-class object for both graph and flat index.
That said, I'm open to creating a more intuitive separation between graph and flat indexes in diskann. Although I think as of now, the interface for the flat index is too small to justify any major separation. But I'm open to ideas here.
This PR introduces a trait interface and a light index to support brute-force search for providers that can be used as/are a flat-index. There is an associated RFC that walks through the interface and associated implementation in
diskannas a newflatmodule.Rendered RFC link.
Motivation
The repo has no first-class surface for brute-force search today.
This PR introduces a small trait hierarchy that gives flat search the same provider-agnostic shape that
Accessor/SearchStrategygive graph search, so any backend (in-memory full-precision, quantized, disk, remote) can plug in once and reuse a shared algorithm.This allows implementations of algorithms - diverse search, knn search, range search, pre-filtered search - to live in the repo and let consumers only worry about defining the provider; just like we do for graph-search.
The core traits
The following is a light-introduction to the trait surface in order of importance - reviewers are recommended to review in this order.
OnElementsUnordered- core scanImplemented in
flat\iterator.rsA single (async) method that drives the entire scan via callback. Implementations choose iteration order, prefetching, and/or bulk reads if they want. Brute-force algorithm and computer sees only
(Id, ElementRef)pairs.DistancesUnordered- distance subtraitFuses scanning with scoring. The default implementation loops
on_elements_unorderedand callsevaluate_similarityon each element. Backends that can fuse retrieval and scoring can override this method.FlatSearchStrategy- the glueImplemented in
flat\strategy.rs.FlatSearchStrategy<P, T>is the per-call configuration that mirrors graph'sdiskann::glue::SearchStrategy:create_callback(&provider, &context)produces a per-queryCallback: DistancesUnordered.build_query_computer(&query)produces theQueryComputerthat scores each element.The strategy is the glue that creates the computer that is used in a given search, and is what links a provider to a particular flat-search execution.
FlatIndexImplemented in
flat\index.rs.FlatIndex<P>is a thin wrapper around aDataProvider. Itsknn_searchmethod:callback.distances_unordered(&computer, ...)through the priority queue,FlatPostProcessto write into the output buffer.FlatIterator- opt-in convenienceFor backends that naturally expose element-at-a-time iteration,
FlatIteratoris a lending async iterator with a singlenext()method.DefaultIteratedOperator<I>wraps anyFlatIteratorand implementsOnElementsUnordered(andDistancesUnorderedby inheritance) by looping overnext()and reborrowing each element.This means a backend can opt in at exactly the right layer: bulk-friendly backends implement
OnElementsUnordereddirectly; element-at-a-time backends implementFlatIteratorand get the rest for free.