Remove AsThreadPool generic and take &RayonThreadPool directly#967
Remove AsThreadPool generic and take &RayonThreadPool directly#967JordanMaples wants to merge 3 commits intomainfrom
Conversation
…h_rayon De-genericize all pool-accepting functions to take &RayonThreadPool directly. Remove PQGenerationContext/PQGeneration Pool type parameter (now borrows). Test callers that passed 1usize now create explicit 1-thread pools. Production callers that passed num_threads now create pools at call site. Addresses #905 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR simplifies the Rayon thread-pool plumbing by removing the AsThreadPool generic + forward_threadpool! macro approach and standardizing pool-accepting APIs to take a borrowed &RayonThreadPool instead. This reduces monomorphization higher in the call stack and makes pool usage more explicit across providers/tools/disk build paths.
Changes:
- Removed
AsThreadPool,forward_threadpool!, andexecute_with_rayon; updated exports accordingly. - Updated PQ/kmeans/partition/quantization APIs to accept
&RayonThreadPooland adjusted call sites (including tests) to construct explicit pools. - Updated disk quantization generator and PQ generation context types to borrow the pool instead of carrying a generic pool type parameter.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-tools/src/utils/build_pq.rs | Creates a thread pool once and passes &RayonThreadPool into PQ generation calls. |
| diskann-providers/src/utils/rayon_util.rs | Removes execute_with_rayon/AsThreadPool infra; retains concrete RayonThreadPool and in-pool iterator helpers. |
| diskann-providers/src/utils/mod.rs | Stops re-exporting removed thread-pool abstractions/functions. |
| diskann-providers/src/storage/index_storage.rs | Updates tests to pass &RayonThreadPool instead of a thread-count integer. |
| diskann-providers/src/model/pq/pq_construction.rs | Degenericizes PQ construction APIs to take &RayonThreadPool. |
| diskann-providers/src/index/wrapped_async.rs | Updates tests to pass &RayonThreadPool. |
| diskann-providers/src/index/diskann_async.rs | Degenericizes train_pq and updates test call sites to pass &RayonThreadPool. |
| diskann-disk/src/utils/partition.rs | Degenericizes partitioning helper to take &RayonThreadPool. |
| diskann-disk/src/utils/math_util.rs | Degenericizes math utilities to take &RayonThreadPool. |
| diskann-disk/src/utils/kmeans.rs | Degenericizes kmeans utilities to take &RayonThreadPool. |
| diskann-disk/src/storage/quant/pq/pq_generation.rs | Removes pool type parameter from PQ generation context; now borrows &RayonThreadPool. |
| diskann-disk/src/storage/quant/generator.rs | Degenericizes quant data generator to take &RayonThreadPool; updates test call site accordingly. |
| diskann-disk/src/build/builder/quantizer.rs | Trains PQ quantizer by constructing a thread pool at call site and passing &RayonThreadPool. |
| diskann-disk/src/build/builder/core.rs | Updates merged index workflow partitioning call to the new signature. |
| diskann-disk/src/build/builder/build.rs | Updates PQ generation usage and generator invocation to the new borrowed-pool signatures. |
Comments suppressed due to low confidence (1)
diskann-providers/src/utils/rayon_util.rs:15
create_thread_poolis documented to treatnum_threads == 0as “use logical CPUs”, but the implementation always callsThreadPoolBuilder::num_threads(num_threads). If callers pass 0 (e.g., as an “auto” sentinel), this will rely on Rayon accepting 0; if Rayon treats 0 as invalid, this will return an error at runtime. Consider explicitly handlingnum_threads == 0by omitting.num_threads(...)(or mapping 0 toavailable_parallelism()) so behavior matches the docs and is stable across Rayon versions.
/// Creates a new thread pool with the specified number of threads.
/// If `num_threads` is 0, it defaults to the number of logical CPUs.
pub fn create_thread_pool(num_threads: usize) -> ANNResult<RayonThreadPool> {
let pool = rayon::ThreadPoolBuilder::new()
.num_threads(num_threads)
.build()
.map_err(|err| ANNError::log_thread_pool_error(err.to_string()))?;
Ok(RayonThreadPool(pool))
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #967 +/- ##
==========================================
- Coverage 89.31% 89.30% -0.01%
==========================================
Files 448 448
Lines 83329 83244 -85
==========================================
- Hits 74422 74344 -78
+ Misses 8907 8900 -7
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Remove AsThreadPool trait, forward_threadpool! macro, and execute_with_rayon
De-genericize all pool-accepting functions to take &RayonThreadPool directly. Remove PQGenerationContext/PQGeneration Pool type parameter (now borrows). Test callers that passed 1usize now create explicit 1-thread pools. Production callers that passed num_threads now create pools at call site.
Addresses #905
Question that needs follow-up. This will always result in DiskANN creating its own threadpool and does not address the concern that @hildebrandmw had in #906 regarding a bring your own threadpool scenario. If that's something we want to design around, this will require a bit of extra tinkering.