migrating multi-hop tests from diskann-providers to diskann#928
migrating multi-hop tests from diskann-providers to diskann#928JordanMaples wants to merge 18 commits intomainfrom
Conversation
5a59f22 to
5655eea
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #928 +/- ##
==========================================
+ Coverage 89.31% 89.33% +0.01%
==========================================
Files 447 449 +2
Lines 83250 83375 +125
==========================================
+ Hits 74354 74479 +125
Misses 8896 8896
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
- Move groundtruth, is_match, assert_top_k_exactly_match, and assert_range_results_exactly_match from diskann-providers/test_utils to diskann/graph/test/search_utils for cross-crate reuse - Migrate test_even_filtering_multihop to diskann as even_filtering_multihop, using test_provider::Provider::grid() - Remove test_multihop_filtering and test_even_filtering_multihop from diskann-providers - Update all consumers in diskann-providers to use shared search_utils Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add callback_enforces_filtering test to multihop.rs - Expand CallbackMetrics to track total_visits, rejected_count, adjusted_count (matching original) - Remove test_multihop_callback_enforces_filtering, CallbackFilter, and CallbackMetrics from diskann-providers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Gate is_match, assert_top_k_exactly_match, assert_range_results_exactly_match with #[cfg(test)] in diskann/graph/test/search_utils - Restore search_utils in diskann-providers/test_utils for cross-crate use, re-exporting groundtruth from diskann - Update diskann-providers and diskann-disk imports accordingly - Remove unused imports (Mutex, QueryVisitDecision, Knn) and dead code (test_multihop_search) from diskann-providers - Fix needless_range_loop in multihop.rs - Remove stale duplicate diskann dep in diskann-disk/Cargo.toml Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
8105440 to
b7d238b
Compare
- Add run_multihop_search() to eliminate repeated runtime/buffer/search boilerplate across 5 tests - Add l2_groundtruth() to deduplicate brute-force groundtruth computation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Verifies that multihop search can discover matching nodes that are only reachable through non-matching nodes, exercising the core two-hop expansion behavior of the multihop algorithm. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Migrates a first set of multi-hop search traversal unit tests from diskann_async into the diskann crate’s graph test suite, and introduces shared search ground-truth utilities to support those tests.
Changes:
- Added
multihoptest cases underdiskann/src/graph/test/cases/and wired them into the test module. - Introduced
diskann::graph::test::search_utilswith ground-truth + assertion helpers for search verification. - Removed the migrated multi-hop test helpers/cases from
diskann-providers/src/index/diskann_async.rsand adjusteddiskann-providerstest utils module visibility/docs.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann/src/graph/test/search_utils.rs | Adds ground-truth computation and assertion helpers for graph search tests. |
| diskann/src/graph/test/mod.rs | Exposes the new search_utils module in the graph test module. |
| diskann/src/graph/test/cases/multihop.rs | Adds migrated multi-hop traversal/filtering/termination/callback tests. |
| diskann/src/graph/test/cases/mod.rs | Registers the new multihop test module. |
| diskann-providers/src/test_utils/search_utils.rs | Updates/clarifies docs around duplicated ground-truth helpers for provider-side tests. |
| diskann-providers/src/test_utils/mod.rs | Makes search_utils publicly accessible from diskann-providers::test_utils. |
| diskann-providers/src/index/diskann_async.rs | Removes migrated multi-hop-related tests and supporting helpers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
hildebrandmw
left a comment
There was a problem hiding this comment.
Thanks Jordan! This is moving the tests in the right direction, but we need to be careful about just moving the test infrastructure from diskann_async.rs as-is.
The utilities in search_utils.rs are extremely awkward for actually running tests, are very sensitive, and don't provide much useful information when they fire.
They've been used in diskann_async.rs because it was kind of the best thing we had back then.
My hope is that the new diskann can take a higher signal approach using baselines and VerboseEq.
Not only does this provide a really good way of viewing the expected results as a whole, it's also great for storing additional metrics.
For example, the stats, ids, and distances from multi-hop search can all be checked in as part of the baseline and get protected for free.
My ask is to not migrate the search_utils.rs as is - especially if it means including test methods that aren't actually used by diskann.
Also, use the baseline capturing mechanism to capture everything about both test setups and results.
We cannot rely solely on the baseline to protect against regression (someone could check-in a broken baseline in the future), but a baseline in combination with some invariant checks (returned items should be filtered/adjusted) will go a long way toward good tests.
|
converting back to draft as it needs some more human refinement |
Test eval() and eval_mut() behavior: visited-set exclusion, label matching, and insert-on-match semantics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace migrated integration tests with focused unit tests that call multihop_search_internal directly on small hand-constructed graphs: - accept_all_finds_all_nodes: one-hop expansion with AcceptAll filter - reject_triggers_two_hop_expansion: EvenFilter rejection triggers two-hop - reject_all_yields_only_start: RejectAll leaves only start in best set - terminate_stops_search_on_target: TerminateOnTarget stops search early - block_and_adjust_modifies_results: blocked node excluded, distance adjusted Add integration tests with VerboseEq baselines: - two_hop_reaches_through_non_matching: end-to-end with invariants - even_filtering_grid: 3D grid with even-only filter - callback_filtering_grid: block+adjust with full metrics baseline Remove search_utils.rs (only used by old multihop tests). Make multihop_search module pub(crate) for direct internal testing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@copilot would you mind taking another review over this pr as it has changed significantly from your previous review. |
Addressed in 9c85d26:
I re-reviewed the updated PR state and the major requested shifts are present (baseline-driven multihop tests + removal of migrated |
The doc comment changes weren't material to this PR. Also reverts pub mod search_utils back to mod since consumers access the functions through re-exports, not the module path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This is a first step in the effort to migrate unit tests from diskann_async to diskann #927. If you don't think tests that have been brought over should be moved, they probably shouldn't have been. Please point them out and I'll do my best to put them back where I found them.
I'll update this field as I continue to work on it: