[diskann-disk] Decouple buffer memory alignment from disk block_size#984
[diskann-disk] Decouple buffer memory alignment from disk block_size#984doliawu merged 2 commits intomicrosoft:mainfrom
Conversation
|
For awareness, this PR overlaps with @hildebrandmw's review direction on #965: #965 (review) His two concrete asks there both touch this PR:
|
c2da97c to
ab9e3fb
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #984 +/- ##
==========================================
+ Coverage 89.48% 90.63% +1.14%
==========================================
Files 448 448
Lines 84081 84095 +14
==========================================
+ Hits 75239 76216 +977
+ Misses 8842 7879 -963
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
25bd2e2 to
b730cda
Compare
b730cda to
3ee1e17
Compare
There was a problem hiding this comment.
Pull request overview
This PR decouples in-memory buffer alignment requirements from the on-disk block_size by moving the memory-alignment contract onto the AlignedFileReader itself, and updates the aligned-read request type and documentation accordingly.
Changes:
- Introduces
AlignedFileReader::MEMORY_ALIGNMENT(default1; direct-I/O readers override to512) plus a sharedvalidate_alignment()helper. - Replaces the fallible
AlignedReadtype with an infallibleReadRequestdata carrier and updates all call sites, tests, and benchmarks. - Updates
DiskSectorGraphbuffer allocation to align to the reader’sMEMORY_ALIGNMENTrather thanblock_size, and refreshes VertexProvider docs.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-disk/src/utils/aligned_file_reader/windows_aligned_file_reader.rs | Switches to ReadRequest, defines MEMORY_ALIGNMENT = 512, and validates alignment before issuing unbuffered reads. |
| diskann-disk/src/utils/aligned_file_reader/linux_aligned_file_reader.rs | Switches to ReadRequest, defines MEMORY_ALIGNMENT = 512, and validates alignment before O_DIRECT io_uring reads. |
| diskann-disk/src/utils/aligned_file_reader/storage_provider_aligned_file_reader.rs | Switches to ReadRequest for the non-direct storage provider reader. |
| diskann-disk/src/utils/aligned_file_reader/traits/aligned_file_reader.rs | Adds MEMORY_ALIGNMENT contract + shared validate_alignment() and unit tests. |
| diskann-disk/src/utils/aligned_file_reader/read_request.rs | Adds new infallible request carrier type used by readers/callers. |
| diskann-disk/src/utils/aligned_file_reader/mod.rs | Stops exporting AlignedRead and exports ReadRequest instead. |
| diskann-disk/src/utils/aligned_file_reader/aligned_read.rs | Removes the old AlignedRead type and its constructor-time alignment checks. |
| diskann-disk/src/search/provider/disk_sector_graph.rs | Allocates sector buffers aligned to AlignedReaderType::MEMORY_ALIGNMENT instead of block_size; uses ReadRequest. |
| diskann-disk/src/search/provider/disk_vertex_provider_factory.rs | Updates header read path to use ReadRequest. |
| diskann-disk/src/search/traits/vertex_provider_factory.rs | Refreshes trait documentation to remove stale references and clarify parameters/functions. |
| diskann-disk/src/search/traits/vertex_provider.rs | Refreshes trait documentation (and converts several comments to doc comments). |
| diskann-disk/src/search/provider/disk_provider.rs | Minor formatting-only change. |
| diskann-disk/benches/benchmarks/aligned_file_reader_bench.rs | Updates benchmark to use ReadRequest. |
| diskann-disk/benches/benchmarks_iai/aligned_file_reader_bench_iai.rs | Updates IAI benchmark to use ReadRequest. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
arkrishn94
left a comment
There was a problem hiding this comment.
Thanks for taking this on Dongliang, just have one comment. I know it's a little more work than might be needed but I think it it'll pay off to keep this API clean.
Drop references to ANNWrapper and JetVertexProvider (neither exist), fix typo GraphProvider -> VertexProvider, fix wrong type parameter (Data, not GraphMetadata) and wrong create_vertex_provider signature description, and convert four `//` comments that should have been `///` doc comments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
71bd0db to
3954f7a
Compare
arkrishn94
left a comment
There was a problem hiding this comment.
Nice, thanks Dongliang.
Decouple buffer-placement memory alignment from the disk-side stride. DiskSectorGraph previously allocated `sectors_data` aligned to `block_size` (typically 4096), conflating the on-disk stride with the reader's hardware/OS memory-placement requirement. Introduce a sealed `Alignment` trait with `const VALUE: PowerOfTwo` and two unit markers (`A1`, `A512`), and parameterize `AlignedRead` by an `A: Alignment` witness (defaulting to `A1`). `AlignedRead::new` checks buffer pointer, byte-length, and disk-offset alignment against `A::VALUE` at construction and returns `ANNResult<Self>`, so a typed `AlignedRead<u8, A512>` is itself a witness that the request is well-formed -- the file reader's `read` impl no longer needs to re-validate. Replace `AlignedFileReader::DISK_IO_ALIGNMENT` (and the hardcoded 512 in `AlignedRead::new`) with `AlignedFileReader::Alignment`. Linux and Windows direct-I/O readers set `type Alignment = A512;`; the buffered `StorageProviderAlignedFileReader` uses `A1`. `DiskSectorGraph::sectors_data` is now allocated using `<AlignedReaderType::Alignment as Alignment>::VALUE` -- the reader's contract -- rather than `block_size`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3954f7a to
37fdf00
Compare
DiskSectorGraph previously allocated sectors_data aligned to block_size (typically 4096), conflating the on-disk stride with the reader's hardware/OS memory-placement requirement. Move the requirement where it belongs: declare it on the reader as
AlignedFileReader::MEMORY_ALIGNMENT(default 1; direct-I/O readers override to 512) and have the buffer allocator honor that.Drop the per-request alignment check from AlignedRead::new — that constraint is the caller's responsibility (DiskSectorGraph's offsets and lengths are already block_size-aligned by construction). Making construction infallible cleans up call sites.
Refresh VertexProvider / VertexProviderFactory docs to drop stale references to ANNWrapper and JetVertexProvider.