Skip to content

Fix OOM in geotiff dask read, sieve memory, and reproject GPU fallback#1183

Merged
brendancol merged 1 commit intomasterfrom
perf/sweep-phase2-round1
Apr 11, 2026
Merged

Fix OOM in geotiff dask read, sieve memory, and reproject GPU fallback#1183
brendancol merged 1 commit intomasterfrom
perf/sweep-phase2-round1

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • read_geotiff_dask() was reading the entire file into RAM to extract metadata (shape, dtype, nodata) before building the lazy dask graph. Now uses _read_geo_info(), which parses only the TIFF IFD via mmap. Peak memory during graph setup dropped from 4.41 MB to 0.21 MB at 512x512. For a 30TB file this was an instant OOM; now it's a few kilobytes of header parsing.
  • sieve._label_connected() allocated region_val_buf at rows * cols entries -- 16 GB of float64 for a 46K x 46K raster, even though the actual region count is typically around 100K. Now counts regions in a first pass and allocates at the real size. The dead rank array gets reused as root_to_id instead of allocating a separate n-element array. Memory guard multiplier fixed from an inaccurate 5x to 28 bytes/pixel.
  • _reproject_dask_cupy pre-allocated the full output on GPU with cp.full(out_shape), raising MemoryError if it exceeded VRAM. Now it checks available GPU memory first and falls back to the existing map_blocks(is_cupy=True) path when the output won't fit. The fast pre-allocation path is still used when the output does fit.

Test plan

  • geotiff: 284 passed (12 dask-specific), 1 pre-existing failure in plot test (unrelated RecursionError)
  • sieve: 45 passed
  • reproject: 93 passed
  • Verify dask read on a multi-GB GeoTIFF doesn't spike memory
  • Verify sieve on a real classified raster with few regions vs. many pixels

Three performance fixes from the Phase 2 sweep targeting WILL OOM
verdicts under 30TB workloads:

geotiff: read_geotiff_dask() was reading the entire file into RAM just
to extract metadata before building the lazy dask graph. Now uses
_read_geo_info() which parses only the IFD via mmap -- O(1) memory
regardless of file size. Peak memory during dask setup dropped from
4.41 MB to 0.21 MB at 512x512 (21x reduction).

sieve: region_val_buf was allocated at rows*cols (16 GB for a 46K x 46K
raster) when the actual region count is typically orders of magnitude
smaller. Now counts regions first, allocates at actual size. Also reuses
the dead rank array as root_to_id, saving another 4 bytes/pixel. Memory
guard fixed from a misleading 5x multiplier to an accurate 28
bytes/pixel estimate.

reproject: _reproject_dask_cupy pre-allocated the full output on GPU via
cp.full(out_shape), which OOMs for large outputs. Now checks available
GPU memory and falls back to the existing map_blocks path (with
is_cupy=True) when the output exceeds VRAM. Fast path preserved for
outputs that fit.
@github-actions github-actions bot added the performance PR touches performance-sensitive code label Apr 11, 2026
@brendancol brendancol merged commit e944c04 into master Apr 11, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant