Fix get_area_slices under-crop for cropped geostationary cross-CRS areas by Zaczero · Pull Request #700 · pytroll/pyresample

Zaczero · 2026-02-28T20:35:34Z

AreaDefinition.get_area_slices can under-crop cropped geostationary cross-CRS areas by merging the existing boundary-based slice with chunked destination-coverage bounds.

Before (notice cropping at the bottom)

After

Closes Data reduction in resampling yields incorrect result satpy#986
Tests added
Tests passed

djhoese · 2026-03-02T11:24:21Z

Wow! I hope to take a look at all your PRs today or at least start. Since this is the first PR I have to ask: did you have all of these fixes/changes sitting around and waited to submit them or did you use AI to find the problems or...how did you have so many problems and have the fixes for them to make so many pull requests in one weekend?

Zaczero · 2026-03-02T11:58:34Z

@djhoese I used AI to investigate the problems in my local pipeline, prioritizing correctness bugs and performance bottlenecks. I then reviewed and iterated on fixes (human-in-the-loop). I am not a vibe coder if that's what you're worrying about; otherwise the code would be slop! Plus I am a workaholic.

I am new to this project, so obviously I may have missed some things.

djhoese

Thanks for putting this together. This is a very interesting idea and I've made some comments inline. I think my main concerns are:

Performance. You have a for loop in Python that is making 2D chunk arrays the size of the destination area.
The right solution: We (pyresample maintainers) know that the current bounding logic is not good and has a lot of flaws and we have some alternative implementations (PRs and third-party packages) that we want to investigate. I'm wondering if this workaround in this PR should be included because it gets us closer to the answer we want or if this is just another sign that the current implementation needs to be improved.

@pnuu @mraspaud I'd love to hear what you guys think.

djhoese · 2026-03-02T11:49:48Z

pyresample/future/geometry/_subset.py



+def _get_covered_source_slices(src_area: AreaDefinition, area_to_cover: AreaDefinition) -> tuple[slice, slice] | None:
+    max_points_per_chunk = 600_000


How was this value decided on?

768*768 is just under 600k. 1024x1024 is 1M. I tested several values and picked the one that in my opinion best matches CPU/memory tradeoffs. 1M was minimally faster for bigger images but used much more memory. Anything lower was not saving much memory and performance got lower.

djhoese · 2026-03-02T11:54:05Z

pyresample/future/geometry/_subset.py

+            destination_lons, destination_lats = area_to_cover.get_lonlats(
+                data_slice=(slice(row_start, row_stop), slice(None)),
+                dtype=np.float32,
+            )


This concerns me efficiency wise. If I'm not mistaken this is making a full 2D array, right? For real-world areas this could 1000s of rows and 1000s of columns per chunk.

This also assumes that the area_to_cover is best represented by longitude/latitude degrees which is not always the case or even not usually the case. It seems like it would be better (although not implemented in the AreaDefinition objects already) to get the X/Y coordinates in the area_to_cover CRS then use pyproj to convert to the X/Y of the src_area then convert those to indices. That way lon/lat are never involved.

I'm realizing now that even the main boundary code assumes lon/lat so maybe this idea of lon/lat being used is less of a problem for the current implementation.

As I understand it, the algorithm needs to check every point. This cross-CRS case is now 30% slower. There's some reason why a simple boundary check (not area) would not be correct for every cross-CRS, but I don't remember it now. I'll recheck that and get back to you with concrete information.

So yeah, for maximum correctness, all points need to be sampled. GDAL samples only edges by default and provides options to enable sampling of all points on a case-by-case basis. It works most of the time, but not always.

Normally when computing the source raster data to load to generate a particular output area, the warper samples transforms 21 points along each edge of the destination region back onto the source file, and uses this to compute a bounding window on the source image that is sufficient. Depending on the transformation in effect, the source window may be a bit too small, or even missing large areas. Problem situations are those where the transformation is very non-linear or "inside out". Examples are transforming from WGS84 to Polar Stereographic for areas around the pole, or transformations where some of the image is untransformable. The following options provide some additional control to deal with errors in computing the source window:

https://gdal.org/en/stable/doxygen/structGDALWarpOptions.html

So the existing implementation equals to SAMPLE_GRID=YES + SAMPLE_STEPS=ALL in GDAL.

Some other libs also use this 21 number by default.

I implemented the 21 sampled approach with 2 new flags:

sample_steps: int | None = 21, sample_grid: bool = False,

sample_steps=None, sample_grid=True, would work as the before this patch.

In my case it improved the performance 94ms -> 3ms with minimal numerical drift (<=1px).

djhoese · 2026-03-02T11:55:56Z

pyresample/future/geometry/_subset.py

+            max_col = chunk_max_col if max_col is None else max(max_col, chunk_max_col)
+            min_row = chunk_min_row if min_row is None else min(min_row, chunk_min_row)
+            max_row = chunk_max_row if max_row is None else max(max_row, chunk_max_row)
+    except (RuntimeError, TypeError, ValueError):


This worries me that there isn't a clear "this is why this error happens" definition and is instead "catch any problems". I'd like if it was clear why some of these errors are happening and to try to handle them before they happen. Or at the very least comment on why they would happen.

Yes, I tried to follow project patterns with that one. Such a pattern is very common in the codebase. Generic except catches.

Such a pattern is very common in the codebase

Yes, this project has never had the strictest of coding rules.

pyresample/future/geometry/_subset.py

djhoese · 2026-03-04T22:01:24Z

pyresample/future/geometry/_subset.py

+def _get_edge_lonlat_samples(area_to_cover: AreaDefinition, sample_steps: int):
+    """Return perimeter destination samples for edge-only sampling mode."""


Is this different than get_boundary_lonlats?

pyresample/pyresample/geometry.py

Lines 293 to 294 in 07711c0

def get_bbox_lonlats(self, vertices_per_side: Optional[int] = None, force_clockwise: bool = True,

frequency: Optional[int] = None) -> tuple:

Yes. _get_edge_lonlat_samples is geostationary-aware, and sampling follows the visibility curve rather than the image rectangular boundary. get_bbox_lonlats could pick samples that are not the true visible boundary. Plus it's more efficient because it's purpose build.

djhoese · 2026-03-04T22:10:43Z

pyresample/future/geometry/_subset.py

+    min_row = None
+    max_row = None
+    try:
+        src_proj = Proj(src_area.crs)


The pyproj Proj class is less preferred these days (deprecated even?) in favor of Transformer. This is an even stronger argument for maybe looking at not using lons/lats as the intermediate value. By that I mean since you have two define the src and dst CRS in a Transformer you might as well go from source area CRS to area_to_cover CRS. At least that's my gut reaction as I look at this.

Transformer.from_proj is deprecated, but Proj is not. Sample points are in lon/lat units, so Proj is the cleanest API to use in this case: we do lon/lat->CRS, not general CRS->CRS. get_projection_coordinates_from_lonlat uses a similar technique.

Hm I could have sworn there was an explicit deprecation, but can't find it now. However, Proj is limited datum-wise:

https://pyproj4.github.io/pyproj/stable/gotchas.html#proj-not-a-generic-latitude-longitude-to-projection-converter

pyresample/future/geometry/_subset.py

djhoese · 2026-03-04T22:14:22Z

pyresample/future/geometry/_subset.py

+    if sample_steps is None:
+        yield from _iter_dense_lonlat_samples(area_to_cover)
+        return
+    if sample_grid:
+        yield _get_grid_lonlat_samples(area_to_cover, sample_steps)
+        return
+    yield _get_edge_lonlat_samples(area_to_cover, sample_steps)


I could be wrong but I don't think this logic matches GDAL. I read the GDAL docs as sample_grid=False and sample_steps being ALL (0/None in our case) that that would do all the pixels along the edges. When sample_grid=True then sample_steps=ALL/None is the full dense grid (edges and all internal points).

djhoese · 2026-03-18T15:42:01Z

I've restarted CI as it had a weird failure. Let's see how it does now...

codecov · 2026-03-19T13:29:22Z

Codecov Report

❌ Patch coverage is 95.28302% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.71%. Comparing base (4da28b0) to head (7c0cc75).
⚠️ Report is 38 commits behind head on main.

Files with missing lines	Patch %	Lines
pyresample/future/geometry/_subset.py	91.66%	9 Missing ⚠️
pyresample/test/test_geometry/test_area.py	99.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #700      +/-   ##
==========================================
+ Coverage   93.67%   93.71%   +0.04%     
==========================================
  Files          89       89              
  Lines       13621    13863     +242     
==========================================
+ Hits        12759    12992     +233     
- Misses        862      871       +9

Flag	Coverage Δ
unittests	`93.71% <95.28%> (+0.04%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

djhoese requested changes Mar 2, 2026

View reviewed changes

Zaczero force-pushed the zaczero/moral-orca branch from ac5087c to dc30c4c Compare March 2, 2026 23:37

djhoese reviewed Mar 4, 2026

View reviewed changes

Zaczero force-pushed the zaczero/moral-orca branch from dc30c4c to 79d6db6 Compare March 15, 2026 09:51

Zaczero requested a review from djhoese March 15, 2026 09:52

Zaczero force-pushed the zaczero/moral-orca branch from 79d6db6 to 0f47d0f Compare March 19, 2026 12:54

Fix get_area_slices under-crop for cropped geostationary cross-CRS areas

7c0cc75

Zaczero force-pushed the zaczero/moral-orca branch from 0f47d0f to 7c0cc75 Compare March 19, 2026 13:23



		def _get_covered_source_slices(src_area: AreaDefinition, area_to_cover: AreaDefinition) -> tuple[slice, slice] \| None:
		max_points_per_chunk = 600_000

		def _get_edge_lonlat_samples(area_to_cover: AreaDefinition, sample_steps: int):
		"""Return perimeter destination samples for edge-only sampling mode."""

	def get_bbox_lonlats(self, vertices_per_side: Optional[int] = None, force_clockwise: bool = True,
	frequency: Optional[int] = None) -> tuple:

Conversation

Zaczero commented Feb 28, 2026

Before (notice cropping at the bottom)

After

Uh oh!

djhoese commented Mar 2, 2026

Uh oh!

Zaczero commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

djhoese left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Zaczero Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Zaczero Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

djhoese commented Mar 18, 2026

Uh oh!

codecov bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Zaczero commented Mar 2, 2026 •

edited

Loading

Zaczero Mar 2, 2026 •

edited

Loading

Zaczero Mar 15, 2026 •

edited

Loading

codecov bot commented Mar 19, 2026 •

edited

Loading