Skip to content

fix: late-materialize blob columns read as binary#7593

Open
wkalt wants to merge 1 commit into
lance-format:mainfrom
wkalt:ticket/oss-1382/late-materialize-all-binary-blobs
Open

fix: late-materialize blob columns read as binary#7593
wkalt wants to merge 1 commit into
lance-format:mainfrom
wkalt:ticket/oss-1382/late-materialize-all-binary-blobs

Conversation

@wkalt

@wkalt wkalt commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

is_early_field force-classified every blob field as early materialization, so
a selective filter that projected a blob read the whole blob column instead of
taking matched rows. That is only correct for the default blobs_descriptions
handling, where a blob is a tiny {offset, size} description; all_binary (and
the SomeBinary variants) materialize the full value, so reading it for the
whole table defeats late materialization. A TODO at the call site already
flagged this.

Force early materialization only when the blob is returned as a description;
otherwise fall through to the width-based heuristic, which late-materializes a
wide binary leaf. The decision is per leaf, so a blob nested in a struct is
handled like a top-level column. Default and explicit AllEarly/AllLate are
unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com

@github-actions github-actions Bot added the bug Something isn't working label Jul 2, 2026
is_early_field force-classified every blob field as early materialization, so
a selective filter that projected a blob read the whole blob column instead of
taking matched rows. That is only correct for the default blobs_descriptions
handling, where a blob is a tiny {offset, size} description; all_binary (and
the SomeBinary variants) materialize the full value, so reading it for the
whole table defeats late materialization. A TODO at the call site already
flagged this.

Force early materialization only when the blob is returned as a description;
otherwise fall through to the width-based heuristic, which late-materializes a
wide binary leaf. The decision is per leaf, so a blob nested in a struct is
handled like a top-level column. Default and explicit AllEarly/AllLate are
unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@wkalt wkalt force-pushed the ticket/oss-1382/late-materialize-all-binary-blobs branch from 582c1f1 to 5faa90b Compare July 2, 2026 20:27
@codecov

codecov Bot commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants