Skip to content

feat: forward stable row ID write configuration#5217

Draft
VibhuJawa wants to merge 1 commit into
lance-format:mainfrom
VibhuJawa:feat/forward-stable-row-ids
Draft

feat: forward stable row ID write configuration#5217
VibhuJawa wants to merge 1 commit into
lance-format:mainfrom
VibhuJawa:feat/forward-stable-row-ids

Conversation

@VibhuJawa

Copy link
Copy Markdown

Summary

Forward PyLance's enable_stable_row_ids option through every Lance-Ray write path:

  • write_fragment
  • LanceFragmentWriter
  • LanceDatasink and LanceFragmentCommitter
  • non-streaming and streaming write_lance

The option remains disabled by default, so existing callers are unchanged.

Why

Distributed writers could create and commit ordinary Lance fragments, but had no way to request stable row IDs even though PyLance supports the option at fragment-write and dataset-commit time. This prevents downstream datasets from retaining row identity across operations such as compaction.

Validation

  • 37 passed in tests/test_basic_read_write.py and tests/test_fragment.py
  • Production rewrite of 355,952,746 rows across 56,696 fragments
  • Stable row IDs enabled on the committed dataset and preserved across the subsequent URL-index commit
  • Source and target row count and Arrow schema matched exactly
  • 1,000 deterministic samples matched exact image bytes and stored/computed MD5 values
  • 1,000 stable-ID image reads completed with 16 workers at 42.47 images/s and 9.98 MiB/s

The production dataset's fresh remote URL-index initialization remained expensive; that is an index-distribution/cache concern separate from Lance-Ray's stable-row-ID write forwarding.

Signed-off-by: Vibhu Jawa <vjawa@nvidia.com>
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

@VibhuJawa VibhuJawa changed the title Forward stable row ID write configuration feat: forward stable row ID write configuration Jul 3, 2026
@github-actions github-actions Bot added the enhancement New feature or request label Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant