fix: normalize internal state timestamps#18
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5e303bc7c1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ( | ||
| table_schema, | ||
| file_path, | ||
| _normalize_internal_metadata_timestamps(table_name, arrow_table), |
There was a problem hiding this comment.
Preserve existing timestamptz metadata tables
When an internal metadata table was already created by the previous code with Iceberg TimestamptzType (for example _dlt_pipeline_state.created_at from an earlier timestamp[us, tz=UTC] batch), this unconditional normalization turns the next incoming batch into TimestampType before schema evolution runs. The existing-table path then compares TimestamptzType to TimestampType and evolve_schema_if_needed rejects that as an unsafe type change, so affected pipelines fail on the first run after upgrading instead of simply casting to the existing schema. Consider only normalizing when the existing column is already TimestampType, or treating the internal timestamp/timestamptz pair as compatible during evolution.
Useful? React with 👍 / 👎.
5e303bc to
62a752d
Compare
Summary
_dlt_pipeline_state.created_atas Icebergtimestampwhen incoming Arrow batches aretimestamp[us, tz=UTC]._dlt_loads.inserted_atwith the existing naive UTCTimestampTypeconvention used by_dlt_version.Addresses #9.
Tests
uv run pytest tests/test_with_state_sync.py::TestWithStateSyncMethods::test_state_table_created_at_timestamptz_batch_matches_existing_timestamp_schema -quv run pytest tests/test_with_state_sync.py tests/test_load_metadata_resilience.py tests/test_schema_evolution.py -quv run pytest tests/test_class_based_atomic.py tests/test_e2e_sqlite_catalog.py tests/test_destination_e2e.py -quv run pytest tests/test_schema_converter.py tests/test_schema_casting.py tests/test_state_sync_e2e.py -quv run pytest -q -m "not integration"(148 passed, 15 deselected)