Skip to content

fix(dolt): refetch container in setup retry loop#140

Merged
cofin merged 1 commit into
mainfrom
fix/dolt-fixture-race
May 25, 2026
Merged

fix(dolt): refetch container in setup retry loop#140
cofin merged 1 commit into
mainfrom
fix/dolt-fixture-race

Conversation

@cofin

@cofin cofin commented May 24, 2026

Copy link
Copy Markdown
Member

Summary

_provide_dolt_service cached a container handle from _get_container above its retry loop. Under parallel xdist with transient=True, docker_service.run returns success but the container can be auto-removed (Docker remove=True plus a brief Dolt startup blip) between the readiness check passing and the post-run setup exec_run. Every retry then 404s against the same stale ID and the fixture silently yields a broken service, surfacing later as docker.errors.NotFound at test setup.

This recurring flake has hit `tests/test_dolt.py::test_xdist_isolate_server` on the open clientless PRs (e.g. valkey #138 Python 3.11 - 1/3).

Changes

  • Refetch the container by name inside the loop so a fresh handle is used each iteration.
  • Tolerate `docker.errors.NotFound` and 404/409 `APIError` from `exec_run` so a vanished container re-tries cleanly instead of bubbling up.
  • Replace the previous silent-skip-on-missing-container path with an explicit `RuntimeError` after 5 attempts: a genuinely dead container should fail fixture setup loudly instead of producing a half-baked `DoltService`.

Test plan

  • `uv run pytest tests/test_dolt.py -v` — all 3 tests pass locally including the previously flaky `test_xdist_isolate_server`.
  • `uv run ruff check src/pytest_databases/docker/dolt.py`
  • `uv run mypy src/pytest_databases/docker/dolt.py`
  • `uv run pyright src/pytest_databases/docker/dolt.py`
  • CI green across the full Python matrix.

`_provide_dolt_service` cached a container handle from `_get_container`
above its retry loop. Under parallel xdist with `transient=True`,
`docker_service.run` returns success but the container can be
auto-removed (Docker `remove=True` + a brief Dolt startup blip) between
the readiness check passing and the post-`run` setup `exec_run`. Every
retry then 404s against the same stale ID and the fixture silently
yields a broken service, surfacing later as `docker.errors.NotFound` at
test setup.

- Refetch the container by name inside the loop.
- Tolerate `NotFound` and 404/409 `APIError` so a vanished container
  re-tries cleanly instead of bubbling up.
- Replace the previous silent-skip-on-missing-container path with an
  explicit `RuntimeError` after 5 attempts so a genuinely dead container
  fails fixture setup loudly instead of producing a half-baked service.

Surfaced as the recurring `test_xdist_isolate_server` flake in CI on
the open clientless PRs.
@cofin cofin merged commit 72db1ee into main May 25, 2026
21 checks passed
@cofin cofin deleted the fix/dolt-fixture-race branch May 25, 2026 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant