Skip to content

Add local postgres subcommand backed by Docker#136

Merged
iskakaushik merged 6 commits intomainfrom
local-postgres-docker
May 7, 2026
Merged

Add local postgres subcommand backed by Docker#136
iskakaushik merged 6 commits intomainfrom
local-postgres-docker

Conversation

@iskakaushik
Copy link
Copy Markdown
Collaborator

@iskakaushik iskakaushik commented May 5, 2026

Summary

  • Adds clickhousectl local postgres start|stop|stop-all|remove|client|dotenv as a sibling of local server. Each instance runs as a postgres:<tag> Docker container with data bind-mounted at .clickhouse/servers/<name>/data/. ClickHouse + Postgres share local server list (with an Engine column) but lifecycle commands stay engine-scoped.
  • local install postgres@<tag> pre-pulls images. local postgres client prefers host psql and falls back to a Docker exec — interactive TTY for shells, non-TTY one-shot for --query / --queries-file.
  • Restart-safe: stop keeps the container + metadata, start resumes via docker start so credentials encoded into PGDATA on first init are preserved across restarts.
  • Engine-isolated: starting one engine on a name held by the other errors with a clear message instead of clobbering the other engine's data dir.
  • Containers labeled clickhousectl.engine=postgres, clickhousectl.name=<name>, clickhousectl.project=<cwd>, and created_by=clickhousectl_<version>. Discovery (orphan recovery in local server list) filters on engine + project so containers stay manageable across CLI upgrades.
  • Postgres versions restricted to majors 16, 17, 18 (any sub-tag is accepted: 16-alpine, 17.0, 18-bookworm, 16.4-alpine3.20). Default is 18.

Test plan

  • cargo build and cargo clippy clean
  • cargo test — 247 passing (including new tests for Engine serde back-compat, dotenv prefix isolation, install spec parser, validate_pg_tag, resolve_port, generate_password)
  • Smoke: install → start → server list (combined view) → orphan recovery (delete metadata, list rebuilds it from container labels) → stop → start (container_id + password preserved) → remove
  • Smoke: cross-engine guard fires both directions with actionable error
  • Smoke: stale stopped container with our labels gets cleaned up before create_container (no 409)
  • Smoke: non-TTY --query path routes through exec_psql_one_shot
  • Smoke: unsupported version (15, latest) rejected with clear message
  • CI green

Adds a sibling to `local server` that manages Postgres instances as Docker
containers, so a project can run ClickHouse and Postgres side-by-side for
CDC/ingest workflows without touching system Postgres.

* `local install postgres@<tag>` pre-pulls a postgres image (start will pull
  on demand otherwise).
* `local postgres start|stop|stop-all|remove|client|dotenv` mirror the
  ClickHouse server lifecycle. Data persists at
  `.clickhouse/servers/<name>/data/` (bind-mounted into the container).
* `ServerInfo` gains `engine` (ClickHouse | Postgres) and `container_id`
  fields with serde defaults so existing metadata files keep deserializing.
  `local server list` now shows both engines in a single table.
* `local postgres client` prefers host `psql` and falls back to an
  attached exec stream + raw-mode TTY in the container when psql is
  unavailable. SIGWINCH is forwarded to `resize_exec`.
* Containers are labeled `clickhousectl.engine=postgres`,
  `clickhousectl.name=<name>`, `clickhousectl.project=<cwd>`, and
  `created_by=clickhousectl_<version>`. Discovery filters on engine + project
  (not version) so containers stay manageable across upgrades. Stale
  containers with our labels are reaped before start to avoid 409s.
* `local server stop-all` stays ClickHouse-only; Postgres has its own
  `local postgres stop-all`.
@iskakaushik iskakaushik requested a review from sdairs as a code owner May 5, 2026 19:29
@iskakaushik iskakaushik temporarily deployed to cloud-integration May 5, 2026 19:29 — with GitHub Actions Inactive
Round of follow-up fixes to `local postgres` after self-review.

* Restart safety. `local postgres stop` now only stops the container and
  preserves both the container and metadata; `start` resumes via
  `docker start` instead of recreating. The postgres image's initdb only
  runs on empty PGDATA, so re-creating with new POSTGRES_PASSWORD silently
  produced wrong credentials.
* Cross-engine name reuse rejected. Both engines share
  `.clickhouse/servers/<name>/` and the same metadata file. Starting
  Postgres on a name held by ClickHouse (or vice versa) now errors with a
  clear message instead of clobbering the other engine's data dir.
* List faithfulness. Stopped Postgres entries keep their metadata (engine,
  container_id, image tag) so `local server list` shows them correctly;
  pid/http_port are omitted for non-ClickHouse rows instead of serialized
  as 0 sentinels. GC of stale ClickHouse metadata now lives only in
  `list_all_servers`, not in `load_running_info` — so the engine guard can
  read prior metadata without side effects.
* Non-TTY exec for `--query` / `--queries-file`. Added
  `exec_psql_one_shot` (tty=false, no stdin) so the docker fallback works
  in scripted/piped contexts, not just interactive shells.
* Postgres version range restricted to 16/17/18. New `validate_pg_tag`
  rejects unsupported majors (`latest`, `15`, `19`, ...) and the default
  tag is now `18` instead of `latest`.
@iskakaushik iskakaushik temporarily deployed to cloud-integration May 5, 2026 19:40 — with GitHub Actions Inactive
If a Postgres metadata file exists but the container it references has
been removed (e.g. by `docker rm -f`), the previous flow fell through to
a fresh-create against the leftover data dir. That's never safe:
* If PGDATA was half-initialized (force-killed mid-init), the new
  container fails to start with confusing pg_control errors.
* If PGDATA was fully initialized, the postgres image's initdb skips,
  but the env-supplied POSTGRES_PASSWORD is also ignored by the image —
  so the password we'd advertise to the user wouldn't actually work.

Instead, surface a clear error pointing the user at `local postgres
remove` to clear the data dir and start fresh. Verified by an integration
test that force-removes the container and asserts the recovery message.
@iskakaushik iskakaushik temporarily deployed to cloud-integration May 5, 2026 19:52 — with GitHub Actions Inactive
The official postgres image changed its data layout in 18+: PGDATA
defaults to /var/lib/postgresql/<major>/docker, where older majors used
/var/lib/postgresql/data. Mounting the host data dir at the legacy path
made 18+ refuse to start ("PostgreSQL data in: /var/lib/postgresql/data
(unused mount/volume)"). The local smoke tests passed because they ran
against 16/17-alpine, but the default tag is 18 so it broke immediately
in real use.

Pin PGDATA=/var/lib/postgresql/data via env so the image uses the same
path on every supported major. Each managed server is locked to a single
image tag (changing tag requires `remove`), so we don't need pg_upgrade-
style cross-version layout.

Also adds:
* scripts/test-postgres-integration.sh — the edge-case battery I'd been
  running locally (orphan recovery, externally-removed container,
  cross-engine guards, dotenv consistency, etc.). Now reproducible from
  the repo with any `clickhousectl` binary.
* .github/workflows/test-postgres-integration.yml — runs the script on
  pull requests touching local code, against postgres 16, 17, and 18.
* New `majors_start_and_serve` case in the battery: starts each
  supported major and waits for `pg_isready` — would have caught the
  18+ regression.
@iskakaushik iskakaushik temporarily deployed to cloud-integration May 6, 2026 17:52 — with GitHub Actions Inactive
Two postgres instances on the same `--name` but different `--version`
values now coexist with separate state, instead of silently sharing a
data dir (which would either fail to init under the second version, or
come up with the first version's password baked into PGDATA).

Disk layout per Postgres instance:
* metadata at `.clickhouse/servers/<name>-pg<major>.json`
* data at    `.clickhouse/servers/<name>-pg<major>/data/`
* container  `clickhousectl-pg-<name>-<major>`

The new label `clickhousectl.major` records the major so orphan recovery
can rebuild the (name, version) key. ClickHouse paths are unchanged;
since the two engines no longer share files, a name like "dev" can be
held by both engines simultaneously — the cross-engine guard from the
previous round is dropped as it's no longer needed.

User-facing UX:
* `start --name X` (no version) resumes the existing instance when there's
  exactly one with that name, or asks for `--version` when ambiguous.
  With zero existing instances it falls back to the default tag (18).
* `stop`, `remove`, `client`, `dotenv` accept `--version`/`-v` to
  disambiguate when multiple majors share a name. With a single match
  the flag is optional.

Integration coverage updated: new `per_version_isolation` and
`cross_engine_coexist` cases; the old `restart_ignores_changed_version`
and cross-engine-block cases are obsolete and removed. Test runner
canonicalizes its tempdir paths so cleanup matches the labels written
by the CLI. Postgres readiness checks switched to TCP since the alpine
image's local socket dir isn't created until postgres binds.
@iskakaushik iskakaushik temporarily deployed to cloud-integration May 6, 2026 18:10 — with GitHub Actions Inactive
Comment thread crates/clickhousectl/src/local/cli.rs Outdated
CONTEXT FOR AGENTS:
Manage named Postgres server instances backed by Docker. Each instance runs as
a `postgres:<tag>` container with data bind-mounted at .clickhouse/servers/<name>/data/.
Subcommands: start, stop, stop-all, remove, client, dotenv.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid listing subcommands in additional context as they're already listed in the help output

…subcommand list

The integration battery passed locally on macOS but failed in CI because
on Linux bind-mounted PGDATA contains files owned by uid 999 (the
postgres user inside the container), and the host runner can't rm them.
Docker for Mac papers over this with UID mapping; bare Linux doesn't.

Add `docker::remove_host_dir_blocking` that tries `std::fs::remove_dir_all`
first (no overhead on macOS) and falls back to a one-shot Alpine
container running as root. `local postgres remove` uses it for the data
dir teardown. The integration test runner mirrors the same pattern for
its tempdir cleanup so a failed test doesn't leave undeletable state.

Also drops the redundant `Subcommands: ...` lines from `local server`
and `local postgres` after_help blocks (clap's --help already lists
them) per @sdairs review on PR #136.
@iskakaushik iskakaushik temporarily deployed to cloud-integration May 6, 2026 18:39 — with GitHub Actions Inactive
@iskakaushik iskakaushik merged commit a30a775 into main May 7, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants