feat: extract build infrastructure into standalone scripts#227
Open
feat: extract build infrastructure into standalone scripts#227
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extracts the long docker buildx build invocations from .gitlab-ci.yml into two standalone root-level scripts (build-base.sh, build-eic.sh) intended to be the single source of truth for both CI and local builds, and updates local build documentation to use these scripts.
Changes:
- Add
build-base.shto build base images (Debian stable base and CUDA base/runtime). - Add
build-eic.shto build EIC environment images with CI/local mode behavior. - Update
.gitlab-ci.ymlanddocs/building-locally.mdto delegate build logic to the new scripts.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
build-base.sh |
New script encapsulating the base image docker buildx build logic. |
build-eic.sh |
New script encapsulating the EIC environment image build logic (tagging, caching, secrets, mirrors). |
.gitlab-ci.yml |
Replaces large inline build command blocks with calls to the new scripts. |
docs/building-locally.md |
Updates local build instructions to use the new scripts and revises build-arg guidance. |
fafdb52 to
3ffa5ef
Compare
Add build-base.sh and build-eic.sh as standalone scripts that encapsulate the full docker buildx build invocations previously embedded in .gitlab-ci.yml. These scripts are now the single source of truth for build logic and are called by the CI as well as being directly usable by developers for local builds. Key features: - Auto-detect CI vs local mode via CI_REGISTRY env var - CI mode: --push, --cache-to, full tag logic, credential mirrors - Local mode: --load, public-only mirrors (no credentials needed) - Use sed instead of envsubst for mirrors.yaml.in expansion - All CI matrix variables (BUILD_IMAGE, ENV, BUILD_TYPE, etc.) flow through naturally as environment variables - Safe quoting of multi-line SPACK_CHERRYPICKS via bash arrays Also update docs/building-locally.md to reference the new scripts and remove the incomplete inline build script that was missing ~15 --build-arg entries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Wouter Deconinck <wdconinc@gmail.com>
Co-authored-by: Wouter Deconinck <wdconinc@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…:trixie-slim) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GITHUB_BASE_REF is empty for push events, so CI_DEFAULT_BRANCH_SLUG was
always falling back to the hardcoded string 'master', degrading cache hits
on repos whose default branch is 'main' or anything else.
Pass DEFAULT_BRANCH from the workflow (github.event.repository.default_branch)
and use it as an intermediate fallback before 'master':
CI_DEFAULT_BRANCH_SLUG=$(slugify "${GITHUB_BASE_REF:-${DEFAULT_BRANCH:-master}}")
On PR events GITHUB_BASE_REF is still used (the PR target branch).
On push events GITHUB_BASE_REF is empty and DEFAULT_BRANCH takes over.
The 'master' final fallback covers local runs with no CI env set.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two robustness issues in the mirrors.yaml generation: 1. sed substitution was not escaping replacement values, so registry URLs or paths containing '&', '\', or '|' could corrupt the output. Add sed_escape() helper that escapes those characters before injection. 2. mirrors.yaml was written into the source tree (SCRIPT_DIR), leaving a dirty working tree for local users and risking collisions when multiple builds run from the same checkout concurrently. Switch to mktemp under TMPDIR and register a trap to clean it up on exit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The fallback is triggered when either BUILDER_IMAGE or RUNTIME_IMAGE (or both) are absent locally, but the old message only named BUILDER_IMAGE, leaving users guessing which image was actually missing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
build-eic.sh only passes CI_REGISTRY_USER/CI_REGISTRY_PASSWORD/ GITHUB_REGISTRY_USER/GITHUB_REGISTRY_TOKEN as secrets in CI mode. Local builds omit them, but the Dockerfile unconditionally declared those secret mounts as required, causing 'secret not found' failures for anyone running a local build. Add required=false to all four credential secret mounts in the three RUN installation steps. The mirrors secret is always provided (via mktemp in both CI and local mode) and remains required. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
|
Fails on eicweb: |
Contributor
Author
ah shucks. alpine, never has what you need. |
wdconinc
commented
Apr 15, 2026
Comment on lines
+157
to
+164
| env: | ||
| BUILD_IMAGE: ${{ matrix.BUILD_IMAGE }} | ||
| BASE_IMAGE: ${{ matrix.BASE_IMAGE }} | ||
| PLATFORM: ${{ matrix.PLATFORM }} | ||
| BUILDWEEK: ${{ needs.env.outputs.BUILDWEEK }} | ||
| DEFAULT_BRANCH: ${{ github.event.repository.default_branch }} | ||
| METADATA_FILE: /tmp/build-metadata.json | ||
| run: bash build-base.sh |
| ## Mirrors GitLab's CI_COMMIT_REF_SLUG: lowercase, non-alnum runs → '-', | ||
| ## strip leading/trailing '-', truncate to 63 chars. | ||
| slugify() { | ||
| echo "$1" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]\+/-/g; s/^-//; s/-$//' | cut -c1-63 |
| ## Mirrors GitLab's CI_COMMIT_REF_SLUG: lowercase, non-alnum runs → '-', | ||
| ## strip leading/trailing '-', truncate to 63 chars. | ||
| slugify() { | ||
| echo "$1" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]\+/-/g; s/^-//; s/-$//' | cut -c1-63 |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Comment on lines
+99
to
+106
| ## GITHUB_REF_NAME (current branch) and GITHUB_BASE_REF (PR target branch, | ||
| ## empty on push events) are standard runner variables. DEFAULT_BRANCH should | ||
| ## be supplied by the workflow (github.event.repository.default_branch) so | ||
| ## that cache keys are correct even when GITHUB_BASE_REF is empty. | ||
| CI_MODE="github" | ||
| CI_REGISTRY="${GH_REGISTRY}" | ||
| CI_PROJECT_PATH="${GH_REGISTRY_USER}" | ||
| CI_COMMIT_REF_SLUG="$(slugify "${GITHUB_REF_NAME:-master}")" |
Comment on lines
179
to
184
| ### Base Image (containers/debian/Dockerfile) | ||
|
|
||
| | Argument | Description | Default | | ||
| |----------|-------------|---------| | ||
| | `BASE_IMAGE` | Base Debian image | `debian:stable-slim` | | ||
| | `BASE_IMAGE` | Base Debian image | `debian:trixie-slim` (derived from `--image`) | | ||
| | `SPACK_ORGREPO` | Spack GitHub org/repo | `spack/spack` | |
Comment on lines
+86
to
+107
| ## GITHUB_REF_NAME (current branch) and GITHUB_BASE_REF (PR target branch, | ||
| ## empty on push events) are standard runner variables. DEFAULT_BRANCH should | ||
| ## be supplied by the workflow (github.event.repository.default_branch) so | ||
| ## that cache keys are correct even when GITHUB_BASE_REF is empty. | ||
| CI_MODE="github" | ||
| CI_REGISTRY="${GH_REGISTRY}" | ||
| CI_PROJECT_PATH="${GH_REGISTRY_USER}" | ||
| CI_COMMIT_REF_SLUG="$(slugify "${GITHUB_REF_NAME:-master}")" | ||
| CI_DEFAULT_BRANCH_SLUG="$(slugify "${GITHUB_BASE_REF:-${DEFAULT_BRANCH:-master}}")" | ||
| INTERNAL_TAG="${INTERNAL_TAG:-pipeline-${GITHUB_RUN_ID}}" | ||
| else | ||
| CI_MODE="local" | ||
| fi | ||
|
|
||
| ## Derive BASE_IMAGE from BUILD_IMAGE if not provided | ||
| if [ -z "${BASE_IMAGE}" ]; then | ||
| case "${BUILD_IMAGE}" in | ||
| debian_stable_base) BASE_IMAGE="debian:trixie-slim" ;; | ||
| cuda_devel) BASE_IMAGE="nvidia/cuda:${CUDA_VERSION}-devel-${CUDA_OS}" ;; | ||
| cuda_runtime) BASE_IMAGE="nvidia/cuda:${CUDA_VERSION}-runtime-${CUDA_OS}" ;; | ||
| *) echo "Unknown BUILD_IMAGE '${BUILD_IMAGE}'; please specify --base-image" >&2; exit 1 ;; | ||
| esac |
* build-eic: support comma-separated --build-type (default,nightly)
Build both default and nightly images in a single sequential invocation
so that the four shared Docker stages (builder/runtime concretization and
installation of the default spack environment) are reused from BuildKit's
layer cache. Cuts redundant Spack compilation for nightly builds that
follow a default build on the same runner.
Key changes:
- build-eic.sh: change BUILD_TYPE default to "default,nightly"; split on
comma; validate each token; loop over types – resolving EIC package SHAs
and constructing the docker-buildx command per iteration. Shared setup
(benchmark SHAs, mirrors.yaml, SPACK_DUPLICATE_ALLOWLIST, ARCH) runs once
before the loop. Metadata files are written as
${METADATA_FILE%.json}-<build_type>.json; logs as build-<build_type>.log.
- .gitlab-ci.yml: restructure eic parallel matrix from 16 to 10 jobs by
using BUILD_TYPE="default,nightly" as a plain string (not a list dimension)
for environments that previously had separate default and nightly rows.
Drop ${BUILD_TYPE} from METADATA_FILE (script now appends it). Update
.build artifacts glob to build-*.log. Fix .nightly rules so that combined
jobs override BUILD_TYPE to "default" on stable branches (instead of
matching only the now-absent BUILD_TYPE=="nightly" string), while
nightly-only jobs (ci_without_acts) retain when:never semantics.
- .github/workflows/build-push.yml: set BUILD_TYPE="default,nightly" for
ci and xl matrix entries. Export one digest file per build type and
upload as separate artifacts. Add BUILD_TYPE dimension to eic-manifest
matrix; update artifact download pattern; add build-type-aware tags.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
build-base.shandbuild-eic.shas standalone scripts that encapsulate the fulldocker buildx buildinvocations previously embedded in.gitlab-ci.ymland.github/workflows/build-push.yml. These scripts are now the single source of truth for build logic — they are called by the CI and can be run directly by users for local builds.Motivation
The
.gitlab-ci.ymlpreviously contained 130+ lines of inlinedocker buildx buildcommands with ~20–30--build-argflags each. This made local builds difficult because:docs/building-locally.md"Full Build Script" was missing ~15--build-argentries (SHAs, cherry-picks, duplicate allowlists, etc.)Changes
New files
build-base.sh— buildsdebian_stable_base,cuda_devel,cuda_runtimebuild-eic.sh— builds any EIC environment (ci,xl,cuda,dbg, etc.)CI/local mode detection
Scripts auto-detect their context via
CI_REGISTRY:--push--load--cache-toto CI registrymirrors.yamlINTERNAL_TAG,EXPORT_TAG--tag localLocal usage
.gitlab-ci.yml
The
baseandeicjob scripts are each reduced to two lines:All CI matrix variables (
BUILD_IMAGE,ENV,BUILD_TYPE, etc.) flow through naturally as environment variables..github/workflows/build-push.yml
The
baseandeicjobs are reduced to, essentially:docs/building-locally.md
Replaced the incomplete inline script with references to
build-base.shandbuild-eic.sh, updated all examples and troubleshooting steps.