Skip to content

feat: extract build infrastructure into standalone scripts#227

Open
wdconinc wants to merge 36 commits intomasterfrom
copilot/extract-build-scripts
Open

feat: extract build infrastructure into standalone scripts#227
wdconinc wants to merge 36 commits intomasterfrom
copilot/extract-build-scripts

Conversation

@wdconinc
Copy link
Copy Markdown
Contributor

@wdconinc wdconinc commented Apr 6, 2026

Summary

Add build-base.sh and build-eic.sh as standalone scripts that encapsulate the full docker buildx build invocations previously embedded in .gitlab-ci.yml and .github/workflows/build-push.yml. These scripts are now the single source of truth for build logic — they are called by the CI and can be run directly by users for local builds.

Motivation

The .gitlab-ci.yml previously contained 130+ lines of inline docker buildx build commands with ~20–30 --build-arg flags each. This made local builds difficult because:

  • The embedded scripts couldn't be run directly
  • The docs/building-locally.md "Full Build Script" was missing ~15 --build-arg entries (SHAs, cherry-picks, duplicate allowlists, etc.)
  • Any change to build logic required updating both CI and docs separately

Changes

New files

  • build-base.sh — builds debian_stable_base, cuda_devel, cuda_runtime
  • build-eic.sh — builds any EIC environment (ci, xl, cuda, dbg, etc.)

CI/local mode detection

Scripts auto-detect their context via CI_REGISTRY:

Feature CI mode Local mode
Output --push --load
Build cache write --cache-to to CI registry omitted
Build cache read CI + ghcr.io registries public ghcr.io only
mirrors.yaml full template with credentials public-only, no credentials
Image tags INTERNAL_TAG, EXPORT_TAG --tag local

Local usage

# Build base image
bash build-base.sh --jobs 8

# Build EIC XL environment
bash build-eic.sh --env xl --jobs 8

# Build EIC CI environment (faster)
bash build-eic.sh --env ci

.gitlab-ci.yml

The base and eic job scripts are each reduced to two lines:

script:
  - apk add git
  - bash build-<x>.sh

All CI matrix variables (BUILD_IMAGE, ENV, BUILD_TYPE, etc.) flow through naturally as environment variables.

.github/workflows/build-push.yml

The base and eic jobs are reduced to, essentially:

      - name: Build and push
        env:
          BUILD_IMAGE: ${{ matrix.BUILD_IMAGE }}
          BASE_IMAGE: ${{ matrix.BASE_IMAGE }}
        run: bash build-base.sh

docs/building-locally.md

Replaced the incomplete inline script with references to build-base.sh and build-eic.sh, updated all examples and troubleshooting steps.

Copilot AI review requested due to automatic review settings April 6, 2026 18:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extracts the long docker buildx build invocations from .gitlab-ci.yml into two standalone root-level scripts (build-base.sh, build-eic.sh) intended to be the single source of truth for both CI and local builds, and updates local build documentation to use these scripts.

Changes:

  • Add build-base.sh to build base images (Debian stable base and CUDA base/runtime).
  • Add build-eic.sh to build EIC environment images with CI/local mode behavior.
  • Update .gitlab-ci.yml and docs/building-locally.md to delegate build logic to the new scripts.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File Description
build-base.sh New script encapsulating the base image docker buildx build logic.
build-eic.sh New script encapsulating the EIC environment image build logic (tagging, caching, secrets, mirrors).
.gitlab-ci.yml Replaces large inline build command blocks with calls to the new scripts.
docs/building-locally.md Updates local build instructions to use the new scripts and revises build-arg guidance.

Comment thread build-eic.sh Outdated
Comment thread build-eic.sh Outdated
Comment thread build-eic.sh Outdated
Comment thread build-base.sh
Comment thread build-base.sh Outdated
Comment thread docs/building-locally.md
Comment thread docs/building-locally.md
Copilot AI review requested due to automatic review settings April 6, 2026 19:57

This comment was marked as resolved.

This comment was marked as resolved.

@wdconinc wdconinc force-pushed the copilot/extract-build-scripts branch from fafdb52 to 3ffa5ef Compare April 8, 2026 22:54
Copilot AI review requested due to automatic review settings April 8, 2026 23:38

This comment was marked as resolved.

Copilot AI review requested due to automatic review settings April 9, 2026 00:09

This comment was marked as resolved.

Copilot AI review requested due to automatic review settings April 9, 2026 14:44

This comment was marked as resolved.

Copilot AI review requested due to automatic review settings April 9, 2026 14:56

This comment was marked as resolved.

Copilot AI review requested due to automatic review settings April 9, 2026 15:03

This comment was marked as resolved.

wdconinc and others added 5 commits April 9, 2026 13:09
Add build-base.sh and build-eic.sh as standalone scripts that
encapsulate the full docker buildx build invocations previously
embedded in .gitlab-ci.yml. These scripts are now the single
source of truth for build logic and are called by the CI as well
as being directly usable by developers for local builds.

Key features:
- Auto-detect CI vs local mode via CI_REGISTRY env var
- CI mode: --push, --cache-to, full tag logic, credential mirrors
- Local mode: --load, public-only mirrors (no credentials needed)
- Use sed instead of envsubst for mirrors.yaml.in expansion
- All CI matrix variables (BUILD_IMAGE, ENV, BUILD_TYPE, etc.)
  flow through naturally as environment variables
- Safe quoting of multi-line SPACK_CHERRYPICKS via bash arrays

Also update docs/building-locally.md to reference the new scripts
and remove the incomplete inline build script that was missing
~15 --build-arg entries.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Wouter Deconinck <wdconinc@gmail.com>
Co-authored-by: Wouter Deconinck <wdconinc@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 14, 2026 17:31

This comment was marked as resolved.

wdconinc and others added 5 commits April 14, 2026 18:28
…:trixie-slim)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GITHUB_BASE_REF is empty for push events, so CI_DEFAULT_BRANCH_SLUG was
always falling back to the hardcoded string 'master', degrading cache hits
on repos whose default branch is 'main' or anything else.

Pass DEFAULT_BRANCH from the workflow (github.event.repository.default_branch)
and use it as an intermediate fallback before 'master':

  CI_DEFAULT_BRANCH_SLUG=$(slugify "${GITHUB_BASE_REF:-${DEFAULT_BRANCH:-master}}")

On PR events GITHUB_BASE_REF is still used (the PR target branch).
On push events GITHUB_BASE_REF is empty and DEFAULT_BRANCH takes over.
The 'master' final fallback covers local runs with no CI env set.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two robustness issues in the mirrors.yaml generation:

1. sed substitution was not escaping replacement values, so registry URLs
   or paths containing '&', '\', or '|' could corrupt the output.
   Add sed_escape() helper that escapes those characters before injection.

2. mirrors.yaml was written into the source tree (SCRIPT_DIR), leaving a
   dirty working tree for local users and risking collisions when multiple
   builds run from the same checkout concurrently.
   Switch to mktemp under TMPDIR and register a trap to clean it up on exit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The fallback is triggered when either BUILDER_IMAGE or RUNTIME_IMAGE (or
both) are absent locally, but the old message only named BUILDER_IMAGE,
leaving users guessing which image was actually missing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
build-eic.sh only passes CI_REGISTRY_USER/CI_REGISTRY_PASSWORD/
GITHUB_REGISTRY_USER/GITHUB_REGISTRY_TOKEN as secrets in CI mode.
Local builds omit them, but the Dockerfile unconditionally declared
those secret mounts as required, causing 'secret not found' failures
for anyone running a local build.

Add required=false to all four credential secret mounts in the three
RUN installation steps. The mirrors secret is always provided (via
mktemp in both CI and local mode) and remains required.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@wdconinc wdconinc requested review from plexoos and veprbl April 14, 2026 23:44
@veprbl
Copy link
Copy Markdown
Member

veprbl commented Apr 15, 2026

Fails on eicweb:

$ bash build-eic.sh
mktemp: : Invalid argument

@wdconinc
Copy link
Copy Markdown
Contributor Author

Fails on eicweb:

$ bash build-eic.sh
mktemp: : Invalid argument

ah shucks. alpine, never has what you need.

Comment thread .gitlab-ci.yml Outdated
Copilot AI review requested due to automatic review settings April 15, 2026 19:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Comment thread docs/building-locally.md Outdated
Comment on lines +157 to +164
env:
BUILD_IMAGE: ${{ matrix.BUILD_IMAGE }}
BASE_IMAGE: ${{ matrix.BASE_IMAGE }}
PLATFORM: ${{ matrix.PLATFORM }}
BUILDWEEK: ${{ needs.env.outputs.BUILDWEEK }}
DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
METADATA_FILE: /tmp/build-metadata.json
run: bash build-base.sh
Comment thread build-eic.sh
Comment thread build-base.sh
Copilot AI review requested due to automatic review settings April 15, 2026 21:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Comment thread docs/building-locally.md Outdated
Comment thread docs/building-locally.md Outdated
Comment thread build-base.sh
## Mirrors GitLab's CI_COMMIT_REF_SLUG: lowercase, non-alnum runs → '-',
## strip leading/trailing '-', truncate to 63 chars.
slugify() {
echo "$1" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]\+/-/g; s/^-//; s/-$//' | cut -c1-63
Comment thread build-eic.sh
## Mirrors GitLab's CI_COMMIT_REF_SLUG: lowercase, non-alnum runs → '-',
## strip leading/trailing '-', truncate to 63 chars.
slugify() {
echo "$1" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]\+/-/g; s/^-//; s/-$//' | cut -c1-63
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 15, 2026 22:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

Comment thread build-eic.sh
Comment on lines +99 to +106
## GITHUB_REF_NAME (current branch) and GITHUB_BASE_REF (PR target branch,
## empty on push events) are standard runner variables. DEFAULT_BRANCH should
## be supplied by the workflow (github.event.repository.default_branch) so
## that cache keys are correct even when GITHUB_BASE_REF is empty.
CI_MODE="github"
CI_REGISTRY="${GH_REGISTRY}"
CI_PROJECT_PATH="${GH_REGISTRY_USER}"
CI_COMMIT_REF_SLUG="$(slugify "${GITHUB_REF_NAME:-master}")"
Comment thread docs/building-locally.md
Comment on lines 179 to 184
### Base Image (containers/debian/Dockerfile)

| Argument | Description | Default |
|----------|-------------|---------|
| `BASE_IMAGE` | Base Debian image | `debian:stable-slim` |
| `BASE_IMAGE` | Base Debian image | `debian:trixie-slim` (derived from `--image`) |
| `SPACK_ORGREPO` | Spack GitHub org/repo | `spack/spack` |
Comment thread build-base.sh
Comment on lines +86 to +107
## GITHUB_REF_NAME (current branch) and GITHUB_BASE_REF (PR target branch,
## empty on push events) are standard runner variables. DEFAULT_BRANCH should
## be supplied by the workflow (github.event.repository.default_branch) so
## that cache keys are correct even when GITHUB_BASE_REF is empty.
CI_MODE="github"
CI_REGISTRY="${GH_REGISTRY}"
CI_PROJECT_PATH="${GH_REGISTRY_USER}"
CI_COMMIT_REF_SLUG="$(slugify "${GITHUB_REF_NAME:-master}")"
CI_DEFAULT_BRANCH_SLUG="$(slugify "${GITHUB_BASE_REF:-${DEFAULT_BRANCH:-master}}")"
INTERNAL_TAG="${INTERNAL_TAG:-pipeline-${GITHUB_RUN_ID}}"
else
CI_MODE="local"
fi

## Derive BASE_IMAGE from BUILD_IMAGE if not provided
if [ -z "${BASE_IMAGE}" ]; then
case "${BUILD_IMAGE}" in
debian_stable_base) BASE_IMAGE="debian:trixie-slim" ;;
cuda_devel) BASE_IMAGE="nvidia/cuda:${CUDA_VERSION}-devel-${CUDA_OS}" ;;
cuda_runtime) BASE_IMAGE="nvidia/cuda:${CUDA_VERSION}-runtime-${CUDA_OS}" ;;
*) echo "Unknown BUILD_IMAGE '${BUILD_IMAGE}'; please specify --base-image" >&2; exit 1 ;;
esac
* build-eic: support comma-separated --build-type (default,nightly)

Build both default and nightly images in a single sequential invocation
so that the four shared Docker stages (builder/runtime concretization and
installation of the default spack environment) are reused from BuildKit's
layer cache.  Cuts redundant Spack compilation for nightly builds that
follow a default build on the same runner.

Key changes:
- build-eic.sh: change BUILD_TYPE default to "default,nightly"; split on
  comma; validate each token; loop over types – resolving EIC package SHAs
  and constructing the docker-buildx command per iteration.  Shared setup
  (benchmark SHAs, mirrors.yaml, SPACK_DUPLICATE_ALLOWLIST, ARCH) runs once
  before the loop.  Metadata files are written as
  ${METADATA_FILE%.json}-<build_type>.json; logs as build-<build_type>.log.
- .gitlab-ci.yml: restructure eic parallel matrix from 16 to 10 jobs by
  using BUILD_TYPE="default,nightly" as a plain string (not a list dimension)
  for environments that previously had separate default and nightly rows.
  Drop ${BUILD_TYPE} from METADATA_FILE (script now appends it).  Update
  .build artifacts glob to build-*.log.  Fix .nightly rules so that combined
  jobs override BUILD_TYPE to "default" on stable branches (instead of
  matching only the now-absent BUILD_TYPE=="nightly" string), while
  nightly-only jobs (ci_without_acts) retain when:never semantics.
- .github/workflows/build-push.yml: set BUILD_TYPE="default,nightly" for
  ci and xl matrix entries.  Export one digest file per build type and
  upload as separate artifacts.  Add BUILD_TYPE dimension to eic-manifest
  matrix; update artifact download pattern; add build-type-aware tags.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@wdconinc wdconinc linked an issue Apr 23, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

command to build eic images

4 participants