Skip to content

fix: switch agent main_test gexec.Build to canonical pattern#9

Closed
bborbe wants to merge 4 commits into
masterfrom
fix/gexec-canonical-pattern
Closed

fix: switch agent main_test gexec.Build to canonical pattern#9
bborbe wants to merge 4 commits into
masterfrom
fix/gexec-canonical-pattern

Conversation

@bborbe
Copy link
Copy Markdown
Owner

@bborbe bborbe commented May 26, 2026

Summary

  • Replace gexec.Build(\"github.com/...\", \"-mod=mod\") with gexec.Build(\".\", \"-mod=mod\", \"-buildvcs=false\") in all 6 agent main_test.go files (claude, code, gemini, pi — root + cmd/run-task).
  • Matches the canonical pattern from coding/docs/go-testing-guide.md:162-164.
  • Also fixes copy-paste bug where agent/pi/cmd/run-task/main_test.go was building agent/claude/cmd/run-task.

Root cause

gexec.Build with a full import path forks a go build that runs full module resolution (proxy + VCS lookup) before compiling. Inside a race-instrumented test binary (go test -race), the long-running child causes the parent to trip an internal cleanup race; the GH Actions runner kills the process with signal: segmentation fault (core dumped). Local go test ./... (no -race or no matrix-parallel pressure) didn't surface it, which is why it stayed hidden until matrix CI in #3.

With .:

  • Path resolution uses CWD (the package dir at test runtime) — no proxy lookup
  • -buildvcs=false skips git subprocess for VCS embedding

Test plan

  • Local: agent/pi go test -race — was hanging at 60s timeout, now passes in 12s
  • Local: all 4 agent shards (claude/code/gemini/pi root + cmd/run-task) pass go test -race
  • CI: 4 agent matrix shards turn green on this PR
  • CI: aggregator test job goes green
  • Documented in [[Github Workflow Actions]] gotchas section

Related task: [[Fix agent main_test gexec Build Segfault in CI]]

Use gexec.Build(".", "-mod=mod", "-buildvcs=false") from
coding/docs/go-testing-guide.md instead of full import path.

Full-path form forks a `go build` that runs module proxy + VCS
resolution; inside the race-instrumented test binary that hangs long
enough to trip a cleanup race, GH Actions kills with SIGSEGV
("signal: segmentation fault (core dumped)" after "Will run 1 of 1
specs"). Surfaced once matrix CI replaced the 15-min-timed-out
sequential job in PR #3.

Local reproduction: agent/pi `go test -race` hung at 60s timeout with
broken pattern; passes in 12s with canonical pattern. All 4 agent
shards green locally.

Also fixes copy-paste bug in agent/pi/cmd/run-task/main_test.go
(was building agent/claude/cmd/run-task).
Copy link
Copy Markdown

@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my review:

Summary: The PR fixes a copy-paste bug in agent/pi/cmd/run-task/main_test.go (was building claude instead of pi) and standardizes the gexec.Build pattern across all agent test files to use the canonical form with ".", -mod=mod, and -buildvcs=false.

Findings:

  • Must Fix: None
  • Should Fix: None
  • Nice to Have: None

The concerns from the plan are both addressed:

  1. agent/pi/cmd/run-task/main_test.go copy-paste bug is confirmed fixed - was building agent/claude/cmd/run-task, now correctly uses . to build the local package
  2. The . path is valid in gexec.Build when run from the package directory (Ginkgo sets CWD to the test package)
{
  "verdict": "approve",
  "summary": "Bugfix PR correctly fixes a copy-paste error in agent/pi/cmd/run-task/main_test.go (was building claude instead of pi) and standardizes gexec.Build pattern across all agent test files to use canonical form with '.', '-mod=mod', and '-buildvcs=false'.",
  "comments": [],
  "concerns_addressed": [
    "correctness: copy-paste bug fixed in agent/pi/cmd/run-task/main_test.go - was building agent/claude/cmd/run-task instead of agent/pi/cmd/run-task",
    "correctness: path change from full import path to '.' verified valid - gexec.Build runs with CWD at package directory"
  ]
}

…test

Canonical gexec pattern fixed cmd/run-task shards but root agent/pi and
agent/claude still segfault on GH runner (works locally with cold cache
+ race). Symptom: SIGSEGV ~50ms after "Will run 1 of 1 specs" — before
gexec's child build can complete.

gexec.Build adds machinery (goroutines for stdout/stderr piping, global
cleanup registry) that may be interacting badly with race detector on
the Linux runner specifically.

Replacing with direct exec.Command("go", "build", ...) keeps the same
compile-verification semantics with minimal machinery. Verified locally
with -race + cold cache.

agent/{code,gemini} cmd/run-task variants don't ship root main_test.go,
so they need no further change.
@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot dismissed their stale review May 26, 2026 19:21

superseded by new automated review

Copy link
Copy Markdown

@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
  "verdict": "approve",
  "summary": "PR replaces gexec.Build with canonical patterns. Notable bug fix in agent/pi/cmd/run-task/main_test.go where old code incorrectly built the claude package instead of pi. All other changes are consistent refactoring to use simpler build verification patterns with proper temp directory cleanup.",
  "comments": [
    {
      "file": "agent/pi/cmd/run-task/main_test.go",
      "line": 19,
      "severity": "major",
      "message": "Bug fix: old code built 'github.com/bborbe/agent/agent/claude/cmd/run-task' (wrong package). New code correctly uses '.' to build the pi package."
    }
  ],
  "concerns_addressed": [
    "correctness: agent/pi/cmd/run-task main_test.go now builds correct package (pi instead of claude)",
    "correctness: agent/claude/main_test.go replaced gexec.Build with exec.Command + temp dir cleanup",
    "correctness: agent/pi/main_test.go replaced gexec.Build with exec.Command + temp dir cleanup"
  ]
}

bborbe added 2 commits May 26, 2026 21:27
Both gexec.Build (canonical pattern) and direct exec.Command still
segfault the root agent/pi and agent/claude test binaries on the GH
Actions runner — within 50ms of "Will run 1 of 1 specs". Passes
locally. cmd/run-task variants under same fix work fine.

The compile-verification provided by this spec is redundant: building
the test binary itself requires compiling package main. If main.go
doesn't compile, `go test` fails immediately before specs run.

TestSuite is retained so the package satisfies "every binary has a
main_test.go" project convention.
After removing the spec from root agent/{pi,claude}/main_test.go, the
segfault MIGRATED to whichever package still called gexec.Build /
exec.Command — proving the bug is not package-specific but a property
of spawning child processes from any race-instrumented test binary on
this Linux runner.

Removing the spec from all 6 agent main_test.go files. Compile check
remains implicit: the test binary itself is package main built.

Matches earlier root-package fix for agent/{pi,claude}.
@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot dismissed their stale review May 26, 2026 19:34

superseded by new automated review

Copy link
Copy Markdown

@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my review, I now have all the information needed to produce the verdict. Let me compile the final report.

Consolidated Report

Files changed: 6 (all *_test.go files)

Scope: Bugfix — switching gexec.Build to canonical "." pattern and removing redundant "Compiles" specs that cause race-condition segfaults on Linux CI.

Manual review findings:

  • agent/pi/cmd/run-task/main_test.go — Pre-existing bug: old code built "github.com/bborbe/agent/agent/claude/cmd/run-task" (wrong package) instead of the pi package. The "." fix is correct.
  • agent/claude/main_test.go & agent/pi/main_test.go — The removed gexec.Build spec in both files had a typo import (github.com/onsi/omega/gexec instead of github.com/onsi/gomega/gexec). Removing it fixes the import issue and the race-condition segfault problem on Linux CI.
  • All cmd/run-task/main_test.go files now use the idiomatic "." pattern instead of hardcoded module paths, which is more robust if the module name ever changes.

No Must Fix or Should Fix issues.

{
  "verdict": "approve",
  "summary": "This PR makes targeted test infrastructure improvements: switching gexec.Build to the canonical '.' pattern across 4 cmd/run-task/main_test.go files (fixing a pre-existing bug where agent/pi was building the claude binary), and removing redundant 'Compiles' specs from 2 root main_test.go files that caused race-condition segfaults on Linux CI. Well-documented with explanatory NOTE comments. No regressions.",
  "comments": [],
  "concerns_addressed": [
    "correctness: agent/pi/cmd/run-task/main_test.go now correctly builds '.' (pi package) instead of the erroneous 'github.com/bborbe/agent/agent/claude/cmd/run-task' path",
    "tests: agent/claude/main_test.go and agent/pi/main_test.go — gexec.Build 'Compiles' spec removed, eliminating race-condition segfault that occurred under -race on Linux CI; assertion is now implicit via go test failing if main does not compile"
  ]
}

@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot dismissed their stale review May 26, 2026 19:41

superseded by new automated review

Copy link
Copy Markdown

@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All agents completed. Now I'll compile the final report.


PR Review: fix: switch agent main_test gexec.Build to canonical pattern

Consolidated Report

Must Fix (Critical): None.

Should Fix (Important): None.

Nice to Have (Optional): None.

Agent Findings

Agent Result
go-quality Clean. Identified that agent/pi/cmd/run-task/main_test.go had a pre-existing bug — gexec.Build pointed to github.com/bborbe/agent/agent/claude/cmd/run-task instead of the pi package. Removing the test silently fixes this. Also removed misplaced //go:generate directive from claude's run-task test.
go-factory-pattern N/A — test files don't contain factory functions.
go-http-handler N/A — test files contain no HTTP handlers.
go-test-coverage No coverage gap. The Compiles spec was redundant since go test fails if main.go doesn't compile. Explanatory comments are adequate.

Manual Review

  • All 6 files retain proper BSD copyright headers
  • All 6 files follow correct Ginkgo v2 bootstrap pattern (RegisterFailHandler, RunSpecs, GinkgoConfiguration with 60s timeout, UTC timezone)
  • gexec import correctly removed in all files
  • //go:generate directive correctly removed from agent/claude/cmd/run-task/main_test.go (nothing to generate there)
  • Explanatory // NOTE: comments are clear, explain the race condition, reference the vault note, and note redundancy
  • The agent/pi/cmd/run-task/main_test.go bug (wrong claude path) is a pre-existing issue that this change correctly removes

Concerns Addressed

  • correctness: Pre-existing wrong import path bug in agent/pi/cmd/run-task/main_test.go removed (gexec.Build pointed to claude instead of pi)
  • tests: Redundant Compiles spec removed, avoiding race-condition segfault on GH Actions
{
  "verdict": "approve",
  "summary": "Clean bugfix that removes redundant gexec.Build-based Compiles specs that caused race-condition segfaults on GH Actions. The change also silently fixes a pre-existing wrong-import-path bug in agent/pi/cmd/run-task/main_test.go where gexec.Build pointed to the claude package instead of pi. No coverage gaps, no style violations, and the explanatory comments are clear and well-referenced.",
  "comments": [],
  "concerns_addressed": [
    "correctness: pre-existing wrong import path in agent/pi/cmd/run-task/main_test.go (claude instead of pi) was removed along with the buggy test",
    "tests: redundant Compiles spec removed, avoiding race-condition segfaults on GH Actions Linux CI runner"
  ]
}

@bborbe bborbe closed this May 26, 2026
@bborbe bborbe deleted the fix/gexec-canonical-pattern branch May 26, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant