Skip to content

feat(config): unified clean errors for bad/malformed/empty config_paths (#1205 #8/#12; #1488/#1489/#1490)#1609

Open
wprazuch wants to merge 1 commit into
wprazuch/ng-test-concurrencyfrom
wprazuch/config-load-errors
Open

feat(config): unified clean errors for bad/malformed/empty config_paths (#1205 #8/#12; #1488/#1489/#1490)#1609
wprazuch wants to merge 1 commit into
wprazuch/ng-test-concurrencyfrom
wprazuch/config-load-errors

Conversation

@wprazuch

@wprazuch wprazuch commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

What

One coherent fix for the three config_paths failure modes, all failing fast with a clean message and no traceback:

Issue Bad input Before After
#1488 typo'd / missing path 7-frame FileNotFoundError Error: config_paths entry '...' was not found. Looked in: ...
#1490 scalar instead of list raw Pydantic ValidationError (+pydantic.dev URL) Error: 'config_paths' must be a list of paths ... +config_paths=[...]
#1489 empty / omitted (zero servers) Ray starts, hangs until SIGTERM Error: No server instances are configured ... before Ray

How

Supersedes #1510

This consolidates #1510 (which covered the same three issues via raise SystemExit). Differences here: typed exceptions that ng_validate can catch/format (a bare SystemExit would escape it), a validated zero-server check, the guard in start() so e2e is covered too, and deterministic tests.

Tests

test_global_config.py: path-not-found (both-locations / dedup / absolute), malformed config_paths, zero-server (raise + pass). test_cli.py: the decorator (ConfigError → clean exit; non-ConfigError propagates; success passes through). Smoke-tested all three via ng_run — clean messages, examples intact, 0 traceback lines. 51/51 test_global_config+test_cli pass; ruff + pre-commit clean.

@copy-pr-bot

copy-pr-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@wprazuch

Copy link
Copy Markdown
Contributor Author

#1510 is similar to this PR.

@wprazuch wprazuch changed the title feat(config): actionable errors for bad config_paths and zero-server runs (#1205 friction #8/#12) feat(config): unified clean errors for bad/malformed/empty config_paths (#1205 #8/#12; #1488/#1489/#1490) Jun 16, 2026
wprazuch added a commit that referenced this pull request Jun 17, 2026
Ports #1609 onto the unified gym CLI (#1434). Covers epic #1205 friction
#8 + #12 and issues #1488/#1489/#1490:

- #1488: missing config_paths entry -> ConfigPathNotFoundError
- #1490: malformed (non-list) config_paths -> MalformedConfigPathsError
- #1489: zero configured servers -> NoServerInstancesError, in RunHelper.start()
  before Ray (covers env run + eval run / e2e)

All subclass ConfigError; a cli/env.py decorator (exit_cleanly_on_config_error,
also applied to e2e_rollout_collection in cli/eval.py) turns them into a clean,
rich-escaped message + exit 1, no traceback. Targets martas/1434.

Signed-off-by: Wojciech Prazuch <wprazuch@nvidia.com>
@wprazuch wprazuch force-pushed the wprazuch/config-load-errors branch from f22a409 to 205816b Compare June 17, 2026 08:46
@wprazuch wprazuch requested a review from a team as a code owner June 17, 2026 08:46
@wprazuch wprazuch changed the base branch from main to martas/1434 June 17, 2026 08:46
wprazuch added a commit that referenced this pull request Jun 17, 2026
Shared CI fixes for the martas/1434-stacked CLI work: pin uv (0.11.20 drops
pinned deps -> 7 servers fail; = #1576) and pull main's graphwalks
example_rollouts.jsonl (fixes its data validation). This branch is the base
for the ng_validate (#1599) and config-error (#1609) PRs so the fixes live in
one place. Drop when martas/1434 rebases on main.

Signed-off-by: Wojciech Prazuch <wprazuch@nvidia.com>
Ports #1609 onto the shared CLI base (martas/1434 + uv pin + concurrency).
Covers epic #1205 friction #8 + #12 and issues #1488/#1489/#1490:
ConfigPathNotFoundError, MalformedConfigPathsError, NoServerInstancesError
(all ConfigError); fail-fast guard in RunHelper.start(); exit_cleanly_on_config_error
decorator on run()/e2e_rollout_collection() -> clean message, no traceback.

Signed-off-by: Wojciech Prazuch <wprazuch@nvidia.com>
@wprazuch wprazuch force-pushed the wprazuch/config-load-errors branch from 205816b to 95b37ea Compare June 17, 2026 09:33
@wprazuch wprazuch changed the base branch from martas/1434 to wprazuch/ng-test-concurrency June 17, 2026 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant