feat(config): unified clean errors for bad/malformed/empty config_paths (#1205 #8/#12; #1488/#1489/#1490)#1609
Open
wprazuch wants to merge 1 commit into
Open
Conversation
Contributor
Author
|
#1510 is similar to this PR. |
wprazuch
added a commit
that referenced
this pull request
Jun 17, 2026
Ports #1609 onto the unified gym CLI (#1434). Covers epic #1205 friction #8 + #12 and issues #1488/#1489/#1490: - #1488: missing config_paths entry -> ConfigPathNotFoundError - #1490: malformed (non-list) config_paths -> MalformedConfigPathsError - #1489: zero configured servers -> NoServerInstancesError, in RunHelper.start() before Ray (covers env run + eval run / e2e) All subclass ConfigError; a cli/env.py decorator (exit_cleanly_on_config_error, also applied to e2e_rollout_collection in cli/eval.py) turns them into a clean, rich-escaped message + exit 1, no traceback. Targets martas/1434. Signed-off-by: Wojciech Prazuch <wprazuch@nvidia.com>
f22a409 to
205816b
Compare
wprazuch
added a commit
that referenced
this pull request
Jun 17, 2026
Shared CI fixes for the martas/1434-stacked CLI work: pin uv (0.11.20 drops pinned deps -> 7 servers fail; = #1576) and pull main's graphwalks example_rollouts.jsonl (fixes its data validation). This branch is the base for the ng_validate (#1599) and config-error (#1609) PRs so the fixes live in one place. Drop when martas/1434 rebases on main. Signed-off-by: Wojciech Prazuch <wprazuch@nvidia.com>
Ports #1609 onto the shared CLI base (martas/1434 + uv pin + concurrency). Covers epic #1205 friction #8 + #12 and issues #1488/#1489/#1490: ConfigPathNotFoundError, MalformedConfigPathsError, NoServerInstancesError (all ConfigError); fail-fast guard in RunHelper.start(); exit_cleanly_on_config_error decorator on run()/e2e_rollout_collection() -> clean message, no traceback. Signed-off-by: Wojciech Prazuch <wprazuch@nvidia.com>
205816b to
95b37ea
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
One coherent fix for the three
config_pathsfailure modes, all failing fast with a clean message and no traceback:FileNotFoundErrorError: config_paths entry '...' was not found. Looked in: ...ValidationError(+pydantic.dev URL)Error: 'config_paths' must be a list of paths ... +config_paths=[...]Error: No server instances are configured ...before RayHow
ConfigErrorbase inconfig_types.py;ConfigPathNotFoundError,MalformedConfigPathsError,NoServerInstancesError(and the existingServerRefNotFoundError) inherit it. They stay ordinary exceptions, song_validate(feat(cli): add 'gym env validate' pre-flight config check (#1205 friction #12) #1599,except Exception) can still catch and format them.load_extra_config_pathswrapsOmegaConf.load(bug: nonexistent config path dumps raw FileNotFoundError traceback #1488);parse()wrapsta.validate_python(Bug: Malformed +config_paths surfaces raw Pydantic ValidationError stack #1490).raise_on_no_server_instances(validated server-instance check, not a raw key count) runs inRunHelper.start()beforeinitialize_ray()— covers bothng_runande2e_rollout_collection(bug: ng_run with no config spawns Ray and hangs instead of failing fast #1489).exit_cleanly_on_config_erroronrun()/e2e_rollout_collection()converts anyConfigErrorto a rich-escaped message +exit 1, no traceback (the explicit ask in bug: nonexistent config path dumps raw FileNotFoundError traceback #1488/bug: ng_run with no config spawns Ray and hangs instead of failing fast #1489).escape()keeps[...]examples intact. Unexpected errors still propagate.Supersedes #1510
This consolidates #1510 (which covered the same three issues via
raise SystemExit). Differences here: typed exceptions thatng_validatecan catch/format (a bareSystemExitwould escape it), a validated zero-server check, the guard instart()soe2eis covered too, and deterministic tests.Tests
test_global_config.py: path-not-found (both-locations / dedup / absolute), malformed config_paths, zero-server (raise + pass).test_cli.py: the decorator (ConfigError → clean exit; non-ConfigError propagates; success passes through). Smoke-tested all three viang_run— clean messages, examples intact, 0 traceback lines. 51/51test_global_config+test_clipass; ruff + pre-commit clean.