cp: [benchmarking] Adds support for fine-grained config overrides and global ray config (1840) into r1.2.0#1867
cp: [benchmarking] Adds support for fine-grained config overrides and global ray config (1840) into r1.2.0#1867svcnvidia-nemo-ci wants to merge 1 commit into
[benchmarking] Adds support for fine-grained config overrides and global ray config (1840) into r1.2.0#1867Conversation
…bal ray config (#1840) Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
|
/ok to test f905cc7 |
Greptile SummaryThis PR cherry-picks support for fine-grained config overrides and a global Ray config into Confidence Score: 5/5Safe to merge; all findings are P2 style/fragility concerns that do not affect the primary use case. Both open comments are P2: the first-key matching fragility only matters if a YAML author reorders keys in override files (not a current scenario), and the non-dict append duplication only triggers if an override file repeats a value already in the base list. The core merge and global-ray logic is correct and the nightly-benchmark.yaml refactor preserves all effective per-entry Ray configurations. benchmarking/run.py — the update_config list-merge logic warrants a second look if override files with non-dict lists or reordered keys are introduced. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["CLI: --config file1.yaml --config file2.yaml"] --> B["merge_config_files()"]
B --> C["For each YAML file\nyaml.full_load_all()"]
C --> D["update_config(config_dict, new_dict)"]
D --> E{Key type in\nexisting dict?}
E -->|"Both dicts"| F["Recurse:\nupdate_config()"]
E -->|"Both lists"| G["Match by first key\nof each dict item"]
G -->|"Match found"| H["Recurse:\nupdate_config()"]
G -->|"No match"| I["Append item"]
G -->|"Non-dict item"| I
E -->|"Otherwise"| J["Replace value"]
E -->|"Key missing"| K["Add new key"]
D --> L["Merged config_dict"]
L --> M["Session.assert_valid_config_dict()"]
M --> N["Session.from_dict()"]
N --> O["Session.__post_init__()"]
O --> P["For each entry:\nentry.ray = {**session.ray, **entry.ray}"]
P --> Q["Run entries"]
Reviews (1): Last reviewed commit: "[benchmarking] Adds support for fine-gra..." | Re-trigger Greptile |
| elif isinstance(config_dict[key], list) and isinstance(value, list): | ||
| for sub_val in value: | ||
| # Handle dicts in the list by matching based on the first key | ||
| if isinstance(sub_val, dict) and sub_val: | ||
| first_key = next(iter(sub_val.keys())) | ||
| for config_sub_val in config_dict[key]: | ||
| if ( | ||
| isinstance(config_sub_val, dict) | ||
| and config_sub_val | ||
| and next(iter(config_sub_val.keys())) == first_key | ||
| and config_sub_val[first_key] == sub_val[first_key] | ||
| ): | ||
| # If matching dict found (based on first key), recursively update it | ||
| update_config(config_sub_val, sub_val) | ||
| break | ||
| else: | ||
| # If no matching dict, append the new dict to the list | ||
| config_dict[key].append(sub_val) |
There was a problem hiding this comment.
First-key matching is fragile and order-dependent
List items are matched by next(iter(sub_val.keys())), which depends on dict insertion order. If an override YAML has a requirement dict with keys in a different order (e.g., min_value before metric), the lookup key changes, no existing item matches, and the item is appended rather than merged — silently duplicating the requirement in the entry. The README documents the "first key" semantics, but this makes the merge logic sensitive to YAML key ordering in a non-obvious way.
A more robust approach would be to match entries by a known stable key (name for entries, metric for requirements) rather than relying on insertion order.
| else: | ||
| # If not a dict, append the new value to the list | ||
| config_dict[key].append(sub_val) |
There was a problem hiding this comment.
Non-dict list items are always appended, never deduplicated
When a list contains non-dict values (e.g., Slack user IDs in ping_on_failure), the code unconditionally appends the override value even if it already exists in the base list. If an override file repeats a ping_on_failure ID that is already in the base config, it will be added twice, causing duplicate Slack notifications on failure. The README's merge-behavior table only describes the dict-matching case and doesn't cover this edge.
beep boop [🤖]: Hi @rlratzel 👋,