Skip to content

Change enforce_eager default to False#1818

Merged
SumanthRH merged 3 commits into
mainfrom
enforce-eager-default
Jun 23, 2026
Merged

Change enforce_eager default to False#1818
SumanthRH merged 3 commits into
mainfrom
enforce-eager-default

Conversation

@SumanthRH

@SumanthRH SumanthRH commented Jun 20, 2026

Copy link
Copy Markdown
Member

What does this PR do?

It's been a while since #569 . Our internal multi-turn RL training runs use enforce_eager False now. We've previously used default True because of lower reward with DAPO last year. I believe it is safe to turn it to False now. We also have better rollout train mismatch corrections like geometric sequence masking now.

Since our current reproduction curves all use enforce_eager for DAPO, i've retained that in the script. For scripts that aren't exact DAPO reproductions , enforce_eager can be left to false - I've made this change for many GSM8k scripts now

We should also reproduce the DAPO recipe again with cuda graphs enabled.

x
Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
Signed-off-by: SumanthRH <sumanthrh@anyscale.com>
@SumanthRH SumanthRH marked this pull request as ready for review June 23, 2026 01:08
@SumanthRH SumanthRH merged commit 059e6d9 into main Jun 23, 2026
5 checks passed

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request changes the default value of enforce_eager from True to False in InferenceEngineConfig within skyrl/train/config/config.py to enable CUDA graphs by default for higher performance. Consequently, various training run scripts under examples/train/ have been updated to either explicitly set ENFORCE_EAGER=false or to pass the $ENFORCE_EAGER variable to the generator's inference engine configuration. For DAPO-related recipes, ENFORCE_EAGER remains true with updated comments indicating a TODO to reproduce results with it set to false. There are no review comments, and I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant