Skip to content

Add tau2-synth-library reward-profiling config + tau2 deps#1503

Open
dpickem wants to merge 1 commit into
mainfrom
dpickem/tau2-synth-reward-profiling
Open

Add tau2-synth-library reward-profiling config + tau2 deps#1503
dpickem wants to merge 1 commit into
mainfrom
dpickem/tau2-synth-reward-profiling

Conversation

@dpickem

@dpickem dpickem commented Jun 2, 2026

Copy link
Copy Markdown

Adds a tau2-synth (library domain) config for the verifiers_agent and the tau2 / tau2-synth dependencies so the agent can run the tau2-synth env. Used for reward profiling of Nemotron Nano v3.5 on Polyphe.

Adds a tau2-synth (library domain) config for the verifiers_agent and the tau2 /
tau2-synth dependencies so the agent can run the tau2-synth env. Used for reward
profiling of Nemotron Nano v3.5 on Polyphe.
@copy-pr-bot

copy-pr-bot Bot commented Jun 2, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@dpickem

dpickem commented Jun 2, 2026

Copy link
Copy Markdown
Author

Do not merge this. Instead of adding configs / requirements to the verifier agent, we should be using the new environment abstraction (https://github.com/NVIDIA-NeMo/Gym/tree/main/environments). This is just for testing.

@cmunley1

cmunley1 commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

For dependency isolation i think we need to do this #1469 rather than use environments/ unless we refactor nemo gym core to put venvs into environments/ or something

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants