Skip to content

feat: integrate RL training with vLLM_ascend inference backend#1641

Open
NengXu001 wants to merge 1 commit intoInternLM:mainfrom
NengXu001:rl_vllm
Open

feat: integrate RL training with vLLM_ascend inference backend#1641
NengXu001 wants to merge 1 commit intoInternLM:mainfrom
NengXu001:rl_vllm

Conversation

@NengXu001
Copy link
Copy Markdown
Contributor

Integrates the vLLM inference backend into the RL training pipeline, specifically enabling support for the Ascend NPU environment.

Key Features:

  1. Enabled Tensor Parallelism (TP) and Data Parallelism (DP) .
  2. Weight Synchronization: Implemented sync_weight logic to ensure consistency between the actor and the inference engine.

Out of Scope:

  1. Ref model support
  2. Evaluation step
  3. Router replay

Co-authored-by: yifeililn <yifeilin1202@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant