feat: integrate RL training with vLLM_ascend inference backend by NengXu001 · Pull Request #1641 · InternLM/xtuner

NengXu001 · 2026-03-27T13:16:49Z

Integrates the vLLM inference backend into the RL training pipeline, specifically enabling support for the Ascend NPU environment.

Key Features:

Enabled Tensor Parallelism (TP) and Data Parallelism (DP) .
Weight Synchronization: Implemented sync_weight logic to ensure consistency between the actor and the inference engine.

Out of Scope:

Co-authored-by: yifeililn <yifeilin1202@qq.com>

feat: integrate RL training with vLLM inference backend

78e312a

Co-authored-by: yifeililn <yifeilin1202@qq.com>

Provide feedback