Skip to content

[Feature]: add ViT activation_offload for InternS1#1619

Open
NengXu001 wants to merge 1 commit intoInternLM:mainfrom
NengXu001:main
Open

[Feature]: add ViT activation_offload for InternS1#1619
NengXu001 wants to merge 1 commit intoInternLM:mainfrom
NengXu001:main

Conversation

@NengXu001
Copy link
Copy Markdown
Contributor

@NengXu001 NengXu001 commented Mar 23, 2026

Reduced InternS1 training memory usage by ~10GB through ViT offloading; implemented module name configuration for ViT and MoE layers within activation offloading.

Copy link
Copy Markdown
Collaborator

@HAOCHENYE HAOCHENYE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should enhance the implementation of activation offload to make it more general-purpose, so that it can serve multiple models without needing to be aware of layer counts.

@HAOCHENYE HAOCHENYE added the npu label Mar 25, 2026
@NengXu001 NengXu001 force-pushed the main branch 2 times, most recently from 019057a to 382642d Compare March 30, 2026 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants