Skip to content

[fix] Add trailing slash to local directory for uploads to existing cloud directory with fsspec#1614

Merged
SumanthRH merged 2 commits into
mainfrom
fix-io-upload-lora
May 3, 2026
Merged

[fix] Add trailing slash to local directory for uploads to existing cloud directory with fsspec#1614
SumanthRH merged 2 commits into
mainfrom
fix-io-upload-lora

Conversation

@SumanthRH
Copy link
Copy Markdown
Member

What does this PR do?

Fixes an issue with checkpoint upload format with LoRA. With LoRA, we make multiple calls to io.local_work_dir , which in turn calls io.upload_directory . Each time, we write checkpoint files to a temporary file and then upload them to directory in cloud storage. Currently, subsequent calls to io.upload_directory leads to nesting of ckpt contents:

 aws s3 ls <my_remote_path>/global_step_40/policy/
                           PRE tmp8uxmdhfx/
                           PRE tmpbbhsijnu/
                           PRE tmpd3xtpaoz/
                           PRE tmpeag11l_c/
                           PRE tmppg5zt9uh/
                           PRE tmpqplxqem5/
2026-05-01 00:01:40      14885 extra_state_world_size_8_rank_6.pt
2026-05-01 00:01:40      14885 extra_state_world_size_8_rank_7.pt
2026-05-01 00:01:40  860448401 model_world_size_8_rank_6.pt
2026-05-01 00:01:40  860448401 model_world_size_8_rank_7.pt
2026-05-01 00:01:40 1720907321 optim_world_size_8_rank_6.pt
2026-05-01 00:01:40 1720907321 optim_world_size_8_rank_7.pt

The root cause is that with fsspec, while uploading files in a directory src to an existing directory dst with fsspec, we need to ensure that the file path ends in a trailing slash. otherwise, fsspec will create a subdirectory dst/src instead of directly syncing contents of src into the root dst directory

…oud directory with fsspec

Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH marked this pull request as ready for review May 3, 2026 06:40
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the upload_directory function in skyrl/backends/skyrl_train/utils/io/io.py to ensure the local path ends with a trailing slash, preventing fsspec from creating unwanted subdirectories during synchronization. A review comment points out that the manual string concatenation used is not cross-platform and suggests using os.path.join to ensure the trailing separator is handled correctly on all operating systems.

Comment thread skyrl/backends/skyrl_train/utils/io/io.py Outdated
x
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
@SumanthRH SumanthRH merged commit 1c4b83c into main May 3, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant