Too many API calls when restoring a backup from a S3 increment chain#1361
Open
Sfaynet wants to merge 2 commits intoAltinity:masterfrom
Open
Too many API calls when restoring a backup from a S3 increment chain#1361Sfaynet wants to merge 2 commits intoAltinity:masterfrom
Sfaynet wants to merge 2 commits intoAltinity:masterfrom
Conversation
…ix_list_duration_s3_api_walk_5
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hello everyone!
I'd like to share the following issue:
When restoring (restore_remote) a large number of ClickHouse tables from S3, including incremental backups, there are numerous messages (pkg/storage/general.go:222 > , list_duration) in the logs, and it seems like more time is spent processing the logic than downloading the data itself. Restoring 280 GB of data and approximately 3,500 tables from S3 takes 8 hours, while after the backup is downloaded locally, it takes 11 minutes. Linear downloads from S3 (minio) are running at around a gigabyte per second.
I tried tweaking the parameters, but it didn't help:
S3_ALLOW_MULTIPART_DOWNLOAD="true"
DOWNLOAD_BY_PART="true"
DOWNLOAD_CONCURRENCY="255"
S3_CONCURRENCY
After analyzing the AI repo, claude-sonnet identified the cause:
Excessive S3 API calls (Walk) - up to 17,500 calls for each table in the increment chain.
The AI suggested the following as the primary solution:
In-Memory Cache (pkg/storage/general.go)
Added a cache for in-memory backup metadata
Cache key: {storage_type}:{backup_name}
When requesting metadata for the same backup again, the data is retrieved from the cache without calling S3.
I'm not a big Go expert, but the changes suggested by the AI reduced the backup restore time from 8 hours to 30 minutes.