Skip to content

fix(hubble): reset upload quota by job and fix >10GB historical upload blockage#725

Merged
imbajin merged 7 commits intoapache:masterfrom
peakxy:fix-load-file-hubble
Apr 15, 2026
Merged

fix(hubble): reset upload quota by job and fix >10GB historical upload blockage#725
imbajin merged 7 commits intoapache:masterfrom
peakxy:fix-load-file-hubble

Conversation

@peakxy
Copy link
Copy Markdown
Contributor

@peakxy peakxy commented Apr 13, 2026

Purpose of the PR

This PR fixes the Hubble upload quota reset problem described in #496.
Previously, uploaded file size was effectively accumulated across historical jobs, and deleting an old import job did not fully release the occupied upload quota. As a result, once the historical uploaded files exceeded 10GB, users could no longer continue uploading files in the web UI or reset the quota by removing old jobs.

Main Changes

  • Change backend upload quota validation from global historical scope to current job scope by using JobManager.jobSize instead of relying on global file history.
  • Add the size request parameter in the upload flow so the backend can validate the original file size instead of only seeing chunk sizes.
  • Complete cascading cleanup when deleting an import job, including related LoadTask records, FileMapping metadata, and uploaded files on disk, so deleting an old job can actually release the quota.
  • Align frontend cumulative size validation with backend behavior by counting uploaded files, queued local files, and newly selected files in the same job.
  • Keep total_size_bytes as the raw byte field required by frontend quota calculation, remove the redundant job_size_bytes field, and deduplicate the related i18n message key.

Verifying these changes

  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:
    • Create an import job in Hubble and upload files in multiple batches; verify that upload is allowed until the cumulative size of the current job exceeds 10GB.
    • Create a historical job with large uploaded files, then create a new job and upload a small file; verify that the new job is not blocked by historical jobs.
    • Delete an old job with uploaded files, then create a new job and continue uploading; verify that the quota is released after the old job is removed.
    • Run git diff --check to confirm there is no formatting issue in the current diff.

Does this PR potentially affect the following parts?

  • Nope
  • Dependencies (add/update license info)
  • Modify configurations
  • The public API
  • Other affects (Hubble data import upload quota validation, job cleanup flow, and related frontend/backend upload interaction)

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

Copilot AI review requested due to automatic review settings April 13, 2026 14:42
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working hubble hugegraph-hubble labels Apr 13, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR fixes Hubble’s upload quota behavior so quota is enforced per import job (not across historical jobs) and deleting a job fully releases quota by cleaning up associated tasks, metadata, and disk files.

Changes:

  • Frontend: adds current-job cumulative size validation and passes original file size to the backend during chunked upload.
  • Backend: validates total quota using current job size and cascades cleanup on job deletion.
  • Data model/i18n: exposes raw byte size as total_size_bytes and clarifies the quota error message.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
hugegraph-hubble/hubble-fe/src/utils/dataImportUpload.ts Adds helpers to compute current job upload size and enforce the 10GB limit consistently.
hugegraph-hubble/hubble-fe/src/stores/types/GraphManagementStore/dataImportStore.ts Extends FileMapInfo with total_size_bytes for byte-accurate quota calculations.
hugegraph-hubble/hubble-fe/src/stores/GraphManagementStore/dataImportStore/dataImportRootStore.ts Sends size in the upload request so backend can validate original file size.
hugegraph-hubble/hubble-fe/src/components/graph-management/data-import/import-tasks/UploadEntry.tsx Switches UI validation to current-job cumulative size logic.
hugegraph-hubble/hubble-be/src/main/resources/i18n/messages_zh_CN.properties Updates quota error message to clarify “current job” scope.
hugegraph-hubble/hubble-be/src/main/resources/i18n/messages.properties Updates quota error message to clarify “current job” scope.
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/service/load/FileMappingService.java Adds query method to list file mappings by job for cleanup.
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/entity/load/FileMapping.java Adds total_size_bytes JSON field backed by totalSize.
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/load/JobManagerController.java Performs cascading stop/remove of tasks and deletes job files/metadata on job deletion.
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/load/FileUploadController.java Requires size param and validates quota against the current job’s size.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread hugegraph-hubble/hubble-fe/src/utils/dataImportUpload.ts
Comment thread hugegraph-hubble/hubble-fe/src/utils/dataImportUpload.ts
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 14, 2026
@peakxy
Copy link
Copy Markdown
Contributor Author

peakxy commented Apr 14, 2026

@imbajin Hi, I have made revisions based on your comments. Please take a look again at your convenience. If there are still any issues, please point them out so I can make further changes. Sry, I'm a newcomer, so I appreciate your guidance.

@imbajin
Copy link
Copy Markdown
Member

imbajin commented Apr 14, 2026

补充一下 comment 里的几个 sec / CodeQL 提示,这边整体看了一遍。

这些告警大多是针对“用户输入参与路径拼接或文件操作”的通用提醒。就这次 PR 来说,核心目标还是修复 upload quota 的统计口径、分片上传时的大小校验,以及删除 job 后配额无法正确释放的问题。从目前代码来看,这条主线已经处理得比较完整了。

考虑到 Hubble 旧版本后续也不再持续维护,我觉得这些 sec 点不用作为这次 PR 的 blocker,更适合作为可选的防御性增强来看。建议只考虑下面几项:

  1. FileUploadController 里对 name 做一层基础校验,确保它就是纯文件名,拒绝包含 /\.. 这样的值。
  2. 在真正执行磁盘操作前,比如读取上传文件大小、删除路径时,再补一层 upload root 的约束,确认目标路径仍然位于上传目录下面。
  3. 顺手修一下 deleteUnfinishedFile() 里 token 清理的小问题:现在用的是 token.startsWith(mapping.getName()),但 token 实际上是 md5(fileName) + "-" + time,这里更合理的是用 md5(mapping.getName()) 去匹配前缀。

@peakxy
Copy link
Copy Markdown
Contributor Author

peakxy commented Apr 14, 2026

@imbajin 根据你的建议已经改好了~有空看一下

Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THX, maybe could unify CI for the next step (merged the loader-ci improvement PR just now)

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Apr 15, 2026
@imbajin imbajin merged commit c2f588c into apache:master Apr 15, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working hubble hugegraph-hubble lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] can't load files when > 10GB in hubble

4 participants