Skip to content

refactor: split .index file into multiple#389

Draft
dkharms wants to merge 9 commits intomainfrom
336-sealing-split
Draft

refactor: split .index file into multiple#389
dkharms wants to merge 9 commits intomainfrom
336-sealing-split

Conversation

@dkharms
Copy link
Copy Markdown
Member

@dkharms dkharms commented Mar 31, 2026

Description

This is first pull request of series #336. Changes in this pull request will allow us to efficiently merge several fractions into one when performing compaction.

So .index gets split into several:

  • .info -- contains one info block;
  • .offsets -- contains one block with offsets of DocBlock inside .docs file;
  • .ids -- contains triplets of seq.MID, seq.RID and seq.DocPos blocks;
  • .tokens -- contains tokens and token table;
  • .lids -- contains chunks of seq.LID;

It's easy to notice that we expect increase in file descriptors usage by 3x

Several things I need to polish:

  • Reintroduce statistics reporting on sealing;
  • Backwards compatibility for sealed fractions which were offloaded to remote storage;
  • Delete all sealed fractions files if at least one file has tmp suffix;
  • Handle tmp .index files;

  • I have read and followed all requirements in CONTRIBUTING.md;
  • I used LLM/AI assistance to make this pull request;

If you have used LLM/AI assistance please provide model name and full prompt:

Model: Claude Sonnet 4.6
Context: I've used LLM to fix issues with index analyzer binary

@github-actions
Copy link
Copy Markdown
Contributor

PR Title Validation Failed
Please refer to CONTRIBUTING.md

@dkharms dkharms changed the title 336 sealing split refactor: split .index file into multiple Mar 31, 2026
@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot up main bulk

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

Your request was successfully served.
Identificator for your ongoing benchmark - f70e7779.

Here is a list of helpful links:

  • Take a look at Grafana dashboard;
  • Live-tailing logs are also available;

Have a great time!

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 62.85047% with 318 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.02%. Comparing base (fd721bd) to head (28053bf).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
frac/sealed_loader.go 41.86% 67 Missing and 8 partials ⚠️
frac/sealed.go 61.65% 33 Missing and 18 partials ⚠️
cmd/index_analyzer/main.go 0.00% 50 Missing ⚠️
frac/sealed/sealing/index.go 61.90% 12 Missing and 20 partials ⚠️
fracmanager/frac_manifest.go 58.10% 30 Missing and 1 partial ⚠️
frac/sealed/sealing/sealer.go 67.16% 14 Missing and 8 partials ⚠️
frac/remote.go 73.84% 8 Missing and 9 partials ⚠️
frac/sealed/sealing/writer.go 70.37% 8 Missing and 8 partials ⚠️
frac/sealed/sealing/blocks_builder.go 88.07% 7 Missing and 6 partials ⚠️
frac/active_sealing_source.go 84.05% 4 Missing and 7 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #389      +/-   ##
==========================================
- Coverage   71.41%   71.02%   -0.39%     
==========================================
  Files         210      210              
  Lines       15579    15849     +270     
==========================================
+ Hits        11125    11257     +132     
- Misses       3656     3779     +123     
- Partials      798      813      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot down f70e7779

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

The benchmark with identificator f70e7779 was finished.
I've prepared a summary for you. Click on Show summary button to see it:

Show summary
Query Type mean (ms) stddev (ms) p(50) (ms) p(95) (ms) p(99) (ms) iterations
base comp diff base comp diff base comp diff base comp diff base comp diff base comp diff
bulk
warm 75.47 75.95 +0.64% 31.14 32.51 +4.41% 67.00 66.00 -1.49% 140.00 144.00 +2.86% 191.00 196.00 +2.62% 56767.00 56645.00 -0.21%

Have a great time!

@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot up main mixed

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

Your request was successfully served.
Identificator for your ongoing benchmark - e47604c9.

Here is a list of helpful links:

  • Take a look at Grafana dashboard;
  • Live-tailing logs are also available;

Have a great time!

@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot down e47604c9

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

The benchmark with identificator e47604c9 was finished.
I've prepared a summary for you. Click on Show summary button to see it:

Show summary
Query Type mean (ms) stddev (ms) p(50) (ms) p(95) (ms) p(99) (ms) iterations
base comp diff base comp diff base comp diff base comp diff base comp diff base comp diff
bulk
warm 75.44 75.74 +0.40% 30.28 30.81 +1.74% 67.00 68.00 +1.49% 136.50 140.00 +2.56% 188.00 191.00 +1.60% 21032.00 21063.00 +0.15%
service:payment-backend-eu
AND k8s_namespace:prod
AND level:[0 to 3]
AND (
    message:'failed'
    OR message:'timeout'
)
warm 71.49 71.26 -0.32% 25.23 24.94 -1.14% 65.00 65.00 0.00% 124.00 121.00 -2.42% 173.50 171.00 -1.44% 3459.00 3466.00 +0.20%

Have a great time!

@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot up main bulk

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

Your request was successfully served.
Identificator for your ongoing benchmark - 3a13f7af.

Here is a list of helpful links:

  • Take a look at Grafana dashboard;
  • Live-tailing logs are also available;

Have a great time!

@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot down 3a13f7af

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

The benchmark with identificator 3a13f7af was finished.
I've prepared a summary for you. Click on Show summary button to see it:

Show summary
Query Type mean (ms) stddev (ms) p(50) (ms) p(95) (ms) p(99) (ms) iterations
base comp diff base comp diff base comp diff base comp diff base comp diff base comp diff
bulk
warm 70.93 70.93 -0.00% 26.40 27.66 +4.80% 64.00 63.00 -1.56% 124.50 127.00 +2.01% 165.00 170.00 +3.03% 13088.00 13154.00 +0.50%

Have a great time!

@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot up main bulk

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

Your request was successfully served.
Identificator for your ongoing benchmark - 29e733d8.

Here is a list of helpful links:

  • Take a look at Grafana dashboard;
  • Live-tailing logs are also available;

Have a great time!

@dkharms dkharms force-pushed the 336-sealing-split branch from b0caec8 to 20ecd57 Compare March 31, 2026 13:34
@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot down 29e733d8

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

The benchmark with identificator 29e733d8 was finished.
I've prepared a summary for you. Click on Show summary button to see it:

Show summary
Query Type mean (ms) stddev (ms) p(50) (ms) p(95) (ms) p(99) (ms) iterations
base comp diff base comp diff base comp diff base comp diff base comp diff base comp diff
bulk
warm 69.95 69.78 -0.25% 26.18 26.57 +1.50% 63.00 63.00 0.00% 122.00 123.00 +0.82% 162.00 166.00 +2.47% 13679.00 13623.00 -0.41%

Have a great time!

@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot up main bulk

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

Your request was successfully served.
Identificator for your ongoing benchmark - b1727103.

Here is a list of helpful links:

  • Take a look at Grafana dashboard;
  • Live-tailing logs are also available;

Have a great time!

@dkharms
Copy link
Copy Markdown
Member Author

dkharms commented Mar 31, 2026

@seqbenchbot down b1727103

@seqbenchbot
Copy link
Copy Markdown
Collaborator

seqbenchbot commented Mar 31, 2026

Nice, @dkharms <(-^,^-)=b!

The benchmark with identificator b1727103 was finished.
I've prepared a summary for you. Click on Show summary button to see it:

Show summary
Query Type mean (ms) stddev (ms) p(50) (ms) p(95) (ms) p(99) (ms) iterations
base comp diff base comp diff base comp diff base comp diff base comp diff base comp diff
bulk
warm 72.81 72.37 -0.60% 29.14 28.83 -1.06% 65.00 64.00 -1.54% 132.00 130.00 -1.52% 175.00 179.00 +2.29% 40040.00 39925.00 -0.29%

Have a great time!

@github-actions
Copy link
Copy Markdown
Contributor

🔴 Performance Degradation

Some benchmarks have degraded compared to the previous run.
Click on Show table button to see full list of degraded benchmarks.

Show table
Name Previous Current Ratio Verdict
MutexListAppend-4 5115f7 674dbc
196.07 MB/s 168.78 MB/s 0.86 🔴
83287562.00 ns/op 94799666.00 ns/op 1.14 🔴

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants