-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Background
Lite nodes retain all state data, while non-state data is limited to the most recent 65,536 blocks of block and transaction data.
toolkit db lite uses an archiveDbs list to determine which databases are excluded when splitting a lite node snapshot:
// plugins/.../DbLite.java:61-66
private static final List<String> archiveDbs = Arrays.asList(
BLOCK_DB_NAME, // "block"
BLOCK_INDEX_DB_NAME, // "block-index"
TRANS_DB_NAME, // "trans"
TRANSACTION_RET_DB_NAME, // "transactionRetStore"
TRANSACTION_HISTORY_DB_NAME); // "transactionHistoryStore"Currently, account-trace and balance-trace are not included in the list above, so they are copied in full into the lite node snapshot. However, both databases are space-intensive and effectively unusable in a lite node context: they are enabled via the CLI flag --history-balance-lookup or the config option storage.balance.history.lookup, written during block and transaction execution, serves the historical balance query API (getAccountBalance).
Since lite nodes retain only the most recent 65,536 blocks, historical block data before the snapshot point (e.g., block, trans) is already excluded, and the historical balance query API is inherently unavailable on lite nodes — retaining these two databases provides no practical value.
| Database | Contents | Measured Size |
|---|---|---|
balance-trace |
Historical balance change records at the block and transaction level | ≈ 690 GB |
account-trace |
Historical account balances indexed by address + block number | ≈ 180 GB |
| Total | Based on the mainnet full node snapshot measured on 2026-03-11 | ≈ 870 GB |
Problem Statement
When toolkit db lite splits a Lite node snapshot, it does not include account-trace and balance-trace in the archiveDbs exclusion list. As a result, approximately 870 GB of effectively unused data is copied in full into the lite node snapshot, increasing storage costs and data transfer overhead while contributing nothing to lite node functionality.
Rationale
Why should this feature exist?
- Storage savings: Each split operation avoids copying and transferring approximately 870 GB of unused data.
- No functional impact: The two excluded databases are not involved in any online state computation on lite nodes; they serve only the historical balance query API, which is already unavailable on lite nodes. Furthermore, both databases are optional features that most nodes do not enable by default.
What are the use cases?
- Node operators using
toolkit db liteto generate Lite Node snapshots, reducing disk usage and network transfer costs. - Snapshot distribution scenarios where a significantly smaller snapshot size lowers the barrier to initial sync.
- Storage-constrained environments where eliminating unused data preserves valuable disk space.
Who would benefit from this feature?
Node operators, lite node deployers, and snapshot distribution service providers.
Proposed Solution
Specification
1. Split (lite)
Add account-trace and balance-trace to the archiveDbs list in DbLite.java so they are automatically excluded when splitting a lite node snapshot, and classified under the archive (historical) portion:
private static final List<String> archiveDbs = Arrays.asList(
BLOCK_DB_NAME,
BLOCK_INDEX_DB_NAME,
TRANS_DB_NAME,
TRANSACTION_RET_DB_NAME,
TRANSACTION_HISTORY_DB_NAME,
ACCOUNT_TRACE_DB_NAME, // new
BALANCE_TRACE_DB_NAME); // new2. Merge (merge)
The merge operation combines a lite node snapshot with historical data to restore a full node. Since account-trace and balance-trace are now classified on the archive side, the merge logic must be updated and validated accordingly:
Merge strategy:
| Source | Action |
|---|---|
account-trace / balance-trace from archive |
Retain entries with height < lite_height |
account-trace / balance-trace from lite snapshot |
Append only entries with height > archive_height |
| Final result | Two contiguous segments, no overlap, no gap |
- API Changes: None.
- Configuration Changes: None.
- Protocol Changes: None.
Testing Strategy
Test Scenarios
- After running
toolkit db lite, verify that the lite snapshot directory does not containaccount-traceorbalance-trace, and that the archive directory does contain them (if the source node had the feature enabled). - Compare snapshot sizes before and after the change to confirm the reduction matches expectations (≈ 870 GB).
- Start a lite node from the new snapshot and verify that block sync, state queries, and other core functions work correctly with no errors.
- Merge validation (source node with
history-balance-lookupenabled): after merging, verify that the full node contains completeaccount-traceandbalance-tracedata and that the historical balance query API functions correctly.
Performance Considerations
The change affects only file-handling logic during the snapshot split and merge phases; it has no impact on node runtime performance.
Scope of Impact
- Core protocol
- API/RPC
- Database
- Network layer
- Smart contracts
- Documentation
- Other: toolkit
db lite.
Breaking Changes
None.
Backward Compatibility
No modifications are made to existing full node database directories; only the behavior of toolkit db lite splitting is affected. Previously generated lite node snapshots are not impacted.
Implementation
Do you have ideas regarding the implementation?
Append ACCOUNT_TRACE_DB_NAME and BALANCE_TRACE_DB_NAME to the archiveDbs constant in DbLite.java — a minimal code change. Add incremental merge logic to the merge flow for these two databases, scoped by block height range.
Are you willing to implement this feature?
- Yes, I can implement this feature
Estimated Complexity
- Medium (moderate changes)
Alternatives Considered
None.
Additional Context
Related Issues/PRs
None.
References
getAccountBalanceAPI documentation