go-ethereum

mirror of https://github.com/ethereum/go-ethereum.git synced 2026-05-02 14:22:55 +00:00

Author	SHA1	Message	Date
rjl493456442	1022c7637d	core, eth, internal, triedb/pathdb: enable eth_getProofs for history (#32727 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This PR enables the `eth_getProofs ` endpoint against the historical states.	2026-01-22 09:19:27 +08:00
rjl493456442	588dd94aad	triedb/pathdb: implement trienode history indexing scheme (#33551 ) This PR implements the indexing scheme for trie node history. Check https://github.com/ethereum/go-ethereum/pull/33399 for more details	2026-01-17 20:28:37 +08:00
rjl493456442	d5efd34010	triedb/pathdb: introduce extension to history index structure (#33399 ) It's a PR based on #33303 and introduces an approach for trienode history indexing. --- In the current archive node design, resolving a historical trie node at a specific block involves the following steps: - Look up the corresponding trie node index and locate the first entry whose state ID is greater than the target state ID. - Resolve the trie node from the associated trienode history object. A naive approach would be to store mutation records for every trie node, similar to how flat state mutations are recorded. However, the total number of trie nodes is extremely large (approximately 2.4 billion), and the vast majority of them are rarely modified. Creating an index entry for each individual trie node would be very wasteful in both storage and indexing overhead. To address this, we aggregate multiple trie nodes into chunks and index mutations at the chunk level instead. --- For a storage trie, the trie is vertically partitioned into multiple sub tries, each spanning three consecutive levels. The top three levels (1 + 16 + 256 nodes) form the first chunk, and every subsequent three-level segment forms another chunk. ``` Original trie structure Level 0 [ ROOT ] 1 node Level 1 [0] [1] [2] ... [f] 16 nodes Level 2 [00] [01] ... [0f] [10] ... [ff] 256 nodes Level 3 [000] [001] ... [00f] [010] ... [fff] 4096 nodes Level 4 [0000] ... [000f] [0010] ... [001f] ... [ffff] 65536 nodes Vertical split into chunks (3 levels per chunk) Level0 [ ROOT ] 1 chunk Level3 [000] ... [fff] 4096 chunks Level6 [000000] ... [fffffff] 16777216 chunks ``` Within each chunk, there are 273 nodes in total, regardless of the chunk's depth in the trie. ``` Level 0 [ 0 ] 1 node Level 1 [ 1 ] … [ 16 ] 16 nodes Level 2 [ 17 ] … … [ 272 ] 256 nodes ``` Each chunk is uniquely identified by the path prefix of the root node of its corresponding sub-trie. Within a chunk, nodes are identified by a numeric index ranging from 0 to 272. For example, suppose that at block 100, the nodes with paths `[]`, `[0]`, `[f]`, `[00]`, and `[ff]` are modified. The mutation record for chunk 0 is then appended with the following entry: `[100 → [0, 1, 16, 17, 272]]`, `272` is the numeric ID of path `[ff]`. Furthermore, due to the structural properties of the Merkle Patricia Trie, if a child node is modified, all of its ancestors along the same path must also be updated. As a result, in the above example, recording mutations for nodes `00` and `ff` alone is sufficient, as this implicitly indicates that their ancestor nodes `[]`, `[0]` and `[f]` were also modified at block 100. --- Query processing is slightly more complicated. Since trie nodes are indexed at the chunk level, each individual trie node lookup requires an additional filtering step to ensure that a given mutation record actually corresponds to the target trie node. As mentioned earlier, mutation records store only the numeric identifiers of leaf nodes, while ancestor nodes are omitted for storage efficiency. Consequently, when querying an ancestor node, additional checks are required to determine whether the mutation record implicitly represents a modification to that ancestor. Moreover, since trie nodes are indexed at the chunk level, some trie nodes may be updated frequently, causing their mutation records to dominate the index. Queries targeting rarely modified trie nodes would then scan a large amount of irrelevant index data, significantly degrading performance. To address this issue, a bitmap is introduced for each index block and stored in the chunk's metadata. Before loading a specific index block, the bitmap is checked to determine whether the block contains mutation records relevant to the target trie node. If the bitmap indicates that the block does not contain such records, the block is skipped entirely.	2026-01-08 09:57:35 +01:00
rjl493456442	b3e7d9ee44	triedb/pathdb: optimize history indexing efficiency (#33303 ) This pull request optimizes history indexing by splitting a single large database batch into multiple smaller chunks. Originally, the indexer will resolve a batch of state histories and commit all corresponding index entries atomically together with the indexing marker. While indexing more state histories in a single batch improves efficiency, excessively large batches can cause significant memory issues. To mitigate this, the pull request splits the mega-batch into several smaller batches and flushes them independently during indexing. However, this introduces a potential inconsistency that some index entries may be flushed while the indexing marker is not, and an unclean shutdown may leave the database in a partially updated state. This can corrupt index data. To address this, head truncation is introduced. After a restart, any excessive index entries beyond the expected indexing marker are removed, ensuring the index remains consistent after an unclean shutdown.	2025-12-30 16:05:13 +01:00
rjl493456442	0a8b820725	triedb/pathdb: make batch with pre-allocated size (#32914 ) In this PR, the database batch for writing the history index data is pre-allocated. It's observed that database batch repeatedly grows the size of the mega-batch, causing significant memory allocation pressure. This approach can effectively mitigate the overhead.	2025-10-21 13:11:36 +02:00
rjl493456442	de24450dbf	core/rawdb, triedb/pathdb: introduce trienode history (#32596 ) It's a pull request based on the #32523 , implementing the structure of trienode history.	2025-10-10 14:51:27 +08:00
rjl493456442	21769f3474	triedb/pathdb: generalize the history indexer (#32523 ) This pull request is based on #32306 , is the second part for shipping trienode history. Specifically, this pull request generalize the existing index mechanism, making is usable by both state history and trienode history in the near future.	2025-09-17 15:57:16 +02:00
rjl493456442	bc4ee71a5d	triedb/pathdb: add recovery mechanism in state indexer (#32447 ) Alternative of #32335, enhancing the history indexer recovery after unclean shutdown.	2025-09-08 16:07:00 +08:00
Mars	0e69530c6e	all: improve ETA calculation across all progress indicators (#32521 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details ### Summary Fixes long-standing ETA calculation errors in progress indicators that have been present since February 2021. The current implementation produces increasingly inaccurate estimates due to integer division precision loss. ### Problem `3aeccadd04/triedb/pathdb/history_indexer.go (L541-L553)` The ETA calculation has two critical issues: 1. Integer division precision loss: `speed` is calculated as `uint64` 2. Off-by-one: `speed` uses `+ 1`(2 times) to avoid division by zero, however it makes mistake in the final calculation This results in wildly inaccurate time estimates that don't improve as progress continues. ### Example Current output during state history indexing: ``` lvl=info msg="Indexing state history" processed=16858580 left=41802252 elapsed=18h22m59.848s eta=11h36m42.252s ``` Expected calculation: - Speed: 16858580 ÷ 66179848ms = 0.255 blocks/ms - ETA: 41802252 ÷ 0.255 = ~45.6 hours Current buggy calculation: - Speed: rounds to 1 block/ms - ETA: 41802252 ÷ 1 = ~11.6 hours ❌ ### Solution - Created centralized `CalculateETA()` function in common package - Replaced all 8 duplicate code copies across the codebase ### Testing Verified accurate ETA calculations during archive node reindexing with significantly improved time estimates.	2025-09-01 13:47:02 +08:00
rjl493456442	95ab643bb8	triedb/pathdb: refactor state history write (#32497 ) This pull request refactors the internal implementation in path database a bit, specifically: - purge the state index data in batch - simplify the logic of state history construction and index, make it more readable	2025-08-26 21:53:55 +08:00
rjl493456442	8c58f4920d	triedb/pathdb: rename history to state history (#32498 ) This is a internal refactoring PR, renaming the history to stateHistory. It's a pre-requisite PR for merging trienode history, avoid the name conflict.	2025-08-26 08:52:39 +02:00
Delweng	16117eb7cd	triedb/pathdb: fix an deadlock in history indexer (#32260 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Seems the `signal.result` was not sent back in shorten case, this will cause a deadlock. --------- Signed-off-by: jsvisa <delweng@gmail.com> Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-07-23 15:12:55 +08:00
Delweng	62a17fdb25	core/rawdb, triedb/pathdb: fix two inaccurate comments (#32130 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Docker Image (push) Waiting to run Details	2025-07-02 08:46:03 +08:00
Delweng	c59c647ed7	triedb: reset state indexer after snap synced (#32104 ) Fix the issue after initial snap sync with `gcmode=archive` enabled. ``` NewPayload: inserting block failed error="history indexing is out of order, last: null, requested: 1" ``` --------- Signed-off-by: Delweng <delweng@gmail.com> Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-07-01 11:35:22 +08:00
rjl493456442	0c90e4bda0	all: incorporate state history indexing status into eth_syncing response (#32099 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Docker Image (push) Waiting to run Details This pull request tracks the state indexing progress in eth_syncing RPC response, i.e. we will return non-null syncing status until indexing has finished.	2025-06-26 17:20:20 +02:00
rjl493456442	a92f2b86e3	core, eth, triedb: serve historical states over RPC (#31161 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Docker Image (push) Waiting to run Details This is the part-2 for archive node over path mode, which ultimately ships the functionality to serve the historical states	2025-06-25 16:50:54 +08:00
rjl493456442	ce63bba361	eth, triedb/pathdb: permit write buffer allowance in PBSS archive mode (#32091 ) This pull request fixes a flaw in PBSS archive mode that significantly degrades performance when the mode is enabled. Originally, in hash mode, the dirty trie cache is completely disabled when archive mode is active, in order to disable the in-memory garbage collection mechanism. However, the internal logic in path mode differs significantly, and the dirty trie node cache is essential for maintaining chain insertion performance. Therefore, the cache is now retained in path mode.	2025-06-25 16:49:09 +08:00
rjl493456442	9c5c0e37bf	core/rawdb, triedb/pathdb: implement history indexer (#31156 ) This pull request is part-1 for shipping the core part of archive node in PBSS mode.	2025-06-24 14:36:12 +02:00

18 commits