Commit graph

16682 commits

Author SHA1 Message Date
rjl493456442
7046e63244
trie: fix flaky test (#33711) 2026-01-29 17:22:15 +08:00
Lessa
2513feddf8
crypto/kzg4844: preallocate proof slice in ComputeCellProofs (#33703)
Preallocate the proof slice with the known size instead of growing it
via append in a loop. The length is already known from the source slice.
2026-01-29 15:49:10 +08:00
Marius van der Wijden
424bc22ab8
eth/gasprice: reduce allocations (#33698)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Recent pprof from our validator shows ~6% of all allocations because of
the gas price oracle. This PR reduces that.
2026-01-28 20:33:04 +01:00
CPerezz
1e9dfd5bb0
core: standardize slow block JSON output for cross-client metrics (#33655)
Some checks are pending
/ Docker Image (push) Waiting to run
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
Implement standardized JSON format for slow block logging to enable
cross-client performance analysis and protocol research.

This change is part of the Cross-Client Execution Metrics initiative
proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ

The standardized metrics enabled data-driven analysis like the EIP-7907
research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850

JSON format includes:
- block: number, hash, gas_used, tx_count
- timing: execution_ms, total_ms
- throughput: mgas_per_sec
- state_reads: accounts, storage_slots, bytecodes, code_bytes
- state_writes: accounts, storage_slots, bytecodes
- cache: account/storage/code hits, misses, hit_rate


This should come after merging #33522

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2026-01-28 20:58:41 +08:00
Marius van der Wijden
0a8fd6841c
eth/tracers/native: add index to callTracer log (#33629)
closes https://github.com/ethereum/go-ethereum/issues/33566
2026-01-28 13:32:27 +01:00
Ng Wei Han
3d05284928
trie/bintrie: fix tree key hashing to match spec (#33694)
Based on [EIP-7864](https://eips.ethereum.org/EIPS/eip-7864), the tree
index should be 32 bytes instead of 31 bytes.
```
def get_tree_key(address: Address32, tree_index: int, sub_index: int):
    # Assumes STEM_SUBTREE_WIDTH = 256
    return tree_hash(address + tree_index.to_bytes(32, "little"))[:31] + bytes(
        [sub_index]
    )
```
2026-01-28 11:51:02 +01:00
Lessa
344d01e2be
core/rawdb: preallocate slice in iterateTransactions (#33690)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Preallocate hashes slice with known length instead of using append in a
loop. This avoids multiple reallocations during transaction indexing.
2026-01-28 09:51:15 +08:00
Guillaume Ballet
56be36f672
cmd/keeper: export getInput in wasm builds (#33686)
This is a tweak to the wasm build, that expects the `geth_io` namespace
to expect a `geth_io` module, providing a `len` and `read` methods. This
will be provided by the WASM interface in sp1. This forces an API change
on the OpenVM side, but the interface on their side is still being
designed, so we should proceed with this change, and we'll make a
different tag for OpenVM if this can't work for them.

Co-authored-by: wakabat <wakabat@protonmail.com>
2026-01-28 09:50:15 +08:00
rjl493456442
181a3ae9e0
triedb/pathdb: improve trienode reader for searching (#33681)
Some checks are pending
/ Docker Image (push) Waiting to run
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
This PR optimizes the historical trie node reader by reworking how data
is accessed and memory is managed, reducing allocation overhead 
significantly.

Specifically:

- Instead of decoding an entire history object to locate a specific trie node, 
   the reader now searches directly within the history.

- Besides, slice pre-allocation can avoid unnecessary deep-copy significantly.
2026-01-27 20:05:35 +08:00
marukai67
e250836973
trie: preallocate slice capacity (#33689)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR optimizes memory allocation in StateTrie.PrefetchAccount() and
StateTrie.PrefetchStorage() by preallocating slice capacity when the
final size is known.
2026-01-27 12:04:12 +08:00
rjl493456442
c2595381bf
core: extend the code reader statistics (#33659)
Some checks are pending
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
This PR extends the statistics of contract code read by adding these
fields:

- **CacheHitBytes**: the total number of bytes served by cache
- **CacheMissBytes**: the total number of bytes read on cache miss
- **CodeReadBytes**: the total number of bytes for contract code read
2026-01-26 11:25:53 +01:00
Csaba Kiraly
9a8e14e77e
core/txpool/legacypool: fix stale counter (#33653)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
Calling `pool.priced.Removed` is needed to keep is sync with
`pool.all.Remove`.
It was called in other occurances, but not here.

The counter is used for internal heap management. It was working even without this, just not calling reheap at the intended frequency.

Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
2026-01-23 13:35:14 +01:00
Jonny Rhea
251b863107
core/vm: update EIP-8024 - Missing immediate byte is now treated as 0x00 (#33614)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR updates the EIP-8024 implementation to match the latest spec
clarification.

---------

Co-authored-by: lightclient <lightclient@protonmail.com>
2026-01-22 12:16:02 -07:00
rjl493456442
1022c7637d
core, eth, internal, triedb/pathdb: enable eth_getProofs for history (#32727)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR enables the `eth_getProofs ` endpoint against the historical states.
2026-01-22 09:19:27 +08:00
Csaba Kiraly
35922bcd33
core/txpool/legacypool: reset gauges on clear (#33654)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
2026-01-21 16:00:57 +08:00
DeFi Junkie
8fad02ac63
core/types: fix panic on invalid signature length (#33647)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Replace panic with error return in decodeSignature to prevent crashes on
invalid inputs, and update callers to propagate the error.
2026-01-21 06:57:02 +01:00
Csaba Kiraly
54ab4e3c7d
core/txpool/legacypool: add metric for accounts in txpool (#33646)
This PR adds metrics that count the number of accounts having transactions 
in the txpool. Together with the transaction count this can be used as a 
simple indicator of the diversity of transactions in the pool.

Note: as an alternative implementation, we could use a periodic or event
driven update of these Gauges using len.

I've preferred this implementation to match what we have for the pool
sizes.

---------

Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
2026-01-21 09:23:03 +08:00
forkfury
2eb1ccc6c4
core/state: ensure deterministic hook emission order in Finalise (#33644)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Fixes #33630

Sort self-destructed addresses before emitting hooks in Finalise() to
ensure deterministic ordering and fix flaky test
TestHooks_OnCodeChangeV2.

---------

Co-authored-by: jwasinger <j-wasinger@hotmail.com>
2026-01-20 20:36:07 +08:00
DeFi Junkie
46d804776b
accounts/scwallet: fix panic in decryptAPDU (#33606)
Validate ciphertext length in decryptAPDU, preventing runtime panics on
invalid input.
2026-01-20 12:04:23 +01:00
Felix Lange
d58f6291a2
internal/debug: add integration with Grafana Pyroscope (#33623)
This adds support for Grafana Pyroscope, a continuous profiling solution. 
The client is configured similarly to metrics, i.e. run 

     geth --pyroscope --pyroscope.server=https://...

This commit is a resubmit of #33261 with some changes.

---------

Co-authored-by: Carlos Bermudez Porto <cbermudez.dev@gmail.com>
2026-01-20 10:33:09 +01:00
cui
d0af257aa2
triedb/pathdb: double check the list availability before regeneration (#33622)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
2026-01-19 20:45:31 +08:00
rjl493456442
500931bc82
core/vm: add read only protection for opcodes (#33637)
This PR reverts a part of changes brought by https://github.com/ethereum/go-ethereum/pull/33281/changes

Specifically, read-only protection should always be enforced at the opcode level, 
regardless of whether the check has already been performed during gas metering.

It should act as a gatekeeper, otherwise, it is easy to introduce errors by adding
new gas measurement logic without consistently applying the read-only protection.
2026-01-19 20:43:14 +08:00
maskpp
ef815c59a2
rlp: improve SplitListValues allocation efficiency (#33554)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
2026-01-18 16:07:28 +08:00
lightclient
e78be59dc9
build: remove circleci config (#33616)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This doesn't seem to be used anymore.
2026-01-17 16:22:00 +01:00
Lessa
0495350388
accounts/abi/bind/v2: replace rng in test (#33612)
Replace deprecated rand.Seed() with rand.New(rand.NewSource()) in
dep_tree_test.go.
2026-01-17 16:20:19 +01:00
Mael Regnery
3d78da9171
rpc: add a rpc.rangelimit flag (#33163)
Adding an RPC flag to limit the block range size for eth_getLogs and
eth_newFilter requests.

closing https://github.com/ethereum/go-ethereum/issues/24508

---------

Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
2026-01-17 14:34:08 +01:00
rjl493456442
add1890a57
triedb/pathdb: enable trienode history (#32621)
It's the part-4 for trienode history. The trienode history persistence
has been enabled with this PR by flag `history.trienode <non-negative-number>`
2026-01-17 21:23:48 +08:00
rjl493456442
588dd94aad
triedb/pathdb: implement trienode history indexing scheme (#33551)
This PR implements the indexing scheme for trie node history. Check
https://github.com/ethereum/go-ethereum/pull/33399 for more details
2026-01-17 20:28:37 +08:00
jwasinger
715bf8e81e
core: invoke selfdestruct tracer hooks during finalisation (#32919)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
The core part of this PR that we need to adopt is to move the code and
nonce change hook invocations to occur at tx finalization, instead of
when the selfdestruct opcode is called.

Additionally:
* remove `SelfDestruct6780` now that it is essentially the same as
`SelfDestruct` just gated by `is new contract`
* don't duplicate `BalanceIncreaseSelfdestruct` (transfer to recipient
of selfdestruct) in the hooked statedb and in the opcode handler for the
selfdestruct opcode.
* balance is burned immediately when the beneficiary of the selfdestruct
is the sender, and the contract was created in the same transaction.
Previously we emit two balance increases to the recipient (see above
point), and a balance decrease from the sender.

---------

Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: lightclient <lightclient@protonmail.com>
2026-01-16 15:10:08 -07:00
jwasinger
b6fb79cdf9
core/vm: in selfdestruct gas calculation, return early if there isn't enough gas to cover cold account access costs (#33450)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
There's no need to perform the subsequent state access on the target if
we already know that we are out of gas.

This aligns the state access behavior of selfdestruct with EIP-7928
2026-01-16 07:37:12 -07:00
jwasinger
23c3498836
core/vm: check if read-only in gas handlers (#33281)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR causes execution to terminate at the gas handler in the case of
sstore/call if they are invoked in a static execution context.

This aligns the behavior with EIP 7928 by ensuring that we don't record
any state reads in the access list from an SSTORE/CALL in this
circumstance.

---------

Co-authored-by: lightclient <lightclient@protonmail.com>
2026-01-15 15:55:43 -07:00
Csaba Kiraly
9ba13b6097
eth/fetcher: refactor test code (#33610)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Remove a large amount of duplicate code from the tx_fetcher tests.

---------

Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: lightclient <lightclient@protonmail.com>
2026-01-15 11:37:34 -07:00
rjl493456442
494908a852
triedb/pathdb: change the bitmap to big endian (#33584)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
The bitmap is used in compact-encoded trie nodes to indicate which elements 
have been modified. The bitmap format has been updated to use big-endian
encoding. 

Bit positions are numbered from 0 to 15, where position 0 corresponds to
the most significant bit of b[0], and position 15 corresponds to the least
significant bit of b[1].
2026-01-15 17:28:57 +08:00
Jonny Rhea
e3e556b266
rpc: extract OpenTelemetry trace context from request headers (#33599)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR adds support for the extraction of OpenTelemetry trace context
from incoming JSON-RPC request headers, allowing geth spans to be linked
to upstream traces when present.

---------

Co-authored-by: lightclient <lightclient@protonmail.com>
2026-01-14 14:03:48 -07:00
Jonny Rhea
a9acb3ff93
rpc, internal/telemetry: add OpenTelemetry tracing for JSON-RPC calls (#33452)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Add Open Telemetry tracing inside the RPC server to help attribute runtime costs within `handler.handleCall()`. In particular, it allows us to distinguish time spent decoding arguments, invoking methods via reflection, and actually executing the method and constructing/encoding JSON responses.

---------

Co-authored-by: lightclient <lightclient@protonmail.com>
2026-01-14 10:58:30 -07:00
DeFi Junkie
94710f79a2
accounts/keystore: fix panic in decryptPreSaleKey (#33602)
Validate ciphertext length in decryptPreSaleKey, preventing runtime
panics on invalid input.
2026-01-14 11:51:48 +01:00
lightclient
3b17e78274 crypto/ecies: use aes blocksize
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2026-01-13 17:12:23 +01:00
MariusVanDerWijden
5b99d2bba4 core/txpool: drop peers on invalid KZG proofs
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>:
2026-01-13 17:12:08 +01:00
Felix Lange
ea4935430b version: begin v1.17.0 release cycle 2026-01-13 17:07:44 +01:00
Maxim Evtush
5a1990d1d8
rpc: fix limitedBuffer.Write to properly enforce size limit (#33545)
Updated the `avail` calculation to correctly compute remaining capacity:
`buf.limit - len(buf.output)`, ensuring the buffer never exceeds its
configured limit regardless of how many times `Write()` is called.
2026-01-13 20:25:53 +08:00
bigbear
1278b4891d
tests: repair oss-fuzz coverage command (#33304)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
The coverage build path was generating go test commands with a bogus
-tags flag that held the coverpkg value, so the run kept failing. I
switched coverbuild to treat the optional argument as an override for
-coverpkg and stopped passing coverpkg from the caller. Now the script
emits a clean go test invocation that should actually succeed.
2026-01-13 09:24:45 +01:00
0xcharry
31d5d82ce5
internal/ethapi: refactor RPC tx formatter (#33582)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
2026-01-12 14:31:45 +08:00
Jonny Rhea
c890637af9
core/rawdb: skip missing block bodies during tx unindexing (#33573)
This PR fixes an issue where the tx indexer would repeatedly try to
“unindex” a block with a missing body, causing a spike in CPU usage.
This change skips these blocks and advances the index tail. The fix was
verified both manually on a local development chain and with a new test.

resolves #33371
2026-01-12 14:25:22 +08:00
Csaba Kiraly
127d1f42bb
core: remove duplicate chainHeadFeed.Send code (#33563)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
The code was simply duplicate, so we can remove some code lines here.

Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
2026-01-09 14:40:40 +01:00
Xuanyu Hu
7cd400612f
tests: check correct revert on invalid tests (#33543)
This PR fixes an issue where `evm statetest` would not verify the
post-state root hash if the test case expected an exception (e.g.
invalid transaction).

The fix involves:
1.  Modifying `tests/state_test_util.go` in the `Run` method.
2. When an expected error occurs (`err != nil`), we now check if
`post.Root` is defined.
3. If defined, we recalculate the intermediate root from the current
state (which is reverted to the pre-transaction snapshot upon error).
4. We use `GetChainConfig` and `IsEIP158` to ensure the correct state
clearing rules are applied when calculating the root, avoiding
regressions on forks that require EIP-158 state clearing.
5. If the calculated root mismatches the expected root, the test now
fails.

This ensures that state tests are strictly verified against their
expected post-state, even for failure scenarios.

Fixes issue #33527

---------

Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
2026-01-09 12:53:55 +01:00
maradini77
4eb5b66d9e
ethclient: restore BlockReceipts support for BlockNumberOrHash objects (#33242)
- pass `rpc.BlockNumberOrHash` directly to `eth_getBlockReceipts` so
`requireCanonical` and other fields survive
- aligns `BlockReceipts` with other `ethclient` methods and re-enables
canonical-only receipt queries
2026-01-09 11:25:04 +01:00
Csaba Kiraly
b993cb6f38
core/txpool/blobpool: allow gaps in blobpool (#32717)
Allow the blobpool to accept blobs out of nonce order

Previously, we were dropping blobs that arrived out-of-order. However,
since fetch decisions are done on receiver side,
out-of-order delivery can happen, leading to inefficiencies.

This PR:
- adds an in-memory blob tx storage, similar to the queue in the
legacypool
- a limited number of received txs can be added to this per account
- txs waiting in the gapped queue are not processed further and not
propagated further until they are unblocked by adding the previos nonce
to the blobpool

The size of the in-memory storage is currently limited per account,
following a slow-start logic.
An overall size limit, and a TTL is also enforced for DoS protection.

---------

Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
2026-01-09 10:43:15 +01:00
rjl493456442
f51870e40e
rlp, trie, triedb/pathdb: compress trienode history (#32913)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This pull request introduces a mechanism to compress trienode history by
storing only the node diffs between consecutive versions.

- For full nodes, only the modified children are recorded in the history;
- For short nodes, only the modified value is stored;

If the node type has changed, or if the node is newly created or
deleted, the entire node value is stored instead.

To mitigate the overhead of reassembling nodes from diffs during history
reads, checkpoints are introduced by periodically storing full node values.

The current checkpoint interval is set to every 16 mutations, though
this parameter may be made configurable in the future.
2026-01-08 21:58:02 +08:00
Madison carter
52f998d5ec
ethclient: omit nil address/topics from filter args (#33464)
Fixes #33369

This omits "topics" and "addresses" from the filter when they are unspecified.
It is required for interoperability with some server implementations that cannot
handle `null` for these fields.
2026-01-08 11:09:29 +01:00
rjl493456442
d5efd34010
triedb/pathdb: introduce extension to history index structure (#33399)
It's a PR based on #33303 and introduces an approach for trienode
history indexing.

---

In the current archive node design, resolving a historical trie node at
a specific block
involves the following steps:

- Look up the corresponding trie node index and locate the first entry
whose state ID
   is greater than the target state ID.
- Resolve the trie node from the associated trienode history object.

A naive approach would be to store mutation records for every trie node,
similar to
how flat state mutations are recorded. However, the total number of trie
nodes is
extremely large (approximately 2.4 billion), and the vast majority of
them are rarely
modified. Creating an index entry for each individual trie node would be
very wasteful
in both storage and indexing overhead. To address this, we aggregate
multiple trie
nodes into chunks and index mutations at the chunk level instead. 

---

For a storage trie, the trie is vertically partitioned into multiple sub
tries, each spanning
three consecutive levels. The top three levels (1 + 16 + 256 nodes) form
the first chunk,
and every subsequent three-level segment forms another chunk.

```
Original trie structure

Level 0               [ ROOT ]                               1 node
Level 1        [0] [1] [2] ... [f]                          16 nodes
Level 2     [00] [01] ... [0f] [10] ... [ff]               256 nodes
Level 3   [000] [001] ... [00f] [010] ... [fff]           4096 nodes
Level 4   [0000] ... [000f] [0010] ... [001f] ... [ffff] 65536 nodes

Vertical split into chunks (3 levels per chunk)

Level0             [ ROOT ]                     1 chunk
Level3        [000]   ...     [fff]          4096 chunks
Level6   [000000]    ...    [fffffff]    16777216 chunks  
```

Within each chunk, there are 273 nodes in total, regardless of the
chunk's depth in the trie.

```
Level 0           [ 0 ]                         1 node
Level 1        [ 1 ] … [ 16 ]                  16 nodes
Level 2     [ 17 ] … … [ 272 ]                256 nodes
```

Each chunk is uniquely identified by the path prefix of the root node of
its corresponding
sub-trie. Within a chunk, nodes are identified by a numeric index
ranging from 0 to 272.

For example, suppose that at block 100, the nodes with paths `[]`,
`[0]`, `[f]`, `[00]`, and `[ff]`
are modified. The mutation record for chunk 0 is then appended with the
following entry:

`[100 → [0, 1, 16, 17, 272]]`, `272` is the numeric ID of path `[ff]`.

Furthermore, due to the structural properties of the Merkle Patricia
Trie, if a child node
is modified, all of its ancestors along the same path must also be
updated. As a result,
in the above example, recording mutations for nodes `00` and `ff` alone
is sufficient,
as this implicitly indicates that their ancestor nodes `[]`, `[0]` and
`[f]` were also
modified at block 100.

--- 

Query processing is slightly more complicated. Since trie nodes are
indexed at the chunk
level, each individual trie node lookup requires an additional filtering
step to ensure that
a given mutation record actually corresponds to the target trie node.

As mentioned earlier, mutation records store only the numeric
identifiers of leaf nodes,
while ancestor nodes are omitted for storage efficiency. Consequently,
when querying
an ancestor node, additional checks are required to determine whether
the mutation
record implicitly represents a modification to that ancestor.

Moreover, since trie nodes are indexed at the chunk level, some trie
nodes may be
updated frequently, causing their mutation records to dominate the
index. Queries
targeting rarely modified trie nodes would then scan a large amount of
irrelevant
index data, significantly degrading performance.

To address this issue, a bitmap is introduced for each index block and
stored in the
chunk's metadata. Before loading a specific index block, the bitmap is
checked to
determine whether the block contains mutation records relevant to the
target trie node.
If the bitmap indicates that the block does not contain such records,
the block is skipped entirely.
2026-01-08 09:57:35 +01:00