go-ethereum

mirror of https://github.com/ethereum/go-ethereum.git synced 2026-06-15 19:31:37 +00:00

Author	SHA1	Message	Date
CPerezz	f57dd20461	trie/bintrie: mark stems created via Empty.Insert* as dirty Empty.Insert and Empty.InsertValuesAtStem construct a fresh StemNode with mustRecompute=true but left the new `dirty` field at its zero value. With the skip-clean CollectNodes optimization enabled, the resulting stem was treated as already-persisted and never flushed to disk. A parent InternalNode's blob would be written referencing a hash for which no blob existed, causing "missing trie node" errors on subsequent reads. This is the path hit whenever a key is inserted into an Empty subtree — the common case on the first insert, and frequently thereafter on splits that leave one side Empty. A long-running deployment surfaced the bug after ~15 hours of random ERC20 writes. Add `dirty: true` to both struct literals, and add regression guards TestEmptyInsertMarksDirty / TestEmptyInsertValuesAtStemMarksDirty that assert the returned stem is dirty.	2026-04-18 09:06:11 +02:00
CPerezz	fad11d5795	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification BinaryTrie.Commit unconditionally walked every resolved in-memory node and flushed it into the NodeSet, producing one Pebble write per resolved internal + stem node on every block — even when the node's on-disk blob was bitwise identical to the previous commit. On a warm 400M-state workload this meant tens of thousands of redundant 65-byte writes per block, compounding Pebble compaction pressure on every commit. The existing mustRecompute flag tracks hash staleness, not disk-blob staleness: after Hash() completes, mustRecompute is cleared even though the fresh blob has not been persisted. It is therefore insufficient for a skip-flush optimization. This change mirrors MPT's committer pattern (trie/committer.go:51-56) by adding a dirty flag on InternalNode and StemNode with the semantics "the on-disk blob is stale". The flag is: - set to true wherever the node is created or structurally modified (the same call sites that already set mustRecompute = true), - set to false only after the node has been passed to the flushfn inside CollectNodes, - left false on nodes produced by DeserializeNodeWithHash, matching the "loaded from disk, already persisted" semantics. CollectNodes short-circuits on !dirty subtrees; the propagation invariant (an ancestor of any dirty node is itself dirty) is already maintained by the existing InsertValuesAtStem / Insert paths, which now mirror every mustRecompute = true setter with a dirty = true setter. Serialization format, hash computation, state root, and the pathdb write path are untouched. Empty NodeSets are already tolerated by triedb/pathdb.writeNodes. BenchmarkCollectNodes_SparseWrite (10,000-stem trie, one-leaf modification + Commit per iteration, Apple M4 Pro): before 12,653,000 ns/op 107,224,740 B/op 80,953 allocs/op after 7,336 ns/op 37,774 B/op 134 allocs/op speedup: ~1,725x memory: ~2,839x less allocs: ~604x fewer End-to-end impact on a benchmarked geth build depends on workload; the new TestBinaryTrieCommitIncremental provides a structural regression guard.	2026-04-17 15:54:16 +02:00
Guillaume Ballet	ba215fd927	cmd, core, trie, triedb: split CachingDB into merkle + binary dbs. (#34700 ) This Pr implements some prerequisite changes for #34004 : split the `CachingDB` into a `MerkleDB` and a `UBTDB`, so that very different behaviors don't clash as much. The transition isn't handled by this PR, but after talking to Gary we agreed that `UBTDB` should receive another `triedb`, which will only be loaded if the `Ended` flag is set to false in the conversion contract. If this is too hard to achieve, it makes sense to load it regardless, and then loading can be prevented at a later stage by adding a `UBTTransitionFinalizationTime` in `ChainConfig`. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2026-04-17 08:55:54 +08:00
Guillaume Ballet	735bfd121a	trie/bintrie: spec change, big endian hashing of slot key (#34670 ) The spec has been changed during SIC #49, the offset is encoded as a big-endian number.	2026-04-13 09:42:37 +02:00
CPerezz	deda47f6a1	trie/bintrie: fix GetAccount/GetStorage non-membership — verify stem before returning values (#34690 ) Some checks failed / Linux Build (push) Has been cancelled Details / Linux Build (arm) (push) Has been cancelled Details / Keeper Build (push) Has been cancelled Details / Windows Build (push) Has been cancelled Details / Docker Image (push) Has been cancelled Details Fix `GetAccount` returning wrong account data for non-existent addresses when the trie root is a `StemNode` (single-account trie) — the `StemNode` branch returned `r.Values` without verifying the queried address's stem matches. Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>	2026-04-10 19:43:48 +02:00
CPerezz	f71a884e37	trie/bintrie: fix DeleteAccount no-op (#34676 ) `BinaryTrie.DeleteAccount` was a no-op, silently ignoring the caller's deletion request and leaving the old `BasicData` and `CodeHash` in the trie. Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>	2026-04-10 19:23:44 +02:00
Guillaume Ballet	305cd7b9eb	trie/bintrie: fix NodeIterator Empty node handling and expose tree accessors (#34056 ) Some checks failed / Linux Build (push) Has been cancelled Details / Linux Build (arm) (push) Has been cancelled Details / Keeper Build (push) Has been cancelled Details / Windows Build (push) Has been cancelled Details / Docker Image (push) Has been cancelled Details Fix three issues in the binary trie NodeIterator: 1. Empty nodes now properly backtrack to parent and continue iteration instead of terminating the entire walk early. 2. `HashedNode` resolver handles `nil` data (all-zeros hash) gracefully by treating it as Empty rather than panicking. 3. Parent update after node resolution guards against stack underflow when resolving the root node itself. --------- Co-authored-by: tellabg <249254436+tellabg@users.noreply.github.com>	2026-03-20 13:53:14 -04:00
CPerezz	6138a11c39	trie/bintrie: parallelize InternalNode.Hash at shallow tree depths (#34032 ) ## Summary At tree depths below `log2(NumCPU)` (clamped to [2, 8]), hash the left subtree in a goroutine while hashing the right subtree inline. This exploits available CPU cores for the top levels of the tree where subtree hashing is most expensive. On single-core machines, the parallel path is disabled entirely. Deeper nodes use sequential hashing with the existing `sync.Pool` hasher where goroutine overhead would exceed the hash computation cost. The parallel path uses `sha256.Sum256` with a stack-allocated buffer to avoid pool contention across goroutines. Safety: - Left/right subtrees are disjoint — no shared mutable state - `sync.WaitGroup` provides happens-before guarantee for the result - `defer wg.Done()` + `recover()` prevents goroutine panics from crashing the process - `!bt.mustRecompute` early return means clean nodes never enter the parallel path - Hash results are deterministic regardless of computation order — no consensus risk ## Benchmark (AMD EPYC 48-core, 500K entries, `--benchtime=10s --count=3`, post-H01 baseline) \| Metric \| Baseline \| Parallel \| Delta \| \|--------\|----------\|----------\|-------\| \| Approve (Mgas/s) \| 224.5 ± 7.1 \| 259.6 ± 2.4 \| +15.6% \| \| BalanceOf (Mgas/s) \| 982.9 ± 5.1 \| 954.3 ± 10.8 \| -2.9% (noise, clean nodes skip parallel path) \| \| Allocs/op (approve) \| ~810K \| ~700K \| -13.6% \|	2026-03-18 13:54:23 +01:00
Guillaume Ballet	1c9ddee16f	trie/bintrie: use a sync.Pool when hashing binary tree nodes (#33989 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Binary tree hashing is quite slow, owing to many factors. One of them is the GC pressure that is the consequence of allocating many hashers, as a binary tree has 4x the size of an MPT. This PR introduces an optimization that already exists for the MPT: keep a pool of hashers, in order to reduce the amount of allocations.	2026-03-12 10:20:12 +01:00
Guillaume Ballet	3f1871524f	trie/bintrie: cache hashes of clean nodes so as not to rehash the whole tree (#33961 ) This is an optimization that existed for verkle and the MPT, but that got dropped during the rebase. Mark the nodes that were modified as needing recomputation, and skip the hash computation if this is not needed. Otherwise, the whole tree is hashed, which kills performance.	2026-03-06 18:06:24 +01:00
Guillaume Ballet	a0fb8102fe	trie/bintrie: fix overflow management in slot key computation (#33951 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details / Windows Build (push) Waiting to run Details The computation of `MAIN_STORAGE_OFFSET` was incorrect, causing the last byte of the stem to be dropped. This means that there would be a collision in the hash computation (at the preimage level, not a hash collision of course) if two keys were only differing at byte 31.	2026-03-05 14:43:31 +01:00
Guillaume Ballet	95c6b05806	trie/bintrie: fix endianness in code chunk key computation (#33900 ) The endianness was wrong, which means that the code chunks were stored in the wrong location in the tree.	2026-02-27 11:35:13 +01:00
phrwlk	30656d714e	trie/bintrie: use correct key mapping in GetStorage and DeleteStorage (#33807 ) GetStorage and DeleteStorage used GetBinaryTreeKey to compute the tree key, while UpdateStorage used GetBinaryTreeKeyStorageSlot. The latter applies storage slot remapping (header offset for slots <64, main storage prefix for the rest), so reads and deletes were targeting different tree locations than writes. Replace GetBinaryTreeKey with GetBinaryTreeKeyStorageSlot in both GetStorage and DeleteStorage to match UpdateStorage. Add a regression test that verifies the write→read→delete→read round-trip for main storage slots.	2026-02-11 11:42:17 +01:00
Guillaume Ballet	19f37003fb	trie/bintrie: fix debug_executionWitness for binary tree (#33739 ) The `Witness` method was not implemented for the binary tree, which caused `debug_excutionWitness` to panic. This PR fixes that. Note that the `TransitionTrie` version isn't implemented, and that's on purpose: more thought must be given to what should go in the global witness.	2026-02-03 12:19:40 +01:00
Ng Wei Han	3d05284928	trie/bintrie: fix tree key hashing to match spec (#33694 ) Based on [EIP-7864](https://eips.ethereum.org/EIPS/eip-7864), the tree index should be 32 bytes instead of 31 bytes. ``` def get_tree_key(address: Address32, tree_index: int, sub_index: int): # Assumes STEM_SUBTREE_WIDTH = 256 return tree_hash(address + tree_index.to_bytes(32, "little"))[:31] + bytes( [sub_index] ) ```	2026-01-28 11:51:02 +01:00
Guillaume Ballet	3f641dba87	trie, go.mod: remove all references to go-verkle and go-ipa (#33461 ) In order to reduce the amount of code that is embedded into the keeper binary, I am removing all the verkle code that uses go-verkle and go-ipa. This will be followed by further PRs that are more like stubs to replace code when the keeper build is detected. I'm keeping the binary tree of course. This means that you will still see `isVerkle` variables all over the codebase, but they will be renamed when code is touched (i.e. this is not an invitation for 30+ AI slop PRs). --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-12-30 20:44:04 +08:00
Guillaume Ballet	2a2f106a01	cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This is broken off of #31730 to only focus on testing networks that start with verkle at genesis. The PR has seen a lot of work since its creation, and it now targets creating and re-executing tests for a binary tree testnet without the transition (so it starts at genesis). The transition tree has been moved to its own package. It also replaces verkle with the binary tree for this specific application. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-11-14 15:25:30 +01:00
Guillaume Ballet	bd4b17907f	trie/bintrie: add eip7864 binary trees and run its tests (#32365 ) Implement the binary tree as specified in [eip-7864](https://eips.ethereum.org/EIPS/eip-7864). This will gradually replace verkle trees in the codebase. This is only running the tests and will not be executed in production, but will help me rebase some of my work, so that it doesn't bitrot as much. --------- Signed-off-by: Guillaume Ballet Co-authored-by: Parithosh Jayanthi <parithosh.jayanthi@ethereum.org> Co-authored-by: rjl493456442 <garyrong0905@gmail.com>	2025-09-01 21:06:51 +08:00

18 commits