go-ethereum

mirror of https://github.com/ethereum/go-ethereum.git synced 2026-06-13 18:31:35 +00:00

Author	SHA1	Message	Date
CPerezz	2d44d8a4b6	trie/bintrie: unexport package-internal arena identifiers Gballet asked on PR #34055 to unexport nodeRef, nodeKind, and makeRef (comments 3099846639, 3099847640, 3100717855) — none are used outside trie/bintrie. Cascade to the internal-only support symbols and methods: NodeKind → nodeKind KindEmpty/... → kindEmpty/... NodeRef → nodeRef EmptyRef → emptyRef MakeRef → makeRef NodeStore.Root → deleted; inlined to s.root field access (same pkg) NodeStore.SetRoot → deleted; inlined to s.root = ref NodeStore.ComputeHash/SerializeNode/DeserializeNode(WithHash)/ CollectNodes/ToDot/GetHeight → lowercased All 9 method signatures took or returned nodeRef so their export would have tripped revive:unexported-return after the type rename. Zero external callers means no API break. The private deserializeNode helper was renamed to decodeNode to free the name for the newly-private deserializeNode public function. Pure rename; no behaviour change.	2026-04-18 18:49:04 +02:00
CPerezz	939b36345f	trie/bintrie: port dirty flag + CollectNodes skip-clean from master Master added (via PR #34754) a dirty bool to InternalNode/StemNode plus a CollectNodes short-circuit that skips clean subtrees — the arena branch diverged before that landed. Port the semantics onto the arena shape: - Add dirty bool to InternalNode and StemNode. - Wire dirty=true alongside every existing mustRecompute=true setter in node_store.go (newInternalRef, newStemRef) and store_ops.go (8 mutation sites across InsertSingle/insertSingleInternal/InsertValuesAtStem/ insertValuesAtStem/splitStemInsert/splitStemValuesInsert). - Add 'if !node.dirty { return nil }' gate at the top of CollectNodes for both KindInternal and KindStem; clear dirty after flushfn runs. - Plumb a dirty parameter through deserializeNode; DeserializeNode passes dirty=true (safe default), DeserializeNodeWithHash passes dirty=false (loaded from disk, blob matches). The arena test in trie_test.go that was auto-merged from master used master-shape struct literals (tr.root, NewBinaryNode) that don't exist on arena; delete those and replace with TestCommitSkipCleanSubtrees, an arena-native version that asserts first-Commit flushes all nodes, no-op Commit flushes none, and single-leaf Commit flushes only the root-to-leaf path.	2026-04-18 18:45:12 +02:00
CPerezz	b3b86a873a	trie/bintrie: merge makeKeyPath into keyToPath Drop the panic-on-error variant. All callers are inside methods that already propagate errors, so the error-returning form is the right default.	2026-04-18 18:38:37 +02:00
CPerezz	b4a7118d06	trie/bintrie: trim verbose doc comments to essentials	2026-04-18 18:38:37 +02:00
CPerezz	8a5e777fde	trie/bintrie: replace BinaryNode interface with GC-free NodeRef arena Replace the BinaryNode interface (which uses Go interface pointers that the GC must scan) with NodeRef uint32 indices into typed arena pools. NodeRef packs a 2-bit kind tag and 30-bit pool index into a single uint32, making it invisible to the garbage collector. NodeStore manages chunked typed pools per node kind: - InternalNode pool: ZERO Go pointers (children are NodeRef, hash is [32]byte) → allocated in noscan spans, GC skips entirely - HashedNode pool: ZERO Go pointers → noscan spans - StemNode pool: ONE pointer per node (valueData []byte) → minimal GC For a trie with 25K InternalNodes, this reduces GC-scanned pointer-words from ~125K to ~10K (85% reduction). CPU profiling showed 44% of time in GC; this refactor directly addresses that bottleneck. Serialization format is unchanged — the on-disk representation is fully compatible. All existing tests pass.	2026-04-18 18:38:15 +02:00
CPerezz	61bfacc52f	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details ## Problem `BinaryTrie.Commit` unconditionally walked every resolved in-memory node and flushed it into the `NodeSet`, producing one Pebble write per resolved internal + stem node on every block — even when the node's on-disk blob was bitwise identical to the previous commit. On a warm 400M-state workload this meant tens of thousands of redundant 65-byte writes per block, compounding Pebble compaction pressure on every commit. The existing `mustRecompute` flag tracks hash staleness, not disk-blob staleness: after `Hash()` completes, `mustRecompute` is cleared even though the fresh blob has not been persisted. It is therefore insufficient for a skip-flush optimization. ## Fix Mirror the MPT committer pattern (`trie/committer.go:51-56`) by adding a `dirty` flag on `InternalNode` and `StemNode` with the semantics the on-disk blob is stale. The flag is: - set to `true` wherever the node is created or structurally modified (the same call sites that already set `mustRecompute = true`); - set to `false` only after the node has been passed to the `flushfn` inside `CollectNodes`; - left `false` on nodes produced by `DeserializeNodeWithHash`, matching the loaded from disk, already persisted semantics. `CollectNodes` short-circuits on `!dirty` subtrees. The propagation invariant (an ancestor of any dirty node is itself dirty) is already maintained by the existing `InsertValuesAtStem` / `Insert` paths, which now mirror every `mustRecompute = true` setter with a `dirty = true` setter. ## Benchmark New `BenchmarkCollectNodes_SparseWrite` measures commit cost when only one leaf changes between blocks — the common case for state updates. 10,000-stem trie, one-leaf modification + Commit per iteration, Apple M4 Pro: \| \| before \| after \| delta \| \|---\|---\|---\|---\| \| time / op \| 12,653,000 ns \| 7,336 ns \| ~1,725× \| \| bytes / op \| 107,224,740 B \| 37,774 B \| ~2,839× \| \| allocs / op \| 80,953 \| 134 \| ~604× \| End-to-end impact on a real workload depends on the resolved-footprint-to-dirty-path ratio; the new `TestBinaryTrieCommitIncremental` provides a structural regression guard (asserts that a Commit following a single-leaf modification flushes a root-to-leaf path, not the whole tree). --- Found all of this stuff while bloating my #34706 DB to make some benchmarks. And saw we were spending A LOT OF TIME on hashing. Hope this helps the perf a bit. Will rebase the flat-state PR on top of this once merged.	2026-04-18 11:42:58 +02:00
CPerezz	6138a11c39	trie/bintrie: parallelize InternalNode.Hash at shallow tree depths (#34032 ) ## Summary At tree depths below `log2(NumCPU)` (clamped to [2, 8]), hash the left subtree in a goroutine while hashing the right subtree inline. This exploits available CPU cores for the top levels of the tree where subtree hashing is most expensive. On single-core machines, the parallel path is disabled entirely. Deeper nodes use sequential hashing with the existing `sync.Pool` hasher where goroutine overhead would exceed the hash computation cost. The parallel path uses `sha256.Sum256` with a stack-allocated buffer to avoid pool contention across goroutines. Safety: - Left/right subtrees are disjoint — no shared mutable state - `sync.WaitGroup` provides happens-before guarantee for the result - `defer wg.Done()` + `recover()` prevents goroutine panics from crashing the process - `!bt.mustRecompute` early return means clean nodes never enter the parallel path - Hash results are deterministic regardless of computation order — no consensus risk ## Benchmark (AMD EPYC 48-core, 500K entries, `--benchtime=10s --count=3`, post-H01 baseline) \| Metric \| Baseline \| Parallel \| Delta \| \|--------\|----------\|----------\|-------\| \| Approve (Mgas/s) \| 224.5 ± 7.1 \| 259.6 ± 2.4 \| +15.6% \| \| BalanceOf (Mgas/s) \| 982.9 ± 5.1 \| 954.3 ± 10.8 \| -2.9% (noise, clean nodes skip parallel path) \| \| Allocs/op (approve) \| ~810K \| ~700K \| -13.6% \|	2026-03-18 13:54:23 +01:00
Guillaume Ballet	1c9ddee16f	trie/bintrie: use a sync.Pool when hashing binary tree nodes (#33989 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Binary tree hashing is quite slow, owing to many factors. One of them is the GC pressure that is the consequence of allocating many hashers, as a binary tree has 4x the size of an MPT. This PR introduces an optimization that already exists for the MPT: keep a pool of hashers, in order to reduce the amount of allocations.	2026-03-12 10:20:12 +01:00
Guillaume Ballet	3f1871524f	trie/bintrie: cache hashes of clean nodes so as not to rehash the whole tree (#33961 ) This is an optimization that existed for verkle and the MPT, but that got dropped during the rebase. Mark the nodes that were modified as needing recomputation, and skip the hash computation if this is not needed. Otherwise, the whole tree is hashed, which kills performance.	2026-03-06 18:06:24 +01:00
Guillaume Ballet	2a2f106a01	cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This is broken off of #31730 to only focus on testing networks that start with verkle at genesis. The PR has seen a lot of work since its creation, and it now targets creating and re-executing tests for a binary tree testnet without the transition (so it starts at genesis). The transition tree has been moved to its own package. It also replaces verkle with the binary tree for this specific application. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-11-14 15:25:30 +01:00
Guillaume Ballet	bd4b17907f	trie/bintrie: add eip7864 binary trees and run its tests (#32365 ) Implement the binary tree as specified in [eip-7864](https://eips.ethereum.org/EIPS/eip-7864). This will gradually replace verkle trees in the codebase. This is only running the tests and will not be executed in production, but will help me rebase some of my work, so that it doesn't bitrot as much. --------- Signed-off-by: Guillaume Ballet Co-authored-by: Parithosh Jayanthi <parithosh.jayanthi@ethereum.org> Co-authored-by: rjl493456442 <garyrong0905@gmail.com>	2025-09-01 21:06:51 +08:00

11 commits