go-ethereum

mirror of https://github.com/ethereum/go-ethereum.git synced 2026-05-09 17:46:37 +00:00

History

CPerezz 61bfacc52f Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 ) ## Problem `BinaryTrie.Commit` unconditionally walked every resolved in-memory node and flushed it into the `NodeSet`, producing one Pebble write per resolved internal + stem node on every block — even when the node's on-disk blob was bitwise identical to the previous commit. On a warm 400M-state workload this meant tens of thousands of redundant 65-byte writes per block, compounding Pebble compaction pressure on every commit. The existing `mustRecompute` flag tracks hash staleness, not disk-blob staleness: after `Hash()` completes, `mustRecompute` is cleared even though the fresh blob has not been persisted. It is therefore insufficient for a skip-flush optimization. ## Fix Mirror the MPT committer pattern (`trie/committer.go:51-56`) by adding a `dirty` flag on `InternalNode` and `StemNode` with the semantics the on-disk blob is stale. The flag is: - set to `true` wherever the node is created or structurally modified (the same call sites that already set `mustRecompute = true`); - set to `false` only after the node has been passed to the `flushfn` inside `CollectNodes`; - left `false` on nodes produced by `DeserializeNodeWithHash`, matching the loaded from disk, already persisted semantics. `CollectNodes` short-circuits on `!dirty` subtrees. The propagation invariant (an ancestor of any dirty node is itself dirty) is already maintained by the existing `InsertValuesAtStem` / `Insert` paths, which now mirror every `mustRecompute = true` setter with a `dirty = true` setter. ## Benchmark New `BenchmarkCollectNodes_SparseWrite` measures commit cost when only one leaf changes between blocks — the common case for state updates. 10,000-stem trie, one-leaf modification + Commit per iteration, Apple M4 Pro: \| \| before \| after \| delta \| \|---\|---\|---\|---\| \| time / op \| 12,653,000 ns \| 7,336 ns \| ~1,725× \| \| bytes / op \| 107,224,740 B \| 37,774 B \| ~2,839× \| \| allocs / op \| 80,953 \| 134 \| ~604× \| End-to-end impact on a real workload depends on the resolved-footprint-to-dirty-path ratio; the new `TestBinaryTrieCommitIncremental` provides a structural regression guard (asserts that a Commit following a single-leaf modification flushes a root-to-leaf path, not the whole tree). --- Found all of this stuff while bloating my #34706 DB to make some benchmarks. And saw we were spending A LOT OF TIME on hashing. Hope this helps the perf a bit. Will rebase the flat-state PR on top of this once merged.		2026-04-18 11:42:58 +02:00
..
binary_node.go	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 )	2026-04-18 11:42:58 +02:00
binary_node_test.go	cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445 )	2025-11-14 15:25:30 +01:00
empty.go	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 )	2026-04-18 11:42:58 +02:00
empty_test.go	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 )	2026-04-18 11:42:58 +02:00
hashed_node.go	trie/bintrie: cache hashes of clean nodes so as not to rehash the whole tree (#33961 )	2026-03-06 18:06:24 +01:00
hashed_node_test.go	cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445 )	2025-11-14 15:25:30 +01:00
hasher.go	trie/bintrie: use a sync.Pool when hashing binary tree nodes (#33989 )	2026-03-12 10:20:12 +01:00
internal_node.go	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 )	2026-04-18 11:42:58 +02:00
internal_node_test.go	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 )	2026-04-18 11:42:58 +02:00
iterator.go	trie/bintrie: fix NodeIterator Empty node handling and expose tree accessors (#34056 )	2026-03-20 13:53:14 -04:00
iterator_test.go	trie/bintrie: fix NodeIterator Empty node handling and expose tree accessors (#34056 )	2026-03-20 13:53:14 -04:00
key_encoding.go	trie/bintrie: spec change, big endian hashing of slot key (#34670 )	2026-04-13 09:42:37 +02:00
stem_node.go	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 )	2026-04-18 11:42:58 +02:00
stem_node_test.go	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 )	2026-04-18 11:42:58 +02:00
trie.go	cmd, core, trie, triedb: split CachingDB into merkle + binary dbs. (#34700 )	2026-04-17 08:55:54 +08:00
trie_test.go	trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754 )	2026-04-18 11:42:58 +02:00