Commit graph

13 commits

Author SHA1 Message Date
CPerezz
f676f04706
trie/bintrie: move dirty+mustRecompute flip into setValue
The invariant "mutating a value slot must mark the stem for re-hash
and re-flush" was enforced by every caller remembering to set both
flags after setValue. Moving the flip into setValue itself makes it
structurally impossible to forget, and drops the duplicate flag-sets
at each callsite.

decodeNode's on-disk load path still writes directly to sn.values
because loaded stems must retain whatever mustRecompute/dirty state
the caller asked for (typically both false).
2026-04-19 22:16:50 +02:00
CPerezz
ef3217c249
trie/bintrie: keep StemNode.Hash's data array on stack
The pooled hash.Hash interface forced the local [StemNodeWidth]common.Hash
data array to escape to the heap: h.Sum(data[i][:0]) passes a subslice of
data into an interface method, so escape analysis conservatively moves the
whole array. pprof (post-rollback) showed this single allocation as 52%
of total bytes (5 GB over BenchmarkCollectNodesSparseWrite).

Switch to sha256.Sum256 (takes []byte, returns [32]byte by value) — no
slice into data ever leaves the frame, so data stays on stack. Also
drops per-Hash h.Sum(nil) allocs and the sync.Pool Get/Put round-trip
for stems.

Benchmark delta (M4 Pro, go1.24.0, --count=5 --benchtime=5s):

  before: 9095 ns/op  15008 B/op  106 allocs/op
  after:  9133 ns/op   6526 B/op   95 allocs/op

  vs upstream/master@53ff723cc: bytes/op -82.7% (was -60%),
  allocs/op -29.1% (was -20.9%).
2026-04-19 08:06:18 +02:00
CPerezz
e1859ea864
trie/bintrie: simplify StemNode to array-of-slices representation
Gballet asked on PR #34055 (comments 3100043116, 3100050542, and the
bit-check dedup at 3100114416 / 3100878310) to revert StemNode from the
packed-bytes representation to the straightforward array-of-slices.

Before: StemNode carried a bitmap, a concatenated valueData []byte, a
count, and a shared COW flag. Every read/write went through a bit-count
posInData lookup; every mutation through ensureWritable COW.

After: values [StemNodeWidth][]byte — 256 slots, nil == absent. No
bitmap lookup, no COW. Direct sn.values[suffix] access.

Supporting changes:
- Drop posInData, ensureWritable; rewrite getValue/hasValue/allValues/
  setValue as trivial slice access.
- Hash() iterates sn.values directly, matching master's shape.
- SerializeNode emits the bitmap + concatenated bytes on the wire from
  the array-of-slices at serialize time; wire format unchanged.
- decodeNode populates sn.values[i] slots by aliasing the serialized
  buffer (zero-copy).
- NodeStore.Copy deep-copies each slot.
- splitStemValuesInsert + the insertSingleInternal paths write directly
  to sn.values[i].

Trade-off: stems now carry 256 []byte headers (6144 B) instead of 1
concatenated slice (~32 B) + bitmap. Stem-pool scan cost returns to
parity with master (the existing valueData pointer already made the
pool non-noscan; rollback adds 255 more pointers per stem). The primary
arena win — pointer-free InternalNode pool — is preserved.
2026-04-18 18:53:07 +02:00
CPerezz
939b36345f
trie/bintrie: port dirty flag + CollectNodes skip-clean from master
Master added (via PR #34754) a dirty bool to InternalNode/StemNode plus a
CollectNodes short-circuit that skips clean subtrees — the arena branch
diverged before that landed. Port the semantics onto the arena shape:

- Add dirty bool to InternalNode and StemNode.
- Wire dirty=true alongside every existing mustRecompute=true setter in
  node_store.go (newInternalRef, newStemRef) and store_ops.go (8 mutation
  sites across InsertSingle/insertSingleInternal/InsertValuesAtStem/
  insertValuesAtStem/splitStemInsert/splitStemValuesInsert).
- Add 'if !node.dirty { return nil }' gate at the top of CollectNodes for
  both KindInternal and KindStem; clear dirty after flushfn runs.
- Plumb a dirty parameter through deserializeNode; DeserializeNode passes
  dirty=true (safe default), DeserializeNodeWithHash passes dirty=false
  (loaded from disk, blob matches).

The arena test in trie_test.go that was auto-merged from master used
master-shape struct literals (tr.root, NewBinaryNode) that don't exist on
arena; delete those and replace with TestCommitSkipCleanSubtrees, an
arena-native version that asserts first-Commit flushes all nodes, no-op
Commit flushes none, and single-leaf Commit flushes only the root-to-leaf
path.
2026-04-18 18:45:12 +02:00
CPerezz
b4a7118d06
trie/bintrie: trim verbose doc comments to essentials 2026-04-18 18:38:37 +02:00
CPerezz
efc164b850
trie/bintrie: move ensureWritable into setValue 2026-04-18 18:38:37 +02:00
CPerezz
8a5e777fde
trie/bintrie: replace BinaryNode interface with GC-free NodeRef arena
Replace the BinaryNode interface (which uses Go interface pointers that
the GC must scan) with NodeRef uint32 indices into typed arena pools.
NodeRef packs a 2-bit kind tag and 30-bit pool index into a single
uint32, making it invisible to the garbage collector.

NodeStore manages chunked typed pools per node kind:
- InternalNode pool: ZERO Go pointers (children are NodeRef, hash is
  [32]byte) → allocated in noscan spans, GC skips entirely
- HashedNode pool: ZERO Go pointers → noscan spans
- StemNode pool: ONE pointer per node (valueData []byte) → minimal GC

For a trie with 25K InternalNodes, this reduces GC-scanned pointer-words
from ~125K to ~10K (85% reduction). CPU profiling showed 44% of time
in GC; this refactor directly addresses that bottleneck.

Serialization format is unchanged — the on-disk representation is
fully compatible. All existing tests pass.
2026-04-18 18:38:15 +02:00
CPerezz
61bfacc52f
trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
## Problem

`BinaryTrie.Commit` unconditionally walked every resolved in-memory node
and flushed it into the `NodeSet`, producing one Pebble write per
resolved internal + stem node on every block — even when the node's
on-disk blob was bitwise identical to the previous commit. On a warm
400M-state workload this meant tens of thousands of redundant 65-byte
writes per block, compounding Pebble compaction pressure on every
commit.

The existing `mustRecompute` flag tracks *hash* staleness, not
*disk-blob* staleness: after `Hash()` completes, `mustRecompute` is
cleared even though the fresh blob has not been persisted. It is
therefore insufficient for a skip-flush optimization.

## Fix

Mirror the MPT committer pattern (`trie/committer.go:51-56`) by adding a
`dirty` flag on `InternalNode` and `StemNode` with the semantics *the
on-disk blob is stale*. The flag is:

- set to `true` wherever the node is created or structurally modified
(the same call sites that already set `mustRecompute = true`);
- set to `false` only after the node has been passed to the `flushfn`
inside `CollectNodes`;
- left `false` on nodes produced by `DeserializeNodeWithHash`, matching
the *loaded from disk, already persisted* semantics.

`CollectNodes` short-circuits on `!dirty` subtrees. The propagation
invariant (an ancestor of any dirty node is itself dirty) is already
maintained by the existing `InsertValuesAtStem` / `Insert` paths, which
now mirror every `mustRecompute = true` setter with a `dirty = true`
setter.

## Benchmark

New `BenchmarkCollectNodes_SparseWrite` measures commit cost when only
one leaf changes between blocks — the common case for state updates.
10,000-stem trie, one-leaf modification + Commit per iteration, Apple M4
Pro:

| | before | after | delta |
|---|---|---|---|
| time / op | 12,653,000 ns | 7,336 ns | **~1,725×** |
| bytes / op | 107,224,740 B | 37,774 B | **~2,839×** |
| allocs / op | 80,953 | 134 | **~604×** |

End-to-end impact on a real workload depends on the
resolved-footprint-to-dirty-path ratio; the new
`TestBinaryTrieCommitIncremental` provides a structural regression guard
(asserts that a Commit following a single-leaf modification flushes a
root-to-leaf path, not the whole tree).

---

Found all of this stuff while bloating my #34706 DB to make some
benchmarks. And saw we were spending A LOT OF TIME on hashing.
Hope this helps the perf a bit. Will rebase the flat-state PR on top of
this once merged.
2026-04-18 11:42:58 +02:00
CPerezz
deda47f6a1
trie/bintrie: fix GetAccount/GetStorage non-membership — verify stem before returning values (#34690)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
Fix `GetAccount` returning **wrong account data** for non-existent
addresses when the trie root is a `StemNode` (single-account trie) — the
`StemNode` branch returned `r.Values` without verifying the queried
address's stem matches.

Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
2026-04-10 19:43:48 +02:00
Guillaume Ballet
1c9ddee16f
trie/bintrie: use a sync.Pool when hashing binary tree nodes (#33989)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Binary tree hashing is quite slow, owing to many factors. One of them is
the GC pressure that is the consequence of allocating many hashers, as a
binary tree has 4x the size of an MPT. This PR introduces an
optimization that already exists for the MPT: keep a pool of hashers, in
order to reduce the amount of allocations.
2026-03-12 10:20:12 +01:00
Guillaume Ballet
3f1871524f
trie/bintrie: cache hashes of clean nodes so as not to rehash the whole tree (#33961)
This is an optimization that existed for verkle and the MPT, but that
got dropped during the rebase.

Mark the nodes that were modified as needing recomputation, and skip the
hash computation if this is not needed. Otherwise, the whole tree is
hashed, which kills performance.
2026-03-06 18:06:24 +01:00
Guillaume Ballet
2a2f106a01
cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This is broken off of #31730 to only focus on testing networks that
start with verkle at genesis.

The PR has seen a lot of work since its creation, and it now targets
creating and re-executing tests for a binary tree testnet without the
transition (so it starts at genesis). The transition tree has been moved
to its own package. It also replaces verkle with the binary tree for
this specific application.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-11-14 15:25:30 +01:00
Guillaume Ballet
bd4b17907f
trie/bintrie: add eip7864 binary trees and run its tests (#32365)
Implement the binary tree as specified in [eip-7864](https://eips.ethereum.org/EIPS/eip-7864). 

This will gradually replace verkle trees in the codebase. This is only 
running the tests and will not be executed in production, but will help 
me rebase some of my work, so that it doesn't bitrot as much.

---------

Signed-off-by: Guillaume Ballet
Co-authored-by: Parithosh Jayanthi <parithosh.jayanthi@ethereum.org>
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
2025-09-01 21:06:51 +08:00