Commit graph

492 commits

Author SHA1 Message Date
CPerezz
c24930bebf
trie/bintrie: unexport NodeStore, NewNodeStore, NodeFlushFn, nodeResolverFn
grep across the repo confirms zero external callers of bintrie.NodeStore,
NewNodeStore, NodeFlushFn, or NodeResolverFn. The arena is purely an
implementation detail of BinaryTrie; unexport the top-level names so
the package's external surface stays confined to BinaryTrie plus the
EIP-7864 helpers (ChunkifyCode, GetBinaryTreeKey*).

Methods on *nodeStore remain capitalized for now — with nodeStore
itself unexported, external code has no way to hold a *nodeStore
pointer, so the methods are effectively internal despite their case.
Method case is a cosmetic follow-up.
2026-04-19 22:18:43 +02:00
CPerezz
f676f04706
trie/bintrie: move dirty+mustRecompute flip into setValue
The invariant "mutating a value slot must mark the stem for re-hash
and re-flush" was enforced by every caller remembering to set both
flags after setValue. Moving the flip into setValue itself makes it
structurally impossible to forget, and drops the duplicate flag-sets
at each callsite.

decodeNode's on-disk load path still writes directly to sn.values
because loaded stems must retain whatever mustRecompute/dirty state
the caller asked for (typically both false).
2026-04-19 22:16:50 +02:00
CPerezz
d216942b7c
trie/bintrie: make IsEmpty kind-based
Comparing against emptyRef (the single value makeRef(kindEmpty, 0))
only works because no other nodeRef with kindEmpty is ever constructed.
That invariant is easy to break if future code ever produces a
kindEmpty ref with nonzero index. Test the kind directly so any
kindEmpty ref reads as empty regardless of index.
2026-04-19 22:15:48 +02:00
CPerezz
c08048eb84
trie/bintrie: panic on makeRef index overflow
Silently masking the index with indexMask would let an oversized idx
collide with emptyRef (makeRef(kindEmpty, 0)): e.g. makeRef(kindEmpty,
1<<30) returned emptyRef, which IsEmpty would accept as absent even
though the caller meant to reference a real node. Panic instead.
allocInternal/allocStem/allocHashed already panic on pool overflow,
so this is a belt-and-suspenders guard for any direct callers.
2026-04-19 22:15:38 +02:00
CPerezz
6e36636c9c
trie/bintrie: compute parallelHashDepth once at init
runtime.NumCPU() + bits.Len were recomputed on every hashInternal
call. NumCPU is immutable after startup; hoist to a package var
computed once. Also fixes a minor style wart: the constant is now a
value, not a zero-arg function.
2026-04-19 22:15:23 +02:00
CPerezz
f8c283410e
trie/bintrie: drop unused freeInternals and freeStems
Only freeHashed is written to (via freeHashedNode); the internal and
stem lists are declared, consumed in alloc paths, and copied in
NodeStore.Copy, but no callsite ever appends to them. Under current
semantics (no delete, stem-split keeps the old stem deeper in the
tree) there is no path that would free an internal or stem slot, so
the recycle branch was dead code. Drop it to avoid misleading future
contributors; the infrastructure is easy to restore if a delete path
is ever added.
2026-04-19 22:15:01 +02:00
CPerezz
99520432ec
trie/bintrie: roll back split-stem depth on error
splitStemValuesInsert increments existing.depth before recursing into
insertValuesAtStem. If the recursion fails, the depth stays incremented
but the tree is not re-rooted through the new internal, so a retry
reads bitStem at the wrong offset and can place the stem on the wrong
side of a fresh split. Roll back existing.depth on error to keep the
stem consistent across retries.
2026-04-19 22:14:31 +02:00
CPerezz
7f20e908e2
trie/bintrie: defer wg.Done in hashInternal shallow goroutine
If s.computeHash(node.left) panics inside the goroutine and a caller
higher up recovers the panic, the parent would be stuck forever in
wg.Wait() — no log, no error. defer wg.Done() releases the waiter
unconditionally so Wait returns even on panic.
2026-04-19 22:13:51 +02:00
CPerezz
0c0c68da6c
trie/bintrie: align hashInternal deep branch with shallow shape
Deep branch used a pooled hash.Hash: h.Write(lh[:]) passed a subslice
through the interface, forcing lh/rh to heap; h.Sum(nil) allocated
32 B per rehash; empty children allocated 32 B via make([]byte, HashSize).

Mirror the shallow branch: write left/right hashes into a stack
[64]byte via copy() and call sha256.Sum256 in one shot. No interface
writes, no pool round-trip, no Sum(nil), no empty-child make.

Benchmark delta (M4 Pro, go1.24.0, --count=5 --benchtime=5s):

  before: 9133 ns/op  6526 B/op  95 allocs/op
  after:  8783 ns/op  5623 B/op  67 allocs/op

  vs upstream/master@53ff723cc: allocs/op -50.0% (was -29.1%),
  bytes/op -85.1% (was -82.7%), time/op +18.7% (was +23.4%).
2026-04-19 08:12:54 +02:00
CPerezz
ef3217c249
trie/bintrie: keep StemNode.Hash's data array on stack
The pooled hash.Hash interface forced the local [StemNodeWidth]common.Hash
data array to escape to the heap: h.Sum(data[i][:0]) passes a subslice of
data into an interface method, so escape analysis conservatively moves the
whole array. pprof (post-rollback) showed this single allocation as 52%
of total bytes (5 GB over BenchmarkCollectNodesSparseWrite).

Switch to sha256.Sum256 (takes []byte, returns [32]byte by value) — no
slice into data ever leaves the frame, so data stays on stack. Also
drops per-Hash h.Sum(nil) allocs and the sync.Pool Get/Put round-trip
for stems.

Benchmark delta (M4 Pro, go1.24.0, --count=5 --benchtime=5s):

  before: 9095 ns/op  15008 B/op  106 allocs/op
  after:  9133 ns/op   6526 B/op   95 allocs/op

  vs upstream/master@53ff723cc: bytes/op -82.7% (was -60%),
  allocs/op -29.1% (was -20.9%).
2026-04-19 08:06:18 +02:00
CPerezz
50d815313e
trie/bintrie: reuse path buffer in collectNodes
Post-rollback pprof on BenchmarkCollectNodesSparseWrite revealed
collectNodes' per-descent leftPath/rightPath make+copy as 13% of
alloc_objects (~26 allocs/op). Replace with append/truncate on a
shared buffer pre-sized by Commit; flushfn consumers (NodeSet.AddNode,
tracer.Get) already clone via string(path), so in-place reuse is safe.

Benchmark delta (M4 Pro, go1.24.0, --count=5 --benchtime=5s):

  before: 9506 ns/op  15245 B/op  132 allocs/op
  after:  9095 ns/op  15008 B/op  106 allocs/op

  vs upstream/master@53ff723cc: allocs/op now -20.9% (was -1.5%).
2026-04-19 08:00:33 +02:00
CPerezz
0c92956c77
trie/bintrie: add BenchmarkCollectNodesSparseWrite for allocation tracking
Port the same-shape benchmark from master (PR #34754) to the arena. A
benchstat-able pair lets reviewers verify the arena's allocation win is
preserved after the Group B + C rollbacks, and guards against future
regressions on the sparse-write commit path.
2026-04-18 19:12:55 +02:00
CPerezz
b86e2d3e20
trie/bintrie: inline get/InsertValuesAtStem wrappers
Gballet's three empty 'suggestion' blocks (comments 3101685618,
3101734697, 3101736436) mark the unexported wrapper declarations on
getValuesAtStem and insertValuesAtStem plus one temporary-var line.
Apply:

- Inline the unexported getValuesAtStem body into GetValuesAtStem (start
  the walk at s.root directly instead of via a two-arg helper). The
  function is not self-recursive, so the wrapper was pure indirection.
- Tighten InsertValuesAtStem to two lines using the 's.root, err = ...'
  idiom — the recursive helper stays (it IS self-recursive), only the
  public entry point gets the cleanup.

Adds docstrings on both public entry points.
2026-04-18 18:59:27 +02:00
CPerezz
aa21cd1b80
trie/bintrie: remove hasParent dead code in getValuesAtStem
Gballet (comment 3101708418): the hasParent check in the kindHashed
branch never fires — NewBinaryTrie resolves the root eagerly at open
time, so any HashedNode we encounter during a getValuesAtStem walk is
necessarily a child of a previously-visited internal (parentIdx /
parentIsLeft set on the prior kindInternal iteration).

Drop the hasParent flag and its setter; replace the check with a short
comment stating the invariant.
2026-04-18 18:57:56 +02:00
CPerezz
33227e7e6d
trie/bintrie: merge *Single paths into *ValuesAtStem
Per gballet's comment 3101751325 on PR #34055: the *Single functions
are essentially the same thing as *ValuesAtStem with one slot set. The
original design dispatched through *ValuesAtStem for dedup; this commit
restores that shape on the arena side.

- GetValue now delegates to GetValuesAtStem and indexes the returned
  256-slot array header (no allocation — the stem node returns its own
  inline values array as a slice).
- InsertSingle now builds a stack-allocated [StemNodeWidth][]byte with
  only the target slot set and delegates to InsertValuesAtStem.
- Delete the insertSingleInternal tree walker (~90 LOC) and the whole
  splitStemInsert (~60 LOC) — the *ValuesAtStem / splitStemValuesInsert
  pair already handles every case.

Addresses gballet comments 3101751325, 3101739001, 3101724199, 3101721238
(the last three subsumed by the consolidation — the duplicated helper
bodies no longer exist).

Net: ~150 LOC removed from store_ops.go. Allocation cost for InsertSingle
is bounded by the stack-allocated 256-slot array (one stack frame, no
heap allocation on the hot path).
2026-04-18 18:57:23 +02:00
CPerezz
bbf062c746
trie/bintrie: inline getSingle into GetValue
Gballet asked (comment 3101679920) to fold the unexported getSingle helper
into its single caller, and (comment 3101677731) to rename GetSingle ('bad
name: get single what?') with a top-level docstring.

- Inline getSingle into GetValue (one function instead of two).
- Rename GetSingle → GetValue and add a docstring.
- Drop the hasParent tracker that was only used for the 'hashed at root'
  guard — that case is now handled by kindEmpty / the top-level
  NewBinaryTrie-time root resolution, so remove the check rather than
  keep dead state.

CE2 will later fold this into GetValuesAtStem; this commit closes the
naming + inline asks independently.
2026-04-18 18:54:26 +02:00
CPerezz
8f31f30500
trie/bintrie: trim storeChunkSize doc comment
Gballet posted an empty 'suggestion' block on node_store.go:24 (comment
3100612272) — collapse the 4-line explanatory block to one line.
2026-04-18 18:53:23 +02:00
CPerezz
e1859ea864
trie/bintrie: simplify StemNode to array-of-slices representation
Gballet asked on PR #34055 (comments 3100043116, 3100050542, and the
bit-check dedup at 3100114416 / 3100878310) to revert StemNode from the
packed-bytes representation to the straightforward array-of-slices.

Before: StemNode carried a bitmap, a concatenated valueData []byte, a
count, and a shared COW flag. Every read/write went through a bit-count
posInData lookup; every mutation through ensureWritable COW.

After: values [StemNodeWidth][]byte — 256 slots, nil == absent. No
bitmap lookup, no COW. Direct sn.values[suffix] access.

Supporting changes:
- Drop posInData, ensureWritable; rewrite getValue/hasValue/allValues/
  setValue as trivial slice access.
- Hash() iterates sn.values directly, matching master's shape.
- SerializeNode emits the bitmap + concatenated bytes on the wire from
  the array-of-slices at serialize time; wire format unchanged.
- decodeNode populates sn.values[i] slots by aliasing the serialized
  buffer (zero-copy).
- NodeStore.Copy deep-copies each slot.
- splitStemValuesInsert + the insertSingleInternal paths write directly
  to sn.values[i].

Trade-off: stems now carry 256 []byte headers (6144 B) instead of 1
concatenated slice (~32 B) + bitmap. Stem-pool scan cost returns to
parity with master (the existing valueData pointer already made the
pool non-noscan; rollback adds 255 more pointers per stem). The primary
arena win — pointer-free InternalNode pool — is preserved.
2026-04-18 18:53:07 +02:00
CPerezz
3885e539b7
trie/bintrie: revert sha256 helper + parallelHashDepth constant
Gballet asked (comment 3099953085) to leave the sha256Sum256 / constant
parallelHashDepth optimisation out of this PR: it's an orthogonal
microbenchmark concern that should be revisited post-group-depth under
Go 1.26.

- Delete the sha256Sum256 helper from hasher.go.
- Delete the const parallelHashDepth = 4 from hasher.go.
- Restore master's dynamic parallelDepth() helper in store_commit.go
  (copy verbatim — min(bits.Len(NumCPU), 8)).
- In hashInternal's shallow-parallel branch, call sha256.Sum256 directly
  (std-lib, stack-allocated [32]byte; common.Hash is a type alias for
  [32]byte so no conversion needed).
- In hashInternal's deep-sequential branch, use the pooled newSha256 /
  returnSha256 hasher (matches master's internal_node.go:170-185).

Intentional trade-off: the deep branch now re-introduces per-hash
sync.Pool Get/Put plus a 32-byte h.Sum(nil) allocation. Zero regression
vs master; foregoes the arena's proposed stack-based hashing until
Go 1.26 + post-group-depth benchmarks.
2026-04-18 18:50:45 +02:00
CPerezz
2d44d8a4b6
trie/bintrie: unexport package-internal arena identifiers
Gballet asked on PR #34055 to unexport nodeRef, nodeKind, and makeRef
(comments 3099846639, 3099847640, 3100717855) — none are used outside
trie/bintrie. Cascade to the internal-only support symbols and methods:

  NodeKind          → nodeKind
  KindEmpty/...     → kindEmpty/...
  NodeRef           → nodeRef
  EmptyRef          → emptyRef
  MakeRef           → makeRef
  NodeStore.Root    → deleted; inlined to s.root field access (same pkg)
  NodeStore.SetRoot → deleted; inlined to s.root = ref
  NodeStore.ComputeHash/SerializeNode/DeserializeNode(WithHash)/
  CollectNodes/ToDot/GetHeight → lowercased

All 9 method signatures took or returned nodeRef so their export would
have tripped revive:unexported-return after the type rename. Zero
external callers means no API break. The private deserializeNode helper
was renamed to decodeNode to free the name for the newly-private
deserializeNode public function.

Pure rename; no behaviour change.
2026-04-18 18:49:04 +02:00
CPerezz
939b36345f
trie/bintrie: port dirty flag + CollectNodes skip-clean from master
Master added (via PR #34754) a dirty bool to InternalNode/StemNode plus a
CollectNodes short-circuit that skips clean subtrees — the arena branch
diverged before that landed. Port the semantics onto the arena shape:

- Add dirty bool to InternalNode and StemNode.
- Wire dirty=true alongside every existing mustRecompute=true setter in
  node_store.go (newInternalRef, newStemRef) and store_ops.go (8 mutation
  sites across InsertSingle/insertSingleInternal/InsertValuesAtStem/
  insertValuesAtStem/splitStemInsert/splitStemValuesInsert).
- Add 'if !node.dirty { return nil }' gate at the top of CollectNodes for
  both KindInternal and KindStem; clear dirty after flushfn runs.
- Plumb a dirty parameter through deserializeNode; DeserializeNode passes
  dirty=true (safe default), DeserializeNodeWithHash passes dirty=false
  (loaded from disk, blob matches).

The arena test in trie_test.go that was auto-merged from master used
master-shape struct literals (tr.root, NewBinaryNode) that don't exist on
arena; delete those and replace with TestCommitSkipCleanSubtrees, an
arena-native version that asserts first-Commit flushes all nodes, no-op
Commit flushes none, and single-leaf Commit flushes only the root-to-leaf
path.
2026-04-18 18:45:12 +02:00
CPerezz
1a37d82231
trie/bintrie: restore load-bearing iterator doc comments
A prior commit aggressively trimmed comments. This restores the
ones that carried real information — ownership contracts on
returned slices, the index-tracking semantics inside Next(),
the Parent() grandparent note, and the "at a leaf" stem rule —
so future readers aren't left guessing. Also elaborates the
parallel-hashing rationale in hashInternal.
2026-04-18 18:38:38 +02:00
CPerezz
b3b86a873a
trie/bintrie: merge makeKeyPath into keyToPath
Drop the panic-on-error variant. All callers are inside methods
that already propagate errors, so the error-returning form is the
right default.
2026-04-18 18:38:37 +02:00
CPerezz
84c61897b3
trie/bintrie: use type alias for HashedNode
Replace the single-field struct with a type alias on common.Hash.
Both have identical layout (32 bytes, no pointers) and noscan span
placement, but the alias matches master's style and reads more
naturally. A zero-arg Hash() method keeps call sites terse.
2026-04-18 18:38:37 +02:00
CPerezz
5f94d26db8
trie/bintrie: update copyright year on newly added files
These four files were introduced in this PR and should carry the
current year.
2026-04-18 18:38:37 +02:00
CPerezz
16b0f9d2d9
trie/bintrie: gofmt
Fix goimports alignment in struct literals, const block, and var
block so the CI Lint job passes.
2026-04-18 18:38:37 +02:00
CPerezz
957f85fa58
Update trie/bintrie/hasher.go
Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
2026-04-18 18:38:37 +02:00
CPerezz
b4a7118d06
trie/bintrie: trim verbose doc comments to essentials 2026-04-18 18:38:37 +02:00
CPerezz
ad64f4ec04
trie/bintrie: fix copy-paste panic messages in allocStem/allocHashed 2026-04-18 18:38:37 +02:00
CPerezz
efc164b850
trie/bintrie: move ensureWritable into setValue 2026-04-18 18:38:37 +02:00
CPerezz
ee951bad21
trie/bintrie: remove KindInvalid (overflows 2-bit NodeRef tag) 2026-04-18 18:38:36 +02:00
CPerezz
a9a2d5653c
trie/bintrie: free hashed node slot after iterator resolution 2026-04-18 18:38:36 +02:00
CPerezz
7351d8d1c1
trie/bintrie: guard LeafProof against single-stem trees 2026-04-18 18:38:36 +02:00
CPerezz
9769a68c84
trie/bintrie: document zero-copy deserialization ownership contract 2026-04-18 18:38:36 +02:00
CPerezz
0f6b066156
trie/bintrie: add nil-resolver guards for hashed node resolution 2026-04-18 18:38:36 +02:00
CPerezz
b6d385e702
trie/bintrie: add depth bound to splitStemInsert 2026-04-18 18:38:36 +02:00
CPerezz
05773f4bae
trie/bintrie: fix CollectNodes path slice aliasing 2026-04-18 18:38:36 +02:00
CPerezz
d5969de518
trie/bintrie: use EmptyRef for zero-hash children in deserialization 2026-04-18 18:38:36 +02:00
CPerezz
cff5fca22f
trie/bintrie: fix Parent() to return parent hash, not current node 2026-04-18 18:38:35 +02:00
CPerezz
c1fb257e2e
fix: Strip group-depth from GC 2026-04-18 18:38:35 +02:00
CPerezz
8a5e777fde
trie/bintrie: replace BinaryNode interface with GC-free NodeRef arena
Replace the BinaryNode interface (which uses Go interface pointers that
the GC must scan) with NodeRef uint32 indices into typed arena pools.
NodeRef packs a 2-bit kind tag and 30-bit pool index into a single
uint32, making it invisible to the garbage collector.

NodeStore manages chunked typed pools per node kind:
- InternalNode pool: ZERO Go pointers (children are NodeRef, hash is
  [32]byte) → allocated in noscan spans, GC skips entirely
- HashedNode pool: ZERO Go pointers → noscan spans
- StemNode pool: ONE pointer per node (valueData []byte) → minimal GC

For a trie with 25K InternalNodes, this reduces GC-scanned pointer-words
from ~125K to ~10K (85% reduction). CPU profiling showed 44% of time
in GC; this refactor directly addresses that bottleneck.

Serialization format is unchanged — the on-disk representation is
fully compatible. All existing tests pass.
2026-04-18 18:38:15 +02:00
CPerezz
61bfacc52f
trie/bintrie: skip clean nodes in CollectNodes to reduce commit write amplification (#34754)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
## Problem

`BinaryTrie.Commit` unconditionally walked every resolved in-memory node
and flushed it into the `NodeSet`, producing one Pebble write per
resolved internal + stem node on every block — even when the node's
on-disk blob was bitwise identical to the previous commit. On a warm
400M-state workload this meant tens of thousands of redundant 65-byte
writes per block, compounding Pebble compaction pressure on every
commit.

The existing `mustRecompute` flag tracks *hash* staleness, not
*disk-blob* staleness: after `Hash()` completes, `mustRecompute` is
cleared even though the fresh blob has not been persisted. It is
therefore insufficient for a skip-flush optimization.

## Fix

Mirror the MPT committer pattern (`trie/committer.go:51-56`) by adding a
`dirty` flag on `InternalNode` and `StemNode` with the semantics *the
on-disk blob is stale*. The flag is:

- set to `true` wherever the node is created or structurally modified
(the same call sites that already set `mustRecompute = true`);
- set to `false` only after the node has been passed to the `flushfn`
inside `CollectNodes`;
- left `false` on nodes produced by `DeserializeNodeWithHash`, matching
the *loaded from disk, already persisted* semantics.

`CollectNodes` short-circuits on `!dirty` subtrees. The propagation
invariant (an ancestor of any dirty node is itself dirty) is already
maintained by the existing `InsertValuesAtStem` / `Insert` paths, which
now mirror every `mustRecompute = true` setter with a `dirty = true`
setter.

## Benchmark

New `BenchmarkCollectNodes_SparseWrite` measures commit cost when only
one leaf changes between blocks — the common case for state updates.
10,000-stem trie, one-leaf modification + Commit per iteration, Apple M4
Pro:

| | before | after | delta |
|---|---|---|---|
| time / op | 12,653,000 ns | 7,336 ns | **~1,725×** |
| bytes / op | 107,224,740 B | 37,774 B | **~2,839×** |
| allocs / op | 80,953 | 134 | **~604×** |

End-to-end impact on a real workload depends on the
resolved-footprint-to-dirty-path ratio; the new
`TestBinaryTrieCommitIncremental` provides a structural regression guard
(asserts that a Commit following a single-leaf modification flushes a
root-to-leaf path, not the whole tree).

---

Found all of this stuff while bloating my #34706 DB to make some
benchmarks. And saw we were spending A LOT OF TIME on hashing.
Hope this helps the perf a bit. Will rebase the flat-state PR on top of
this once merged.
2026-04-18 11:42:58 +02:00
Guillaume Ballet
ba215fd927
cmd, core, trie, triedb: split CachingDB into merkle + binary dbs. (#34700)
This Pr implements some prerequisite changes for #34004 : split the
`CachingDB` into a `MerkleDB` and a `UBTDB`, so that very different
behaviors don't clash as much.

The transition isn't handled by this PR, but after talking to Gary we
agreed that `UBTDB` should receive another `triedb`, which will only be
loaded if the `Ended` flag is set to false in the conversion contract.
If this is too hard to achieve, it makes sense to load it regardless,
and then loading can be prevented at a later stage by adding a
`UBTTransitionFinalizationTime` in `ChainConfig`.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2026-04-17 08:55:54 +08:00
Guillaume Ballet
735bfd121a
trie/bintrie: spec change, big endian hashing of slot key (#34670)
The spec has been changed during SIC #49, the offset is encoded as a
big-endian number.
2026-04-13 09:42:37 +02:00
CPerezz
deda47f6a1
trie/bintrie: fix GetAccount/GetStorage non-membership — verify stem before returning values (#34690)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
Fix `GetAccount` returning **wrong account data** for non-existent
addresses when the trie root is a `StemNode` (single-account trie) — the
`StemNode` branch returned `r.Values` without verifying the queried
address's stem matches.

Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
2026-04-10 19:43:48 +02:00
CPerezz
f71a884e37
trie/bintrie: fix DeleteAccount no-op (#34676)
`BinaryTrie.DeleteAccount` was a no-op, silently ignoring the caller's
deletion request and leaving the old `BasicData` and `CodeHash` in the
trie.

Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
2026-04-10 19:23:44 +02:00
rjl493456442
c3467dd8b5
core, miner, trie: relocate witness stats (#34106)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR relocates the witness statistics into the witness itself, making
it more self-contained.
2026-03-27 17:06:46 +01:00
Guillaume Ballet
305cd7b9eb
trie/bintrie: fix NodeIterator Empty node handling and expose tree accessors (#34056)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
Fix three issues in the binary trie NodeIterator:

1. Empty nodes now properly backtrack to parent and continue iteration
instead of terminating the entire walk early.

2. `HashedNode` resolver handles `nil` data (all-zeros hash) gracefully
by treating it as Empty rather than panicking.

3. Parent update after node resolution guards against stack underflow
when resolving the root node itself.

---------

Co-authored-by: tellabg <249254436+tellabg@users.noreply.github.com>
2026-03-20 13:53:14 -04:00
CPerezz
6138a11c39
trie/bintrie: parallelize InternalNode.Hash at shallow tree depths (#34032)
## Summary

At tree depths below `log2(NumCPU)` (clamped to [2, 8]), hash the left
subtree in a goroutine while hashing the right subtree inline. This
exploits available CPU cores for the top levels of the tree where
subtree hashing is most expensive. On single-core machines, the parallel
path is disabled entirely.

Deeper nodes use sequential hashing with the existing `sync.Pool` hasher
where goroutine overhead would exceed the hash computation cost. The
parallel path uses `sha256.Sum256` with a stack-allocated buffer to
avoid pool contention across goroutines.

**Safety:**
- Left/right subtrees are disjoint — no shared mutable state
- `sync.WaitGroup` provides happens-before guarantee for the result
- `defer wg.Done()` + `recover()` prevents goroutine panics from
crashing the process
- `!bt.mustRecompute` early return means clean nodes never enter the
parallel path
- Hash results are deterministic regardless of computation order — no
consensus risk

## Benchmark (AMD EPYC 48-core, 500K entries, `--benchtime=10s
--count=3`, post-H01 baseline)

| Metric | Baseline | Parallel | Delta |
|--------|----------|----------|-------|
| Approve (Mgas/s) | 224.5 ± 7.1 | **259.6 ± 2.4** | **+15.6%** |
| BalanceOf (Mgas/s) | 982.9 ± 5.1 | 954.3 ± 10.8 | -2.9% (noise, clean
nodes skip parallel path) |
| Allocs/op (approve) | ~810K | ~700K | -13.6% |
2026-03-18 13:54:23 +01:00
Guillaume Ballet
1c9ddee16f
trie/bintrie: use a sync.Pool when hashing binary tree nodes (#33989)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Binary tree hashing is quite slow, owing to many factors. One of them is
the GC pressure that is the consequence of allocating many hashers, as a
binary tree has 4x the size of an MPT. This PR introduces an
optimization that already exists for the MPT: keep a pool of hashers, in
order to reduce the amount of allocations.
2026-03-12 10:20:12 +01:00