Commit graph

17 commits

Author SHA1 Message Date
CPerezz
437a53bbe0
triedb/pathdb: implement bintrieFlatCodec + stem blob helpers
Introduce the codec and on-disk blob format for the bintrie flat-state
layer. This commit only defines the types; the codec is NOT wired into
pathdb.Database.New yet (that happens in a later commit once the
leaf-production hook in binaryHasher and the stateUpdate wiring are in
place).

Three pieces:

1. trie/bintrie/pack.go

   Canonical PackBasicData / UnpackBasicData helpers that encode an
   account's (codeSize, nonce, balance) into the 32-byte BasicData leaf
   defined by EIP-7864. Preserves the existing BinaryTrie.UpdateAccount
   layout byte-for-byte (4-byte code_size at offset 4 rather than the
   spec's 3-byte field at offset 5 — any realistic code size has byte 4
   always zero and the two encodings are bit-equivalent in practice).

   BinaryTrie.UpdateAccount is refactored to delegate to PackBasicData
   so the flat-state codec can produce a bit-identical BasicData
   encoding without duplicating the layout logic.

2. triedb/pathdb/stem_blob.go

   Packed encoding of the populated (offset, value) pairs at a bintrie
   stem. A stem can hold up to 256 offsets per EIP-7864 but in practice
   only a handful are set; the layout is a 32-byte bitmap followed by
   N 32-byte values in ascending offset order, where N = popcount.
   Empty stems encode to nil so the caller knows to delete the on-disk
   key rather than write a zero-length value.

   Provides encodeStemBlob / decodeStemBlob / extractStemOffset /
   mergeStemBlob and a stemBuilder type for accumulating writes. The
   tombstone convention (32 zero bytes = "present with zero" as used
   by DeleteStorage) is preserved.

   11 unit tests cover: empty blob, BasicData+CodeHash roundtrip, all
   256 offsets populated, sparse high offsets, set/clear roundtrip,
   load-from-existing-blob RMW, merge helper, merge-to-empty, tombstone
   zero bytes, malformed input detection, bitmap rank sanity.

3. triedb/pathdb/flat_codec_bintrie.go

   bintrieFlatCodec implements flatStateCodec over the stem-blob layout.
   Unlike merkleFlatCodec it is stateful: it holds a ethdb.KeyValueReader
   reference used by applyWrites to read the existing stem blob before
   merging in new writes. ethdb.Batch is write-only so the batch passed
   to Write* cannot be used to fetch current state.

   Pre-aggregation requirement is documented explicitly: within a single
   flush, the caller must NOT issue two Write* calls targeting the same
   stem, because the RMW read comes from the store (not the in-flight
   batch). Commit 8 of the bintrie flat-state plan restructures
   writeStates to pre-aggregate per-stem writes so callers don't have
   to handle this manually.

   Cache keys are prefix-disambiguated with a one-byte 0x01 to keep
   bintrie stem lookups disjoint from merkle 32-byte account keys and
   64-byte storage keys in the shared clean-state fastcache.

   SplitMarker is a single-tier (stem-only) format, not the merkle
   two-tier (account, account+storage) format.

   7 unit tests cover: account roundtrip, storage roundtrip, multiple
   writes to the same stem, DeleteAccount preserving unrelated offsets,
   DeleteStorage removing the final offset collapsing the key, cache
   key disjointness from merkle, SplitMarker semantics.

The codec is not dispatched by anything yet; MPT continues through the
merkle codec and bintrie mode still runs on the (soon-to-be-replaced)
keccak-shaped path until Commit 10 wires things up.
2026-04-15 15:00:40 +02:00
CPerezz
2851f7b8c7
trie/bintrie: implement binaryNodeIterator.seek()
The bintrie node iterator previously discarded its `start` parameter,
forcing every iteration to begin at the root. This makes resumable
generators (snapshot/flat-state population) impossible — any
interruption restarts from scratch.

Implement seek(start []byte) by walking down the trie following start's
bit path, building the iterator stack as we go. When the chosen path
dead-ends (Empty, missing child, or a stem strictly less than start),
backtrack through the existing stack to find the next in-order subtree
and descend to its leftmost leaf.

Also wire BinaryTrie.NodeIterator(startKey) to actually pass startKey
through (was hardcoded to nil).

Tests cover: empty start (no-op), exact key match, between-keys,
into empty subtree, past end, within-stem offsets, resume simulation,
and deep tree.
2026-04-15 15:00:39 +02:00
Guillaume Ballet
735bfd121a
trie/bintrie: spec change, big endian hashing of slot key (#34670)
The spec has been changed during SIC #49, the offset is encoded as a
big-endian number.
2026-04-13 09:42:37 +02:00
CPerezz
deda47f6a1
trie/bintrie: fix GetAccount/GetStorage non-membership — verify stem before returning values (#34690)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
Fix `GetAccount` returning **wrong account data** for non-existent
addresses when the trie root is a `StemNode` (single-account trie) — the
`StemNode` branch returned `r.Values` without verifying the queried
address's stem matches.

Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
2026-04-10 19:43:48 +02:00
CPerezz
f71a884e37
trie/bintrie: fix DeleteAccount no-op (#34676)
`BinaryTrie.DeleteAccount` was a no-op, silently ignoring the caller's
deletion request and leaving the old `BasicData` and `CodeHash` in the
trie.

Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
2026-04-10 19:23:44 +02:00
Guillaume Ballet
305cd7b9eb
trie/bintrie: fix NodeIterator Empty node handling and expose tree accessors (#34056)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
Fix three issues in the binary trie NodeIterator:

1. Empty nodes now properly backtrack to parent and continue iteration
instead of terminating the entire walk early.

2. `HashedNode` resolver handles `nil` data (all-zeros hash) gracefully
by treating it as Empty rather than panicking.

3. Parent update after node resolution guards against stack underflow
when resolving the root node itself.

---------

Co-authored-by: tellabg <249254436+tellabg@users.noreply.github.com>
2026-03-20 13:53:14 -04:00
CPerezz
6138a11c39
trie/bintrie: parallelize InternalNode.Hash at shallow tree depths (#34032)
## Summary

At tree depths below `log2(NumCPU)` (clamped to [2, 8]), hash the left
subtree in a goroutine while hashing the right subtree inline. This
exploits available CPU cores for the top levels of the tree where
subtree hashing is most expensive. On single-core machines, the parallel
path is disabled entirely.

Deeper nodes use sequential hashing with the existing `sync.Pool` hasher
where goroutine overhead would exceed the hash computation cost. The
parallel path uses `sha256.Sum256` with a stack-allocated buffer to
avoid pool contention across goroutines.

**Safety:**
- Left/right subtrees are disjoint — no shared mutable state
- `sync.WaitGroup` provides happens-before guarantee for the result
- `defer wg.Done()` + `recover()` prevents goroutine panics from
crashing the process
- `!bt.mustRecompute` early return means clean nodes never enter the
parallel path
- Hash results are deterministic regardless of computation order — no
consensus risk

## Benchmark (AMD EPYC 48-core, 500K entries, `--benchtime=10s
--count=3`, post-H01 baseline)

| Metric | Baseline | Parallel | Delta |
|--------|----------|----------|-------|
| Approve (Mgas/s) | 224.5 ± 7.1 | **259.6 ± 2.4** | **+15.6%** |
| BalanceOf (Mgas/s) | 982.9 ± 5.1 | 954.3 ± 10.8 | -2.9% (noise, clean
nodes skip parallel path) |
| Allocs/op (approve) | ~810K | ~700K | -13.6% |
2026-03-18 13:54:23 +01:00
Guillaume Ballet
1c9ddee16f
trie/bintrie: use a sync.Pool when hashing binary tree nodes (#33989)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Binary tree hashing is quite slow, owing to many factors. One of them is
the GC pressure that is the consequence of allocating many hashers, as a
binary tree has 4x the size of an MPT. This PR introduces an
optimization that already exists for the MPT: keep a pool of hashers, in
order to reduce the amount of allocations.
2026-03-12 10:20:12 +01:00
Guillaume Ballet
3f1871524f
trie/bintrie: cache hashes of clean nodes so as not to rehash the whole tree (#33961)
This is an optimization that existed for verkle and the MPT, but that
got dropped during the rebase.

Mark the nodes that were modified as needing recomputation, and skip the
hash computation if this is not needed. Otherwise, the whole tree is
hashed, which kills performance.
2026-03-06 18:06:24 +01:00
Guillaume Ballet
a0fb8102fe
trie/bintrie: fix overflow management in slot key computation (#33951)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Docker Image (push) Waiting to run
/ Windows Build (push) Waiting to run
The computation of `MAIN_STORAGE_OFFSET` was incorrect, causing the last
byte of the stem to be dropped. This means that there would be a
collision in the hash computation (at the preimage level, not a hash
collision of course) if two keys were only differing at byte 31.
2026-03-05 14:43:31 +01:00
Guillaume Ballet
95c6b05806
trie/bintrie: fix endianness in code chunk key computation (#33900)
The endianness was wrong, which means that the code chunks were stored
in the wrong location in the tree.
2026-02-27 11:35:13 +01:00
phrwlk
30656d714e
trie/bintrie: use correct key mapping in GetStorage and DeleteStorage (#33807)
GetStorage and DeleteStorage used GetBinaryTreeKey to compute the tree
key, while UpdateStorage used GetBinaryTreeKeyStorageSlot. The latter
applies storage slot remapping (header offset for slots <64, main
storage prefix for the rest), so reads and deletes were targeting
different tree locations than writes.

Replace GetBinaryTreeKey with GetBinaryTreeKeyStorageSlot in both
GetStorage and DeleteStorage to match UpdateStorage. Add a regression
test that verifies the write→read→delete→read round-trip for main
storage slots.
2026-02-11 11:42:17 +01:00
Guillaume Ballet
19f37003fb
trie/bintrie: fix debug_executionWitness for binary tree (#33739)
The `Witness` method was not implemented for the binary tree, which
caused `debug_excutionWitness` to panic. This PR fixes that.

Note that the `TransitionTrie` version isn't implemented, and that's on
purpose: more thought must be given to what should go in the global
witness.
2026-02-03 12:19:40 +01:00
Ng Wei Han
3d05284928
trie/bintrie: fix tree key hashing to match spec (#33694)
Based on [EIP-7864](https://eips.ethereum.org/EIPS/eip-7864), the tree
index should be 32 bytes instead of 31 bytes.
```
def get_tree_key(address: Address32, tree_index: int, sub_index: int):
    # Assumes STEM_SUBTREE_WIDTH = 256
    return tree_hash(address + tree_index.to_bytes(32, "little"))[:31] + bytes(
        [sub_index]
    )
```
2026-01-28 11:51:02 +01:00
Guillaume Ballet
3f641dba87
trie, go.mod: remove all references to go-verkle and go-ipa (#33461)
In order to reduce the amount of code that is embedded into the keeper
binary, I am removing all the verkle code that uses go-verkle and
go-ipa. This will be followed by further PRs that are more like stubs to
replace code when the keeper build is detected.

I'm keeping the binary tree of course. This means that you will still
see `isVerkle` variables all over the codebase, but they will be renamed
when code is touched (i.e. this is not an invitation for 30+ AI slop
PRs).

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-12-30 20:44:04 +08:00
Guillaume Ballet
2a2f106a01
cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This is broken off of #31730 to only focus on testing networks that
start with verkle at genesis.

The PR has seen a lot of work since its creation, and it now targets
creating and re-executing tests for a binary tree testnet without the
transition (so it starts at genesis). The transition tree has been moved
to its own package. It also replaces verkle with the binary tree for
this specific application.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-11-14 15:25:30 +01:00
Guillaume Ballet
bd4b17907f
trie/bintrie: add eip7864 binary trees and run its tests (#32365)
Implement the binary tree as specified in [eip-7864](https://eips.ethereum.org/EIPS/eip-7864). 

This will gradually replace verkle trees in the codebase. This is only 
running the tests and will not be executed in production, but will help 
me rebase some of my work, so that it doesn't bitrot as much.

---------

Signed-off-by: Guillaume Ballet
Co-authored-by: Parithosh Jayanthi <parithosh.jayanthi@ethereum.org>
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
2025-09-01 21:06:51 +08:00