Commit graph

11 commits

Author SHA1 Message Date
CPerezz
29ef7576d9
core/state: hook leaf production in binaryHasher
binaryHasher now implements the new LeafProducer optional extension to
the Hasher interface. Every UpdateAccount, UpdateStorage, and delete
path records the corresponding (stem, offset, value) write into an
internal buffer, which the caller drains once per block via
DrainStemWrites() and hands to the pathdb flat-state layer through the
stateUpdate (wired up in the next commit).

Three kinds of writes are recorded:

  - Account create/update: two writes (BasicData at offset 0,
    CodeHash at offset 1), sharing the same 31-byte stem. BasicData
    is produced via bintrie.PackBasicData so the flat-state blob
    is bit-identical to what the trie layer packs internally.

  - Storage update: one write per slot. Non-zero values become
    right-justified 32-byte blobs; the zero value (the bintrie's
    "delete" convention) becomes 32 zero bytes, matching the trie's
    tombstone-with-zero semantics so the flat-state mirror stays
    bit-identical to the StemNode.Values entry.

  - Account delete: two clear writes (nil Value) for offsets 0 and 1.
    Storage slots and code chunks at the same or other stems are NOT
    touched; pre-EIP-6780 full-wipe is a documented scope limitation.

The LeafProducer interface lives on Hasher and is strictly opt-in —
merkleHasher does not implement it, and callers detect capability via
a type assertion. This keeps the read-side/write-side split of the
existing Hasher cleanly extended: hashers that have a concept of
flat-state leaves can expose them; hashers that don't (MPT) are
unaffected.

Tests cover:

  - TestBinaryHasherLeafProduction: account update produces 2 writes
    at offsets 0+1 with matching stem; drain is destructive; storage
    update emits one matching write; zero-value storage writes 32 zero
    bytes; delete emits 2 clear writes.
  - TestMerkleHasherNoLeafProducer: merkleHasher does NOT satisfy the
    LeafProducer interface (the capability is opt-in per hasher).

The collected stem writes are not yet propagated anywhere — a later
commit wires DrainStemWrites into StateDB.IntermediateRoot so the
writes flow through stateUpdate and the pathdb stateSet into the
flat-state layer.
2026-04-15 15:00:40 +02:00
CPerezz
64d185616c
core/state: plumb CodeSize through AccountMut for binaryHasher
binaryHasher.updateAccount computed codeLen from len(account.Code.Code),
which is only non-zero when the code itself was modified in the current
block. For balance- or nonce-only updates account.Code is nil and the
computed codeLen was 0, silently overwriting the code_size field packed
into the bintrie BasicData leaf (EIP-7864 bytes 5-7) with zero every
time a contract was touched without a code write.

The TODO(rjl493456442) on updateAccount acknowledged this. Fix it by
adding a CodeSize field to AccountMut and having the caller at
StateDB.IntermediateRoot populate it via stateObject.CodeSize(), which
returns len(obj.code) when the bytes are loaded, otherwise falls back
to a code-size lookup via the reader. The binary hasher then passes
account.CodeSize straight to BinaryTrie.UpdateAccount as its codeLen
argument, and the TODO is removed.

Rationale for placing CodeSize on AccountMut rather than Account:
AccountMut already carries Code *CodeMut — the new bytecode, which is
not a field of Account — because code is write-time data that is not
persisted in the flat-state format (SlimAccountRLP). CodeSize has the
identical lifecycle: it is not in SlimAccountRLP, it is not populated
by any reader, and it is only consumed by the hasher at write time.
Mirroring Code's placement keeps the read-side/write-side split honest
(Account models the persisted flat-state record; AccountMut adds the
code-related write-time parameters). If the bintrie flat-state format
is later extended to carry code_size, CodeSize can be promoted onto
Account at that time.

merkleHasher is unaffected: StateTrie.UpdateAccount ignores its codeLen
parameter, so the wrapTrie.UpdateAccount shim continues to pass 0 and
no state-root divergence is introduced on the MPT path.

Regression test TestVerkleCodeSizePreserved verifies that the state
root produced by "create contract, commit, reload, modify balance,
commit" matches the root of a single-step construction of the same
final state. Before the fix the roots diverge:

  path A (reload + balance): 1a675599...
  path B (fresh, same state): de0cfb03...
2026-04-15 15:00:39 +02:00
Gary Rong
533d2109d5
core: fix memory leaking 2026-04-15 15:00:39 +02:00
Gary Rong
aec9c18432
core/state: improve binary hasher 2026-04-15 15:00:39 +02:00
Gary Rong
d57dca07b1
core/state: integrate witness collector 2026-04-15 15:00:39 +02:00
Gary Rong
5e23a29b73
core/state: integrate prefetching into merkle hasher 2026-04-15 15:00:38 +02:00
Gary Rong
91298c8655
core/state: implement binary hasher just for demonstration 2026-04-15 15:00:38 +02:00
Gary Rong
282cece030
core/state: implement merkle hasher 2026-04-15 15:00:38 +02:00
Gary Rong
38c7021c73
core/state: invoke prefetcher 2026-04-15 15:00:38 +02:00
Gary Rong
1ae462f08d
core/state: build hasher skeleton 2026-04-15 15:00:38 +02:00
Gary Rong
e2c00d6c96
core/state: add hasher interface definition 2026-04-15 14:59:05 +02:00