Addresses review suggestion S2/S7/S22.
Remove "NOT wired in yet", "in a later commit", and "Commit N"
references that were accurate at the time of their original commit
but became stale after subsequent commits landed on the same branch.
Update cross-references to name the actual functions and files
rather than commit numbers.
Specific fixes:
* flat_codec_bintrie.go:50-57: "NOT wired" → "wired when isVerkle"
* flat_codec_bintrie.go:360: "in a later commit" → "generate_bintrie.go"
* database_hasher_binary.go:132: "in a later commit" → "StateDB.commit()"
* journal.go:53-56: v4 comment updated from "reserved for" → actual description
binaryHasher now implements the new LeafProducer optional extension to
the Hasher interface. Every UpdateAccount, UpdateStorage, and delete
path records the corresponding (stem, offset, value) write into an
internal buffer, which the caller drains once per block via
DrainStemWrites() and hands to the pathdb flat-state layer through the
stateUpdate (wired up in the next commit).
Three kinds of writes are recorded:
- Account create/update: two writes (BasicData at offset 0,
CodeHash at offset 1), sharing the same 31-byte stem. BasicData
is produced via bintrie.PackBasicData so the flat-state blob
is bit-identical to what the trie layer packs internally.
- Storage update: one write per slot. Non-zero values become
right-justified 32-byte blobs; the zero value (the bintrie's
"delete" convention) becomes 32 zero bytes, matching the trie's
tombstone-with-zero semantics so the flat-state mirror stays
bit-identical to the StemNode.Values entry.
- Account delete: two clear writes (nil Value) for offsets 0 and 1.
Storage slots and code chunks at the same or other stems are NOT
touched; pre-EIP-6780 full-wipe is a documented scope limitation.
The LeafProducer interface lives on Hasher and is strictly opt-in —
merkleHasher does not implement it, and callers detect capability via
a type assertion. This keeps the read-side/write-side split of the
existing Hasher cleanly extended: hashers that have a concept of
flat-state leaves can expose them; hashers that don't (MPT) are
unaffected.
Tests cover:
- TestBinaryHasherLeafProduction: account update produces 2 writes
at offsets 0+1 with matching stem; drain is destructive; storage
update emits one matching write; zero-value storage writes 32 zero
bytes; delete emits 2 clear writes.
- TestMerkleHasherNoLeafProducer: merkleHasher does NOT satisfy the
LeafProducer interface (the capability is opt-in per hasher).
The collected stem writes are not yet propagated anywhere — a later
commit wires DrainStemWrites into StateDB.IntermediateRoot so the
writes flow through stateUpdate and the pathdb stateSet into the
flat-state layer.
binaryHasher.updateAccount computed codeLen from len(account.Code.Code),
which is only non-zero when the code itself was modified in the current
block. For balance- or nonce-only updates account.Code is nil and the
computed codeLen was 0, silently overwriting the code_size field packed
into the bintrie BasicData leaf (EIP-7864 bytes 5-7) with zero every
time a contract was touched without a code write.
The TODO(rjl493456442) on updateAccount acknowledged this. Fix it by
adding a CodeSize field to AccountMut and having the caller at
StateDB.IntermediateRoot populate it via stateObject.CodeSize(), which
returns len(obj.code) when the bytes are loaded, otherwise falls back
to a code-size lookup via the reader. The binary hasher then passes
account.CodeSize straight to BinaryTrie.UpdateAccount as its codeLen
argument, and the TODO is removed.
Rationale for placing CodeSize on AccountMut rather than Account:
AccountMut already carries Code *CodeMut — the new bytecode, which is
not a field of Account — because code is write-time data that is not
persisted in the flat-state format (SlimAccountRLP). CodeSize has the
identical lifecycle: it is not in SlimAccountRLP, it is not populated
by any reader, and it is only consumed by the hasher at write time.
Mirroring Code's placement keeps the read-side/write-side split honest
(Account models the persisted flat-state record; AccountMut adds the
code-related write-time parameters). If the bintrie flat-state format
is later extended to carry code_size, CodeSize can be promoted onto
Account at that time.
merkleHasher is unaffected: StateTrie.UpdateAccount ignores its codeLen
parameter, so the wrapTrie.UpdateAccount shim continues to pass 0 and
no state-root divergence is introduced on the MPT path.
Regression test TestVerkleCodeSizePreserved verifies that the state
root produced by "create contract, commit, reload, modify balance,
commit" matches the root of a single-step construction of the same
final state. Before the fix the roots diverge:
path A (reload + balance): 1a675599...
path B (fresh, same state): de0cfb03...