go-ethereum/core/state/database_hasher.go
CPerezz 64d185616c
core/state: plumb CodeSize through AccountMut for binaryHasher
binaryHasher.updateAccount computed codeLen from len(account.Code.Code),
which is only non-zero when the code itself was modified in the current
block. For balance- or nonce-only updates account.Code is nil and the
computed codeLen was 0, silently overwriting the code_size field packed
into the bintrie BasicData leaf (EIP-7864 bytes 5-7) with zero every
time a contract was touched without a code write.

The TODO(rjl493456442) on updateAccount acknowledged this. Fix it by
adding a CodeSize field to AccountMut and having the caller at
StateDB.IntermediateRoot populate it via stateObject.CodeSize(), which
returns len(obj.code) when the bytes are loaded, otherwise falls back
to a code-size lookup via the reader. The binary hasher then passes
account.CodeSize straight to BinaryTrie.UpdateAccount as its codeLen
argument, and the TODO is removed.

Rationale for placing CodeSize on AccountMut rather than Account:
AccountMut already carries Code *CodeMut — the new bytecode, which is
not a field of Account — because code is write-time data that is not
persisted in the flat-state format (SlimAccountRLP). CodeSize has the
identical lifecycle: it is not in SlimAccountRLP, it is not populated
by any reader, and it is only consumed by the hasher at write time.
Mirroring Code's placement keeps the read-side/write-side split honest
(Account models the persisted flat-state record; AccountMut adds the
code-related write-time parameters). If the bintrie flat-state format
is later extended to carry code_size, CodeSize can be promoted onto
Account at that time.

merkleHasher is unaffected: StateTrie.UpdateAccount ignores its codeLen
parameter, so the wrapTrie.UpdateAccount shim continues to pass 0 and
no state-root divergence is introduced on the MPT path.

Regression test TestVerkleCodeSizePreserved verifies that the state
root produced by "create contract, commit, reload, modify balance,
commit" matches the root of a single-step construction of the same
final state. Before the fix the roots diverge:

  path A (reload + balance): 1a675599...
  path B (fresh, same state): de0cfb03...
2026-04-15 15:00:39 +02:00

153 lines
6.5 KiB
Go

// Copyright 2026 The go-ethereum Authors
// This file is part of the go-ethereum library.
//
// The go-ethereum library is free software: you can redistribute it and/or modify
// it under the terms of the GNU Lesser General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// The go-ethereum library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with the go-ethereum library. If not, see <http://www.gnu.org/licenses/>.
package state
import (
"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/core/stateless"
"github.com/ethereum/go-ethereum/ethdb"
"github.com/ethereum/go-ethereum/trie/trienode"
)
// CodeMut represents a mutation to contract code.
type CodeMut struct {
Code []byte // Null for deletion
}
// AccountMut represents a mutation to an account.
// Semantics:
// - Account == nil: delete the account
// - Code == nil: leave code unchanged
// - Code != nil: apply the given code mutation
// - CodeSize: the account's CURRENT total code size, not just the bytes
// carried in Code. It is used by implementations that pack the code
// size into their on-trie account encoding (e.g. the binary trie
// BasicData leaf). Callers must always populate this field to the
// account's real code size, obtained via stateObject.CodeSize() or an
// equivalent source — even on balance/nonce-only updates where the
// code bytes themselves are not loaded. Leaving it at zero on a
// non-code-touching update silently corrupts on-trie state for any
// hasher that stores code size.
type AccountMut struct {
Account *Account // Null for deletion
Code *CodeMut // Null for unchanged
CodeSize int // Current code length (must be set by the caller)
}
// Hashes encapsulates a trie root together with its original (pre-update) root.
type Hashes struct {
Hash common.Hash // Post-mutation root
Prev common.Hash // Pre-mutation root
}
// Hasher defines the minimal interface for computing state root hashes.
//
// It abstracts over different trie implementations, such as the traditional
// two-layer Merkle Patricia Trie (separate account and storage tries) and a
// unified single-layer binary trie (a single trie covering accounts, storages
// and contract code).
//
// This abstraction also enables alternative implementations, such as a no-op
// hasher for flat-state-only nodes (i.e. nodes that do not store trie data and
// do not perform state validation).
//
// The Hash method may be invoked multiple times and must return a hash that
// reflects all preceding state mutations. This behavior is required for
// compatibility with pre-Byzantium semantics.
type Hasher interface {
// UpdateAccount writes a list of accounts into the state.
UpdateAccount(addresses []common.Address, accounts []AccountMut) error
// UpdateStorage writes a list of storage slot value.
UpdateStorage(address common.Address, keys []common.Hash, values []common.Hash) error
// Hash computes and returns the state root hash without committing.
Hash() common.Hash
// Commit finalizes all pending changes and returns the resulting state root
// hash, along with the set of dirty trie nodes generated by the updates.
//
// Additionally, if the hasher uses a two-layer structure, the roots of the
// secondary tries together with their original hashes will also be returned
// for all mutated accounts, regardless of whether their storage was modified.
Commit() (common.Hash, *trienode.MergedNodeSet, map[common.Address]Hashes, error)
// Copy returns a deep-copied hasher instance.
Copy() Hasher
}
// Prefetcher is an optional extension implemented by hashers that can
// asynchronously warm up trie/state data ahead of hashing.
type Prefetcher interface {
// PrefetchAccount schedules the account for prefetching.
PrefetchAccount(addresses []common.Address, read bool)
// PrefetchStorage schedules the storage slot for prefetching.
PrefetchStorage(addr common.Address, keys []common.Hash, read bool)
// TermPrefetch terminates all the background prefetching activities.
TermPrefetch()
}
// WitnessCollector is an optional extension implemented by hashers that can
// construct a state witness for the most recent committed state transition.
type WitnessCollector interface {
// CollectWitness returns the state witness corresponding to the most recent
// committed state transition.
CollectWitness(*stateless.Witness)
}
// Prover is an optional extension implemented by hashers that can construct
// proofs against the current state.
type Prover interface {
// ProveAccount constructs a proof for the given account.
//
// The returned proof contains all encoded nodes on the path to the account.
// The account itself is included in the last node and can be retrieved by
// verifying the proof.
//
// If the account does not exist, the returned proof contains all nodes of
// the longest existing prefix of the account key (at least the root), ending
// with the node that proves the absence of the account.
ProveAccount(addr common.Address, proofDb ethdb.KeyValueWriter) error
// ProveStorage constructs a proof for the given storage slot of the
// specified account.
//
// The returned proof contains all encoded nodes on the path to the storage
// slot. The slot value itself is included in the last node and can be
// retrieved by verifying the proof.
//
// If the account or storage slot does not exist, the returned proof contains
// the nodes required to prove its absence.
ProveStorage(addr common.Address, key common.Hash, proofDb ethdb.KeyValueWriter) error
}
// noopHasher is a Hasher implementation that performs no work and always
// returns an empty state root.
type noopHasher struct{}
func (n *noopHasher) UpdateAccount([]common.Address, []AccountMut) error { return nil }
func (n *noopHasher) UpdateStorage(common.Address, []common.Hash, []common.Hash) error {
return nil
}
func (n *noopHasher) Hash() common.Hash { return common.Hash{} }
func (n *noopHasher) Commit() (common.Hash, *trienode.MergedNodeSet, map[common.Address]Hashes, error) {
return common.Hash{}, trienode.NewMergedNodeSet(), make(map[common.Address]Hashes), nil
}
func (n *noopHasher) Copy() Hasher { return &noopHasher{} }
func (n *noopHasher) Close() {}