Addresses review test gaps T8, T9, T10.
TestBintrieFlatReaderStorageTombstone (T8):
Write slot=0x42 in block 1; clear to zero in block 2. Read at
block 2 → common.Hash{} (tombstone, not absent). Read at block 1 →
0x42 (original value preserved in the diff layer chain).
TestBintrieFlatReaderMultiBlockEvolution (T9):
Write nonce=1/balance=100 in block 1, nonce=2 in block 2,
balance=200 in block 3. Open StateReaders at each root and assert
the correct snapshot for each block. Validates diff-layer chaining
under the bintrie path.
TestBintrieGeneratorWithContractCode (T10):
Build a bintrie with a contract having ~100 bytes of code (4
chunks at offsets 128..131). Run the generator and verify code-chunk
offsets appear in the resulting stem blob. Validates the plan's
claim that code chunks are handled by the generator.
Addresses review findings I13 and S6.
encodeBinary: reject non-nil bintrie leaves with length != 32 at the
trust boundary between the hasher and the state update. Previously a
wrong-length leaf silently made it into the diff layer's accountData
and only surfaced as a panic deep in the Flush path (stemBuilder.set).
encodeStemBlob: add an upper-bound check on the value count (must be
<= 256, the maximum offsets per stem). Previously a buggy producer
could pass an arbitrarily long values slice.
Addresses review finding I2.
The bintrieFlatCodec had a crit() helper whose doc claimed "delegates to
log.Crit" but whose body was panic(fmt.Sprintf(...)). A corrupt on-disk
stem blob would cause the buffer flush goroutine to panic, killing the
process. On restart the same blob would cause the same panic —
unrecoverable crash loop.
Fix: applyWrites now returns ([]byte, error) instead of panicking.
The Flush method on flatStateCodec gains an error return:
(int, int) → (int, int, error)
The error propagates up through writeStates → stateSet.write →
buffer.flush → flushErr. A corrupted stem blob now causes a flush
failure that the database can react to instead of a crash loop.
The per-entry methods (WriteAccount, WriteStorage, DeleteAccount,
DeleteStorage) — which are NOT on the production flush path — use
log.Crit (the real function, not the deleted shim) on error, matching
the merkle codec's existing convention for unrecoverable corruption
at the per-entry level.
The crit shim is deleted entirely.
Addresses review finding I1.
diskLayer.node() returned crypto.Keccak256Hash(blob) as the hash for
ALL trie nodes regardless of the database's scheme. For the binary trie
the correct hash is sha256 via binaryNodeHasher. The wrong hash was
masked by noHashCheck=true in pathdb.NodeReader for the bintrie path,
but HistoricalNodeReader.Node (which does NOT set noHashCheck) would
never match the returned hash, silently falling through to the slow
freezer-backed read on every call.
Fix: replace crypto.Keccak256Hash(blob) with dl.db.hasher(blob) at
both the clean-cache-hit and disk-read return paths. The hasher is
already set to the correct function (merkleNodeHasher or
binaryNodeHasher) at Database construction time.
Addresses review finding C4 + Opus agent audit secondary bug.
Bug 1 — fail-open gate in disklayer.storage:
disklayer.storage() compared a 64-byte merkle-shaped combinedKey
(accountHash || storageHash) against the 32-byte bintrie generator
marker via codec.MarkerCompare. For bintrie, accountHash is always
common.Hash{} (since bintrieFlatCodec.StorageKey returns zero for
the account key), so the combinedKey started with 32 zero bytes.
The sha256-derived marker's first byte is essentially never 0x00,
so bytes.Compare returned -1, the > 0 branch never fired, and the
generator-progress gate was silently DISABLED. During active
generation, disklayer.storage served whatever was on disk (nil or
stale) without returning errNotCoveredYet.
Fix: add StorageMarkerKey(accountHash, storageHash) to the
flatStateCodec interface. Merkle returns the 64-byte concatenation
(preserving existing behavior); bintrie returns storageHash[:]
(the 32-byte stem||offset key matching the generator marker shape).
disklayer.storage now uses the codec method.
Bug 2 — rlp.Split on raw bintrie storage leaves in historicStateReader:
historicStateReader.Storage at core/state/database_history.go:87
calls rlp.Split(blob) on whatever bytes the pathdb historical reader
returns. Merkle storage values are RLP-encoded (trimmed-left-zeros);
bintrie leaves are raw 32 bytes. rlp.Split on raw 32-byte input
either errors or decodes garbage. Even after fixing Bug 1, bintrie
historical storage reads were broken end-to-end.
Fix: add isVerkle bool to historicStateReader; when true, bypass
rlp.Split and copy the raw 32-byte blob directly. The flag is set
from db.triedb.IsVerkle() at construction time.
Addresses review finding C1.
Before this commit, flushStem in generateBinTrieStems used
builder.encode() to overwrite the on-disk stem blob unconditionally.
When a crash+restart interrupted generation mid-stem (e.g., at offset 3
of stemA), the resume iterator positioned at stemA||3, the builder
accumulated only offsets 3+, and flushStem overwrote the disk blob with
a partial result — silently losing offsets 0, 1, 2 that were written in
the prior pass.
Fix: make flushStem a read-modify-write. It now reads the existing
on-disk stem blob (if any), converts the builder's accumulated offsets
to []stemOffsetValue via a new toOffsetValues() helper, and merges them
via the existing mergeStemBlob function. The merge semantics are
"builder values win" — new offsets overwrite their existing counterparts,
and gaps are filled from the prior blob. This makes the RMW idempotent
across resume cycles: the same stem can be re-walked from any midpoint
and the final disk blob always contains the union of all passes.
New helper: stemBuilder.toOffsetValues() converts the builder's
populated bitmap entries into a []stemOffsetValue slice suitable for
mergeStemBlob. ~20 LOC in stem_blob.go.
Tests:
* TestBintrieGeneratorResumeMidStem — pre-seeds disk with a partial
stem (offsets 0, 1), resumes generator at offset 1, asserts all
offsets survive including the pre-seeded offset 0. Before the fix
this test fails with "BasicData lost after mid-stem resume".
* TestBintrieGeneratorResumeStemBoundary — renamed from the original
TestBintrieGeneratorResume, unchanged behavior (stem-boundary
resume).
Addresses review finding C3.
Before this commit, bintrieFlatReader.Account returned (nil, nil) when
both the BasicData and CodeHash leaves were absent from the flat state.
multiStateReader.Account treats (nil, nil) as "confirmed absent" and
short-circuits — the trie reader never runs. This silently hid every
corruption mode the other A-commits are fixing (C1 mid-stem resume
loss, C2 disk-layer shape mismatch, in-transition stale data, etc.):
the flat state said "not present" and nobody checked.
Fix: introduce errBintrieFlatStateMiss as a local sentinel. When both
leaves are absent, the flat reader returns (nil, errBintrieFlatStateMiss)
instead of (nil, nil). The multiStateReader falls through on any
non-nil error, so the trie reader now runs and serves as the
authoritative gatekeeper. If the flat state genuinely has no data (and
the trie reader also returns nil), the end result is the same — but
any case where the flat state is wrong and the trie is right is now
caught by the fallthrough.
Same treatment for Storage: absent blob returns errBintrieFlatStateMiss.
Known limitation: BinaryTrie.GetAccount does not verify stem membership
(a characteristic of verkle-style tries where non-membership proofs are
handled externally). A truly non-existent account returns the closest
stem's data, not nil. The TestBintrieFlatReaderMissingAccountSentinel
test therefore verifies the flat reader's sentinel in isolation rather
than the end-to-end multiStateReader result.
Addresses review finding C2 (+ I5, S5, T2, T3, T12).
Before this commit, bintrieFlatCodec.ReadAccount returned the FULL
variable-length stem blob from disk while the in-memory diff-layer
buffer stored per-offset 32-byte values. The consumer,
bintrieFlatReader.Account, enforced len(basicBlob)!=32 → error, so
every disk-layer hit produced "bintrie BasicData leaf invalid length"
in production the moment the write buffer flushed. TestBintrieFlatReaderEndToEnd
did not catch this because it never forced a buffer → disk flush.
Fix: make bintrieFlatCodec.ReadAccount extract the offset from the
stem blob (mirroring ReadStorage), so the disk path and the buffer
path return the same 32-byte per-offset shape. Update
AccountCacheKey/StorageCacheKey to embed the full 32-byte key
(prefix + 31-byte stem + 1-byte offset), since caching under a
stem-only key would collapse BasicData and CodeHash into the same
slot and return the wrong value on the second hit. Update
Flush's cache-update loop to store per-offset entries from the
aggregated write set.
Design note: I considered the alternative of introducing a new
StemBlob(stem) interface method that returns the full blob synthesized
from a stem-level lookup index. Rejected because (a) the index is a
new data structure with its own consistency invariants, (b) the
per-offset approach is strictly local to the codec + reader, and (c)
the "1 Pebble read per Account" locality benefit is preserved at the
OS page cache level — both offsets at the same stem live in the same
Pebble block, so the second read is effectively free.
bintrieFlatReader.Account still does two AccountRLP lookups; the
torn-read hazard is gated by a new load-bearing invariant test,
TestBinaryHasherWritesBothBasicAndCodeHash, which asserts that
binaryHasher.updateAccount always emits both BasicData and CodeHash
leaves together. A future code-only update that broke this invariant
would fail the test.
Tests added:
* TestBintrieFlatReaderEndToEndAfterFlush — explicitly flushes via
tdb.Commit(root, false) and re-reads through a fresh StateReader.
This is the smoking-gun regression for C2.
* TestBintrieFlatReaderMultipleOffsetsPerStem — multiple offsets at
the same stem (BasicData, CodeHash, header storage slots) all
round-trip post-flush.
* TestBintrieCodecCrossFlushRMW — two Flush calls to the same stem
from different "blocks" correctly merge on disk, with prior
offsets preserved.
* TestBinaryHasherWritesBothBasicAndCodeHash — locks down the hasher
co-write invariant that bintrieFlatReader.Account relies on.
Existing tests updated to match the new per-offset ReadAccount
semantics:
* TestBintrieCodecAccountRoundTrip, TestBintrieCodecMultipleWritesSameStem,
TestBintrieCodecDeleteAccount — now read per-offset rather than
calling extractStemOffset on the raw blob.
* TestBintrieCodecCacheKeysDisjoint — additionally verifies two
offsets at the same stem produce distinct cache keys.
Error messages in bintrieFlatReader now include address and length
context (S5).
Wires the pieces from Commits 1-9 into a running system:
* triedb/pathdb.New: install the bintrieFlatCodec when isVerkle is set,
backed by the same verkle-namespaced db used for trie nodes.
* triedb/pathdb.database.go: drop isVerkle from the noBuild guard so the
bintrie generator (Commit 9) runs on startup, and remove it from the
generateSnapshot call path for the same reason.
* triedb/pathdb.disklayer.revert: hard-fail on bintrie because the
reorg path would replay merkle-shaped origin records against a
per-stem layout. Tracked in BINTRIE_FLAT_STATE_REORG_GAP.md.
* triedb/pathdb.journal: add IsBintrie to journalGenerator (rlp:"optional"
so v3 journals still decode) and make journalProgress a method on
generator so it stamps the active scheme; loadGenerator discards any
journal whose scheme does not match the database, forcing a fresh
regeneration.
* triedb/pathdb.reader: export RawStateReader, a small extension of
database.StateReader that exposes AccountRLP so callers outside the
package can reach the raw flat-state bytes without going through the
slim-RLP decode path that assumes merkle shape.
* core/state.reader: add bintrieFlatReader, the bintrie equivalent of
flatReader. It derives the EIP-7864 stem keys from (addr, slot),
performs two AccountRLP lookups per Account call (BasicData +
CodeHash), and decodes via bintrie.UnpackBasicData. Storage reads go
through a single AccountRLP lookup at the slot's full bintrie key.
* core/state.database.StateReader: dispatch to bintrieFlatReader when
the path database is in verkle mode; merkle path unchanged.
Depends on the lookup sentinel fix in the previous commit; without it
missing-account reads on bintrie misreport as "layer stale".
Adds generateBinTrieStems, the bintrie analogue of generateAccounts. It
opens the bintrie via a sha256-aware bintrieDiskStore (the merkle disk
store would always fail root validation against a binary node), iterates
all leaves with binaryNodeIterator, aggregates them into per-stem
builders, and emits one stem blob per stem boundary.
Resume support is structural: ctx.marker is fed straight to the trie's
NodeIterator, which uses binaryNodeIterator.seek (Commit 1) to position
on the first leaf >= marker. Range proofs are deliberately skipped — the
bintrie's Prove path is unimplemented and an iteration-only generation
cycle is acceptable for a one-time startup cost.
A bintrieGeneratorContext mirrors generatorContext but is much smaller:
no holdable iterators (we walk the trie, not the existing flat state)
and no two-tier marker (the bintrie key space is unified). checkAndFlushBin
journals progress as a single 32-byte (stem || offset) key so resume
can pick up mid-stem.
generator.run dispatches on codec type so callers see a uniform
lifecycle whether the underlying scheme is merkle or bintrie.
Drains the binaryHasher's LeafProducer side-channel in StateDB.commit and
threads the stem writes through stateUpdate.encodeBinary into the pathdb
state set as per-offset accountData entries (key = stem||offset, value =
32-byte leaf or nil for clears).
The flat-state codec gains a Flush method that owns the in-memory→disk
write path, replacing the codec-agnostic per-entry loop in writeStates.
The merkle codec preserves its historical per-entry behavior verbatim;
the bintrie codec aggregates per-offset writes by stem so each stem hits
disk via a single read-modify-write, satisfying the codec's pre-aggregation
requirement and updating the clean cache with the merged blob it just
produced (no extra disk read).
stateUpdate.encodeBinary returns empty origin maps for the bintrie path:
state-history rollback for bintrie is deferred to a follow-up PR (see
BINTRIE_FLAT_STATE_REORG_GAP.md), and the diskLayer.revert path will
panic before consuming origins anyway.
binaryHasher now implements the new LeafProducer optional extension to
the Hasher interface. Every UpdateAccount, UpdateStorage, and delete
path records the corresponding (stem, offset, value) write into an
internal buffer, which the caller drains once per block via
DrainStemWrites() and hands to the pathdb flat-state layer through the
stateUpdate (wired up in the next commit).
Three kinds of writes are recorded:
- Account create/update: two writes (BasicData at offset 0,
CodeHash at offset 1), sharing the same 31-byte stem. BasicData
is produced via bintrie.PackBasicData so the flat-state blob
is bit-identical to what the trie layer packs internally.
- Storage update: one write per slot. Non-zero values become
right-justified 32-byte blobs; the zero value (the bintrie's
"delete" convention) becomes 32 zero bytes, matching the trie's
tombstone-with-zero semantics so the flat-state mirror stays
bit-identical to the StemNode.Values entry.
- Account delete: two clear writes (nil Value) for offsets 0 and 1.
Storage slots and code chunks at the same or other stems are NOT
touched; pre-EIP-6780 full-wipe is a documented scope limitation.
The LeafProducer interface lives on Hasher and is strictly opt-in —
merkleHasher does not implement it, and callers detect capability via
a type assertion. This keeps the read-side/write-side split of the
existing Hasher cleanly extended: hashers that have a concept of
flat-state leaves can expose them; hashers that don't (MPT) are
unaffected.
Tests cover:
- TestBinaryHasherLeafProduction: account update produces 2 writes
at offsets 0+1 with matching stem; drain is destructive; storage
update emits one matching write; zero-value storage writes 32 zero
bytes; delete emits 2 clear writes.
- TestMerkleHasherNoLeafProducer: merkleHasher does NOT satisfy the
LeafProducer interface (the capability is opt-in per hasher).
The collected stem writes are not yet propagated anywhere — a later
commit wires DrainStemWrites into StateDB.IntermediateRoot so the
writes flow through stateUpdate and the pathdb stateSet into the
flat-state layer.
Introduce the codec and on-disk blob format for the bintrie flat-state
layer. This commit only defines the types; the codec is NOT wired into
pathdb.Database.New yet (that happens in a later commit once the
leaf-production hook in binaryHasher and the stateUpdate wiring are in
place).
Three pieces:
1. trie/bintrie/pack.go
Canonical PackBasicData / UnpackBasicData helpers that encode an
account's (codeSize, nonce, balance) into the 32-byte BasicData leaf
defined by EIP-7864. Preserves the existing BinaryTrie.UpdateAccount
layout byte-for-byte (4-byte code_size at offset 4 rather than the
spec's 3-byte field at offset 5 — any realistic code size has byte 4
always zero and the two encodings are bit-equivalent in practice).
BinaryTrie.UpdateAccount is refactored to delegate to PackBasicData
so the flat-state codec can produce a bit-identical BasicData
encoding without duplicating the layout logic.
2. triedb/pathdb/stem_blob.go
Packed encoding of the populated (offset, value) pairs at a bintrie
stem. A stem can hold up to 256 offsets per EIP-7864 but in practice
only a handful are set; the layout is a 32-byte bitmap followed by
N 32-byte values in ascending offset order, where N = popcount.
Empty stems encode to nil so the caller knows to delete the on-disk
key rather than write a zero-length value.
Provides encodeStemBlob / decodeStemBlob / extractStemOffset /
mergeStemBlob and a stemBuilder type for accumulating writes. The
tombstone convention (32 zero bytes = "present with zero" as used
by DeleteStorage) is preserved.
11 unit tests cover: empty blob, BasicData+CodeHash roundtrip, all
256 offsets populated, sparse high offsets, set/clear roundtrip,
load-from-existing-blob RMW, merge helper, merge-to-empty, tombstone
zero bytes, malformed input detection, bitmap rank sanity.
3. triedb/pathdb/flat_codec_bintrie.go
bintrieFlatCodec implements flatStateCodec over the stem-blob layout.
Unlike merkleFlatCodec it is stateful: it holds a ethdb.KeyValueReader
reference used by applyWrites to read the existing stem blob before
merging in new writes. ethdb.Batch is write-only so the batch passed
to Write* cannot be used to fetch current state.
Pre-aggregation requirement is documented explicitly: within a single
flush, the caller must NOT issue two Write* calls targeting the same
stem, because the RMW read comes from the store (not the in-flight
batch). Commit 8 of the bintrie flat-state plan restructures
writeStates to pre-aggregate per-stem writes so callers don't have
to handle this manually.
Cache keys are prefix-disambiguated with a one-byte 0x01 to keep
bintrie stem lookups disjoint from merkle 32-byte account keys and
64-byte storage keys in the shared clean-state fastcache.
SplitMarker is a single-tier (stem-only) format, not the merkle
two-tier (account, account+storage) format.
7 unit tests cover: account roundtrip, storage roundtrip, multiple
writes to the same stem, DeleteAccount preserving unrelated offsets,
DeleteStorage removing the final offset collapsing the key, cache
key disjointness from merkle, SplitMarker semantics.
The codec is not dispatched by anything yet; MPT continues through the
merkle codec and bintrie mode still runs on the (soon-to-be-replaced)
keccak-shaped path until Commit 10 wires things up.
Reserve a new top-level key-value namespace for bintrie flat-state
entries: BinTrieStemPrefix = "X" (chosen to be free at the top level so
unwrapped-db callers do not collide with blockBodyPrefix "b"). In
practice pathdb nests this namespace under VerklePrefix "v" via
rawdb.NewTable, so the on-disk key is effectively "v"+"X"+stem.
A stem is the 31-byte common prefix of the 32-byte tree key defined in
EIP-7864. The stored value is a packed blob containing the populated
(offset, value) pairs at that stem; BasicData (offset 0), CodeHash
(offset 1), header storage (offsets 64-127), code chunks (offsets
128-255) and main-storage slots can all be extracted from the same
blob by a single Pebble read, matching the on-trie StemNode grouping.
Add Read/Write/DeleteBinTrieStem alongside the existing snapshot
accessors, and a private binTrieStemKey helper in schema.go. No
callers yet — the bintrieFlatCodec in the next commit consumes them.
Matches the same style as ReadAccountSnapshot et al.; the codec and
integration tests downstream will exercise them.
Reserve journal version 4 for the upcoming bintrie flat-state layout
(per-stem blobs). Bumping now — with no on-disk format change yet —
ensures that any v3 journals belonging to a bintrie database are
discarded on load, so the new layout can be introduced cleanly in
follow-up commits without a migration shim.
MPT behavior is unchanged at this point: the only codec wired to the
pathdb Database is still merkleFlatCodec. All pathdb, core/state,
core/rawdb, and trie tests pass.
Route the flatStateCodec from Database through every flat-state call
site so that the trie-specific aspects of persistence and key derivation
live behind a single abstraction. Pure refactor: merkle behavior and
on-disk layout are unchanged because the only codec wired up is
merkleFlatCodec, whose methods are thin wrappers over the existing
rawdb accessors.
Threaded sites:
disklayer.account/storage use codec.{Read,AccountCacheKey,
StorageCacheKey} instead of direct
rawdb calls and bare hash slicing.
flush.writeStates takes a codec parameter; persistence
goes through codec.{Write,Delete}
{Account,Storage}.
buffer.flush carries the codec down into writeStates.
states.write/dbsize takes the codec for prefix-size
accounting.
generate.go (g.codec) the generator owns a codec, used by
generateAccounts/generateStorages
callbacks; the unused top-level
splitMarker helper is removed in favor
of codec.SplitMarker.
context.go the generator context owns the codec
and uses codec.{AccountPrefix,
StoragePrefix,Account/StorageKeyLength}
to construct iterators.
reader.go (HistoricalState) uses codec.{Account,Storage}Key for
caller-side key derivation.
The marker comparisons in writeStates remain merkle-shaped (two-tier
account+storage marker) because the bintrie path will use a separate
writer over single-tier stem markers in a later commit.
All existing pathdb tests pass.
Introduce flatStateCodec, a small interface that captures the
trie-specific aspects of flat-state storage: key derivation from
(address, slot), persistence of account/storage entries, clean-cache
key disambiguation, iterator setup, and progress-marker handling.
Mirrors the existing nodeHasher pattern and complements the Hasher
interface from state-hasher-iface-2 (which abstracts trie-side hashing
and commit). The codec is stored on Database alongside the existing
hasher field, ready to be threaded through the flat-state call sites
(disklayer, flush, generator, reader) in the next commit.
Provides merkleFlatCodec, a thin wrapper over the existing rawdb
snapshot accessors and helpers. This is a pure refactor: behavior is
unchanged. The bintrie-side codec implementation is added in a later
commit, after all call sites have been routed through the abstraction.
The bintrie node iterator previously discarded its `start` parameter,
forcing every iteration to begin at the root. This makes resumable
generators (snapshot/flat-state population) impossible — any
interruption restarts from scratch.
Implement seek(start []byte) by walking down the trie following start's
bit path, building the iterator stack as we go. When the chosen path
dead-ends (Empty, missing child, or a stem strictly less than start),
backtrack through the existing stack to find the next in-order subtree
and descend to its leftmost leaf.
Also wire BinaryTrie.NodeIterator(startKey) to actually pass startKey
through (was hardcoded to nil).
Tests cover: empty start (no-op), exact key match, between-keys,
into empty subtree, past end, within-stem offsets, resume simulation,
and deep tree.
binaryHasher.updateAccount computed codeLen from len(account.Code.Code),
which is only non-zero when the code itself was modified in the current
block. For balance- or nonce-only updates account.Code is nil and the
computed codeLen was 0, silently overwriting the code_size field packed
into the bintrie BasicData leaf (EIP-7864 bytes 5-7) with zero every
time a contract was touched without a code write.
The TODO(rjl493456442) on updateAccount acknowledged this. Fix it by
adding a CodeSize field to AccountMut and having the caller at
StateDB.IntermediateRoot populate it via stateObject.CodeSize(), which
returns len(obj.code) when the bytes are loaded, otherwise falls back
to a code-size lookup via the reader. The binary hasher then passes
account.CodeSize straight to BinaryTrie.UpdateAccount as its codeLen
argument, and the TODO is removed.
Rationale for placing CodeSize on AccountMut rather than Account:
AccountMut already carries Code *CodeMut — the new bytecode, which is
not a field of Account — because code is write-time data that is not
persisted in the flat-state format (SlimAccountRLP). CodeSize has the
identical lifecycle: it is not in SlimAccountRLP, it is not populated
by any reader, and it is only consumed by the hasher at write time.
Mirroring Code's placement keeps the read-side/write-side split honest
(Account models the persisted flat-state record; AccountMut adds the
code-related write-time parameters). If the bintrie flat-state format
is later extended to carry code_size, CodeSize can be promoted onto
Account at that time.
merkleHasher is unaffected: StateTrie.UpdateAccount ignores its codeLen
parameter, so the wrapTrie.UpdateAccount shim continues to pass 0 and
no state-root divergence is introduced on the MPT path.
Regression test TestVerkleCodeSizePreserved verifies that the state
root produced by "create contract, commit, reload, modify balance,
commit" matches the root of a single-step construction of the same
final state. Before the fix the roots diverge:
path A (reload + balance): 1a675599...
path B (fresh, same state): de0cfb03...
StateDB.Commit first commits all storage changes into the storage trie,
then updates the account metadata with the new storage root into the
account trie.
Within StateDB.Commit, the new storage trie root has already been
computed and applied as the storage root. This PR explicitly skips the
redundant storage trie root assignment for readability.
This is a copy of #34721 but against `master` (rather than
`bal-devnet-3`), as requested by @jwasinger, since the slotnum logic now
exists on `master` as well.
This PR simplifies the implementation of EIP-7610 by eliminating the
need to check storage emptiness during contract deployment.
EIP-7610 specifies that contract creation must be rejected if the
destination account has a non-zero nonce, non-empty runtime code, or
**non-empty storage**.
After EIP-161, all newly deployed contracts are initialized with a nonce
of one. As a result, such accounts are no longer eligible as deployment
targets unless they are explicitly cleared.
However, prior to EIP-161, contracts were initialized with a nonce of
zero. This made it possible to end up with accounts that have:
- zero nonce
- empty runtime code
- non-empty storage (created during constructor execution)
- non-zero balance
These edge-case accounts complicate the storage emptiness check.
In practice, contract addresses are derived using one of the following
formulas:
- `Keccak256(rlp({sender, nonce}))[12:]`
- `Keccak256([]byte{0xff}, sender, salt[:], initHash)[12:]`
As such, an existing address is not selected as a deployment target
unless a collision occurs, which is extremely unlikely.
---
Previously, verifying storage emptiness relied on GetStorageRoot.
However, with the transition to the block-based access list (BAL),
the storage root is no longer available, as computing it would require
reconstructing the full storage trie from all mutations of preceding
transactions.
To address this, this PR introduces a simplified approach: it hardcodes
the set of known accounts that have zero nonce, empty runtime code,
but non-empty storage and non-zero balance. During contract deployment,
if the destination address belongs to this set, the deployment is
rejected.
This check is applied retroactively back to genesis. Since no address
collision events have occurred in Ethereum’s history, this change does
not
alter existing behavior. Instead, it serves as a safeguard for future
state
transitions.
This fixes the remaining Hive discv5/FindnodeResults failures in the
cmd/devp2p/internal/v5test fixture.
The issue was in the simulator-side bystander behavior, not in
production discovery logic. The existing fixture could get bystanders
inserted into the remote table, but under current geth behavior they
were not stable enough to remain valid FINDNODE results. In
particular, the fixture still had a few protocol/behavior mismatches:
- incomplete WHOAREYOU recovery
- replies not consistently following the UDP envelope source
- incorrect endpoint echoing in PONG
- fixture-originated PING using the wrong ENR sequence
- bystanders answering background FINDNODE with empty NODES
That last point was important because current lookup accounting can
treat repeatedly unhelpful FINDNODE interactions as failures. As a
result, a bystander could become live via PING/PONG and still later be
dropped from the table before the final FindnodeResults assertion.
This change updates the fixture so that bystanders behave more like
stable discv5 peers:
- perform one explicit initial handshake, then switch to passive response handling
- resend the exact challenged packet when handling WHOAREYOU
- reply to the actual UDP packet source and mirror that source in PONG.ToIP / PONG.ToPort
- use the bystander’s own ENR sequence in fixture-originated PING
- prefill each bystander with the bystander ENR set and answer FINDNODE from that set
The result is that the fixture now forms a small self-consistent lookup
environment instead of a set of peers that are live but systematically
poor lookup participants.
Fixes#34108
The UDPv5 test harness (`newUDPV5Test`) uses the default `PingInterval`
of 3 seconds. When tests like `TestUDPv5_findnodeHandling` insert nodes
into the routing table via `fillTable`, the table's revalidation loop
may schedule PING packets for those nodes. Under the race detector or on
slow CI runners, the test runs long enough for revalidation to fire,
causing background pings to be written to the test pipe. The `close()`
method then finds these as unmatched packets and fails.
The fix sets `PingInterval` to a very large value in the test harness so
revalidation never fires during tests.
Verified locally: 100 iterations with `-race -count=100` pass reliably,
where previously the test would fail within ~50 iterations.
runtime.setDefaults was unconditionally assigning cfg.Random =
&common.Hash{}, which silently overwrote any caller-provided Random
value. This made it impossible to simulate a specific PREVRANDAO and
also forced post-merge rules whenever London was active, regardless of
the intended environment.
This change only initializes cfg.Random when it is nil, matching how
other fields in Config are defaulted. Existing callers that did not set
Random keep the same behavior (a non-nil zero hash still enables
post-merge semantics), while callers that explicitly set Random now get
their value respected.
This fixes a truncation bug that results in an invalid serialization of
empty EIP712.
For example:
```json
{
"method": "eth_signTypedData_v4",
"request": {
"types": {
"EIP712Domain": [
{
"name": "version",
"type": "string"
}
],
"Empty": []
},
"primaryType": "Empty",
"domain": {
"version": "0"
},
"message": {}
}
}
```
When calculating the type-hash for the stuct-hash, it will incorrectly
use `Empty)` instead of `Empty()`
# Summary
Replaces the inline `errors.New("event signature mismatch")` in
generated `UnpackXxxEvent` methods with per-event package-level sentinel
errors (e.g. `ErrTransferSignatureMismatch`,
`ErrApprovalSignatureMismatch`), allowing callers to reliably
distinguish a topic mismatch from a genuine decoding failure via
`errors.Is`.
Each event gets its own sentinel, generated via the abigen template:
```go
var ErrTransferSignatureMismatch = errors.New("event signature mismatch")
```
This scoping is intentional — it allows callers to be precise about
*which* event was mismatched, which is useful when routing logs across
multiple unpackers.
# Motivation
Previously, all errors returned from `UnpackXxxEvent` were
indistinguishable without string matching. This is especially
problematic when processing logs sourced from `eth_getBlockReceipts`,
where a caller receives the full set of logs for a block across all
contracts and event types. In that context, a signature mismatch is
expected and should be skipped, while any other error (malformed data,
topic parsing failure) indicates something is genuinely wrong and should
halt execution:
```go
for _, log := range blockLogs {
event, err := contract.UnpackTransferEvent(log)
if errors.Is(err, gen.ErrTransferSignatureMismatch) {
continue // not our event, expected
}
if err != nil {
return fmt.Errorf("unexpected decode failure: %w", err) // alert
}
// process event
}
```
**Changes:**
- `abigen` template: generates a `ErrXxxSignatureMismatch` sentinel per
event and returns it on topic mismatch instead of an inline error
- Existing generated bindings & testdata: regenerated to reflect the
update
Implements #34075
Return ErrInvalidOpCode with the executing opcode and offending
immediate for forbidden DUPN, SWAPN, and EXCHANGE operands. Extend
TestEIP8024_Execution to assert both opcode and operand for all
invalid-immediate paths.
TestUpdatedKeyfileContents was intermittently failing with:
- Emptying account file failed
- wasn't notified of new accounts
Root cause: waitForAccounts required the account list match and an
immediately readable ks.changes notification in the same instant,
creating a timing race between cache update visibility and channel
delivery.
This change keeps the same timeout window but waits until both
conditions are observed, which preserves test intent while removing the
flaky timing dependency.
Validation:
- go test ./accounts/keystore -run '^TestUpdatedKeyfileContents$'
-count=100
The comment formula showed (i+3) but the code multiplies by 9 (Lsh 3 +
add = 8+1).
This was a error when porting from upstream golang.org/x/crypto/bn256
where ξ=i+3.
Go-ethereum changed the constant to ξ=i+9 but forgot to update the inner
formula.
Two fixes for `testing_buildBlockV1`:
1. Add `omitempty` to `SlotNumber` in `ExecutableData` so it is omitted
for pre-Amsterdam payloads. The spec defines the response as
`ExecutionPayloadV3` which does not include `slotNumber`.
2. Pass `res.fees` instead of `new(big.Int)` in `BuildTestingPayload` so
`blockValue` reflects actual priority fees instead of always being zero.
Corresponding fixture update: ethereum/execution-apis#783