Reserve a new top-level key-value namespace for bintrie flat-state
entries: BinTrieStemPrefix = "X" (chosen to be free at the top level so
unwrapped-db callers do not collide with blockBodyPrefix "b"). In
practice pathdb nests this namespace under VerklePrefix "v" via
rawdb.NewTable, so the on-disk key is effectively "v"+"X"+stem.
A stem is the 31-byte common prefix of the 32-byte tree key defined in
EIP-7864. The stored value is a packed blob containing the populated
(offset, value) pairs at that stem; BasicData (offset 0), CodeHash
(offset 1), header storage (offsets 64-127), code chunks (offsets
128-255) and main-storage slots can all be extracted from the same
blob by a single Pebble read, matching the on-trie StemNode grouping.
Add Read/Write/DeleteBinTrieStem alongside the existing snapshot
accessors, and a private binTrieStemKey helper in schema.go. No
callers yet — the bintrieFlatCodec in the next commit consumes them.
Matches the same style as ReadAccountSnapshot et al.; the codec and
integration tests downstream will exercise them.
binaryHasher.updateAccount computed codeLen from len(account.Code.Code),
which is only non-zero when the code itself was modified in the current
block. For balance- or nonce-only updates account.Code is nil and the
computed codeLen was 0, silently overwriting the code_size field packed
into the bintrie BasicData leaf (EIP-7864 bytes 5-7) with zero every
time a contract was touched without a code write.
The TODO(rjl493456442) on updateAccount acknowledged this. Fix it by
adding a CodeSize field to AccountMut and having the caller at
StateDB.IntermediateRoot populate it via stateObject.CodeSize(), which
returns len(obj.code) when the bytes are loaded, otherwise falls back
to a code-size lookup via the reader. The binary hasher then passes
account.CodeSize straight to BinaryTrie.UpdateAccount as its codeLen
argument, and the TODO is removed.
Rationale for placing CodeSize on AccountMut rather than Account:
AccountMut already carries Code *CodeMut — the new bytecode, which is
not a field of Account — because code is write-time data that is not
persisted in the flat-state format (SlimAccountRLP). CodeSize has the
identical lifecycle: it is not in SlimAccountRLP, it is not populated
by any reader, and it is only consumed by the hasher at write time.
Mirroring Code's placement keeps the read-side/write-side split honest
(Account models the persisted flat-state record; AccountMut adds the
code-related write-time parameters). If the bintrie flat-state format
is later extended to carry code_size, CodeSize can be promoted onto
Account at that time.
merkleHasher is unaffected: StateTrie.UpdateAccount ignores its codeLen
parameter, so the wrapTrie.UpdateAccount shim continues to pass 0 and
no state-root divergence is introduced on the MPT path.
Regression test TestVerkleCodeSizePreserved verifies that the state
root produced by "create contract, commit, reload, modify balance,
commit" matches the root of a single-step construction of the same
final state. Before the fix the roots diverge:
path A (reload + balance): 1a675599...
path B (fresh, same state): de0cfb03...
StateDB.Commit first commits all storage changes into the storage trie,
then updates the account metadata with the new storage root into the
account trie.
Within StateDB.Commit, the new storage trie root has already been
computed and applied as the storage root. This PR explicitly skips the
redundant storage trie root assignment for readability.
This PR simplifies the implementation of EIP-7610 by eliminating the
need to check storage emptiness during contract deployment.
EIP-7610 specifies that contract creation must be rejected if the
destination account has a non-zero nonce, non-empty runtime code, or
**non-empty storage**.
After EIP-161, all newly deployed contracts are initialized with a nonce
of one. As a result, such accounts are no longer eligible as deployment
targets unless they are explicitly cleared.
However, prior to EIP-161, contracts were initialized with a nonce of
zero. This made it possible to end up with accounts that have:
- zero nonce
- empty runtime code
- non-empty storage (created during constructor execution)
- non-zero balance
These edge-case accounts complicate the storage emptiness check.
In practice, contract addresses are derived using one of the following
formulas:
- `Keccak256(rlp({sender, nonce}))[12:]`
- `Keccak256([]byte{0xff}, sender, salt[:], initHash)[12:]`
As such, an existing address is not selected as a deployment target
unless a collision occurs, which is extremely unlikely.
---
Previously, verifying storage emptiness relied on GetStorageRoot.
However, with the transition to the block-based access list (BAL),
the storage root is no longer available, as computing it would require
reconstructing the full storage trie from all mutations of preceding
transactions.
To address this, this PR introduces a simplified approach: it hardcodes
the set of known accounts that have zero nonce, empty runtime code,
but non-empty storage and non-zero balance. During contract deployment,
if the destination address belongs to this set, the deployment is
rejected.
This check is applied retroactively back to genesis. Since no address
collision events have occurred in Ethereum’s history, this change does
not
alter existing behavior. Instead, it serves as a safeguard for future
state
transitions.
runtime.setDefaults was unconditionally assigning cfg.Random =
&common.Hash{}, which silently overwrote any caller-provided Random
value. This made it impossible to simulate a specific PREVRANDAO and
also forced post-merge rules whenever London was active, regardless of
the intended environment.
This change only initializes cfg.Random when it is nil, matching how
other fields in Config are defaulted. Existing callers that did not set
Random keep the same behavior (a non-nil zero hash still enables
post-merge semantics), while callers that explicitly set Random now get
their value respected.
Return ErrInvalidOpCode with the executing opcode and offending
immediate for forbidden DUPN, SWAPN, and EXCHANGE operands. Extend
TestEIP8024_Execution to assert both opcode and operand for all
invalid-immediate paths.
This PR fixes https://github.com/ethereum/go-ethereum/issues/34623 by
changing the `vm.StateDB` interface:
Instead of `EmitLogsForBurnAccounts()` emitting burn logs, `LogsForBurnAccounts()
[]*types.Log` just returns these logs which are then emitted by the caller.
This way when tracing is used, `hookedStateDB.AddLog` will be used
automatically and there is no need to duplicate either the burn log
logic or the `OnLog` tracing hook.
ProcessBeaconBlockRoot (EIP-4788) and processRequestsSystemCall
(EIP-7002/7251) do not merge the EVM access events into the state after
execution. ProcessParentBlockHash (EIP-2935) already does this correctly
at line 290-291.
Without this merge, the Verkle witness will be missing the storage
accesses from the beacon root and request system calls, leading to
incomplete witnesses and potential consensus issues when Verkle
activates.
In this PR, the Database interface in `core/state` has been extended
with one more function:
```go
// Iteratee returns a state iteratee associated with the specified state root,
// through which the account iterator and storage iterator can be created.
Iteratee(root common.Hash) (Iteratee, error)
```
With this additional abstraction layer, the implementation details can be hidden
behind the interface. For example, state traversal can now operate directly on
the flat state for Verkle or binary trees, which do not natively support traversal.
Moreover, state dumping will now prefer using the flat state iterator as
the primary option, offering better efficiency.
Edit: this PR also fixes a tiny issue in the state dump, marshalling the
next field in the correct way.
Add persistent storage for Block Access Lists (BALs) in `core/rawdb/`.
This provides read/write/delete accessors for BALs in the active
key-value store.
---------
Co-authored-by: Jared Wasinger <j-wasinger@hotmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
`pool.signer.Sender(tx)` bypasses the sender cache used by types.Sender,
which can force an extra signature recovery for every promotable tx
(promotion runs frequently). Use `types.Sender(pool.signer, tx)` here to
keep sender derivation cached and consistent.
Problem:
The max-initcode sentinel moved from core to vm, but RPC pre-check
mapping still depended on core.ErrMaxInitCodeSizeExceeded. This mismatch
could surface inconsistent error mapping when oversized initcode is
submitted through JSON-RPC.
Solution:
- Remove core.ErrMaxInitCodeSizeExceeded from the core pre-check error
set.
- Map max-initcode validation errors in RPC from
vm.ErrMaxInitCodeSizeExceeded.
- Keep the RPC error code mapping unchanged (-38025).
Impact:
- Restores consistent max-initcode error mapping after the sentinel
move.
- Preserves existing JSON-RPC client expectations for error code -38025.
- No consensus, state, or protocol behavior changes.
Fix incorrect key length calculation for `numHashPairings` in
`InspectDatabase`, introduced in #34000.
The `headerHashKey` format is `headerPrefix + num + headerHashSuffix`
(10 bytes), but the check incorrectly included `common.HashLength`,
expecting 42 bytes.
This caused all number -- hash entries to be misclassified as
unaccounted data.
## Summary
In binary trie mode, `IntermediateRoot` calls `updateTrie()` once per
dirty account. But with the binary trie there is only one unified trie
(`OpenStorageTrie` returns `self`), so each call redundantly does
per-account trie setup: `getPrefetchedTrie`, `getTrie`, slice
allocations for deletions/used, and `prefetcher.used` — all for the same
trie pointer.
This PR replaces the per-account `updateTrie()` calls with a single flat
loop that applies all storage updates directly to `s.trie`. The MPT path
is unchanged. The prefetcher trie replacement is guarded to avoid
overwriting the binary trie that received updates.
This is the phase-1 counterpart to #34021 (H01). H01 fixes the commit
phase (`trie.Commit()` called N+1 times). This PR fixes the update phase
(`updateTrie()` called N times with redundant setup). Same root cause —
unified binary trie operated on per-account — different phases.
## Benchmark (Apple M4 Pro, 500K entries, `--benchtime=10s --count=3`,
on top of #34021)
| Metric | H01 baseline | H01 + this PR | Delta |
|--------|:------------:|:-------------:|:-----:|
| Approve (Mgas/s) | 368 | **414** | **+12.5%** |
| BalanceOf (Mgas/s) | 870 | 875 | +0.6% |
Should be rebased after #34021 is merged.
EIP-7928 brings state reads into consensus by recording accounts and storage accessed during execution in the block access list. As part of the spec, we need to check that there is enough gas available to cover the cost component which doesn't depend on looking up state. If this component can't be covered by the available gas, we exit immediately.
The portion of the call dynamic cost which doesn't depend on state look ups:
- EIP2929 call costs
- value transfer cost
- memory expansion cost
This PR:
- breaks up the "inner" gas calculation for each call variant into a pair of stateless/stateful cost methods
- modifies the gas calculation logic of calls to check stateless cost component first, and go out of gas immediately if it is not covered.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR introduces a new type HistoryPolicy which captures user intent
as opposed to pruning point stored in the blockchain which persists the
actual tail of data in the database.
It is in preparation for the rolling history expiry feature.
It comes with a semantic change: if database was pruned and geth is
running without a history mode flag (or explicit keep all flag) geth
will emit a warning but continue running as opposed to stopping the
world.
## Summary
**Bug fix.** In Verkle mode, all state objects share a single unified
trie (`OpenStorageTrie` returns `self`). During `stateDB.commit()`, the
main account trie is committed via `s.trie.Commit(true)`, which calls
`CollectNodes` to traverse and serialize the entire tree. However, each
dirty account's `obj.commit()` also calls `s.trie.Commit(false)` on the
**same trie object**, redundantly traversing and serializing the full
tree once per dirty account.
With N dirty accounts per block, this causes **N+1 full-tree
traversals** instead of 1. On a write-heavy workload (2250 SSTOREs),
this produces ~131 GB of allocations per block from duplicate NodeSet
creation and serialization. It also causes a latent data race from N+1
goroutines concurrently calling `CollectNodes` on shared `InternalNode`
objects.
This commit adds an `IsVerkle()` early return in `stateObject.commit()`
to skip the redundant `trie.Commit()` call.
## Benchmark (AMD EPYC 48-core, 500K entries, `--benchtime=10s
--count=3`)
| Metric | Baseline | Fixed | Delta |
|--------|----------|-------|-------|
| Approve (Mgas/s) | 4.16 ± 0.37 | **220.2 ± 10.1** | **+5190%** |
| BalanceOf (Mgas/s) | 966.2 ± 8.1 | 971.0 ± 3.0 | +0.5% |
| Allocs/op (approve) | 136.4M | 792K | **-99.4%** |
Resolves the TODO in statedb.go about the account trie commit being
"very heavy" and "something's wonky".
---------
Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>