AdvancePartialHead's backfill loop used a strictly-greater condition, so it
wrote canonical-hash keys only for blocks above the pivot. Combined with
the Engine API path persisting the pivot via WriteBlockWithoutState (which
writes header+body but not the canonical-hash key) and InsertReceiptChain.writeLive
skipping the pivot because HasBlock already returned true, the pivot block
ended up without an H<num>n entry in leveldb. After the freezer advanced
past finalized, startup's gap check at rawdb/database.go:279 rejected the
datadir with "gap in the chain between ancients [0 - #N-1] and leveldb
[#N+1 - #head]".
Fix: explicitly write the canonical hash for currentHead at the start of
AdvancePartialHead's backfill, covering the pivot inclusively.
Also add a defensive guard in the chain-retention freezer path so that
TruncateTail never prunes past lastPivotNumber. Partial-state mode relies
on the pivot block as the anchor for state reconstruction; pruning its
body from ancients would make a future reorg spanning the pivot
unrecoverable.
Ship with a regression test that asserts AdvancePartialHead writes the
currentHead's canonical hash (covers the bug precondition directly), plus
an idempotency check and a small post-advance sanity test.
Verified end-to-end on bal-devnet-3:
- Before fix: Fatal on restart
- After fix: restart succeeds, BAL processing resumes within seconds,
verify_partial_sync_devnet3.sh passes 16/16 checks.
After the second snap sync completes, AdvancePartialHead moves the head
markers forward but never initialized partialState.Root(). This caused
ProcessBlockWithBAL to fall back to the parent's header root, which
doesn't match the computed trie root from BAL processing — resulting in
a state root mismatch on the first block after sync.
Fix: call SetRoot(root) and SetLastProcessedBlock() in AdvancePartialHead
so subsequent BAL processing chains from the correct state root.
Also add diagnostic logging to ProcessBlockWithBAL for easier debugging.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Live testing on bal-devnet-2 confirmed that computed roots DO diverge from
header roots. Block 75315 computed root 0xe909c7.. vs header root
0x9acbbe.. — untracked contracts' storage roots in the local trie are from
snap sync time and differ from the actual current roots, even when the
storage root resolver successfully queries peers.
This means subsequent blocks must chain off the computed root (via
partialState.Root()), not the header root (via parent.Root()). Restore
the stateRoot field using atomic.Pointer[common.Hash] instead of the
previous sync.RWMutex for lock-free concurrent access.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Apply review fixes: BAL iterator start (Fix 2), fatal root mismatch when
all storage resolved (Fix 3), WriteBlockWithoutState error handling (Fix 4),
contract filter construction order (Fix 5), canonical hash backfill (Fix 6),
underflow guard in gap processing (Fix 8), O(n²) prepend fix (Fix 9),
ReadBALHistory corruption detection (Fix 11), incomplete resolution error
(Fix 13), RLP encode panic (Fix 14), gap processing log level (Fix 16),
TriggerPartialResync message (Fix 18), and comment accuracy fixes.
Remove the stateRoot field and sync.RWMutex from PartialState entirely.
Since partial state maintains the full account trie, the computed root
always matches the header root (assuming storage root resolution succeeds).
ProcessBlockWithBAL now derives parent root from parent.Root() directly,
matching how full nodes derive state root from currentBlock headers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix several interacting issues that prevented partial state nodes from
syncing and following the chain on bal-devnet-2:
1. Stale pivot deadlock: Replace unconditional pivot suppression with
rate-limited advances (2-minute cooldown). This prevents the restart
loop bug while allowing recovery when the initial pivot is too stale
for peers to serve.
2. Storage root resolution: Add snap-based resolver that queries peers
for untracked contracts' storage roots during BAL processing. This
lets the computed state root converge toward the header root.
3. SetCanonical for partial state: When the computed root differs from
the header root (expected when untracked contracts have unresolved
storage roots), check HasState(partialState.Root()) instead of only
HasState(block.Root()). Guard against zero root during snap sync.
4. Canonical hash backfill: AdvancePartialHead now writes canonical
hashes for all blocks between the pivot and snap head, fixing the
"final block not in canonical chain" error caused by
InsertReceiptChain skipping blocks whose bodies already exist.
5. Gap block processing: After snap sync completes, process accumulated
blocks between the sync head and chain tip using their persisted BALs
before entering steady-state chain following.
6. Computed root chaining: Use partialState.Root() (actual computed root)
as parentRoot for subsequent blocks, not the header root. This ensures
correct trie chaining when computed != header root.
Tested end-to-end on bal-devnet-2: snap sync completes, gap blocks
processed, canonical head advances at chain tip (~1 block/12s).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix the post-sync deadlock where blocks validated via BAL in newPayload
were never written to the database, causing ForkchoiceUpdated to fail
finding them and triggering infinite sync cycles.
Changes:
- Export WriteBlockWithoutState and call it after ProcessBlockWithBAL
in newPayload, so FCU can find blocks via GetBlockByHash
- Guard SetCanonical against recoverAncestors for partial state nodes
(they can't re-execute blocks, only apply BAL diffs)
- Auto-disable log indexing when partial state is enabled (no receipts)
- Fix BAL type field accesses to match upstream bal-devnet-2 types
(StorageChanges, CodeChanges, BalanceChanges, Validate signature)
- Update newPayload signature (BAL now comes from ExecutableData params)
- Add partial sync scripts and documentation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Geth has two independent snapshot tiers, each with its own disable
mechanism:
1. In-memory snapshot cache: controlled by SnapshotLimit (derived from
ethconfig.SnapshotCache). Setting SnapshotCache=0 disables it.
2. On-disk snapshot generator: a background goroutine in pathdb that
iterates the entire state trie to build flat key-value snapshots.
Controlled by pathdb.Config.SnapshotNoBuild.
The partial state configuration (cmd/utils/flags.go) already set
SnapshotCache=0 to disable the in-memory cache. However, SnapshotNoBuild
was never set, so pathdb.Enable() — called after snap sync completes —
still launched the background generator goroutine.
This generator immediately hits missing storage tries for untracked
contracts (whose storage was intentionally skipped during partial sync),
logs "Trie missing, snapshotting paused", and blocks forever on its
abort channel — a permanent goroutine leak with no recovery path.
Additionally, BlockChainConfig.SnapshotNoBuild was never propagated to
pathdb.Config.SnapshotNoBuild in the triedbConfig() conversion. The
field only reached the hash-scheme snapshot module (core/blockchain.go
setupSnapshot), which is already skipped for path-scheme databases. This
plumbing gap meant pathdb.Config.SnapshotNoBuild was never set in
production code — only in tests.
Fix both issues:
- Set SnapshotNoBuild=true when partial state is enabled
- Propagate BlockChainConfig.SnapshotNoBuild into pathdb.Config
Freeze the pivot header for partial state nodes to ensure stable state
sync progress:
- Suppress pivot movement in fetchHeaders() (beaconsync.go)
- Suppress pivot movement in processSnapSyncContent() (downloader.go)
- Reuse existing pivot across sync cycle restarts in syncToHead()
After initial snap sync completes, bridge the gap from pivot to HEAD:
- Import post-pivot blocks with receipts (no execution needed since
untracked contracts have empty storage tries)
- Run second state sync to download HEAD state root
- Add AdvancePartialHead to update currentBlock without re-execution
Guard the backfiller for partial state mode:
- suspend() skips Cancel() during active snap sync to prevent
constant cancel/restart cycles from beacon head updates
- resume() skips new sync cycles after partial sync completes
Add chain retention for partial state mode: only the most recent N blocks
(default 1024) retain bodies and receipts. During sync, older blocks are
skipped entirely. After sync, the freezer enforces a rolling window.
Add engine API support for Block Access Lists (EIP-7928): NewPayloadV5
accepts BAL data alongside execution payloads, enabling partial state
nodes to receive per-block storage access information from the CL.
Fix beacon backfilling failure caused by dynamic chain cutoff not
clearing the cutoff hash (which remained at the genesis hash).
Add partial state awareness to eth_call/eth_estimateGas to return clear
errors when accessing untracked contract storage.
Implement Block Access List (BAL) processing for partial statefulness
per EIP-7928. This enables nodes to update state without re-executing
transactions by applying BAL diffs directly to the trie.
Key additions:
- ApplyBALAndComputeRoot: Core BAL processing with correct commit ordering
(storage trie → account Root → account trie)
- ProcessBlockWithBAL: Blockchain-level entry point for BAL processing
- HandlePartialReorg: Chain reorganization support using BAL history
- Comprehensive test coverage (31 tests):
* Unit tests for edge cases (storage deletion, EIP-161, buildStateSet)
* Blockchain integration tests (ProcessBlockWithBAL, HandlePartialReorg)
* Both HashScheme and PathScheme coverage
Devnet Testing (2-node setup):
- Full node: dev mode with --dev.period 2, creates blocks
- Partial node: --partial-state mode, syncs via P2P
- Test results: Block sync verified, balance queries match between nodes,
state roots consistent. Database size reduction observed for partial node.
* add method on StateReaderTracker to clear the accumulated reads
* don't factor the BAL size into the payload size during construction in the miner
* simplify miner code for constructing payloads-with-BALs via the use of aformentioned StateReaderTracker clear method
* clean up the configuration of the BAL execution mode based on the preset flag specified
This PR introduces a new type HistoryPolicy which captures user intent
as opposed to pruning point stored in the blockchain which persists the
actual tail of data in the database.
It is in preparation for the rolling history expiry feature.
It comes with a semantic change: if database was pruned and geth is
running without a history mode flag (or explicit keep all flag) geth
will emit a warning but continue running as opposed to stopping the
world.
This PR allows users to prune their nodes up to the Prague fork. It
indirectly depends on #32157 and can't really be merged before eraE
files are widely available for download.
The `--history.chain` flag becomes mandatory for `prune-history`
command. Here I've listed all the edge cases that can happen and how we
behave:
## prune-history Behavior
| From | To | Result |
|-------------|--------------|--------------------------|
| full | postmerge | ✅ prunes |
| full | postprague | ✅ prunes |
| postmerge | postprague | ✅ prunes further |
| postprague | postmerge | ❌ can't unprune |
| any | all | ❌ use import-history |
## Node Startup Behavior
| DB State | Flag | Result |
|-------------|--------------|----------------------------------------------------------------|
| fresh | postprague | ✅ syncs from Prague |
| full | postprague | ❌ "run prune-history first" |
| postmerge | postprague | ❌ "run prune-history first" |
| postprague | postmerge | ❌ "can't unprune, use import-history or fix
flag" |
| pruned | all | ✅ accepts known prune points |
Pebble maintains a batch pool to recycle the batch object. Unfortunately
batch object must be
explicitly returned via `batch.Close` function. This PR extends the
batch interface by adding
the close function and also invoke batch.Close in some critical code
paths.
Memory allocation must be measured before merging this change. What's
more, it's an open
question that whether we should apply batch.Close as much as possible in
every invocation.
Implement standardized JSON format for slow block logging to enable
cross-client performance analysis and protocol research.
This change is part of the Cross-Client Execution Metrics initiative
proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ
The standardized metrics enabled data-driven analysis like the EIP-7907
research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850
JSON format includes:
- block: number, hash, gas_used, tx_count
- timing: execution_ms, total_ms
- throughput: mgas_per_sec
- state_reads: accounts, storage_slots, bytecodes, code_bytes
- state_writes: accounts, storage_slots, bytecodes
- cache: account/storage/code hits, misses, hit_rate
This should come after merging #33522
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR extends the statistics of contract code read by adding these
fields:
- **CacheHitBytes**: the total number of bytes served by cache
- **CacheMissBytes**: the total number of bytes read on cache miss
- **CodeReadBytes**: the total number of bytes for contract code read
### Description
Add a new `OnStateUpdate` hook which gets invoked after state is
committed.
### Rationale
For our particular use case, we need to obtain the state size metrics at
every single block when fuly syncing from genesis. With the current
state sizer, whenever the node is stopped, the background process must
be freshly initialized. During this re-initialization, it can skip some
blocks while the node continues executing blocks, causing gaps in the
recorded metrics.
Using this state update hook allows us to customize our own data
persistence logic, and we would never skip blocks upon node restart.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Fix#33390
`setHeadBeyondRoot` was failing to invalidate finalized blocks because
it compared against the original head instead of the rewound root. This
fix updates the comparison to use the post-rewind block number,
preventing the node from reporting a finalized block that no longer
exists. Also added relevant test cases for it.
## Description
This PR fixes incorrect contract code state metrics by ensuring
duplicate codes are not counted towards the reported results.
## Rationale
The contract code metrics don't consider database deduplication. The
current implementation assumes that the results are only **slightly
inaccurate**, but this is not true, especially for data collection
efforts that started from the genesis block.
This PR introduces a new debug feature, logging the slow blocks with
detailed performance statistics, such as state read, EVM execution and
so on.
Notably, the detailed performance statistics of slow blocks won't be
logged during the sync to not overwhelm users. Specifically, the statistics
are only logged if there is a single block processed.
Example output
```
########## SLOW BLOCK #########
Block: 23537063 (0xa7f878611c2dd27f245fc41107d12ebcf06b4e289f1d6acf44d49a169554ee09) txs: 248, mgasps: 202.99
EVM execution: 63.295ms
Validation: 1.130ms
Account read: 6.634ms(648)
Storage read: 17.391ms(1434)
State hash: 6.722ms
DB commit: 3.260ms
Block write: 1.954ms
Total: 99.094ms
State read cache: account (hit: 622, miss: 26), storage (hit: 1325, miss: 109)
##############################
```
- Introduce a new subscription kind `transactionReceipts` to allow clients to
receive transaction receipts over WebSocket as soon as they are available.
- Accept optional `transactionHashes` filter to subscribe to receipts for specific
transactions; an empty or omitted filter subscribes to all receipts.
- Preserve the same receipt format as returned by `eth_getTransactionReceipt`.
- Avoid additional HTTP polling, reducing RPC load and latency.
---------
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
This pr implements https://github.com/ethereum/go-ethereum/issues/32733
to make StateProcessor more customisable.
## Compatibility notes
This introduces a breaking change to users using geth EVM as a library.
The `NewStateProcessor` function now takes one parameter which has the
chainConfig embedded instead of 2 parameters.
When I implemented in #31340 I didn't expect multiple forks to be
configured at once, but this is exactly how BPOs are defined. This
updates the method to determine the next scheduled fork rather than the
last fork.
The format that is currently reported by the chain isn't very useful, as
it gives an average for ALL the nodes, and not only the leaves, which
skews the results.
Also, until now there was no way to activate the reporting of errors.
We also decided that metrics weren't the right tool to report this data,
so we decided to dump it to the console if the flag is enabled. A better
system should be built, but for now, printing to the logs does the job.
This PR adds a new RPC call, which re-executes a block with stateless
mode activated, so that the witness data are collected and returned.
They are `debug_executionWitnessByHash` which takes in a block hash
and `debug_executionWitness` which takes in a block number.
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Add state size tracking and retrieve api, start geth with `--state.size-tracking`,
the initial bootstrap is required (around 1h on mainnet), after the bootstrap,
use `debug_stateSize()` RPC to retrieve the state size:
```
> debug.stateSize()
{
accountBytes: "0x39681967b",
accountTrienodeBytes: "0xc57939f0c",
accountTrienodes: "0x198b36ac",
accounts: "0x129da14a",
blockNumber: "0x1635e90",
contractCodeBytes: "0x2b63ef481",
contractCodes: "0x1c7b45",
stateRoot: "0x9c36a3ec3745d72eea8700bd27b90dcaa66de0494b187c5600750044151e620a",
storageBytes: "0x18a6e7d3f1",
storageTrienodeBytes: "0x2e7f53fae6",
storageTrienodes: "0x6e49a234",
storages: "0x517859c5"
}
```
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR is the first step in the trienode history series.
It introduces the `nodeWithOrigin` struct in the path database, which tracks
the original values of dirty nodes to support trienode history construction.
Note, the original value is always empty in this PR, so it won't break the
existing journal for encoding and decoding. The compatibility of journal
should be handled in the following PR.
Introduce file-based state journal in path database, fixing
the Pebble restriction when the journal size exceeds 4GB.
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
If Geth is engaged in a long-run block synchronization, such as a full
syncing over a large number of blocks, invoking `debug_setHead` will
cause `downloader.Cancel` to wait for all fetchers to stop first.
This can be time-consuming, particularly for the block processing
thread.
To address this, we manually call `blockchain.StopInsert` to interrupt
the blocking processing thread and allow it to exit immediately, and
after that call `blockchain.ResumeInsert` to resume the block
downloading process.
Additionally, we add a sanity check for the input block number of
`debug_setHead` to ensure its validity.
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Previously, PathDB used a single buffer to aggregate database writes,
which needed to be flushed atomically. However, flushing large amounts
of data (e.g., 256MB) caused significant overhead, often blocking the
system for around 3 seconds during the flush.
To mitigate this overhead and reduce performance spikes, a double-buffer
mechanism is introduced. When the active buffer fills up, it is marked
as frozen and a background flushing process is triggered. Meanwhile, a
new buffer is allocated for incoming writes, allowing operations to
continue uninterrupted.
This approach reduces system blocking times and provides flexibility in
adjusting buffer parameters for improved performance.
This pull request introduces a mechanism to expose statistics from the
state reader, specifically related to cache utilization during state prefetching.
To improve state access performance, a pair of state readers is constructed
with a shared local cache. One reader to execute transactions ahead of time
to warm up the cache. The other reader is used by the actual chain processing
logic, which can benefit from the prefetched states.
This PR adds visibility into how effective the cache is by exposing relevant
usage statistics.
---------
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: Csaba Kiraly <csaba.kiraly@gmail.com>