The stateless block check in forkchoiceUpdated was calling BeaconSync()
on every FCU (~12 seconds) during active snap sync, restarting the
entire sync cycle each time. This prevented state download from ever
completing.
Guard the check with ConfigSyncMode: during active snap sync, the
downloader is already working, so just return STATUS_SYNCING without
restarting. Only trigger BeaconSync for stateless blocks after snap
sync has completed (FullSync mode).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Live testing on bal-devnet-2 confirmed that computed roots DO diverge from
header roots. Block 75315 computed root 0xe909c7.. vs header root
0x9acbbe.. — untracked contracts' storage roots in the local trie are from
snap sync time and differ from the actual current roots, even when the
storage root resolver successfully queries peers.
This means subsequent blocks must chain off the computed root (via
partialState.Root()), not the header root (via parent.Root()). Restore
the stateRoot field using atomic.Pointer[common.Hash] instead of the
previous sync.RWMutex for lock-free concurrent access.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Apply review fixes: BAL iterator start (Fix 2), fatal root mismatch when
all storage resolved (Fix 3), WriteBlockWithoutState error handling (Fix 4),
contract filter construction order (Fix 5), canonical hash backfill (Fix 6),
underflow guard in gap processing (Fix 8), O(n²) prepend fix (Fix 9),
ReadBALHistory corruption detection (Fix 11), incomplete resolution error
(Fix 13), RLP encode panic (Fix 14), gap processing log level (Fix 16),
TriggerPartialResync message (Fix 18), and comment accuracy fixes.
Remove the stateRoot field and sync.RWMutex from PartialState entirely.
Since partial state maintains the full account trie, the computed root
always matches the header root (assuming storage root resolution succeeds).
ProcessBlockWithBAL now derives parent root from parent.Root() directly,
matching how full nodes derive state root from currentBlock headers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix several interacting issues that prevented partial state nodes from
syncing and following the chain on bal-devnet-2:
1. Stale pivot deadlock: Replace unconditional pivot suppression with
rate-limited advances (2-minute cooldown). This prevents the restart
loop bug while allowing recovery when the initial pivot is too stale
for peers to serve.
2. Storage root resolution: Add snap-based resolver that queries peers
for untracked contracts' storage roots during BAL processing. This
lets the computed state root converge toward the header root.
3. SetCanonical for partial state: When the computed root differs from
the header root (expected when untracked contracts have unresolved
storage roots), check HasState(partialState.Root()) instead of only
HasState(block.Root()). Guard against zero root during snap sync.
4. Canonical hash backfill: AdvancePartialHead now writes canonical
hashes for all blocks between the pivot and snap head, fixing the
"final block not in canonical chain" error caused by
InsertReceiptChain skipping blocks whose bodies already exist.
5. Gap block processing: After snap sync completes, process accumulated
blocks between the sync head and chain tip using their persisted BALs
before entering steady-state chain following.
6. Computed root chaining: Use partialState.Root() (actual computed root)
as parentRoot for subsequent blocks, not the header root. This ensures
correct trie chaining when computed != header root.
Tested end-to-end on bal-devnet-2: snap sync completes, gap blocks
processed, canonical head advances at chain tip (~1 block/12s).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix the post-sync deadlock where blocks validated via BAL in newPayload
were never written to the database, causing ForkchoiceUpdated to fail
finding them and triggering infinite sync cycles.
Changes:
- Export WriteBlockWithoutState and call it after ProcessBlockWithBAL
in newPayload, so FCU can find blocks via GetBlockByHash
- Guard SetCanonical against recoverAncestors for partial state nodes
(they can't re-execute blocks, only apply BAL diffs)
- Auto-disable log indexing when partial state is enabled (no receipts)
- Fix BAL type field accesses to match upstream bal-devnet-2 types
(StorageChanges, CodeChanges, BalanceChanges, Validate signature)
- Update newPayload signature (BAL now comes from ExecutableData params)
- Add partial sync scripts and documentation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add chain retention for partial state mode: only the most recent N blocks
(default 1024) retain bodies and receipts. During sync, older blocks are
skipped entirely. After sync, the freezer enforces a rolling window.
Add engine API support for Block Access Lists (EIP-7928): NewPayloadV5
accepts BAL data alongside execution payloads, enabling partial state
nodes to receive per-block storage access information from the CL.
Fix beacon backfilling failure caused by dynamic chain cutoff not
clearing the cutoff hash (which remained at the genesis hash).
Add partial state awareness to eth_call/eth_estimateGas to return clear
errors when accessing untracked contract storage.
Wire up slotnum for `testing_buildBlockV1` for `bal-devnet-3` branch
We are experimenting testing block building through hive via EELS (PR
[here](https://github.com/ethereum/execution-specs/pull/2679)). This is
the only change needed to test against `bal-devnet-3` - missing slotnum.
This PR contains two changes:
Firstly, the finalized header will be resolved from local chain if it's
not recently announced via the `engine_newPayload`.
What's more importantly is, in the downloader, originally there are two
code paths to push forward the pivot point block, one in the beacon
header fetcher (`fetchHeaders`), and another one is in the snap content
processer (`processSnapSyncContent`).
Usually if there are new blocks and local pivot block becomes stale, it
will firstly be detected by the `fetchHeaders`. `processSnapSyncContent`
is fully driven by the beacon headers and will only detect the stale pivot
block after synchronizing the corresponding chain segment. I think the
detection here is redundant and useless.
I observed failing tests in Hive `engine-withdrawals`:
-
https://hive.ethpandaops.io/#/test/generic/1772351960-ad3e3e460605c670efe1b4f4178eb422?testnumber=146
-
https://hive.ethpandaops.io/#/test/generic/1772351960-ad3e3e460605c670efe1b4f4178eb422?testnumber=147
```shell
DEBUG (Withdrawals Fork on Block 2): NextPayloadID before getPayloadV2:
id=0x01487547e54e8abe version=1
>> engine_getPayloadV2("0x01487547e54e8abe")
<< error: {"code":-38005,"message":"Unsupported fork"}
FAIL: Expected no error on EngineGetPayloadV2: error=Unsupported fork
```
The same failure pattern occurred for Block 3.
Per Shanghai engine_getPayloadV2 spec, pre-Shanghai payloads should be
accepted via V2 and returned as ExecutionPayloadV1:
- executionPayload: ExecutionPayloadV1 | ExecutionPayloadV2
- ExecutionPayloadV1 MUST be returned if payload timestamp < Shanghai
timestamp
- ExecutionPayloadV2 MUST be returned if payload timestamp >= Shanghai
timestamp
Reference:
-
https://github.com/ethereum/execution-apis/blob/main/src/engine/shanghai.md#engine_getpayloadv2
Current implementation only allows GetPayloadV2 on the Shanghai fork
window (`[]forks.Fork{forks.Shanghai}`), so pre-Shanghai payloads are
rejected with Unsupported fork.
If my interpretation of the spec is incorrect, please let me know and I
can adjust accordingly.
---------
Co-authored-by: muzry.li <muzry.li1@ambergroup.io>
This PR removes the legacy sidecar conversion logic.
After the Osaka fork, the blobpool will accept only blob sidecar version
1.
Any remaining version 0 blob transactions, if they still exist, will no
longer
be eligible for inclusion.
Note that conversion at the RPC layer is still supported, and version 0
blob
transactions will be automatically converted to version 1 there.
This moves the tracking of the current syncmode into the downloader, fixing an
issue where the syncmode being requested through the engine API could go
out-of-sync with the actual mode being performed by downloader.
Fixes#32629
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This is to benchmark how much the internal parts of GetBlobsV2 take.
This is not an RPC-level benchmark, so JSON-RPC overhead is not
included.
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
This adds checks into getPayload to ensure the correct version is called
for the fork which applies to the payload.
---------
Co-authored-by: jsvisa <delweng@gmail.com>
The periodic sealing loop failed to reset its timer when sealBlock
returned an error, causing the timer to never fire again and effectively
halting block production in developer periodic mode after the first
failure. This is a bug because the loop relies on the timer to trigger
subsequent sealing attempts, and transient errors (e.g., pool races or
chain rewinds) should not permanently stop the loop. The change moves
timer.Reset after the sealing attempt unconditionally, ensuring the loop
continues ticking and retrying even when sealing fails, which matches
how other periodic timers in the codebase behave and preserves forward
progress.
This PR updates the `payloadVersion` function in `simulated_beacon.go`
to handle additional following forks used during development and testing
phases after Osaka.
This change ensures that the simulated beacon correctly resolves the
payload version for these forks, enabling consistent and valid execution
payload handling during local testing or simulation.
Addresses https://github.com/ethereum/go-ethereum/issues/32630
This pull request enables the stateless engine APIs for Osaka and the
following BPOs. Apart from that, a few more descriptions have been added
in the engine APIs, making it easier to follow the spec change.
Fixes an issue I accidentally introduced in #32579. Essentially, because
we gate the engine methods based on particular forks and I did not add
the BPOs as allowed forks to the method.
Another getBlobs PR on top of
https://github.com/ethereum/go-ethereum/pull/32190 to avoid some minor
regressions.
- bring back more log messages from before
- continue processing also on some internal errors
- ensure v2 complies with spec even if there are internal errors
---------
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
This pull request fixes a regression, introduced in #32190
Specifically, in GetBlobsV1 engine API, if any blob is missing, the null
should be placed in
response, unfortunately a behavioral change was introduced in #32190,
returning an error
instead.
What's more, a more comprehensive test suite is added to cover both
`GetBlobsV1` and
`GetBlobsV2` endpoints.
- If all the `vhashes` are in the same `sidecar`, then it will load the
same blob tx many times. This PR aims to upgrade this.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Correct the error message in the ExecuteStatelessPayloadV4 function to
reference newPayloadV4 and the Prague fork, instead of incorrectly
referencing newPayloadV3 and Cancun.
This improves clarity during debugging and aligns the error message with
the actual function and fork being validated. No logic is changed.
---------
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
The main purpose of this change is to enforce the version setting when
constructing the blobSidecar, avoiding creating sidecar with wrong/default
version tag.
alternate approach to https://github.com/ethereum/go-ethereum/pull/31328
suggested by @MariusVanDerWijden . This prevents Geth from outputting a
lot of logs when trying to commit on-demand dev mode blocks while the
client is shutting down.
The issue is hard to reproduce, but I've seen it myself and it is
annoying when it happens. I think this is a reasonable simple solution,
and we can revisit if we find that the output is still too large (i.e.
there is a large delay between initiating shut down and the simulated
beacon receiving the signal, while in this loop).
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
geth cmd: `geth --dev --dev.period 5`
call: `debug.setHead` to rollback several blocks.
If the `debug.setHead` call is delayed, it will trigger a panic with a
small probability, due to using the null point of
`fcResponse.PayloadID`.
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>