Adds `testing_commitBlockV1`. It is the write companion of `testing_buildBlockV1`:
it builds a block from the provided payload attributes and transactions on
top of the current canonical head, inserts it, and sets it as the new
head, returning the new head hash.
---------
Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
This PR fixes an issue where flat states are continuously persisted
during downloadState, while the sync journal is only persisted at the
end of Sync.
As a result, an unclean shutdown can leave the on-disk flat state ahead
of the journal markers. Some persisted entries may be stale (storage
slots that should have been deleted), and these dangling entries are not
detected or fixed by subsequent state downloads.
To address this, this PR introduces a cleanup step before state
downloading begins. It removes all state entries that are not covered by
the persisted journal markers.
This PR introduces a new condition that if the local node falls behind
too much and the required BAL for catching up is very likely to be
unavailable, the entire snap sync will be restarting from scratch.
As the defined BAL retention window is weak-subjective-period which is
calculated dynamically. A more conservative threshold is used (90K
blocks) for robustness.
Apart from that, the BAL catchup will be divided into several spans and
apply one by one. It's essential to prevent the potential out-of-memory
panic of placing the entire BAL set in memory.
This PR improves the slot reservation logic in the context of snap/2.
Geth has the mechanism to reserve roughly half the peer slots for peers
supporting the snap protocol if snap syncing is needed by local node.
With the context of snap/2, this mechanism should be changed that:
we reserve the slot for the "usable snap peer", not blindly for peer
with snap extension enabled (such as legacy snap/1, which can't serve
the snap/2).
This PR fixes an issue that when peers legitimately lack a requested
BAL, empty (0x80) is delivered and this BAL entry will be refetched
over and over again.
A `refused` tracker is added and catchUp will fail if this BAL is
unavailable against the entire peerset.
Implements https://eips.ethereum.org/EIPS/eip-8037
mainly done in order to judge the complexity of the EIP
and to act as a jumping off point, since the eip will likely
change.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR introduces a cache for GetBlobs request.
The main purpose of this PR is to reduce the getBlobs latency by reading and
decoding blobs from the pool in advance of the actual query. This is important
especially in the context of a sparse blobpool, since it may be necessary to
recover blobs from cells on a getBlobs request.
Previously, the Engine API read and decoded blobs from the pool on every call.
Now those calls check the cache and only fall back to the pool on a miss.
The cache has two modes:
- In topK mode (default), it wakes up periodically, picks the most profitable
pending blob transactions up to the current fork's maxBlobsPerBlock, and loads
their blobs. The selection logic is shared with the miner's block-building
logic. The selection size is derived from eip4844.MaxBlobsPerBlock at the
current head.
- When the CL calls HasBlobs, the cache switches to hasBlobs mode and tries to
pin the set it just reported as available. Cache updates (read, decode, and
optionally conversion in the future) run in background goroutines.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Adds snap/2 (EIP-8189), a block-access-list (BAL) based state sync, and
wires it to run side by side with snap/1. It's opt-in (for now) behind a
new --snap.v2 flag and chosen at startup.
https://eips.ethereum.org/EIPS/eip-8189
---------
Co-authored-by: Toni Wahrstätter <info@toniwahrstaetter.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This is a PR that removes all correctly flagged typos, in order to stop
an onslaught of slop PRs in its tracks. It should be followed by #34994
but the latter needs more configuration work and I want to limit the
stem of PRs right now.
The per-call SERVER span ended inside `handleCall()`, so the JSON-RPC
response write happened after the span closed. For large responses like
`engine_getBlobsV*`, that write time was missing from traces.
- Extend the SERVER span past `writeJSON`.
- For batches, add a top-level `jsonrpc.batch` SERVER span (with `rpc.batch.size`) covering the whole batch including `callBuffer.write`.
- Add `rpc.writeJSON` span around the non-batch response write.
- Add `rpc.writeJSONBatch` span around the batch response write.
- Add `rpc.httpWrite` span around the actual HTTP write, separating JSON encoding from network write.
- Add additional telemetry helpers.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR is a prerequisite for landing snap v2, the BAL-healing snap sync
algorithm.
It duplicates much of the snap v1 skeleton, which is expected to be
deprecated once v2 is enabled. The code duplication is acceptable as a
short-term tradeoff, simplifying development and reducing integration
complexity.
fixes#32672
This is kind of a band aid solution since it fixes the issue by
bypassing the snap sync expectations of an empty db and attempting to
import the new payload if we're at block 1. The next FCU will set the
status to synced.
Will continue looking to better understand how the above issue arises
and find a more thorough solution.
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
There is currently no way for JSON-RPC clients to discover which
historical data a node can serve without probing with trial-and-error
calls and interpreting opaque error messages (`pruned history
unavailable`).
This makes it hard to build robust tooling on top of nodes that prune
their history, for example nodes started with `--history.chain
postmerge`
or with reduced `TransactionHistory`, `LogHistory`, or `StateHistory`
windows.
This PR implements `eth_capabilities` as defined in
ethereum/execution-apis#755. The method takes no parameters and returns
the current head plus six per-resource capability records:
- `state`
- `tx`
- `logs`
- `receipts`
- `blocks`
- `stateproofs`
Closes#33828
- Adds tracing to the `GetBlobsV1/V2/V3`
- Adds `blobs.requested` and `blobs.filled` attributes to
`GetBlobsV1/V2/V3` spans.
- Adds tracing to `BlobTxPool().GetBlobs()`
This PR:
- Adds `engine_newPayloadWithWitnessV5`. The codebase already supports
the previous `VX`, so only `V5` was missing.
- Make the consensus witness format use the field [ordering defined in
the
spec](8d7e68f4b7/src/ethereum/forks/amsterdam/stateless_host_exec_witness.py (L175-L176))
to make it canonical.
cc @gballet
---------
Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Fixes a regression where nil results from getBlobs were encoded as an empty array instead of null.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Adds a fast path for ExecutionPayloadEnvelope and BlobAndProofListV*
that bypasses encoding/json's reflection and re-validation, which are
expensive for large payloads with many blobs. Also hand-rolls the
jsonrpcMessage wire encoding in the RPC codec to avoid a second
re-validation pass when writing responses to the connection.
Resolves#33814
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR implements the serving side of the eth71 BAL exchange messages.
Until commit 4cd7092 also contained the requesting side, but since that
part still needs more work, I'm splitting it out into a separate PR.
The test injects BALs directly into rawdb. This can be removed once BAL
generation is integrated into the chain maker.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR finally lands EIP-7928, collecting the block accessList during
the block execution and verifying against the block header.
---------
Co-authored-by: jwasinger <j-wasinger@hotmail.com>
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
Per @MariusVanDerWijden's review feedback, tighten the change to match
geth's existing style:
- Drop the MarkConsensusExpected/MarkConsensusContacted/ConsensusReady
doc paragraphs on Ethereum; collapse the field comments to single
trailing lines matching eth/handler.go's atomic.Bool style.
- Rename the unexported accessors to MarkCLExpected/MarkCLContacted
(catalyst can't reach the fields directly).
- Drop the multi-line comments at the catalyst call sites — the method
names are self-describing.
- Trim the Backend.ConsensusReady() interface comment and EthAPIBackend
wrapper comment.
- Replace the verbose docstring on EthereumAPI.Syncing with a single
reference to #33687.
- Drop the long doc comments on the syncing_test.go cases; rename test
functions to short forms (TestSyncingBeforeCLContact, etc.).
No behavioural change. Run: `go test ./internal/ethapi/ -count=1`.
This PR extends the journal to track the pre-transaction values of
mutated balances, nonces, and code.
At the end of the transaction, these values are used to filter out no-op
changes, such as balance transitions from a-> b->a. These changes are
excluded from the block-level access list.
Additionally, there is a dedicated `bal.ConstructionBlockAccessList`
objects for gathering the state reads and writes within the current
transaction. These state writes will be keyed by the block accessList
index.
---------
Co-authored-by: jwasinger <j-wasinger@hotmail.com>
This PR introduces OnGasChangeV2 tracing hook, as the pre-requisite for landing
EIP-8037.
---------
Co-authored-by: Sina M <1591639+s1na@users.noreply.github.com>
The previous version of this change unconditionally returned the progress
map until the consensus client had driven the node at least once. That
broke ethclient.TestEthClient/StatusFunctions and any other backend that
runs without a consensus client (in-process tests, --dev mode without
catalyst, light/legacy backends), where reporting "syncing" forever is
clearly wrong.
Split the gate into two flags:
- clExpected: set in eth/catalyst.Register, the only entry point that
attaches the Engine API to a node. If a backend never calls Register,
it is not paired with a consensus client.
- clContacted: set on every Engine API call (forkchoiceUpdated and
newPayload), unchanged from before.
Replace ConsensusContacted on the Backend interface with ConsensusReady,
which folds the two flags into the question eth_syncing actually wants
answered: "is the synced claim meaningful right now?" Backends that
never expect a CL answer yes immediately, preserving legacy behavior.
Backends that do expect one answer yes only after the first FCU/NewPayload.
- eth/backend.go: clExpected, clContacted, MarkConsensusExpected,
MarkConsensusContacted, ConsensusReady on (*Ethereum)
- eth/catalyst/api.go: backend.MarkConsensusExpected() in Register
- eth/api_backend.go: ConsensusReady delegates to (*Ethereum)
- internal/ethapi/backend.go: rename interface method to ConsensusReady
- internal/ethapi/api.go: Syncing checks ConsensusReady
- internal/ethapi/{api_test,transaction_args_test}.go: rename the test
mock methods (default to true so existing tests are unaffected)
- internal/ethapi/syncing_test.go: rename the helper field; tests now
cover (a) CL-paired node before handshake -> truthy, (b) ready node
-> false, (c) active sync -> progress map regardless of gate
Refs #33687.
eth_syncing currently returns false as soon as the local downloader
believes the chain to be done. On a freshly started node this happens
before the consensus client has talked to it: the persisted head loads
into memory, no CL handshake has occurred, the downloader sees nothing
to do, Progress.Done() is true, eth_syncing reports synced.
That is wrong from an operator perspective. Load balancers (HAProxy,
NGINX), L2 supervisors and multi-node setups commonly gate routing on
eth_syncing. They start sending live traffic to a node that has not
actually learned about any new head yet, which surfaces as missing
state, stale reads, and unhealthy upstreams.
Maintainer-endorsed direction in the issue thread: "default geth to
'syncing' on startup and only switch to 'synced' once we learn about
a new block".
Implement that with a sticky atomic.Bool on *Ethereum, set the first
time the consensus layer drives the node via the Engine API
(ForkchoiceUpdated or NewPayload), and consulted from eth_syncing.
- eth/backend.go: add Ethereum.clContacted with
MarkConsensusContacted/ConsensusContacted helpers
- eth/catalyst/api.go: call MarkConsensusContacted at the same point
where lastForkchoiceUpdate / lastNewPayloadUpdate are stamped, so
the gate flips on every CL message regardless of the response
status (handshake recorded even when we reply STATUS_SYNCING)
- internal/ethapi/backend.go: add ConsensusContacted() to the Backend
interface and to the two test mocks (api_test.go testBackend,
transaction_args_test.go backendMock; both default to true so
existing tests keep their original semantics)
- eth/api_backend.go: implement ConsensusContacted on EthAPIBackend
- internal/ethapi/api.go: in EthereumAPI.Syncing, only short-circuit
to "false" when both progress.Done() AND ConsensusContacted() are
true; otherwise return the progress map as during an active sync
Adds dedicated tests in internal/ethapi/syncing_test.go covering:
- the new gate (Done but no CL contact -> truthy progress)
- normal post-handshake behavior (Done + CL contact -> false)
- active-sync behavior is unchanged regardless of the gate
Refs #33687.
In b2843a11d, metrics check len(res) == len(hashes) but res is
pre-allocated with make(), so length is always equal. Partial hit metric
never fires. Count non-nil elements instead.
---------
Co-authored-by: Bosul Mun <bsbs8645@snu.ac.kr>
This is a refactoring PR to wrap all pre/post-execution system calls as
the exported functions, eliminating the duplicated system calls across
the codebase.
There are a few things unchanged but worths highlight:
- ChainMaker is left as unchanged, a significant rewrite is required
- BeaconRoot in header should be non-nil if Cancun is enabled
---------
Co-authored-by: jwasinger <j-wasinger@hotmail.com>
Every tracer that implements Stop/GetResult held a `reason error` field
that is written by Stop (called from the trace-timeout watchdog
goroutine in api.go) and read by GetResult (called by the RPC handler
main goroutine). These accesses were unsynchronized.
This is an alternative PR for
https://github.com/ethereum/go-ethereum/pull/34746.
This PR implements the second approach among the two possible solutions
mentioned in the above PR.
Requests for unavailable items are possible when the peer is following a
different fork from us. However this is not expected to happen
frequently. Considering the amount of complexity added to the codebase,
the simpler approach (this PR) can be preferred.
The reconstruct callback indexes parallel response slices (bodies,
receipts). Passing the accept counter used the wrong element when an
earlier header in the same batch hit a stale slot.