Previously, handshake timeouts were recorded as generic peer errors
instead of timeout errors. waitForHandshake passed a raw
p2p.DiscReadTimeout into markError, but markError classified errors only
via errors.Unwrap(err), which returns nil for non-wrapped errors. As a
result, the timeoutError meter was never incremented and all such
failures fell into the peerError bucket.
This change makes markError switch on the base error, using
errors.Unwrap(err) when available and falling back to the original error
otherwise. With this adjustment, p2p.DiscReadTimeout is correctly mapped
to timeoutError, while existing behaviour for the other wrapped sentinel
errors remains unchanged
---------
Co-authored-by: lightclient <lightclient@protonmail.com>
The fetcher should not fetch transactions that are already on chain.
Until now we were only checking in the txpool, but that does not have
the old transaction. This was leading to extra fetches of transactions
that were announced by a peer but are already on chain.
Here we extend the check to the chain as well.
All five `revert*Request` functions (account, bytecode, storage,
trienode heal, bytecode heal) remove the request from the tracked set
but never restore the peer to its corresponding idle pool. When a
request times out and no response arrives, the peer is permanently lost
from the idle pool, preventing new work from being assigned to it.
In normal operation mode (snap-sync full state) this bug is masked by
pivot movement (which resets idle pools via new Sync() cycles every ~15
minutes) and peer churn (reconnections re-add peers via Register()).
However in scenarios like the one I have running my (partial-stateful
node)[https://github.com/ethereum/go-ethereum/pull/33764] with
long-running sync cycles and few peers, all peers can eventually leak
out of the idle pools, stalling sync entirely.
Fix: after deleting from the request map, restore the peer to its idle
pool if it is still registered (guards against the peer-drop path where
Unregister already removed the peer). This mirrors the pattern used in
all five On* response handlers.
This only seems to manifest in peer-thirstly scenarios as where I find
myself when testing snapsync for the partial-statefull node).
Still, thought was at least good to raise this point. Unsure if required
to discuss or not
Adds `--opcode.count=<file>` flag to `evm t8n` that writes per-opcode
execution frequency counts to a JSON file (relative to
`--output.basedir`).
---------
Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
This changes the p2p protocol handlers to delay message decoding. It's
the first part of a larger change that will delay decoding all the way
through message processing. For responses, we delay the decoding until
it is confirmed that the response matches an active request and does not
exceed its limits.
In order to make this work, all messages have been changed to use
rlp.RawList instead of a slice of the decoded item type. For block
bodies specifically, the decoding has been delayed all the way until
after verification of the response hash.
The role of p2p/tracker.Tracker changes significantly in this PR. The
Tracker's original purpose was to maintain metrics about requests and
responses in the peer-to-peer protocols. Each protocol maintained a
single global Tracker instance. As of this change, the Tracker is now
always active (regardless of metrics collection), and there is a
separate instance of it for each peer. Whenever a response arrives, it
is first verified that a request exists for it in the tracker. The
tracker is also the place where limits are kept.
This PR fixes a panic in a corner case situation when a `ChainEvent` is
received by `eth.Ethereum.updateFilterMapsHeads()` but the given chain
section does not exist in `BlockChain` any more. This can happen during
chain rewind because chain events are processed asynchronously. Ignoring
the event in this case is ok, the final event will point to the final
rewound head and the indexer will be updated.
Note that similar issues will not happen once we transition to
https://github.com/ethereum/go-ethereum/pull/32292 and the new indexer
built on top of this. Until then, the current fix should be fine.
The upstream libray has removed the assembly-based implementation of
keccak. We need to maintain our own library to avoid a peformance
regression.
---------
Co-authored-by: lightclient <lightclient@protonmail.com>
I recently went on a longer flight and started profiling the geth block
production pipeline.
This PR contains a bunch of individual fixes split into separate
commits.
I can drop some if necessary.
Benchmarking is not super easy, the benchmark I wrote is a bit
non-deterministic.
I will try to write a better benchmark later
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/miner
cpu: Intel(R) Core(TM) Ultra 7 155U
│ /tmp/old.txt │ /tmp/new.txt │
│ sec/op │ sec/op vs base │
BuildPayload-14 141.5µ ± 3% 146.0µ ± 6% ~ (p=0.346 n=200)
│ /tmp/old.txt │ /tmp/new.txt │
│ B/op │ B/op vs base │
BuildPayload-14 188.2Ki ± 4% 177.4Ki ± 4% -5.71% (p=0.018 n=200)
│ /tmp/old.txt │ /tmp/new.txt │
│ allocs/op │ allocs/op vs base │
BuildPayload-14 2.703k ± 4% 2.453k ± 5% -9.25% (p=0.000 n=200)
```
Implement standardized JSON format for slow block logging to enable
cross-client performance analysis and protocol research.
This change is part of the Cross-Client Execution Metrics initiative
proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ
The standardized metrics enabled data-driven analysis like the EIP-7907
research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850
JSON format includes:
- block: number, hash, gas_used, tx_count
- timing: execution_ms, total_ms
- throughput: mgas_per_sec
- state_reads: accounts, storage_slots, bytecodes, code_bytes
- state_writes: accounts, storage_slots, bytecodes
- cache: account/storage/code hits, misses, hit_rate
This should come after merging #33522
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Adding an RPC flag to limit the block range size for eth_getLogs and
eth_newFilter requests.
closing https://github.com/ethereum/go-ethereum/issues/24508
---------
Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
The core part of this PR that we need to adopt is to move the code and
nonce change hook invocations to occur at tx finalization, instead of
when the selfdestruct opcode is called.
Additionally:
* remove `SelfDestruct6780` now that it is essentially the same as
`SelfDestruct` just gated by `is new contract`
* don't duplicate `BalanceIncreaseSelfdestruct` (transfer to recipient
of selfdestruct) in the hooked statedb and in the opcode handler for the
selfdestruct opcode.
* balance is burned immediately when the beneficiary of the selfdestruct
is the sender, and the contract was created in the same transaction.
Previously we emit two balance increases to the recipient (see above
point), and a balance decrease from the sender.
---------
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: lightclient <lightclient@protonmail.com>
Remove a large amount of duplicate code from the tx_fetcher tests.
---------
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: lightclient <lightclient@protonmail.com>
### Description
Add a new `OnStateUpdate` hook which gets invoked after state is
committed.
### Rationale
For our particular use case, we need to obtain the state size metrics at
every single block when fuly syncing from genesis. With the current
state sizer, whenever the node is stopped, the background process must
be freshly initialized. During this re-initialization, it can skip some
blocks while the node continues executing blocks, causing gaps in the
recorded metrics.
Using this state update hook allows us to customize our own data
persistence logic, and we would never skip blocks upon node restart.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
In this PR, two things have been fixed:
---
(a) truncate the stale beacon headers with latest snap block
Originally, b.filled is used as the indicator for deleting stale beacon headers.
This field is set only after synchronization has been scheduled, under the
assumption that the skeleton chain is already linked to the local chain.
However, the local chain can be mutated via `debug_setHead`, which may
cause `b.filled` outdated. For instance, `b.filled` refers to the last head snap block
in the last sync cycle while after `debug_setHead`, the head snap block has been
rewounded to 1.
As a result, Geth can enter an unintended loop: it repeatedly downloads
the missing beacon headers for the skeleton chain and attempts to schedule the
actual synchronization, but in the final step, all recently fetched headers are removed
by `cleanStales` due to the stale `b.filled` value.
This issue is addressed by always using the latest snap block as the indicator,
without relying on any cached value. However, note that before the skeleton
chain is linked to the local chain, the latest snap block will always be below
skeleton.tail, and this condition should not be treated as an error.
---
(b) merge the subchains once the skeleton chain links to local chain
Once the skeleton chain links with local one, it will try to schedule the
synchronization by fetching the missing blocks and import them then.
It's possible the last subchain already overwrites the previous subchain and
results in having two subchains leftover. As a result, an error log will printed
https://github.com/ethereum/go-ethereum/blob/master/eth/downloader/skeleton.go#L1074
This PR removes the legacy sidecar conversion logic.
After the Osaka fork, the blobpool will accept only blob sidecar version
1.
Any remaining version 0 blob transactions, if they still exist, will no
longer
be eligible for inclusion.
Note that conversion at the RPC layer is still supported, and version 0
blob
transactions will be automatically converted to version 1 there.
This PR fixes the bug reported in #33365.
The impact of the bug is not catastrophic. After a transaction is
ultimately fetched, validation and propagation will be performed based
on the fetched body, and any response with a mismatched type is treated
as a protocol violation. An attacker could only waste the limited
portion of victim’s bandwidth at most.
However, the reasons for submitting this PR are as follows
1. Fetching a transaction announced with an arbitrary type is a weird
behavior.
2. It aligns with efforts such as EIP-8077 and #33119 to make the
fetcher smarter and reduce bandwidth waste.
Regarding the `FilterType` function, it could potentially be implemented
by modifying the Filter function's parameter itself, but I wasn’t sure
whether changing that function is acceptable, so I left it as is.
This moves the tracking of the current syncmode into the downloader, fixing an
issue where the syncmode being requested through the engine API could go
out-of-sync with the actual mode being performed by downloader.
Fixes#32629
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This improves the error code for cases where invalid query parameters
are submitted to `eth_getLogs`. I also improved the error message that
is emitted when querying into the future.
This is to benchmark how much the internal parts of GetBlobsV2 take.
This is not an RPC-level benchmark, so JSON-RPC overhead is not
included.
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
This PR introduces a new debug feature, logging the slow blocks with
detailed performance statistics, such as state read, EVM execution and
so on.
Notably, the detailed performance statistics of slow blocks won't be
logged during the sync to not overwhelm users. Specifically, the statistics
are only logged if there is a single block processed.
Example output
```
########## SLOW BLOCK #########
Block: 23537063 (0xa7f878611c2dd27f245fc41107d12ebcf06b4e289f1d6acf44d49a169554ee09) txs: 248, mgasps: 202.99
EVM execution: 63.295ms
Validation: 1.130ms
Account read: 6.634ms(648)
Storage read: 17.391ms(1434)
State hash: 6.722ms
DB commit: 3.260ms
Block write: 1.954ms
Total: 99.094ms
State read cache: account (hit: 622, miss: 26), storage (hit: 1325, miss: 109)
##############################
```
We still default to legacy txes for methods like eth_sendTransaction,
eth_signTransaction. We can default to 0x2 and if someone would like to
stay on legacy they can do so by setting the `gasPrice` field.
cc @deffrian
This adds checks into getPayload to ensure the correct version is called
for the fork which applies to the payload.
---------
Co-authored-by: jsvisa <delweng@gmail.com>
While updating to latest Geth, I noticed `OnCodeChangeV2` was not
properly handled in `SelfDestruct/6780`, this PR fixes this and bring a
unit test. Let me know if it's deemed more approriate to merge the tests
with the other one.
The periodic sealing loop failed to reset its timer when sealBlock
returned an error, causing the timer to never fire again and effectively
halting block production in developer periodic mode after the first
failure. This is a bug because the loop relies on the timer to trigger
subsequent sealing attempts, and transient errors (e.g., pool races or
chain rewinds) should not permanently stop the loop. The change moves
timer.Reset after the sealing attempt unconditionally, ensuring the loop
continues ticking and retrying even when sealing fails, which matches
how other periodic timers in the codebase behave and preserves forward
progress.