Remove the custom topN generic function. Use slices.SortedFunc
(creates a sorted copy from an iterator) + slices.DeleteFunc (filters
score <= 0) from the standard library. No custom generics needed.
Replace peerWithStats wrapper, manual slice copying, and protectTopN
closure with a generic topN[T] function that sorts by score and
returns top elements. protectedPeers now works directly with
[]*p2p.Peer slices, building per-category score functions that close
over the stats map.
Compute the protected peer set once in dropRandomPeer via
protectedPeers(), then include protection as a condition in
selectDoNotDrop alongside trusted/static/recent checks. This
eliminates the separate filterProtectedPeers post-pass and the
awkward "all protected → skip" branch.
Rename filterProtectedPeers to protectedPeers, returning
map[*p2p.Peer]bool instead of filtering a slice. The map is
checked directly in selectDoNotDrop via protected[p].
enqueueAndTrack used pool.Has() after Enqueue to determine accepted
txs. Under concurrent delivery of the same tx from two peers, both
could see Has()==true, making attribution non-deterministic.
Add an onAccepted callback to the fetcher, called from Enqueue with
(peer, acceptedHashes) immediately after pool.Add returns for each
batch. Attribution happens atomically inside Enqueue using the per-tx
error from addTxs (nil = accepted), before another goroutine can
race.
Remove the enqueueAndTrack helper from handler_eth.go — the fetcher
now handles notification directly.
protectTopN used maxPeers (configured capacity) to compute the
number of peers to protect. With small droppable sets this could
protect everyone, permanently disabling churn.
Use len(entries) (current droppable count in each category) instead.
With 20 droppable dialed peers and 10% fraction, 2 are protected.
With 3 droppable peers, 0 are protected — churn is never blocked.
Peer stats were never pruned, so the peers map grew with every peer
ever seen. The EMA decay loop and stats copy iterated all historical
peers on every block/query.
Add NotifyPeerDrop(peer) that deletes the peer's stats entry. Called
from handler.unregisterPeer alongside txFetcher.Drop.
handleChainHead fetched the block by number only. If the tracker
goroutine lagged and that height was reorged before processing,
the EMA was computed from the wrong canonical block.
Use GetBlock(hash, number) with the header hash from the event to
fetch the exact block the event refers to, not whatever is currently
canonical at that height.
NotifyReceived was called before pool validation, allowing a peer
to claim deliverer credit by replaying already-included txs or
sending invalid packets.
Rename to NotifyAccepted (takes hashes, not full txs). Call it from
a new enqueueAndTrack helper in handler_eth.go that runs after
Enqueue and checks pool.Has to identify accepted txs. Only accepted
txs are credited to the delivering peer.
lastFinalNum started at 0, so the first checkFinalization after
startup iterated from block 1 to the current finalized head (~20M
blocks on mainnet) under the mutex, stalling the tracker and
potentially awarding bogus credit for ancient txs whose hashes
happened to match recently-received ones.
Seed lastFinalNum from chain.CurrentFinalBlock() in Start() so only
blocks finalized after startup are processed.
txtracker tests (7 tests):
- NotifyReceived: stats empty before chain events
- InclusionEMA: EMA increases on inclusion, decays on empty blocks
- Finalization: Finalized counter credited after finalization
- MultiplePeers: each peer credited for own txs only
- FirstDelivererWins: duplicate delivery ignored
- NoFinalizationCredit: no credit without finalization
- EMADecay: EMA approaches zero after 30 empty blocks
dropper tests (6 tests):
- FilterProtectedNoStats: nil stats → all droppable
- FilterProtectedEmptyStats: empty map → all droppable
- FilterProtectedTopPeer: top-scored peers removed from droppable
- FilterProtectedZeroScore: zero scores → no protection
- FilterProtectedOverlap: peer top in both categories → counted once
- FilterProtectedAllProtected: all droppable protected → empty list
Also fix: create peer entries during EMA update for peers with
inclusions in the current block (previously only created during
finalization, so EMA was not tracked before first finalization).
Expand the txtracker package doc to describe the tracking flow
(NotifyReceived → chain head → finalization → peer credit) and its
role as stats provider for the dropper.
Rewrite the dropper struct comment to document the full behavior
including the inclusion-based peer protection: two scoring categories
(total finalized + recent EMA), top 10% per pool, union of protected
sets.
Change the long-term protection category from total inclusions to
total finalized inclusions. Finalized txs are harder to game (require
actual block finality, not just inclusion) and represent confirmed
on-chain value.
The recent-inclusion EMA stays on chain head inclusions for
responsiveness — a peer delivering txs that appear in the latest
blocks gets quick protection without waiting for finalization.
The tracker now checks CurrentFinalBlock() on each chain head event
and credits delivering peers for all newly finalized blocks since
the last check.
Minimal txtracker that records which peer delivered each transaction
and credits peers when their transactions appear on chain. Provides
the PeerInclusionStats needed by the dropper's protection logic.
Design:
- NotifyReceived(peer, txs): records deliverer per tx hash (called
from handler_eth.go when tx bodies arrive via P2P)
- Subscribes to ChainHeadEvent, fetches block txs, credits the
delivering peer for each included tx
- Per-peer EMA of recent inclusions (alpha=0.05), updated every block
- LRU eviction at 262K entries to bound memory
- Mutex-based (not channel-based) for simplicity — the hot path
(NotifyReceived) is a fast map insert
Wired into the dropper via an adapter callback in backend.go that
converts txtracker.PeerStats to the dropper's PeerInclusionStats.
The dropper periodically disconnects random peers to create churn.
This was blind to peer quality. Add inclusion-based peer protection
using two categories:
1. Total inclusions: protects peers with the highest cumulative
count of delivered txs that were included on chain
2. Recent inclusions (EMA): protects peers with the best recent
inclusion rate, giving newly productive peers faster protection
Each category independently protects the top 10% of inbound and
top 10% of dialed peers. The union of both sets is protected. Only
peers with positive scores qualify.
The dropper defines its own PeerInclusionStats struct and callback
type (getPeerInclusionStatsFunc) so any stats provider (e.g. a
transaction tracker) can plug in without a package dependency. The
callback is nil by default (protection disabled until wired).
The protectionCategories slice is designed for easy extension —
adding a new category requires only appending a struct with a name,
scoring function, and protection fraction.
This PR refactors the encoding rules for `AccessListsPacket` in the wire
protocol. Specifically:
- The response is now encoded as a list of `rlp.RawValue`
- `rlp.EmptyString` is used as a placeholder for unavailable BAL objects
In this PR, the Database interface in `core/state` has been extended
with one more function:
```go
// Iteratee returns a state iteratee associated with the specified state root,
// through which the account iterator and storage iterator can be created.
Iteratee(root common.Hash) (Iteratee, error)
```
With this additional abstraction layer, the implementation details can be hidden
behind the interface. For example, state traversal can now operate directly on
the flat state for Verkle or binary trees, which do not natively support traversal.
Moreover, state dumping will now prefer using the flat state iterator as
the primary option, offering better efficiency.
Edit: this PR also fixes a tiny issue in the state dump, marshalling the
next field in the correct way.
This is a breaking change in the opcode (structLog) tracer. Several fields
will have a slight formatting difference to conform to the newly established
spec at: https://github.com/ethereum/execution-apis/pull/762. The differences
include:
- `memory`: words will have the 0x prefix. Also last word of memory will be padded to 32-bytes.
- `storage`: keys and values will have the 0x prefix.
---------
Co-authored-by: Sina M <1591639+s1na@users.noreply.github.com>
In this PR, we add support for protocol version eth/70, defined by EIP-7975.
Overall changes:
- Each response is buffered in the peer’s receipt buffer when the
`lastBlockIncomplete` field is true.
- Continued request uses the same request id of its original
request(`RequestPartialReceipts`).
- Partial responses are verified in `validateLastBlockReceipt`.
- Even if all receipts for partial blocks of the request are collected,
those partial results are not sinked to the downloader, to avoid
complexity. This assumes that partial response and buffering occur only
in exceptional cases.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR introduces a new type HistoryPolicy which captures user intent
as opposed to pruning point stored in the blockchain which persists the
actual tail of data in the database.
It is in preparation for the rolling history expiry feature.
It comes with a semantic change: if database was pruned and geth is
running without a history mode flag (or explicit keep all flag) geth
will emit a warning but continue running as opposed to stopping the
world.
`TestSubscribePendingTxHashes` hangs indefinitely because pending tx
events are permanently missed due to a race condition in
`NewPendingTransactions` (and `NewHeads`). Both handlers called their
event subscription functions (`SubscribePendingTxs`,
`SubscribeNewHeads`) inside goroutines, so the RPC handler returned the
subscription ID to the client before the filter was installed in the
event loop. When the client then sent a transaction, the event fired but
no filter existed to catch it — the event was silently lost.
- Move `SubscribePendingTxs` and `SubscribeNewHeads` calls out of
goroutines so filters are installed synchronously before the RPC
response is sent, matching the pattern already used by `Logs` and
`TransactionReceipts`
<!-- START COPILOT CODING AGENT TIPS -->
---
💬 We'd love your input! Share your thoughts on Copilot coding agent in
our [2 minute survey](https://gh.io/copilot-coding-agent-survey).
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: s1na <1591639+s1na@users.noreply.github.com>
This PR contains two changes:
Firstly, the finalized header will be resolved from local chain if it's
not recently announced via the `engine_newPayload`.
What's more importantly is, in the downloader, originally there are two
code paths to push forward the pivot point block, one in the beacon
header fetcher (`fetchHeaders`), and another one is in the snap content
processer (`processSnapSyncContent`).
Usually if there are new blocks and local pivot block becomes stale, it
will firstly be detected by the `fetchHeaders`. `processSnapSyncContent`
is fully driven by the beacon headers and will only detect the stale pivot
block after synchronizing the corresponding chain segment. I think the
detection here is redundant and useless.
This PR fixes a regression introduced in https://github.com/ethereum/go-ethereum/pull/33836/changes
Before PR 33836, running mainnet would automatically bump the cache size
to 4GB and trigger a cache re-calculation, specifically setting the key-value
database cache to 2GB.
After PR 33836, this logic was removed, and the cache value is no longer
recomputed if no command line flags are specified. The default key-value
database cache is 512MB.
This PR bumps the default key-value database cache size alongside the
default cache size for other components (such as snapshot) accordingly.
I observed failing tests in Hive `engine-withdrawals`:
-
https://hive.ethpandaops.io/#/test/generic/1772351960-ad3e3e460605c670efe1b4f4178eb422?testnumber=146
-
https://hive.ethpandaops.io/#/test/generic/1772351960-ad3e3e460605c670efe1b4f4178eb422?testnumber=147
```shell
DEBUG (Withdrawals Fork on Block 2): NextPayloadID before getPayloadV2:
id=0x01487547e54e8abe version=1
>> engine_getPayloadV2("0x01487547e54e8abe")
<< error: {"code":-38005,"message":"Unsupported fork"}
FAIL: Expected no error on EngineGetPayloadV2: error=Unsupported fork
```
The same failure pattern occurred for Block 3.
Per Shanghai engine_getPayloadV2 spec, pre-Shanghai payloads should be
accepted via V2 and returned as ExecutionPayloadV1:
- executionPayload: ExecutionPayloadV1 | ExecutionPayloadV2
- ExecutionPayloadV1 MUST be returned if payload timestamp < Shanghai
timestamp
- ExecutionPayloadV2 MUST be returned if payload timestamp >= Shanghai
timestamp
Reference:
-
https://github.com/ethereum/execution-apis/blob/main/src/engine/shanghai.md#engine_getpayloadv2
Current implementation only allows GetPayloadV2 on the Shanghai fork
window (`[]forks.Fork{forks.Shanghai}`), so pre-Shanghai payloads are
rejected with Unsupported fork.
If my interpretation of the spec is incorrect, please let me know and I
can adjust accordingly.
---------
Co-authored-by: muzry.li <muzry.li1@ambergroup.io>
Eth currently has a flaky test, related to the tx fetcher.
The issue seems to happen when Unsubscribe is called while sub is nil.
It seems that chain.Stop() may be invoked before the loop starts in some
tests, but the exact cause is still under investigation through repeated
runs. I think this change will at least prevent the error.
With this, we are dropping support for protocol version eth/68. The only supported
version is eth/69 now. The p2p receipt encoding logic can be simplified a lot, and
processing of receipts during sync gets a little faster because we now transform
the network encoding into the database encoding directly, without decoding the
receipts first.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
fix the flaky test found in
https://ci.appveyor.com/project/ethereum/go-ethereum/builds/53601688/job/af5ccvufpm9usq39
1. increase the timeout from 3+1s to 15s, and use timer instead of
sleep(in the CI env, it may need more time to sync the 1024 blocks)
2. add `synced.Load()` to ensure the full async chain is finished
Signed-off-by: Delweng <delweng@gmail.com>
Previously, handshake timeouts were recorded as generic peer errors
instead of timeout errors. waitForHandshake passed a raw
p2p.DiscReadTimeout into markError, but markError classified errors only
via errors.Unwrap(err), which returns nil for non-wrapped errors. As a
result, the timeoutError meter was never incremented and all such
failures fell into the peerError bucket.
This change makes markError switch on the base error, using
errors.Unwrap(err) when available and falling back to the original error
otherwise. With this adjustment, p2p.DiscReadTimeout is correctly mapped
to timeoutError, while existing behaviour for the other wrapped sentinel
errors remains unchanged
---------
Co-authored-by: lightclient <lightclient@protonmail.com>
The fetcher should not fetch transactions that are already on chain.
Until now we were only checking in the txpool, but that does not have
the old transaction. This was leading to extra fetches of transactions
that were announced by a peer but are already on chain.
Here we extend the check to the chain as well.
All five `revert*Request` functions (account, bytecode, storage,
trienode heal, bytecode heal) remove the request from the tracked set
but never restore the peer to its corresponding idle pool. When a
request times out and no response arrives, the peer is permanently lost
from the idle pool, preventing new work from being assigned to it.
In normal operation mode (snap-sync full state) this bug is masked by
pivot movement (which resets idle pools via new Sync() cycles every ~15
minutes) and peer churn (reconnections re-add peers via Register()).
However in scenarios like the one I have running my (partial-stateful
node)[https://github.com/ethereum/go-ethereum/pull/33764] with
long-running sync cycles and few peers, all peers can eventually leak
out of the idle pools, stalling sync entirely.
Fix: after deleting from the request map, restore the peer to its idle
pool if it is still registered (guards against the peer-drop path where
Unregister already removed the peer). This mirrors the pattern used in
all five On* response handlers.
This only seems to manifest in peer-thirstly scenarios as where I find
myself when testing snapsync for the partial-statefull node).
Still, thought was at least good to raise this point. Unsure if required
to discuss or not
Adds `--opcode.count=<file>` flag to `evm t8n` that writes per-opcode
execution frequency counts to a JSON file (relative to
`--output.basedir`).
---------
Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
This changes the p2p protocol handlers to delay message decoding. It's
the first part of a larger change that will delay decoding all the way
through message processing. For responses, we delay the decoding until
it is confirmed that the response matches an active request and does not
exceed its limits.
In order to make this work, all messages have been changed to use
rlp.RawList instead of a slice of the decoded item type. For block
bodies specifically, the decoding has been delayed all the way until
after verification of the response hash.
The role of p2p/tracker.Tracker changes significantly in this PR. The
Tracker's original purpose was to maintain metrics about requests and
responses in the peer-to-peer protocols. Each protocol maintained a
single global Tracker instance. As of this change, the Tracker is now
always active (regardless of metrics collection), and there is a
separate instance of it for each peer. Whenever a response arrives, it
is first verified that a request exists for it in the tracker. The
tracker is also the place where limits are kept.
This PR fixes a panic in a corner case situation when a `ChainEvent` is
received by `eth.Ethereum.updateFilterMapsHeads()` but the given chain
section does not exist in `BlockChain` any more. This can happen during
chain rewind because chain events are processed asynchronously. Ignoring
the event in this case is ok, the final event will point to the final
rewound head and the indexer will be updated.
Note that similar issues will not happen once we transition to
https://github.com/ethereum/go-ethereum/pull/32292 and the new indexer
built on top of this. Until then, the current fix should be fine.
The upstream libray has removed the assembly-based implementation of
keccak. We need to maintain our own library to avoid a peformance
regression.
---------
Co-authored-by: lightclient <lightclient@protonmail.com>