Commit graph

17 commits

Author SHA1 Message Date
Csaba Kiraly
f24161de71 eth/txtracker: replace cumulative Finalized with slow RecentFinalized EMA
The total-finalized protection category ranked peers by a monotonic
cumulative count, so a peer that had been productive in the past kept
a high score forever — even if they had since gone silent — and held
a protected slot without contributing.

Replace txtracker.PeerStats.Finalized (int64 cumulative) with
RecentFinalized (float64 EMA). On each chain head, finalization
credits accumulated over the newly-finalized range are folded into a
slow EMA (alpha=0.0001, half-life ~6930 blocks ≈ 23 hours on 12s
mainnet blocks). Peers that continue contributing keep a high score;
peers that stop decay toward zero over roughly a day.

The dropper category renames to "recent-finalized" accordingly. The
type's docstring is rewritten to describe both categories as EMAs
with different time horizons (slow finalized, fast included).

Refactors checkFinalization to return a per-peer credits map rather
than mutating state directly, so both EMAs update in the same loop
over tracked peers.
2026-04-19 12:14:23 +02:00
Csaba Kiraly
1f2ebc5d59 eth: drop PeerInclusionStats wrapper and use txtracker.PeerStats directly
PeerInclusionStats was declared identically to txtracker.PeerStats as a
decoupling abstraction: any stats provider could implement the dropper's
callback by returning this shape. In practice there's one provider and
the two types were kept in sync by a rote copy adapter in backend.go.

Delete PeerInclusionStats, have the dropper consume txtracker.PeerStats
directly via getPeerStatsFunc. backend.go now passes
txTracker.GetAllPeerStats as the callback with no adapter.

If a second stats provider ever appears, the abstraction can come back;
until then, one fewer type and 8 fewer lines of ceremony.
2026-04-15 14:35:37 +02:00
Csaba Kiraly
69a7baefd8 eth: drop skipped_protected metric and simplify dropper skip path
The skipped_protected metric (added earlier on this branch) counted the
subset of drop skips where inclusion protection was the cause. The
signal can be inferred from rising dropSkipped rate plus the existing
"Protecting high-value peers" debug log, which wasn't worth the second
metric, the causality-check loop over the protected set, and the
baseNotDrop closure extracted solely to share the predicate.

Collapse baseNotDrop back into selectDoNotDrop and remove the metric.
dropSkipped still fires on every skip (fast-path headroom + all-filtered).
2026-04-13 19:13:12 +02:00
Csaba Kiraly
a7ce1e2ad8 eth: test per-pool top-N selection in dropper peer protection
The protection feature promises top-N per inbound/dialed pool, but
every existing test constructed peers via p2p.NewPeer (which produces
no-flag peers), so all test peers landed in the dialed pool and the
per-pool split was never validated.

Extract the selection logic from protectedPeers into a pure helper
protectedPeersByPool(inbound, dialed, stats) that accepts pre-split
pools. This sidesteps the unexported p2p.connFlag types and makes the
interesting behavior directly testable. Add three tests covering:

  - exact top-N selected independently in each pool
  - cross-category union with overlap deduplication
  - per-pool independence: top dialed peers stay protected even when
    every inbound peer scores higher globally
2026-04-13 16:56:53 +02:00
Csaba Kiraly
832f206275 eth: split dropper skip metric into total and protection-caused
Rename eth/dropper/protected to eth/dropper/skipped and mark it on
every skip (fast-path headroom, all candidates un-droppable, or
protection emptied the list). Add eth/dropper/skipped_protected to
count the subset of skips where at least one otherwise-droppable
peer was kept only because of inclusion protection.

The pair lets operators see both the total churn-miss rate and how
often peer protection specifically is the cause.
2026-04-13 10:01:54 +02:00
Csaba Kiraly
ed3d5ab3da eth: skip protection work in dropper when pools have headroom
When neither the dialed nor inbound peer pool is close to capacity,
every non-trusted/non-static peer is already marked do-not-drop by
the pool-threshold rules in selectDoNotDrop, so the droppable set is
guaranteed empty regardless of inclusion protection.

Return early in that case to avoid the wasted peerStatsFunc call,
per-direction split, and per-category sort in protectedPeers.
2026-04-13 10:01:26 +02:00
Csaba Kiraly
803ac3c641
eth: improve dropper description
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
2026-04-10 12:52:17 +02:00
Csaba Kiraly
ecfdaa69c6 eth: replace topN helper with stdlib slices.SortedFunc + DeleteFunc
Remove the custom topN generic function. Use slices.SortedFunc
(creates a sorted copy from an iterator) + slices.DeleteFunc (filters
score <= 0) from the standard library. No custom generics needed.
2026-04-10 12:18:56 +02:00
Csaba Kiraly
aa5fa692e2 eth: simplify protectedPeers with generic topN helper
Replace peerWithStats wrapper, manual slice copying, and protectTopN
closure with a generic topN[T] function that sorts by score and
returns top elements. protectedPeers now works directly with
[]*p2p.Peer slices, building per-category score functions that close
over the stats map.
2026-04-10 12:18:56 +02:00
Csaba Kiraly
1c518be79f eth: simplify peer protection — compute protected set upfront
Compute the protected peer set once in dropRandomPeer via
protectedPeers(), then include protection as a condition in
selectDoNotDrop alongside trusted/static/recent checks. This
eliminates the separate filterProtectedPeers post-pass and the
awkward "all protected → skip" branch.

Rename filterProtectedPeers to protectedPeers, returning
map[*p2p.Peer]bool instead of filtering a slice. The map is
checked directly in selectDoNotDrop via protected[p].
2026-04-10 12:18:56 +02:00
Csaba Kiraly
db611822db eth: rename droppedProtected metric to dropSkipped 2026-04-10 12:18:56 +02:00
Csaba Kiraly
44c8a5b7f4 eth: base protection quota on current peer count, not max capacity
protectTopN used maxPeers (configured capacity) to compute the
number of peers to protect. With small droppable sets this could
protect everyone, permanently disabling churn.

Use len(entries) (current droppable count in each category) instead.
With 20 droppable dialed peers and 10% fraction, 2 are protected.
With 3 droppable peers, 0 are protected — churn is never blocked.
2026-04-10 10:36:59 +02:00
Csaba Kiraly
58556173f6 eth: improve package and type documentation for txtracker and dropper
Expand the txtracker package doc to describe the tracking flow
(NotifyReceived → chain head → finalization → peer credit) and its
role as stats provider for the dropper.

Rewrite the dropper struct comment to document the full behavior
including the inclusion-based peer protection: two scoring categories
(total finalized + recent EMA), top 10% per pool, union of protected
sets.
2026-04-10 08:59:09 +02:00
Csaba Kiraly
98ffc7bd37 eth: use finalized count for total protection, keep EMA on inclusions
Change the long-term protection category from total inclusions to
total finalized inclusions. Finalized txs are harder to game (require
actual block finality, not just inclusion) and represent confirmed
on-chain value.

The recent-inclusion EMA stays on chain head inclusions for
responsiveness — a peer delivering txs that appear in the latest
blocks gets quick protection without waiting for finalization.

The tracker now checks CurrentFinalBlock() on each chain head event
and credits delivering peers for all newly finalized blocks since
the last check.
2026-04-10 08:56:32 +02:00
Csaba Kiraly
5a918be50d eth: protect high-value peers from random dropping based on inclusion stats
The dropper periodically disconnects random peers to create churn.
This was blind to peer quality. Add inclusion-based peer protection
using two categories:

1. Total inclusions: protects peers with the highest cumulative
   count of delivered txs that were included on chain
2. Recent inclusions (EMA): protects peers with the best recent
   inclusion rate, giving newly productive peers faster protection

Each category independently protects the top 10% of inbound and
top 10% of dialed peers. The union of both sets is protected. Only
peers with positive scores qualify.

The dropper defines its own PeerInclusionStats struct and callback
type (getPeerInclusionStatsFunc) so any stats provider (e.g. a
transaction tracker) can plug in without a package dependency. The
callback is nil by default (protection disabled until wired).

The protectionCategories slice is designed for easy extension —
adding a new category requires only appending a struct with a name,
scoring function, and protection fraction.
2026-04-10 08:23:30 +02:00
maradini77
e0d81d1e99
eth: fix panic in randomDuration when min equals max (#33193)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Fixes a potential panic in `randomDuration` when `min == max` by
handling the edge case explicitly.
2025-11-19 01:54:53 +08:00
Csaba Kiraly
c5c75977ab
eth: add logic to drop peers randomly when saturated (#31476)
As of now, Geth disconnects peers only on protocol error or timeout,
meaning once connection slots are filled, the peerset is largely fixed.

As mentioned in https://github.com/ethereum/go-ethereum/issues/31321,
Geth should occasionally disconnect peers to ensure some churn.
What/when to disconnect could depend on:
- the state of geth (e.g. sync or not)
- current number of peers
- peer level metrics

This PR adds a very slow churn using a random drop.

---------

Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
2025-04-14 12:45:27 +02:00