go-ethereum/cmd
Csaba Kiraly 33785aab21
p2p/discover: document BFS choice, add RandomWorkers split
Two related changes to CrawlIterator:

(1) Add a file-level commentary block explaining why the iterator uses a
FIFO queue (BFS over the FINDNODE-response graph) and what it is *not*
suitable for (target-directed lookup -- use RandomNodes() / the alpha=3
lookup iterator for that). The choice was inherited from dcrawl.nim
without explicit reasoning; making it visible avoids future readers
re-deriving the survey-vs-lookup distinction.

The BFS rationale is two-fold:

 - Coverage: BFS reaches every peer within N hops of the seeds in
   order, so a time-bounded run produces a representative sample of the
   reachable graph rather than a deep tendril through one sub-region.
 - Adversarial resilience: a peer returning malicious "neighbour"
   claims, dead-end peers, or eclipse-style sub-graphs cannot
   monopolise the worker pool, because pending work from other branches
   sits ahead of the attacker's responses in the queue. DFS would
   amplify each of these attacks.

(2) Add a RandomWorkers field to CrawlOptions. Of the Workers-sized
worker pool, the first (Workers - RandomWorkers) workers pop the FIFO
front (BFS), while RandomWorkers workers pop a uniform-random queue
index via swap-and-pop (O(1)). Total worker count is unchanged.

Default RandomWorkers = Workers / 4 (4 of 16 with the default
parallelism). At this ratio:

 - Cold-start cost is negligible: 12 of 16 workers still drain FIFO,
   so the first ~1s of a fresh crawl behaves like pure BFS.
 - 25% of pops break strict FIFO ordering, providing a mild
   anti-fingerprint defence against an attacker who could otherwise
   predict our processing order from the contents of their own
   FINDNODE responses.

Operators can override per-run via the new --random-workers CLI flag
on `devp2p discv4 crawl` and `discv5 crawl`. Negative value forces
pure BFS; positive value selects an explicit count.

The new TestCrawlIteratorRandomWorkers covers four pop-policy
configurations (all-fifo, all-random, half-half, default) and
asserts the iterator still terminates and emits each node exactly
once in each.
2026-05-07 14:41:58 +02:00
..
abidump all: update license headers and AUTHORS from git history (#24947) 2022-05-24 20:39:40 +02:00
abigen cmd/abigen, accounts/abi/bind: implement abigen version 2 (#31379) 2025-03-17 15:56:55 +01:00
blsync beacon/blsync: add checkpoint import/export file feature (#31469) 2025-04-03 16:04:11 +02:00
clef cmd/clef: update Safe API documentation links in changelog (#32136) 2025-07-09 14:09:11 -06:00
devp2p p2p/discover: document BFS choice, add RandomWorkers split 2026-05-07 14:41:58 +02:00
era internal/era/onedb: return false if err (#34816) 2026-05-01 14:10:41 +02:00
ethkey cmd: fix some typos in readmes (#29405) 2024-04-11 14:06:49 +03:00
evm core: implement eip-7981: Increase Access List Cost (#34755) 2026-05-06 12:03:11 +02:00
fetchpayload cmd/fetchpayload: add payload-building utility (#33919) 2026-03-11 16:18:42 +01:00
geth trie: group 2^N binary trie nodes in serialization (#34794) 2026-05-01 15:28:19 +02:00
keeper internal/telemetry: add gRPC transport for OTLP trace export (#33941) 2026-04-21 14:48:21 +02:00
rlpdump build: update to golangci-lint 1.61.0 (#30587) 2024-10-14 19:25:22 +02:00
utils trie: group 2^N binary trie nodes in serialization (#34794) 2026-05-01 15:28:19 +02:00
workload core/history: refactor pruning configuration (#34036) 2026-03-18 13:54:29 +01:00