mirror of
https://github.com/ethereum/go-ethereum.git
synced 2026-05-24 08:49:29 +00:00
Two related changes to CrawlIterator: (1) Add a file-level commentary block explaining why the iterator uses a FIFO queue (BFS over the FINDNODE-response graph) and what it is *not* suitable for (target-directed lookup -- use RandomNodes() / the alpha=3 lookup iterator for that). The choice was inherited from dcrawl.nim without explicit reasoning; making it visible avoids future readers re-deriving the survey-vs-lookup distinction. The BFS rationale is two-fold: - Coverage: BFS reaches every peer within N hops of the seeds in order, so a time-bounded run produces a representative sample of the reachable graph rather than a deep tendril through one sub-region. - Adversarial resilience: a peer returning malicious "neighbour" claims, dead-end peers, or eclipse-style sub-graphs cannot monopolise the worker pool, because pending work from other branches sits ahead of the attacker's responses in the queue. DFS would amplify each of these attacks. (2) Add a RandomWorkers field to CrawlOptions. Of the Workers-sized worker pool, the first (Workers - RandomWorkers) workers pop the FIFO front (BFS), while RandomWorkers workers pop a uniform-random queue index via swap-and-pop (O(1)). Total worker count is unchanged. Default RandomWorkers = Workers / 4 (4 of 16 with the default parallelism). At this ratio: - Cold-start cost is negligible: 12 of 16 workers still drain FIFO, so the first ~1s of a fresh crawl behaves like pure BFS. - 25% of pops break strict FIFO ordering, providing a mild anti-fingerprint defence against an attacker who could otherwise predict our processing order from the contents of their own FINDNODE responses. Operators can override per-run via the new --random-workers CLI flag on `devp2p discv4 crawl` and `discv5 crawl`. Negative value forces pure BFS; positive value selects an explicit count. The new TestCrawlIteratorRandomWorkers covers four pop-policy configurations (all-fifo, all-random, half-half, default) and asserts the iterator still terminates and emits each node exactly once in each. |
||
|---|---|---|
| .. | ||
| v4wire | ||
| v5wire | ||
| common.go | ||
| crawliter.go | ||
| crawliter_test.go | ||
| lookup.go | ||
| metrics.go | ||
| node.go | ||
| ntp.go | ||
| table.go | ||
| table_reval.go | ||
| table_reval_test.go | ||
| table_test.go | ||
| table_util_test.go | ||
| v4_lookup_test.go | ||
| v4_udp.go | ||
| v4_udp_test.go | ||
| v5_talk.go | ||
| v5_udp.go | ||
| v5_udp_test.go | ||