mirror of
https://github.com/ethereum/go-ethereum.git
synced 2026-05-25 09:19:28 +00:00
Two related changes to CrawlIterator: (1) Add a file-level commentary block explaining why the iterator uses a FIFO queue (BFS over the FINDNODE-response graph) and what it is *not* suitable for (target-directed lookup -- use RandomNodes() / the alpha=3 lookup iterator for that). The choice was inherited from dcrawl.nim without explicit reasoning; making it visible avoids future readers re-deriving the survey-vs-lookup distinction. The BFS rationale is two-fold: - Coverage: BFS reaches every peer within N hops of the seeds in order, so a time-bounded run produces a representative sample of the reachable graph rather than a deep tendril through one sub-region. - Adversarial resilience: a peer returning malicious "neighbour" claims, dead-end peers, or eclipse-style sub-graphs cannot monopolise the worker pool, because pending work from other branches sits ahead of the attacker's responses in the queue. DFS would amplify each of these attacks. (2) Add a RandomWorkers field to CrawlOptions. Of the Workers-sized worker pool, the first (Workers - RandomWorkers) workers pop the FIFO front (BFS), while RandomWorkers workers pop a uniform-random queue index via swap-and-pop (O(1)). Total worker count is unchanged. Default RandomWorkers = Workers / 4 (4 of 16 with the default parallelism). At this ratio: - Cold-start cost is negligible: 12 of 16 workers still drain FIFO, so the first ~1s of a fresh crawl behaves like pure BFS. - 25% of pops break strict FIFO ordering, providing a mild anti-fingerprint defence against an attacker who could otherwise predict our processing order from the contents of their own FINDNODE responses. Operators can override per-run via the new --random-workers CLI flag on `devp2p discv4 crawl` and `discv5 crawl`. Negative value forces pure BFS; positive value selects an explicit count. The new TestCrawlIteratorRandomWorkers covers four pop-policy configurations (all-fifo, all-random, half-half, default) and asserts the iterator still terminates and emits each node exactly once in each. |
||
|---|---|---|
| .. | ||
| discover | ||
| dnsdisc | ||
| enode | ||
| enr | ||
| msgrate | ||
| nat | ||
| netutil | ||
| pipes | ||
| rlpx | ||
| tracker | ||
| config.go | ||
| config_toml.go | ||
| dial.go | ||
| dial_test.go | ||
| message.go | ||
| message_test.go | ||
| metrics.go | ||
| peer.go | ||
| peer_error.go | ||
| peer_test.go | ||
| protocol.go | ||
| server.go | ||
| server_nat.go | ||
| server_nat_test.go | ||
| server_test.go | ||
| transport.go | ||
| transport_test.go | ||
| util.go | ||
| util_test.go | ||