Commit graph

13 commits

Author SHA1 Message Date
Csaba Kiraly
33785aab21
p2p/discover: document BFS choice, add RandomWorkers split
Two related changes to CrawlIterator:

(1) Add a file-level commentary block explaining why the iterator uses a
FIFO queue (BFS over the FINDNODE-response graph) and what it is *not*
suitable for (target-directed lookup -- use RandomNodes() / the alpha=3
lookup iterator for that). The choice was inherited from dcrawl.nim
without explicit reasoning; making it visible avoids future readers
re-deriving the survey-vs-lookup distinction.

The BFS rationale is two-fold:

 - Coverage: BFS reaches every peer within N hops of the seeds in
   order, so a time-bounded run produces a representative sample of the
   reachable graph rather than a deep tendril through one sub-region.
 - Adversarial resilience: a peer returning malicious "neighbour"
   claims, dead-end peers, or eclipse-style sub-graphs cannot
   monopolise the worker pool, because pending work from other branches
   sits ahead of the attacker's responses in the queue. DFS would
   amplify each of these attacks.

(2) Add a RandomWorkers field to CrawlOptions. Of the Workers-sized
worker pool, the first (Workers - RandomWorkers) workers pop the FIFO
front (BFS), while RandomWorkers workers pop a uniform-random queue
index via swap-and-pop (O(1)). Total worker count is unchanged.

Default RandomWorkers = Workers / 4 (4 of 16 with the default
parallelism). At this ratio:

 - Cold-start cost is negligible: 12 of 16 workers still drain FIFO,
   so the first ~1s of a fresh crawl behaves like pure BFS.
 - 25% of pops break strict FIFO ordering, providing a mild
   anti-fingerprint defence against an attacker who could otherwise
   predict our processing order from the contents of their own
   FINDNODE responses.

Operators can override per-run via the new --random-workers CLI flag
on `devp2p discv4 crawl` and `discv5 crawl`. Negative value forces
pure BFS; positive value selects an explicit count.

The new TestCrawlIteratorRandomWorkers covers four pop-policy
configurations (all-fifo, all-random, half-half, default) and
asserts the iterator still terminates and emits each node exactly
once in each.
2026-05-07 14:41:58 +02:00
Csaba Kiraly
dfa3bbffae
cmd/devp2p: add --mode flag selecting crawl iterator
Wire the new discover.CrawlIterator into devp2p discv4/discv5 crawl
behind a --mode flag (default 'lookup', i.e. existing behaviour).

  devp2p discv4 crawl --mode=fast --timeout 30s nodes.json
  devp2p discv5 crawl --mode=fast --timeout 30s nodes.json

Smoke test against mainnet bootnodes for 30s on a residential link
yields ~2.4x more nodes under --mode=fast (587 vs 240 in one run),
with the new per-tick LogDist log showing a much more uniform
distribution of query distances. Workers default to the existing
--parallel value (16); pacing is RTT-driven.

The 'lookup' default keeps existing behaviour byte-identical for any
operator running the saved devp2p discv4 crawl from a script.
2026-05-07 14:41:58 +02:00
Chen Kai
5117f77af9
p2p/discover: expose discv5 functions for portal JSON-RPC interface (#31117)
Fixes #31093

Here we add some API functions on the UDPv5 object for the purpose of implementing
the Portal Network JSON-RPC API in the shisui client.

---------

Signed-off-by: Chen Kai <281165273grape@gmail.com>
2025-03-13 15:16:01 +01:00
Péter Szilágyi
20bf543a64
internal/flags: remove Merge, it's identical to slices.Concat (#30706)
This is a noop change to not have custom code for stdlib functionality.
2024-10-31 19:26:02 +02:00
Delweng
41a0ad9f03
cmd/devp2p: use bootnodes as crawl input (#28139)
This PR makes the tool use the --bootnodes list as the input to devp2p crawl.
The flag will take effect if the input/output.json file is missing or empty.
2023-09-19 14:18:29 +02:00
Delweng
e9c3183c52
cmd: use errrors.New instead of empty fmt.Errorf (#27329)
Signed-off-by: jsvisa <delweng@gmail.com>
2023-05-24 12:21:29 +02:00
Martin Holst Swende
c155c8e179
cmd/devp2p: faster crawling + less verbose dns updates (#26697)
This improves the speed of DHT crawling by using concurrent requests.
It also removes logging of individual DNS updates.
2023-02-27 11:36:26 +01:00
Felix Lange
b44abf56a9
cmd/devp2p: add --extaddr flag (#26312)
The new flag allows configuring an explicit endpoint which is to be
announced in the DHT. This feature was originally developed for the
discv5 wormhole experiment (#25798), but it's useful in other contexts
as well.
2022-12-06 16:25:53 +01:00
willian.eth
52ed3570c4
cmd: migrate to urfave/cli/v2 (#24751)
This change updates our urfave/cli dependency to the v2 branch of the library.
There are some Go API changes in cli v2:

- Flag values can now be accessed using the methods ctx.Bool,
  ctx.Int, ctx.String, ... regardless of whether the flag is 'local' or
  'global'.

- v2 has built-in support for flag categories. Our home-grown category
  system is removed and the categories of flags are assigned as part of
  the flag definition.

For users, there is only one observable difference with cli v2: flags must now
strictly appear before regular arguments. For example, the following command is
now invalid:

   geth account import mykey.json --password file.txt

Instead, the command must be invoked as follows:

   geth account import --password file.txt mykey.json
2022-06-27 18:22:36 +02:00
Felix Lange
9244d5cd61
all: update license headers and AUTHORS from git history (#24947) 2022-05-24 20:39:40 +02:00
Felix Lange
5d20fbbb6f
cmd/devp2p, internal/utesting: implement TAP output (#21760)
TAP is a text format for test results. Parsers for it are available in many languages,
making it easy to consume. I want TAP output from our protocol tests because the
Hive wrapper around them needs to know about the test names and their individual
results and logs. It would also be possible to just write this info as JSON, but I don't
want to invent a new format.

This also improves the normal console output for tests (when running without --tap).
It now prints -- RUN lines before any output from the test, and indents the log output
by one space.
2020-11-04 15:02:58 +01:00
Felix Lange
524aaf5ec6
p2p/discover: implement v5.1 wire protocol (#21647)
This change implements the Discovery v5.1 wire protocol and
also adds an interactive test suite for this protocol.
2020-10-14 12:28:17 +02:00
Felix Lange
b7394d7942
p2p/discover: add initial discovery v5 implementation (#20750)
This adds an implementation of the current discovery v5 spec.

There is full integration with cmd/devp2p and enode.Iterator in this
version. In theory we could enable the new protocol as a replacement of
discovery v4 at any time. In practice, there will likely be a few more
changes to the spec and implementation before this can happen.
2020-04-08 09:57:23 +02:00