go-ethereum

mirror of https://github.com/ethereum/go-ethereum.git synced 2026-05-09 17:46:37 +00:00

Author	SHA1	Message	Date
rayoo	60db25b070	p2p/discover: restore nextTimeout update in UDPv4 resetTimeout loop (#34878 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details The refactor from `for el := plist.Front(); ...; el = el.Next()` to the new `iterList` iterator in #34743 silently dropped two things needed by resetTimeout: 1. `nextTimeout = el.Value.(replyMatcher)` at the top of the loop. This assignment is what gives `nextTimeout` its documented meaning ("head of plist when timeout was last reset"), and what makes the early-return optimization at the top of resetTimeout work. Without it, nextTimeout is only ever written to nil, so `nextTimeout == plist.Front().Value` is always false and the optimization is dead. 2. `nextTimeout.errc <- errClockWarp` in the clock-warp branch now reads a stale or nil pointer. Prior to the refactor, the inner assignment kept nextTimeout pointing at the current matcher so its errc was the right channel to receive the errClockWarp signal. After the refactor, on first entry into the clock-warp branch nextTimeout is nil, which panics the UDPv4 loop goroutine with a nil pointer deref and takes discv4 down. Re-assign `nextTimeout = p` at the head of the loop (restoring the documented invariant) and send the clock-warp error on `p.errc` rather than the now-stale `nextTimeout.errc`. The clock-warp branch triggers only when the system clock jumps backward after a deadline is assigned (deadline - time.Now() >= 2respTimeout, i.e. at least ~500ms backward jump), which is why this regression slipped past CI - it is not exercised by any existing unit test, and writing one would require plumbing a clock through the loop.	2026-05-05 15:28:28 +02:00
Rahman	51c97216c5	p2p/discover: fix timeout loop early exit when removing expired matchers (#34743 ) Save `el.Next()` before calling `plist.Remove(el)` so iteration continues correctly. Previously the loop exited after removing the first expired matcher because `Remove` invalidates the element's links. --------- Co-authored-by: Felix Lange <fjl@twurst.com>	2026-04-28 10:57:58 +02:00
Charles Dusek	e1fe4a1a98	p2p/discover: fix flaky TestUDPv5_findnodeHandling (#34109 ) Fixes #34108 The UDPv5 test harness (`newUDPV5Test`) uses the default `PingInterval` of 3 seconds. When tests like `TestUDPv5_findnodeHandling` insert nodes into the routing table via `fillTable`, the table's revalidation loop may schedule PING packets for those nodes. Under the race detector or on slow CI runners, the test runs long enough for revalidation to fire, causing background pings to be written to the test pipe. The `close()` method then finds these as unmatched packets and fails. The fix sets `PingInterval` to a very large value in the test harness so revalidation never fires during tests. Verified locally: 100 iterations with `-race -count=100` pass reliably, where previously the test would fail within ~50 iterations.	2026-04-14 09:43:44 +02:00
Charles Dusek	a2496852e9	p2p/discover: resolve DNS hostnames for bootstrap nodes (#34101 ) Fixes #31208	2026-03-28 11:37:39 +01:00
jvn	59ce2cb6a1	p2p: track in-progress inbound node IDs (#33198 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Avoid dialing a node while we have an inbound connection request from them in progress. Closes #33197	2026-03-20 05:52:15 +01:00
Felix Lange	9962e2c9f3	p2p/tracker: fix crash in clean when tracker is stopped (#33940 )	2026-03-03 12:54:24 +01:00
Felix Lange	00cbd2e6f4	p2p/discover/v5wire: use Whoareyou.ChallengeData instead of storing encoded packet (#31547 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This changes the challenge resend logic again to use the existing `ChallengeData` field of `v5wire.Whoareyou` instead of storing a second copy of the packet in `Whoareyou.Encoded`. It's more correct this way since `ChallengeData` is supposed to be the data that is used by the ID verification procedure. Also adapts the cross-client test to verify this behavior. Follow-up to #31543	2026-02-22 21:58:47 +01:00
Felix Lange	0cba803fba	eth/protocols/eth, eth/protocols/snap: delayed p2p message decoding (#33835 ) Some checks failed / Linux Build (push) Has been cancelled Details / Linux Build (arm) (push) Has been cancelled Details / Keeper Build (push) Has been cancelled Details / Windows Build (push) Has been cancelled Details / Docker Image (push) Has been cancelled Details This changes the p2p protocol handlers to delay message decoding. It's the first part of a larger change that will delay decoding all the way through message processing. For responses, we delay the decoding until it is confirmed that the response matches an active request and does not exceed its limits. In order to make this work, all messages have been changed to use rlp.RawList instead of a slice of the decoded item type. For block bodies specifically, the decoding has been delayed all the way until after verification of the response hash. The role of p2p/tracker.Tracker changes significantly in this PR. The Tracker's original purpose was to maintain metrics about requests and responses in the peer-to-peer protocols. Each protocol maintained a single global Tracker instance. As of this change, the Tracker is now always active (regardless of metrics collection), and there is a separate instance of it for each peer. Whenever a response arrives, it is first verified that a request exists for it in the tracker. The tracker is also the place where limits are kept.	2026-02-15 21:21:16 +08:00
Felix Lange	8e1de223ad	crypto/keccak: vendor in golang.org/x/crypto/sha3 (#33323 ) The upstream libray has removed the assembly-based implementation of keccak. We need to maintain our own library to avoid a peformance regression. --------- Co-authored-by: lightclient <lightclient@protonmail.com>	2026-02-03 14:55:27 -07:00
fengjian	c974722dc0	crypto/ecies: fix ECIES invalid-curve handling (#33669 ) Some checks are pending / Docker Image (push) Waiting to run Details / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details Fix ECIES invalid-curve handling in RLPx handshake (reject invalid ephemeral pubkeys early) - Add curve validation in crypto/ecies.GenerateShared to reject invalid public keys before ECDH. - Update RLPx PoC test to assert invalid curve points fail with ErrInvalidPublicKey. Motivation / Context RLPx handshake uses ECIES decryption on unauthenticated network input. Prior to this change, an invalid-curve ephemeral public key would proceed into ECDH and only fail at MAC verification, returning ErrInvalidMessage. This allows an oracle on decrypt success/failure and leaves the code path vulnerable to invalid-curve/small-subgroup attacks. The fix enforces IsOnCurve validation up front.	2026-01-29 10:56:12 +01:00
kurahin	13a8798fa3	p2p/tracker: fix head detection in Fulfil to avoid unnecessary timer reschedules (#33370 )	2025-12-10 16:09:07 +08:00
cui	31f9c9ff75	common/bitutil: deprecate XORBytes in favor of stdlib crypto/subtle (#33331 ) XORBytes was added to package crypto/subtle in Go 1.20, and it's faster than our bitutil.XORBytes. There is only one use of this function across go-ethereum so we can simply deprecate the custom implementation. --------- Co-authored-by: Felix Lange <fjl@twurst.com>	2025-12-08 17:40:59 +01:00
Snezhkko	af47d9b472	p2p/nat: fix err shadowing in UPnP addAnyPortMapping (#33355 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details The random-port retry loop in addAnyPortMapping shadowed the err variable, causing the function to return (0, nil) when all attempts failed. This change removes the shadowing and preserves the last error across both the fixed-port and random-port retries, ensuring failures are reported to callers correctly.	2025-12-08 15:02:24 +01:00
oxBoni	1468331f9d	p2p/discover/v5wire: remove redundant bytes clone in WHOAREYOU encoding (#33180 ) head.AuthData is assigned later in the function, so the earlier assignment can safely be removed.	2025-11-26 15:34:11 +01:00
Delweng	5dd0fe2f53	p2p: cleanup v4 if v5 failed (#33005 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Clean the previous resource (v4) if the latter (v5) failed.	2025-10-29 10:34:19 +01:00
Delweng	2bb3d9a330	p2p: silence on listener shutdown (#33001 ) Co-authored-by: Felix Lange <fjl@twurst.com>	2025-10-23 10:44:54 +02:00
Felix Lange	7c107c2691	p2p/discover: remove hot-spin in table refresh trigger (#32912 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This fixes a regression introduced in #32518. In that PR, we removed the slowdown logic that would throttle lookups when the table runs empty. Said logic was originally added in #20389. Usually it's fine, but there exist pathological cases, such as hive tests, where the node can only discover one other node, so it can only ever query that node and won't get any results. In cases like these, we need to throttle the creation of lookups to avoid crazy CPU usage.	2025-10-15 11:51:33 +02:00
Delweng	6337577434	p2p/discover: wait for bootstrap to be done (#32881 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This ensures the node is ready to accept other nodes into the table before it is used in a test. Closes #32863	2025-10-13 19:58:50 +02:00
cui	b87581f297	p2p/enode: optimize DistCmp (#32888 ) This speeds up DistCmp by 75% through using 64-bit operations instead of byte-wise XOR.	2025-10-13 16:16:07 +02:00
cui	5c6ba6b400	p2p/enode: optimize LogDist (#32887 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This speeds up LogDist by 75% using 64-bit operations instead of byte-wise XOR. --------- Co-authored-by: Felix Lange <fjl@twurst.com>	2025-10-13 14:00:43 +02:00
Delweng	85e9977fae	p2p: rm unused var seedMinTableTime (#32876 )	2025-10-13 16:40:08 +08:00
Csaba Kiraly	4927e89647	p2p/enode: fix asyncfilter comment (#32823 ) just finisher the sentence Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>	2025-10-02 17:27:35 +02:00
zzzckck	f0dc47aae3	p2p/enode: fix discovery AyncFilter deadlock on shutdown (#32572 ) Description: We found a occasionally node hang issue on BSC, I think Geth may also have the issue, so pick the fix patch here. The fix on BSC repo: https://github.com/bnb-chain/bsc/pull/3347 When the hang occurs, there are two routines stuck. - routine 1: AsyncFilter(...) On node start, it will run part of the DiscoveryV4 protocol, which could take considerable time, here is its hang callstack: ``` goroutine 9711 [chan receive]: // this routine was stuck on read channel: `<-f.slots` github.com/ethereum/go-ethereum/p2p/enode.AsyncFilter.func1() github.com/ethereum/go-ethereum/p2p/enode/iter.go:206 +0x125 created by github.com/ethereum/go-ethereum/p2p/enode.AsyncFilter in goroutine 1 github.com/ethereum/go-ethereum/p2p/enode/iter.go:192 +0x205 ``` - Routine 2: Node Stop It is the main routine to shutdown the process, but it got stuck when it tries to shutdown the discovery components, as it tries to drain the channel of `<-f.slots`, but the extra 1 slot will never have chance to be resumed. ``` goroutine 11796 [chan receive]: github.com/ethereum/go-ethereum/p2p/enode.(asyncFilterIter).Close.func1() github.com/ethereum/go-ethereum/p2p/enode/iter.go:248 +0x5c sync.(Once).doSlow(0xc032a97cb8?, 0xc032a97d18?) sync/once.go:78 +0xab sync.(Once).Do(...) sync/once.go:69 github.com/ethereum/go-ethereum/p2p/enode.(asyncFilterIter).Close(0xc092ff8d00?) github.com/ethereum/go-ethereum/p2p/enode/iter.go:244 +0x36 github.com/ethereum/go-ethereum/p2p/enode.(bufferIter).Close.func1() github.com/ethereum/go-ethereum/p2p/enode/iter.go:299 +0x24 sync.(Once).doSlow(0x11a175f?, 0x2bfe63e?) sync/once.go:78 +0xab sync.(Once).Do(...) sync/once.go:69 github.com/ethereum/go-ethereum/p2p/enode.(bufferIter).Close(0x30?) github.com/ethereum/go-ethereum/p2p/enode/iter.go:298 +0x36 github.com/ethereum/go-ethereum/p2p/enode.(FairMix).Close(0xc0004bfea0) github.com/ethereum/go-ethereum/p2p/enode/iter.go:379 +0xb7 github.com/ethereum/go-ethereum/eth.(Ethereum).Stop(0xc000997b00) github.com/ethereum/go-ethereum/eth/backend.go:960 +0x4a github.com/ethereum/go-ethereum/node.(Node).stopServices(0xc0001362a0, {0xc012e16330, 0x1, 0xc000111410?}) github.com/ethereum/go-ethereum/node/node.go:333 +0xb3 github.com/ethereum/go-ethereum/node.(Node).Close(0xc0001362a0) github.com/ethereum/go-ethereum/node/node.go:263 +0x167 created by github.com/ethereum/go-ethereum/cmd/utils.StartNode.func1.1 in goroutine 9729 github.com/ethereum/go-ethereum/cmd/utils/cmd.go:101 +0x78 ``` The rootcause of the hang is caused by the extra 1 slot, which was designed to make sure the routines in `AsyncFilter(...)` can be finished. This PR fixes it by making sure the extra 1 shot can always be resumed when node shutdown.	2025-10-02 12:43:31 +02:00
Zach Brown	f9756bb885	p2p: fix error message in test (#32804 )	2025-09-30 19:30:47 +08:00
cui	64c6de7747	p2p: using testing.B.Loop (#32664 )	2025-09-19 16:38:36 -06:00
Csaba Kiraly	de9fb9722b	revert to using table parameter using it.lookup.tab inside is unsafe Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>	2025-09-17 09:04:41 +02:00
Csaba Kiraly	3589c0d59b	p2p/discover: expose timeout in lookupFailed Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com> # Conflicts: # p2p/discover/lookup.go	2025-09-16 14:03:11 +02:00
Felix Lange	0643427965	p2p/discover: continue	2025-09-12 12:50:07 +02:00
Felix Lange	68c18ede06	Update lookup.go	2025-09-12 11:34:44 +02:00
Csaba Kiraly	97afa2815b	Revert "p2p/discover: add test for lookup returning immediately" This reverts commit `3eab4616a6`.	2025-09-12 11:29:43 +02:00
Csaba Kiraly	3eab4616a6	p2p/discover: add test for lookup returning immediately Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>	2025-09-12 10:59:29 +02:00
Csaba Kiraly	72d3e881b3	p2p/discover: clarify lookup behavior on empty table We have changed this behavior, better clarify in comment. Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>	2025-09-12 10:52:53 +02:00
Felix Lange	a9f9e0d589	p2p/discover: add imports in test	2025-09-10 20:10:51 +02:00
Felix Lange	3133fd369a	p2p/discover: remove print in test	2025-09-10 20:10:51 +02:00
Felix Lange	3946708935	p2p/discover: fix two bugs in lookup iterator The lookup would add self into the replyBuffer if returned by another node. Avoid doing that by marking self as seen. With the changed initialization behavior of lookup, the lookupIterator needs to yield the buffer right after creation. This fixes the smallNetConvergence test, where all results are straight out of the local table.	2025-09-10 20:10:51 +02:00
Felix Lange	cf0503da7c	p2p/discover: track missing nodes in test	2025-09-10 20:10:51 +02:00
Felix Lange	721c8de738	p2p/discover: trigger refresh in lookupIterator	2025-09-10 20:10:51 +02:00
Felix Lange	e58e7f7927	p2p/discover: fix bug in lookup	2025-09-10 20:10:51 +02:00
Felix Lange	4ed8f5ee2b	p2p/discover: improve iterator	2025-09-10 20:10:51 +02:00
Felix Lange	f4046b0cfb	p2p/discover: move wait condition to lookupIterator	2025-09-10 20:10:51 +02:00
Felix Lange	f8e0e8dc55	p2p/discover: add context in waitForNodes	2025-09-10 20:10:51 +02:00
Felix Lange	46e4f0b5c1	p2p/discover: add waitForNodes	2025-09-10 20:10:51 +02:00
Csaba Kiraly	1f7f95d718	p2p/discover: remove delay from discv5 RandomNodes (#32517 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Refresh is doing some lookups and thus it could block for some time. We do not want the initializer of an iterator to block. If there is something blocking, it should happen when calling Next. Here, next will start a lookup, which will wait if needed (no nodes), making sure the iterator's Next is not creating a busy loop. Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>	2025-09-10 19:51:04 +02:00
Zach Brown	2a795c14f4	all: fix problematic function name in comment (#32513 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Fix problematic function name in comment. Do my best to correct them all with a script to avoid spamming PRs.	2025-08-29 08:54:23 +08:00
cui	9b2e8e7ce3	p2p: use slices.Clone (#32428 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Replaces a helper method with slices.Clone	2025-08-25 11:30:51 +02:00
Ocenka	276ed4848c	p2p/discover: add discv5 invalid findnodes result test cases (#32481 ) Some checks failed / Linux Build (push) Has been cancelled Details / Linux Build (arm) (push) Has been cancelled Details / Windows Build (push) Has been cancelled Details / Docker Image (push) Has been cancelled Details Supersedes #32470. ### What - snap: shorten stall watchdog in `eth/protocols/snap/sync_test.go` from 1m to 10s. - discover/v5: consolidate FINDNODE negative tests into a single table-driven test: - `TestUDPv5_findnodeCall_InvalidNodes` covers: - invalid IP (unspecified `0.0.0.0`) → ignored - low UDP port (`<=1024`) → ignored ### Why - Addresses TODOs: - “Make tests smaller” (reduce long 1m timeout). - “check invalid IPs”; also cover low port per `verifyResponseNode` rules (UDP must be >1024). ### How it’s validated - Test-only changes; no production code touched. - Local runs: - `go test ./p2p/discover -count=1 -timeout=300s` → ok - `go test ./eth/protocols/snap -count=1 -timeout=600s` → ok - Lint: - `go run build/ci.go lint` → 0 issues on modified files. ### Notes - The test harness uses `enode.ValidSchemesForTesting` (which includes the “null” scheme), so records signed with `enode.SignNull` are signature-valid; failures here are due to IP/port validation in `verifyResponseNode` and `netutil.CheckRelayAddr`. - Tests are written as a single table-driven function for clarity; no helpers or environment switching. --------- Co-authored-by: lightclient <lightclient@protonmail.com>	2025-08-22 11:44:11 -06:00
cui	f3467d1e63	p2p: remove todo comment, as it's unnecessary (#32397 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details as metioned in https://github.com/ethereum/go-ethereum/pull/32351, I think this comment is unnecessary.	2025-08-21 15:48:46 -06:00
cui	997dff4fae	p2p: using math.MaxInt32 from go std lib (#32357 ) Co-authored-by: Felix Lange <fjl@twurst.com>	2025-08-20 16:22:21 -06:00
Klimov Sergei	62ac0e05b6	p2p: update MaxPeers comment (#32414 )	2025-08-19 20:14:11 +08:00
cui	2b38daa48c	p2p: refactor to use time.Now().UnixMilli() in golang std lib (#32402 )	2025-08-14 16:28:57 +08:00

1 2 3 4 5 ...

771 commits