The endSpan closure accepted error by value, meaning deferred calls like
defer spanEnd(err) captured the error at defer-time (always nil), not at
function-return time. This meant errors were never recorded on spans.
- Changed endSpan to accept *error
- Updated all call sites in rpc/handler.go to pass error pointers, and
adjusted handleCall to avoid propagating child-span errors to the parent
- Added TestTracingHTTPErrorRecording to verify that errors from RPC
methods are properly recorded on the rpc.runMethod span
This PR adds support for the extraction of OpenTelemetry trace context
from incoming JSON-RPC request headers, allowing geth spans to be linked
to upstream traces when present.
---------
Co-authored-by: lightclient <lightclient@protonmail.com>
Add Open Telemetry tracing inside the RPC server to help attribute runtime costs within `handler.handleCall()`. In particular, it allows us to distinguish time spent decoding arguments, invoking methods via reflection, and actually executing the method and constructing/encoding JSON responses.
---------
Co-authored-by: lightclient <lightclient@protonmail.com>
Updated the `avail` calculation to correctly compute remaining capacity:
`buf.limit - len(buf.output)`, ensuring the buffer never exceeds its
configured limit regardless of how many times `Write()` is called.
Found in
https://github.com/ethereum/go-ethereum/actions/runs/17803828253/job/50611300621?pr=32585
```
--- FAIL: TestClientCancelWebsocket (0.33s)
panic: read tcp 127.0.0.1:36048->127.0.0.1:38643: read: connection reset by peer [recovered, repanicked]
goroutine 15 [running]:
testing.tRunner.func1.2({0x98dd20, 0xc0005b0100})
/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/testing/testing.go:1872 +0x237
testing.tRunner.func1()
/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/testing/testing.go:1875 +0x35b
panic({0x98dd20?, 0xc0005b0100?})
/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/runtime/panic.go:783 +0x132
github.com/ethereum/go-ethereum/rpc.httpTestClient(0xc0001dc1c0?, {0x9d5e40, 0x2}, 0xc0002bc1c0)
/opt/actions-runner/_work/go-ethereum/go-ethereum/rpc/client_test.go:932 +0x2b1
github.com/ethereum/go-ethereum/rpc.testClientCancel({0x9d5e40, 0x2}, 0xc0001dc1c0)
/opt/actions-runner/_work/go-ethereum/go-ethereum/rpc/client_test.go:356 +0x15f
github.com/ethereum/go-ethereum/rpc.TestClientCancelWebsocket(0xc0001dc1c0?)
/opt/actions-runner/_work/go-ethereum/go-ethereum/rpc/client_test.go:319 +0x25
testing.tRunner(0xc0001dc1c0, 0xa07370)
/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/testing/testing.go:1934 +0xea
created by testing.(*T).Run in goroutine 1
/opt/actions-runner/_work/_tool/go/1.25.1/x64/src/testing/testing.go:1997 +0x465
FAIL github.com/ethereum/go-ethereum/rpc 0.371s
```
In `testClientCancel` we wrap the server listener in `flakeyListener`,
which schedules an unconditional close of every accepted connection
after a random delay, if the random delay is zero then the timer fires
immediately, and then the http client paniced of connection reset by
peer.
Here we add a minimum 10ms to ensure the timeout won't fire immediately.
Signed-off-by: jsvisa <delweng@gmail.com>
closes#32240#32232
The main cause for the time out is the slow json encoding of large data.
In #32240 they tried to resolve the issue by reducing the size of the
test. However as Felix pointed out, the test is still kind of confusing.
I've refactored the test so it is more understandable and have reduced
the amount of data needed to be json encoded. I think it is still
important to ensure that the default read limit is not active, so I have
retained one large (~32 MB) test case, but it's at least smaller than
the existing ~64 MB test case.
Exposing the public method to setReadLimits for Websocket RPC to
prevent OOM.
Current, Geth Server is using a default 32MB max read limit (message
size) for websocket, which is prune to being attacked for OOM. Any one
can easily launch a client to send a bunch of concurrent large request
to cause the node to crash for OOM. One example of such script that can
easily crash a Geth node running websocket server is like this:
ec830979ac/poc.go
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
---
**Description:**
- Replaced outdated GitHub wiki links with current, official
documentation URLs.
- Removed links that redirect or are no longer relevant.
- Ensured all references point to up-to-date and reliable sources.
---
This change adds a limit for RPC method names to prevent potential abuse
where large method names could lead to large response sizes.
The limit is enforced in:
- handleCall for regular RPC method calls
- handleSubscribe for subscription method calls
Added tests in websocket_test.go to verify the length limit
functionality for both regular method calls and subscriptions.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Changelog: https://golangci-lint.run/product/changelog/#1610
Removes `exportloopref` (no longer needed), replaces it with
`copyloopvar` which is basically the opposite.
Also adds:
- `durationcheck`
- `gocheckcompilerdirectives`
- `reassign`
- `mirror`
- `tenv`
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
Here we add distinct error messages for network timeouts and JSON parsing errors.
Note this specifically applies to HTTP connections serving a single RPC request.
Co-authored-by: Felix Lange <fjl@twurst.com>
It turns out that encoding json.RawMessage is slow because
package json basically parses the message again to ensure it is valid.
We can avoid the slowdown by encoding the entire RPC notification once,
which yields a 30% speedup.
* rpc: make subscription test faster
reduces time for TestClientSubscriptionChannelClose
from 25 sec to < 1 sec.
* trie: cache trie nodes for faster sanity check
This reduces the time spent on TestIncompleteSyncHash
from ~25s to ~16s.
* core/forkid: speed up validation test
This takes the validation test from > 5s to sub 1 sec
* core/state: improve snapshot test run
brings the time for TestSnapshotRandom from 13s down to 6s
* accounts/keystore: improve keyfile test
This removes some unnecessary waits and reduces the
runtime of TestUpdatedKeyfileContents from 5 to 3 seconds
* trie: remove resolver
* trie: only check ~5% of all trie nodes
The String() version of BlockNumberOrHash uses decimal for all block numbers, including negative ones used to indicate labels. Switch to using BlockNumber.String() which encodes it correctly for use in the JSON-RPC API.
We're trying a new named pipe library, which should hopefully fix some occasional failures in CI.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This should fix#27726. With enough load, it might happen that the SetPongHandler
callback gets invoked before the call to SetReadDeadline is made in pingLoop. When
this occurs, the socket will end up with a 30s read deadline even though it got the pong,
which will lead to a timeout.
The fix here is processing the pong on pingLoop, synchronizing with the code that
sends the ping.
Package rpc uses cgo to find the maximum UNIX domain socket path
length. If exceeded, a warning is printed. This is the only use of cgo in this
package. It seems excessive to depend on cgo just for this warning, so
we now hard-code the usual limit for Linux instead.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This adds two ways to check for subscription support. First, one can now check
whether the transport method (HTTP/WS/etc.) is capable of subscriptions using
the new Client.SupportsSubscriptions method.
Second, the error returned by Subscribe can now reliably be tested using this
pattern:
sub, err := client.Subscribe(...)
if errors.Is(err, rpc.ErrNotificationsUnsupported) {
// no subscription support
}
---------
Co-authored-by: Felix Lange <fjl@twurst.com>