Commit graph

130 commits

Author SHA1 Message Date
CPerezz
c2e1785a48
eth/protocols/snap: restore peers to idle pool on request revert (#33790)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
All five `revert*Request` functions (account, bytecode, storage,
trienode heal, bytecode heal) remove the request from the tracked set
but never restore the peer to its corresponding idle pool. When a
request times out and no response arrives, the peer is permanently lost
from the idle pool, preventing new work from being assigned to it.

In normal operation mode (snap-sync full state) this bug is masked by
pivot movement (which resets idle pools via new Sync() cycles every ~15
minutes) and peer churn (reconnections re-add peers via Register()).
However in scenarios like the one I have running my (partial-stateful
node)[https://github.com/ethereum/go-ethereum/pull/33764] with
long-running sync cycles and few peers, all peers can eventually leak
out of the idle pools, stalling sync entirely.

Fix: after deleting from the request map, restore the peer to its idle
pool if it is still registered (guards against the peer-drop path where
Unregister already removed the peer). This mirrors the pattern used in
all five On* response handlers.


This only seems to manifest in peer-thirstly scenarios as where I find
myself when testing snapsync for the partial-statefull node).
Still, thought was at least good to raise this point. Unsure if required
to discuss or not
2026-02-24 09:14:11 +08:00
Felix Lange
0cba803fba
eth/protocols/eth, eth/protocols/snap: delayed p2p message decoding (#33835)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
This changes the p2p protocol handlers to delay message decoding. It's
the first part of a larger change that will delay decoding all the way
through message processing. For responses, we delay the decoding until
it is confirmed that the response matches an active request and does not
exceed its limits.

In order to make this work, all messages have been changed to use
rlp.RawList instead of a slice of the decoded item type. For block
bodies specifically, the decoding has been delayed all the way until
after verification of the response hash.

The role of p2p/tracker.Tracker changes significantly in this PR. The
Tracker's original purpose was to maintain metrics about requests and
responses in the peer-to-peer protocols. Each protocol maintained a
single global Tracker instance. As of this change, the Tracker is now
always active (regardless of metrics collection), and there is a
separate instance of it for each peer. Whenever a response arrives, it
is first verified that a request exists for it in the tracker. The
tracker is also the place where limits are kept.
2026-02-15 21:21:16 +08:00
Felix Lange
8e1de223ad
crypto/keccak: vendor in golang.org/x/crypto/sha3 (#33323)
The upstream libray has removed the assembly-based implementation of
keccak. We need to maintain our own library to avoid a peformance
regression.

---------

Co-authored-by: lightclient <lightclient@protonmail.com>
2026-02-03 14:55:27 -07:00
cui
ed264a1f19
eth/protocols/snap: optimize incHash (#32748) 2025-10-10 13:48:25 +08:00
Zach Brown
f770ec1ffc
common, eth: remove duplicate test cases (#32624)
Remove redundant duplicate test vectors. The two entries were identical
and back-to-back, providing no additional coverage while adding noise.
Keeping a single instance maintains test intent and clarity without
altering behavior.
2025-09-19 17:20:44 -06:00
cui
3ebb1431d3
eth: using testing.B.Loop (#32657)
before:
go test -run=^$ -bench=. ./eth/... 827.57s user 23.80s system 361% cpu
3:55.49 total

after:
go test -run=^$ -bench=. ./eth/... 281.62s user 13.62s system 245% cpu
2:00.49 total
2025-09-19 17:00:29 -06:00
Marius van der Wijden
b369a855fb
eth/protocols/snap: add healing and syncing metrics (#32258)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
Adds the heal time and snap sync time to grafana

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-07-24 16:43:04 +08:00
Zhou
becca46010
eth/protocols/snap: fix negative eta in state progress logging (#32225) 2025-07-17 10:59:47 +08:00
asamuj
d7db10ddbd
eth/protocols/snap, p2p/discover: improve zero time checks (#32214)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
2025-07-15 14:20:45 +08:00
rjl493456442
ac50181b74
core: consolidate BlockChain constructor options (#31925)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Docker Image (push) Waiting to run
In this pull request, the original `CacheConfig` has been renamed to `BlockChainConfig`.

Over time, more fields have been added to `CacheConfig` to support
blockchain configuration. Such as `ChainHistoryMode`, which clearly extends
beyond just caching concerns.

Additionally, adding new parameters to the blockchain constructor has
become increasingly complicated, since it’s initialized across multiple
places in the codebase. A natural solution is to consolidate these arguments 
into a dedicated configuration struct.

As a result, the existing `CacheConfig` has been redefined as `BlockChainConfig`.
Some parameters, such as `VmConfig`, `TxLookupLimit`, and `ChainOverrides`
have been moved into `BlockChainConfig`. Besides, a few fields in `BlockChainConfig`
were renamed, specifically:

- `TrieCleanNoPrefetch` -> `NoPrefetch`
- `TrieDirtyDisabled` -> `ArchiveMode`

Notably, this change won't affect the command line flags or the toml
configuration file. It's just an internal refactoring and fully backward-compatible.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
2025-06-19 12:21:15 +02:00
rjl493456442
892a661ee2
core, triedb/pathdb: final integration (snapshot integration pt 5) (#30661)
In this pull request, snapshot generation in pathdb has been ported from 
the legacy state snapshot implementation. Additionally, when running in 
path mode, legacy state snapshot data is now managed by the pathdb
based snapshot logic.

Note: Existing snapshot data will be re-generated, regardless of whether 
it was previously fully constructed.
2025-05-16 18:29:38 +08:00
Martin HS
9045b79bc2
metrics, cmd/geth: change init-process of metrics (#30814)
This PR modifies how the metrics library handles `Enabled`: previously,
the package `init` decided whether to serve real metrics or just
dummy-types.

This has several drawbacks: 
- During pkg init, we need to determine whether metrics are enabled or
not. So we first hacked in a check if certain geth-specific
commandline-flags were enabled. Then we added a similar check for
geth-env-vars. Then we almost added a very elaborate check for
toml-config-file, plus toml parsing.

- Using "real" types and dummy types interchangeably means that
everything is hidden behind interfaces. This has a performance penalty,
and also it just adds a lot of code.

This PR removes the interface stuff, uses concrete types, and allows for
the setting of Enabled to happen later. It is still assumed that
`metrics.Enable()` is invoked early on.

The somewhat 'heavy' operations, such as ticking meters and exp-decay,
now checks the enable-flag to prevent resource leak.

The change may be large, but it's mostly pretty trivial, and from the
last time I gutted the metrics, I ensured that we have fairly good test
coverage.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
2024-12-10 13:27:29 +01:00
rjl493456442
03c37cdb2b
core/state: introduce code reader interface (#30816)
This PR introduces a `ContractCodeReader` interface with functions defined:

type ContractCodeReader interface {
	Code(addr common.Address, codeHash common.Hash) ([]byte, error)
	CodeSize(addr common.Address, codeHash common.Hash) (int, error)
}

This interface can be implemented in various ways. Although the codebase
currently includes only one implementation, additional implementations
could be created for different purposes and scenarios, such as a code
reader designed for the Verkle tree approach or one that reads code from
the witness.

*Notably, this interface modifies the function’s semantics. If the
contract code is not found, no error will be returned. An error should
only be returned in the event of an unexpected issue, primarily for
future implementations.*

The original state.Reader interface is extended with ContractCodeReader
methods, it gives us more flexibility to manipulate the reader with additional
logic on top, e.g. Hooks.

type Reader interface {
	ContractCodeReader
	StateReader
}

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
2024-11-29 15:32:45 +01:00
rjl493456442
b6c62d5887
core, trie, triedb: minor changes from snapshot integration (#30599)
This change ports some non-important changes from https://github.com/ethereum/go-ethereum/pull/30159, including interface renaming and some trivial refactorings.
2024-10-18 17:06:31 +02:00
Martin HS
5adc314817
build: update to golangci-lint 1.61.0 (#30587)
Changelog: https://golangci-lint.run/product/changelog/#1610 

Removes `exportloopref` (no longer needed), replaces it with
`copyloopvar` which is basically the opposite.

Also adds: 
- `durationcheck`
- `gocheckcompilerdirectives`
- `reassign`
- `mirror`
- `tenv`

---------

Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
2024-10-14 19:25:22 +02:00
Marius van der Wijden
b0b67be0a2
all: remove forkchoicer and reorgNeeded (#29179)
This PR changes how sidechains are handled. 

Before the merge, it was possible to import a chain with lower td and not set it as canonical. After the merge, we expect every chain that we get via InsertChain to be canonical. Non-canonical blocks can still be inserted
with InsertBlockWIthoutSetHead.

If during the InsertChain, the existing chain is not canonical anymore, we mark it as a sidechain and send the SideChainEvents normally.
2024-09-04 15:03:06 +02:00
Felix Lange
6eb42a6b4f
eth: dial nodes from discv5 (#30302)
Here I am adding a discv5 nodes source into the p2p dial iterator. It's
an improved version of #29533.

Unlike discv4, the discv5 random nodes iterator will always provide full
ENRs. This means we can apply filtering to the results and will only try
dialing nodes which explictly opt into the eth protocol with a matching
chain.

I have also removed the dial iterator from snap. We don't have an
official DNS list for snap anymore, and I doubt anyone else is running
one. While we could potentially filter for snap on discv5, there will be
very few nodes announcing it, and the extra iterator would just stall
the dialer.

---------

Co-authored-by: lightclient <lightclient@protonmail.com>
2024-08-15 22:14:42 +02:00
rjl493456442
5adf4adc8e
eth/protocols/snap: cleanup dangling account trie nodes due to incomplete storage (#30258)
This pull request fixes #30229.
 
During snap sync, large storage will be split into several pieces and
synchronized concurrently. Unfortunately, the tradeoff is that the respective
merkle trie of each storage chunk will be incomplete due to the incomplete
boundaries. The trie nodes on these boundaries will be discarded, and any
dangling nodes on disk will also be removed if they fall on these paths,
ensuring the state healer won't be blocked.

However, the dangling account trie nodes on the path from the root to the
associated account are left untouched. This means the dangling account trie
nodes could potentially stop the state healing and break the assumption that the
entire subtrie should exist if the subtrie root exists. We should consider the
account trie node as the ancestor of the corresponding storage trie node.

In the scenarios described in the above ticket, the state corruption could occur
if there is a dangling account trie node while some storage trie nodes are
removed due to synchronization redo.

The fixing idea is pretty straightforward, the trie nodes on the path from root
to account should all be explicitly removed if an incomplete storage trie
occurs. Therefore, a `delete` operation has been added into `gentrie` to
explicitly clear the account along with all nodes on this path. The special
thing is that it's a cross-trie clearing. In theory, there may be a dangling
node at any position on this account key and we have to clear all of them.
2024-08-12 10:43:54 +02:00
maskpp
00675c5876
trie/trienode: avoid unnecessary copy (#30019)
* avoid unnecessary copy

* delete the never used function ProofList

* eth/protocols/snap, trie/trienode: polish the code

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2024-06-20 11:47:29 +08:00
jwasinger
69351e8b0f
core/state, eth/protocols, trie, triedb/pathdb: remove unused error from trie Commit (#29869)
* core/state, eth/protocols, trie, triedb/pathdb:  remove unused error return from trie Commit

* move set back to account-trie-update block scoping for easier readability

* address review

* undo tests submodule change

* trie:  panic if BatchSerialize returns an error in Verkle trie Commit

* trie: verkle comment nitpicks

---------

Co-authored-by: Péter Szilágyi <peterke@gmail.com>
2024-06-12 12:23:16 +03:00
tianyeyouyou
a6751d6fc8
core/rawdb,eth/protocols,p2p: prealloc slice size (#29893)
chore: prealloc slice size
2024-06-03 15:51:04 +03:00
PolyMa
06263b1b35
all: fix typos in comments (#29873)
fix using `a` & `the` simutaneously
2024-05-29 12:24:10 +02:00
Mobin Mohanan
7224576fba
core, eth/protocols/snap, internal/ethapi: remove redundant types (#29841) 2024-05-27 14:39:39 +08:00
rjl493456442
473ee8fc07
trie, eth/protocols/snap: sanitize the committed node data (#29485) 2024-05-16 17:58:35 +08:00
rjl493456442
9f96e07c1c
core/rawdb, trie: improve db APIs for accessing trie nodes (#29362)
* core/rawdb, trie: improve db APIs for accessing trie nodes

* triedb/pathdb: fix
2024-04-30 16:25:35 +02:00
haoran
b2b0e1da8c
all: fix various typos (#29600)
* core: fix typo

* rpc: fix typo

* snap: fix typo

* trie: fix typo

* main: fix typo

* abi: fix typo

* main: fix field comment for basicOp
2024-04-23 13:09:42 +03:00
rjl493456442
d3c4466edd
core, eth/protocols/snap, trie: fix cause for snap-sync corruption, implement gentrie (#29313)
This pull request defines a gentrie for snap sync purpose.

The stackTrie is used to generate the merkle tree nodes upon receiving a state batch. Several additional options have been added into stackTrie to handle incomplete states (either missing states before or after).

In this pull request, these options have been relocated from stackTrie to genTrie, which serves as a wrapper for stackTrie specifically for snap sync purposes.

Further, the logic for managing incomplete state has been enhanced in this change. Originally, there are two cases handled:

-    boundary node filtering
-    internal (covered by extension node) node clearing

This changes adds one more:
 
- Clearing leftover nodes on the boundaries.

This feature is necessary if there are leftover trie nodes in database, otherwise node inconsistency may break the state healing.
2024-04-16 09:05:36 +02:00
rjl493456442
9dcf8aae47
eth/protocols/snap: skip retrieval for completed storages (#29378)
* eth/protocols/snap: skip retrieval for completed storages

* eth/protocols/snap: address comments from peter

* eth/protocols/snap: add comments
2024-04-10 12:02:45 +03:00
rjl493456442
304879da20
eth/protocols/snap: check storage root existence for hash scheme (#29341) 2024-03-27 09:35:33 +08:00
Aaron Chen
723b1e36ad
all: fix mismatched names in comments (#29348)
* all: fix mismatched names in comments

* metrics: fix mismatched name in UpdateIfGt
2024-03-26 21:01:28 +01:00
Martin HS
14cc967d19
all: remove dependency on golang.org/exp (#29314)
This change includes a leftovers from https://github.com/ethereum/go-ethereum/pull/29307
- using the [new `slices` package](https://go.dev/doc/go1.21#slices) and
- using the [new `cmp.Ordered`](https://go.dev/doc/go1.21#cmp) instead of exp `constraints.Ordered`
2024-03-25 07:50:18 +01:00
Martin HS
04bf1c802f
eth/protocols/snap, internal/testlog: fix dataraces (#29301) 2024-03-20 15:22:52 +01:00
rjl493456442
15eb9773f9
triedb/pathdb: improve tests (#29278) 2024-03-19 10:50:08 +08:00
hyhnet
cd490608e3
all: fix typos in comments (#29186) 2024-03-07 22:56:19 +01:00
Undefinedor
00905f7dc4
all: remove redundant import aliases (#29144) 2024-03-02 22:42:50 +02:00
buddho
bba3fa9af9
core,eth,internal: fix typo (#29024) 2024-02-20 19:42:48 +08:00
Sina Mahmoodi
95741b1844
core: move genesis alloc types to core/types (#29003)
We want to use these types in public user-facing APIs, so they shouldn't be in core.

Co-authored-by: Felix Lange <fjl@twurst.com>
2024-02-16 19:05:33 +01:00
rjl493456442
fe91d476ba
all: remove the dependency from trie to triedb (#28824)
This change removes the dependency from trie package to triedb package.
2024-02-13 14:49:53 +01:00
Martin HS
a5a4fa7032
all: use uint256 in state (#28598)
This change makes use of uin256 to represent balance in state. It touches primarily upon statedb, stateobject and state processing, trying to avoid changes in transaction pools, core types, rpc and tracers.
2024-01-23 14:51:58 +01:00
Martin Holst Swende
2391fbc676
tests/fuzzers: move fuzzers into native packages (#28467)
This PR moves our fuzzers from tests/fuzzers into whatever their respective 'native' package is.

The historical reason why they were placed in an external location, is that when they were based on go-fuzz, they could not be "hidden" via the _test.go prefix. So in order to shove them away from the go-ethereum "production code", they were put aside.

But now we've rewritten them to be based on golang testing, and thus can be brought back. I've left (in tests/) the ones that are not production (bls128381), require non-standard imports (secp requires btcec, bn256 requires gnark/google/cloudflare deps).

This PR also adds a fuzzer for precompiled contracts, because why not.

This PR utilizes a newly rewritten replacement for go-118-fuzz-build, namely gofuzz-shim, which utilises the inputs from the fuzzing engine better.
2023-11-14 14:34:29 +01:00
rjl493456442
ab04aeb855
core, eth, trie: filter out boundary nodes and remove dangling nodes in stacktrie (#28327)
* core, eth, trie: filter out boundary nodes in stacktrie

* eth/protocol/snap: add comments

* Update trie/stacktrie.go

Co-authored-by: Martin Holst Swende <martin@swende.se>

* eth, trie: remove onBoundary callback

* eth/protocols/snap: keep complete boundary nodes

* eth/protocols/snap: skip healing if the storage trie is already complete

* eth, trie: add more metrics

* eth, trie: address comment

---------

Co-authored-by: Martin Holst Swende <martin@swende.se>
2023-10-23 18:31:56 +03:00
rjl493456442
1b1611b8d0
core, trie, eth: refactor stacktrie constructor (#28350)
This change enhances the stacktrie constructor by introducing an option struct. It also simplifies the `Hash` and `Commit` operations, getting rid of the special handling round root node.
2023-10-17 14:09:25 +02:00
Martin Holst Swende
f62c58f8de
trie: make rhs-proof align with last key in range proofs (#28311)
During snap-sync, we request ranges of values: either a range of accounts or a range of storage values. For any large trie, e.g. the main account trie or a large storage trie, we cannot fetch everything at once.

Short version; we split it up and request in multiple stages. To do so, we use an origin field, to say "Give me all storage key/values where key > 0x20000000000000000". When the server fulfils this, the server provides the first key after origin, let's say 0x2e030000000000000 -- never providing the exact origin. However, the client-side needs to be able to verify that the 0x2e03.. indeed is the first one after 0x2000.., and therefore the attached proof concerns the origin, not the first key.

So, short-short version: the left-hand side of the proof relates to the origin, and is free-standing from the first leaf.

On the other hand, (pun intended), the right-hand side, there's no such 'gap' between "along what path does the proof walk" and the last provided leaf. The proof must prove the last element (unless there are no elements).

Therefore, we can simplify the semantics for trie.VerifyRangeProof by removing an argument. This doesn't make much difference in practice, but makes it so that we can remove some tests. The reason I am raising this is that the upcoming stacktrie-based verifier does not support such fancy features as standalone right-hand borders.
2023-10-13 16:05:29 +02:00
rjl493456442
1cb3b6aee4
eth/protocols/snap: fix snap sync failure on empty storage range (#28306)
This change addresses an issue in snap sync, specifically when the entire sync process can be halted due to an encountered empty storage range.

Currently, on the snap sync client side, the response to an empty (partial) storage range is discarded as a non-delivery. However, this response can be a valid response, when the particular range requested does not contain any slots.

For instance, consider a large contract where the entire key space is divided into 16 chunks, and there are no available slots in the last chunk [0xf] -> [end]. When the node receives a request for this particular range, the response includes:

    The proof with origin [0xf]
    A nil storage slot set

If we simply discard this response, the finalization of the last range will be skipped, halting the entire sync process indefinitely. The test case TestSyncWithUnevenStorage can reproduce the scenario described above.

In addition, this change also defines the common variables MaxAddress and MaxHash.
2023-10-13 09:08:26 +02:00
Martin Holst Swende
8976a0c97a
trie: remove owner and binary marshaling from stacktrie (#28291)
This change
  - Removes the owner-notion from a stacktrie; the owner is only ever needed for comitting to the database, but the commit-function, the `writeFn` is provided by the caller, so the caller can just set the owner into the `writeFn` instead of having it passed through the stacktrie.
  - Removes the `encoding.BinaryMarshaler`/`encoding.BinaryUnmarshaler` interface from stacktrie. We're not using it, and it is doubtful whether anyone downstream is either.
2023-10-11 06:12:45 +02:00
Martin Holst Swende
6b1e4f4211
all: move light.NodeSet to trienode.ProofSet (#28287)
This is a minor refactor in preparation of changes to range verifier. This PR contains no intentional functional changes but moves (and renames) the light.NodeSet
2023-10-10 10:30:47 +02:00
Péter Szilágyi
be65b47645
all: update golang/x/ext and fix slice sorting fallout (#27909)
The Go authors updated golang/x/ext to change the function signature of the slices sort method. 
It's an entire shitshow now because x/ext is not tagged, so everyone's codebase just 
picked a new version that some other dep depends on, causing our code to fail building.

This PR updates the dep on our code too and does all the refactorings to follow upstream...
2023-08-12 00:04:12 +02:00
rjl493456442
503f1f7ada
all: activate pbss as experimental feature (#26274)
* all: activate pbss

* core/rawdb: fix compilation error

* cma, core, eth, les, trie: address comments

* cmd, core, eth, trie: polish code

* core, cmd, eth: address comments

* cmd, core, eth, les, light, tests: address comment

* cmd/utils: shorten log message

* trie/triedb/pathdb: limit node buffer size to 1gb

* cmd/utils: fix opening non-existing db

* cmd/utils: rename flag name

* cmd, core: group chain history flags and fix tests

* core, eth, trie: fix memory leak in snapshot generation

* cmd, eth, internal: deprecate flags

* all: enable state tests for pathdb, fixes

* cmd, core: polish code

* trie/triedb/pathdb: limit the node buffer size to 256mb

---------

Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
2023-08-10 22:21:36 +03:00
Péter Szilágyi
6e934f40f9
eth/protocols/snap: fix batch writer when resuming an aborted sync (#27842) 2023-08-03 14:51:02 +03:00
rjl493456442
88f3d61468
all: expose block number information to statedb (#27753)
* core/state: clean up

* all: add block number infomration to statedb

* core, trie: rename blockNumber to block
2023-07-24 13:22:09 +03:00