Commit graph

37 commits

Author SHA1 Message Date
sashass1315
919b238c82
triedb/pathdb: return nodeLoc by value to avoid heap allocation (#33819) 2026-02-11 22:14:43 +08:00
rjl493456442
add1890a57
triedb/pathdb: enable trienode history (#32621)
It's the part-4 for trienode history. The trienode history persistence
has been enabled with this PR by flag `history.trienode <non-negative-number>`
2026-01-17 21:23:48 +08:00
Guillaume Ballet
2a2f106a01
cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This is broken off of #31730 to only focus on testing networks that
start with verkle at genesis.

The PR has seen a lot of work since its creation, and it now targets
creating and re-executing tests for a binary tree testnet without the
transition (so it starts at genesis). The transition tree has been moved
to its own package. It also replaces verkle with the binary tree for
this specific application.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-11-14 15:25:30 +01:00
rjl493456442
de24450dbf
core/rawdb, triedb/pathdb: introduce trienode history (#32596)
It's a pull request based on the #32523 , implementing the structure of
trienode history.
2025-10-10 14:51:27 +08:00
rjl493456442
21769f3474
triedb/pathdb: generalize the history indexer (#32523)
This pull request is based on #32306 , is the second part for shipping
trienode history.

Specifically, this pull request generalize the existing index mechanism,
making is usable
by both state history and trienode history in the near future.
2025-09-17 15:57:16 +02:00
Delweng
c4ec4504bb
core/state: state size tracking (#32362)
Add state size tracking and retrieve api, start geth with `--state.size-tracking`, 
the initial bootstrap is required (around 1h on mainnet), after the bootstrap, 
use `debug_stateSize()` RPC to retrieve the state size:

```
> debug.stateSize()
{
  accountBytes: "0x39681967b",
  accountTrienodeBytes: "0xc57939f0c",
  accountTrienodes: "0x198b36ac",
  accounts: "0x129da14a",
  blockNumber: "0x1635e90",
  contractCodeBytes: "0x2b63ef481",
  contractCodes: "0x1c7b45",
  stateRoot: "0x9c36a3ec3745d72eea8700bd27b90dcaa66de0494b187c5600750044151e620a",
  storageBytes: "0x18a6e7d3f1",
  storageTrienodeBytes: "0x2e7f53fae6",
  storageTrienodes: "0x6e49a234",
  storages: "0x517859c5"
}
```

---------

Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-09-08 14:00:23 +08:00
rjl493456442
902ec5baae
cmd, core, eth, triedb/pathdb: track node origins in the path database (#32418)
This PR is the first step in the trienode history series.

It introduces the `nodeWithOrigin` struct in the path database, which tracks
the original values of dirty nodes to support trienode history construction.

Note, the original value is always empty in this PR, so it won't break the 
existing journal for encoding and decoding. The compatibility of journal 
should be handled in the following PR.
2025-09-05 10:37:05 +08:00
rjl493456442
7f78fa6912
triedb/pathdb, core: keep root->id mappings after truncation (#32502)
This pull request preserves the root->ID mappings in the path database
even after the associated state histories are truncated, regardless of
whether the truncation occurs at the head or the tail.

The motivation is to support an additional history type, trienode history. 
Since the root->ID mappings are shared between two history instances, 
they must not be removed by either one.

As a consequence, the root->ID mappings remain in the database even
after the corresponding histories are pruned. While these mappings may 
become  dangling, it is safe and cheap to keep them.

Additionally, this pull request enhances validation during historical
reader construction, ensuring that only canonical historical state will be
served.
2025-08-29 15:43:58 +08:00
rjl493456442
95ab643bb8
triedb/pathdb: refactor state history write (#32497)
This pull request refactors the internal implementation in path database
a bit, specifically:

- purge the state index data in batch
- simplify the logic of state history construction and index, make it more readable
2025-08-26 21:53:55 +08:00
rjl493456442
8c58f4920d
triedb/pathdb: rename history to state history (#32498)
This is a internal refactoring PR, renaming the history to stateHistory.

It's a pre-requisite PR for merging trienode history, avoid the name
conflict.
2025-08-26 08:52:39 +02:00
Delweng
17903fedf0
triedb/pathdb: introduce file-based state journal (#32060)
Introduce file-based state journal in path database, fixing
the Pebble restriction when the journal size exceeds 4GB.

---------

Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-07-15 11:45:20 +08:00
Delweng
c59c647ed7
triedb: reset state indexer after snap synced (#32104)
Fix the issue after initial snap sync with `gcmode=archive` enabled.

```
NewPayload: inserting block failed       error="history indexing is out of order, last: null, requested: 1"
```

---------

Signed-off-by: Delweng <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-07-01 11:35:22 +08:00
rjl493456442
0c90e4bda0
all: incorporate state history indexing status into eth_syncing response (#32099)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Docker Image (push) Waiting to run
This pull request tracks the state indexing progress in eth_syncing
RPC response, i.e. we will return non-null syncing status until indexing
has finished.
2025-06-26 17:20:20 +02:00
rjl493456442
9c5c0e37bf
core/rawdb, triedb/pathdb: implement history indexer (#31156)
This pull request is part-1 for shipping the core part of archive node
in PBSS mode.
2025-06-24 14:36:12 +02:00
rjl493456442
21920207e4
triedb/pathdb, eth: use double-buffer mechanism in pathdb (#30464)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Docker Image (push) Waiting to run
Previously, PathDB used a single buffer to aggregate database writes,
which needed to be flushed atomically. However, flushing large amounts
of data (e.g., 256MB) caused significant overhead, often blocking the
system for around 3 seconds during the flush.

To mitigate this overhead and reduce performance spikes, a double-buffer
mechanism is introduced. When the active buffer fills up, it is marked
as frozen and a background flushing process is triggered. Meanwhile, a
new buffer is allocated for incoming writes, allowing operations to
continue uninterrupted.

This approach reduces system blocking times and provides flexibility in
adjusting buffer parameters for improved performance.
2025-06-22 20:40:54 +08:00
rjl493456442
8b9f2d4e36
triedb/pathdb: introduce lookup structure to optimize state access (#30971)
This pull request introduces a mechanism to improve state lookup
efficiency in pathdb by maintaining a lookup structure that eliminates
unnecessary iteration over diff layers.

The core idea is to track a mutation history for each dirty state entry
residing in the diff layers. This history records the state roots of all layers
in which the entry was modified, sorted from oldest to newest.

During state lookup, this mutation history is queried to find the most
recent layer whose state root either matches the target root or is a
descendant of it. This allows us to quickly identify the layer containing
the relevant data, avoiding the need to iterate through all diff layers from
top to bottom.

Besides, the overhead for state lookup is constant, no matter how many
diff layers are retained in the pathdb, which unlocks the potential to hold
more diff layers.

Of course, maintaining this lookup structure introduces some overhead.
For each state transition, we need to:
(a) update the mutation records for the modified state entries, and
(b) remove stale mutation records associated with outdated layers.

On our benchmark machine, it will introduce around 1ms overhead which is
acceptable.
2025-05-28 13:31:42 +02:00
rjl493456442
892a661ee2
core, triedb/pathdb: final integration (snapshot integration pt 5) (#30661)
In this pull request, snapshot generation in pathdb has been ported from 
the legacy state snapshot implementation. Additionally, when running in 
path mode, legacy state snapshot data is now managed by the pathdb
based snapshot logic.

Note: Existing snapshot data will be re-generated, regardless of whether 
it was previously fully constructed.
2025-05-16 18:29:38 +08:00
rjl493456442
10519768a2
core, ethdb: introduce database sync function (#31703)
This pull request introduces a SyncKeyValue function to the
ethdb.KeyValueStore
interface, providing the ability to forcibly flush all previous writes
to disk.

This functionality is critical for go-ethereum, which internally uses
two independent
database engines: a key-value store (such as Pebble, LevelDB, or
memoryDB for
testing) and a flat-file–based freezer. To ensure write-order
consistency between
these engines, the key-value store must be explicitly synced before
writing to the
freezer and vice versa.

Fixes 
- https://github.com/ethereum/go-ethereum/issues/31405
- https://github.com/ethereum/go-ethereum/issues/29819
2025-05-08 19:10:26 +08:00
Christina
9516e0f6b6
chore: fix various comments (#31082) 2025-01-28 16:56:23 +01:00
rjl493456442
37c0e6992e
cmd, core, miner: rework genesis setup (#30907)
This pull request refactors the genesis setup function, the major
changes are highlighted here:

**(a) Triedb is opened in verkle mode if `EnableVerkleAtGenesis` is
configured in chainConfig or the database has been initialized previously with
`EnableVerkleAtGenesis` configured**.

A new config field `EnableVerkleAtGenesis` has been added in the
chainConfig. This field must be configured with True if Geth wants to initialize 
the genesis in Verkle mode.

In the verkle devnet-7, the verkle transition is activated at genesis.
Therefore, the verkle rules should be used since the genesis. In production
networks (mainnet and public testnets), verkle activation always occurs after
the genesis block. Therefore, this flag is only made for devnet and should be
deprecated later. Besides, verkle transition at non-genesis block hasn't been
implemented yet, it should be done in the following PRs.

**(b) The genesis initialization condition has been simplified**
There is a special mode supported by the Geth is that: Geth can be
initialized with an existing chain segment, which can fasten the node sync
process by retaining the chain freezer folder.

Originally, if the triedb is regarded as uninitialized and the genesis block can
be found in the chain freezer, the genesis block along with genesis state will be
committed. This condition has been simplified to checking the presence of chain
config in key-value store. The existence of chain config can represent the genesis
has been committed.
2025-01-14 11:49:30 +01:00
rjl493456442
82e963e5c9
triedb/pathdb: configure different node hasher in pathdb (#31008)
As the node hash scheme in verkle and merkle are totally different, the
original default node hasher in pathdb is no longer suitable. Therefore,
this pull request configures different node hasher respectively.
2025-01-10 20:51:19 +08:00
rjl493456442
bc1ec69008
trie/pathdb: state iterator (snapshot integration pt 4) (#30654)
In this pull request, the state iterator is implemented. It's mostly a copy-paste
from the original state snapshot package, but still has some important changes
to highlight here:

(a) The iterator for the disk layer consists of a diff iterator and a disk iterator.

Originally, the disk layer in the state snapshot was a wrapper around the disk, 
and its corresponding iterator was also a wrapper around the disk iterator.
However, due to structural differences, the disk layer iterator is divided into
two parts:

- The disk iterator, which traverses the content stored on disk.
- The diff iterator, which traverses the aggregated state buffer.

Checkout `BinaryIterator` and `FastIterator` for more details.

(b) The staleness management is improved in the diffAccountIterator and
diffStorageIterator

Originally, in the `diffAccountIterator`, the layer’s staleness had to be checked 
within the Next function to ensure the iterator remained usable. Additionally, 
a read lock on the associated diff layer was required to first retrieve the account 
blob. This read lock protection is essential to prevent concurrent map read/write. 
Afterward, a staleness check was performed to ensure the retrieved data was 
not outdated.

The entire logic can be simplified as follows: a loadAccount callback is provided 
to retrieve account data. If the corresponding state is immutable (e.g., diff layers
in the path database), the staleness check can be skipped, and a single account 
data retrieval is sufficient. However, if the corresponding state is mutable (e.g., 
the disk layer in the path database), the callback can operate as follows:

```go
func(hash common.Hash) ([]byte, error) {
    dl.lock.RLock()
    defer dl.lock.RUnlock()

    if dl.stale {
        return nil, errSnapshotStale
    }
    return dl.buffer.states.mustAccount(hash)
}
```

The callback solution can eliminate the complexity for managing
concurrency with the read lock for atomic operation.
2024-12-16 21:10:08 +08:00
rjl493456442
05148d972c
triedb/pathdb: track flat state changes in pathdb (snapshot integration pt 2) (#30643)
This pull request ports some changes from the main state snapshot
integration one, specifically introducing the flat state tracking in
pathdb.

Note, the tracked flat state changes are only held in memory and won't
be persisted in the disk. Meanwhile, the correspoding state retrieval in
persistent state is also not supported yet. The states management in
disk is more complicated and will be implemented in a separate pull
request.

Part 1: https://github.com/ethereum/go-ethereum/pull/30752
2024-11-29 19:30:45 +08:00
rjl493456442
b6c62d5887
core, trie, triedb: minor changes from snapshot integration (#30599)
This change ports some non-important changes from https://github.com/ethereum/go-ethereum/pull/30159, including interface renaming and some trivial refactorings.
2024-10-18 17:06:31 +02:00
rjl493456442
f59d013e40
core/rawdb, triedb, cmd: create an isolated disk namespace for verkle (#30105)
* core, triedb/pathdb, cmd: define verkle state ancient store

* core/rawdb, triedb: add verkle namespace in pathdb
2024-07-16 16:17:58 +03:00
rjl493456442
045b9718d5
trie: relocate state execution logic into pathdb package (#29861) 2024-06-27 20:30:39 +08:00
rjl493456442
b88051ec83
core/rawdb, triedb/pathdb: fix freezer read-only option (#29823) 2024-05-28 14:41:11 +02:00
rjl493456442
9f96e07c1c
core/rawdb, trie: improve db APIs for accessing trie nodes (#29362)
* core/rawdb, trie: improve db APIs for accessing trie nodes

* triedb/pathdb: fix
2024-04-30 16:25:35 +02:00
rjl493456442
f46c878441
core/rawdb: implement in-memory freezer (#29135) 2024-04-30 11:33:22 +02:00
Martin HS
853e0c23f3
eth/catalyst, trie/pathdb: fix flaky tests (#29571)
This change fixes three flaky tests `TestEth2AssembleBlock`,`TestEth2NewBlock`, `TestEth2PrepareAndGetPayload` and `TestDisable`.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2024-04-23 10:33:36 +02:00
Guillaume Ballet
da7469e5c4
core: add an end-to-end verkle test (#29262)
core: add a simple verkle test

triedb, core: skip hash comparison in verkle

core: remove legacy daoFork logic in verkle chain maker

fix: nil pointer in tests

triedb/pathdb: add blob hex

core: less defensive

Co-authored-by: Ignacio Hagopian <jsign.uy@gmail.com>
Co-authored-by: Martin HS <martin@swende.se>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2024-03-26 21:25:41 +01:00
rjl493456442
6490d9897a
cmd, triedb: implement history inspection (#29267)
This pull request introduces a database tool for inspecting the state history. 
It can be used for either account history or storage slot history, within a 
specific block range.

The state output format can be chosen either with

- the "rlp-encoded" values (those inserted into the merkle trie)
- the "rlp-decoded" value (the raw state value)

The latter one needs --raw flag.
2024-03-22 20:12:10 +08:00
rjl493456442
15eb9773f9
triedb/pathdb: improve tests (#29278) 2024-03-19 10:50:08 +08:00
rjl493456442
7b81cf6362
core/state, trie/triedb/pathdb: remove storage incomplete flag (#28940)
As SELF-DESTRUCT opcode is disabled in the cancun fork(unless the
account is created within the same transaction, nothing to delete
in this case). The account will only be deleted in the following
cases:

- The account is created within the same transaction. In this case
the original storage was empty.

- The account is empty(zero nonce, zero balance, zero code) and
is touched within the transaction. Fortunately this kind of accounts
are not-existent on ethereum-mainnet.

All in all, after cancun, we are pretty sure there is no large contract
deletion and we don't need this mechanism for oom protection.
2024-03-05 14:31:55 +01:00
Péter Szilágyi
865e1e9f57
cmd/utils, core/rawdb, triedb/pathdb: flip hash to path scheme (#29108)
* cmd/utils, core/rawdb, triedb/pathdb: flip hash to path scheme

* graphql: run tests in hash mode as the chain maker needs it
2024-02-29 12:40:59 +02:00
rjl493456442
5bae14f9df
triedb/pathdb: fix panic in recoverable (#29107)
* triedb/pathdb: fix panic in recoverable

* triedb/pathdb: add todo

* triedb/pathdb: rename

* triedb/pathdb: rename
2024-02-28 14:40:28 +02:00
rjl493456442
fe91d476ba
all: remove the dependency from trie to triedb (#28824)
This change removes the dependency from trie package to triedb package.
2024-02-13 14:49:53 +01:00
Renamed from trie/triedb/pathdb/database.go (Browse further)