Commit graph

119 commits

Author SHA1 Message Date
CPerezz
bfb77d98f6
core/state,triedb/pathdb: enable bintrie flat state reads end-to-end
Wires the pieces from Commits 1-9 into a running system:

* triedb/pathdb.New: install the bintrieFlatCodec when isVerkle is set,
  backed by the same verkle-namespaced db used for trie nodes.
* triedb/pathdb.database.go: drop isVerkle from the noBuild guard so the
  bintrie generator (Commit 9) runs on startup, and remove it from the
  generateSnapshot call path for the same reason.
* triedb/pathdb.disklayer.revert: hard-fail on bintrie because the
  reorg path would replay merkle-shaped origin records against a
  per-stem layout. Tracked in BINTRIE_FLAT_STATE_REORG_GAP.md.
* triedb/pathdb.journal: add IsBintrie to journalGenerator (rlp:"optional"
  so v3 journals still decode) and make journalProgress a method on
  generator so it stamps the active scheme; loadGenerator discards any
  journal whose scheme does not match the database, forcing a fresh
  regeneration.
* triedb/pathdb.reader: export RawStateReader, a small extension of
  database.StateReader that exposes AccountRLP so callers outside the
  package can reach the raw flat-state bytes without going through the
  slim-RLP decode path that assumes merkle shape.
* core/state.reader: add bintrieFlatReader, the bintrie equivalent of
  flatReader. It derives the EIP-7864 stem keys from (addr, slot),
  performs two AccountRLP lookups per Account call (BasicData +
  CodeHash), and decodes via bintrie.UnpackBasicData. Storage reads go
  through a single AccountRLP lookup at the slot's full bintrie key.
* core/state.database.StateReader: dispatch to bintrieFlatReader when
  the path database is in verkle mode; merkle path unchanged.

Depends on the lookup sentinel fix in the previous commit; without it
missing-account reads on bintrie misreport as "layer stale".
2026-04-15 15:00:40 +02:00
CPerezz
0508d40aaf
triedb/pathdb: bintrie snapshot generator
Adds generateBinTrieStems, the bintrie analogue of generateAccounts. It
opens the bintrie via a sha256-aware bintrieDiskStore (the merkle disk
store would always fail root validation against a binary node), iterates
all leaves with binaryNodeIterator, aggregates them into per-stem
builders, and emits one stem blob per stem boundary.

Resume support is structural: ctx.marker is fed straight to the trie's
NodeIterator, which uses binaryNodeIterator.seek (Commit 1) to position
on the first leaf >= marker. Range proofs are deliberately skipped — the
bintrie's Prove path is unimplemented and an iteration-only generation
cycle is acceptable for a one-time startup cost.

A bintrieGeneratorContext mirrors generatorContext but is much smaller:
no holdable iterators (we walk the trie, not the existing flat state)
and no two-tier marker (the bintrie key space is unified). checkAndFlushBin
journals progress as a single 32-byte (stem || offset) key so resume
can pick up mid-stem.

generator.run dispatches on codec type so callers see a uniform
lifecycle whether the underlying scheme is merkle or bintrie.
2026-04-15 15:00:40 +02:00
CPerezz
a1ff36d9e1
core/state,triedb/pathdb: wire bintrie leaves through stateUpdate
Drains the binaryHasher's LeafProducer side-channel in StateDB.commit and
threads the stem writes through stateUpdate.encodeBinary into the pathdb
state set as per-offset accountData entries (key = stem||offset, value =
32-byte leaf or nil for clears).

The flat-state codec gains a Flush method that owns the in-memory→disk
write path, replacing the codec-agnostic per-entry loop in writeStates.
The merkle codec preserves its historical per-entry behavior verbatim;
the bintrie codec aggregates per-offset writes by stem so each stem hits
disk via a single read-modify-write, satisfying the codec's pre-aggregation
requirement and updating the clean cache with the merged blob it just
produced (no extra disk read).

stateUpdate.encodeBinary returns empty origin maps for the bintrie path:
state-history rollback for bintrie is deferred to a follow-up PR (see
BINTRIE_FLAT_STATE_REORG_GAP.md), and the diskLayer.revert path will
panic before consuming origins anyway.
2026-04-15 15:00:40 +02:00
CPerezz
437a53bbe0
triedb/pathdb: implement bintrieFlatCodec + stem blob helpers
Introduce the codec and on-disk blob format for the bintrie flat-state
layer. This commit only defines the types; the codec is NOT wired into
pathdb.Database.New yet (that happens in a later commit once the
leaf-production hook in binaryHasher and the stateUpdate wiring are in
place).

Three pieces:

1. trie/bintrie/pack.go

   Canonical PackBasicData / UnpackBasicData helpers that encode an
   account's (codeSize, nonce, balance) into the 32-byte BasicData leaf
   defined by EIP-7864. Preserves the existing BinaryTrie.UpdateAccount
   layout byte-for-byte (4-byte code_size at offset 4 rather than the
   spec's 3-byte field at offset 5 — any realistic code size has byte 4
   always zero and the two encodings are bit-equivalent in practice).

   BinaryTrie.UpdateAccount is refactored to delegate to PackBasicData
   so the flat-state codec can produce a bit-identical BasicData
   encoding without duplicating the layout logic.

2. triedb/pathdb/stem_blob.go

   Packed encoding of the populated (offset, value) pairs at a bintrie
   stem. A stem can hold up to 256 offsets per EIP-7864 but in practice
   only a handful are set; the layout is a 32-byte bitmap followed by
   N 32-byte values in ascending offset order, where N = popcount.
   Empty stems encode to nil so the caller knows to delete the on-disk
   key rather than write a zero-length value.

   Provides encodeStemBlob / decodeStemBlob / extractStemOffset /
   mergeStemBlob and a stemBuilder type for accumulating writes. The
   tombstone convention (32 zero bytes = "present with zero" as used
   by DeleteStorage) is preserved.

   11 unit tests cover: empty blob, BasicData+CodeHash roundtrip, all
   256 offsets populated, sparse high offsets, set/clear roundtrip,
   load-from-existing-blob RMW, merge helper, merge-to-empty, tombstone
   zero bytes, malformed input detection, bitmap rank sanity.

3. triedb/pathdb/flat_codec_bintrie.go

   bintrieFlatCodec implements flatStateCodec over the stem-blob layout.
   Unlike merkleFlatCodec it is stateful: it holds a ethdb.KeyValueReader
   reference used by applyWrites to read the existing stem blob before
   merging in new writes. ethdb.Batch is write-only so the batch passed
   to Write* cannot be used to fetch current state.

   Pre-aggregation requirement is documented explicitly: within a single
   flush, the caller must NOT issue two Write* calls targeting the same
   stem, because the RMW read comes from the store (not the in-flight
   batch). Commit 8 of the bintrie flat-state plan restructures
   writeStates to pre-aggregate per-stem writes so callers don't have
   to handle this manually.

   Cache keys are prefix-disambiguated with a one-byte 0x01 to keep
   bintrie stem lookups disjoint from merkle 32-byte account keys and
   64-byte storage keys in the shared clean-state fastcache.

   SplitMarker is a single-tier (stem-only) format, not the merkle
   two-tier (account, account+storage) format.

   7 unit tests cover: account roundtrip, storage roundtrip, multiple
   writes to the same stem, DeleteAccount preserving unrelated offsets,
   DeleteStorage removing the final offset collapsing the key, cache
   key disjointness from merkle, SplitMarker semantics.

The codec is not dispatched by anything yet; MPT continues through the
merkle codec and bintrie mode still runs on the (soon-to-be-replaced)
keccak-shaped path until Commit 10 wires things up.
2026-04-15 15:00:40 +02:00
CPerezz
0fb4d9226b
triedb/pathdb: bump journal version to 4
Reserve journal version 4 for the upcoming bintrie flat-state layout
(per-stem blobs). Bumping now — with no on-disk format change yet —
ensures that any v3 journals belonging to a bintrie database are
discarded on load, so the new layout can be introduced cleanly in
follow-up commits without a migration shim.

MPT behavior is unchanged at this point: the only codec wired to the
pathdb Database is still merkleFlatCodec. All pathdb, core/state,
core/rawdb, and trie tests pass.
2026-04-15 15:00:40 +02:00
CPerezz
f1d7143afa
triedb/pathdb: thread flatStateCodec through internals
Route the flatStateCodec from Database through every flat-state call
site so that the trie-specific aspects of persistence and key derivation
live behind a single abstraction. Pure refactor: merkle behavior and
on-disk layout are unchanged because the only codec wired up is
merkleFlatCodec, whose methods are thin wrappers over the existing
rawdb accessors.

Threaded sites:

  disklayer.account/storage    use codec.{Read,AccountCacheKey,
                                StorageCacheKey} instead of direct
                                rawdb calls and bare hash slicing.
  flush.writeStates            takes a codec parameter; persistence
                                goes through codec.{Write,Delete}
                                {Account,Storage}.
  buffer.flush                 carries the codec down into writeStates.
  states.write/dbsize          takes the codec for prefix-size
                                accounting.
  generate.go (g.codec)        the generator owns a codec, used by
                                generateAccounts/generateStorages
                                callbacks; the unused top-level
                                splitMarker helper is removed in favor
                                of codec.SplitMarker.
  context.go                   the generator context owns the codec
                                and uses codec.{AccountPrefix,
                                StoragePrefix,Account/StorageKeyLength}
                                to construct iterators.
  reader.go (HistoricalState)  uses codec.{Account,Storage}Key for
                                caller-side key derivation.

The marker comparisons in writeStates remain merkle-shaped (two-tier
account+storage marker) because the bintrie path will use a separate
writer over single-tier stem markers in a later commit.

All existing pathdb tests pass.
2026-04-15 15:00:39 +02:00
CPerezz
eaf5523a5a
triedb/pathdb: introduce flatStateCodec abstraction
Introduce flatStateCodec, a small interface that captures the
trie-specific aspects of flat-state storage: key derivation from
(address, slot), persistence of account/storage entries, clean-cache
key disambiguation, iterator setup, and progress-marker handling.

Mirrors the existing nodeHasher pattern and complements the Hasher
interface from state-hasher-iface-2 (which abstracts trie-side hashing
and commit). The codec is stored on Database alongside the existing
hasher field, ready to be threaded through the flat-state call sites
(disklayer, flush, generator, reader) in the next commit.

Provides merkleFlatCodec, a thin wrapper over the existing rawdb
snapshot accessors and helpers. This is a pure refactor: behavior is
unchanged. The bintrie-side codec implementation is added in a later
commit, after all call sites have been routed through the abstraction.
2026-04-15 15:00:39 +02:00
CPerezz
3772bb536a
triedb/pathdb: fix lookup sentinel collision with zero disk layer root (#34680) 2026-04-09 13:39:38 +08:00
Diego López León
52b8c09fdf
triedb/pathdb: skip duplicate-root layer insertion (#34642)
PathDB keys diff layers by state root, not by block hash. That means a
side-chain block can legitimately collide with an existing canonical diff layer
when both blocks produce the same post-state (for example same parent, 
same coinbase, no txs).

Today `layerTree.add` blindly inserts that second layer. If the root
already exists, this overwrites `tree.layers[root]` and appends the same 
root to the mutation lookup again. Later account/storage lookups resolve 
that root to the wrong diff layer, which can corrupt reads for descendant 
canonical states.

At runtime, the corruption is silent: no error is logged and no invariant check
fires. State reads against affected descendants simply return stale data
from the wrong diff layer (for example, an account balance that reflects one
fewer block reward), which can propagate into RPC responses and block 
validation.

This change makes duplicate-root inserts idempotent. A second layer with
the same state root does not add any new retrievable state to a tree that is
already keyed by root; keeping the original layer preserves the existing parent 
chain and avoids polluting the lookup history with duplicate roots.

The regression test imports a canonical chain of two layers followed by
a fork layer at height 1 with the same state root but a different block hash. 
Before the fix, account and storage lookups at the head resolve the fork 
layer instead of the canonical one. After the fix, the duplicate insert is 
skipped and lookups remain correct.
2026-04-07 21:31:41 +08:00
Jonny Rhea
bd6530a1d4
triedb, triedb/internal, triedb/pathdb: add GenerateTrie + extract shared pipeline into triedb/internal (#34654)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR adds `GenerateTrie(db, scheme, root)` to the `triedb` package,
which rebuilds all tries from flat snapshot KV data. This is needed by
snap/2 sync so it can rebuild the trie after downloading the flat state.
The shared trie generation pipeline from `pathdb/verifier.go` was moved
into `triedb/internal/conversion.go` so both `GenerateTrie` and
`VerifyState` reuse the same code.
2026-04-07 14:36:53 +08:00
rjl493456442
d8cb8a962b
core, eth, ethclient, triedb: report trienode index progress (#34633)
Some checks failed
/ Linux Build (push) Has been cancelled
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
The trienode history indexing progress is also exposed via an RPC 
endpoint and contributes to the eth_syncing status.
2026-04-04 21:00:07 +08:00
rjl493456442
db6c7d06a2
triedb/pathdb: implement history index pruner (#33999)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR implements the missing functionality for archive nodes by 
pruning stale index data.

The current mechanism is relatively simple but sufficient for now: 
it periodically iterates over index entries and deletes outdated data 
on a per-block basis. 

The pruning process is triggered every 90,000 new blocks (approximately 
every 12 days), and the iteration typically takes ~30 minutes on a 
mainnet node.

This mechanism is only applied with `gcmode=archive` enabled, having
no impact on normal full node.
2026-04-02 00:21:58 +02:00
rjl493456442
9b2ce121dc
triedb/pathdb: enhance history index initer (#33640)
This PR improves the pbss archive mode. Initial sync
of an archive mode which has the --gcmode archive
flag enabled will be significantly sped up.

It achieves that with the following changes:

The indexer now attempts to process histories in batch whenever
possible.
Batch indexing is enforced when the node is still syncing and the local
chain
head is behind the network chain head. 

In this scenario, instead of scheduling indexing frequently alongside
block
insertion, the indexer waits until a sufficient amount of history has
accumulated
and then processes it in a batch, which is significantly more efficient.

---------

Co-authored-by: Sina M <1591639+s1na@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-17 15:29:30 +01:00
rjl493456442
7d13acd030
core/rawdb, triedb/pathdb: enable trienode history alongside existing data (#33934)
Fixes https://github.com/ethereum/go-ethereum/issues/33907

Notably there is a behavioral change:
- Previously Geth will refuse to restart if the existing trienode
history is gapped with the state data
- With this PR, the gapped trienode history will be entirely reset and
being constructed from scratch
2026-03-12 09:21:54 +08:00
rjl493456442
dd202d4283
core, ethdb, triedb: add batch close (#33708)
Pebble maintains a batch pool to recycle the batch object. Unfortunately
batch object must be
explicitly returned via `batch.Close` function. This PR extends the
batch interface by adding
the close function and also invoke batch.Close in some critical code
paths.

Memory allocation must be measured before merging this change. What's
more, it's an open
question that whether we should apply batch.Close as much as possible in
every invocation.
2026-03-04 11:17:47 +01:00
sashass1315
919b238c82
triedb/pathdb: return nodeLoc by value to avoid heap allocation (#33819) 2026-02-11 22:14:43 +08:00
0xFloki
a951aacb70
triedb/pathdb: preallocate slices in encode methods (#33736)
Preallocates slices with known capacity in `stateSet.encode()` and
`StateSetWithOrigin.encode()` methods to eliminate redundant
reallocations during serialization.
2026-02-02 15:27:37 +08:00
alex017
cb97c48cb6
triedb/pathdb: preallocate slices in decodeRestartTrailer (#33715)
Some checks failed
/ Linux Build (arm) (push) Has been cancelled
/ Keeper Build (push) Has been cancelled
/ Windows Build (push) Has been cancelled
/ Linux Build (push) Has been cancelled
/ Docker Image (push) Has been cancelled
Preallocate capacity for `keyOffsets` and `valOffsets` slices in
`decodeRestartTrailer` since the exact size (`nRestarts`) is known
upfront.

---------

Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
2026-01-30 21:14:15 +08:00
rjl493456442
181a3ae9e0
triedb/pathdb: improve trienode reader for searching (#33681)
Some checks are pending
/ Docker Image (push) Waiting to run
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
This PR optimizes the historical trie node reader by reworking how data
is accessed and memory is managed, reducing allocation overhead 
significantly.

Specifically:

- Instead of decoding an entire history object to locate a specific trie node, 
   the reader now searches directly within the history.

- Besides, slice pre-allocation can avoid unnecessary deep-copy significantly.
2026-01-27 20:05:35 +08:00
rjl493456442
1022c7637d
core, eth, internal, triedb/pathdb: enable eth_getProofs for history (#32727)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This PR enables the `eth_getProofs ` endpoint against the historical states.
2026-01-22 09:19:27 +08:00
cui
d0af257aa2
triedb/pathdb: double check the list availability before regeneration (#33622)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
2026-01-19 20:45:31 +08:00
rjl493456442
add1890a57
triedb/pathdb: enable trienode history (#32621)
It's the part-4 for trienode history. The trienode history persistence
has been enabled with this PR by flag `history.trienode <non-negative-number>`
2026-01-17 21:23:48 +08:00
rjl493456442
588dd94aad
triedb/pathdb: implement trienode history indexing scheme (#33551)
This PR implements the indexing scheme for trie node history. Check
https://github.com/ethereum/go-ethereum/pull/33399 for more details
2026-01-17 20:28:37 +08:00
rjl493456442
494908a852
triedb/pathdb: change the bitmap to big endian (#33584)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
The bitmap is used in compact-encoded trie nodes to indicate which elements 
have been modified. The bitmap format has been updated to use big-endian
encoding. 

Bit positions are numbered from 0 to 15, where position 0 corresponds to
the most significant bit of b[0], and position 15 corresponds to the least
significant bit of b[1].
2026-01-15 17:28:57 +08:00
rjl493456442
f51870e40e
rlp, trie, triedb/pathdb: compress trienode history (#32913)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This pull request introduces a mechanism to compress trienode history by
storing only the node diffs between consecutive versions.

- For full nodes, only the modified children are recorded in the history;
- For short nodes, only the modified value is stored;

If the node type has changed, or if the node is newly created or
deleted, the entire node value is stored instead.

To mitigate the overhead of reassembling nodes from diffs during history
reads, checkpoints are introduced by periodically storing full node values.

The current checkpoint interval is set to every 16 mutations, though
this parameter may be made configurable in the future.
2026-01-08 21:58:02 +08:00
rjl493456442
d5efd34010
triedb/pathdb: introduce extension to history index structure (#33399)
It's a PR based on #33303 and introduces an approach for trienode
history indexing.

---

In the current archive node design, resolving a historical trie node at
a specific block
involves the following steps:

- Look up the corresponding trie node index and locate the first entry
whose state ID
   is greater than the target state ID.
- Resolve the trie node from the associated trienode history object.

A naive approach would be to store mutation records for every trie node,
similar to
how flat state mutations are recorded. However, the total number of trie
nodes is
extremely large (approximately 2.4 billion), and the vast majority of
them are rarely
modified. Creating an index entry for each individual trie node would be
very wasteful
in both storage and indexing overhead. To address this, we aggregate
multiple trie
nodes into chunks and index mutations at the chunk level instead. 

---

For a storage trie, the trie is vertically partitioned into multiple sub
tries, each spanning
three consecutive levels. The top three levels (1 + 16 + 256 nodes) form
the first chunk,
and every subsequent three-level segment forms another chunk.

```
Original trie structure

Level 0               [ ROOT ]                               1 node
Level 1        [0] [1] [2] ... [f]                          16 nodes
Level 2     [00] [01] ... [0f] [10] ... [ff]               256 nodes
Level 3   [000] [001] ... [00f] [010] ... [fff]           4096 nodes
Level 4   [0000] ... [000f] [0010] ... [001f] ... [ffff] 65536 nodes

Vertical split into chunks (3 levels per chunk)

Level0             [ ROOT ]                     1 chunk
Level3        [000]   ...     [fff]          4096 chunks
Level6   [000000]    ...    [fffffff]    16777216 chunks  
```

Within each chunk, there are 273 nodes in total, regardless of the
chunk's depth in the trie.

```
Level 0           [ 0 ]                         1 node
Level 1        [ 1 ] … [ 16 ]                  16 nodes
Level 2     [ 17 ] … … [ 272 ]                256 nodes
```

Each chunk is uniquely identified by the path prefix of the root node of
its corresponding
sub-trie. Within a chunk, nodes are identified by a numeric index
ranging from 0 to 272.

For example, suppose that at block 100, the nodes with paths `[]`,
`[0]`, `[f]`, `[00]`, and `[ff]`
are modified. The mutation record for chunk 0 is then appended with the
following entry:

`[100 → [0, 1, 16, 17, 272]]`, `272` is the numeric ID of path `[ff]`.

Furthermore, due to the structural properties of the Merkle Patricia
Trie, if a child node
is modified, all of its ancestors along the same path must also be
updated. As a result,
in the above example, recording mutations for nodes `00` and `ff` alone
is sufficient,
as this implicitly indicates that their ancestor nodes `[]`, `[0]` and
`[f]` were also
modified at block 100.

--- 

Query processing is slightly more complicated. Since trie nodes are
indexed at the chunk
level, each individual trie node lookup requires an additional filtering
step to ensure that
a given mutation record actually corresponds to the target trie node.

As mentioned earlier, mutation records store only the numeric
identifiers of leaf nodes,
while ancestor nodes are omitted for storage efficiency. Consequently,
when querying
an ancestor node, additional checks are required to determine whether
the mutation
record implicitly represents a modification to that ancestor.

Moreover, since trie nodes are indexed at the chunk level, some trie
nodes may be
updated frequently, causing their mutation records to dominate the
index. Queries
targeting rarely modified trie nodes would then scan a large amount of
irrelevant
index data, significantly degrading performance.

To address this issue, a bitmap is introduced for each index block and
stored in the
chunk's metadata. Before loading a specific index block, the bitmap is
checked to
determine whether the block contains mutation records relevant to the
target trie node.
If the bitmap indicates that the block does not contain such records,
the block is skipped entirely.
2026-01-08 09:57:35 +01:00
rjl493456442
b3e7d9ee44
triedb/pathdb: optimize history indexing efficiency (#33303)
This pull request optimizes history indexing by splitting a single large
database
 batch into multiple smaller chunks.

Originally, the indexer will resolve a batch of state histories and
commit all
corresponding index entries atomically together with the indexing
marker.

While indexing more state histories in a single batch improves
efficiency, excessively
large batches can cause significant memory issues.

To mitigate this, the pull request splits the mega-batch into several
smaller batches
and flushes them independently during indexing. However, this introduces
a potential
inconsistency that some index entries may be flushed while the indexing
marker is not,
and an unclean shutdown may leave the database in a partially updated
state.
This can corrupt index data.

To address this, head truncation is introduced. After a restart, any
excessive index
entries beyond the expected indexing marker are removed, ensuring the
index remains
consistent after an unclean shutdown.
2025-12-30 16:05:13 +01:00
rjl493456442
bf141fbfb1
core, eth: add lock protection in snap sync (#33428)
Fixes #33396, #33397, #33398
2025-12-19 09:36:48 +01:00
Delweng
1b702f71d9
triedb/pathdb: use copy instead of append to reduce memory alloc (#33044)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
2025-12-11 09:37:16 +08:00
Forostovec
6f2cbb7a27
triedb/pathdb: allow single-element history ranges (#33329)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
2025-12-01 10:19:21 +08:00
rjl493456442
960c87a944
triedb/pathdb: implement iterator of history index (#32981)
This change introduces an iterator for the history index in the pathdb.
It provides sequential access to historical entries, enabling efficient 
scanning and future features built on top of historical state traversal.
2025-11-26 16:07:16 +08:00
Guillaume Ballet
2a2f106a01
cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This is broken off of #31730 to only focus on testing networks that
start with verkle at genesis.

The PR has seen a lot of work since its creation, and it now targets
creating and re-executing tests for a binary tree testnet without the
transition (so it starts at genesis). The transition tree has been moved
to its own package. It also replaces verkle with the binary tree for
this specific application.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-11-14 15:25:30 +01:00
Forostovec
eb8f32588b
triedb/pathdb: fix ID assignment in history inspection (#33103) 2025-11-13 14:51:41 +08:00
Delweng
d2a5dba48f
triedb/pathdb: fix 32-bit integer overflow in history trienode decoder (#33098)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
failed in 32bit:

```
--- FAIL: TestDecodeSingleCorruptedData (0.00s)
panic: runtime error: slice bounds out of range [:-1501805520] [recovered, repanicked]

goroutine 38872 [running]:
testing.tRunner.func1.2({0x838db20, 0xa355620})
	/opt/actions-runner/_work/_tool/go/1.25.3/x64/src/testing/testing.go:1872 +0x29b
testing.tRunner.func1()
	/opt/actions-runner/_work/_tool/go/1.25.3/x64/src/testing/testing.go:1875 +0x414
panic({0x838db20, 0xa355620})
	/opt/actions-runner/_work/_tool/go/1.25.3/x64/src/runtime/panic.go:783 +0x103
github.com/ethereum/go-ethereum/triedb/pathdb.decodeSingle({0x9e57500, 0x1432, 0x1432}, 0x0)
	/opt/actions-runner/_work/go-ethereum/go-ethereum/triedb/pathdb/history_trienode.go:399 +0x18d6
github.com/ethereum/go-ethereum/triedb/pathdb.TestDecodeSingleCorruptedData(0xa2db9e8)
	/opt/actions-runner/_work/go-ethereum/go-ethereum/triedb/pathdb/history_trienode_test.go:698 +0x180
testing.tRunner(0xa2db9e8, 0x83c86e8)
	/opt/actions-runner/_work/_tool/go/1.25.3/x64/src/testing/testing.go:1934 +0x114
created by testing.(*T).Run in goroutine 1
	/opt/actions-runner/_work/_tool/go/1.25.3/x64/src/testing/testing.go:1997 +0x4b4
FAIL	github.com/ethereum/go-ethereum/triedb/pathdb	41.453s
?   	github.com/ethereum/go-ethereum/version	[no test files]
FAIL
```

Found in
https://github.com/ethereum/go-ethereum/actions/runs/18912701345/job/53990136071?pr=33052
2025-11-07 23:06:15 +01:00
rjl493456442
cfa3b96103
core/rawdb, triedb/pathdb: re-structure the trienode history header (#32907)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
In this PR, several changes have been made:

(a) restructure the trienode history header section

Previously, the offsets of the key and value sections were recorded before 
encoding data into these sections. As a result, these offsets referred to the
start position of each chunk rather than the end position.

This caused an issue where the end position of the last chunk was
unknown, making it incompatible with the freezer partial-read APIs. 
With this update, all offsets now refer to the end position, and the 
start position of the first chunk is always 0.

(b) Enable partial freezer read for trienode data retrieval

The partial freezer read feature is now utilized in trienode data
retrieval, improving efficiency.
2025-10-25 16:16:16 +08:00
rjl493456442
0a8b820725
triedb/pathdb: make batch with pre-allocated size (#32914)
In this PR, the database batch for writing the history index data is
pre-allocated.

It's observed that database batch repeatedly grows the size of the
mega-batch,
causing significant memory allocation pressure. This approach can
effectively
mitigate the overhead.
2025-10-21 13:11:36 +02:00
hero5512
11c0fb98af
triedb/pathdb: fix index out of range panic in decodeSingle (#32937)
Fixes TestCorruptedKeySection flaky test failure.
https://github.com/ethereum/go-ethereum/actions/runs/18600235182/job/53037084761?pr=32920
2025-10-20 10:29:46 +08:00
Guillaume Ballet
52c484de86
triedb/pathdb: catch int conversion overflow in 32-bit (#32899)
The limit check for `MaxUint32` is done after the cast to `int`. On 64
bits machines, that will work without a problem. On 32 bits machines,
that will always fail. The compiler catches it and refuses to build.

Note that this only fixes the compiler build. ~~If the limit is above
`MaxInt32` but strictly below `MaxUint32` then this will fail at runtime
and we have another issue.~~ I checked and this should not happen during
regular execution, although it might happen in tests.
2025-10-14 09:23:05 +08:00
Delweng
a7359ceb69
triedb, core/rawdb: implement the partial read in freezer (#32132)
This PR implements the partial read functionalities in the freezer, optimizing
the state history reader by resolving less data from freezer.

---------

Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-10-13 19:40:03 +08:00
rjl493456442
de24450dbf
core/rawdb, triedb/pathdb: introduce trienode history (#32596)
It's a pull request based on the #32523 , implementing the structure of
trienode history.
2025-10-10 14:51:27 +08:00
rjl493456442
ada2db4304
triedb/pathdb: move head truncation log (#32649)
Print the `Truncating from head` log only if head truncation is needed.
2025-09-22 14:45:15 +08:00
rjl493456442
21769f3474
triedb/pathdb: generalize the history indexer (#32523)
This pull request is based on #32306 , is the second part for shipping
trienode history.

Specifically, this pull request generalize the existing index mechanism,
making is usable
by both state history and trienode history in the near future.
2025-09-17 15:57:16 +02:00
rjl493456442
ca6e2d141b
triedb/pathdb: sync ancient store before journal (#32557)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This pull request addresses the corrupted path database with log
indicating:
`history head truncation out of range, tail: 122557, head: 212208,
target: 212557`

This is a rare edge case where the in-memory layers, including the write
buffer
in the disk layer, are fully persisted (e.g., written to file), but the
state history
freezer is not properly closed (e.g., Geth is terminated after
journaling but
before freezer.Close). In this situation, the recent state history
writes will be
truncated on the next startup, while the in-memory layers resolve
correctly.
As a result, the state history falls behind the disk layer (including
the write buffer).

In this pull request, the state history freezer is always synced before
journal,
ensuring the state history writes are always persisted before the
others.

Edit: 
It's confirmed that devops team has 10s container termination setting.
It
explains why Geth didn't finish the entire termination without state
history
being closed.

https://github.com/ethpandaops/fusaka-devnets/pull/63/files
2025-09-09 14:39:54 +02:00
rjl493456442
bc4ee71a5d
triedb/pathdb: add recovery mechanism in state indexer (#32447)
Alternative of #32335, enhancing the history indexer recovery after
unclean shutdown.
2025-09-08 16:07:00 +08:00
Delweng
c4ec4504bb
core/state: state size tracking (#32362)
Add state size tracking and retrieve api, start geth with `--state.size-tracking`, 
the initial bootstrap is required (around 1h on mainnet), after the bootstrap, 
use `debug_stateSize()` RPC to retrieve the state size:

```
> debug.stateSize()
{
  accountBytes: "0x39681967b",
  accountTrienodeBytes: "0xc57939f0c",
  accountTrienodes: "0x198b36ac",
  accounts: "0x129da14a",
  blockNumber: "0x1635e90",
  contractCodeBytes: "0x2b63ef481",
  contractCodes: "0x1c7b45",
  stateRoot: "0x9c36a3ec3745d72eea8700bd27b90dcaa66de0494b187c5600750044151e620a",
  storageBytes: "0x18a6e7d3f1",
  storageTrienodeBytes: "0x2e7f53fae6",
  storageTrienodes: "0x6e49a234",
  storages: "0x517859c5"
}
```

---------

Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-09-08 14:00:23 +08:00
rjl493456442
902ec5baae
cmd, core, eth, triedb/pathdb: track node origins in the path database (#32418)
This PR is the first step in the trienode history series.

It introduces the `nodeWithOrigin` struct in the path database, which tracks
the original values of dirty nodes to support trienode history construction.

Note, the original value is always empty in this PR, so it won't break the 
existing journal for encoding and decoding. The compatibility of journal 
should be handled in the following PR.
2025-09-05 10:37:05 +08:00
Mars
0e69530c6e
all: improve ETA calculation across all progress indicators (#32521)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
### Summary
Fixes long-standing ETA calculation errors in progress indicators that
have been present since February 2021. The current implementation
produces increasingly inaccurate estimates due to integer division
precision loss.

### Problem

3aeccadd04/triedb/pathdb/history_indexer.go (L541-L553)
The ETA calculation has two critical issues:
1. **Integer division precision loss**: `speed` is calculated as
`uint64`
2. **Off-by-one**: `speed` uses `+ 1`(2 times) to avoid division by
zero, however it makes mistake in the final calculation

This results in wildly inaccurate time estimates that don't improve as
progress continues.

### Example
Current output during state history indexing:
```
lvl=info msg="Indexing state history" processed=16858580 left=41802252 elapsed=18h22m59.848s eta=11h36m42.252s
```

**Expected calculation:**
- Speed: 16858580 ÷ 66179848ms = 0.255 blocks/ms  
- ETA: 41802252 ÷ 0.255 = ~45.6 hours

**Current buggy calculation:**
- Speed: rounds to 1 block/ms
- ETA: 41802252 ÷ 1 = ~11.6 hours 

### Solution
- Created centralized `CalculateETA()` function in common package
- Replaced all 8 duplicate code copies across the codebase

### Testing
Verified accurate ETA calculations during archive node reindexing with
significantly improved time estimates.
2025-09-01 13:47:02 +08:00
rjl493456442
7f78fa6912
triedb/pathdb, core: keep root->id mappings after truncation (#32502)
This pull request preserves the root->ID mappings in the path database
even after the associated state histories are truncated, regardless of
whether the truncation occurs at the head or the tail.

The motivation is to support an additional history type, trienode history. 
Since the root->ID mappings are shared between two history instances, 
they must not be removed by either one.

As a consequence, the root->ID mappings remain in the database even
after the corresponding histories are pruned. While these mappings may 
become  dangling, it is safe and cheap to keep them.

Additionally, this pull request enhances validation during historical
reader construction, ensuring that only canonical historical state will be
served.
2025-08-29 15:43:58 +08:00
Zach Brown
2a795c14f4
all: fix problematic function name in comment (#32513)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Fix problematic function name in comment.
Do my best to correct them all with a script to avoid spamming PRs.
2025-08-29 08:54:23 +08:00
rjl493456442
95ab643bb8
triedb/pathdb: refactor state history write (#32497)
This pull request refactors the internal implementation in path database
a bit, specifically:

- purge the state index data in batch
- simplify the logic of state history construction and index, make it more readable
2025-08-26 21:53:55 +08:00