go-ethereum

forks/go-ethereum

Fork 1

mirror of https://github.com/ethereum/go-ethereum.git synced 2026-06-15 03:11:36 +00:00

Commit graph

Author	SHA1	Message	Date
CPerezz	45de1c3cc1	triedb/pathdb: fix mid-stem generator resume via mergeStemBlob RMW Addresses review finding C1. Before this commit, flushStem in generateBinTrieStems used builder.encode() to overwrite the on-disk stem blob unconditionally. When a crash+restart interrupted generation mid-stem (e.g., at offset 3 of stemA), the resume iterator positioned at stemA\|\|3, the builder accumulated only offsets 3+, and flushStem overwrote the disk blob with a partial result — silently losing offsets 0, 1, 2 that were written in the prior pass. Fix: make flushStem a read-modify-write. It now reads the existing on-disk stem blob (if any), converts the builder's accumulated offsets to []stemOffsetValue via a new toOffsetValues() helper, and merges them via the existing mergeStemBlob function. The merge semantics are "builder values win" — new offsets overwrite their existing counterparts, and gaps are filled from the prior blob. This makes the RMW idempotent across resume cycles: the same stem can be re-walked from any midpoint and the final disk blob always contains the union of all passes. New helper: stemBuilder.toOffsetValues() converts the builder's populated bitmap entries into a []stemOffsetValue slice suitable for mergeStemBlob. ~20 LOC in stem_blob.go. Tests: * TestBintrieGeneratorResumeMidStem — pre-seeds disk with a partial stem (offsets 0, 1), resumes generator at offset 1, asserts all offsets survive including the pre-seeded offset 0. Before the fix this test fails with "BasicData lost after mid-stem resume". * TestBintrieGeneratorResumeStemBoundary — renamed from the original TestBintrieGeneratorResume, unchanged behavior (stem-boundary resume).	2026-04-15 15:00:41 +02:00
CPerezz	bfb77d98f6	core/state,triedb/pathdb: enable bintrie flat state reads end-to-end Wires the pieces from Commits 1-9 into a running system: * triedb/pathdb.New: install the bintrieFlatCodec when isVerkle is set, backed by the same verkle-namespaced db used for trie nodes. * triedb/pathdb.database.go: drop isVerkle from the noBuild guard so the bintrie generator (Commit 9) runs on startup, and remove it from the generateSnapshot call path for the same reason. * triedb/pathdb.disklayer.revert: hard-fail on bintrie because the reorg path would replay merkle-shaped origin records against a per-stem layout. Tracked in BINTRIE_FLAT_STATE_REORG_GAP.md. * triedb/pathdb.journal: add IsBintrie to journalGenerator (rlp:"optional" so v3 journals still decode) and make journalProgress a method on generator so it stamps the active scheme; loadGenerator discards any journal whose scheme does not match the database, forcing a fresh regeneration. * triedb/pathdb.reader: export RawStateReader, a small extension of database.StateReader that exposes AccountRLP so callers outside the package can reach the raw flat-state bytes without going through the slim-RLP decode path that assumes merkle shape. * core/state.reader: add bintrieFlatReader, the bintrie equivalent of flatReader. It derives the EIP-7864 stem keys from (addr, slot), performs two AccountRLP lookups per Account call (BasicData + CodeHash), and decodes via bintrie.UnpackBasicData. Storage reads go through a single AccountRLP lookup at the slot's full bintrie key. * core/state.database.StateReader: dispatch to bintrieFlatReader when the path database is in verkle mode; merkle path unchanged. Depends on the lookup sentinel fix in the previous commit; without it missing-account reads on bintrie misreport as "layer stale".	2026-04-15 15:00:40 +02:00
CPerezz	0508d40aaf	triedb/pathdb: bintrie snapshot generator Adds generateBinTrieStems, the bintrie analogue of generateAccounts. It opens the bintrie via a sha256-aware bintrieDiskStore (the merkle disk store would always fail root validation against a binary node), iterates all leaves with binaryNodeIterator, aggregates them into per-stem builders, and emits one stem blob per stem boundary. Resume support is structural: ctx.marker is fed straight to the trie's NodeIterator, which uses binaryNodeIterator.seek (Commit 1) to position on the first leaf >= marker. Range proofs are deliberately skipped — the bintrie's Prove path is unimplemented and an iteration-only generation cycle is acceptable for a one-time startup cost. A bintrieGeneratorContext mirrors generatorContext but is much smaller: no holdable iterators (we walk the trie, not the existing flat state) and no two-tier marker (the bintrie key space is unified). checkAndFlushBin journals progress as a single 32-byte (stem \|\| offset) key so resume can pick up mid-stem. generator.run dispatches on codec type so callers see a uniform lifecycle whether the underlying scheme is merkle or bintrie.	2026-04-15 15:00:40 +02:00

Author

SHA1

Message

Date

CPerezz

45de1c3cc1

triedb/pathdb: fix mid-stem generator resume via mergeStemBlob RMW

Addresses review finding C1.

Before this commit, flushStem in generateBinTrieStems used
builder.encode() to overwrite the on-disk stem blob unconditionally.
When a crash+restart interrupted generation mid-stem (e.g., at offset 3
of stemA), the resume iterator positioned at stemA||3, the builder
accumulated only offsets 3+, and flushStem overwrote the disk blob with
a partial result — silently losing offsets 0, 1, 2 that were written in
the prior pass.

Fix: make flushStem a read-modify-write. It now reads the existing
on-disk stem blob (if any), converts the builder's accumulated offsets
to []stemOffsetValue via a new toOffsetValues() helper, and merges them
via the existing mergeStemBlob function. The merge semantics are
"builder values win" — new offsets overwrite their existing counterparts,
and gaps are filled from the prior blob. This makes the RMW idempotent
across resume cycles: the same stem can be re-walked from any midpoint
and the final disk blob always contains the union of all passes.

New helper: stemBuilder.toOffsetValues() converts the builder's
populated bitmap entries into a []stemOffsetValue slice suitable for
mergeStemBlob. ~20 LOC in stem_blob.go.

Tests:
  * TestBintrieGeneratorResumeMidStem — pre-seeds disk with a partial
    stem (offsets 0, 1), resumes generator at offset 1, asserts all
    offsets survive including the pre-seeded offset 0. Before the fix
    this test fails with "BasicData lost after mid-stem resume".
  * TestBintrieGeneratorResumeStemBoundary — renamed from the original
    TestBintrieGeneratorResume, unchanged behavior (stem-boundary
    resume).

2026-04-15 15:00:41 +02:00

CPerezz

bfb77d98f6

core/state,triedb/pathdb: enable bintrie flat state reads end-to-end

Wires the pieces from Commits 1-9 into a running system:

* triedb/pathdb.New: install the bintrieFlatCodec when isVerkle is set,
  backed by the same verkle-namespaced db used for trie nodes.
* triedb/pathdb.database.go: drop isVerkle from the noBuild guard so the
  bintrie generator (Commit 9) runs on startup, and remove it from the
  generateSnapshot call path for the same reason.
* triedb/pathdb.disklayer.revert: hard-fail on bintrie because the
  reorg path would replay merkle-shaped origin records against a
  per-stem layout. Tracked in BINTRIE_FLAT_STATE_REORG_GAP.md.
* triedb/pathdb.journal: add IsBintrie to journalGenerator (rlp:"optional"
  so v3 journals still decode) and make journalProgress a method on
  generator so it stamps the active scheme; loadGenerator discards any
  journal whose scheme does not match the database, forcing a fresh
  regeneration.
* triedb/pathdb.reader: export RawStateReader, a small extension of
  database.StateReader that exposes AccountRLP so callers outside the
  package can reach the raw flat-state bytes without going through the
  slim-RLP decode path that assumes merkle shape.
* core/state.reader: add bintrieFlatReader, the bintrie equivalent of
  flatReader. It derives the EIP-7864 stem keys from (addr, slot),
  performs two AccountRLP lookups per Account call (BasicData +
  CodeHash), and decodes via bintrie.UnpackBasicData. Storage reads go
  through a single AccountRLP lookup at the slot's full bintrie key.
* core/state.database.StateReader: dispatch to bintrieFlatReader when
  the path database is in verkle mode; merkle path unchanged.

Depends on the lookup sentinel fix in the previous commit; without it
missing-account reads on bintrie misreport as "layer stale".

2026-04-15 15:00:40 +02:00

CPerezz

0508d40aaf

triedb/pathdb: bintrie snapshot generator

Adds generateBinTrieStems, the bintrie analogue of generateAccounts. It
opens the bintrie via a sha256-aware bintrieDiskStore (the merkle disk
store would always fail root validation against a binary node), iterates
all leaves with binaryNodeIterator, aggregates them into per-stem
builders, and emits one stem blob per stem boundary.

Resume support is structural: ctx.marker is fed straight to the trie's
NodeIterator, which uses binaryNodeIterator.seek (Commit 1) to position
on the first leaf >= marker. Range proofs are deliberately skipped — the
bintrie's Prove path is unimplemented and an iteration-only generation
cycle is acceptable for a one-time startup cost.

A bintrieGeneratorContext mirrors generatorContext but is much smaller:
no holdable iterators (we walk the trie, not the existing flat state)
and no two-tier marker (the bintrie key space is unified). checkAndFlushBin
journals progress as a single 32-byte (stem || offset) key so resume
can pick up mid-stem.

generator.run dispatches on codec type so callers see a uniform
lifecycle whether the underlying scheme is merkle or bintrie.

2026-04-15 15:00:40 +02:00

3 commits