Wires the pieces from Commits 1-9 into a running system:
* triedb/pathdb.New: install the bintrieFlatCodec when isVerkle is set,
backed by the same verkle-namespaced db used for trie nodes.
* triedb/pathdb.database.go: drop isVerkle from the noBuild guard so the
bintrie generator (Commit 9) runs on startup, and remove it from the
generateSnapshot call path for the same reason.
* triedb/pathdb.disklayer.revert: hard-fail on bintrie because the
reorg path would replay merkle-shaped origin records against a
per-stem layout. Tracked in BINTRIE_FLAT_STATE_REORG_GAP.md.
* triedb/pathdb.journal: add IsBintrie to journalGenerator (rlp:"optional"
so v3 journals still decode) and make journalProgress a method on
generator so it stamps the active scheme; loadGenerator discards any
journal whose scheme does not match the database, forcing a fresh
regeneration.
* triedb/pathdb.reader: export RawStateReader, a small extension of
database.StateReader that exposes AccountRLP so callers outside the
package can reach the raw flat-state bytes without going through the
slim-RLP decode path that assumes merkle shape.
* core/state.reader: add bintrieFlatReader, the bintrie equivalent of
flatReader. It derives the EIP-7864 stem keys from (addr, slot),
performs two AccountRLP lookups per Account call (BasicData +
CodeHash), and decodes via bintrie.UnpackBasicData. Storage reads go
through a single AccountRLP lookup at the slot's full bintrie key.
* core/state.database.StateReader: dispatch to bintrieFlatReader when
the path database is in verkle mode; merkle path unchanged.
Depends on the lookup sentinel fix in the previous commit; without it
missing-account reads on bintrie misreport as "layer stale".
Adds generateBinTrieStems, the bintrie analogue of generateAccounts. It
opens the bintrie via a sha256-aware bintrieDiskStore (the merkle disk
store would always fail root validation against a binary node), iterates
all leaves with binaryNodeIterator, aggregates them into per-stem
builders, and emits one stem blob per stem boundary.
Resume support is structural: ctx.marker is fed straight to the trie's
NodeIterator, which uses binaryNodeIterator.seek (Commit 1) to position
on the first leaf >= marker. Range proofs are deliberately skipped — the
bintrie's Prove path is unimplemented and an iteration-only generation
cycle is acceptable for a one-time startup cost.
A bintrieGeneratorContext mirrors generatorContext but is much smaller:
no holdable iterators (we walk the trie, not the existing flat state)
and no two-tier marker (the bintrie key space is unified). checkAndFlushBin
journals progress as a single 32-byte (stem || offset) key so resume
can pick up mid-stem.
generator.run dispatches on codec type so callers see a uniform
lifecycle whether the underlying scheme is merkle or bintrie.
Route the flatStateCodec from Database through every flat-state call
site so that the trie-specific aspects of persistence and key derivation
live behind a single abstraction. Pure refactor: merkle behavior and
on-disk layout are unchanged because the only codec wired up is
merkleFlatCodec, whose methods are thin wrappers over the existing
rawdb accessors.
Threaded sites:
disklayer.account/storage use codec.{Read,AccountCacheKey,
StorageCacheKey} instead of direct
rawdb calls and bare hash slicing.
flush.writeStates takes a codec parameter; persistence
goes through codec.{Write,Delete}
{Account,Storage}.
buffer.flush carries the codec down into writeStates.
states.write/dbsize takes the codec for prefix-size
accounting.
generate.go (g.codec) the generator owns a codec, used by
generateAccounts/generateStorages
callbacks; the unused top-level
splitMarker helper is removed in favor
of codec.SplitMarker.
context.go the generator context owns the codec
and uses codec.{AccountPrefix,
StoragePrefix,Account/StorageKeyLength}
to construct iterators.
reader.go (HistoricalState) uses codec.{Account,Storage}Key for
caller-side key derivation.
The marker comparisons in writeStates remain merkle-shaped (two-tier
account+storage marker) because the bintrie path will use a separate
writer over single-tier stem markers in a later commit.
All existing pathdb tests pass.
Previously, PathDB used a single buffer to aggregate database writes,
which needed to be flushed atomically. However, flushing large amounts
of data (e.g., 256MB) caused significant overhead, often blocking the
system for around 3 seconds during the flush.
To mitigate this overhead and reduce performance spikes, a double-buffer
mechanism is introduced. When the active buffer fills up, it is marked
as frozen and a background flushing process is triggered. Meanwhile, a
new buffer is allocated for incoming writes, allowing operations to
continue uninterrupted.
This approach reduces system blocking times and provides flexibility in
adjusting buffer parameters for improved performance.
In this pull request, snapshot generation in pathdb has been ported from
the legacy state snapshot implementation. Additionally, when running in
path mode, legacy state snapshot data is now managed by the pathdb
based snapshot logic.
Note: Existing snapshot data will be re-generated, regardless of whether
it was previously fully constructed.