go-ethereum

mirror of https://github.com/ethereum/go-ethereum.git synced 2026-06-12 09:51:36 +00:00

Author	SHA1	Message	Date
CPerezz	fcc0587ec3	core/state,triedb/pathdb: fix bintrieFlatReader disk-layer shape via per-offset extraction Addresses review finding C2 (+ I5, S5, T2, T3, T12). Before this commit, bintrieFlatCodec.ReadAccount returned the FULL variable-length stem blob from disk while the in-memory diff-layer buffer stored per-offset 32-byte values. The consumer, bintrieFlatReader.Account, enforced len(basicBlob)!=32 → error, so every disk-layer hit produced "bintrie BasicData leaf invalid length" in production the moment the write buffer flushed. TestBintrieFlatReaderEndToEnd did not catch this because it never forced a buffer → disk flush. Fix: make bintrieFlatCodec.ReadAccount extract the offset from the stem blob (mirroring ReadStorage), so the disk path and the buffer path return the same 32-byte per-offset shape. Update AccountCacheKey/StorageCacheKey to embed the full 32-byte key (prefix + 31-byte stem + 1-byte offset), since caching under a stem-only key would collapse BasicData and CodeHash into the same slot and return the wrong value on the second hit. Update Flush's cache-update loop to store per-offset entries from the aggregated write set. Design note: I considered the alternative of introducing a new StemBlob(stem) interface method that returns the full blob synthesized from a stem-level lookup index. Rejected because (a) the index is a new data structure with its own consistency invariants, (b) the per-offset approach is strictly local to the codec + reader, and (c) the "1 Pebble read per Account" locality benefit is preserved at the OS page cache level — both offsets at the same stem live in the same Pebble block, so the second read is effectively free. bintrieFlatReader.Account still does two AccountRLP lookups; the torn-read hazard is gated by a new load-bearing invariant test, TestBinaryHasherWritesBothBasicAndCodeHash, which asserts that binaryHasher.updateAccount always emits both BasicData and CodeHash leaves together. A future code-only update that broke this invariant would fail the test. Tests added: * TestBintrieFlatReaderEndToEndAfterFlush — explicitly flushes via tdb.Commit(root, false) and re-reads through a fresh StateReader. This is the smoking-gun regression for C2. * TestBintrieFlatReaderMultipleOffsetsPerStem — multiple offsets at the same stem (BasicData, CodeHash, header storage slots) all round-trip post-flush. * TestBintrieCodecCrossFlushRMW — two Flush calls to the same stem from different "blocks" correctly merge on disk, with prior offsets preserved. * TestBinaryHasherWritesBothBasicAndCodeHash — locks down the hasher co-write invariant that bintrieFlatReader.Account relies on. Existing tests updated to match the new per-offset ReadAccount semantics: * TestBintrieCodecAccountRoundTrip, TestBintrieCodecMultipleWritesSameStem, TestBintrieCodecDeleteAccount — now read per-offset rather than calling extractStemOffset on the raw blob. * TestBintrieCodecCacheKeysDisjoint — additionally verifies two offsets at the same stem produce distinct cache keys. Error messages in bintrieFlatReader now include address and length context (S5).	2026-04-15 15:00:40 +02:00
CPerezz	bfb77d98f6	core/state,triedb/pathdb: enable bintrie flat state reads end-to-end Wires the pieces from Commits 1-9 into a running system: * triedb/pathdb.New: install the bintrieFlatCodec when isVerkle is set, backed by the same verkle-namespaced db used for trie nodes. * triedb/pathdb.database.go: drop isVerkle from the noBuild guard so the bintrie generator (Commit 9) runs on startup, and remove it from the generateSnapshot call path for the same reason. * triedb/pathdb.disklayer.revert: hard-fail on bintrie because the reorg path would replay merkle-shaped origin records against a per-stem layout. Tracked in BINTRIE_FLAT_STATE_REORG_GAP.md. * triedb/pathdb.journal: add IsBintrie to journalGenerator (rlp:"optional" so v3 journals still decode) and make journalProgress a method on generator so it stamps the active scheme; loadGenerator discards any journal whose scheme does not match the database, forcing a fresh regeneration. * triedb/pathdb.reader: export RawStateReader, a small extension of database.StateReader that exposes AccountRLP so callers outside the package can reach the raw flat-state bytes without going through the slim-RLP decode path that assumes merkle shape. * core/state.reader: add bintrieFlatReader, the bintrie equivalent of flatReader. It derives the EIP-7864 stem keys from (addr, slot), performs two AccountRLP lookups per Account call (BasicData + CodeHash), and decodes via bintrie.UnpackBasicData. Storage reads go through a single AccountRLP lookup at the slot's full bintrie key. * core/state.database.StateReader: dispatch to bintrieFlatReader when the path database is in verkle mode; merkle path unchanged. Depends on the lookup sentinel fix in the previous commit; without it missing-account reads on bintrie misreport as "layer stale".	2026-04-15 15:00:40 +02:00
Gary Rong	d57dca07b1	core/state: integrate witness collector	2026-04-15 15:00:39 +02:00
Gary Rong	1ae462f08d	core/state: build hasher skeleton	2026-04-15 15:00:38 +02:00
Gary Rong	e2c00d6c96	core/state: add hasher interface definition	2026-04-15 14:59:05 +02:00
CPerezz	4b915af2c3	core/state: avoid Bytes() allocation in flatReader hash computations (#34025 ) ## Summary Replace `addr.Bytes()` and `key.Bytes()` with `addr[:]` and `key[:]` in `flatReader`'s `Account` and `Storage` methods. The former allocates a copy while the latter creates a zero-allocation slice header over the existing backing array. ## Benchmark (AMD EPYC 48-core, 500K entries, screening `--benchtime=1x`) \| Metric \| Baseline \| Slice syntax \| Delta \| \|--------\|----------\|--------------\|-------\| \| Approve (Mgas/s) \| 4.13 \| 4.22 \| +2.2% \| \| BalanceOf (Mgas/s) \| 168.3 \| 190.0 \| +12.9% \|	2026-03-17 11:42:42 +01:00
rjl493456442	91cec92bf3	core, miner, tests: introduce codedb and simplify cachingDB (#33816 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details	2026-03-10 08:29:21 +01:00
rjl493456442	c2595381bf	core: extend the code reader statistics (#33659 ) Some checks are pending / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details This PR extends the statistics of contract code read by adding these fields: - CacheHitBytes: the total number of bytes served by cache - CacheMissBytes: the total number of bytes read on cache miss - CodeReadBytes: the total number of bytes for contract code read	2026-01-26 11:25:53 +01:00
rjl493456442	9623dcbca2	core/state: add cache statistics of contract code reader (#33532 )	2026-01-08 11:48:45 +08:00
Guillaume Ballet	3f641dba87	trie, go.mod: remove all references to go-verkle and go-ipa (#33461 ) In order to reduce the amount of code that is embedded into the keeper binary, I am removing all the verkle code that uses go-verkle and go-ipa. This will be followed by further PRs that are more like stubs to replace code when the keeper build is detected. I'm keeping the binary tree of course. This means that you will still see `isVerkle` variables all over the codebase, but they will be renamed when code is touched (i.e. this is not an invitation for 30+ AI slop PRs). --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-12-30 20:44:04 +08:00
Ng Wei Han	9a346873b8	core/state: fix incorrect contract code state metrics (#33376 ) ## Description This PR fixes incorrect contract code state metrics by ensuring duplicate codes are not counted towards the reported results. ## Rationale The contract code metrics don't consider database deduplication. The current implementation assumes that the results are only slightly inaccurate, but this is not true, especially for data collection efforts that started from the genesis block.	2025-12-10 11:33:59 +08:00
rjl493456442	042c47ce1a	core: log detailed statistics for slow block (#32812 ) This PR introduces a new debug feature, logging the slow blocks with detailed performance statistics, such as state read, EVM execution and so on. Notably, the detailed performance statistics of slow blocks won't be logged during the sync to not overwhelm users. Specifically, the statistics are only logged if there is a single block processed. Example output ``` ########## SLOW BLOCK ######### Block: 23537063 (0xa7f878611c2dd27f245fc41107d12ebcf06b4e289f1d6acf44d49a169554ee09) txs: 248, mgasps: 202.99 EVM execution: 63.295ms Validation: 1.130ms Account read: 6.634ms(648) Storage read: 17.391ms(1434) State hash: 6.722ms DB commit: 3.260ms Block write: 1.954ms Total: 99.094ms State read cache: account (hit: 622, miss: 26), storage (hit: 1325, miss: 109) ############################## ```	2025-12-02 14:43:51 +01:00
Guillaume Ballet	2a2f106a01	cmd/evm/internal/t8ntool, trie: support for verkle-at-genesis, use UBT, and move the transition tree to its own package (#32445 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Keeper Build (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This is broken off of #31730 to only focus on testing networks that start with verkle at genesis. The PR has seen a lot of work since its creation, and it now targets creating and re-executing tests for a binary tree testnet without the transition (so it starts at genesis). The transition tree has been moved to its own package. It also replaces verkle with the binary tree for this specific application. --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-11-14 15:25:30 +01:00
Zach Brown	2a795c14f4	all: fix problematic function name in comment (#32513 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details Fix problematic function name in comment. Do my best to correct them all with a script to avoid spamming PRs.	2025-08-29 08:54:23 +08:00
pxwanglu	d0602ba45a	core,trie: fix typo in TransitionTrie (#32491 ) Change `NewTransitionTree` to the correct `NewTransitionTrie`. Signed-off-by: pxwanglu <pxwanglu@icloud.com>	2025-08-25 09:29:58 +02:00
Guillaume Ballet	ea3a71792d	trie, core/state: add the transition tree (verkle transition part 2) (#32366 ) This add some of the changes that were missing from #31634. It introduces the `TransitionTrie`, which is a façade pattern between the current MPT trie and the overlay tree. --------- Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com> Co-authored-by: rjl493456442 <garyrong0905@gmail.com>	2025-08-15 14:34:32 +08:00
Guillaume Ballet	cf50026466	core/state: introduce the TransitionState object (verkle transition part 1) (#31634 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Windows Build (push) Waiting to run Details / Docker Image (push) Waiting to run Details This is the first part of #31532 It maintains a series of conversion maker which are to be updated by the conversion code (in a follow-up PR, this is a breakdown of a larger PR to make things easier to review). They can be used in this way: - During the conversion, by storing the conversion markers when the block has been processed. This is meant to be written in a function that isn't currently present, hence [this TODO](https://github.com/ethereum/go-ethereum/pull/31634/files#diff-89272f61e115723833d498a0acbe59fa2286e3dc7276a676a7f7816f21e248b7R384). Part of https://github.com/ethereum/go-ethereum/issues/31583 --------- Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com> Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-08-05 09:34:12 +08:00
rjl493456442	c7b8924fe4	core/state: expose the state reader stats (#31998 ) This pull request introduces a mechanism to expose statistics from the state reader, specifically related to cache utilization during state prefetching. To improve state access performance, a pair of state readers is constructed with a shared local cache. One reader to execute transactions ahead of time to warm up the cache. The other reader is used by the actual chain processing logic, which can benefit from the prefetched states. This PR adds visibility into how effective the cache is by exposing relevant usage statistics. --------- Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com> Co-authored-by: Csaba Kiraly <csaba.kiraly@gmail.com>	2025-06-21 12:58:04 +08:00
nthumann	cc1293b8f1	all: reuse the global hash buffer (#31839 ) Some checks are pending / Linux Build (push) Waiting to run Details / Linux Build (arm) (push) Waiting to run Details / Docker Image (push) Waiting to run Details As https://github.com/ethereum/go-ethereum/pull/31769 defined a global hash pool, so we can reuse it, and also remove the unnecessary KeccakState buffering --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-06-18 15:29:14 +08:00
Marius van der Wijden	0eb2eeea90	all: create global hasher pool (#31769 ) This PR creates a global hasher pool that can be used by all packages. It also removes a bunch of the package local pools. It also updates a few locations to use available hashers or the global hashing pool to reduce allocations all over the codebase. This change should reduce global allocation count by ~1% --------- Co-authored-by: Gary Rong <garyrong0905@gmail.com>	2025-05-09 13:52:40 +08:00
rjl493456442	485ff4bbff	core: implement in-block prefetcher (#31557 ) This pull request enhances the block prefetcher by executing transactions in parallel to warm the cache alongside the main block processor. Unlike the original prefetcher, which only executes the next block and is limited to chain syncing, the new implementation can be applied to any block. This makes it useful not only during chain sync but also for regular block insertion after the initial sync. --------- Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>	2025-05-08 22:28:16 +08:00
rjl493456442	03c37cdb2b	core/state: introduce code reader interface (#30816 ) This PR introduces a `ContractCodeReader` interface with functions defined: type ContractCodeReader interface { Code(addr common.Address, codeHash common.Hash) ([]byte, error) CodeSize(addr common.Address, codeHash common.Hash) (int, error) } This interface can be implemented in various ways. Although the codebase currently includes only one implementation, additional implementations could be created for different purposes and scenarios, such as a code reader designed for the Verkle tree approach or one that reads code from the witness. Notably, this interface modifies the function’s semantics. If the contract code is not found, no error will be returned. An error should only be returned in the event of an unexpected issue, primarily for future implementations. The original state.Reader interface is extended with ContractCodeReader methods, it gives us more flexibility to manipulate the reader with additional logic on top, e.g. Hooks. type Reader interface { ContractCodeReader StateReader } --------- Co-authored-by: Felix Lange <fjl@twurst.com>	2024-11-29 15:32:45 +01:00
rjl493456442	74ef47462f	core/state, triedb/database: refactor state reader (#30712 ) Co-authored-by: Martin HS <martin@swende.se>	2024-11-09 08:08:06 +08:00
rjl493456442	623b17ba20	core/state: state reader abstraction (#29761 ) This pull request introduces a state.Reader interface for state accessing. The interface could be implemented in various ways. It can be pure trie only reader, or the combination of trie and state snapshot. What's more, this interface allows us to have more flexibility in the future, e.g. the archive reader (for accessing archive state). Additionally, this pull request removes the following metrics - `chain/snapshot/account/reads` - `chain/snapshot/storage/reads`	2024-09-05 13:10:47 +03:00

24 commits