The BAL block has been verified and committed by the time we reach the
read-time accounting block, so the prefetcher (whose workload is bounded
by the BAL contents) has no outstanding tasks to wait on.
The previous follow-up note: per-tx + pre-tx + post-tx StateDBs each
have their own stateObjects, so summing CodeLoaded/CodeLoadBytes
over-counts contracts whose code body was fetched by multiple phases.
Fix: snapshot per-StateDB the {address: codeLen} map of contracts whose
s.code is populated, plumb through the existing aggregation pipeline,
dedupe by address in resultHandler/prepareExecResult. The merged map's
size and value-sum become CodeLoaded and CodeLoadBytes respectively,
overriding the per-tx-summed values at the wiring site.
Empirical: a 3-tx block touching the same set of system contracts now
reports code=4, code_bytes=1098 (matches single-tx baseline) instead
of code=12, code_bytes=3294 under the prior over-count.
Per-tx + pre-tx + post-tx StateDBs each have independent stateObjects
caches, so summing their AccountLoaded/StorageLoaded counts over-counts
addresses/slots touched by multiple phase StateDBs (compared to non-BAL
single-StateDB semantics where the cache deduplicates).
Override the read counts at the BAL stats wiring site using two new
helpers on bal.BlockAccessList. The BAL is the canonical block-level
deduplicated access record, so this restores cross-client comparable
"unique accounts/slots touched" semantics.
CodeLoaded/CodeLoadBytes still sum per-tx — the BAL doesn't track code-
fetch events distinctly. Slight over-count remains there, documented.
Replace the {Account, Storage, Code} time.Duration scalars threaded through
ProcessResultWithMetrics, txExecResult, processBlockPreTx and resultHandler
with a single ReadDurations struct + Add merge primitive. Same shape as
StateCounts. Adds (*StateDB).SnapshotReads() helper at the boundary.
Replace the cached AccountReadTime/StorageReadTime fields (which had a
snapshot-staleness bug fixed in 16e98f5d9 by re-calling Metrics()) with
a live ReadTimes() accessor. Metrics() now only returns commit/hash-phase
timings — it no longer touches atomics. blockchain.go reads atomics
directly via stateTransition.ReadTimes(), eliminating the refresh hack.
Also resolves the I1 fragility: Metrics() returning &s.metrics no longer
involves any writes inside the function, so concurrent callers can't race
on the read-time field updates.
Forward prefetchStateReader.Wait() through *reader.WaitPrefetch and call
it before reading the read-time atomics. Eliminates the edge-case where
prefetcher goroutines outlast block execution + commit. For slow blocks
(the metric's target audience) this is a no-op; for fast blocks it
ensures the metric is complete rather than slightly under.
The first Metrics() call inside calcAndVerifyRoot snapshots accountReadNS
and storageReadNS into the cached AccountReadTime/StorageReadTime fields.
But commitAccount (called from writeBlockWithState's CommitWithUpdate
path) increments storageReadNS *after* that snapshot, so reading
m.StorageReadTime later would silently drop those reads.
Re-call Metrics() before reading the read-time fields so the cache
reflects the post-commit atomics. Other metric fields (AccountUpdate,
AccountCommits, etc.) are written directly to s.metrics elsewhere and
remain untouched by Metrics().
Found via the metric-correctness audit.
The struct is 80 bytes (10 ints) — value semantics matches the type's
"snapshot, safe to pass by value" thesis stated in its doc comment, and
removes three unnecessary &-takings at call sites. No behavior change.
* add method on StateReaderTracker to clear the accumulated reads
* don't factor the BAL size into the payload size during construction in the miner
* simplify miner code for constructing payloads-with-BALs via the use of aformentioned StateReaderTracker clear method
* clean up the configuration of the BAL execution mode based on the preset flag specified
This PR introduces a new type HistoryPolicy which captures user intent
as opposed to pruning point stored in the blockchain which persists the
actual tail of data in the database.
It is in preparation for the rolling history expiry feature.
It comes with a semantic change: if database was pruned and geth is
running without a history mode flag (or explicit keep all flag) geth
will emit a warning but continue running as opposed to stopping the
world.
This PR allows users to prune their nodes up to the Prague fork. It
indirectly depends on #32157 and can't really be merged before eraE
files are widely available for download.
The `--history.chain` flag becomes mandatory for `prune-history`
command. Here I've listed all the edge cases that can happen and how we
behave:
## prune-history Behavior
| From | To | Result |
|-------------|--------------|--------------------------|
| full | postmerge | ✅ prunes |
| full | postprague | ✅ prunes |
| postmerge | postprague | ✅ prunes further |
| postprague | postmerge | ❌ can't unprune |
| any | all | ❌ use import-history |
## Node Startup Behavior
| DB State | Flag | Result |
|-------------|--------------|----------------------------------------------------------------|
| fresh | postprague | ✅ syncs from Prague |
| full | postprague | ❌ "run prune-history first" |
| postmerge | postprague | ❌ "run prune-history first" |
| postprague | postmerge | ❌ "can't unprune, use import-history or fix
flag" |
| pruned | all | ✅ accepts known prune points |
Pebble maintains a batch pool to recycle the batch object. Unfortunately
batch object must be
explicitly returned via `batch.Close` function. This PR extends the
batch interface by adding
the close function and also invoke batch.Close in some critical code
paths.
Memory allocation must be measured before merging this change. What's
more, it's an open
question that whether we should apply batch.Close as much as possible in
every invocation.
Implement standardized JSON format for slow block logging to enable
cross-client performance analysis and protocol research.
This change is part of the Cross-Client Execution Metrics initiative
proposed by Gary Rong: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ
The standardized metrics enabled data-driven analysis like the EIP-7907
research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850
JSON format includes:
- block: number, hash, gas_used, tx_count
- timing: execution_ms, total_ms
- throughput: mgas_per_sec
- state_reads: accounts, storage_slots, bytecodes, code_bytes
- state_writes: accounts, storage_slots, bytecodes
- cache: account/storage/code hits, misses, hit_rate
This should come after merging #33522
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR extends the statistics of contract code read by adding these
fields:
- **CacheHitBytes**: the total number of bytes served by cache
- **CacheMissBytes**: the total number of bytes read on cache miss
- **CodeReadBytes**: the total number of bytes for contract code read
### Description
Add a new `OnStateUpdate` hook which gets invoked after state is
committed.
### Rationale
For our particular use case, we need to obtain the state size metrics at
every single block when fuly syncing from genesis. With the current
state sizer, whenever the node is stopped, the background process must
be freshly initialized. During this re-initialization, it can skip some
blocks while the node continues executing blocks, causing gaps in the
recorded metrics.
Using this state update hook allows us to customize our own data
persistence logic, and we would never skip blocks upon node restart.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Fix#33390
`setHeadBeyondRoot` was failing to invalidate finalized blocks because
it compared against the original head instead of the rewound root. This
fix updates the comparison to use the post-rewind block number,
preventing the node from reporting a finalized block that no longer
exists. Also added relevant test cases for it.
## Description
This PR fixes incorrect contract code state metrics by ensuring
duplicate codes are not counted towards the reported results.
## Rationale
The contract code metrics don't consider database deduplication. The
current implementation assumes that the results are only **slightly
inaccurate**, but this is not true, especially for data collection
efforts that started from the genesis block.
This PR introduces a new debug feature, logging the slow blocks with
detailed performance statistics, such as state read, EVM execution and
so on.
Notably, the detailed performance statistics of slow blocks won't be
logged during the sync to not overwhelm users. Specifically, the statistics
are only logged if there is a single block processed.
Example output
```
########## SLOW BLOCK #########
Block: 23537063 (0xa7f878611c2dd27f245fc41107d12ebcf06b4e289f1d6acf44d49a169554ee09) txs: 248, mgasps: 202.99
EVM execution: 63.295ms
Validation: 1.130ms
Account read: 6.634ms(648)
Storage read: 17.391ms(1434)
State hash: 6.722ms
DB commit: 3.260ms
Block write: 1.954ms
Total: 99.094ms
State read cache: account (hit: 622, miss: 26), storage (hit: 1325, miss: 109)
##############################
```
- Introduce a new subscription kind `transactionReceipts` to allow clients to
receive transaction receipts over WebSocket as soon as they are available.
- Accept optional `transactionHashes` filter to subscribe to receipts for specific
transactions; an empty or omitted filter subscribes to all receipts.
- Preserve the same receipt format as returned by `eth_getTransactionReceipt`.
- Avoid additional HTTP polling, reducing RPC load and latency.
---------
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
This pr implements https://github.com/ethereum/go-ethereum/issues/32733
to make StateProcessor more customisable.
## Compatibility notes
This introduces a breaking change to users using geth EVM as a library.
The `NewStateProcessor` function now takes one parameter which has the
chainConfig embedded instead of 2 parameters.
When I implemented in #31340 I didn't expect multiple forks to be
configured at once, but this is exactly how BPOs are defined. This
updates the method to determine the next scheduled fork rather than the
last fork.
The format that is currently reported by the chain isn't very useful, as
it gives an average for ALL the nodes, and not only the leaves, which
skews the results.
Also, until now there was no way to activate the reporting of errors.
We also decided that metrics weren't the right tool to report this data,
so we decided to dump it to the console if the flag is enabled. A better
system should be built, but for now, printing to the logs does the job.
This PR adds a new RPC call, which re-executes a block with stateless
mode activated, so that the witness data are collected and returned.
They are `debug_executionWitnessByHash` which takes in a block hash
and `debug_executionWitness` which takes in a block number.
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Add state size tracking and retrieve api, start geth with `--state.size-tracking`,
the initial bootstrap is required (around 1h on mainnet), after the bootstrap,
use `debug_stateSize()` RPC to retrieve the state size:
```
> debug.stateSize()
{
accountBytes: "0x39681967b",
accountTrienodeBytes: "0xc57939f0c",
accountTrienodes: "0x198b36ac",
accounts: "0x129da14a",
blockNumber: "0x1635e90",
contractCodeBytes: "0x2b63ef481",
contractCodes: "0x1c7b45",
stateRoot: "0x9c36a3ec3745d72eea8700bd27b90dcaa66de0494b187c5600750044151e620a",
storageBytes: "0x18a6e7d3f1",
storageTrienodeBytes: "0x2e7f53fae6",
storageTrienodes: "0x6e49a234",
storages: "0x517859c5"
}
```
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR is the first step in the trienode history series.
It introduces the `nodeWithOrigin` struct in the path database, which tracks
the original values of dirty nodes to support trienode history construction.
Note, the original value is always empty in this PR, so it won't break the
existing journal for encoding and decoding. The compatibility of journal
should be handled in the following PR.
Introduce file-based state journal in path database, fixing
the Pebble restriction when the journal size exceeds 4GB.
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
If Geth is engaged in a long-run block synchronization, such as a full
syncing over a large number of blocks, invoking `debug_setHead` will
cause `downloader.Cancel` to wait for all fetchers to stop first.
This can be time-consuming, particularly for the block processing
thread.
To address this, we manually call `blockchain.StopInsert` to interrupt
the blocking processing thread and allow it to exit immediately, and
after that call `blockchain.ResumeInsert` to resume the block
downloading process.
Additionally, we add a sanity check for the input block number of
`debug_setHead` to ensure its validity.
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>