This PR should reduce overall allocations of a running node by ~10
percent. Since most allocations are coming from the re-heaping of the
transaction pool.
```
(pprof) list EffectiveGasTipCmp
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core/types.(*Transaction).EffectiveGasTipCmp in github.com/ethereum/go-ethereum/core/types/transaction.go
0 3766837369 (flat, cum) 9.86% of Total
. . 386:func (tx *Transaction) EffectiveGasTipCmp(other *Transaction, baseFee *big.Int) int {
. . 387: if baseFee == nil {
. . 388: return tx.GasTipCapCmp(other)
. . 389: }
. . 390: // Use more efficient internal method.
. . 391: txTip, otherTip := new(big.Int), new(big.Int)
. 1796172553 392: tx.calcEffectiveGasTip(txTip, baseFee)
. 1970664816 393: other.calcEffectiveGasTip(otherTip, baseFee)
. . 394: return txTip.Cmp(otherTip)
. . 395:}
. . 396:
. . 397:// EffectiveGasTipIntCmp compares the effective gasTipCap of a transaction to the given gasTipCap.
. . 398:func (tx *Transaction) EffectiveGasTipIntCmp(other *big.Int, baseFee *big.Int) int {
```
This PR reduces the allocations for comparing two transactions from 2 to
0:
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: Intel(R) Core(TM) Ultra 7 155U
│ /tmp/old.txt │ /tmp/new.txt │
│ sec/op │ sec/op vs base │
EffectiveGasTipCmp/Original-14 64.67n ± 2% 25.13n ± 9% -61.13% (p=0.000 n=10)
│ /tmp/old.txt │ /tmp/new.txt │
│ B/op │ B/op vs base │
EffectiveGasTipCmp/Original-14 16.00 ± 0% 0.00 ± 0% -100.00% (p=0.000 n=10)
│ /tmp/old.txt │ /tmp/new.txt │
│ allocs/op │ allocs/op vs base │
EffectiveGasTipCmp/Original-14 2.000 ± 0% 0.000 ± 0% -100.00% (p=0.000 n=10)
```
It also speeds up the process by ~60%
There are two minor caveats with this PR:
- We change the API for `EffectiveGasTipCmp` and `EffectiveGasTipIntCmp`
(which are probably not used by much)
- We slightly change the behavior of `tx.EffectiveGasTip` when it
returns an error. It would previously return a negative number on error,
now it does not (since uint256 does not allow for negative numbers)
---------
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: Csaba Kiraly <csaba.kiraly@gmail.com>
This pull introduces a `Prefetch` operation in the trie to prefetch trie
nodes in parallel. It is used by the `triePrefetcher` to accelerate state
loading and improve overall chain processing performance.
Continuation of https://github.com/ethereum/go-ethereum/issues/32022
tablewriter assumes unix or windows, which may not be the case for
embedded targets.
For v0.0.5 of tablewriter, it is noted in table.go: "The protocols were
written in pure Go and works on windows and unix systems"
---------
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
The order of the checks was wrong which would have allowed a call to
modexp with `baseLen == 0 && modLen == 0` post fusaka.
Also handles an edge case where base/mod/exp length >= 2**64
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This add some of the changes that were missing from #31634. It
introduces the `TransitionTrie`, which is a façade pattern between the
current MPT trie and the overlay tree.
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
These changes made in the PR should be highlighted here
The trie tracer is split into two distinct structs: opTracer and prevalueTracer.
The former is specific to MPT, while the latter is generic and applicable to all
trie implementations.
The original values of dirty nodes are tracked in a NodeSet. This serves
as the foundation for both full archive node implementations and the state live
tracer.
- If all the `vhashes` are in the same `sidecar`, then it will load the
same blob tx many times. This PR aims to upgrade this.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This is the first part of #31532
It maintains a series of conversion maker which are to be updated by the
conversion code (in a follow-up PR, this is a breakdown of a larger PR
to make things easier to review). They can be used in this way:
- During the conversion, by storing the conversion markers when the
block has been processed. This is meant to be written in a function that
isn't currently present, hence [this
TODO](https://github.com/ethereum/go-ethereum/pull/31634/files#diff-89272f61e115723833d498a0acbe59fa2286e3dc7276a676a7f7816f21e248b7R384).
Part of https://github.com/ethereum/go-ethereum/issues/31583
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This adds a method on vm.EVM to set the jumpdest cache implementation.
It can be used to maintain an analysis cache across VM invocations, to improve
performance by skipping the analysis for already known contracts.
---------
Co-authored-by: lmittmann <lmittmann@users.noreply.github.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
[EIP-7594](https://eips.ethereum.org/EIPS/eip-7594) defines a limit of
max 6 blobs per transaction. We need to enforce this limit during block
processing.
> Additionally, a limit of 6 blobs per transaction is introduced.
Clients MUST enforce this limit when validating blob transactions at
submission time, when received from the network, and during block
production and processing.
The main purpose of this change is to enforce the version setting when
constructing the blobSidecar, avoiding creating sidecar with wrong/default
version tag.
This is something interesting I came across during my benchmarks, we
spent ~3.8% of all allocations allocating the header number on the heap.
```
(pprof) list GetHeaderByHash
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*BlockChain).GetHeaderByHash in github.com/ethereum/go-ethereum/core/blockchain_reader.go
0 5786566117 (flat, cum) 15.15% of Total
. . 79:func (bc *BlockChain) GetHeaderByHash(hash common.Hash) *types.Header {
. 5786566117 80: return bc.hc.GetHeaderByHash(hash)
. . 81:}
. . 82:
. . 83:// GetHeaderByNumber retrieves a block header from the database by number,
. . 84:// caching it (associated with its hash) if found.
. . 85:func (bc *BlockChain) GetHeaderByNumber(number uint64) *types.Header {
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*HeaderChain).GetHeaderByHash in github.com/ethereum/go-ethereum/core/headerchain.go
0 5786566117 (flat, cum) 15.15% of Total
. . 404:func (hc *HeaderChain) GetHeaderByHash(hash common.Hash) *types.Header {
. 1471264309 405: number := hc.GetBlockNumber(hash)
. . 406: if number == nil {
. . 407: return nil
. . 408: }
. 4315301808 409: return hc.GetHeader(hash, *number)
. . 410:}
. . 411:
. . 412:// HasHeader checks if a block header is present in the database or not.
. . 413:// In theory, if header is present in the database, all relative components
. . 414:// like td and hash->number should be present too.
(pprof) list GetBlockNumber
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*HeaderChain).GetBlockNumber in github.com/ethereum/go-ethereum/core/headerchain.go
94438817 1471264309 (flat, cum) 3.85% of Total
. . 100:func (hc *HeaderChain) GetBlockNumber(hash common.Hash) *uint64 {
94438817 94438817 101: if cached, ok := hc.numberCache.Get(hash); ok {
. . 102: return &cached
. . 103: }
. 1376270828 104: number := rawdb.ReadHeaderNumber(hc.chainDb, hash)
. . 105: if number != nil {
. 554664 106: hc.numberCache.Add(hash, *number)
. . 107: }
. . 108: return number
. . 109:}
. . 110:
. . 111:type headerWriteResult struct {
(pprof) list ReadHeaderNumber
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core/rawdb.ReadHeaderNumber in github.com/ethereum/go-ethereum/core/rawdb/accessors_chain.go
204606513 1376270828 (flat, cum) 3.60% of Total
. . 146:func ReadHeaderNumber(db ethdb.KeyValueReader, hash common.Hash) *uint64 {
109577863 1281242178 147: data, _ := db.Get(headerNumberKey(hash))
. . 148: if len(data) != 8 {
. . 149: return nil
. . 150: }
95028650 95028650 151: number := binary.BigEndian.Uint64(data)
. . 152: return &number
. . 153:}
. . 154:
. . 155:// WriteHeaderNumber stores the hash->number mapping.
. . 156:func WriteHeaderNumber(db ethdb.KeyValueWriter, hash common.Hash, number uint64) {
```
Opening this to discuss the idea, I know that rawdb.EmptyNumber is not a
great name for the variable, open to suggestions
This pull request slightly improves the freezer fsync mechanism by scheduling
the Sync operation based on the number of uncommitted items and original
time interval.
Originally, freezer.Sync was triggered every 30 seconds, which worked well during
active chain synchronization. However, once the initial state sync is complete,
the fixed interval causes Sync to be scheduled too frequently.
To address this, the scheduling logic has been improved to consider both the time
interval and the number of uncommitted items. This additional condition helps
avoid unnecessary Sync operations when the chain is idle.
Introduce file-based state journal in path database, fixing
the Pebble restriction when the journal size exceeds 4GB.
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This adds the SSZ types from the
[EIP-7928](https://eips.ethereum.org/EIPS/eip-7928) and also adds
encoder/decoder generation using https://github.com/ferranbt/fastssz.
The fastssz dependency is updated because the generation will not work
properly with the master branch version due to a bug in fastssz.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR adds a block validation check for the maximum block size, as required by
EIP-7934, and also applies a slightly lower size limit during block building.
---------
Co-authored-by: spencer-tb <spencer@spencertaylorbrown.uk>
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Improves the SSTORE gas calculation a bit. Previously we would pull up
the state object twice. This is okay for existing objects, since they
are cached, however non-existing objects are not cached, thus we needed
to go through all 128 diff layers as well as the disk layer twice, just
for the gas calculation
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/vm
cpu: AMD Ryzen 9 5900X 12-Core Processor
│ /tmp/old.txt │ /tmp/new.txt │
│ sec/op │ sec/op vs base │
Interpreter-24 1118.0n ± 2% 602.8n ± 1% -46.09% (p=0.000 n=10)
```
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This pull request refines the filtermap implementation, defining key
APIs for map and
epoch calculations to improve readability.
This pull request doesn't change any logic, it's a pure cleanup.
---------
Co-authored-by: zsfelfoldi <zsfelfoldi@gmail.com>
## Summary
This PR resolves Issue #31929 by reducing log noise generated by the log
indexer after `debug_setHead` operations.
## Problem Description
When `debug_setHead` is called to rewind the blockchain, blocks are
removed from the database. However, the log indexer's `ChainView`
objects may still hold references to these deleted blocks. When
`extendNonCanonical()` attempts to access these missing headers, it
results in:
1. **Repeated ERROR logs**: `Header not found number=X hash=0x...`
2. **Log noise** that can mask other important errors
3. **User confusion** about whether this indicates a real problem
## Root Cause Analysis
The issue occurs because:
- `debug_setHead` removes blocks from the blockchain database
- Log indexer's `ChainView` may still reference deleted block hashes
- `extendNonCanonical()` in `core/filtermaps/chain_view.go` tries to
fetch these missing headers
- The existing `return false` logic properly handles the error, but logs
at ERROR level
## Solution
This is a **logging improvement only** - no functional logic changes:
### Changes Made
1. **Log level**: Changed from `ERROR` to `DEBUG`
2. **Log message**: Enhanced with descriptive context about chain view
extension
3. **Comments**: Added explanation for when this situation occurs
4. **Behavior**: Maintains existing error handling (`return false` was
already present)
### Code Changes
```go
// Before
log.Error("Header not found", "number", number, "hash", hash)
return false
// After
// Header not found - this can happen after debug_setHead operations
// where blocks have been deleted. Return false to indicate the chain view
// is no longer valid rather than logging repeated errors.
log.Debug("Header not found during chain view extension", "number", number, "hash", hash)
return false
```
## Testing
### Automated Tests
- ✅ All existing filtermaps tests pass: `go test ./core/filtermaps -v`
- ✅ No regressions in related functionality
### Manual Verification
1. **Before fix**: Started geth in dev mode, generated blocks, called
`debug_setHead(3)` → **5 repeated ERROR logs**
2. **After fix**: Same scenario → **4 DEBUG logs, no ERROR noise**
### Test Environment
```bash
# Setup test environment
rm -rf ./dev-test-data
./build/bin/geth --dev --datadir ./dev-test-data --http --http.api debug,eth,net,web3 --verbosity 4
# Generate test blocks and trigger issue
curl -X POST -H "Content-Type: application/json" --data '{"jsonrpc":"2.0","method":"debug_setHead","params":["0x3"],"id":1}' http://localhost:8545
```
## Related Issues
- Fixes#31929
## Additional Context
This issue was reported as spurious error messages appearing after
`debug_setHead` operations. The investigation revealed that while the
error handling was functionally correct, the ERROR log level was
inappropriate for this expected scenario in development/debugging
workflows.
The fix maintains full compatibility while significantly improving the
debugging experience for developers using `debug_setHead`.
---------
Co-authored-by: Sun Tae, Kim <38067691+humblefirm@users.noreply.github.com>
Co-authored-by: zsfelfoldi <zsfelfoldi@gmail.com>
This pull request tracks the state indexing progress in eth_syncing
RPC response, i.e. we will return non-null syncing status until indexing
has finished.
Previously, the account trie for a given state root was resolved immediately
when the stateDB was created, implying that the trie was always required
by the stateDB.
However, this assumption no longer holds, especially for path archive nodes,
where historical states can be accessed even if the corresponding trie data
does not exist.
If Geth is engaged in a long-run block synchronization, such as a full
syncing over a large number of blocks, invoking `debug_setHead` will
cause `downloader.Cancel` to wait for all fetchers to stop first.
This can be time-consuming, particularly for the block processing
thread.
To address this, we manually call `blockchain.StopInsert` to interrupt
the blocking processing thread and allow it to exit immediately, and
after that call `blockchain.ResumeInsert` to resume the block
downloading process.
Additionally, we add a sanity check for the input block number of
`debug_setHead` to ensure its validity.
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Previously, PathDB used a single buffer to aggregate database writes,
which needed to be flushed atomically. However, flushing large amounts
of data (e.g., 256MB) caused significant overhead, often blocking the
system for around 3 seconds during the flush.
To mitigate this overhead and reduce performance spikes, a double-buffer
mechanism is introduced. When the active buffer fills up, it is marked
as frozen and a background flushing process is triggered. Meanwhile, a
new buffer is allocated for incoming writes, allowing operations to
continue uninterrupted.
This approach reduces system blocking times and provides flexibility in
adjusting buffer parameters for improved performance.
This pull request introduces a mechanism to expose statistics from the
state reader, specifically related to cache utilization during state prefetching.
To improve state access performance, a pair of state readers is constructed
with a shared local cache. One reader to execute transactions ahead of time
to warm up the cache. The other reader is used by the actual chain processing
logic, which can benefit from the prefetched states.
This PR adds visibility into how effective the cache is by exposing relevant
usage statistics.
---------
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: Csaba Kiraly <csaba.kiraly@gmail.com>
In this pull request, the original `CacheConfig` has been renamed to `BlockChainConfig`.
Over time, more fields have been added to `CacheConfig` to support
blockchain configuration. Such as `ChainHistoryMode`, which clearly extends
beyond just caching concerns.
Additionally, adding new parameters to the blockchain constructor has
become increasingly complicated, since it’s initialized across multiple
places in the codebase. A natural solution is to consolidate these arguments
into a dedicated configuration struct.
As a result, the existing `CacheConfig` has been redefined as `BlockChainConfig`.
Some parameters, such as `VmConfig`, `TxLookupLimit`, and `ChainOverrides`
have been moved into `BlockChainConfig`. Besides, a few fields in `BlockChainConfig`
were renamed, specifically:
- `TrieCleanNoPrefetch` -> `NoPrefetch`
- `TrieDirtyDisabled` -> `ArchiveMode`
Notably, this change won't affect the command line flags or the toml
configuration file. It's just an internal refactoring and fully backward-compatible.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Since we have the effective gas price in the message, we can compute tip by
simply subtracting the basefee. No need to recompute the effective price.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Reading a single transaction out of a block shouldn't need decoding the
entire body
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
As https://github.com/ethereum/go-ethereum/pull/31769 defined a global
hash pool, so we can reuse it, and also remove the unnecessary
KeccakState buffering
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
When `GetKey` is called, a missing preimage can cause the function to return a `nil`
key. This, in turn, makes `account.Storage` persist an incorrect value.
This is a followup to #31753.
A cumulative counter is more useful when we need to measure / aggregate
the metric over a longer period of time. It also means we won't miss data,
e.g. our prometheus scrapes every 30 seconds, and so may miss a transient
spike in the pre-aggregated mgas/s.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
With EOF removed from the Osaka fork, and no longer being tested, the
implementation will now just be bitrotting. I'm opting to remove it so
it doesn't get in the way of other changes.
This PR changes the database access of the base part of filter rows that
are stored in groups of 32 adjacent maps for improved database storage
size and data access efficiency.
Before this grouped storage was introduced, filter rows were not cached
because the access pattern of either the index rendering or the search
does not really benefit from caching. Also no mutex was necessary for
filter row access. Storing adjacent rows in groups complicated the
situation as a search typically required reading all or most of adjacent
rows of a group, so in order to implement the single row read operation
without having to read the entire group up to 32 times, a cache for the
base row groups was added. This also introduced data race issues for
concurrenct read/write in the same group which was avoided by locking
the `indexLock` mutex. Unfortunately this also led to slowed down or
temporarily blocked search operations when indexing was in progress.
This PR returns to the original concept of uncached, no-mutex filter map
access by increasing read efficiency in a better way; similiarly to
write operations that already operate on groups of filter maps, now
`getFilterMapRow` is also replaced by `getFilterMapRows` that accepts a
single `rowIndex` and a list of `mapIndices`. It slightly complicates
`singleMatcherInstance.getMatchesForLayer` which now has to collect
groups of map indices accessed in the same row, but in exchange it
guarantees maximum read efficiency while avoiding read/write mutex
interference.
Note: a follow-up refactoring is WIP that further changes the database
access scheme by prodiving an immutable index view to the matcher, makes
the whole indexer more straightforward with no callbacks, and entirely
removes the concept of matcher syncing with `validBlocks` and the
resulting multiple retry logic in `eth/filters/filter.go`. This might
take a bit longer to finish though and in the meantime this change could
hopefully already solve the blocked request issues.
This implements a backing store for chain history based on era1 files.
The new store is integrated with the freezer. Queries for blocks and receipts
below the current freezer tail are handled by the era store.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: lightclient <lightclient@protonmail.com>
Some tests involving transactions near the txMaxSize limit were flaky.
This was due to ECDSA signatures occasionally having leading zeros,
which are omitted during RLP encoding — making the final transaction
size 1 byte smaller than expected.
To address this, a new helper function pricedDataTransactionWithFixedSignature
was added. It ensures both r and s are exactly 32 bytes (i.e., no leading zeros),
producing transactions with deterministic size.
This PR implements eth/69. This protocol version drops the bloom filter
from receipts messages, reducing the amount of data needed for a sync
by ~530GB (2.3B txs * 256 byte) uncompressed. Compressed this will
be reduced to ~100GB
The new version also changes the Status message and introduces the
BlockRangeUpdate message to relay information about the available history
range.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
In this pull request, snapshot generation in pathdb has been ported from
the legacy state snapshot implementation. Additionally, when running in
path mode, legacy state snapshot data is now managed by the pathdb
based snapshot logic.
Note: Existing snapshot data will be re-generated, regardless of whether
it was previously fully constructed.
Adding values to the witness introduces a new class of issues for
computing gas: if there is not enough gas to cover adding an item to the
witness, then the item should not be added to the witness.
The problem happens when several items are added together, and that
process runs out of gas. The witness gas computation needs a way to
signal that not enough gas was provided. These values can not be
hardcoded, however, as they are context dependent, i.e. two calls to the
same function with the same parameters can give two different results.
The approach is to return both the gas that was actually consumed, and
the gas that was necessary. If the values don't match, then a witness
update OOG'd. The caller should then charge the `consumed` value
(remaining gas will be 0) and error out.
Why not return a boolean instead of the wanted value? Because when
several items are touched, we want to distinguish which item lacked gas.
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
This adds a metric called `chain/mgasps`, which records how many million
gas per second are being used during block insertion.
The value is calculated as `usedGas * 1000 / elapsed`, and it's updated
in the `insertStats.report` method. Also cleaned up the log output to
reuse the same value instead of recalculating it.
Useful for monitoring block processing throughput.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR introduces an allocation-free version of the
Transaction.EffectiveGasTip method to improve performance by reducing
memory allocations.
## Changes
- Added a new `EffectiveGasTipInto` method that accepts a destination
parameter to avoid memory allocations
- Refactored the existing `EffectiveGasTip` method to use the new
allocation-free implementation
- Updated related methods (`EffectiveGasTipValue`, `EffectiveGasTipCmp`,
`EffectiveGasTipIntCmp`) to use the allocation-free approach
- Added tests and benchmarks to verify correctness and measure
performance improvements
## Motivation
In high-transaction-volume environments, the `EffectiveGasTip` method is
called frequently. Reducing memory allocations in this method decreases
garbage collection pressure and improves overall system performance.
## Benchmark Results
As-Is
BenchmarkEffectiveGasTip/Original-10 42089140 27.45 ns/op 8 B/op 1
allocs/op
To-Be
BenchmarkEffectiveGasTip/IntoMethod-10 72353263 16.73 ns/op 0 B/op 0
allocs/op
## Summary of Improvements
- **Performance**: ~39% faster execution (27.45 ns/op → 16.73 ns/op)
- **Memory**: Eliminated all allocations (8 B/op → 0 B/op)
- **Allocation count**: Reduced from 1 to 0 allocations per operation
This optimization follows the same pattern successfully applied to other
methods in the codebase, maintaining API compatibility while improving
performance.
## Safety & Compatibility
This optimization has no side effects or adverse impacts because:
- It maintains functional equivalence as confirmed by comprehensive
tests
- It preserves API compatibility with existing callers
- It follows clear memory ownership patterns with the destination
parameter
- It maintains thread safety by only modifying the caller-provided
destination parameter
This optimization follows the same pattern successfully applied to other
methods in the codebase, providing better performance without
compromising stability or correctness.
---------
Co-authored-by: lightclient <lightclient@protonmail.com>
This PR creates a global hasher pool that can be used by all packages.
It also removes a bunch of the package local pools.
It also updates a few locations to use available hashers or the global
hashing pool to reduce allocations all over the codebase.
This change should reduce global allocation count by ~1%
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This pull request enhances the block prefetcher by executing transactions
in parallel to warm the cache alongside the main block processor.
Unlike the original prefetcher, which only executes the next block and
is limited to chain syncing, the new implementation can be applied to any
block. This makes it useful not only during chain sync but also for regular
block insertion after the initial sync.
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
This PR fixes an issue that could lead to data corruption.
Writing the state history may fail due to insufficient disk space or
other potential errors. With this change, the entire state insertion
will be aborted instead of silently ignoring the error.
Without this fix, state transitions would continue while the associated
state history is lost. After a restart, the resulting gap would be detected,
making recovery impossible.
This pull request introduces a SyncKeyValue function to the
ethdb.KeyValueStore
interface, providing the ability to forcibly flush all previous writes
to disk.
This functionality is critical for go-ethereum, which internally uses
two independent
database engines: a key-value store (such as Pebble, LevelDB, or
memoryDB for
testing) and a flat-file–based freezer. To ensure write-order
consistency between
these engines, the key-value store must be explicitly synced before
writing to the
freezer and vice versa.
Fixes
- https://github.com/ethereum/go-ethereum/issues/31405
- https://github.com/ethereum/go-ethereum/issues/29819
updates the log entries in `core/filtermaps/indexer.go` to remove double
quotes around keys like "first block" and "last block", changing them to
`firstblock` and `lastblock`. This brings them in line with the general
logging style used across the codebase, where log keys are unquoted
single words.
For example, the log:
` INFO [...] "first block"=..., "last block"=...`
Is now rendered as:
` INFO [...] firstblock=..., lastblock=...`
This change improves readability and maintains consistency with logs
such as:
` INFO [...] number=2 sealhash=... uncles=0 txs=0 ...`
No functional behavior is changed — this is purely a formatting cleanup
for better developer experience.
Fixes https://github.com/ethereum/go-ethereum/issues/31732.
This logic was removed in the recent refactoring in the txindexer to
handle history cutoff (#31393). It was first introduced in this PR:
https://github.com/ethereum/go-ethereum/pull/28908.
I have tested it and it works as an alternative to #31745.
This PR packs 3 changes to the flow of fetching txs from the API:
- It caches the indexer tail after each run is over to avoid hitting the
db all the time as was done originally in #28908.
- Changes `backend.GetTransaction`. It doesn't return an error anymore
when tx indexer is in progress. It shifts the responsibility to the
caller to check the progress. The reason is that in most cases we anyway
check the txpool for the tx. If it was indeed a pending tx we can avoid
the indexer progress check.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR fixes an initialization bug that in some cases caused the map
renderer to leave the last, partially rendered map as is and resume
rendering from the next map. At initialization we check whether the
existing rendered maps are consistent with the current chain view and
revert them if necessary. Until now this happened through an ugly hacky
solution, a "limited" chain view that was supposed to trigger a rollback
of some maps in the renderer logic if necessary. This whole setup worked
under assumptions that just weren't true any more. As a result it always
tried to revert the last map but also it did not shorten the indexed
range, only set `headIndexed` to false which indicated to the renderer
logic that the last map is fully populated (which it wasn't).
Now an explicit rollback of any unusable (reorged) maps happens at
startup, which also means that no hacky chain view is necessary, as soon
as the new `FilterMaps` is returned, the indexed range and view are
consistent with each other.
In the first commit an extra check is also added to `writeFinishedMaps`
so that if there is ever again a bug that would result in a gapped index
then it will not break the db with writing the incomplete data. Instead
it will return an indexing error which causes the indexer to revert to
unindexed mode and print an error log instantly. Hopefully this will not
ever happen in the future, but in order to test this safeguard check I
manually triggered the bug with only the first commit enabled, which
caused an indexing error as expected. With the second commit added (the
actual fix) the same operation succeeded without any issues.
Note that the database version is also bumped in this PR in order to
enforce a full reindexing as any existing database might be potentially
broken.
Fixes https://github.com/ethereum/go-ethereum/issues/31729
This PR fixes the out-of-range block number logic of `getBlockLvPointer`
which sometimes caused searches to fail if the head was updated in the
wrong moment. This logic ensures that querying the pointer of a future
block returns the pointer after the last fully indexed block (instead of
failing) and therefore an async range update will not cause the search
to fail. Earier this behaviour only worked when `headIndexed` was true
and `headDelimiter` pointed to the end of the indexed range. Now it also
works for an unfinished index.
This logic is also moved from `FilterMaps.getBlockLvPointer` to
`FilterMapsMatcherBackend.GetBlockLvPointer` because it is only required
by the search anyways. `FilterMaps.getBlockLvPointer` now only returns a
pointer for existing blocks, consistently with how it is used in the
indexer/renderer.
Note that this unhandled case has been present in the code for a long
time but went unnoticed because either one of two previously fixed bugs
did prevent it from being triggered; the incorrectly positive
`tempRange.headIndexed` (fixed in
https://github.com/ethereum/go-ethereum/pull/31680), though caused other
problems, prevented this one from being triggered as with a positive
`headIndexed` no database read was triggered in `getBlockLvPointer`.
Also, the unnecessary `indexLock` in `synced()` (fixed in
https://github.com/ethereum/go-ethereum/pull/31708) usually did prevent
the search seeing the temp range and therefore avoided noticeable
issues.
This changes the filtermaps to only pull up the raw receipts, not the
derived receipts which saves a lot of allocations.
During normal execution this will reduce the allocations of the whole
geth node by ~15%.