Adding an RPC flag to limit the block range size for eth_getLogs and
eth_newFilter requests.
closing https://github.com/ethereum/go-ethereum/issues/24508
---------
Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
### Description
Add a new `OnStateUpdate` hook which gets invoked after state is
committed.
### Rationale
For our particular use case, we need to obtain the state size metrics at
every single block when fuly syncing from genesis. With the current
state sizer, whenever the node is stopped, the background process must
be freshly initialized. During this re-initialization, it can skip some
blocks while the node continues executing blocks, causing gaps in the
recorded metrics.
Using this state update hook allows us to customize our own data
persistence logic, and we would never skip blocks upon node restart.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This improves the error code for cases where invalid query parameters
are submitted to `eth_getLogs`. I also improved the error message that
is emitted when querying into the future.
- Introduce a new subscription kind `transactionReceipts` to allow clients to
receive transaction receipts over WebSocket as soon as they are available.
- Accept optional `transactionHashes` filter to subscribe to receipts for specific
transactions; an empty or omitted filter subscribes to all receipts.
- Preserve the same receipt format as returned by `eth_getTransactionReceipt`.
- Avoid additional HTTP polling, reducing RPC load and latency.
---------
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
Fixes issue #32793. When the pending tx subscription ends, the filter
is removed from `api.filters`, but it is not terminated. There is no other
way to terminate it, so the subscription will leak, and potentially block
the producer side.
before:
go test -run=^$ -bench=. ./eth/... 827.57s user 23.80s system 361% cpu
3:55.49 total
after:
go test -run=^$ -bench=. ./eth/... 281.62s user 13.62s system 245% cpu
2:00.49 total
Add cli configurable limit for the number of addresses allowed in
eth_getLogs filter criteria:
https://github.com/ethereum/go-ethereum/issues/32264
Key changes:
- Added --rpc.getlogmaxaddrs CLI flag (default: 1000) to configure the
maximum number of addresses
- Updated ethconfig.Config with FilterMaxAddresses field for
configuration management
- Modified filter system to use the configurable limit instead of the
hardcoded maxAddresses constant
- Enhanced test coverage with new test cases for address limit
validation
- Removed hardcoded validation from JSON unmarshaling, moving it to
runtime validation
Please notice that I remove the check at FilterCriteria UnmarshalJSON
because the runtime config can not pass into this validation.
Please help review this change!
---------
Co-authored-by: zsfelfoldi <zsfelfoldi@gmail.com>
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
This introduces an error when the filter has both `blockHash` and
`fromBlock`/`toBlock`, since these are mutually exclusive. Seems the
tests were actually returning `not found` error, which went undetected
since there was no check on the actual returned error in the test.
This is something interesting I came across during my benchmarks, we
spent ~3.8% of all allocations allocating the header number on the heap.
```
(pprof) list GetHeaderByHash
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*BlockChain).GetHeaderByHash in github.com/ethereum/go-ethereum/core/blockchain_reader.go
0 5786566117 (flat, cum) 15.15% of Total
. . 79:func (bc *BlockChain) GetHeaderByHash(hash common.Hash) *types.Header {
. 5786566117 80: return bc.hc.GetHeaderByHash(hash)
. . 81:}
. . 82:
. . 83:// GetHeaderByNumber retrieves a block header from the database by number,
. . 84:// caching it (associated with its hash) if found.
. . 85:func (bc *BlockChain) GetHeaderByNumber(number uint64) *types.Header {
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*HeaderChain).GetHeaderByHash in github.com/ethereum/go-ethereum/core/headerchain.go
0 5786566117 (flat, cum) 15.15% of Total
. . 404:func (hc *HeaderChain) GetHeaderByHash(hash common.Hash) *types.Header {
. 1471264309 405: number := hc.GetBlockNumber(hash)
. . 406: if number == nil {
. . 407: return nil
. . 408: }
. 4315301808 409: return hc.GetHeader(hash, *number)
. . 410:}
. . 411:
. . 412:// HasHeader checks if a block header is present in the database or not.
. . 413:// In theory, if header is present in the database, all relative components
. . 414:// like td and hash->number should be present too.
(pprof) list GetBlockNumber
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*HeaderChain).GetBlockNumber in github.com/ethereum/go-ethereum/core/headerchain.go
94438817 1471264309 (flat, cum) 3.85% of Total
. . 100:func (hc *HeaderChain) GetBlockNumber(hash common.Hash) *uint64 {
94438817 94438817 101: if cached, ok := hc.numberCache.Get(hash); ok {
. . 102: return &cached
. . 103: }
. 1376270828 104: number := rawdb.ReadHeaderNumber(hc.chainDb, hash)
. . 105: if number != nil {
. 554664 106: hc.numberCache.Add(hash, *number)
. . 107: }
. . 108: return number
. . 109:}
. . 110:
. . 111:type headerWriteResult struct {
(pprof) list ReadHeaderNumber
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core/rawdb.ReadHeaderNumber in github.com/ethereum/go-ethereum/core/rawdb/accessors_chain.go
204606513 1376270828 (flat, cum) 3.60% of Total
. . 146:func ReadHeaderNumber(db ethdb.KeyValueReader, hash common.Hash) *uint64 {
109577863 1281242178 147: data, _ := db.Get(headerNumberKey(hash))
. . 148: if len(data) != 8 {
. . 149: return nil
. . 150: }
95028650 95028650 151: number := binary.BigEndian.Uint64(data)
. . 152: return &number
. . 153:}
. . 154:
. . 155:// WriteHeaderNumber stores the hash->number mapping.
. . 156:func WriteHeaderNumber(db ethdb.KeyValueWriter, hash common.Hash, number uint64) {
```
Opening this to discuss the idea, I know that rawdb.EmptyNumber is not a
great name for the variable, open to suggestions
This pull request refines the filtermap implementation, defining key
APIs for map and
epoch calculations to improve readability.
This pull request doesn't change any logic, it's a pure cleanup.
---------
Co-authored-by: zsfelfoldi <zsfelfoldi@gmail.com>
The address filter was never checked against a maximum limit, which can
be somewhat abusive for API nodes. This PR adds a limit similar to
topics
## Description (AI generated)
This pull request introduces a new validation to enforce a maximum limit
on the number of addresses allowed in filter criteria for Ethereum logs.
It includes updates to the `FilterAPI` and `EventSystem` logic, as well
as corresponding test cases to ensure the new constraint is properly
enforced.
### Core functionality changes:
* **Validation for maximum addresses in filter criteria**:
- Added a new constant, `maxAddresses`, set to 100, to define the
maximum allowable addresses in a filter.
- Introduced a new error, `errExceedMaxAddresses`, to handle cases where
the number of addresses exceeds the limit.
- Updated the `GetLogs` method in `FilterAPI` to validate the number of
addresses against `maxAddresses`.
- Modified the `UnmarshalJSON` method to return an error if the number
of addresses in the input JSON exceeds `maxAddresses`.
- Added similar validation to the `SubscribeLogs` method in
`EventSystem`.
### Test updates:
* **New test cases for address limit validation**:
- Added a test in `TestUnmarshalJSONNewFilterArgs` to verify that
exceeding the maximum number of addresses triggers the
`errExceedMaxAddresses` error.
- Updated `TestInvalidLogFilterCreation` to include a test case for an
invalid filter with more than `maxAddresses` addresses.
- Updated `TestInvalidGetLogsRequest` to test for invalid log requests
with excessive addresses.
These changes ensure that the system enforces a reasonable limit on the
number of addresses in filter criteria, improving robustness and
preventing potential performance issues.
---------
Co-authored-by: zsfelfoldi <zsfelfoldi@gmail.com>
In this pull request, the original `CacheConfig` has been renamed to `BlockChainConfig`.
Over time, more fields have been added to `CacheConfig` to support
blockchain configuration. Such as `ChainHistoryMode`, which clearly extends
beyond just caching concerns.
Additionally, adding new parameters to the blockchain constructor has
become increasingly complicated, since it’s initialized across multiple
places in the codebase. A natural solution is to consolidate these arguments
into a dedicated configuration struct.
As a result, the existing `CacheConfig` has been redefined as `BlockChainConfig`.
Some parameters, such as `VmConfig`, `TxLookupLimit`, and `ChainOverrides`
have been moved into `BlockChainConfig`. Besides, a few fields in `BlockChainConfig`
were renamed, specifically:
- `TrieCleanNoPrefetch` -> `NoPrefetch`
- `TrieDirtyDisabled` -> `ArchiveMode`
Notably, this change won't affect the command line flags or the toml
configuration file. It's just an internal refactoring and fully backward-compatible.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Fixes an issue where querying logs for block ranges starting from 0 would fail with an irrelevant
error on a pruned node. Now the correct "history is pruned" error will be returned.
This PR implements eth/69. This protocol version drops the bloom filter
from receipts messages, reducing the amount of data needed for a sync
by ~530GB (2.3B txs * 256 byte) uncompressed. Compressed this will
be reduced to ~100GB
The new version also changes the Status message and introduces the
BlockRangeUpdate message to relay information about the available history
range.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This changes the filtermaps to only pull up the raw receipts, not the
derived receipts which saves a lot of allocations.
During normal execution this will reduce the allocations of the whole
geth node by ~15%.
This PR changes the chain view update mechanism of the log filter.
Previously the head updates were all wired through the indexer, even in
unindexed mode. This was both a bit weird and also unsafe as the
indexer's chain view was updates asynchronously with some delay, making
some log related tests flaky. Also, the reorg safety of the indexed
search was integrated with unindexed search in a weird way, relying on
`syncRange.ValidBlocks` in the unindexed case too, with a special
condition added to only consider the head of the valid range but not the
tail in the unindexed case.
In this PR the current chain view is directly accessible through the
filter backend and unindexed search is also chain view based, making it
inherently safe. The matcher sync mechanism is now only used for indexed
search as originally intended, removing a few ugly special conditions.
The PR is currently based on top of
https://github.com/ethereum/go-ethereum/pull/31642
Together they fix https://github.com/ethereum/go-ethereum/issues/31518
and replace https://github.com/ethereum/go-ethereum/pull/31542
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
I added the history mode configuration in eth/ethconfig initially, since
it seemed like the logical place. But it turns out we need access to the
intended pruning setting at a deeper level, and it actually needs to be
integrated with the blockchain startup procedure.
With this change applied, if a node previously had its history pruned,
and is subsequently restarted **without** the `--history.chain
postmerge` flag, the `BlockChain` initialization code will now verify
the freezer tail against the known pruning point of the predefined
network and will restore pruning status. Note that this logic is quite
restrictive, we allow non-zero tail only for known networks, and only
for the specific pruning point that is defined.
Currently, when calculating block's bloom, we loop through all the
receipt logs to calculate the hash value. However, normally, after going
through applyTransaction, the receipt's bloom is already calculated
based on the receipt log, so the block's bloom can be calculated by just
ORing these receipt's blooms.
```
goos: darwin
goarch: arm64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: Apple M1 Pro
BenchmarkCreateBloom
BenchmarkCreateBloom/small
BenchmarkCreateBloom/small-10 810922 1481 ns/op 104 B/op 5 allocs/op
BenchmarkCreateBloom/large
BenchmarkCreateBloom/large-10 8173 143764 ns/op 9614 B/op 401 allocs/op
BenchmarkCreateBloom/small-mergebloom
BenchmarkCreateBloom/small-mergebloom-10 5178918 232.0 ns/op 0 B/op 0 allocs/op
BenchmarkCreateBloom/large-mergebloom
BenchmarkCreateBloom/large-mergebloom-10 54110 22207 ns/op 0 B/op 0 allocs/op
```
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Zsolt Felfoldi <zsfelfoldi@gmail.com>
Changelog: https://golangci-lint.run/product/changelog/#1610
Removes `exportloopref` (no longer needed), replaces it with
`copyloopvar` which is basically the opposite.
Also adds:
- `durationcheck`
- `gocheckcompilerdirectives`
- `reassign`
- `mirror`
- `tenv`
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
This PR changes how sidechains are handled.
Before the merge, it was possible to import a chain with lower td and not set it as canonical. After the merge, we expect every chain that we get via InsertChain to be canonical. Non-canonical blocks can still be inserted
with InsertBlockWIthoutSetHead.
If during the InsertChain, the existing chain is not canonical anymore, we mark it as a sidechain and send the SideChainEvents normally.
This change removes support for subscribing to pending logs.
"Pending logs" were always an odd feature, because it can never be fully reliable. When support for it was added many years ago, the intention was for this to be used by wallet apps to show the 'potential future token balance' of accounts, i.e. as a way of notifying the user of incoming transfers before they were mined. In order to generate the pending logs, the node must pick a subset of all public mempool transactions, execute them in the EVM, and then dispatch the resulting logs to API consumers.
* miner: untangle miner
* miner: use common.hash instead of *types.header
* cmd/geth: deprecate --mine
* eth: get rid of most miner api
* console: get rid of coinbase in welcome message
* miner/stress: get rid of the miner stress test
* eth: get rid of miner.setEtherbase
* ethstats: remove miner and hashrate flags
* ethstats: remove miner and hashrate flags
* cmd: rename pendingBlockProducer to miner.pending.feeRecipient flag
* miner: use pendingFeeRecipient instead of etherbase
* miner: add mutex to protect the pending block
* miner: add mutex to protect the pending block
* eth: get rid of etherbase mentions
* miner: no need to lock the coinbase
* eth, miner: fix linter
---------
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
Currently, geth's will return `[]` for any `len(topics) > 4` log filter. The EVM only supports up to four logs, via LOG4 opcode, so larger criterias fail. This change makes the filter query exit early in those cases.
This change improves GenerateChain to support internal chain history access (ChainReader)
for the consensus engine and EVM.
GenerateChain takes a `parent` block and the number of blocks to create. With my changes,
the consensus engine and EVM can now access blocks from `parent` up to the block currently
being generated. This is required to make the BLOCKHASH instruction work, and also needed
to create real clique chains. Clique uses chain history to figure out if the current signer is in-turn,
for example.
I've also added some more accessors to BlockGen. These are helpful when creating transactions:
- g.Signer returns a signer instance for the current block
- g.Difficulty returns the current block difficulty
- g.Gas returns the remaining gas amount
Another fix in this commit concerns the receipts returned by GenerateChain. The receipts now
have properly derived fields (BlockHash, etc.) and should generally match what would be
returned by the RPC API.