Commit graph

28 commits

Author SHA1 Message Date
Marius van der Wijden
e94123acc2
core/rawdb: reduce allocations in rawdb.ReadHeaderNumber (#31913)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This is something interesting I came across during my benchmarks, we
spent ~3.8% of all allocations allocating the header number on the heap.

```
(pprof) list GetHeaderByHash
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*BlockChain).GetHeaderByHash in github.com/ethereum/go-ethereum/core/blockchain_reader.go
         0 5786566117 (flat, cum) 15.15% of Total
         .          .     79:func (bc *BlockChain) GetHeaderByHash(hash common.Hash) *types.Header {
         . 5786566117     80: return bc.hc.GetHeaderByHash(hash)
         .          .     81:}
         .          .     82:
         .          .     83:// GetHeaderByNumber retrieves a block header from the database by number,
         .          .     84:// caching it (associated with its hash) if found.
         .          .     85:func (bc *BlockChain) GetHeaderByNumber(number uint64) *types.Header {
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*HeaderChain).GetHeaderByHash in github.com/ethereum/go-ethereum/core/headerchain.go
         0 5786566117 (flat, cum) 15.15% of Total
         .          .    404:func (hc *HeaderChain) GetHeaderByHash(hash common.Hash) *types.Header {
         . 1471264309    405: number := hc.GetBlockNumber(hash)
         .          .    406: if number == nil {
         .          .    407:  return nil
         .          .    408: }
         . 4315301808    409: return hc.GetHeader(hash, *number)
         .          .    410:}
         .          .    411:
         .          .    412:// HasHeader checks if a block header is present in the database or not.
         .          .    413:// In theory, if header is present in the database, all relative components
         .          .    414:// like td and hash->number should be present too.
(pprof) list GetBlockNumber
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*HeaderChain).GetBlockNumber in github.com/ethereum/go-ethereum/core/headerchain.go
  94438817 1471264309 (flat, cum)  3.85% of Total
         .          .    100:func (hc *HeaderChain) GetBlockNumber(hash common.Hash) *uint64 {
  94438817   94438817    101: if cached, ok := hc.numberCache.Get(hash); ok {
         .          .    102:  return &cached
         .          .    103: }
         . 1376270828    104: number := rawdb.ReadHeaderNumber(hc.chainDb, hash)
         .          .    105: if number != nil {
         .     554664    106:  hc.numberCache.Add(hash, *number)
         .          .    107: }
         .          .    108: return number
         .          .    109:}
         .          .    110:
         .          .    111:type headerWriteResult struct {
(pprof) list ReadHeaderNumber
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core/rawdb.ReadHeaderNumber in github.com/ethereum/go-ethereum/core/rawdb/accessors_chain.go
 204606513 1376270828 (flat, cum)  3.60% of Total
         .          .    146:func ReadHeaderNumber(db ethdb.KeyValueReader, hash common.Hash) *uint64 {
 109577863 1281242178    147: data, _ := db.Get(headerNumberKey(hash))
         .          .    148: if len(data) != 8 {
         .          .    149:  return nil
         .          .    150: }
  95028650   95028650    151: number := binary.BigEndian.Uint64(data)
         .          .    152: return &number
         .          .    153:}
         .          .    154:
         .          .    155:// WriteHeaderNumber stores the hash->number mapping.
         .          .    156:func WriteHeaderNumber(db ethdb.KeyValueWriter, hash common.Hash, number uint64) {
```

Opening this to discuss the idea, I know that rawdb.EmptyNumber is not a
great name for the variable, open to suggestions
2025-07-15 15:48:36 +02:00
rjl493456442
cbd6ed9e0b
core/filtermaps: define APIs for map, epoch calculation (#31659)
This pull request refines the filtermap implementation, defining key
APIs for map and
epoch calculations to improve readability.

This pull request doesn't change any logic, it's a pure cleanup.

---------

Co-authored-by: zsfelfoldi <zsfelfoldi@gmail.com>
2025-07-01 16:31:09 +02:00
Ömer Faruk Irmak
f70aaa8399
ethapi: reduce some of the wasted effort in GetTransactionReceipt (#32021)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Docker Image (push) Waiting to run
Towards https://github.com/ethereum/go-ethereum/issues/26974

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-07-01 15:18:49 +08:00
Ömer Faruk Irmak
4997a248ab
core/rawdb: don't decode the full block body in ReadTransaction (#32027)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Docker Image (push) Waiting to run
Reading a single transaction out of a block shouldn't need decoding the
entire body

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-06-19 10:05:32 +08:00
Zhou
15057e7f7f
core: don't emit the warning of log indexing if the db was not initialized (#31845) 2025-05-19 09:59:35 +08:00
Felföldi Zsolt
ebb3eb29d3
core/filtermaps: fix map renderer reorg issue (#31642)
This PR fixes a bug in the map renderer that sometimes used an obsolete
block log value pointer to initialize the iterator for rendering from a
snapshot. This bug was triggered by chain reorgs and sometimes caused
indexing errors and invalid search results. A few other conditions are
also made safer that were not reported to cause issues yet but could
potentially be unsafe in some corner cases. A new unit test is also
added that reproduced the bug but passes with the new fixes.

Fixes https://github.com/ethereum/go-ethereum/issues/31593
Might also fix https://github.com/ethereum/go-ethereum/issues/31589
though this issue has not been reproduced yet, but it appears to be
related to a log index database corruption around a specific block,
similarly to the other issue.

Note that running this branch resets and regenerates the log index
database. For this purpose a `Version` field has been added to
`rawdb.FilterMapsRange` which will also make this easier in the future
if a breaking database change is needed or the existing one is
considered potentially broken due to a bug, like in this case.
2025-04-16 23:30:13 +02:00
Felföldi Zsolt
14d576c002
core/filtermaps: hashdb safe delete range (#31525)
This PR adds `rawdb.SafeDeleteRange` and uses it for range deletion in
`core/filtermaps`. This includes deleting the old bloombits database,
resetting the log index database and removing index data for unindexed
tail epochs (which previously weren't properly implemented for the
fallback case).
`SafeDeleteRange` either calls `ethdb.DeleteRange` if the node uses the
new path based state scheme or uses an iterator based fallback method
that safely skips trie nodes in the range if the old hash based state
scheme is used. Note that `ethdb.DeleteRange` also has its own iterator
based fallback implementation in `ethdb/leveldb`. If a path based state
scheme is used and the backing db is pebble (as it is on the majority of
new nodes) then `rawdb.SafeDeleteRange` uses the fast native range
delete.
Also note that `rawdb.SafeDeleteRange` has different semantics from
`ethdb.DeleteRange`, it does not automatically return if the operation
takes a long time. Instead it receives a `stopCallback` that can
interrupt the process if necessary. This is because in the safe mode
potentially a lot of entries are iterated without being deleted (this is
definitely the case when deleting the old bloombits database which has a
single byte prefix) and therefore restarting the process every time a
fixed number of entries have been iterated would result in a quadratic
run time in the number of skipped entries.

When running in safe mode, unindexing an epoch takes about a second,
removing bloombits takes around 10s while resetting a full log index
might take a few minutes. If a range delete operation takes a
significant amount of time then log messages are printed. Also, any
range delete operation can be interrupted by shutdown (tail uinindexing
can also be interrupted by head indexing, similarly to how tail indexing
works). If the last unindexed epoch might have "dirty" index data left
then the indexed map range points to the first valid epoch and
`cleanedEpochsBefore` points to the previous, potentially dirty one. At
startup it is always assumed that the epoch before the first fully
indexed one might be dirty. New tail maps are never rendered and also no
further maps are unindexed before the previous unindexing is properly
cleaned up.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
2025-03-31 14:47:56 +02:00
Felix Lange
fd4049dc1e
core/rawdb: improve database stats output (#31463)
Instead of reporting all filtermaps stuff in one line, I'm breaking it
down into the three separate kinds of entries here.

```
+-----------------------+-----------------------------+------------+------------+
|       DATABASE        |          CATEGORY           |    SIZE    |   ITEMS    |
+-----------------------+-----------------------------+------------+------------+
| Key-Value store       | Log index filter-map rows   | 59.21 GiB  |  616077345 |
| Key-Value store       | Log index last-block-of-map | 12.35 MiB  |     269755 |
| Key-Value store       | Log index block-lv          | 421.70 MiB |   22109169 |
```

Also added some other changes to make it easier to debug:

- restored bloombits into the inspect output, so we notice if it doesn't
get deleted for some reason
- tracking of unaccounted key examples
2025-03-24 10:07:38 +01:00
Sina M
8fe09df54f
cmd/geth: add prune history command (#31384)
This adds a new subcommand 'geth prune-history' that removes the pre-merge history
on supported networks. Geth is not fully ready to work in this mode, please do not run
this command on your production node.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
2025-03-21 13:12:56 +01:00
Sina M
1886922264
core: respect history cutoff in txindexer (#31393)
In #31384 we unindex TXes prior to the merge block. However when the
node starts up it will try to re-index those back if the config is to index the
whole chain. This change makes the indexer aware of the history cutoff block,
avoiding reindexing in that segment.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
2025-03-21 11:29:51 +01:00
Felföldi Zsolt
07cca7ab9f
core/bloombits: remove old bloombits logic and chain indexer (#31081)
This PR is #3 of a 3-part series that implements the new log index
intended to replace core/bloombits.
Based on https://github.com/ethereum/go-ethereum/pull/31079 and
https://github.com/ethereum/go-ethereum/pull/31080
Replaces https://github.com/ethereum/go-ethereum/pull/30370

This part removes the old bloombits package and the chain indexer that
was only used by bloombits. Deletes the old bloombits database.

FilterMaps data structure explanation:
https://gist.github.com/zsfelfoldi/a60795f9da7ae6422f28c7a34e02a07e

Log index generator code overview:
https://gist.github.com/zsfelfoldi/97105dff0b1a4f5ed557924a24b9b9e7

Search pattern matcher code overview:
https://gist.github.com/zsfelfoldi/5981735641c956afb18065e84f8aff34

Note that the possibility of a tree hashing scheme and remote proof
protocol are mentioned in the documents above but they are not exactly
specified yet. These specs are WIP and will be finalized after the local
log indexer/filter code is finalized and merged.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
2025-03-21 10:47:58 +01:00
Felföldi Zsolt
d85f796356
eth/filters: implement log filter using new log index (#31080)
This PR is #2 of a 3-part series that implements the new log index
intended to replace core/bloombits.
Based on https://github.com/ethereum/go-ethereum/pull/31079
Replaces https://github.com/ethereum/go-ethereum/pull/30370

This part replaces the old bloombits based log search logic in
`eth/filters` to use the new `core/filtermaps` logic.

FilterMaps data structure explanation:
https://gist.github.com/zsfelfoldi/a60795f9da7ae6422f28c7a34e02a07e

Log index generator code overview:
https://gist.github.com/zsfelfoldi/97105dff0b1a4f5ed557924a24b9b9e7

Search pattern matcher code overview:
https://gist.github.com/zsfelfoldi/5981735641c956afb18065e84f8aff34

Note that the possibility of a tree hashing scheme and remote proof
protocol are mentioned in the documents above but they are not exactly
specified yet. These specs are WIP and will be finalized after the local
log indexer/filter code is finalized and merged.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
2025-03-17 18:59:04 +01:00
Felföldi Zsolt
f9f1172d59
core/filtermaps: FilterMaps log index generator and search logic (#31079)
This PR is #1 of a 3-part series that implements the new log index
intended to replace core/bloombits.
Replaces https://github.com/ethereum/go-ethereum/pull/30370

This part implements the new data structure, the log index generator and
the search logic. This PR has most of the complexity but it does not
affect any existing code yet so maybe it is easier to review separately.

FilterMaps data structure explanation:
https://gist.github.com/zsfelfoldi/a60795f9da7ae6422f28c7a34e02a07e

Log index generator code overview:
https://gist.github.com/zsfelfoldi/97105dff0b1a4f5ed557924a24b9b9e7

Search pattern matcher code overview:
https://gist.github.com/zsfelfoldi/5981735641c956afb18065e84f8aff34

Note that the possibility of a tree hashing scheme and remote proof
protocol are mentioned in the documents above but they are not exactly
specified yet. These specs are WIP and will be finalized after the local
log indexer/filter code is finalized and merged.

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
2025-03-13 19:04:16 +01:00
Péter Szilágyi
bbc565ab05
core/types, params: add blob transaction type, RLP encoded for now (#27049)
* core/types, params: add blob transaction type, RLP encoded for now

* all: integrate Cancun (and timestamp based forks) into MakeSigner

* core/types: fix 2 back-and-forth type refactors

* core: fix review comment

* core/types: swap blob tx type id to 0x03
2023-04-21 12:52:02 +03:00
Patrick O'Grady
d3e3a460ec
core/rawdb: fix logs to print block number, not address (#23328) 2021-08-04 11:10:37 +03:00
Giuseppe Bertone
0185ee0993
core/rawdb: single point of maintenance for writing and deleting tx lookup indexes (#21480) 2020-09-15 10:37:01 +02:00
gary rong
6eef141aef
les: historical data garbage collection (#19570)
This change introduces garbage collection for the light client. Historical
chain data is deleted periodically. If you want to disable the GC, use
the --light.nopruning flag.
2020-07-13 11:02:54 +02:00
Martin Holst Swende
4535230059
cmd, core, eth: background transaction indexing (#20302)
* cmd, core, eth: init tx lookup in background

* core/rawdb: tiny log fixes to make it clearer what's happening

* core, eth: fix rebase errors

* core/rawdb: make reindexing less generic, but more optimal

* rlp: implement rlp list iterator

* core/rawdb: new implementation of tx indexing/unindex using generic tx iterator and hashing rlp-data

* core/rawdb, cmd/utils: fix review concerns

* cmd/utils: fix merge issue

* core/rawdb: add some log formatting polishes

Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
2020-05-11 18:58:43 +03:00
Péter Szilágyi
fc85777a21
core: concurrent database reinit from freezer dump
* core: reinit chain from freezer in batches

* core/rawdb: concurrent database reinit from freezer dump

* core/rawdb: reinit from freezer in sequential order
2019-05-27 15:48:30 +03:00
gary rong
80469bea0c
all: integrate the freezer with fast sync
* all: freezer style syncing

core, eth, les, light: clean up freezer relative APIs

core, eth, les, trie, ethdb, light: clean a bit

core, eth, les, light: add unit tests

core, light: rewrite setHead function

core, eth: fix downloader unit tests

core: add receipt chain insertion test

core: use constant instead of hardcoding table name

core: fix rollback

core: fix setHead

core/rawdb: remove canonical block first and then iterate side chain

core/rawdb, ethdb: add hasAncient interface

eth/downloader: calculate ancient limit via cht first

core, eth, ethdb: lots of fixes

* eth/downloader: print ancient disable log only for fast sync
2019-05-16 10:39:32 +03:00
Péter Szilágyi
006c21efc7
cmd, core, eth, les, node: chain freezer on top of db rework 2019-05-16 10:39:29 +03:00
Matthew Halpern
937417527c core: lookup txs by block number instead of block hash (#19431)
* core: lookup txs by block number instead of block hash

Transaction hashes now store a reference to their corresponding
block number as opposed to their hash. In benchmarks this was
shown to reduce storage by over 12 GB.

The main limitation of this approach is that transactions on
non-canonical blocks could never be looked up, however that is
currently not supported.

The database version has been upgraded to version 5 and the
transaction lookup process is backwards-compatible with the
prior two transaction lookup formats prexisting in the
database instance. Tests have been added to ensure this.

* core/rawdb: tiny review nit fixes
2019-04-25 17:24:55 +03:00
Péter Szilágyi
7221cb1434
core, eth, les, light: scope receipt functionality a bit cleaner 2019-04-15 13:42:26 +03:00
Martin Holst Swende
59e1953246 core, ethdb, trie: mode dirty data to clean cache on flush (#19307)
This PR is a more advanced form of the dirty-to-clean cacher (#18995),
where we reuse previous database write batches as datasets to uncache,
saving a dirty-trie-iteration and a dirty-trie-rlp-reencoding per block.
2019-03-26 15:48:31 +01:00
Péter Szilágyi
054412e335
all: clean up and proerly abstract database access 2019-03-06 13:35:03 +02:00
gary rong
7fd0ccaa68 core: remove unnecessary fields in logs, receipts and tx lookups (#17106)
* core: remove unnecessary fields in log

* core: bump blockchain database version

* core, les: remove unnecessary fields in txlookup

* eth: print db version explicitly

* core/rawdb: drop txlookup entry struct wrapper
2019-02-21 15:14:35 +02:00
Wenbiao Zheng
aab7ab04b0 core/rawdb: wrap db key creations (#16914)
* core/rawdb: use wrappered helper to assemble key

* core/rawdb: wrappered helper to assemble key

* core/rawdb: rewrite the wrapper, pass common.Hash
2018-06-11 16:06:26 +03:00
Péter Szilágyi
6cf0ab38bd
core/rawdb: separate raw database access to own package (#16666) 2018-05-07 14:35:06 +03:00