Adds missing trienode freezer case to InspectFreezerTable, making it
consistent with InspectFreezer which already supports it.
Co-authored-by: m6xwzzz <maskk.weller@gmail.com>
Fix#33212.
This PR remove `github.com/olekukonko/tablewriter` from dependencies and
use a naive stub implementation.
`github.com/olekukonko/tablewriter` is used to format database inspection
output neatly. However, it requires custom adjustments for TinyGo and is
incompatible with the latest version.
---------
Co-authored-by: MariusVanDerWijden <m.vanderwijden@live.de>
The iterator loop in findTxInBlockBody returned the outer-scoped err
when iter.Err() was non-nil, which could incorrectly propagate a nil or
stale error and hide actual RLP decoding issues. This patch returns
iter.Err() as intended by the rlp list iterator API, matching
established patterns elsewhere in the codebase and improving diagnostics
when encountering malformed transaction entries.
In this PR, several changes have been made:
(a) restructure the trienode history header section
Previously, the offsets of the key and value sections were recorded before
encoding data into these sections. As a result, these offsets referred to the
start position of each chunk rather than the end position.
This caused an issue where the end position of the last chunk was
unknown, making it incompatible with the freezer partial-read APIs.
With this update, all offsets now refer to the end position, and the
start position of the first chunk is always 0.
(b) Enable partial freezer read for trienode data retrieval
The partial freezer read feature is now utilized in trienode data
retrieval, improving efficiency.
This PR implements the partial read functionalities in the freezer, optimizing
the state history reader by resolving less data from freezer.
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
- Replace outdated NewFreezer doc that referenced map[string]bool/snappy
toggle with accurate description of -map[string]freezerTableConfig
(noSnappy, prunable).
- Fix misleading field comment on freezerTable.config that spoke as if
it were a boolean (“if true”), clarifying it’s a struct and noting
compression is non-retroactive.
Fix the t.Fatalf format arguments in TestBadBlockStorage to match the
intended #index output. Previously, the left number used i+1 and the
right index used the block number, producing misleading diagnostics.
Correct mapping improves test failure clarity and debuggability.
`db inspect` on the full database currently takes **30min+**, because
the db iterate was run in one thread, propose to split the key-space to
256 sub range, and assign them to the worker pool to speed up.
After the change, the time of running `db inspect --workers 16` reduced
to **10min**(the keyspace is not evenly distributed).
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This pull request preserves the root->ID mappings in the path database
even after the associated state histories are truncated, regardless of
whether the truncation occurs at the head or the tail.
The motivation is to support an additional history type, trienode history.
Since the root->ID mappings are shared between two history instances,
they must not be removed by either one.
As a consequence, the root->ID mappings remain in the database even
after the corresponding histories are pruned. While these mappings may
become dangling, it is safe and cheap to keep them.
Additionally, this pull request enhances validation during historical
reader construction, ensuring that only canonical historical state will be
served.
Continuation of https://github.com/ethereum/go-ethereum/issues/32022
tablewriter assumes unix or windows, which may not be the case for
embedded targets.
For v0.0.5 of tablewriter, it is noted in table.go: "The protocols were
written in pure Go and works on windows and unix systems"
---------
Co-authored-by: rjl493456442 <garyrong0905@gmail.com>
This is the first part of #31532
It maintains a series of conversion maker which are to be updated by the
conversion code (in a follow-up PR, this is a breakdown of a larger PR
to make things easier to review). They can be used in this way:
- During the conversion, by storing the conversion markers when the
block has been processed. This is meant to be written in a function that
isn't currently present, hence [this
TODO](https://github.com/ethereum/go-ethereum/pull/31634/files#diff-89272f61e115723833d498a0acbe59fa2286e3dc7276a676a7f7816f21e248b7R384).
Part of https://github.com/ethereum/go-ethereum/issues/31583
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This is something interesting I came across during my benchmarks, we
spent ~3.8% of all allocations allocating the header number on the heap.
```
(pprof) list GetHeaderByHash
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*BlockChain).GetHeaderByHash in github.com/ethereum/go-ethereum/core/blockchain_reader.go
0 5786566117 (flat, cum) 15.15% of Total
. . 79:func (bc *BlockChain) GetHeaderByHash(hash common.Hash) *types.Header {
. 5786566117 80: return bc.hc.GetHeaderByHash(hash)
. . 81:}
. . 82:
. . 83:// GetHeaderByNumber retrieves a block header from the database by number,
. . 84:// caching it (associated with its hash) if found.
. . 85:func (bc *BlockChain) GetHeaderByNumber(number uint64) *types.Header {
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*HeaderChain).GetHeaderByHash in github.com/ethereum/go-ethereum/core/headerchain.go
0 5786566117 (flat, cum) 15.15% of Total
. . 404:func (hc *HeaderChain) GetHeaderByHash(hash common.Hash) *types.Header {
. 1471264309 405: number := hc.GetBlockNumber(hash)
. . 406: if number == nil {
. . 407: return nil
. . 408: }
. 4315301808 409: return hc.GetHeader(hash, *number)
. . 410:}
. . 411:
. . 412:// HasHeader checks if a block header is present in the database or not.
. . 413:// In theory, if header is present in the database, all relative components
. . 414:// like td and hash->number should be present too.
(pprof) list GetBlockNumber
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core.(*HeaderChain).GetBlockNumber in github.com/ethereum/go-ethereum/core/headerchain.go
94438817 1471264309 (flat, cum) 3.85% of Total
. . 100:func (hc *HeaderChain) GetBlockNumber(hash common.Hash) *uint64 {
94438817 94438817 101: if cached, ok := hc.numberCache.Get(hash); ok {
. . 102: return &cached
. . 103: }
. 1376270828 104: number := rawdb.ReadHeaderNumber(hc.chainDb, hash)
. . 105: if number != nil {
. 554664 106: hc.numberCache.Add(hash, *number)
. . 107: }
. . 108: return number
. . 109:}
. . 110:
. . 111:type headerWriteResult struct {
(pprof) list ReadHeaderNumber
Total: 38197204475
ROUTINE ======================== github.com/ethereum/go-ethereum/core/rawdb.ReadHeaderNumber in github.com/ethereum/go-ethereum/core/rawdb/accessors_chain.go
204606513 1376270828 (flat, cum) 3.60% of Total
. . 146:func ReadHeaderNumber(db ethdb.KeyValueReader, hash common.Hash) *uint64 {
109577863 1281242178 147: data, _ := db.Get(headerNumberKey(hash))
. . 148: if len(data) != 8 {
. . 149: return nil
. . 150: }
95028650 95028650 151: number := binary.BigEndian.Uint64(data)
. . 152: return &number
. . 153:}
. . 154:
. . 155:// WriteHeaderNumber stores the hash->number mapping.
. . 156:func WriteHeaderNumber(db ethdb.KeyValueWriter, hash common.Hash, number uint64) {
```
Opening this to discuss the idea, I know that rawdb.EmptyNumber is not a
great name for the variable, open to suggestions
This pull request slightly improves the freezer fsync mechanism by scheduling
the Sync operation based on the number of uncommitted items and original
time interval.
Originally, freezer.Sync was triggered every 30 seconds, which worked well during
active chain synchronization. However, once the initial state sync is complete,
the fixed interval causes Sync to be scheduled too frequently.
To address this, the scheduling logic has been improved to consider both the time
interval and the number of uncommitted items. This additional condition helps
avoid unnecessary Sync operations when the chain is idle.
Introduce file-based state journal in path database, fixing
the Pebble restriction when the journal size exceeds 4GB.
---------
Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This pull request refines the filtermap implementation, defining key
APIs for map and
epoch calculations to improve readability.
This pull request doesn't change any logic, it's a pure cleanup.
---------
Co-authored-by: zsfelfoldi <zsfelfoldi@gmail.com>
Reading a single transaction out of a block shouldn't need decoding the
entire body
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
As https://github.com/ethereum/go-ethereum/pull/31769 defined a global
hash pool, so we can reuse it, and also remove the unnecessary
KeccakState buffering
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This implements a backing store for chain history based on era1 files.
The new store is integrated with the freezer. Queries for blocks and receipts
below the current freezer tail are handled by the era store.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: lightclient <lightclient@protonmail.com>
This PR implements eth/69. This protocol version drops the bloom filter
from receipts messages, reducing the amount of data needed for a sync
by ~530GB (2.3B txs * 256 byte) uncompressed. Compressed this will
be reduced to ~100GB
The new version also changes the Status message and introduces the
BlockRangeUpdate message to relay information about the available history
range.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR creates a global hasher pool that can be used by all packages.
It also removes a bunch of the package local pools.
It also updates a few locations to use available hashers or the global
hashing pool to reduce allocations all over the codebase.
This change should reduce global allocation count by ~1%
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR fixes an issue that could lead to data corruption.
Writing the state history may fail due to insufficient disk space or
other potential errors. With this change, the entire state insertion
will be aborted instead of silently ignoring the error.
Without this fix, state transitions would continue while the associated
state history is lost. After a restart, the resulting gap would be detected,
making recovery impossible.
This pull request introduces a SyncKeyValue function to the
ethdb.KeyValueStore
interface, providing the ability to forcibly flush all previous writes
to disk.
This functionality is critical for go-ethereum, which internally uses
two independent
database engines: a key-value store (such as Pebble, LevelDB, or
memoryDB for
testing) and a flat-file–based freezer. To ensure write-order
consistency between
these engines, the key-value store must be explicitly synced before
writing to the
freezer and vice versa.
Fixes
- https://github.com/ethereum/go-ethereum/issues/31405
- https://github.com/ethereum/go-ethereum/issues/29819
This PR fixes a bug in the map renderer that sometimes used an obsolete
block log value pointer to initialize the iterator for rendering from a
snapshot. This bug was triggered by chain reorgs and sometimes caused
indexing errors and invalid search results. A few other conditions are
also made safer that were not reported to cause issues yet but could
potentially be unsafe in some corner cases. A new unit test is also
added that reproduced the bug but passes with the new fixes.
Fixes https://github.com/ethereum/go-ethereum/issues/31593
Might also fix https://github.com/ethereum/go-ethereum/issues/31589
though this issue has not been reproduced yet, but it appears to be
related to a log index database corruption around a specific block,
similarly to the other issue.
Note that running this branch resets and regenerates the log index
database. For this purpose a `Version` field has been added to
`rawdb.FilterMapsRange` which will also make this easier in the future
if a breaking database change is needed or the existing one is
considered potentially broken due to a bug, like in this case.
This pull request introduces new sync logic for pruning mode. The downloader will now skip
insertion of block bodies and receipts before the configured history cutoff point.
Originally, in snap sync, the header chain and other components (bodies and receipts) were
inserted separately. However, in Proof-of-Stake, this separation is unnecessary since the
sync target is already verified by the CL.
To simplify the process, this pull request modifies `InsertReceiptChain` to insert headers
along with block bodies and receipts together. Besides, `InsertReceiptChain` doesn't have
the notion of reorg, as the common ancestor is always be found before the sync and extra
side chain is truncated at the beginning if they fall in the ancient store. The stale
canonical chain flags will always be rewritten by the new chain. Explicit reorg logic is
no longer required in `InsertReceiptChain`.
This PR adds `rawdb.SafeDeleteRange` and uses it for range deletion in
`core/filtermaps`. This includes deleting the old bloombits database,
resetting the log index database and removing index data for unindexed
tail epochs (which previously weren't properly implemented for the
fallback case).
`SafeDeleteRange` either calls `ethdb.DeleteRange` if the node uses the
new path based state scheme or uses an iterator based fallback method
that safely skips trie nodes in the range if the old hash based state
scheme is used. Note that `ethdb.DeleteRange` also has its own iterator
based fallback implementation in `ethdb/leveldb`. If a path based state
scheme is used and the backing db is pebble (as it is on the majority of
new nodes) then `rawdb.SafeDeleteRange` uses the fast native range
delete.
Also note that `rawdb.SafeDeleteRange` has different semantics from
`ethdb.DeleteRange`, it does not automatically return if the operation
takes a long time. Instead it receives a `stopCallback` that can
interrupt the process if necessary. This is because in the safe mode
potentially a lot of entries are iterated without being deleted (this is
definitely the case when deleting the old bloombits database which has a
single byte prefix) and therefore restarting the process every time a
fixed number of entries have been iterated would result in a quadratic
run time in the number of skipped entries.
When running in safe mode, unindexing an epoch takes about a second,
removing bloombits takes around 10s while resetting a full log index
might take a few minutes. If a range delete operation takes a
significant amount of time then log messages are printed. Also, any
range delete operation can be interrupted by shutdown (tail uinindexing
can also be interrupted by head indexing, similarly to how tail indexing
works). If the last unindexed epoch might have "dirty" index data left
then the indexed map range points to the first valid epoch and
`cleanedEpochsBefore` points to the previous, potentially dirty one. At
startup it is always assumed that the epoch before the first fully
indexed one might be dirty. New tail maps are never rendered and also no
further maps are unindexed before the previous unindexing is properly
cleaned up.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
Instead of reporting all filtermaps stuff in one line, I'm breaking it
down into the three separate kinds of entries here.
```
+-----------------------+-----------------------------+------------+------------+
| DATABASE | CATEGORY | SIZE | ITEMS |
+-----------------------+-----------------------------+------------+------------+
| Key-Value store | Log index filter-map rows | 59.21 GiB | 616077345 |
| Key-Value store | Log index last-block-of-map | 12.35 MiB | 269755 |
| Key-Value store | Log index block-lv | 421.70 MiB | 22109169 |
```
Also added some other changes to make it easier to debug:
- restored bloombits into the inspect output, so we notice if it doesn't
get deleted for some reason
- tracking of unaccounted key examples