This PR improves error handling in the remotedb package by fixing two
issues:
1. In the `Has` method, we now properly propagate errors instead of
silently returning false. This makes the behavior more predictable and
helps clients better understand when there are connection issues.
2. In the `New` constructor, we add a nil check for the client parameter
to prevent potential panics. This follows Go best practices for
constructor functions.
These changes make the code more robust and follow Go's error handling
idioms without requiring any changes to other parts of the codebase.
Changes:
- Modified `Has` method to return errors instead of silently returning
false
- Added nil check in `New` constructor
- Fixed field name in constructor to match struct definition
This PR adds `rawdb.SafeDeleteRange` and uses it for range deletion in
`core/filtermaps`. This includes deleting the old bloombits database,
resetting the log index database and removing index data for unindexed
tail epochs (which previously weren't properly implemented for the
fallback case).
`SafeDeleteRange` either calls `ethdb.DeleteRange` if the node uses the
new path based state scheme or uses an iterator based fallback method
that safely skips trie nodes in the range if the old hash based state
scheme is used. Note that `ethdb.DeleteRange` also has its own iterator
based fallback implementation in `ethdb/leveldb`. If a path based state
scheme is used and the backing db is pebble (as it is on the majority of
new nodes) then `rawdb.SafeDeleteRange` uses the fast native range
delete.
Also note that `rawdb.SafeDeleteRange` has different semantics from
`ethdb.DeleteRange`, it does not automatically return if the operation
takes a long time. Instead it receives a `stopCallback` that can
interrupt the process if necessary. This is because in the safe mode
potentially a lot of entries are iterated without being deleted (this is
definitely the case when deleting the old bloombits database which has a
single byte prefix) and therefore restarting the process every time a
fixed number of entries have been iterated would result in a quadratic
run time in the number of skipped entries.
When running in safe mode, unindexing an epoch takes about a second,
removing bloombits takes around 10s while resetting a full log index
might take a few minutes. If a range delete operation takes a
significant amount of time then log messages are printed. Also, any
range delete operation can be interrupted by shutdown (tail uinindexing
can also be interrupted by head indexing, similarly to how tail indexing
works). If the last unindexed epoch might have "dirty" index data left
then the indexed map range points to the first valid epoch and
`cleanedEpochsBefore` points to the previous, potentially dirty one. At
startup it is always assumed that the epoch before the first fully
indexed one might be dirty. New tail maps are never rendered and also no
further maps are unindexed before the previous unindexing is properly
cleaned up.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
Here we add the notion of prunable tables for the `TruncateTail` operation
in the freezer. TruncateTail for the chain freezer now only truncates the body and
receipts tables, leaving headers and hashes as-is.
This change also requires changing the validation/repair at startup to allow for
tables with different tail. For the header and hash tables, we now require them to start
at number zero.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
The metric always has a value, no need to check for the nil.
Seems this code was first introduced here
054412e335/metrics/meter.go (L45-L48)
As the `nilMeter` was removed, so this check seems is useless.
Signed-off-by: jsvisa <delweng@gmail.com>
This PR replaces the iterator based DeleteRange implementation of
memorydb with a simpler and much faster loop that directly deletes keys
in the order of iteration instead of unnecessarily collecting keys in
memory and sorting them.
---------
Co-authored-by: Martin HS <martin@swende.se>
This PR modifies how the metrics library handles `Enabled`: previously,
the package `init` decided whether to serve real metrics or just
dummy-types.
This has several drawbacks:
- During pkg init, we need to determine whether metrics are enabled or
not. So we first hacked in a check if certain geth-specific
commandline-flags were enabled. Then we added a similar check for
geth-env-vars. Then we almost added a very elaborate check for
toml-config-file, plus toml parsing.
- Using "real" types and dummy types interchangeably means that
everything is hidden behind interfaces. This has a performance penalty,
and also it just adds a lot of code.
This PR removes the interface stuff, uses concrete types, and allows for
the setting of Enabled to happen later. It is still assumed that
`metrics.Enable()` is invoked early on.
The somewhat 'heavy' operations, such as ticking meters and exp-decay,
now checks the enable-flag to prevent resource leak.
The change may be large, but it's mostly pretty trivial, and from the
last time I gutted the metrics, I ensured that we have fairly good test
coverage.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR adds `DeleteRange` to `ethdb.KeyValueWriter`. While range
deletion using an iterator can be really slow, `DeleteRange` is natively
supported by pebble and apparently runs in O(1) time (typically 20-30ms
in my tests for removing hundreds of millions of keys and gigabytes of
data). For leveldb and memorydb an iterator based fallback is
implemented. Note that since the iterator method can be slow and a
database function should not unexpectedly block for a very long time,
the number of deleted keys is limited at 10000 which should ensure that
it does not block for more than a second. ErrTooManyKeys is returned if
the range has only been partially deleted. In this case the caller can
repeat the call until it finally succeeds.
* cmd/geth, ethdb/pebble: polish method naming and code comment
* implement db stat for pebble
* cmd, core, ethdb, internal, trie: remove db property selector
* cmd, core, ethdb: fix function description
---------
Co-authored-by: prpeh <prpeh@proton.me>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This fixes a database corruption issue that could occur during state healing.
When sync is aborted while certain modifications were already committed, and a
reorg occurs, the database would contain incorrect trie nodes stored by path.
These nodes need to detected/deleted in order to obtain a complete and fully correct state
after state healing.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
* trie: use pooling of iterator states in iterator
The node iterator burns through a lot of memory while iterating a trie, and a lot of
that can be avoided by using a fairly small pool (max 40 items).
name old time/op new time/op delta
Iterator-8 6.22ms ± 3% 5.40ms ± 6% -13.18% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
Iterator-8 2.36MB ± 0% 1.67MB ± 0% -29.23% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
Iterator-8 37.0k ± 0% 29.8k ± 0% ~ (p=0.079 n=4+5)
* ethdb/memorydb: avoid one copying of key
By making the transformation from []byte to string at an earlier point,
we save an allocation which otherwise happens later on.
name old time/op new time/op delta
BatchAllocs-8 412µs ± 6% 382µs ± 2% -7.18% (p=0.016 n=5+4)
name old alloc/op new alloc/op delta
BatchAllocs-8 480kB ± 0% 490kB ± 0% +1.93% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
BatchAllocs-8 3.03k ± 0% 2.03k ± 0% -32.98% (p=0.008 n=5+5)
cockroachdb/pebble@422dce9 added Errorf to the Logger interface, this change makes it possible to compile geth with that version of pebble by adding the corresponding method to panicLogger.
The Go authors updated golang/x/ext to change the function signature of the slices sort method.
It's an entire shitshow now because x/ext is not tagged, so everyone's codebase just
picked a new version that some other dep depends on, causing our code to fail building.
This PR updates the dep on our code too and does all the refactorings to follow upstream...
* all: implement path-based state scheme
* all: edits from review
* core/rawdb, trie/triedb/pathdb: review changes
* core, light, trie, eth, tests: reimplement pbss history
* core, trie/triedb/pathdb: track block number in state history
* trie/triedb/pathdb: add history documentation
* core, trie/triedb/pathdb: address comments from Peter's review
Important changes to list:
- Cache trie nodes by path in clean cache
- Remove root->id mappings when history is truncated
* trie/triedb/pathdb: fallback to disk if unexpect node in clean cache
* core/rawdb: fix tests
* trie/triedb/pathdb: rename metrics, change clean cache key
* trie/triedb: manage the clean cache inside of disk layer
* trie/triedb/pathdb: move journal function
* trie/triedb/path: fix tests
* trie/triedb/pathdb: fix journal
* trie/triedb/pathdb: fix history
* trie/triedb/pathdb: try to fix tests on windows
* core, trie: address comments
* trie/triedb/pathdb: fix test issues
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
This change adds the ability to perform reads from freezer without size limitation. This can be useful in cases where callers are certain that out-of-memory will not happen (e.g. reading only a few elements).
The previous API was designed to behave both optimally and secure while servicing a request from a peer, whereas this change should _not_ be used when an untrusted peer can influence the query size.
This removes text parsing in leveldb metrics collection code. All metrics
can now be accessed through the stats API provided by leveldb.
We also add new gauge-typed metrics that count the number of tables at each level.
---------
Co-authored-by: Exca-DK <Exca-DK@users.noreply.github.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
This is likely the culprit behind several data corruption issues, e.g. where data has been
written to the freezer, but the deletion from pebble does not go through due to process
crash.
One difference between pebble and leveldb is that the latter returns error when performing Get on a closed database, the former does a panic. This may be triggered during shutdown (see #27237)
This PR changes the pebble driver so we check that the db is not closed already, for several operations. It also adds tests to the db test-suite, so the previously implicit assumption of "not panic:ing at ops on closed database" is covered by tests.