The optimization tried to defer allocating the cache map until it was used for the
first time. It's a relic from earlier times, when tries were copied often. This seems
unnecessary now, so we can just create the map when the trie is created.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
As the preimage will only be stored if `t.preimages != nil`, so no need
to save them into local cache if not enabled. This will reduce the memory
wasted to copy the bytes
---------
Signed-off-by: jsvisa <delweng@gmail.com>
In this pull request, snapshot generation in pathdb has been ported from
the legacy state snapshot implementation. Additionally, when running in
path mode, legacy state snapshot data is now managed by the pathdb
based snapshot logic.
Note: Existing snapshot data will be re-generated, regardless of whether
it was previously fully constructed.
This PR creates a global hasher pool that can be used by all packages.
It also removes a bunch of the package local pools.
It also updates a few locations to use available hashers or the global
hashing pool to reduce allocations all over the codebase.
This change should reduce global allocation count by ~1%
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This pull request introduces a SyncKeyValue function to the
ethdb.KeyValueStore
interface, providing the ability to forcibly flush all previous writes
to disk.
This functionality is critical for go-ethereum, which internally uses
two independent
database engines: a key-value store (such as Pebble, LevelDB, or
memoryDB for
testing) and a flat-file–based freezer. To ensure write-order
consistency between
these engines, the key-value store must be explicitly synced before
writing to the
freezer and vice versa.
Fixes
- https://github.com/ethereum/go-ethereum/issues/31405
- https://github.com/ethereum/go-ethereum/issues/29819
This PR adds checking for an edgecase which theoretically can happen in
the range-prover. Right now, we check that a key does not overwrite a
previous one by checking that the key is increasing. However, if keys
are of different lengths, it is possible to create a key which is
increasing _and_ overwrites the previous key. Example: `0xaabbcc`
followed by `0xaabbccdd`.
This can not happen in go-ethereum, which always uses fixed-size paths
for accounts and storage slot paths in the trie, but it might happen if
the range prover is used without guaranteed fixed-size keys.
This PR also adds some testcases for the errors that are expected.
This pull request removes the node copy operation to reduce memory
allocation. Key Changes as below:
**(a) Use `decodeNodeUnsafe` for decoding nodes retrieved from the trie
node reader**
In the current implementation of the MPT, once a trie node blob is
retrieved, it is passed to `decodeNode` for decoding. However,
`decodeNode` assumes the supplied byte slice might be mutated later, so
it performs a deep copy internally before parsing the node.
Given that the node reader is implemented by the path database and the
hash database, both of which guarantee the immutability of the returned
byte slice. By restricting the node reader interface to explicitly
guarantee that the returned byte slice will not be modified, we can
safely replace `decodeNode` with `decodeNodeUnsafe`. This eliminates the
need for a redundant byte copy during each node resolution.
**(b) Modify the trie in place**
In the current implementation of the MPT, a copy of a trie node is
created before any modifications are made. These modifications include:
- Node resolution: Converting the value from a hash to the actual node.
- Node hashing: Tagging the hash into its cache.
- Node commit: Replacing the children with its hash.
- Structural changes: For example, adding a new child to a fullNode or
replacing a child of a shortNode.
This mechanism ensures that modifications only affect the live tree,
leaving all previously created copies unaffected.
Unfortunately, this property leads to a huge memory allocation
requirement. For example, if we want to modify the fullNode for n times,
the node will be copied for n times.
In this pull request, all the trie modifications are made in place. In
order to make sure all previously created copies are unaffected, the
`Copy` function now will deep-copy all the live nodes rather than the
root node itself.
With this change, while the `Copy` function becomes more expensive, it's
totally acceptable as it's not a frequently used one. For the normal
trie operations (Get, GetNode, Hash, Commit, Insert, Delete), the node
copy is not required anymore.
This PR removes the assumption of the stacktrie and trie to have the
same ordering. This was hit by the fuzzers on oss-fuzz
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This fixes an error where executing `evm run --dump ...` omits preimages
from the dump (because the statedb used for execution is a copy of
another instance).
As the node hash scheme in verkle and merkle are totally different, the
original default node hasher in pathdb is no longer suitable. Therefore,
this pull request configures different node hasher respectively.
Tests that are crucial to for verifying the verkle testnet functions properly.
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Co-authored-by: Ignacio Hagopian <jsign.uy@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Martin HS <martin@swende.se>
This PR adds `DeleteRange` to `ethdb.KeyValueWriter`. While range
deletion using an iterator can be really slow, `DeleteRange` is natively
supported by pebble and apparently runs in O(1) time (typically 20-30ms
in my tests for removing hundreds of millions of keys and gigabytes of
data). For leveldb and memorydb an iterator based fallback is
implemented. Note that since the iterator method can be slow and a
database function should not unexpectedly block for a very long time,
the number of deleted keys is limited at 10000 which should ensure that
it does not block for more than a second. ErrTooManyKeys is returned if
the range has only been partially deleted. In this case the caller can
repeat the call until it finally succeeds.
This change makes the trie commit operation concurrent, if the number of changes exceed 100.
Co-authored-by: stevemilk <wangpeculiar@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This PR implements changes related to
[EIP-6800](https://eips.ethereum.org/EIPS/eip-6800) and
[EIP-4762](https://eips.ethereum.org/EIPS/eip-4762) spec updates.
A TL;DR of the changes is that `Version`, `Balance`, `Nonce` and
`CodeSize` are encoded in a single leaf named `BasicData`. For more
details, see the [_Header Values_ table in
EIP-6800](https://eips.ethereum.org/EIPS/eip-6800#header-values).
The motivation for this was simplifying access event patterns, reducing
code complexity, and, as a side effect, saving gas since fewer leaf
nodes must be accessed.
---------
Co-authored-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR adds the bulk verkle witness+proof production at the end of block
production. It reads all data from the tree in one swoop and produces
a verkle proof.
Co-authored-by: Felix Lange <fjl@twurst.com>
This is a performance improvement on the account-creation rollback code
required for the archive node to support verkle. It uses the utility
function `DeleteAtStem` to remove code and account data per-group
instead of doing it leaf by leaf.
It also fixes an index bug, as code is chunked in 31-byte chunks, so
comparing with the code size should use 31 as its stride.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This pull request fixes#30229.
During snap sync, large storage will be split into several pieces and
synchronized concurrently. Unfortunately, the tradeoff is that the respective
merkle trie of each storage chunk will be incomplete due to the incomplete
boundaries. The trie nodes on these boundaries will be discarded, and any
dangling nodes on disk will also be removed if they fall on these paths,
ensuring the state healer won't be blocked.
However, the dangling account trie nodes on the path from the root to the
associated account are left untouched. This means the dangling account trie
nodes could potentially stop the state healing and break the assumption that the
entire subtrie should exist if the subtrie root exists. We should consider the
account trie node as the ancestor of the corresponding storage trie node.
In the scenarios described in the above ticket, the state corruption could occur
if there is a dangling account trie node while some storage trie nodes are
removed due to synchronization redo.
The fixing idea is pretty straightforward, the trie nodes on the path from root
to account should all be explicitly removed if an incomplete storage trie
occurs. Therefore, a `delete` operation has been added into `gentrie` to
explicitly clear the account along with all nodes on this path. The special
thing is that it's a cross-trie clearing. In theory, there may be a dangling
node at any position on this account key and we have to clear all of them.
* all: add stateless verifications
* all: simplify witness and integrate it into live geth
---------
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
* avoid unnecessary copy
* delete the never used function ProofList
* eth/protocols/snap, trie/trienode: polish the code
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
* cmd/geth, ethdb/pebble: polish method naming and code comment
* implement db stat for pebble
* cmd, core, ethdb, internal, trie: remove db property selector
* cmd, core, ethdb: fix function description
---------
Co-authored-by: prpeh <prpeh@proton.me>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This pull request fixes the pre-order trie traversal by defining
a more accurate iterator order and path comparison rule.
Co-authored-by: Gary Rong <garyrong0905@gmail.com>