Improves the SSTORE gas calculation a bit. Previously we would pull up
the state object twice. This is okay for existing objects, since they
are cached, however non-existing objects are not cached, thus we needed
to go through all 128 diff layers as well as the disk layer twice, just
for the gas calculation
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/vm
cpu: AMD Ryzen 9 5900X 12-Core Processor
│ /tmp/old.txt │ /tmp/new.txt │
│ sec/op │ sec/op vs base │
Interpreter-24 1118.0n ± 2% 602.8n ± 1% -46.09% (p=0.000 n=10)
```
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Previously, the account trie for a given state root was resolved immediately
when the stateDB was created, implying that the trie was always required
by the stateDB.
However, this assumption no longer holds, especially for path archive nodes,
where historical states can be accessed even if the corresponding trie data
does not exist.
Previously, PathDB used a single buffer to aggregate database writes,
which needed to be flushed atomically. However, flushing large amounts
of data (e.g., 256MB) caused significant overhead, often blocking the
system for around 3 seconds during the flush.
To mitigate this overhead and reduce performance spikes, a double-buffer
mechanism is introduced. When the active buffer fills up, it is marked
as frozen and a background flushing process is triggered. Meanwhile, a
new buffer is allocated for incoming writes, allowing operations to
continue uninterrupted.
This approach reduces system blocking times and provides flexibility in
adjusting buffer parameters for improved performance.
This pull request introduces a mechanism to expose statistics from the
state reader, specifically related to cache utilization during state prefetching.
To improve state access performance, a pair of state readers is constructed
with a shared local cache. One reader to execute transactions ahead of time
to warm up the cache. The other reader is used by the actual chain processing
logic, which can benefit from the prefetched states.
This PR adds visibility into how effective the cache is by exposing relevant
usage statistics.
---------
Signed-off-by: Csaba Kiraly <csaba.kiraly@gmail.com>
Co-authored-by: Csaba Kiraly <csaba.kiraly@gmail.com>
As https://github.com/ethereum/go-ethereum/pull/31769 defined a global
hash pool, so we can reuse it, and also remove the unnecessary
KeccakState buffering
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
When `GetKey` is called, a missing preimage can cause the function to return a `nil`
key. This, in turn, makes `account.Storage` persist an incorrect value.
In this pull request, snapshot generation in pathdb has been ported from
the legacy state snapshot implementation. Additionally, when running in
path mode, legacy state snapshot data is now managed by the pathdb
based snapshot logic.
Note: Existing snapshot data will be re-generated, regardless of whether
it was previously fully constructed.
Adding values to the witness introduces a new class of issues for
computing gas: if there is not enough gas to cover adding an item to the
witness, then the item should not be added to the witness.
The problem happens when several items are added together, and that
process runs out of gas. The witness gas computation needs a way to
signal that not enough gas was provided. These values can not be
hardcoded, however, as they are context dependent, i.e. two calls to the
same function with the same parameters can give two different results.
The approach is to return both the gas that was actually consumed, and
the gas that was necessary. If the values don't match, then a witness
update OOG'd. The caller should then charge the `consumed` value
(remaining gas will be 0) and error out.
Why not return a boolean instead of the wanted value? Because when
several items are touched, we want to distinguish which item lacked gas.
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
This PR creates a global hasher pool that can be used by all packages.
It also removes a bunch of the package local pools.
It also updates a few locations to use available hashers or the global
hashing pool to reduce allocations all over the codebase.
This change should reduce global allocation count by ~1%
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
This pull request enhances the block prefetcher by executing transactions
in parallel to warm the cache alongside the main block processor.
Unlike the original prefetcher, which only executes the next block and
is limited to chain syncing, the new implementation can be applied to any
block. This makes it useful not only during chain sync but also for regular
block insertion after the initial sync.
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
1. The metric of preimage/hits are always the same as preimage/total, prefer to replace
the hits with miss instead.
2. For the state/read/accounts metric, follow the same naming of others,
change into singuar.
This is a not-particularly-important "cleanliness" PR. It removes the
last remnants of the `x/exp` package, where we used the `maps.Keys`
function.
The original returned the keys in a slice, but when it became 'native'
the signature changed to return an iterator, so the new idiom is
`slices.Collect(maps.Keys(theMap))`, unless of course the raw iterator
can be used instead.
In some cases, where we previously collect into slice and then sort, we
can now instead do `slices.SortXX` on the iterator instead, making the
code a bit more concise.
This PR might be _slighly_ less optimal, because the original `x/exp`
implementation allocated the slice at the correct size off the bat,
which I suppose the new code won't.
Putting it up for discussion.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Here we add some more changes for live tracing API v1.1:
- Hook `OnSystemCallStartV2` was introduced with `VMContext` as parameter.
- Hook `OnBlockHashRead` was introduced.
- `GetCodeHash` was added to the state interface
- The new `WrapWithJournal` construction helps with tracking EVM reverts in the tracer.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
Same as #31015 but requires the contract to exist. Not compatible with
any verkle testnet up to now.
This adds a `isSytemCall` flag so that it is possible to detect when a
system call is executed, so that the code execution and other locations
are not added to the witness.
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Co-authored-by: Ignacio Hagopian <jsign.uy@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
This pull request delivers the new version of the state history, where
the raw storage key is used instead of the hash.
Before the cancun fork, it's supported by protocol to destruct a
specific account and therefore, all the storage slot owned by it should
be wiped in the same transition.
Technically, storage wiping should be performed through storage
iteration, and only the storage key hash will be available for traversal
if the state snapshot is not available. Therefore, the storage key hash
is chosen as the identifier in the old version state history.
Fortunately, account self-destruction has been deprecated by the
protocol since the Cancun fork, and there are no empty accounts eligible
for deletion under EIP-158. Therefore, we can conclude that no storage
wiping should occur after the Cancun fork. In this case, it makes no
sense to keep using hash.
Besides, another big reason for making this change is the current format
state history is unusable if verkle is activated. Verkle tree has a
different key derivation scheme (merkle uses keccak256), the preimage of
key hash must be provided in order to make verkle rollback functional.
This pull request is a prerequisite for landing verkle.
Additionally, the raw storage key is more human-friendly for those who
want to manually check the history, even though Solidity already
performs some hashing to derive the storage location.
---
This pull request doesn't bump the database version, as I believe the
database should still be compatible if users degrade from the new geth
version to old one, the only side effect is the persistent new version
state history will be unusable.
---------
Co-authored-by: Zsolt Felfoldi <zsfelfoldi@gmail.com>
As the node hash scheme in verkle and merkle are totally different, the
original default node hasher in pathdb is no longer suitable. Therefore,
this pull request configures different node hasher respectively.
This PR implements EIP-7702: "Set EOA account code".
Specification: https://eips.ethereum.org/EIPS/eip-7702
> Add a new transaction type that adds a list of `[chain_id, address,
nonce, y_parity, r, s]` authorization tuples. For each tuple, write a
delegation designator `(0xef0100 ++ address)` to the signing account’s
code. All code reading operations must load the code pointed to by the
designator.
---------
Co-authored-by: Mario Vega <marioevz@gmail.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR modifies how the metrics library handles `Enabled`: previously,
the package `init` decided whether to serve real metrics or just
dummy-types.
This has several drawbacks:
- During pkg init, we need to determine whether metrics are enabled or
not. So we first hacked in a check if certain geth-specific
commandline-flags were enabled. Then we added a similar check for
geth-env-vars. Then we almost added a very elaborate check for
toml-config-file, plus toml parsing.
- Using "real" types and dummy types interchangeably means that
everything is hidden behind interfaces. This has a performance penalty,
and also it just adds a lot of code.
This PR removes the interface stuff, uses concrete types, and allows for
the setting of Enabled to happen later. It is still assumed that
`metrics.Enable()` is invoked early on.
The somewhat 'heavy' operations, such as ticking meters and exp-decay,
now checks the enable-flag to prevent resource leak.
The change may be large, but it's mostly pretty trivial, and from the
last time I gutted the metrics, I ensured that we have fairly good test
coverage.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
It's a pull request based on https://github.com/ethereum/go-ethereum/pull/30643
In this pull request, the partial functional state reader is enabled if **legacy snapshot
is not enabled**. The tracked flat states in pathdb will be used to serve the state
retrievals, as the second implementation to fasten the state access.
This pull request should be a noop change in normal cases.
This PR introduces a `ContractCodeReader` interface with functions defined:
type ContractCodeReader interface {
Code(addr common.Address, codeHash common.Hash) ([]byte, error)
CodeSize(addr common.Address, codeHash common.Hash) (int, error)
}
This interface can be implemented in various ways. Although the codebase
currently includes only one implementation, additional implementations
could be created for different purposes and scenarios, such as a code
reader designed for the Verkle tree approach or one that reads code from
the witness.
*Notably, this interface modifies the function’s semantics. If the
contract code is not found, no error will be returned. An error should
only be returned in the event of an unexpected issue, primarily for
future implementations.*
The original state.Reader interface is extended with ContractCodeReader
methods, it gives us more flexibility to manipulate the reader with additional
logic on top, e.g. Hooks.
type Reader interface {
ContractCodeReader
StateReader
}
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
This workaround is meant to minimize the possibility for snapshot generation
once the geth node upgrades to new version (specifically #30752 )
In #30752, the journal format in state snapshot is modified by removing
the destruct set. Therefore, the existing old format (version = 0) will be
discarded and all in-memory layers will be lost. Unfortunately, the lost
in-memory layers can't be recovered by some other approaches, and the
entire state snapshot will be regenerated (it will last about 2.5 hours).
This pull request introduces a workaround to adopt the legacy journal if
the destruct set contained is empty. Since self-destruction has been
deprecated following the cancun fork, the destruct set is expected to be nil for
layers above the fork block. However, an exception occurs during contract
deployment: pre-funded accounts may self-destruct, causing accounts with
non-zero balances to be removed from the state. For example,
https://etherscan.io/tx/0xa087333d83f0cd63b96bdafb686462e1622ce25f40bd499e03efb1051f31fe49).
For nodes with a fully synced state, the legacy journal is likely compatible with
the updated definition, eliminating the need for regeneration. Unfortunately,
nodes performing a full sync of historical chain segments or encountering
pre-funded account deletions may face incompatibilities, leading to automatic
snapshot regeneration.
This reverts commit 23800122b3.
The original pull request introduces a bug and some flaky tests are
detected because of this flaw.
```
--- FAIL: TestRecoverSnapshotFromWipingCrash (0.27s)
blockchain_snapshot_test.go:158: The disk layer is not integrated snapshot is not constructed
{"pc":0,"op":88,"gas":"0x7148","gasCost":"0x2","memSize":0,"stack":[],"depth":1,"refund":0,"opName":"PC"}
{"pc":1,"op":255,"gas":"0x7146","gasCost":"0x1db0","memSize":0,"stack":["0x0"],"depth":1,"refund":0,"opName":"SELFDESTRUCT"}
{"output":"","gasUsed":"0x0"}
{"output":"","gasUsed":"0x1db2"}
{"pc":0,"op":116,"gas":"0x13498","gasCost":"0x3","memSize":0,"stack":[],"depth":1,"refund":0,"opName":"PUSH21"}
```
Before the original PR, the snapshot would block the function until the
disk layer
was fully generated under the following conditions:
(a) explicitly required by users with `AsyncBuild = false`.
(b) the snapshot was being fully rebuilt or *the disk layer generation
had resumed*.
Unfortunately, with the changes introduced in that PR, the snapshot no
longer waits
for disk layer generation to complete if the generation is resumed. It
brings lots of
uncertainty and breaks this tiny debug feature.
This PR is purely for improved readability; I was doing work involving
the file and think this may help others who are trying to understand
what's going on.
1. `snapshot.Tree.Rebuild()` now returns a function that blocks until
regeneration is complete, allowing `Tree.waitBuild()` to be removed
entirely as all it did was search for the `done` channel behind this new
function.
2. Its usage inside `New()` is also simplified by (a) only waiting if
`!AsyncBuild`; and (b) avoiding the double negative of `if !NoBuild`.
---------
Co-authored-by: Martin HS <martin@swende.se>
This pull request removes the destruct flag from the state snapshot to
simplify the code.
Previously, this flag indicated that an account was removed during a
state transition, making all associated storage slots inaccessible.
Because storage deletion can involve a large number of slots, the actual
deletion is deferred until the end of the process, where it is handled
in batches.
With the deprecation of self-destruct in the Cancun fork, storage
deletions are no longer expected. Historically, the largest storage
deletion event in Ethereum was around 15 megabytes—manageable in memory.
In this pull request, the single destruct flag is replaced by a set of
deletion markers for individual storage slots. Each deleted storage slot
will now appear in the Storage set with a nil value.
This change will simplify a lot logics, such as storage accessing,
storage flushing, storage iteration and so on.
This change invokes the OnCodeChange hook when selfdestruct operation is performed, and a contract is removed. This is an event which can be consumed by tracers.
Tests that are crucial to for verifying the verkle testnet functions properly.
---------
Signed-off-by: Guillaume Ballet <3272758+gballet@users.noreply.github.com>
Co-authored-by: Ignacio Hagopian <jsign.uy@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Martin HS <martin@swende.se>
I think the core code should generally be agnostic about the witness and
the statedb layer should determine what elements need to be included in
the witness. Because code is accessed via `GetCode`, and
`GetCodeLength`, the statedb will always know when it needs to add that
code into the witness.
The edge case is block hashes, so we continue to add them manually in
the implementation of `BLOCKHASH`.
It probably makes sense to refactor statedb so we have a wrapped
implementation that accumulates the witness, but this is a simpler
change that makes #30078 less aggressive.
This PR moves the logging/tracing-facilities out of `*state.StateDB`,
in to a wrapping struct which implements `vm.StateDB` instead.
In most places, it is a pretty straight-forward change:
- First, hoisting the invocations from state objects up to the statedb.
- Then making the mutation-methods simply return the previous value, so
that the external logging layer could log everything.
Some internal code uses the direct object-accessors to mutate the state,
particularly in testing and in setting up state overrides, which means
that these changes are unobservable for the hooked layer. Thus, configuring
the overrides are not necessarily part of the API we want to publish.
The trickiest part about the layering is that when the selfdestructs are
finally deleted during `Finalise`, there's the possibility that someone
sent some ether to it, which is burnt at that point, and thus needs to
be logged. The hooked layer reaches into the inner layer to figure out
these events.
In package `vm`, the conversion from `state.StateDB + hooks` into a
hooked `vm.StateDB` is performed where needed.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Changelog: https://golangci-lint.run/product/changelog/#1610
Removes `exportloopref` (no longer needed), replaces it with
`copyloopvar` which is basically the opposite.
Also adds:
- `durationcheck`
- `gocheckcompilerdirectives`
- `reassign`
- `mirror`
- `tenv`
---------
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
This pull request skips the state snapshot update if the base layer is
not existent, eliminating the numerous warning logs after an unclean
shutdown.
Specifically, Geth will rewind its chain head to a historical block
after unclean shutdown and state snapshot will be remained as unchanged
waiting for recovery. During this period of time, the snapshot is unusable
and all state updates should be ignored/skipped for state snapshot update.