I recently went on a longer flight and started profiling the geth block
production pipeline.
This PR contains a bunch of individual fixes split into separate
commits.
I can drop some if necessary.
Benchmarking is not super easy, the benchmark I wrote is a bit
non-deterministic.
I will try to write a better benchmark later
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/miner
cpu: Intel(R) Core(TM) Ultra 7 155U
│ /tmp/old.txt │ /tmp/new.txt │
│ sec/op │ sec/op vs base │
BuildPayload-14 141.5µ ± 3% 146.0µ ± 6% ~ (p=0.346 n=200)
│ /tmp/old.txt │ /tmp/new.txt │
│ B/op │ B/op vs base │
BuildPayload-14 188.2Ki ± 4% 177.4Ki ± 4% -5.71% (p=0.018 n=200)
│ /tmp/old.txt │ /tmp/new.txt │
│ allocs/op │ allocs/op vs base │
BuildPayload-14 2.703k ± 4% 2.453k ± 5% -9.25% (p=0.000 n=200)
```
This PR reverts a part of changes brought by https://github.com/ethereum/go-ethereum/pull/33281/changes
Specifically, read-only protection should always be enforced at the opcode level,
regardless of whether the check has already been performed during gas metering.
It should act as a gatekeeper, otherwise, it is easy to introduce errors by adding
new gas measurement logic without consistently applying the read-only protection.
The core part of this PR that we need to adopt is to move the code and
nonce change hook invocations to occur at tx finalization, instead of
when the selfdestruct opcode is called.
Additionally:
* remove `SelfDestruct6780` now that it is essentially the same as
`SelfDestruct` just gated by `is new contract`
* don't duplicate `BalanceIncreaseSelfdestruct` (transfer to recipient
of selfdestruct) in the hooked statedb and in the opcode handler for the
selfdestruct opcode.
* balance is burned immediately when the beneficiary of the selfdestruct
is the sender, and the contract was created in the same transaction.
Previously we emit two balance increases to the recipient (see above
point), and a balance decrease from the sender.
---------
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: lightclient <lightclient@protonmail.com>
This PR causes execution to terminate at the gas handler in the case of
sstore/call if they are invoked in a static execution context.
This aligns the behavior with EIP 7928 by ensuring that we don't record
any state reads in the access list from an SSTORE/CALL in this
circumstance.
---------
Co-authored-by: lightclient <lightclient@protonmail.com>
The EIP says to increment PC by 2 _instead of_ the standard increment by
1. The opcode handlers added in #33095 result in incrementing PC by 3,
because they ignored the increment already present in `interpreter.go`.
Does this need to be better specified in the EIP? I've added a [new test
case](https://github.com/ethereum/EIPs/pull/10859) for it anyway.
Found by @0xriptide.
EIP-8024: Backward compatible SWAPN, DUPN, EXCHANGE
Introduces additional instructions for manipulating the stack which
allow accessing the stack at higher depths. This is an initial implementation
of the EIP, which is still in Review stage.
This PR creates a global hasher pool that can be used by all packages.
It also removes a bunch of the package local pools.
It also updates a few locations to use available hashers or the global
hashing pool to reduce allocations all over the codebase.
This change should reduce global allocation count by ~1%
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
During my benchmarks on Holesky, around 10% of all CPU time was spent in
PUSH2
```
ROUTINE ======================== github.com/ethereum/go-ethereum/core/vm.newFrontierInstructionSet.makePush.func1 in github.com/ethereum/go-ethereum/core/vm/instructions.go
16.38s 20.35s (flat, cum) 10.31% of Total
740ms 740ms 976: return func(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) {
. . 977: var (
40ms 40ms 978: codeLen = len(scope.Contract.Code)
970ms 970ms 979: start = min(codeLen, int(*pc+1))
200ms 200ms 980: end = min(codeLen, start+pushByteSize)
. . 981: )
670ms 2.39s 982: a := new(uint256.Int).SetBytes(scope.Contract.Code[start:end])
. . 983:
. . 984: // Missing bytes: pushByteSize - len(pushData)
410ms 410ms 985: if missing := pushByteSize - (end - start); missing > 0 {
. . 986: a.Lsh(a, uint(8*missing))
. . 987: }
12.69s 14.94s 988: scope.Stack.push2(*a)
10ms 10ms 989: *pc += size
650ms 650ms 990: return nil, nil
. . 991: }
. . 992:}
```
Which is quite crazy. We have a handwritten encoder for PUSH1 already,
this PR adds one for PUSH2.
PUSH2 is the second most used opcode as shown here:
https://gist.github.com/shemnon/fb9b292a103abb02d98d64df6fbd35c8 since
it is used by solidity quite significantly. Its used ~20 times as much
as PUSH20 and PUSH32.
# Benchmarks
```
BenchmarkPush/makePush-14 94196547 12.27 ns/op 0 B/op 0 allocs/op
BenchmarkPush/push-14 429976924 2.829 ns/op 0 B/op 0 allocs/op
```
---------
Co-authored-by: jwasinger <j-wasinger@hotmail.com>
This PR does a few things including:
- Remove `ContractRef` interface
- Remove `vm.AccountRef` which implements `ContractRef` interface
- Maintain the `jumpDests` struct in EVM for sharing between call frames
- Simplify the delegateCall context initialization
Here we add some more changes for live tracing API v1.1:
- Hook `OnSystemCallStartV2` was introduced with `VMContext` as parameter.
- Hook `OnBlockHashRead` was introduced.
- `GetCodeHash` was added to the state interface
- The new `WrapWithJournal` construction helps with tracking EVM reverts in the tracer.
---------
Co-authored-by: Felix Lange <fjl@twurst.com>
I think the core code should generally be agnostic about the witness and
the statedb layer should determine what elements need to be included in
the witness. Because code is accessed via `GetCode`, and
`GetCodeLength`, the statedb will always know when it needs to add that
code into the witness.
The edge case is block hashes, so we continue to add them manually in
the implementation of `BLOCKHASH`.
It probably makes sense to refactor statedb so we have a wrapped
implementation that accumulates the witness, but this is a simpler
change that makes #30078 less aggressive.
Looking at the cpu profile of a burntpix benchmark, I noticed that a lot
of time was spent in gas-used, in the interpreter loop. It's an actual
call (not inlined), which explicitly wants to be ignored by tracing
("tracing.GasChangeIgnored"), so it can be safely and simply inlined.
The other change is in `pushX`. These also do a call to
`common.RightPadBytes`. I replaced that by a doing a corresponding `Lsh`
on the `u256` if needed. Note: it's needed only to make the stack output
look right, for fuzzers. It technically doesn't matter what we put
there: if code ends on a pushdata immediate, nothing will consume the
stack element. We could just as well just ignore it, if we didn't care
about fuzzers (which I do).
Seems quite a lot faster on burntpix, according to my runs.
This PR:
```
EVM gas used: 5642735088
execution time: 34.84609475s
allocations: 915683
allocated bytes: 175334088
```
```
EVM gas used: 5642735088
execution time: 36.671958278s
allocations: 915701
allocated bytes: 175340528
```
Master
```
EVM gas used: 5642735088
execution time: 49.349209526s
allocations: 915684
allocated bytes: 175333368
```
```
EVM gas used: 5642735088
execution time: 46.581006598s
allocations: 915681
allocated bytes: 175330728
```
---------
Co-authored-by: Sina M <1591639+s1na@users.noreply.github.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR moves the logging/tracing-facilities out of `*state.StateDB`,
in to a wrapping struct which implements `vm.StateDB` instead.
In most places, it is a pretty straight-forward change:
- First, hoisting the invocations from state objects up to the statedb.
- Then making the mutation-methods simply return the previous value, so
that the external logging layer could log everything.
Some internal code uses the direct object-accessors to mutate the state,
particularly in testing and in setting up state overrides, which means
that these changes are unobservable for the hooked layer. Thus, configuring
the overrides are not necessarily part of the API we want to publish.
The trickiest part about the layering is that when the selfdestructs are
finally deleted during `Finalise`, there's the possibility that someone
sent some ether to it, which is burnt at that point, and thus needs to
be logged. The hooked layer reaches into the inner layer to figure out
these events.
In package `vm`, the conversion from `state.StateDB + hooks` into a
hooked `vm.StateDB` is performed where needed.
---------
Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Consistently use `uint64` for indices in `Memory` and drop lots of type
conversions from `uint64` to `int64`.
---------
Co-authored-by: lmittmann <lmittmann@users.noreply.github.com>
* all: add stateless verifications
* all: simplify witness and integrate it into live geth
---------
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
Here we add a Go API for running tracing plugins within the main block import process.
As an advanced user of geth, you can now create a Go file in eth/tracers/live/, and within
that file register your custom tracer implementation. Then recompile geth and select your tracer
on the command line. Hooks defined in the tracer will run whenever a block is processed.
The hook system is defined in package core/tracing. It uses a struct with callbacks, instead of
requiring an interface, for several reasons:
- We plan to keep this API stable long-term. The core/tracing hook API does not depend on
on deep geth internals.
- There are a lot of hooks, and tracers will only need some of them. Using a struct allows you
to implement only the hooks you want to actually use.
All existing tracers in eth/tracers/native have been rewritten to use the new hook system.
This change breaks compatibility with the vm.EVMLogger interface that we used to have.
If you are a user of vm.EVMLogger, please migrate to core/tracing, and sorry for breaking
your stuff. But we just couldn't have both the old and new tracing APIs coexist in the EVM.
---------
Co-authored-by: Matthieu Vachon <matthieu.o.vachon@gmail.com>
Co-authored-by: Delweng <delweng@gmail.com>
Co-authored-by: Martin HS <martin@swende.se>
This change makes use of uin256 to represent balance in state. It touches primarily upon statedb, stateobject and state processing, trying to avoid changes in transaction pools, core types, rpc and tracers.
Adding a space beween function opOrigin() and opcCaller() in instruciton.go.
Adding a space beween function opkeccak256() and opAddress() in instruciton.go.
EIP-6780: SELFDESTRUCT only in same transaction
> SELFDESTRUCT will recover all funds to the caller but not delete the account, except when called in the same transaction as creation
---------
Co-authored-by: Martin Holst Swende <martin@swende.se>
This PR removes the Debug field from vmconfig, making it so that if a tracer is set, debug=true is implied.
---------
Co-authored-by: 0xTylerHolmes <tyler@ethereum.org>
Co-authored-by: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
This change improves reusability of the EVM struct. Two methods are added:
- SetBlockContext(...)
- SetTracer(...)
Other attributes like the TransactionContext and the StateDB can already be updated.
BlockContext and Tracer are partially not updateable right now. This change fixes it and
opens the potential to reuse an EVM struct in more ways.
Co-authored-by: Felix Lange <fjl@twurst.com>
This PR changes the API so that uint64 is used for fork timestamps.
It's a good choice because types.Header also uses uint64 for time.
Co-authored-by: Felix Lange <fjl@twurst.com>
Implementation of https://eips.ethereum.org/EIPS/eip-3860, limit and meter initcode. This PR enables EIP-3860 as part of the Shanghai fork.
Co-authored-by: lightclient@protonmail.com <lightclient@protonmail.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
This changes the CI / release builds to use the latest Go version. It also
upgrades golangci-lint to a newer version compatible with Go 1.19.
In Go 1.19, godoc has gained official support for links and lists. The
syntax for code blocks in doc comments has changed and now requires a
leading tab character. gofmt adapts comments to the new syntax
automatically, so there are a lot of comment re-formatting changes in this
PR. We need to apply the new format in order to pass the CI lint stage with
Go 1.19.
With the linter upgrade, I have decided to disable 'gosec' - it produces
too many false-positive warnings. The 'deadcode' and 'varcheck' linters
have also been removed because golangci-lint warns about them being
unmaintained. 'unused' provides similar coverage and we already have it
enabled, so we don't lose much with this change.
* core: implement eip-4399 random opcode
* core: make vmconfig threadsafe
* core: miner: pass vmConfig by value not reference
* all: enable 4399 by Rules
* core: remove diff (f)
* tests: set proper difficulty (f)
* smaller diff (f)
* eth/catalyst: nit
* core: make RANDOM a pointer which is only set post-merge
* cmd/evm/internal/t8ntool: fix t8n tracing of 4399
* tests: set difficulty
* cmd/evm/internal/t8ntool: check that baserules are london before applying the merge chainrules
* core/vm: Remove interpreter loop interruption check
* core/vm: Unit test for interpreter loop interruption
* core/vm: Check for interpreter loop abort on every jump
* core/vm: Move interpreter.ReadOnly check into the opcode implementations
Also remove the same check from the interpreter inner loop.
* core/vm: Remove obsolete operation.writes flag
* core/vm: Capture fault states in logger
Co-authored-by: Martin Holst Swende <martin@swende.se>
* core/vm: Remove panic added for testing
Co-authored-by: Martin Holst Swende <martin@swende.se>
* core/vm: break loop on any error
* core/vm: move ErrExecutionReverted to opRevert()
* core/vm: use "stop token" to stop the loop
* core/vm: unconditionally pc++ in the loop
* core/vm: set return data in instruction impls
This change introduces 2 new optional methods; `enter()` and `exit()` for js tracers, and makes `step()` optiona. The two new methods are invoked when entering and exiting a call frame (but not invoked for the outermost scope, which has it's own methods). Currently these are the data fields passed to each of them:
enter: type (opcode), from, to, input, gas, value
exit: output, gasUsed, error
The PR also comes with a re-write of the callTracer. As a backup we keep the previous tracing script under the name `callTracerLegacy`. Behaviour of both tracers are equivalent for the most part, although there are some small differences (improvements), where the new tracer is more correct / has more information.