Commit graph

2 commits

Author SHA1 Message Date
Martin HS
057667151b
core/types, trie: reduce allocations in derivesha (#30747)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
Alternative to #30746, potential follow-up to #30743 . This PR makes the
stacktrie always copy incoming value buffers, and reuse them internally.

Improvement in #30743:
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
                          │ derivesha.1 │             derivesha.2              │
                          │   sec/op    │    sec/op     vs base                │
DeriveSha200/stack_trie-8   477.8µ ± 2%   430.0µ ± 12%  -10.00% (p=0.000 n=10)

                          │ derivesha.1  │             derivesha.2              │
                          │     B/op     │     B/op      vs base                │
DeriveSha200/stack_trie-8   45.17Ki ± 0%   25.65Ki ± 0%  -43.21% (p=0.000 n=10)

                          │ derivesha.1 │            derivesha.2             │
                          │  allocs/op  │ allocs/op   vs base                │
DeriveSha200/stack_trie-8   1259.0 ± 0%   232.0 ± 0%  -81.57% (p=0.000 n=10)

```
This PR further enhances that: 

```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
                          │ derivesha.2  │          derivesha.3           │
                          │    sec/op    │    sec/op     vs base          │
DeriveSha200/stack_trie-8   430.0µ ± 12%   423.6µ ± 13%  ~ (p=0.739 n=10)

                          │  derivesha.2  │             derivesha.3              │
                          │     B/op      │     B/op      vs base                │
DeriveSha200/stack_trie-8   25.654Ki ± 0%   4.960Ki ± 0%  -80.67% (p=0.000 n=10)

                          │ derivesha.2 │            derivesha.3             │
                          │  allocs/op  │ allocs/op   vs base                │
DeriveSha200/stack_trie-8   232.00 ± 0%   37.00 ± 0%  -84.05% (p=0.000 n=10)
```
So the total derivesha-improvement over *both PRS* is: 
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
                          │ derivesha.1 │             derivesha.3              │
                          │   sec/op    │    sec/op     vs base                │
DeriveSha200/stack_trie-8   477.8µ ± 2%   423.6µ ± 13%  -11.33% (p=0.015 n=10)

                          │  derivesha.1  │             derivesha.3              │
                          │     B/op      │     B/op      vs base                │
DeriveSha200/stack_trie-8   45.171Ki ± 0%   4.960Ki ± 0%  -89.02% (p=0.000 n=10)

                          │ derivesha.1  │            derivesha.3             │
                          │  allocs/op   │ allocs/op   vs base                │
DeriveSha200/stack_trie-8   1259.00 ± 0%   37.00 ± 0%  -97.06% (p=0.000 n=10)
```

Since this PR always copies the incoming value, it adds a little bit of
a penalty on the previous insert-benchmark, which copied nothing (always
passed the same empty slice as input) :

```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/trie
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
             │ stacktrie.7  │          stacktrie.10          │
             │    sec/op    │    sec/op     vs base          │
Insert100K-8   88.21m ± 34%   92.37m ± 31%  ~ (p=0.280 n=10)

             │ stacktrie.7  │             stacktrie.10             │
             │     B/op     │     B/op      vs base                │
Insert100K-8   3.424Ki ± 3%   4.581Ki ± 3%  +33.80% (p=0.000 n=10)

             │ stacktrie.7 │            stacktrie.10            │
             │  allocs/op  │ allocs/op   vs base                │
Insert100K-8    22.00 ± 5%   26.00 ± 4%  +18.18% (p=0.000 n=10)
```

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
2025-10-01 10:05:49 +02:00
Delweng
aecf3f9579
internal/blocktest: add package for shared test code (#27270) 2023-07-11 14:57:02 +02:00