go-ethereum/trie/bytepool.go
Martin HS 057667151b
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
core/types, trie: reduce allocations in derivesha (#30747)
Alternative to #30746, potential follow-up to #30743 . This PR makes the
stacktrie always copy incoming value buffers, and reuse them internally.

Improvement in #30743:
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
                          │ derivesha.1 │             derivesha.2              │
                          │   sec/op    │    sec/op     vs base                │
DeriveSha200/stack_trie-8   477.8µ ± 2%   430.0µ ± 12%  -10.00% (p=0.000 n=10)

                          │ derivesha.1  │             derivesha.2              │
                          │     B/op     │     B/op      vs base                │
DeriveSha200/stack_trie-8   45.17Ki ± 0%   25.65Ki ± 0%  -43.21% (p=0.000 n=10)

                          │ derivesha.1 │            derivesha.2             │
                          │  allocs/op  │ allocs/op   vs base                │
DeriveSha200/stack_trie-8   1259.0 ± 0%   232.0 ± 0%  -81.57% (p=0.000 n=10)

```
This PR further enhances that: 

```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
                          │ derivesha.2  │          derivesha.3           │
                          │    sec/op    │    sec/op     vs base          │
DeriveSha200/stack_trie-8   430.0µ ± 12%   423.6µ ± 13%  ~ (p=0.739 n=10)

                          │  derivesha.2  │             derivesha.3              │
                          │     B/op      │     B/op      vs base                │
DeriveSha200/stack_trie-8   25.654Ki ± 0%   4.960Ki ± 0%  -80.67% (p=0.000 n=10)

                          │ derivesha.2 │            derivesha.3             │
                          │  allocs/op  │ allocs/op   vs base                │
DeriveSha200/stack_trie-8   232.00 ± 0%   37.00 ± 0%  -84.05% (p=0.000 n=10)
```
So the total derivesha-improvement over *both PRS* is: 
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
                          │ derivesha.1 │             derivesha.3              │
                          │   sec/op    │    sec/op     vs base                │
DeriveSha200/stack_trie-8   477.8µ ± 2%   423.6µ ± 13%  -11.33% (p=0.015 n=10)

                          │  derivesha.1  │             derivesha.3              │
                          │     B/op      │     B/op      vs base                │
DeriveSha200/stack_trie-8   45.171Ki ± 0%   4.960Ki ± 0%  -89.02% (p=0.000 n=10)

                          │ derivesha.1  │            derivesha.3             │
                          │  allocs/op   │ allocs/op   vs base                │
DeriveSha200/stack_trie-8   1259.00 ± 0%   37.00 ± 0%  -97.06% (p=0.000 n=10)
```

Since this PR always copies the incoming value, it adds a little bit of
a penalty on the previous insert-benchmark, which copied nothing (always
passed the same empty slice as input) :

```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/trie
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
             │ stacktrie.7  │          stacktrie.10          │
             │    sec/op    │    sec/op     vs base          │
Insert100K-8   88.21m ± 34%   92.37m ± 31%  ~ (p=0.280 n=10)

             │ stacktrie.7  │             stacktrie.10             │
             │     B/op     │     B/op      vs base                │
Insert100K-8   3.424Ki ± 3%   4.581Ki ± 3%  +33.80% (p=0.000 n=10)

             │ stacktrie.7 │            stacktrie.10            │
             │  allocs/op  │ allocs/op   vs base                │
Insert100K-8    22.00 ± 5%   26.00 ± 4%  +18.18% (p=0.000 n=10)
```

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
Co-authored-by: Felix Lange <fjl@twurst.com>
2025-10-01 10:05:49 +02:00

101 lines
2.8 KiB
Go

// Copyright 2024 The go-ethereum Authors
// This file is part of the go-ethereum library.
//
// The go-ethereum library is free software: you can redistribute it and/or modify
// it under the terms of the GNU Lesser General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// The go-ethereum library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with the go-ethereum library. If not, see <http://www.gnu.org/licenses/>.
package trie
// bytesPool is a pool for byte slices. It is safe for concurrent use.
type bytesPool struct {
c chan []byte
w int
}
// newBytesPool creates a new bytesPool. The sliceCap sets the capacity of
// newly allocated slices, and the nitems determines how many items the pool
// will hold, at maximum.
func newBytesPool(sliceCap, nitems int) *bytesPool {
return &bytesPool{
c: make(chan []byte, nitems),
w: sliceCap,
}
}
// get returns a slice. Safe for concurrent use.
func (bp *bytesPool) get() []byte {
select {
case b := <-bp.c:
return b
default:
return make([]byte, 0, bp.w)
}
}
// getWithSize returns a slice with specified byte slice size.
func (bp *bytesPool) getWithSize(s int) []byte {
b := bp.get()
if cap(b) < s {
return make([]byte, s)
}
return b[:s]
}
// put returns a slice to the pool. Safe for concurrent use. This method
// will ignore slices that are too small or too large (>3x the cap)
func (bp *bytesPool) put(b []byte) {
if c := cap(b); c < bp.w || c > 3*bp.w {
return
}
select {
case bp.c <- b:
default:
}
}
// unsafeBytesPool is a pool for byte slices. It is not safe for concurrent use.
type unsafeBytesPool struct {
items [][]byte
w int
}
// newUnsafeBytesPool creates a new unsafeBytesPool. The sliceCap sets the
// capacity of newly allocated slices, and the nitems determines how many
// items the pool will hold, at maximum.
func newUnsafeBytesPool(sliceCap, nitems int) *unsafeBytesPool {
return &unsafeBytesPool{
items: make([][]byte, 0, nitems),
w: sliceCap,
}
}
// Get returns a slice with pre-allocated space.
func (bp *unsafeBytesPool) get() []byte {
if len(bp.items) > 0 {
last := bp.items[len(bp.items)-1]
bp.items = bp.items[:len(bp.items)-1]
return last
}
return make([]byte, 0, bp.w)
}
// put returns a slice to the pool. This method will ignore slices that are
// too small or too large (>3x the cap)
func (bp *unsafeBytesPool) put(b []byte) {
if c := cap(b); c < bp.w || c > 3*bp.w {
return
}
if len(bp.items) < cap(bp.items) {
bp.items = append(bp.items, b)
}
}