Commit graph

25 commits

Author SHA1 Message Date
Felix Lange
ac85a6f254
rlp: add back Iterator.Count, with fixes (#33841)
I removed `Iterator.Count` in #33840, because it appeared to be unused
and did not provide the documented invariant: the returned count should
always be an upper bound on the number of iterations allowed by `Next`.

In order to make `Count` work, the semantics of `CountValues` has to
change to return the number of items up and including the invalid one. I
have reviewed all callsites of `CountValues` to assess if changing this
is safe. There aren't that many, and the only call that doesn't check
the error and return is in the trie node parser,
`trie.decodeNodeUnsafe`. There, we distinguish the node type based on
the number of items, and it previously returned an error for item count
zero. In order to avoid any potential issue that could result from this
change, I'm adding an error check in that function, though it isn't
necessary.
2026-02-13 23:53:42 +01:00
sashass1315
4d4883731e
trie: fix embedded node size validation (#33803)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
The `decodeRef` function used `size > hashLen` to reject oversized
embedded nodes, but this incorrectly allowed nodes of exactly 32 bytes
through. The encoding side (hasher.go, stacktrie.go) consistently uses
`len(enc) < 32` to decide whether to embed a node inline, meaning nodes
of 32+ bytes are always hash-referenced. The error message itself
already stated `want size < 32`, confirming the intended threshold.
Changed `size > hashLen` to `size >= hashLen` in `decodeRef` to align
the decoding validation with the encoding logic, the Yellow Paper spec,
and the surrounding comments.
2026-02-10 22:05:39 +08:00
rjl493456442
f51870e40e
rlp, trie, triedb/pathdb: compress trienode history (#32913)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Keeper Build (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This pull request introduces a mechanism to compress trienode history by
storing only the node diffs between consecutive versions.

- For full nodes, only the modified children are recorded in the history;
- For short nodes, only the modified value is stored;

If the node type has changed, or if the node is newly created or
deleted, the entire node value is stored instead.

To mitigate the overhead of reassembling nodes from diffs during history
reads, checkpoints are introduced by periodically storing full node values.

The current checkpoint interval is set to every 16 mutations, though
this parameter may be made configurable in the future.
2026-01-08 21:58:02 +08:00
rjl493456442
23da91f73b
trie: reduce the memory allocation in trie hashing (#31902)
Some checks are pending
/ Linux Build (push) Waiting to run
/ Linux Build (arm) (push) Waiting to run
/ Windows Build (push) Waiting to run
/ Docker Image (push) Waiting to run
This pull request optimizes trie hashing by reducing memory allocation
overhead. Specifically:

- define a fullNodeEncoder pool to reuse encoders and avoid memory
allocations.

- simplify the encoding logic for shortNode and fullNode by getting rid
of the Go interfaces.
2025-08-01 10:23:23 +08:00
rjl493456442
4dfec7e83e
trie: optimize memory allocation (#30932)
This pull request removes the node copy operation to reduce memory
allocation. Key Changes as below:

**(a) Use `decodeNodeUnsafe` for decoding nodes retrieved from the trie
node reader**

In the current implementation of the MPT, once a trie node blob is
retrieved, it is passed to `decodeNode` for decoding. However,
`decodeNode` assumes the supplied byte slice might be mutated later, so
it performs a deep copy internally before parsing the node.

Given that the node reader is implemented by the path database and the
hash database, both of which guarantee the immutability of the returned
byte slice. By restricting the node reader interface to explicitly
guarantee that the returned byte slice will not be modified, we can
safely replace `decodeNode` with `decodeNodeUnsafe`. This eliminates the
need for a redundant byte copy during each node resolution.

**(b) Modify the trie in place**

In the current implementation of the MPT, a copy of a trie node is
created before any modifications are made. These modifications include:
- Node resolution: Converting the value from a hash to the actual node.
- Node hashing: Tagging the hash into its cache.
- Node commit: Replacing the children with its hash.
- Structural changes: For example, adding a new child to a fullNode or
replacing a child of a shortNode.

This mechanism ensures that modifications only affect the live tree,
leaving all previously created copies unaffected.

Unfortunately, this property leads to a huge memory allocation
requirement. For example, if we want to modify the fullNode for n times,
the node will be copied for n times.

In this pull request, all the trie modifications are made in place. In
order to make sure all previously created copies are unaffected, the
`Copy` function now will deep-copy all the live nodes rather than the
root node itself.

With this change, while the `Copy` function becomes more expensive, it's
totally acceptable as it's not a frequently used one. For the normal
trie operations (Get, GetNode, Hash, Commit, Insert, Delete), the node
copy is not required anymore.
2025-03-25 14:59:44 +01:00
Martin HS
d3cc618951
trie: reduce allocations in stacktrie (#30743)
This PR uses various tweaks and tricks to make the stacktrie near
alloc-free.

```
[user@work go-ethereum]$ benchstat stacktrie.1 stacktrie.7
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/trie
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
             │ stacktrie.1  │             stacktrie.7              │
             │    sec/op    │    sec/op     vs base                │
Insert100K-8   106.97m ± 8%   88.21m ± 34%  -17.54% (p=0.000 n=10)

             │   stacktrie.1    │             stacktrie.7              │
             │       B/op       │     B/op      vs base                │
Insert100K-8   13199.608Ki ± 0%   3.424Ki ± 3%  -99.97% (p=0.000 n=10)

             │  stacktrie.1   │             stacktrie.7             │
             │   allocs/op    │ allocs/op   vs base                 │
Insert100K-8   553428.50 ± 0%   22.00 ± 5%  -100.00% (p=0.000 n=10)
```
Also improves derivesha:
```
goos: linux
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core/types
cpu: 12th Gen Intel(R) Core(TM) i7-1270P
                          │ derivesha.1 │             derivesha.2              │
                          │   sec/op    │    sec/op     vs base                │
DeriveSha200/stack_trie-8   477.8µ ± 2%   430.0µ ± 12%  -10.00% (p=0.000 n=10)

                          │ derivesha.1  │             derivesha.2              │
                          │     B/op     │     B/op      vs base                │
DeriveSha200/stack_trie-8   45.17Ki ± 0%   25.65Ki ± 0%  -43.21% (p=0.000 n=10)

                          │ derivesha.1 │            derivesha.2             │
                          │  allocs/op  │ allocs/op   vs base                │
DeriveSha200/stack_trie-8   1259.0 ± 0%   232.0 ± 0%  -81.57% (p=0.000 n=10)

```

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2025-01-23 10:17:12 +01:00
rjl493456442
bbcb5ea37b
core, trie: rework trie database (#26813)
* core, trie: rework trie database

* trie: fix comment
2023-04-24 10:38:52 +03:00
rjl493456442
a1b8892384
trie: improve node rlp decoding performance (#25357)
This avoids copying the input []byte while decoding trie nodes. In most
cases, particularly when the input slice is provided by the underlying
database, this optimization is safe to use.

For cases where the origin of the input slice is unclear, the copying version
is retained. The new code performs better even when the input must be
copied, because it is now only copied once in decodeNode.
2022-08-19 00:39:47 +02:00
Qian Bin
65ed1a6871
rlp, trie: faster trie node encoding (#24126)
This change speeds up trie hashing and all other activities that require
RLP encoding of trie nodes by approximately 20%. The speedup is achieved by
avoiding reflection overhead during node encoding.

The interface type trie.node now contains a method 'encode' that works with
rlp.EncoderBuffer. Management of EncoderBuffers is left to calling code.
trie.hasher, which is pooled to avoid allocations, now maintains an
EncoderBuffer. This means memory resources related to trie node encoding
are tied to the hasher pool.

Co-authored-by: Felix Lange <fjl@twurst.com>
2022-03-09 14:45:17 +01:00
Péter Szilágyi
91eec1251c
cmd, core, eth, trie: get rid of trie cache generations (#19262)
* cmd, core, eth, trie: get rid of trie cache generations

* core, trie: get rid of remainder of cache gen boilerplate
2019-03-14 15:25:12 +02:00
Oleg Kovalov
cf05ef9106 p2p, swarm, trie: avoid copying slices in loops (#17265) 2018-08-07 13:56:40 +03:00
Péter Szilágyi
d926bf2c7e trie: cache collapsed tries node, not rlp blobs (#16876)
The current trie memory database/cache that we do pruning on stores
trie nodes as binary rlp encoded blobs, and also stores the node
relationships/references for GC purposes. However, most of the trie
nodes (everything apart from a value node) is in essence just a
collection of references.

This PR switches out the RLP encoded trie blobs with the
collapsed-but-not-serialized trie nodes. This permits most of the
references to be recovered from within the node data structure,
avoiding the need to track them a second time (expensive memory wise).
2018-06-21 11:28:05 +02:00
xincaosu
cfe8f5fd94 trie: remove unused buf parameter (#16583) 2018-04-27 12:45:02 +03:00
Felix Lange
f958d7d482 trie: rework and document key encoding
'encode' and 'decode' are meaningless because the code deals with three
encodings. Document the encodings and give a name to each one.
2017-04-25 02:14:31 +02:00
Felix Lange
177cab5fe7 trie: ensure resolved nodes stay loaded
Commit 40cdcf1183 broke the optimisation which kept nodes resolved
during Get in the trie. The decoder assigned cache generation 0
unconditionally, causing resolved nodes to get flushed on Commit.

This commit fixes it and adds two tests.
2016-10-18 04:57:47 +02:00
Felix Lange
40cdcf1183 trie, core/state: improve memory usage and performance (#3135)
* trie: store nodes as pointers

This avoids memory copies when unwrapping node interface values.

name      old time/op  new time/op  delta
Get        388ns ± 8%   215ns ± 2%  -44.56%  (p=0.000 n=15+15)
GetDB      363ns ± 3%   202ns ± 2%  -44.21%  (p=0.000 n=15+15)
UpdateBE  1.57µs ± 2%  1.29µs ± 3%  -17.80%  (p=0.000 n=13+15)
UpdateLE  1.92µs ± 2%  1.61µs ± 2%  -16.25%  (p=0.000 n=14+14)
HashBE    2.16µs ± 6%  2.18µs ± 6%     ~     (p=0.436 n=15+15)
HashLE    7.43µs ± 3%  7.21µs ± 3%   -2.96%  (p=0.000 n=15+13)

* trie: close temporary databases in GetDB benchmark

* trie: don't keep []byte from DB load around

Nodes decoded from a DB load kept hashes and values as sub-slices of
the DB value. This can be a problem because loading from leveldb often
returns []byte with a cap that's larger than necessary, increasing
memory usage.

* trie: unload old cached nodes

* trie, core/state: use cache unloading for account trie

* trie: use explicit private flags (fixes Go 1.5 reflection issue).

* trie: fixup cachegen overflow at request of nick

* core/state: rename journal size constant
2016-10-14 19:04:33 +03:00
Péter Szilágyi
748d1c171d core, core/state, trie: enterprise hand-tuned multi-level caching 2016-05-26 16:33:09 +03:00
Felix Lange
565d9f2306 core, trie: new trie 2015-09-22 22:53:49 +02:00
Felix Lange
bfbcfbe4a9 all: fix license headers one more time
I forgot to update one instance of "go-ethereum" in commit 3f047be5a.
2015-07-23 18:35:11 +02:00
Felix Lange
3f047be5aa all: update license headers to distiguish GPL/LGPL
All code outside of cmd/ is licensed as LGPL. The headers
now reflect this by calling the whole work "the go-ethereum library".
2015-07-22 18:51:45 +02:00
Felix Lange
ea54283b30 all: update license information 2015-07-07 14:12:44 +02:00
Jeffrey Wilcke
0a1ff68c11 trie: dirty tracking 2015-07-04 02:51:36 +02:00
obscuren
3c7181d28f Fixed a copy issue in the trie which could cause a consensus failure 2015-02-02 19:58:34 -08:00
obscuren
9022f5034f default values removed 2015-01-29 23:17:43 +01:00
obscuren
db4aaedcbd Moved ptrie => trie. Removed old trie 2015-01-08 11:47:04 +01:00
Renamed from ptrie/node.go (Browse further)