mirror of
https://github.com/ethereum/go-ethereum.git
synced 2026-06-12 01:41:36 +00:00
Introduce the codec and on-disk blob format for the bintrie flat-state layer. This commit only defines the types; the codec is NOT wired into pathdb.Database.New yet (that happens in a later commit once the leaf-production hook in binaryHasher and the stateUpdate wiring are in place). Three pieces: 1. trie/bintrie/pack.go Canonical PackBasicData / UnpackBasicData helpers that encode an account's (codeSize, nonce, balance) into the 32-byte BasicData leaf defined by EIP-7864. Preserves the existing BinaryTrie.UpdateAccount layout byte-for-byte (4-byte code_size at offset 4 rather than the spec's 3-byte field at offset 5 — any realistic code size has byte 4 always zero and the two encodings are bit-equivalent in practice). BinaryTrie.UpdateAccount is refactored to delegate to PackBasicData so the flat-state codec can produce a bit-identical BasicData encoding without duplicating the layout logic. 2. triedb/pathdb/stem_blob.go Packed encoding of the populated (offset, value) pairs at a bintrie stem. A stem can hold up to 256 offsets per EIP-7864 but in practice only a handful are set; the layout is a 32-byte bitmap followed by N 32-byte values in ascending offset order, where N = popcount. Empty stems encode to nil so the caller knows to delete the on-disk key rather than write a zero-length value. Provides encodeStemBlob / decodeStemBlob / extractStemOffset / mergeStemBlob and a stemBuilder type for accumulating writes. The tombstone convention (32 zero bytes = "present with zero" as used by DeleteStorage) is preserved. 11 unit tests cover: empty blob, BasicData+CodeHash roundtrip, all 256 offsets populated, sparse high offsets, set/clear roundtrip, load-from-existing-blob RMW, merge helper, merge-to-empty, tombstone zero bytes, malformed input detection, bitmap rank sanity. 3. triedb/pathdb/flat_codec_bintrie.go bintrieFlatCodec implements flatStateCodec over the stem-blob layout. Unlike merkleFlatCodec it is stateful: it holds a ethdb.KeyValueReader reference used by applyWrites to read the existing stem blob before merging in new writes. ethdb.Batch is write-only so the batch passed to Write* cannot be used to fetch current state. Pre-aggregation requirement is documented explicitly: within a single flush, the caller must NOT issue two Write* calls targeting the same stem, because the RMW read comes from the store (not the in-flight batch). Commit 8 of the bintrie flat-state plan restructures writeStates to pre-aggregate per-stem writes so callers don't have to handle this manually. Cache keys are prefix-disambiguated with a one-byte 0x01 to keep bintrie stem lookups disjoint from merkle 32-byte account keys and 64-byte storage keys in the shared clean-state fastcache. SplitMarker is a single-tier (stem-only) format, not the merkle two-tier (account, account+storage) format. 7 unit tests cover: account roundtrip, storage roundtrip, multiple writes to the same stem, DeleteAccount preserving unrelated offsets, DeleteStorage removing the final offset collapsing the key, cache key disjointness from merkle, SplitMarker semantics. The codec is not dispatched by anything yet; MPT continues through the merkle codec and bintrie mode still runs on the (soon-to-be-replaced) keccak-shaped path until Commit 10 wires things up.
78 lines
3.3 KiB
Go
78 lines
3.3 KiB
Go
// Copyright 2026 go-ethereum Authors
|
||
// This file is part of the go-ethereum library.
|
||
//
|
||
// The go-ethereum library is free software: you can redistribute it and/or modify
|
||
// it under the terms of the GNU Lesser General Public License as published by
|
||
// the Free Software Foundation, either version 3 of the License, or
|
||
// (at your option) any later version.
|
||
//
|
||
// The go-ethereum library is distributed in the hope that it will be useful,
|
||
// but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||
// GNU Lesser General Public License for more details.
|
||
//
|
||
// You should have received a copy of the GNU Lesser General Public License
|
||
// along with the go-ethereum library. If not, see <http://www.gnu.org/licenses/>.
|
||
|
||
package bintrie
|
||
|
||
import (
|
||
"encoding/binary"
|
||
|
||
"github.com/holiman/uint256"
|
||
)
|
||
|
||
// PackBasicData encodes an account's basic metadata (code size, nonce,
|
||
// balance) into the 32-byte BasicData leaf value defined by EIP-7864.
|
||
//
|
||
// The canonical spec layout is:
|
||
//
|
||
// byte 0 version (currently always 0, left as the implicit zero)
|
||
// bytes 1..4 reserved
|
||
// bytes 5..7 code_size (big-endian, 3 bytes, max 2^24-1)
|
||
// bytes 8..15 nonce (big-endian, 8 bytes)
|
||
// bytes 16..31 balance (big-endian, right-justified, 16 bytes)
|
||
//
|
||
// For historical reasons the existing BinaryTrie implementation writes
|
||
// code_size as a 4-byte big-endian uint32 starting at byte 4 rather than a
|
||
// 3-byte big-endian field starting at byte 5. Byte 4 is reserved per the
|
||
// EIP, so for any realistic code size (below 2^24 ≈ 16 MB, well under the
|
||
// EIP-170 24 KB contract limit) the high byte is always 0 and the two
|
||
// encodings are bit-equivalent. This function preserves that existing
|
||
// behavior byte-for-byte so callers can substitute it for the inlined
|
||
// encoding in BinaryTrie.UpdateAccount without changing any state root.
|
||
//
|
||
// Any future correction of the byte offset is a consensus-level change
|
||
// and must be coordinated across clients.
|
||
func PackBasicData(nonce uint64, balance *uint256.Int, codeSize int) [HashSize]byte {
|
||
var data [HashSize]byte
|
||
binary.BigEndian.PutUint32(data[BasicDataCodeSizeOffset-1:], uint32(codeSize))
|
||
binary.BigEndian.PutUint64(data[BasicDataNonceOffset:], nonce)
|
||
|
||
// Balance is a 256-bit uint stored right-justified in the lower 16
|
||
// bytes of BasicData. For dev-mode accounts whose balance exceeds
|
||
// 2^128 - 1 (e.g. 0xff × HashSize), truncate to the upper 16 bytes to
|
||
// match the existing BinaryTrie behavior rather than panicking.
|
||
balanceBytes := balance.Bytes()
|
||
if len(balanceBytes) > 16 {
|
||
balanceBytes = balanceBytes[16:]
|
||
}
|
||
copy(data[HashSize-len(balanceBytes):], balanceBytes[:])
|
||
return data
|
||
}
|
||
|
||
// UnpackBasicData is the inverse of PackBasicData. It decodes the code
|
||
// size, nonce, and balance fields from a BasicData leaf value.
|
||
//
|
||
// Note: the returned balance is always 128-bit or smaller because the
|
||
// encoding reserves 16 bytes for it; dev-mode accounts whose pre-encoded
|
||
// balance exceeded 2^128 - 1 are not recoverable losslessly.
|
||
func UnpackBasicData(data [HashSize]byte) (nonce uint64, balance *uint256.Int, codeSize int) {
|
||
codeSize = int(binary.BigEndian.Uint32(data[BasicDataCodeSizeOffset-1:]))
|
||
nonce = binary.BigEndian.Uint64(data[BasicDataNonceOffset:])
|
||
|
||
var b [16]byte
|
||
copy(b[:], data[BasicDataBalanceOffset:])
|
||
balance = new(uint256.Int).SetBytes(b[:])
|
||
return
|
||
}
|