go-ethereum/docs/partial-state/PARTIAL_STATEFULNESS_PLAN.md
CPerezz c3c4dfd838
core, eth: fix post-sync block processing and BAL type compatibility
Fix the post-sync deadlock where blocks validated via BAL in newPayload
were never written to the database, causing ForkchoiceUpdated to fail
finding them and triggering infinite sync cycles.

Changes:
- Export WriteBlockWithoutState and call it after ProcessBlockWithBAL
  in newPayload, so FCU can find blocks via GetBlockByHash
- Guard SetCanonical against recoverAncestors for partial state nodes
  (they can't re-execute blocks, only apply BAL diffs)
- Auto-disable log indexing when partial state is enabled (no receipts)
- Fix BAL type field accesses to match upstream bal-devnet-2 types
  (StorageChanges, CodeChanges, BalanceChanges, Validate signature)
- Update newPayload signature (BAL now comes from ExecutableData params)
- Add partial sync scripts and documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 12:04:09 +02:00

20 KiB
Raw Blame History

Partial Statefulness Design - Final Plan

Overview

Goal: Enable Ethereum nodes to operate with reduced storage by keeping:

  • Full account trie (all accounts + intermediate nodes)
  • Selective storage (only configured contracts' storage)
  • BAL-based state updates (per EIP-7928)

Source: ethresear.ch - Partial Statefulness


Design Decisions (Confirmed)

Core Model

Decision Choice Notes
Account trie ALL accounts + ALL intermediate nodes Full trie structure with compression
Storage Only configured contracts User specifies which contracts in config file
BAL source Per EIP-7928 BALs come with blocks, hash committed in header
Validation Trust BAL, apply diffs Same trust model as light clients (signing committee)
Block history 256-1024 blocks Support BLOCKHASH opcode, configurable BAL retention

Storage Approach

Component Size Notes
Account leaves ~14 GB 300M accounts × ~45 bytes (slim RLP)
Intermediate nodes ~15-25 GB With delta encoding + bitmap compression
Total account trie ~30-40 GB
Configured storage Variable Depends on tracked contracts
BAL history ~1-2 GB 256-1024 blocks

Operations

Operation Approach
Initial sync Account trie first (snap sync), then configured storage
Block processing Apply BAL diffs → update trie → verify state root matches header
Reorgs Revert using stored BAL history; deeper reorgs request from full peers
eth_getProof (accounts) Supported for ALL accounts
eth_getProof (storage) Only for configured contracts; error otherwise
Mempool validation Fully supported (only needs account data)
Serving peers Account proofs + tracked contract storage

EIP-7928 BAL Integration

BAL Format (from EIP-7928)

BlockAccessList = [AccountAccess, ...]

AccountAccess = [
  Address,
  StorageWrites,    // map[slot] -> map[txIdx] -> value
  StorageReads,     // list of read slots
  BalanceChanges,   // map[txIdx] -> balance
  NonceChanges,     // map[txIdx] -> nonce
  CodeChanges       // map[txIdx] -> bytecode
]

Key EIP-7928 Facts

  • Header commitment: block_access_list_hash = keccak256(rlp.encode(bal))
  • Propagation: Via Engine API (ExecutionPayloadV4), not in block body
  • Retention: Full nodes must keep WSP (~5 months); partial nodes: configurable (256-1024 blocks)
  • Validation: Deterministic - wrong BAL = wrong header hash = invalid block

BAL Processing Flow

1. Receive block + BAL via Engine API
2. Verify: keccak256(rlp.encode(bal)) == header.block_access_list_hash
3. For each AccountAccess in BAL:
   a. Load current account from trie
   b. Apply balance/nonce changes (final values per block)
   c. Apply storage root update (from BAL storage writes for tracked contracts)
   d. Update account in trie
4. Commit trie changes
5. Verify: trie.Root() == header.stateRoot
6. If mismatch: reject block (consensus failure elsewhere)

State Root Verification

How It Works Without Re-execution

Partial nodes can verify state root because:

  1. Full account trie stored: All intermediate nodes available
  2. BAL provides final values: Post-block account state (not deltas)
  3. Trie update is deterministic: Same inputs → same output
  4. Cross-check with header: header.stateRoot must match computed root

Trust Model

Same as beacon chain light clients:

  • Trust signing committee (attestations)
  • Verify header commitments (state root, BAL hash)
  • Detect inconsistencies via hash mismatches

If BAL is incorrect:

  • State root won't match → block rejected
  • Fork choice rejects the block
  • Partial node follows canonical chain

Snap Sync Adaptation

Current Snap Sync (Full Node)

Phase 1: Sync account ranges (GetAccountRangeMsg)
Phase 2: Sync all storage for all contracts
Phase 3: Sync all bytecode
Phase 4: Healing (fill gaps)

Partial Statefulness Snap Sync

Phase 1: Sync COMPLETE account trie (same as full node)
  - All accounts
  - All intermediate nodes
  - ~30-40 GB

Phase 2: Sync storage ONLY for configured contracts
  - Filter: Only request storage for contracts in config
  - Skip: All other contracts' storage

Phase 3: Sync bytecode ONLY for configured contracts
  - Same filtering as storage

Phase 4: Healing (account trie only)
  - No healing needed for skipped storage

Implementation Changes Needed

  1. Add PartialStateConfig to ethconfig
  2. Modify storageRequest creation in snap syncer to check config
  3. Skip storage/bytecode tasks for non-configured contracts
  4. Track sync progress separately for account trie vs. storage

Configuration

Config Structure

type PartialStateConfig struct {
    Enabled          bool
    Contracts        []common.Address  // Tracked contracts
    ContractsFile    string            // Or load from JSON file
    BALRetention     uint64            // Blocks to keep (default: 256)
}

Example Config (TOML)

[Eth.PartialState]
Enabled = true
BALRetention = 256
Contracts = [
    "0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2",  # WETH
    "0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48",  # USDC
]

RPC Behavior

Method Behavior
eth_getBalance Works (have account data)
eth_getTransactionCount Works (have nonce)
eth_getCode For tracked contracts; error for others
eth_getStorageAt For tracked contracts; error for others
eth_getProof (account) Works for ANY account
eth_getProof (storage) For tracked contracts; error for others
eth_call If touches only tracked contracts; if touches untracked
eth_estimateGas Same as eth_call
eth_sendRawTransaction Mempool validation works (only needs account data)

Binary Trie (EIP-7864) Compatibility

Will This Design Work With Binary Trie?

Yes, with minimal changes:

Aspect MPT Binary Trie Compatibility
Account data StateAccount struct Same struct Compatible
Trie interface Trie interface Same interface Compatible
BAL format Per EIP-7928 Same format Compatible
Selective storage Skip storage tries Skip stem suffixes Compatible
Proof generation Merkle proofs Path proofs Use interface

Adaptation Needed

Only the storage size estimates change:

  • Binary Trie total: ~48 GB (vs. MPT ~30-40 GB with compression)
  • Binary Trie has simpler structure, no compression needed

Recommendation: Use go-ethereum's Trie interface which abstracts over both.


Implementation Phases

Phase 1: Configuration & Infrastructure

  • Add PartialStateConfig to eth/ethconfig/config.go
  • Create core/state/partial/ package with ContractFilter interface
  • Add CLI flags for partial state mode

Phase 2: Snap Sync Modifications

  • Modify eth/protocols/snap/sync.go for selective storage sync
  • Add filter checks in processAccountResponse and processStorageResponse
  • Track separate progress for account trie vs. storage

Phase 3: BAL Processing

  • Implement BAL diff application in block import pipeline
  • Modify core/blockchain.go to use BAL for state updates
  • Add state root verification without re-execution

Phase 4: RPC & Operations

  • Modify internal/ethapi/api.go for partial state awareness
  • Add appropriate errors for untracked contract queries
  • Implement BAL history management and reorg handling

Key Files to Modify

File Changes
eth/ethconfig/config.go Add PartialStateConfig
core/state/partial/filter.go New: ContractFilter interface
eth/protocols/snap/sync.go Filter storage sync by config
core/blockchain.go BAL-based state updates
internal/ethapi/api.go Partial state RPC handling
cmd/utils/flags.go CLI flags for partial state

Open Items for Implementation

  1. BLOCKHASH opcode: Verify 256 blocks of history is sufficient; check if other opcodes need block history

  2. Storage root verification: When applying BAL storage diffs for tracked contracts, verify computed storage root matches account's storageRoot field

  3. Compression implementation: Implement delta encoding + bitmap optimization for intermediate nodes (existing pathdb patterns can be adapted)

  4. Selective snap sync protocol: Research if snap protocol needs extension or if filtering can be done client-side


Verification Checklist

After implementation, verify:

  • Can sync account trie completely via snap sync
  • Can sync only configured contracts' storage
  • BAL diffs apply correctly, state root matches header
  • eth_getProof works for any account (proof generation)
  • eth_getProof returns error for untracked storage
  • Mempool accepts/validates transactions correctly
  • Reorgs up to BAL retention depth work
  • Deeper reorgs trigger recovery from full peers
  • Total storage matches estimates (~30-40 GB + configured storage)

DETAILED SPECIFICATIONS


SPEC 1: Snap Sync Refactoring for Selective Storage

Overview

The snap sync protocol in go-ethereum downloads account data and contract storage in parallel. For partial statefulness, we need to:

  1. Download ALL accounts (unchanged behavior)
  2. Download storage ONLY for configured contracts (new filtering)
  3. Download bytecode ONLY for configured contracts (new filtering)

Design Principle: Keep original Syncer implementation untouched. Create a separate syncer implementation using a strategy/interface pattern that allows selection at runtime.

Architecture: Strategy Pattern

                    ┌─────────────────────┐
                    │   SyncStrategy      │  (interface)
                    │   interface         │
                    └─────────┬───────────┘
                              │
              ┌───────────────┼───────────────┐
              │               │               │
    ┌─────────▼─────┐  ┌──────▼──────┐  ┌─────▼───────┐
    │ FullSyncer    │  │PartialSyncer│  │ (future)    │
    │ (wraps orig)  │  │(new impl)   │  │             │
    └───────────────┘  └─────────────┘  └─────────────┘

Key Files

File Purpose
eth/protocols/snap/sync.go UNCHANGED - Original Syncer
eth/protocols/snap/strategy.go NEW - SyncStrategy interface
eth/protocols/snap/partial_sync.go NEW - PartialSyncer implementation
core/state/partial/filter.go NEW - ContractFilter interface
eth/downloader/downloader.go MODIFIED - Strategy selection

SPEC 2: Compression + Root Recomputation

Overview

For partial statefulness, we store the full account trie (~300M accounts + intermediate nodes) but need efficient storage. This spec covers:

  1. REUSE existing delta encoding infrastructure from pathdb
  2. State root recomputation from BAL diffs

Existing Compression Infrastructure (REUSE - DO NOT REIMPLEMENT)

Location: triedb/pathdb/nodes.go (lines 431-691)

go-ethereum already has production-grade compression we must reuse:

Function Purpose Status
encodeNodeCompressed() Delta encoding with bitmap REUSE
decodeNodeCompressed() Decode compressed format REUSE
encodeNodeFull() Full-value encoding REUSE
encodeNodeHistory() Checkpoint + delta chains REUSE

SPEC 3: BAL Processing Pipeline

Overview

Block Access Lists (BALs) per EIP-7928 provide state diffs that allow partial nodes to update state without re-executing transactions.

Existing BAL Implementation (Already in Geth)

Location: core/types/bal/

BAL types are already implemented in go-ethereum master:

File Contents
bal.go ConstructionBlockAccessList, ConstructionAccountAccess, builder methods
bal_encoding.go BlockAccessList, AccountAccess, RLP encoding, hash computation
bal_encoding_rlp_generated.go Generated RLP encoder/decoder

SPEC 4: RPC Modifications

Overview

Partial state nodes can answer some RPC queries but not others. This spec defines the behavior.

Error Codes

var (
    ErrStorageNotTracked = errors.New("storage not tracked for this contract")
    ErrCodeNotTracked = errors.New("code not tracked for this contract")
)

const (
    ErrCodeStorageNotTracked = -32001
    ErrCodeNotTracked        = -32002
)

SPEC 5: Configuration System

CLI Flags

var (
    PartialStateFlag = &cli.BoolFlag{
        Name:     "partial-state",
        Usage:    "Enable partial statefulness mode (reduced storage)",
        Category: flags.EthCategory,
    }

    PartialStateContractsFlag = &cli.StringSliceFlag{
        Name:     "partial-state.contracts",
        Usage:    "Contracts to track storage for (comma-separated addresses)",
        Category: flags.EthCategory,
    }

    PartialStateContractsFileFlag = &cli.StringFlag{
        Name:     "partial-state.contracts-file",
        Usage:    "JSON file containing contracts to track",
        Category: flags.EthCategory,
    }

    PartialStateBALRetentionFlag = &cli.Uint64Flag{
        Name:     "partial-state.bal-retention",
        Usage:    "Number of blocks to retain BAL history (default: 256)",
        Value:    256,
        Category: flags.EthCategory,
    }
)

Implementation Task Breakdown

Phase 1: Core Infrastructure (Foundation)

Task ID Task Dependencies Effort
1.1 Create core/state/partial/ package structure None S
1.2 Implement ContractFilter interface 1.1 S
1.3 Add PartialStateConfig to ethconfig None S
1.4 Add CLI flags for partial state 1.3 S
1.5 Implement config loading (file + direct) 1.3, 1.4 M

Phase 2: Snap Sync Modifications (Selective Sync via Strategy Pattern)

Task ID Task Dependencies Effort
2.1 Create SyncStrategy interface in strategy.go None S
2.2 Create FullSyncStrategy wrapper (embeds original Syncer) 2.1 S
2.3 Create PartialSyncer struct in partial_sync.go 1.2, 2.1 M
2.4 Implement account processing with storage filtering 2.3 M
2.5 Add markStorageSkipped / isStorageSkipped helpers 2.3 S
2.6 Implement healing with skip checks 2.5 M
2.7 Modify Downloader to use SyncStrategy interface 2.1, 2.2 S
2.8 Add strategy selection based on config 2.7 S
2.9 Unit tests for PartialSyncer 2.4, 2.6 M
2.10 Integration test with partial filter 2.9 L

Phase 3: BAL Processing (State Updates)

Task ID Task Dependencies Effort
3.1 Add BAL key schema to core/rawdb/schema.go None S
3.2 Create core/rawdb/accessors_bal.go (following existing pattern) 3.1 S
3.3 Create thin BALHistory wrapper in core/state/partial/history.go 3.2 S
3.4 Implement ApplyBALAndComputeRoot using existing BAL types + trie Phase 2 L
3.5 Implement applyStorageChanges for tracked contracts 3.4 M
3.6 Add ProcessBlockWithBAL to BlockChain 3.4, 3.3 L
3.7 Implement reorg handling with BAL history 3.3, 3.6 L
3.8 Engine API integration for BAL delivery 3.6 M
3.9 BAL processing tests 3.6, 3.7 L

Phase 4: RPC Modifications (API Layer)

Task ID Task Dependencies Effort
4.1 Add PartialStateError and error codes None S
4.2 Add PartialStateEnabled, IsContractTracked to Backend 1.2 S
4.3 Modify GetStorageAt for partial state 4.1, 4.2 S
4.4 Modify GetCode for partial state 4.1, 4.2 S
4.5 Modify GetProof (account ok, storage filtered) 4.1, 4.2 M
4.6 Modify Call / EstimateGas with pre-check 4.1, 4.2 M
4.7 RPC behavior tests 4.3-4.6 M

Phase 5: Integration & Testing

Task ID Task Dependencies Effort
5.1 End-to-end partial sync test Phase 2, Phase 3 L
5.2 Verify storage size meets estimates 5.1 M
5.3 Reorg recovery test Phase 3 M
5.4 RPC integration test Phase 4, 5.1 M
5.5 Documentation updates All M

Effort Legend

  • S = Small (few hours)
  • M = Medium (1-2 days)
  • L = Large (3-5 days)

Critical Path

The critical path for minimum viable partial statefulness:

  1. Phase 1: Configuration infrastructure
  2. Phase 2: Selective snap sync via strategy pattern (accounts + filtered storage)
  3. Phase 3: BAL processing (state updates without re-execution, using existing BAL types)
  4. Phase 4: RPC modifications (proper error handling)
  5. Phase 5: End-to-end test

This enables a working partial stateful node. Compression and full reorg handling can be added incrementally.

Key Design Decisions Summary

Decision Approach Rationale
Snap sync Strategy pattern with separate PartialSyncer Keep original Syncer untouched
BAL types Use existing core/types/bal/ Already implemented in geth master
Filter interface ContractFilter interface Flexible, testable
Skip tracking DB markers + in-memory map Persist across restarts
RPC errors Custom error codes Clear user feedback

Reuse vs. New Code Summary

REUSING (Do Not Reimplement)

Component Existing Location How We Use It
BAL Types core/types/bal/ Import directly
Compression triedb/pathdb/nodes.go encodeNodeCompressed(), encodeNodeHistory()
Delta Encoding trie/node.go NodeDifference()
Checkpoint Mechanism triedb/pathdb/config.go FullValueCheckpoint config
Diff Layers triedb/pathdb/difflayer.go nodeSetWithOrigin, StateSetWithOrigin
History Key Patterns core/rawdb/schema.go Follow StateHistoryAccountBlockPrefix pattern
History Accessors core/rawdb/accessors_history.go Follow Read/Write/Delete triplet pattern
Safe Deletion core/rawdb/database.go SafeDeleteRange() for pruning
Filter Patterns eth/filters/filter.go Reference for contract filtering
Trie Interface trie/trie.go Standard trie operations

CREATING NEW

Component New Location Purpose
SyncStrategy interface eth/protocols/snap/strategy.go Abstract sync implementations
PartialSyncer eth/protocols/snap/partial_sync.go Filtered storage sync
ContractFilter core/state/partial/filter.go Contract tracking interface
PartialState core/state/partial/state.go BAL application + root computation
BAL key schema core/rawdb/schema.go Add balHistoryPrefix
BAL accessors core/rawdb/accessors_bal.go Read/Write/Delete following pattern
BALHistory wrapper core/state/partial/history.go Thin layer over rawdb
ProcessBlockWithBAL core/blockchain_partial.go Block processing entry point
RPC error codes internal/ethapi/ Partial state errors
Config eth/ethconfig/config.go PartialStateConfig
CLI flags cmd/utils/flags.go Partial state flags