go-ethereum/docs/partial-state/PARTIAL_STATEFULNESS_PLAN.md
CPerezz c3c4dfd838
core, eth: fix post-sync block processing and BAL type compatibility
Fix the post-sync deadlock where blocks validated via BAL in newPayload
were never written to the database, causing ForkchoiceUpdated to fail
finding them and triggering infinite sync cycles.

Changes:
- Export WriteBlockWithoutState and call it after ProcessBlockWithBAL
  in newPayload, so FCU can find blocks via GetBlockByHash
- Guard SetCanonical against recoverAncestors for partial state nodes
  (they can't re-execute blocks, only apply BAL diffs)
- Auto-disable log indexing when partial state is enabled (no receipts)
- Fix BAL type field accesses to match upstream bal-devnet-2 types
  (StorageChanges, CodeChanges, BalanceChanges, Validate signature)
- Update newPayload signature (BAL now comes from ExecutableData params)
- Add partial sync scripts and documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 12:04:09 +02:00

543 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Partial Statefulness Design - Final Plan
## Overview
**Goal**: Enable Ethereum nodes to operate with reduced storage by keeping:
- Full account trie (all accounts + intermediate nodes)
- Selective storage (only configured contracts' storage)
- BAL-based state updates (per EIP-7928)
**Source**: [ethresear.ch - Partial Statefulness](https://ethresear.ch/t/the-future-of-state-part-2-beyond-the-myth-of-partial-statefulness-the-reality-of-zkevms/23396)
---
## Design Decisions (Confirmed)
### Core Model
| Decision | Choice | Notes |
|----------|--------|-------|
| Account trie | ALL accounts + ALL intermediate nodes | Full trie structure with compression |
| Storage | Only configured contracts | User specifies which contracts in config file |
| BAL source | Per EIP-7928 | BALs come with blocks, hash committed in header |
| Validation | Trust BAL, apply diffs | Same trust model as light clients (signing committee) |
| Block history | 256-1024 blocks | Support BLOCKHASH opcode, configurable BAL retention |
### Storage Approach
| Component | Size | Notes |
|-----------|------|-------|
| Account leaves | ~14 GB | 300M accounts × ~45 bytes (slim RLP) |
| Intermediate nodes | ~15-25 GB | With delta encoding + bitmap compression |
| **Total account trie** | **~30-40 GB** | |
| Configured storage | Variable | Depends on tracked contracts |
| BAL history | ~1-2 GB | 256-1024 blocks |
### Operations
| Operation | Approach |
|-----------|----------|
| Initial sync | Account trie first (snap sync), then configured storage |
| Block processing | Apply BAL diffs → update trie → verify state root matches header |
| Reorgs | Revert using stored BAL history; deeper reorgs request from full peers |
| eth_getProof (accounts) | Supported for ALL accounts |
| eth_getProof (storage) | Only for configured contracts; error otherwise |
| Mempool validation | Fully supported (only needs account data) |
| Serving peers | Account proofs + tracked contract storage |
---
## EIP-7928 BAL Integration
### BAL Format (from EIP-7928)
```
BlockAccessList = [AccountAccess, ...]
AccountAccess = [
Address,
StorageWrites, // map[slot] -> map[txIdx] -> value
StorageReads, // list of read slots
BalanceChanges, // map[txIdx] -> balance
NonceChanges, // map[txIdx] -> nonce
CodeChanges // map[txIdx] -> bytecode
]
```
### Key EIP-7928 Facts
- **Header commitment**: `block_access_list_hash = keccak256(rlp.encode(bal))`
- **Propagation**: Via Engine API (ExecutionPayloadV4), not in block body
- **Retention**: Full nodes must keep WSP (~5 months); partial nodes: configurable (256-1024 blocks)
- **Validation**: Deterministic - wrong BAL = wrong header hash = invalid block
### BAL Processing Flow
```
1. Receive block + BAL via Engine API
2. Verify: keccak256(rlp.encode(bal)) == header.block_access_list_hash
3. For each AccountAccess in BAL:
a. Load current account from trie
b. Apply balance/nonce changes (final values per block)
c. Apply storage root update (from BAL storage writes for tracked contracts)
d. Update account in trie
4. Commit trie changes
5. Verify: trie.Root() == header.stateRoot
6. If mismatch: reject block (consensus failure elsewhere)
```
---
## State Root Verification
### How It Works Without Re-execution
Partial nodes can verify state root because:
1. **Full account trie stored**: All intermediate nodes available
2. **BAL provides final values**: Post-block account state (not deltas)
3. **Trie update is deterministic**: Same inputs → same output
4. **Cross-check with header**: header.stateRoot must match computed root
### Trust Model
Same as beacon chain light clients:
- Trust signing committee (attestations)
- Verify header commitments (state root, BAL hash)
- Detect inconsistencies via hash mismatches
If BAL is incorrect:
- State root won't match → block rejected
- Fork choice rejects the block
- Partial node follows canonical chain
---
## Snap Sync Adaptation
### Current Snap Sync (Full Node)
```
Phase 1: Sync account ranges (GetAccountRangeMsg)
Phase 2: Sync all storage for all contracts
Phase 3: Sync all bytecode
Phase 4: Healing (fill gaps)
```
### Partial Statefulness Snap Sync
```
Phase 1: Sync COMPLETE account trie (same as full node)
- All accounts
- All intermediate nodes
- ~30-40 GB
Phase 2: Sync storage ONLY for configured contracts
- Filter: Only request storage for contracts in config
- Skip: All other contracts' storage
Phase 3: Sync bytecode ONLY for configured contracts
- Same filtering as storage
Phase 4: Healing (account trie only)
- No healing needed for skipped storage
```
### Implementation Changes Needed
1. Add `PartialStateConfig` to ethconfig
2. Modify `storageRequest` creation in snap syncer to check config
3. Skip storage/bytecode tasks for non-configured contracts
4. Track sync progress separately for account trie vs. storage
---
## Configuration
### Config Structure
```go
type PartialStateConfig struct {
Enabled bool
Contracts []common.Address // Tracked contracts
ContractsFile string // Or load from JSON file
BALRetention uint64 // Blocks to keep (default: 256)
}
```
### Example Config (TOML)
```toml
[Eth.PartialState]
Enabled = true
BALRetention = 256
Contracts = [
"0xC02aaA39b223FE8D0A0e5C4F27eAD9083C756Cc2", # WETH
"0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48", # USDC
]
```
---
## RPC Behavior
| Method | Behavior |
|--------|----------|
| `eth_getBalance` | ✅ Works (have account data) |
| `eth_getTransactionCount` | ✅ Works (have nonce) |
| `eth_getCode` | ✅ For tracked contracts; ❌ error for others |
| `eth_getStorageAt` | ✅ For tracked contracts; ❌ error for others |
| `eth_getProof` (account) | ✅ Works for ANY account |
| `eth_getProof` (storage) | ✅ For tracked contracts; ❌ error for others |
| `eth_call` | ✅ If touches only tracked contracts; ❌ if touches untracked |
| `eth_estimateGas` | Same as eth_call |
| `eth_sendRawTransaction` | ✅ Mempool validation works (only needs account data) |
---
## Binary Trie (EIP-7864) Compatibility
### Will This Design Work With Binary Trie?
**Yes**, with minimal changes:
| Aspect | MPT | Binary Trie | Compatibility |
|--------|-----|-------------|---------------|
| Account data | StateAccount struct | Same struct | ✅ Compatible |
| Trie interface | `Trie` interface | Same interface | ✅ Compatible |
| BAL format | Per EIP-7928 | Same format | ✅ Compatible |
| Selective storage | Skip storage tries | Skip stem suffixes | ✅ Compatible |
| Proof generation | Merkle proofs | Path proofs | ✅ Use interface |
### Adaptation Needed
Only the storage size estimates change:
- Binary Trie total: ~48 GB (vs. MPT ~30-40 GB with compression)
- Binary Trie has simpler structure, no compression needed
**Recommendation**: Use go-ethereum's `Trie` interface which abstracts over both.
---
## Implementation Phases
### Phase 1: Configuration & Infrastructure
- Add `PartialStateConfig` to `eth/ethconfig/config.go`
- Create `core/state/partial/` package with `ContractFilter` interface
- Add CLI flags for partial state mode
### Phase 2: Snap Sync Modifications
- Modify `eth/protocols/snap/sync.go` for selective storage sync
- Add filter checks in `processAccountResponse` and `processStorageResponse`
- Track separate progress for account trie vs. storage
### Phase 3: BAL Processing
- Implement BAL diff application in block import pipeline
- Modify `core/blockchain.go` to use BAL for state updates
- Add state root verification without re-execution
### Phase 4: RPC & Operations
- Modify `internal/ethapi/api.go` for partial state awareness
- Add appropriate errors for untracked contract queries
- Implement BAL history management and reorg handling
---
## Key Files to Modify
| File | Changes |
|------|---------|
| `eth/ethconfig/config.go` | Add `PartialStateConfig` |
| `core/state/partial/filter.go` | New: `ContractFilter` interface |
| `eth/protocols/snap/sync.go` | Filter storage sync by config |
| `core/blockchain.go` | BAL-based state updates |
| `internal/ethapi/api.go` | Partial state RPC handling |
| `cmd/utils/flags.go` | CLI flags for partial state |
---
## Open Items for Implementation
1. **BLOCKHASH opcode**: Verify 256 blocks of history is sufficient; check if other opcodes need block history
2. **Storage root verification**: When applying BAL storage diffs for tracked contracts, verify computed storage root matches account's storageRoot field
3. **Compression implementation**: Implement delta encoding + bitmap optimization for intermediate nodes (existing pathdb patterns can be adapted)
4. **Selective snap sync protocol**: Research if snap protocol needs extension or if filtering can be done client-side
---
## Verification Checklist
After implementation, verify:
- [ ] Can sync account trie completely via snap sync
- [ ] Can sync only configured contracts' storage
- [ ] BAL diffs apply correctly, state root matches header
- [ ] eth_getProof works for any account (proof generation)
- [ ] eth_getProof returns error for untracked storage
- [ ] Mempool accepts/validates transactions correctly
- [ ] Reorgs up to BAL retention depth work
- [ ] Deeper reorgs trigger recovery from full peers
- [ ] Total storage matches estimates (~30-40 GB + configured storage)
---
# DETAILED SPECIFICATIONS
---
## SPEC 1: Snap Sync Refactoring for Selective Storage
### Overview
The snap sync protocol in go-ethereum downloads account data and contract storage in parallel. For partial statefulness, we need to:
1. Download ALL accounts (unchanged behavior)
2. Download storage ONLY for configured contracts (new filtering)
3. Download bytecode ONLY for configured contracts (new filtering)
**Design Principle**: Keep original `Syncer` implementation untouched. Create a separate syncer implementation using a strategy/interface pattern that allows selection at runtime.
### Architecture: Strategy Pattern
```
┌─────────────────────┐
│ SyncStrategy │ (interface)
│ interface │
└─────────┬───────────┘
┌───────────────┼───────────────┐
│ │ │
┌─────────▼─────┐ ┌──────▼──────┐ ┌─────▼───────┐
│ FullSyncer │ │PartialSyncer│ │ (future) │
│ (wraps orig) │ │(new impl) │ │ │
└───────────────┘ └─────────────┘ └─────────────┘
```
### Key Files
| File | Purpose |
|------|---------|
| `eth/protocols/snap/sync.go` | **UNCHANGED** - Original Syncer |
| `eth/protocols/snap/strategy.go` | **NEW** - SyncStrategy interface |
| `eth/protocols/snap/partial_sync.go` | **NEW** - PartialSyncer implementation |
| `core/state/partial/filter.go` | **NEW** - ContractFilter interface |
| `eth/downloader/downloader.go` | **MODIFIED** - Strategy selection |
---
## SPEC 2: Compression + Root Recomputation
### Overview
For partial statefulness, we store the full account trie (~300M accounts + intermediate nodes) but need efficient storage. This spec covers:
1. **REUSE** existing delta encoding infrastructure from pathdb
2. State root recomputation from BAL diffs
### Existing Compression Infrastructure (REUSE - DO NOT REIMPLEMENT)
**Location**: `triedb/pathdb/nodes.go` (lines 431-691)
go-ethereum **already has production-grade compression** we must reuse:
| Function | Purpose | Status |
|----------|---------|--------|
| `encodeNodeCompressed()` | Delta encoding with bitmap | **REUSE** |
| `decodeNodeCompressed()` | Decode compressed format | **REUSE** |
| `encodeNodeFull()` | Full-value encoding | **REUSE** |
| `encodeNodeHistory()` | Checkpoint + delta chains | **REUSE** |
---
## SPEC 3: BAL Processing Pipeline
### Overview
Block Access Lists (BALs) per EIP-7928 provide state diffs that allow partial nodes to update state without re-executing transactions.
### Existing BAL Implementation (Already in Geth)
**Location**: `core/types/bal/`
BAL types are already implemented in go-ethereum master:
| File | Contents |
|------|----------|
| `bal.go` | `ConstructionBlockAccessList`, `ConstructionAccountAccess`, builder methods |
| `bal_encoding.go` | `BlockAccessList`, `AccountAccess`, RLP encoding, hash computation |
| `bal_encoding_rlp_generated.go` | Generated RLP encoder/decoder |
---
## SPEC 4: RPC Modifications
### Overview
Partial state nodes can answer some RPC queries but not others. This spec defines the behavior.
### Error Codes
```go
var (
ErrStorageNotTracked = errors.New("storage not tracked for this contract")
ErrCodeNotTracked = errors.New("code not tracked for this contract")
)
const (
ErrCodeStorageNotTracked = -32001
ErrCodeNotTracked = -32002
)
```
---
## SPEC 5: Configuration System
### CLI Flags
```go
var (
PartialStateFlag = &cli.BoolFlag{
Name: "partial-state",
Usage: "Enable partial statefulness mode (reduced storage)",
Category: flags.EthCategory,
}
PartialStateContractsFlag = &cli.StringSliceFlag{
Name: "partial-state.contracts",
Usage: "Contracts to track storage for (comma-separated addresses)",
Category: flags.EthCategory,
}
PartialStateContractsFileFlag = &cli.StringFlag{
Name: "partial-state.contracts-file",
Usage: "JSON file containing contracts to track",
Category: flags.EthCategory,
}
PartialStateBALRetentionFlag = &cli.Uint64Flag{
Name: "partial-state.bal-retention",
Usage: "Number of blocks to retain BAL history (default: 256)",
Value: 256,
Category: flags.EthCategory,
}
)
```
---
## Implementation Task Breakdown
### Phase 1: Core Infrastructure (Foundation)
| Task ID | Task | Dependencies | Effort |
|---------|------|--------------|--------|
| 1.1 | Create `core/state/partial/` package structure | None | S |
| 1.2 | Implement `ContractFilter` interface | 1.1 | S |
| 1.3 | Add `PartialStateConfig` to ethconfig | None | S |
| 1.4 | Add CLI flags for partial state | 1.3 | S |
| 1.5 | Implement config loading (file + direct) | 1.3, 1.4 | M |
### Phase 2: Snap Sync Modifications (Selective Sync via Strategy Pattern)
| Task ID | Task | Dependencies | Effort |
|---------|------|--------------|--------|
| 2.1 | Create `SyncStrategy` interface in `strategy.go` | None | S |
| 2.2 | Create `FullSyncStrategy` wrapper (embeds original Syncer) | 2.1 | S |
| 2.3 | Create `PartialSyncer` struct in `partial_sync.go` | 1.2, 2.1 | M |
| 2.4 | Implement account processing with storage filtering | 2.3 | M |
| 2.5 | Add `markStorageSkipped` / `isStorageSkipped` helpers | 2.3 | S |
| 2.6 | Implement healing with skip checks | 2.5 | M |
| 2.7 | Modify Downloader to use `SyncStrategy` interface | 2.1, 2.2 | S |
| 2.8 | Add strategy selection based on config | 2.7 | S |
| 2.9 | Unit tests for PartialSyncer | 2.4, 2.6 | M |
| 2.10 | Integration test with partial filter | 2.9 | L |
### Phase 3: BAL Processing (State Updates)
| Task ID | Task | Dependencies | Effort |
|---------|------|--------------|--------|
| 3.1 | Add BAL key schema to `core/rawdb/schema.go` | None | S |
| 3.2 | Create `core/rawdb/accessors_bal.go` (following existing pattern) | 3.1 | S |
| 3.3 | Create thin `BALHistory` wrapper in `core/state/partial/history.go` | 3.2 | S |
| 3.4 | Implement `ApplyBALAndComputeRoot` using existing BAL types + trie | Phase 2 | L |
| 3.5 | Implement `applyStorageChanges` for tracked contracts | 3.4 | M |
| 3.6 | Add `ProcessBlockWithBAL` to BlockChain | 3.4, 3.3 | L |
| 3.7 | Implement reorg handling with BAL history | 3.3, 3.6 | L |
| 3.8 | Engine API integration for BAL delivery | 3.6 | M |
| 3.9 | BAL processing tests | 3.6, 3.7 | L |
### Phase 4: RPC Modifications (API Layer)
| Task ID | Task | Dependencies | Effort |
|---------|------|--------------|--------|
| 4.1 | Add `PartialStateError` and error codes | None | S |
| 4.2 | Add `PartialStateEnabled`, `IsContractTracked` to Backend | 1.2 | S |
| 4.3 | Modify `GetStorageAt` for partial state | 4.1, 4.2 | S |
| 4.4 | Modify `GetCode` for partial state | 4.1, 4.2 | S |
| 4.5 | Modify `GetProof` (account ok, storage filtered) | 4.1, 4.2 | M |
| 4.6 | Modify `Call` / `EstimateGas` with pre-check | 4.1, 4.2 | M |
| 4.7 | RPC behavior tests | 4.3-4.6 | M |
### Phase 5: Integration & Testing
| Task ID | Task | Dependencies | Effort |
|---------|------|--------------|--------|
| 5.1 | End-to-end partial sync test | Phase 2, Phase 3 | L |
| 5.2 | Verify storage size meets estimates | 5.1 | M |
| 5.3 | Reorg recovery test | Phase 3 | M |
| 5.4 | RPC integration test | Phase 4, 5.1 | M |
| 5.5 | Documentation updates | All | M |
### Effort Legend
- **S** = Small (few hours)
- **M** = Medium (1-2 days)
- **L** = Large (3-5 days)
---
## Critical Path
The critical path for minimum viable partial statefulness:
1. **Phase 1**: Configuration infrastructure
2. **Phase 2**: Selective snap sync via strategy pattern (accounts + filtered storage)
3. **Phase 3**: BAL processing (state updates without re-execution, using existing BAL types)
4. **Phase 4**: RPC modifications (proper error handling)
5. **Phase 5**: End-to-end test
This enables a working partial stateful node. Compression and full reorg handling can be added incrementally.
## Key Design Decisions Summary
| Decision | Approach | Rationale |
|----------|----------|-----------|
| Snap sync | Strategy pattern with separate `PartialSyncer` | Keep original `Syncer` untouched |
| BAL types | Use existing `core/types/bal/` | Already implemented in geth master |
| Filter interface | `ContractFilter` interface | Flexible, testable |
| Skip tracking | DB markers + in-memory map | Persist across restarts |
| RPC errors | Custom error codes | Clear user feedback |
---
## Reuse vs. New Code Summary
### REUSING (Do Not Reimplement)
| Component | Existing Location | How We Use It |
|-----------|-------------------|---------------|
| **BAL Types** | `core/types/bal/` | Import directly |
| **Compression** | `triedb/pathdb/nodes.go` | `encodeNodeCompressed()`, `encodeNodeHistory()` |
| **Delta Encoding** | `trie/node.go` | `NodeDifference()` |
| **Checkpoint Mechanism** | `triedb/pathdb/config.go` | `FullValueCheckpoint` config |
| **Diff Layers** | `triedb/pathdb/difflayer.go` | `nodeSetWithOrigin`, `StateSetWithOrigin` |
| **History Key Patterns** | `core/rawdb/schema.go` | Follow `StateHistoryAccountBlockPrefix` pattern |
| **History Accessors** | `core/rawdb/accessors_history.go` | Follow Read/Write/Delete triplet pattern |
| **Safe Deletion** | `core/rawdb/database.go` | `SafeDeleteRange()` for pruning |
| **Filter Patterns** | `eth/filters/filter.go` | Reference for contract filtering |
| **Trie Interface** | `trie/trie.go` | Standard trie operations |
### CREATING NEW
| Component | New Location | Purpose |
|-----------|--------------|---------|
| `SyncStrategy` interface | `eth/protocols/snap/strategy.go` | Abstract sync implementations |
| `PartialSyncer` | `eth/protocols/snap/partial_sync.go` | Filtered storage sync |
| `ContractFilter` | `core/state/partial/filter.go` | Contract tracking interface |
| `PartialState` | `core/state/partial/state.go` | BAL application + root computation |
| BAL key schema | `core/rawdb/schema.go` | Add `balHistoryPrefix` |
| BAL accessors | `core/rawdb/accessors_bal.go` | Read/Write/Delete following pattern |
| `BALHistory` wrapper | `core/state/partial/history.go` | Thin layer over rawdb |
| `ProcessBlockWithBAL` | `core/blockchain_partial.go` | Block processing entry point |
| RPC error codes | `internal/ethapi/` | Partial state errors |
| Config | `eth/ethconfig/config.go` | `PartialStateConfig` |
| CLI flags | `cmd/utils/flags.go` | Partial state flags |