Master added (via PR #34754) a dirty bool to InternalNode/StemNode plus a
CollectNodes short-circuit that skips clean subtrees — the arena branch
diverged before that landed. Port the semantics onto the arena shape:
- Add dirty bool to InternalNode and StemNode.
- Wire dirty=true alongside every existing mustRecompute=true setter in
node_store.go (newInternalRef, newStemRef) and store_ops.go (8 mutation
sites across InsertSingle/insertSingleInternal/InsertValuesAtStem/
insertValuesAtStem/splitStemInsert/splitStemValuesInsert).
- Add 'if !node.dirty { return nil }' gate at the top of CollectNodes for
both KindInternal and KindStem; clear dirty after flushfn runs.
- Plumb a dirty parameter through deserializeNode; DeserializeNode passes
dirty=true (safe default), DeserializeNodeWithHash passes dirty=false
(loaded from disk, blob matches).
The arena test in trie_test.go that was auto-merged from master used
master-shape struct literals (tr.root, NewBinaryNode) that don't exist on
arena; delete those and replace with TestCommitSkipCleanSubtrees, an
arena-native version that asserts first-Commit flushes all nodes, no-op
Commit flushes none, and single-leaf Commit flushes only the root-to-leaf
path.
A prior commit aggressively trimmed comments. This restores the
ones that carried real information — ownership contracts on
returned slices, the index-tracking semantics inside Next(),
the Parent() grandparent note, and the "at a leaf" stem rule —
so future readers aren't left guessing. Also elaborates the
parallel-hashing rationale in hashInternal.
Replace the single-field struct with a type alias on common.Hash.
Both have identical layout (32 bytes, no pointers) and noscan span
placement, but the alias matches master's style and reads more
naturally. A zero-arg Hash() method keeps call sites terse.
Replace the BinaryNode interface (which uses Go interface pointers that
the GC must scan) with NodeRef uint32 indices into typed arena pools.
NodeRef packs a 2-bit kind tag and 30-bit pool index into a single
uint32, making it invisible to the garbage collector.
NodeStore manages chunked typed pools per node kind:
- InternalNode pool: ZERO Go pointers (children are NodeRef, hash is
[32]byte) → allocated in noscan spans, GC skips entirely
- HashedNode pool: ZERO Go pointers → noscan spans
- StemNode pool: ONE pointer per node (valueData []byte) → minimal GC
For a trie with 25K InternalNodes, this reduces GC-scanned pointer-words
from ~125K to ~10K (85% reduction). CPU profiling showed 44% of time
in GC; this refactor directly addresses that bottleneck.
Serialization format is unchanged — the on-disk representation is
fully compatible. All existing tests pass.