Commit graph

18 commits

Author SHA1 Message Date
CPerezz
c24930bebf
trie/bintrie: unexport NodeStore, NewNodeStore, NodeFlushFn, nodeResolverFn
grep across the repo confirms zero external callers of bintrie.NodeStore,
NewNodeStore, NodeFlushFn, or NodeResolverFn. The arena is purely an
implementation detail of BinaryTrie; unexport the top-level names so
the package's external surface stays confined to BinaryTrie plus the
EIP-7864 helpers (ChunkifyCode, GetBinaryTreeKey*).

Methods on *nodeStore remain capitalized for now — with nodeStore
itself unexported, external code has no way to hold a *nodeStore
pointer, so the methods are effectively internal despite their case.
Method case is a cosmetic follow-up.
2026-04-19 22:18:43 +02:00
CPerezz
f676f04706
trie/bintrie: move dirty+mustRecompute flip into setValue
The invariant "mutating a value slot must mark the stem for re-hash
and re-flush" was enforced by every caller remembering to set both
flags after setValue. Moving the flip into setValue itself makes it
structurally impossible to forget, and drops the duplicate flag-sets
at each callsite.

decodeNode's on-disk load path still writes directly to sn.values
because loaded stems must retain whatever mustRecompute/dirty state
the caller asked for (typically both false).
2026-04-19 22:16:50 +02:00
CPerezz
99520432ec
trie/bintrie: roll back split-stem depth on error
splitStemValuesInsert increments existing.depth before recursing into
insertValuesAtStem. If the recursion fails, the depth stays incremented
but the tree is not re-rooted through the new internal, so a retry
reads bitStem at the wrong offset and can place the stem on the wrong
side of a fresh split. Roll back existing.depth on error to keep the
stem consistent across retries.
2026-04-19 22:14:31 +02:00
CPerezz
b86e2d3e20
trie/bintrie: inline get/InsertValuesAtStem wrappers
Gballet's three empty 'suggestion' blocks (comments 3101685618,
3101734697, 3101736436) mark the unexported wrapper declarations on
getValuesAtStem and insertValuesAtStem plus one temporary-var line.
Apply:

- Inline the unexported getValuesAtStem body into GetValuesAtStem (start
  the walk at s.root directly instead of via a two-arg helper). The
  function is not self-recursive, so the wrapper was pure indirection.
- Tighten InsertValuesAtStem to two lines using the 's.root, err = ...'
  idiom — the recursive helper stays (it IS self-recursive), only the
  public entry point gets the cleanup.

Adds docstrings on both public entry points.
2026-04-18 18:59:27 +02:00
CPerezz
aa21cd1b80
trie/bintrie: remove hasParent dead code in getValuesAtStem
Gballet (comment 3101708418): the hasParent check in the kindHashed
branch never fires — NewBinaryTrie resolves the root eagerly at open
time, so any HashedNode we encounter during a getValuesAtStem walk is
necessarily a child of a previously-visited internal (parentIdx /
parentIsLeft set on the prior kindInternal iteration).

Drop the hasParent flag and its setter; replace the check with a short
comment stating the invariant.
2026-04-18 18:57:56 +02:00
CPerezz
33227e7e6d
trie/bintrie: merge *Single paths into *ValuesAtStem
Per gballet's comment 3101751325 on PR #34055: the *Single functions
are essentially the same thing as *ValuesAtStem with one slot set. The
original design dispatched through *ValuesAtStem for dedup; this commit
restores that shape on the arena side.

- GetValue now delegates to GetValuesAtStem and indexes the returned
  256-slot array header (no allocation — the stem node returns its own
  inline values array as a slice).
- InsertSingle now builds a stack-allocated [StemNodeWidth][]byte with
  only the target slot set and delegates to InsertValuesAtStem.
- Delete the insertSingleInternal tree walker (~90 LOC) and the whole
  splitStemInsert (~60 LOC) — the *ValuesAtStem / splitStemValuesInsert
  pair already handles every case.

Addresses gballet comments 3101751325, 3101739001, 3101724199, 3101721238
(the last three subsumed by the consolidation — the duplicated helper
bodies no longer exist).

Net: ~150 LOC removed from store_ops.go. Allocation cost for InsertSingle
is bounded by the stack-allocated 256-slot array (one stack frame, no
heap allocation on the hot path).
2026-04-18 18:57:23 +02:00
CPerezz
bbf062c746
trie/bintrie: inline getSingle into GetValue
Gballet asked (comment 3101679920) to fold the unexported getSingle helper
into its single caller, and (comment 3101677731) to rename GetSingle ('bad
name: get single what?') with a top-level docstring.

- Inline getSingle into GetValue (one function instead of two).
- Rename GetSingle → GetValue and add a docstring.
- Drop the hasParent tracker that was only used for the 'hashed at root'
  guard — that case is now handled by kindEmpty / the top-level
  NewBinaryTrie-time root resolution, so remove the check rather than
  keep dead state.

CE2 will later fold this into GetValuesAtStem; this commit closes the
naming + inline asks independently.
2026-04-18 18:54:26 +02:00
CPerezz
e1859ea864
trie/bintrie: simplify StemNode to array-of-slices representation
Gballet asked on PR #34055 (comments 3100043116, 3100050542, and the
bit-check dedup at 3100114416 / 3100878310) to revert StemNode from the
packed-bytes representation to the straightforward array-of-slices.

Before: StemNode carried a bitmap, a concatenated valueData []byte, a
count, and a shared COW flag. Every read/write went through a bit-count
posInData lookup; every mutation through ensureWritable COW.

After: values [StemNodeWidth][]byte — 256 slots, nil == absent. No
bitmap lookup, no COW. Direct sn.values[suffix] access.

Supporting changes:
- Drop posInData, ensureWritable; rewrite getValue/hasValue/allValues/
  setValue as trivial slice access.
- Hash() iterates sn.values directly, matching master's shape.
- SerializeNode emits the bitmap + concatenated bytes on the wire from
  the array-of-slices at serialize time; wire format unchanged.
- decodeNode populates sn.values[i] slots by aliasing the serialized
  buffer (zero-copy).
- NodeStore.Copy deep-copies each slot.
- splitStemValuesInsert + the insertSingleInternal paths write directly
  to sn.values[i].

Trade-off: stems now carry 256 []byte headers (6144 B) instead of 1
concatenated slice (~32 B) + bitmap. Stem-pool scan cost returns to
parity with master (the existing valueData pointer already made the
pool non-noscan; rollback adds 255 more pointers per stem). The primary
arena win — pointer-free InternalNode pool — is preserved.
2026-04-18 18:53:07 +02:00
CPerezz
2d44d8a4b6
trie/bintrie: unexport package-internal arena identifiers
Gballet asked on PR #34055 to unexport nodeRef, nodeKind, and makeRef
(comments 3099846639, 3099847640, 3100717855) — none are used outside
trie/bintrie. Cascade to the internal-only support symbols and methods:

  NodeKind          → nodeKind
  KindEmpty/...     → kindEmpty/...
  NodeRef           → nodeRef
  EmptyRef          → emptyRef
  MakeRef           → makeRef
  NodeStore.Root    → deleted; inlined to s.root field access (same pkg)
  NodeStore.SetRoot → deleted; inlined to s.root = ref
  NodeStore.ComputeHash/SerializeNode/DeserializeNode(WithHash)/
  CollectNodes/ToDot/GetHeight → lowercased

All 9 method signatures took or returned nodeRef so their export would
have tripped revive:unexported-return after the type rename. Zero
external callers means no API break. The private deserializeNode helper
was renamed to decodeNode to free the name for the newly-private
deserializeNode public function.

Pure rename; no behaviour change.
2026-04-18 18:49:04 +02:00
CPerezz
939b36345f
trie/bintrie: port dirty flag + CollectNodes skip-clean from master
Master added (via PR #34754) a dirty bool to InternalNode/StemNode plus a
CollectNodes short-circuit that skips clean subtrees — the arena branch
diverged before that landed. Port the semantics onto the arena shape:

- Add dirty bool to InternalNode and StemNode.
- Wire dirty=true alongside every existing mustRecompute=true setter in
  node_store.go (newInternalRef, newStemRef) and store_ops.go (8 mutation
  sites across InsertSingle/insertSingleInternal/InsertValuesAtStem/
  insertValuesAtStem/splitStemInsert/splitStemValuesInsert).
- Add 'if !node.dirty { return nil }' gate at the top of CollectNodes for
  both KindInternal and KindStem; clear dirty after flushfn runs.
- Plumb a dirty parameter through deserializeNode; DeserializeNode passes
  dirty=true (safe default), DeserializeNodeWithHash passes dirty=false
  (loaded from disk, blob matches).

The arena test in trie_test.go that was auto-merged from master used
master-shape struct literals (tr.root, NewBinaryNode) that don't exist on
arena; delete those and replace with TestCommitSkipCleanSubtrees, an
arena-native version that asserts first-Commit flushes all nodes, no-op
Commit flushes none, and single-leaf Commit flushes only the root-to-leaf
path.
2026-04-18 18:45:12 +02:00
CPerezz
b3b86a873a
trie/bintrie: merge makeKeyPath into keyToPath
Drop the panic-on-error variant. All callers are inside methods
that already propagate errors, so the error-returning form is the
right default.
2026-04-18 18:38:37 +02:00
CPerezz
84c61897b3
trie/bintrie: use type alias for HashedNode
Replace the single-field struct with a type alias on common.Hash.
Both have identical layout (32 bytes, no pointers) and noscan span
placement, but the alias matches master's style and reads more
naturally. A zero-arg Hash() method keeps call sites terse.
2026-04-18 18:38:37 +02:00
CPerezz
5f94d26db8
trie/bintrie: update copyright year on newly added files
These four files were introduced in this PR and should carry the
current year.
2026-04-18 18:38:37 +02:00
CPerezz
b4a7118d06
trie/bintrie: trim verbose doc comments to essentials 2026-04-18 18:38:37 +02:00
CPerezz
efc164b850
trie/bintrie: move ensureWritable into setValue 2026-04-18 18:38:37 +02:00
CPerezz
0f6b066156
trie/bintrie: add nil-resolver guards for hashed node resolution 2026-04-18 18:38:36 +02:00
CPerezz
b6d385e702
trie/bintrie: add depth bound to splitStemInsert 2026-04-18 18:38:36 +02:00
CPerezz
8a5e777fde
trie/bintrie: replace BinaryNode interface with GC-free NodeRef arena
Replace the BinaryNode interface (which uses Go interface pointers that
the GC must scan) with NodeRef uint32 indices into typed arena pools.
NodeRef packs a 2-bit kind tag and 30-bit pool index into a single
uint32, making it invisible to the garbage collector.

NodeStore manages chunked typed pools per node kind:
- InternalNode pool: ZERO Go pointers (children are NodeRef, hash is
  [32]byte) → allocated in noscan spans, GC skips entirely
- HashedNode pool: ZERO Go pointers → noscan spans
- StemNode pool: ONE pointer per node (valueData []byte) → minimal GC

For a trie with 25K InternalNodes, this reduces GC-scanned pointer-words
from ~125K to ~10K (85% reduction). CPU profiling showed 44% of time
in GC; this refactor directly addresses that bottleneck.

Serialization format is unchanged — the on-disk representation is
fully compatible. All existing tests pass.
2026-04-18 18:38:15 +02:00