Commit graph

10 commits

Author SHA1 Message Date
CPerezz
f8c283410e
trie/bintrie: drop unused freeInternals and freeStems
Only freeHashed is written to (via freeHashedNode); the internal and
stem lists are declared, consumed in alloc paths, and copied in
NodeStore.Copy, but no callsite ever appends to them. Under current
semantics (no delete, stem-split keeps the old stem deeper in the
tree) there is no path that would free an internal or stem slot, so
the recycle branch was dead code. Drop it to avoid misleading future
contributors; the infrastructure is easy to restore if a delete path
is ever added.
2026-04-19 22:15:01 +02:00
CPerezz
8f31f30500
trie/bintrie: trim storeChunkSize doc comment
Gballet posted an empty 'suggestion' block on node_store.go:24 (comment
3100612272) — collapse the 4-line explanatory block to one line.
2026-04-18 18:53:23 +02:00
CPerezz
e1859ea864
trie/bintrie: simplify StemNode to array-of-slices representation
Gballet asked on PR #34055 (comments 3100043116, 3100050542, and the
bit-check dedup at 3100114416 / 3100878310) to revert StemNode from the
packed-bytes representation to the straightforward array-of-slices.

Before: StemNode carried a bitmap, a concatenated valueData []byte, a
count, and a shared COW flag. Every read/write went through a bit-count
posInData lookup; every mutation through ensureWritable COW.

After: values [StemNodeWidth][]byte — 256 slots, nil == absent. No
bitmap lookup, no COW. Direct sn.values[suffix] access.

Supporting changes:
- Drop posInData, ensureWritable; rewrite getValue/hasValue/allValues/
  setValue as trivial slice access.
- Hash() iterates sn.values directly, matching master's shape.
- SerializeNode emits the bitmap + concatenated bytes on the wire from
  the array-of-slices at serialize time; wire format unchanged.
- decodeNode populates sn.values[i] slots by aliasing the serialized
  buffer (zero-copy).
- NodeStore.Copy deep-copies each slot.
- splitStemValuesInsert + the insertSingleInternal paths write directly
  to sn.values[i].

Trade-off: stems now carry 256 []byte headers (6144 B) instead of 1
concatenated slice (~32 B) + bitmap. Stem-pool scan cost returns to
parity with master (the existing valueData pointer already made the
pool non-noscan; rollback adds 255 more pointers per stem). The primary
arena win — pointer-free InternalNode pool — is preserved.
2026-04-18 18:53:07 +02:00
CPerezz
2d44d8a4b6
trie/bintrie: unexport package-internal arena identifiers
Gballet asked on PR #34055 to unexport nodeRef, nodeKind, and makeRef
(comments 3099846639, 3099847640, 3100717855) — none are used outside
trie/bintrie. Cascade to the internal-only support symbols and methods:

  NodeKind          → nodeKind
  KindEmpty/...     → kindEmpty/...
  NodeRef           → nodeRef
  EmptyRef          → emptyRef
  MakeRef           → makeRef
  NodeStore.Root    → deleted; inlined to s.root field access (same pkg)
  NodeStore.SetRoot → deleted; inlined to s.root = ref
  NodeStore.ComputeHash/SerializeNode/DeserializeNode(WithHash)/
  CollectNodes/ToDot/GetHeight → lowercased

All 9 method signatures took or returned nodeRef so their export would
have tripped revive:unexported-return after the type rename. Zero
external callers means no API break. The private deserializeNode helper
was renamed to decodeNode to free the name for the newly-private
deserializeNode public function.

Pure rename; no behaviour change.
2026-04-18 18:49:04 +02:00
CPerezz
939b36345f
trie/bintrie: port dirty flag + CollectNodes skip-clean from master
Master added (via PR #34754) a dirty bool to InternalNode/StemNode plus a
CollectNodes short-circuit that skips clean subtrees — the arena branch
diverged before that landed. Port the semantics onto the arena shape:

- Add dirty bool to InternalNode and StemNode.
- Wire dirty=true alongside every existing mustRecompute=true setter in
  node_store.go (newInternalRef, newStemRef) and store_ops.go (8 mutation
  sites across InsertSingle/insertSingleInternal/InsertValuesAtStem/
  insertValuesAtStem/splitStemInsert/splitStemValuesInsert).
- Add 'if !node.dirty { return nil }' gate at the top of CollectNodes for
  both KindInternal and KindStem; clear dirty after flushfn runs.
- Plumb a dirty parameter through deserializeNode; DeserializeNode passes
  dirty=true (safe default), DeserializeNodeWithHash passes dirty=false
  (loaded from disk, blob matches).

The arena test in trie_test.go that was auto-merged from master used
master-shape struct literals (tr.root, NewBinaryNode) that don't exist on
arena; delete those and replace with TestCommitSkipCleanSubtrees, an
arena-native version that asserts first-Commit flushes all nodes, no-op
Commit flushes none, and single-leaf Commit flushes only the root-to-leaf
path.
2026-04-18 18:45:12 +02:00
CPerezz
84c61897b3
trie/bintrie: use type alias for HashedNode
Replace the single-field struct with a type alias on common.Hash.
Both have identical layout (32 bytes, no pointers) and noscan span
placement, but the alias matches master's style and reads more
naturally. A zero-arg Hash() method keeps call sites terse.
2026-04-18 18:38:37 +02:00
CPerezz
5f94d26db8
trie/bintrie: update copyright year on newly added files
These four files were introduced in this PR and should carry the
current year.
2026-04-18 18:38:37 +02:00
CPerezz
b4a7118d06
trie/bintrie: trim verbose doc comments to essentials 2026-04-18 18:38:37 +02:00
CPerezz
ad64f4ec04
trie/bintrie: fix copy-paste panic messages in allocStem/allocHashed 2026-04-18 18:38:37 +02:00
CPerezz
8a5e777fde
trie/bintrie: replace BinaryNode interface with GC-free NodeRef arena
Replace the BinaryNode interface (which uses Go interface pointers that
the GC must scan) with NodeRef uint32 indices into typed arena pools.
NodeRef packs a 2-bit kind tag and 30-bit pool index into a single
uint32, making it invisible to the garbage collector.

NodeStore manages chunked typed pools per node kind:
- InternalNode pool: ZERO Go pointers (children are NodeRef, hash is
  [32]byte) → allocated in noscan spans, GC skips entirely
- HashedNode pool: ZERO Go pointers → noscan spans
- StemNode pool: ONE pointer per node (valueData []byte) → minimal GC

For a trie with 25K InternalNodes, this reduces GC-scanned pointer-words
from ~125K to ~10K (85% reduction). CPU profiling showed 44% of time
in GC; this refactor directly addresses that bottleneck.

Serialization format is unchanged — the on-disk representation is
fully compatible. All existing tests pass.
2026-04-18 18:38:15 +02:00