Skip to content

perf: pool and pre-allocate interned-string dict (#66)#68

Merged
xe-nvdk merged 1 commit intov6from
perf/intern-dict-pool
Apr 15, 2026
Merged

perf: pool and pre-allocate interned-string dict (#66)#68
xe-nvdk merged 1 commit intov6from
perf/intern-dict-pool

Conversation

@xe-nvdk
Copy link
Copy Markdown
Member

@xe-nvdk xe-nvdk commented Apr 15, 2026

Summary

Closes #66. Pools and pre-allocates the interned-string dict so repeated encode/decode under a reused Encoder/Decoder doesn't rehash the map or regrow the slice.

  • New SetInternedStringsDictCap(n int) on both Encoder and Decoder — initial capacity hint for the dict, clamped to [0, maxDictLen].
  • Reset paths now reuse internally-owned dict storage (clear-in-place for maps, truncate for slices) instead of dropping to GC every session.
  • PutEncoder / PutDecoder drop dicts that grew past internDictPoolCap = 4096 so a one-off large interning session doesn't permanently bloat pool memory (mirrors the existing wbuf/buf cap-drop pattern).
  • Introduces a dictOwned bool so caller-supplied dicts (via ResetDict / WithDict) are never cleared, truncated, or appended into — regression test covers both the encoder-clear and decoder-alias clobber scenarios that a first draft of this patch had.

Test plan

  • go test ./... passes
  • go test -race ./... passes
  • go vet ./... clean
  • New TestInternedStringDictStorageIsReused uses testing.AllocsPerRun to assert 0 encoder allocs / 1 decoder alloc (only the unavoidable string copy) on the reuse hot path — load-bearing, would fail on the pre-fix code
  • New TestResetDoesNotMutateCallerDict guards against the ownership bugs
  • Existing intern tests (TestInternedString, TestInternedStringTag, TestResetDict, TestMapWithInternedString) still pass

Closes #66. Fixes map rehashing and slice growth on repeated interned
encode/decode when the Encoder/Decoder is reused.

New SetInternedStringsDictCap(n) on both Encoder and Decoder sets an
initial capacity hint for the dict, avoiding rehashes as entries are
added. Reset paths now reuse dict storage (clear-in-place for maps,
truncate for slices) instead of dropping it to the GC every session;
PutEncoder/PutDecoder drop oversized dicts (> 4096 entries) so a one-off
large session doesn't permanently bloat pool memory.

Introduces a dictOwned flag to track whether the dict was internally
allocated or supplied by the caller via ResetDict/WithDict. The
Encoder/Decoder will never clear, truncate, or append into a
caller-owned dict — a regression test covers both the encoder-clear and
decoder-alias clobber scenarios. A new allocation-based reuse test
asserts 0 encoder allocs / 1 decoder alloc on the Reset+encode/decode
hot loop (would fail on the pre-fix code).
@xe-nvdk xe-nvdk merged commit 0c65db4 into v6 Apr 15, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: pool or pre-allocate interned string dict

1 participant