Week 15 Update

# Week 15 — Update ## TL;DR Refactored topic memory management to use an interned pool in zig-libp2p. PR: https://github.com/zen-eth/eth-p2p-z/pull/83/commits/c8876911681cd479b31d8b94fd56afca8982f973 --- ## What is an interned pool (short) An interned pool stores one unique immutable instance per distinct value (usually strings or bytes). On request, it returns a handle/reference to the canonical instance: - If the value already exists, return a reference to that instance. - Otherwise insert it into the pool and return a reference. This transforms repeated, equal values into identical references (pointer-equality), saving memory and speeding equality checks. --- ## Why topics are a good fit - Topics are immutable identifiers referenced often (publish/subscribe, mesh/fanout maps). - Many peers use the same topic strings, causing repeated allocations and repeated byte-by-byte comparisons. - Interning makes topic keys cheap to compare (pointer equality) and reduces total heap allocations for topic strings. --- ## Design overview Goals: - Provide a safe, concurrent-friendly, low-overhead pool for topic strings/byte-slices. - Expose a compact handle type (TopicRef) that is cheap to copy and compare. - Ensure deterministic behavior for tests (e.g., seeded RNG for candidate selection), and clear lifecycle rules (when interned entries may be reclaimed). Basic API (conceptual) - TopicRef = opaque handle (pointer-sized) - interned_pool.init() - pool.intern(value: []u8) -> TopicRef - pool.lookup(value: []u8) -> ?TopicRef - pool.release(ref: TopicRef) -> void // optional if using refcounting / weak refs - pool.len() -> usize - pool.for_each(callback) Choices to make: - Ownership model: strong references with refcount, or weak intern table (never reclaim unless explicit GC), or arena with manual clear on restart. - Concurrency: lock-free map, striped locking, or global mutex. - Key representation: store canonical bytes or an immutable Zig slice + length. --- ## Pseudocode Interning API (conceptual): ```pseudo // opaque handle to canonical topic data type TopicRef = pointer function intern(pool, value): hash = hash_bytes(value) bucket = pool.table[hash % table_size] lock(bucket) if entry := bucket.find_equal(value): unlock(bucket) return entry.ref else: canonical = allocate_copy(value) ref = make_topic_ref(canonical) bucket.insert(canonical, ref) unlock(bucket) return ref ``` Optional refcounted release: ```pseudo function release(pool, ref): // decrement refcount, and when reaches 0 remove entry from table and free ``` Usage in publish/receive paths: ```pseudo topic_ref = pool.intern(topic_bytes) mesh_map[topic_ref] = set_of_peers fanout_map[topic_ref] = set_of_peers // comparison: if (topic_ref1 == topic_ref2) // pointer eq ``` --- ## Sequence diagram ```mermaid sequenceDiagram participant App participant Router participant Pool as interned_pool App->>Router: publish(topic_bytes, P) Router->>Pool: intern(topic_bytes) -> topicRef Router->>Router: use topicRef as map key (mesh/fanout) ``` --- ## Memory / lifecycle considerations Options: - Never-evict pool: simplest; safe but entries live for process lifetime (good if number of distinct topics is bounded). - Refcounted entries: free entries when nobody holds them (requires careful locking and atomic refcounts). - TTL / LRU eviction: track last-use time and evict rarely-used topics (adds complexity). - Arena & reset: allocate interned entries in an arena, free all on restart (fast but coarse). Recommendations: - Start simple: implement a process-lifetime pool with deduplication and a small global mutex or sharded locks. This avoids subtle bugs from refcounting and GC during initial rollout. - If memory proves an issue, add optional refcounting or LRU eviction behind a feature flag. --- ## Concurrency considerations - Use sharded hash table or per-bucket mutexes to reduce contention on high throughput paths. - Use atomic pointer swaps where possible for lock-free reads of canonical references. - Ensure intern() is safe when two threads intern the same value concurrently (only one canonical instance must be inserted). - Avoid long holds on global locks; prefer small critical sections around lookup/insert. --- ## Implementation checklist 1. Create InternedPool module with: - init/free functions - intern(value) -> TopicRef - optional lookup(value) -> ?TopicRef - debug/metrics (pool size, hits/misses) 2. Replace topic-key usage: - convert maps (mesh, fanout, peers.gossipsub) to use TopicRef keys - update publish/forward/join/leave code paths to call pool.intern(topic) early 3. Add sharded locking or lightweight synchronization 4. Add debug helpers to print top-k topics by reference count (if refcounting used) or usage 5. Add feature flags for eviction/refcounting if desired --- ## Tests & benchmarks Unit tests: - interning identity: intern(same_bytes) returns same TopicRef (pointer equality) - concurrent intern: N threads intern same topic concurrently => single canonical instance - map-key behavior: using TopicRef as key in mesh/fanout yields same semantics as string keys Integration tests: - publish/subscribe flows still work with TopicRef keys - memory usage: after many repeated intern() for the same topic, memory doesn't grow Benchmarks: - baseline: measure allocations and CPU for publish/receive using raw strings - optimized: measure same with interned pool - measure latency impact and lock contention at target QPS Leak detection: - run with ASAN/valgrind or Zig leak detection to ensure no unintended leaks (if using refcounting, ensure release paths execute). --- ## Migration & rollout plan 1. Implement pool and unit tests behind a feature flag or compile switch. 2. Add static instrumentation (hits/misses, pool size) and run local perf tests. 3. Replace topic key usage in a branch and run integration tests. 4. Deploy to canary or long-running test nodes and monitor memory use. 5. If stable, remove feature flag and merge. --- ## Next Start refactoring publish and receive code paths to call pool.intern(topic) early, add the unit and concurrency tests above, and run the benchmark comparisons. After that, iterate on an eviction/refcount strategy if memory numbers indicate the need.