Verkle serialization notes

# Verkle serialization notes This document contains notes about explorations to minimize the size of serialized bytes saved to the database in Verkle Tries. ## Current situation We have a benchmark that is importing a bunch of blocks using a VKT. I added a "hook" to inspect all the bytes that are flushed to disk during this importing. "Flushing" happens after we process a block, serialize the updated nodes, and save them to the DB. Here're the numbers: ``` Flushes [Count: 269312, TotalBytes: 5427518381, AvgFlushBytes: 20153.27, AvgNodesPerFlush: 8] InternalNode [Count: 1445645, TotalBytes: 5127018637, AvgNodeBytes: 3546.53, AvgNodesPerFlush: 5] LeafNode [Count: 919278, TotalBytes: 300499744, AvgNodeBytes: 326.89, AvgNodesperFlush: 3] ``` All numbers are important, but most notably: - The average size of flushed internal nodes was 3.5KiB. - The average size of flushed leaf nodes was 326bytes. - On each flush (Commiting to the VKT and saving to DB) we saved 5 internal nodes and 3 leaf nodes. The current internal node serialization format is roughly: ``` [32 byte bitmap][32byte-children0][32-byte-children1]... ``` The 32 byte bitmap provides 256 bits that we mark as 0 or 1 to know the indexes of the following children. For example, `[1010000.][chidlren0][children1]`, means that `[children0]` is the 0-th children, and `[children1]` is the 3-th children in the 256 vector. `[32byte-childrenX]` is 32 bytes since we are saving commitments in compressed form. ## Next steps We're working in some changes since the above serialization is a bit inconvenient depending on the case. Examples: - The root node will always have all the 256 children, so the bitmap are just wasted bytes (all bits set to 1). - Probably for next (top-ish) levels, that is also true. We might need to discover the sweet spot. - If we have less than 32 children, it might be better to avoid the bitmap, and do something like `[1 byte child idx][children0][1byte child idx][children 1]`. - If we have more than 32 children, then probably the bitmap is a good idea. ## Benchmark bias Maybe you've already noted something important in the numbers. We're having an average flush of 5 internal nodes and 3 leaf nodes per flush. That's a bias coming from the benchmark, since each imported block doesn't have many transactions. That means the situation is skewed "pessimistically" since if you touch less leaf nodes per block, then internal nodes serialization will dominate and will have a high "cost per leaf node".