Update (2023-11-28): this document was created ~2023-05 – all the ideas are still relevant. I've also worked on a parallelized version in the Go library that you might also find interesting here.
This document contains notes about implementation optimizations for Verkle Tree proof generation and verification.
Some of these optimizations are present in the spec implementation, while others aren't since the spec code is meant to optimize to be understandable. We list them here and provide links to the reference implementation and other documents with explanations (if available).
toFr
at each relevant internal node to prove, which means doing inversions.
Cs
to the transcript. This requires serializing the points. Probably, you store them in projective form in memory, normalize them (i.e: to affine) in batch mode (i.e: Montgometry trick for coordinate). [go-ipa]E
is a MSM. Consider using a MSM algorithm, and not the naive sum of scalar multiplications [go-ipa].As an example of what kind of speedup you can get from an implementation that can be missing some of these benchmarks, here's some results on go-ipa
:
As an example, for 16.000 random polynomial openings:
Legend: (BEFOREms -> AFTERms, [SPEEDUPx])
The optimized implementation still is single-threaded-ish, and doesn't require extra memory (e.g: precomputed tables).
Do you know any other missing trick? Ping me!