This document aims to explore if using _batch affine additions_ is faster than adding in projective form for VKT MSM.
A while ago Kev mentioned this idea, and we left that convo in "We should do the counting and see", so I'm doing this here. :)
## What is _batch affine point additions_?
This is a trick [used by gnark](https://github.com/ConsenSys/gnark-crypto/pull/261) used in some cases when aggregating points in Pippenger buckets. In our case, we don't use Pippenger to do the MSM but use a precomputed list of points, but the idea can still apply (more on this later).
The idea is simple: you have N pair of affine points that you need to aggregate. That's a number of finite field multiplications and inverses. Given that you know the N pairs of affine points upfront, you can use the Montgomery trick for the inverses.
The important question is: is that faster than summing in projective form?
The answer is _it depends_. This document explores if that's the case for Bandersnatch.
## Why is this relevant for VKT MSM?
We calculate Perdersen commitments for VKTs by leveraging the fact that these MSMs are on a fixed basis, so we have some precomputed tables of affine points.
Evaluating the MSM reduces to making a bunch of additions.
Today we define a `result` projective point value and start making the additions (i.e: "mixed" additions). More details about this later.
## BLS12-381
Let's first look at the answer to the question for this curve since it's where gnark is operating and can help make sense of why the trick is useful on some curves but not others.
Doing a [point addition in affine](https://hyperelliptic.org/EFD/g1p/auto-shortw.html) is: ~`3M + 1I` (`M`=field-mul/squaring, `I`=field-inv, ).
Thus, if you have to do `n` additions:
- The cost of doing the N affine additions: `3nM + nI` operations.
- The cost using the Montgomery trick to batch inverses: `3nM + 3nM+I = 6nm + I`, depending on `n` and the cost of `I`, after some threshold `n`, the cost is ~`6nm`.
- Explanation: the Montgomery trick changes `nI` to ~`3nM+I` (i.e: a single inverse, plus some multiplications to _clear out_ terms).
If we add enough `n` pairs, the cost per affine point, addition is ~`6M` instead of `3M + I`.
Compare this to summing Jacobians, where the [cost of adding](https://hyperelliptic.org/EFD/g1p/auto-shortw-jacobian.html) is: ~`16M`.
Thus, for BLS12-381 there's a nice win if you have a use case that aligns with this trick.
## Bandersnatch
We have a Twisted Edwards curve, [which for affine additions means](https://hyperelliptic.org/EFD/g1p/auto-twisted.html): `8M + 2I`.
Doing analogous reasoning for `n` pairs:
- The cost of doing the N affine additions: `8nM + 2nI` operations.
- The cost using the Montgomery trick to batch inverses: `8nM + 6nM+I = 14nM + I`.
If we add enough `n` pairs, the cost per affine point addition is ~`14M` instead of `8M + I`.
Compare this to the sum in projective, where the [cost of adding](https://hyperelliptic.org/EFD/g1p/auto-twisted-projective.html): ~`11M`. Doing mixed additions (i.e: projective + affine) takes ~`10M`.
Since `14M > 11M > 10M`, it looks like, this batch affine addition trick isn't worth it for Twisted Edwards curves.

This document contains notes about implementation optimizations for Verkle Tree proof generation and verification.

11/28/2023This document describes some high-level description of test-vectors to generate.

11/10/2023Last updated: 2023-10-30

11/6/2023block execution: 99.992976msINFO [07-15|13:10:28.917] Collected key values from base tree count=10000 duration=293.583654ms last account=008f65…483efbINFO [07-15|13:10:29.300] Prepared key values from base tree duration=382.488744msINFO [07-15|13:10:30.000] Inserted key values in overlay tree count=10000 duration=1.083027557smigration took: 1.376893246sfillLevels took 4.422121mscollecting nodes took 56.29µstoFrMultiple took 208.536µsupdating commitments took 3.090994mscollecting nodes took 5.472675mstoFrMultiple took 21.644539msupdating commitments took 52.035324mscollecting nodes took 2.3216mstoFrMultiple took 13.16166msupdating commitments took 54.440922mscollecting nodes took 127.455µstoFrMultiple took 639.024µsupdating commitments took 15.478009msFinalize: 211.469993msblock execution: 293.354953msINFO [07-15|13:10:30.971] Collected key values from base tree count=10000 duration=330.318923ms last account=008f65…483efbINFO [07-15|13:10:31.372] Prepared key values from base tree duration=400.638925msINFO [07-15|13:10:32.097] Inserted key values in overlay tree count=10000 duration=1.126096392smigration took: 1.456677516sfillLevels took 2.508844mscollecting nodes took 47.249µstoFrMultiple took 247.035µsupdating commitments took 4.936315mscollecting nodes took 7.428826mstoFrMultiple took 15.575423msupdating commitments took 50.524827mscollecting nodes took 2.413472mstoFrMultiple took 12.737298msupdating commitments took 41.038935mscollecting nodes took 139.996µstoFrMultiple took 648.065µsupdating commitments took 15.552383msFinalize: 193.08911msblock execution: 252.886918msINFO [07-15|13:10:32.963] Collected key values from base tree count=10000 duration=282.597556ms last account=00ab98…a45379INFO [07-15|13:10:33.531] Prepared key values from base tree duration=567.853577msINFO [07-15|13:10:33.900] Inserted key values in overlay tree count=10000 duration=937.057289msmigration took: 1.219926088sfillLevels took 7.629195mscollecting nodes took 136.204µstoFrMultiple took 414.446µsupdating commitments took 7.367287mscollecting nodes took 14.461581mstoFrMultiple took 18.437174msupdating commitments took 81.609759mscollecting nodes took 3.4579mstoFrMultiple took 19.02924msupdating commitments took 70.383834mscollecting nodes took 121.622µstoFrMultiple took 631.732µsupdating commitments took 15.461385msFinalize: 281.168431msblock execution: 239.645635msfillLevels took 269.784µscollecting nodes took 7.291µstoFrMultiple took 25.958µsupdating commitments took 641.648µscollecting nodes took 345.906µstoFrMultiple took 779.019µsupdating commitments took 5.172558mscollecting nodes took 198.328µstoFrMultiple took 601.982µsupdating commitments took 5.064353mscollecting nodes took 48.999µstoFrMultiple took 390.531µs

8/7/2023
Published on ** HackMD**