Verkle Trees - Batch affine points addition vs. mixed additions

This document aims to explore if using batch affine additions is faster than adding in projective form for VKT MSM.

A while ago Kev mentioned this idea, and we left that convo in "We should do the counting and see", so I'm doing this here. :)

What is batch affine point additions?

This is a trick used by gnark used in some cases when aggregating points in Pippenger buckets. In our case, we don't use Pippenger to do the MSM but use a precomputed list of points, but the idea can still apply (more on this later).

The idea is simple: you have N pair of affine points that you need to aggregate. That's a number of finite field multiplications and inverses. Given that you know the N pairs of affine points upfront, you can use the Montgomery trick for the inverses.

The important question is: is that faster than summing in projective form?

The answer is it depends. This document explores if that's the case for Bandersnatch.

Why is this relevant for VKT MSM?

We calculate Perdersen commitments for VKTs by leveraging the fact that these MSMs are on a fixed basis, so we have some precomputed tables of affine points.

Evaluating the MSM reduces to making a bunch of additions.

Today we define a result projective point value and start making the additions (i.e: "mixed" additions). More details about this later.

BLS12-381

Let's first look at the answer to the question for this curve since it's where gnark is operating and can help make sense of why the trick is useful on some curves but not others.

Doing a point addition in affine is: ~3M + 1I (M=field-mul/squaring, I=field-inv, ).

Thus, if you have to do n additions:

The cost of doing the N affine additions: 3nM + nI operations.
The cost using the Montgomery trick to batch inverses: 3nM + 3nM+I = 6nm + I, depending on n and the cost of I, after some threshold n, the cost is ~6nm.
- Explanation: the Montgomery trick changes nI to ~3nM+I (i.e: a single inverse, plus some multiplications to clear out terms).

If we add enough n pairs, the cost per affine point, addition is ~6M instead of 3M + I.

Compare this to summing Jacobians, where the cost of adding is: ~16M.

Thus, for BLS12-381 there's a nice win if you have a use case that aligns with this trick.

Bandersnatch

We have a Twisted Edwards curve, which for affine additions means: 8M + 2I.

Doing analogous reasoning for n pairs:

The cost of doing the N affine additions: 8nM + 2nI operations.
The cost using the Montgomery trick to batch inverses: 8nM + 6nM+I = 14nM + I.

If we add enough n pairs, the cost per affine point addition is ~14M instead of 8M + I.

Compare this to the sum in projective, where the cost of adding: ~11M. Doing mixed additions (i.e: projective + affine) takes ~10M.

Since 14M > 11M > 10M, it looks like, this batch affine addition trick isn't worth it for Twisted Edwards curves.

What is batch affine point additions?

Why is this relevant for VKT MSM?

BLS12-381

Bandersnatch

Read more

Berlin - Stateless Ethereum

Ethproof Killers Test Suite – Overview & Format Guide

EIP-4762 execution witness worst-case

Execution-Spec-Tests TODOs (Stateless Tests)