Try   HackMD

Summa 2.0 aka SNARKless Proof Of Solvency for CEX

Preliminary Notions

Roots of unity in modular arithmetic (or within finite fields) play a significant role in cryptography in general and are crucial for the Summa V2. Let's break down the idea of

n-th roots of unity in modular rings.

Modular Arithmetic

An intuitive example of modular arithmetic from our daily life is the "clock arithmetic". When we see "19:00" boarding time on a boarding pass, we know that it corresponds to "7" on the clock face. Formally, in this case we perform the modular reduction by the modulus 12:

197(mod12),

because the clock face is only marked from 1 to 12.

Roots Of Unity in Modular Arithmetic

An integer

ω is an
n
-th root of unity modulo
p
if:
ωn1(modp)

and
ωk1(modp)
for any
0<k<n
.

In other words,

n is the lowest integer such that
ωn
"wraps around" due to modular reduction to yield exactly
1
. Integers smaller than
n
don't yield
1
modulo
p
.

Roots Of Unity Example

Let's observe a finite field of order

p=7:
F7={1,2,...,7}
. Let's see that
2
and
4
are the 3rd roots of unity in such a field:

  • 2381(mod7)
    , so 2 is a 3rd root of unity modulo 7.
  • 43641(mod7)
    , so 4 is another 3rd root of unity modulo 7.
  • 1
    itself is a trivial root of unity, too.

Special Property of The Sum of Roots of Unity

Let's consider a finite field

Fq that has
n
-th roots of unity. Let
ω
be a primitive
n
-th root of unity in
Fq
, which means
ωn=1
and no smaller positive power of
ω
equals 1.

The

n-th roots of unity in
Fq
are
1,ω,ω2,,ωn1
.

Claim:

1+ω+ω2++ωn1=0 (the sum of all the roots of unity in a finite field is equal to zero).

Proof:

Consider the sum

S=1+ω+ω2++ωn1. We can multiply
S
by
ω1
, noting that
ω10
in a field so that such a multiplication preserves the equality:

(ω1)S=(ω1)(1+ω+ω2++ωn1)

Expanding the right hand side, we get:

ω+ω2+ω3++ωn(1+ω+ω2++ωn1)

Notice that if we were to expand further, every term except

ωn and
1
would cancel out:

(ω1)S=ωn1

Since

ω is a primitive
n
-th root of unity,
ωn=1
. So,
ωn1=0
. Therefore:

(ω1)S=0

If the product of two factors is zero, at least one of them must be zero. We already established that

ω10, thus
S
must be zero:

S=0

Therefore,

1+ω+ω2++ωn1=0.

Let's also check it on our previous toy example of

F7 and
n=3
:
1+2+4=70(mod7)
.

Summa V2

Let's see how we can take advantage of the sum of all roots of unity being zero when applied to the proof of solvency.

Data Structure & Commitment Scheme

The desired commitment scheme for Summa should have the following properties:

  • Committing to the total liabilities of the Custodian that is the sum of all user balances;
  • Allowing to publicly reveal the total liabilities;
  • Allowing to prove the individual user inclusion into the commitment;
  • Preserving the user privacy and hiding the user data (namely the user cryptocurrency balances);
  • Outperfrom the Merkle sum tree in both commitment phase and proving phase[1].

We will demonstrate how a polynomial commitment can be used to achieve all of these properties.

Let's consider a polynomial

B(X) that evaluates to a
i
-th user balance
bi
at a "user's point" - some value
xi
that is designated for this specific user:

B(xi)=bi.

We can call it a user balance polynomial. It is quite easy to construct such a polynomial using the Lagrange interpolation. The formula for the polynomial that interpolates these data points is:

B(X)=i=1nbiLi(X)

Where

Li(X) is the Lagrange basis polynomial defined as:

Li(X)=j=1jinXxjxixj

A polynomial constructed using the Lagrange interpolation is known to have the degree

d=n1 where
n
is the number of users (therefore, the number of the balance evaluation points). The resulting polynomial should look like the following:

B(X)=a0+a1x+a2x2+...+adxd

Let's choose the

xi values as the
i
-th degrees of an
n
-th root of unity (assuming that we are perforing all the calculations in the prime field with a sufficiently large modulus):

B(ωi)=bi, where
ω
is the
n
-th primitive root of unity.

KZG Commitment Scheme

We choose a KZG commitment scheme to commit to this polynomial for the compatibility with Halo2 API (more on that later). In brief, a KZG commitment is a single finite field element

C that uniquely represents the polynomial
B
.

It is impossible to reconstruct the polynomial from the commitment, so our requirement of user privacy is satisfied because it is impossible to infer any evaluations of the polynomial from the single-value commitment

C.

During the reveal (aka opening) phase, the committed value

C is used along with the claimed polynomial evaluation
B(x)
to provide a succinct proof
π
, verifying that the value
B(x)
is indeed an evaluation of a polynomial
B(X)
at point
x
and corresponds to the original commitment
C
. Therefore, KZG commitment allows the Custodian to individually provide the opening proofs
πi
to each user to proof that the polynomial
B(X)
indeed evaluates to the user balance
bi
at the point
xi=ωi
:

{C,B(ωi),π}:B(ωi)=bi

More broadly, the KZG commitment allows the prover to open the polynomial at any point, and we will later see gow it benefits our case.

Grand Total of the Polynomial Evaluations

To prove the solvency of the Custodian, we need to find its total liabilities by summing up all the user balances and to prove to the public that the sum is less than the assets owned by the Custodian. An individual

i-th user balance is the evaluation of the polynomial at the
ωi
value corresponding to the user:

B(ωi)=bi=a0+a1(ωi)1+a2(ωi)2+...+an1(ωi)n1

Let's calculate the sum

S of all the user balances as the sum of the polynomial evaluations:

S=iB(ωi)=a0+a1ω01+a2(ω01)2++an1(ω01)n1++a0+a1ω11+a2(ω11)2++an1(ω11)n1++a0+a1ω21+a2(ω21)2++an1(ω21)n1++a0+a1ωn1+a2(ωn1)2++an1(ωn1)n1=(let's factor out the each ai)
S=iB(ωi)=na0+a1(ω0+ω1+ω2++ωn1=0)++an1(ω0+ω1+ω2++ωn1=0)n1=(using the property of the sum of all roots of unity inside the parnetheses being zero)=na0

Therefore, the grand sum of the user balances is simply the constant coefficient of the polynomial times the number of users:

S=iB(ωi)=na0

As it turns out, the Halo2 proving system is internally using the roots of unity as

X coordinates for the polynomial construction, and we will later see how we can take advantage of that.

Proof of Solvency

Using the described polynomial construction technique and the KZG commitment, it is sufficient for the Custodian to "open" the KZG commitment at

x=0:

{C,B(0),πx=0}:B(0)=a0+a10+a202+...+an10n1=a0

The total liabilities can then be calculated by multiplying the

a0 value by the number of users:

S=na0

Proof of Inclusion

As described in the KZG section, individual users would receive the KZG opening proofs

{C,B(ωi),π} at their specific point
ωi
and they would be able to check that

  • the opening evaluation is equal to their balance:
    B(ωi)=bi
    ;
  • the opening proof
    πi
    corresponds to the public KZG commitment
    C
    .

The caveat is that if two or more users have the same cryptocurrency balance, a malicious Custodian could give them the same KZG proof because the user index

i is a value defined by the Custodian. We will use the following technique to mitigate that:

  • the Custodian has to additionally commit to another polynomial that evaluates to the hashes of user IDs at the specific user points:
    H(ωi)=hi
    ;
  • the hashed user ID should be known to the user (e.g, the email address used to register with the Custodian);
  • the Custodian then gives two KZG commitments and two opening proofs to the user - one proving the balance inclusion into the balances polynomial and the other proving the user ID inclusion into the ID polynomial:

{CB,B(ωi),πB}:B(ωi)=bi
{CH,H(ωi),πH}:H(ωi)=hi

Performance Estimate

To estimate the computational complexity of performing a Lagrange interpolation polynomial and then making KZG commitment to it, we need to consider both steps separately.

  1. Lagrange Interpolation Polynomial:

    • The Lagrange interpolation involves computing Lagrange basis polynomials and then summing them up, weighted by the data points.
    • The computational complexity of calculating a single Lagrange basis polynomial is
      O(n)
      , where
      n
      is the number of data points.
    • Since we need to calculate this for each of the
      n
      data points and then sum them up, the total complexity for Lagrange interpolation is
      O(n2)
      .
  2. KZG Commitment:

    • KZG commitments are based on bilinear pairings over elliptic curves and depend on the size of the polynomial to which the commitment is made.
    • The KZG commitment scheme involves computing a commitment to each coefficient of the polynomial, which is a constant-time operation.
    • Assuming the polynomial resulting from the Lagrange interpolation has
      n
      coefficients, the computational complexity of making a KZG commitment to it is
      O(n)
      .

Combining both, the overall complexity for performing Lagrange interpolation and then making a KZG commitment can be estimated as

O(n2)+O(n), which is dominated by the
O(n2)
term from the Lagrange interpolation for large
n
. Lagrange interpolation can be optimized to sub-quadratic complexity. The key optimization involves recognizing and eliminating the redundant computations, particularly in the denominators of the Lagrange basis polynomials. Since each but one basis polynomials share a common denominator, this part can be computed more efficiently. Here's a brief outline of the optimization:

  1. Compute the Product of All Terms: First, compute the product of all terms

    (xxi) for
    i=0
    to
    n
    . This can be done in
    O(nlogn)
    time using techniques similar to those in the Fast Fourier Transform (FFT).

  2. Use Inverse Multiplication for Denominators: Instead of computing each denominator separately, we can compute the inverses of the terms

    xxi and then use these inverses to quickly compute each denominator. This step also utilizes
    O(nlogn)
    time.

  3. Compute the Numerators Efficiently: The numerators of each Lagrange basis polynomial can be computed directly in linear time.

So, by adopting these optimizations, the computational complexity of Lagrange interpolation can be reduced to

O(nlogn), which is significantly better than
O(n2)
for large
n
.


  1. TODO add a "Motivation" section that explains why Merkle sum tree (MST) in Summa V1 was not sufficient and we have been searching for a better solution. In brief, MST involves hashing and contains

    nlogn entries, making it computationally demanding. In Summa V1 the MST inclusion proofs have to be wrapped inside a ZK-SNARK, so it is also computationally demanding to generate all of them at once for the entire user base of the Custodian that can be on the order of hundreds of millions of users. ↩︎