We present Halo2-svd, an open source library which provides functionality to verify the singular value decomposition (SVD) of a matrix and to perform basic linear algebraic operations in Halo2. Singular value decomposition is one of the most important tools available in linear algebra to analyze a matrix. Our library brings this powerful tool to zero-knowledge proofs, allowing developers an easy and efficient method for performing the linear algebra required for applications ranging from machine learning to decentralized reputation management.
Using singular value decomposition, every matrix can be written as the product where and are orthogonal matrices and is a diagonal matrix with positive entries. Essentially, this decomposition says that a matrix can be viewed as a rotation of coordinates and a rescaling. The orthogonal matrices determine the rotation and the diagonal matrix determines the rescaling. With this interpretation, it can be seen why SVD is an important tool in machine learning- by eliminating the small entries in , one can compress the matrix , while retaining its most essential features. This is essentially the idea behind Principal Component Analysis (PCA), which is technique for dimensionality reduction in machine learning. The singular vectors (these are the columns of and the rows of ) also satisfy special properties, which have been used in algorithms, such as Eigentrust.
Zero-knowledge (ZK) proofs [4] are a framework that allow a prover to prove that a statement is true, without revealing any additional information beyond the validity of the statement. One of the most popular implementation of a ZK proof system is Halo2, which provides basic primitives for writing ZK circuits. Recently this has been used by Axiom to trustlessly access historic on-chain data of Ethereum. The availability of this data naturally prompts interest in the application of statistical and machine learning algorithms. To facilitate this, an efficient implementation of fundamental linear algebraic operations such as matrix multiplication, eigenvalue calculations, and singular value decompositions within a ZK circuit becomes imperative.
There are two major challenges to overcome in order to implement these operations in ZK circuits:
Performing matrix multiplication is a very costly operation: for size square matrices, it naively requires operations and with Strassen's algorithm [3].
One needs to convert floating point numbers to the prime field , since all numbers in a ZK circuit are represented in such a field. This requires implementing some sort of quantization scheme and a corresponding fixed point arithmetic.
For the first point above, the key observation is that one simply needs to verify the result of a matrix multiplication inside the ZK circuit, not compute it. This can be done with only operations with very high probability, see Lemma 1 below. The second obstacle is overcome by using the ZKFixedPointChip [1], which is a library that implements fixed point arithmetic for Halo2. (See also the blog post [1] for an explanation of the Fixed Point Chip and the quantization method that it uses to convert floating point numbers to elements of the prime field.)
The prime field we will work with is constructed from a -bits prime described in [5]. In the following we will use and to denote matrices valued in and 𝕡 respectively.
Efficient algorithms for matrix multiplication and SVD
We start by describing the algorithm for the verification of matrix multiplication. Consider three matrices , , and . Suppose that we wish to verify that inside the ZK-circuit. Instead of naively multiplying the matrices and inside the circuit and checking the result against , we use the following variant of Freivalds' randomized algorithm [8] to verify matrix multiplication (interestingly, this variant appears in Sec 2.2 of [9]):
Sample a random field element .
Construct the vector .
Compute the quantities and .
If , then the test passes and we have verified that ; otherwise, it fails.
The element is randomly sampled by the Prover using the Fiat-Shamir transform [6]. Note that it is important that the Prover commits to matrices , and before the Fiat-Shamir transform is used to sample . The computation of the multiplication result is performed by the Prover outside the circuit and he can use any algorithm for this purpose. In our code, we simply use the trivial multiplication algorithm.
The most important aspect of the above algorithm is that it only requires multiplications (i.e., one multiplication per element of the matrices), compared to multiplications for the trivial matrix multiplication algorithm.
The following lemma implies that if the claimed multiplication result is incorrect, then the test fails with high probability.
Lemma 1. Let , and . Let be chosen uniformly at random over . Define . If (as field matrices), then one has:
Proof. This result is a simple consequence of the Schwartz-Zippel lemma [2]. If , then there exists such that . By a direct computation the coefficient of the vector is given by: By our assumption is a non-zero polynomial in of degree at most . Therefore is equivalent to a non-zero polynomial being zero at the random field element . The Schwartz-Zippel lemma implies this can happen with probability at most . Q.E.D.
Note that we have so far only discussed matrix multiplication for matrices over the prime field. What we are really interested in are matrices over real numbers. We will discuss the matrix multiplication for such matrices in the next section.
Once we have an efficient algorithm for verifying matrix multiplication, we can use it to verify the SVD of matrix by simply checking that and that and .
Design considerations
Matrix multiplication
Under fixed point precision, there are multiple methods of implementing Freivalds' algorithm [8] for real matrices– the most straightforward being that we trivially encode all operations using the fixed point chip. For example to check if (say all matrices with dimension ), analogous to the field matrix algorithm described above, we can check that their quantizations satisfy (see next section for a precise definition of the quantization scheme used by the fixed point chip) by iterating the following simple test times:
Sample a random vector
Compute
Check if for some tolerance
While such an implementation appears to be correct, one needs to be very careful while implementing this using the fixed point chip, especially for the check in Step 3, because the fixed point chip does not check for overflows and a dishonest prover could exploit this fact to cheat. Secondly, it turns out that for a reasonably small probability of error for this algorithm (say ), the number of iterations required is , which is quite significant and leads to a large overhead (between 5-10 times compared to our implementation) in circuit size. Lastly, each iteration would also require a hash to be calculated in circuit, which too is quite expensive.
Instead, in our library we implement Freivalds' algorithm at a lower level- at the level of the field elements. In the fixed point arithmetic of [1], a precision element is represented as the field element in the field . Real number multiplication under this encoding is simply field multiplication followed by a rescaling of the output by . Matrix multiplication of real matrices under this encoding can then be viewed as field matrix multiplication followed by a rescaling of each of the elements of the output. Freivalds' algorithm for field matrices can be implemented as described in the previous section. The implementation and analysis of this algorithm is much more efficient and simple than the one for the real numbers, particularly because for field matrices, we do not need to check if the difference vector ( above in Step 2) is close to zero, we can simply check that it is zero or not. Further, it is sufficient to use a single random vector for the test in the algorithm, which eliminates the overhead due to multiple iterations. Finally, it turns out that this random vector can also be chosen to be of the form for a random field element as is shown above in Lemma 1. As a result, we do not require any hashes to be computed in circuit apart from the commitment for the first phase of the ZK-circuit, which generates the element .
Overall field matrix multiplication can be checked using about ~ cells. As we mentioned above, to obtain the real matrix output from the output of the field matrix multiplication, we need to rescale the matrix. This costs between to (depending on the lookup table size and the precision) and turns out to be the much more expensive part of the multiplication procedure.
SVD verification
We design the SVD verification procedure such that it does not require rescaling any matrix product, since, as mentioned above, this is a very expensive operation. The bounds, which we choose to check for the verification of the SVD can all be viewed as (see the next section and also [7]). For such bounds, it is sufficient to instead create the unscaled field products of these matrices and simply verify that the difference of their elements is smaller than an appropriately scaled version of . In addition to these bounds, we also verify that the elements of and are smaller than and that the singular values are smaller than the maximum value that can be represented using the fixed point chip. These bounds ensure that the matrix multiplications used do not overflow.
The singular value decomposition algorithm
We now describe the full algorithm to check the singular value decomposition of a matrix inside the ZK-circuit. The singular value decomposition (SVD) of a matrix can be written as where , and is a diagonal matrix with non-negative diagonal entries. Here denotes the set of real orthogonal matrices of size .
There are two sources of numerical error that will arise as one attempts to check such a decomposition within a ZK-circuit. First when one numerically computes the SVD of using for instance numpy, one will get matrices which are only approximately equal to their exact counterparts . Second, one needs to apply a quantization scheme to these matrices in order to be able to represent then in . The quantization of a matrix is given by: In other words, is the matrix with the elements of , up to a precision of bits. is referred to as the quantization scale. Therefore for any real matrix , the matrix can be represented as a field matrix in .
We now describe our complete algorithm to verify the SVD of in Halo2. We use the observation that given , we have: Rewriting the SVD condition this way will help us avoid rescaling the matrices while checking the SVD.
SVD computation and verification in our library follows the following steps. Suppose the prover wishes to compute the SVD of a fixed point quantized matrix (i.e., ), then the prover:
Computes the numerical approximations to the matrices in the singular value decomposition using any library for linear algebra (numpy in the example in our library).
Converts the matrices to prime field valued matrices using the quantization procedure described above, namely the matrices: Similarly define since .
Computes the matrix product outside the ZK-circuit.
Computes the product inside the ZK-circuit. Note this only requires multiplications since is a diagonal matrix.
Proves the following claims in the ZK-circuit:
The matrix multiplication is correct using the randomized algorithm of the previous section.
That . The definition of the norm and the way should be chosen will be explained in the next section.
Computes the two products and outside the ZK-circuit, then proves inside the ZK-circuit:
The matrix multiplications and are correct.
That and , where is a small error also defined in the next section.
Note that the above algorithm can easily be adapted to check the eigenvalue decomposition of a real symmetric matrix .
Error analysis of the SVD algorithm.
In this section, we present precise error bounds used in our SVD verification algorithm. For an in-depth error analysis, we refer the reader to [7], where these bounds are proven.
Let us first explain how we define the norm defined over field valued matrices. Given an integer satisfying , where is the prime of our prime field , define the quantity:
The idea behind this definition is that we want to think of prime field elements past the midpoint of the field as representing negative numbers. For a matrix , we can now define:
Now we can state the error analysis results. Let . The results of [7] guarantee that:
So we want to choose and .
To further get a sense of the values of , let us assume for instance that is a square matrix of size . We choose for the quantization scale and further assume that . Then one gets roughly that and that . The first error is larger since it is also growing with the matrix norm.
Performance of the algorithm
The total number of advice cells and lookup cells used will depend on the parameter of quantization scale which is denoted by PRECISION_BITS in the code, as well as on the LOOKUP_BITS parameter.
For LOOKUP_BITS =19 and PRECISION_BITS= 32, the number of advice cells grow as (first phase + second phase) with lookup cells, and for LOOKUP_BITS =19 and PRECISION_BITS= 63 the number of advice cells grows as (first phase + second phase) with lookup cells. Note here that we use the Halo2 Challenge API to implement the Fiat-Shamir transform [6] and sample the randomness for matrix multiplication verification. This API divides the program into two phases.
Acknowledgments
This project was conducted as part of the Axiom Open Source Program. We are grateful to N. Feng, Y. Sun, and J. Wang for organizing it. It's build on top of ZK circuit primitives provided by Halo2 and Axiom's Halo2-base. We also used the fixed-point chip implementation of [1].