Proof of Assets proposal - v1

# Proof of Assets proposal - v1 Ethereum is the chain here. Other chains would possibly need a different spec. ## ZK-SNARK high-level design The idea is to hide the $n$ accounts of the exchange amongst a greater set of addreses of size $N$ (the anonymity set). The exchange will provide ECDSA signatures as evidence that they control the $n$ accounts. Parameters for the snark: ``` [pub] array[N]: Ethereum addresses (anon set) [pub] array[N]: ETH balances for the above addresses [pvt] array[N]: booleans representing which of the above addresses the exchange owns [pvt] array[N]: ECDSA signatures (actually 4 arrays: r, s, message, pub key) [pub] group elem: Pedersen commitment to total asset sum [pvt] integer: blinding factor for the above commitment ``` Computation inside the snark: ```rust let sum = 0; for i in 0..N { if boolean_array[i] == 1 { verify_ecdsa(signatures[i]); let address = keccak256(pub_keys[i]); assert(address == anon_set[i]); sum += balances[i]; } } let com = pedersen_commitment(sum, blinding_factor); assert(com == com_param); ``` Notes: - The ECDSA parameter will have to be 4 arrays: one for each value of the signature (r & s), one for the messages, and one for the public key. - The ECDSA array values are only non-zero if the exchange owns the address at that index. - Branching is not possible inside an R1CS circuit (to my knowledge, anyway) so any code inside an if-statement will be "executed" in the circuit regardless of the condition i.e. constraints will be generated for if-statement branches that would not usually be executed if the program was run as machine code. This means that the ECDSA verification algorithm will be done $N$ times, not $n$ times. ## Problems with the snark design The total number of R1CS constraints that we can have is ~134M, according to the assumptions that a) we will be using a trusted setup, b) the trusted setup determines the total number of constraints, and c) Filecoin has the largest trusted setup around: > As far as we know, Filecoin represents the largest zk-SNARK deployment to date—in several dimensions: > - Our trusted setup enables circuits of up to 2^27 = ~134M constraints. https://research.protocol.ai/sites/snarks/ Both ECDSA & keccak hashing take a lot of constraints. We need to do 1 of each of these for each address in the anon set, which means we have to do a total of $N$ of them. This means $N$ is bounded by $f^{-1}(134M)$ for some function $f: \text{anon set size} \to \text{constraints}$. ### ECDSA verification In [0xPARC's ECDSA verification circom implementation](https://github.com/0xPARC/circom-ecdsa) it takes ~1.5M constraints to verify an ECDSA signature. This number was taken from their [blog article](https://0xparc.org/blog/batch-ecdsa) on the topic, in which they propose a batching algorithm for verifying $n$ signatures that requires fewer constraints compared to naively looping over the original algorithm. Details on the efficiency gain can be seen in the Desmos graphs [here](https://www.desmos.com/calculator/amvuweuegt). Obviously ~1.5M constraints per signature verification is going to be very limiting in terms of the anon set size $N$, since $N$ signature verifications will have to be done. Even if the batch algorithm is used we still are very limited: 100 signature verifications will yield ~41M constraints, which is $1/3$ of the total number of allowed constraints. Immediately we can see that we will need recursive snarks: ![Untitled-2023-11-30-0900](https://hackmd.io/_uploads/H1U2y0Bra.png) Each of the red snarks will verify a portion of $N$, $N_i$, such that $\sum_{i=1}^{k}N_i = N$. The green snark will then verfiy each of the red snarks. The plus side here is that the red snarks can be proved in paralell. ### Keccak256 The keccak hash function is not suited for snarks as it requires many constaints--many more than some other hash functions, such as Poseidon. Using the keccak256 code in the ZoKrates standard library, we get ~150k constraints for a single invocation of keccak on a 256-bit input. ### ZoKrates A small experiment was done using ZoKrates: the above snark was [coded up](https://github.com/Stentonian/poa-zokrates) using EdDSA instead of ECDSA (since that was the library available, and time was of the essence). The code was compiled into R1CS with different values of $N$ and these are the metrics gathered: | $N$ | Constraints | Total memory used | Compilation time | | ---- | ----------- | --- | ---------------- | |12 | 2939125 | N/A | 1m52.015s | |18 | 4408687 | N/A | 2m52.782s | |28 | 6857957 | 18GB | 4m47.163s | |42 | 10286935 | 44GB | 8m8.578s | |60 | 14695621 | 63GB | 14m22.142s | |82 | 20084015 | 96GB | 25m10.087s | |108 | 26452117 | 139GB | 41m45.104s | |138 | 33799927 | 187GB | 63m14.306s | |172 | 42127445 | 233GB | 104m53.336s | |210 | 51434671 | 292GB | 145m46.444s | |252 | 61721605 | 355GB | 202m48.684s | |298 | 72988247 | 421GB | 278m4.777s | |348 | 85234597 | 492GB | 373m54.384s | We can see that even with EdDSA we very quickly reach the limits with small values of $N$. From this data we can extract our function $f^{-1}$ (see analysis [here](https://www.desmos.com/calculator/fxkkoeonfw)): $$ f^{-1}(C)=\frac{200C-19}{49} $$ where $C$ has units of *millions of constraints*. This gives us $f^{-1}(134\text{M})=546$, which is our maximum anon set size. Clearly not good enough. ## Anonymity set size How big do we actually need $N$ to be? Looking at Ethereum we see that - If you sort accounts from most to least ETH then the address at position 10k has ~500 ETH (~$1M @ ~$2k/ETH) ([source](https://etherscan.io/accounts/400)) - There are: - ~500k active daily addresses ([source](https://www.statista.com/statistics/1278174/ethereum-active-addresses/)) - ~100M addresses with non-zero balance ([source](https://studio.glassnode.com/metrics?a=ETH&m=addresses.NonZeroCount)) - ~5M addresses with balance >0.1 ETH ([source](https://studio.glassnode.com/metrics?a=ETH&category=Addresses&m=addresses.MinPoint1Count)) - ~1.7M addresses with balance >1 ETH ([source](https://studio.glassnode.com/metrics?a=ETH&category=Addresses&m=addresses.Min1Count)) - ~350k addresses with balance >10 ETH ([source](https://studio.glassnode.com/metrics?a=ETH&category=Addresses&m=addresses.Min10Count)) Having 10k as an anonymity set size is a reasonable goal for the first iteration, but ideally we need to have the top ~$400$k accounts. With $N=10\_000$ we would have to do $100$ seperate snarks (given $N_i=100$), which is not too bad. With $N=400\_000$ we would have to do $4$k seperate snarks, which is rather terrible. If we use the ZoKrates compile time metrics then each snark of $N_i=100$ would take ~$40$ minutes to generate, using ~$140$GB of memory. If we use r7a.8xlarge instances on AWS then proof generation would cost $9737 :dizzy_face: Clearly this is not a good-enough solution. ## Second snark design The biggest problem in the original snark design is the ECDSA signature verification. If we can somehow reduce the number of times this is done then we can cirumvent all the issues with limits and compute time. Pushing the signature verifications to their own, separate snark means that we only have to verify $n$ sigs, which means that $N$ need be bounded from above so tightly anymore. We will have 2 snarks in total: 1 for ECDSA verification and 1 for anonymity set membership checks. The ECDSA snark will still put limitations on the size of $n$, but at least this number is considerably smaller than $N$. If $n \gg 100$ then we can have multiple iterations of this snark run in parallel. ### First snark (ECDSA verifications) Parameters for the snark: ``` [pvt] array[n]: ECDSA signature - r values [pvt] array[n]: ECDSA signature - s values [pvt] array[n]: ECDSA signature - messages [pub] array[n]: ECDSA signature - public keys ``` Computation inside the snark: ```rust for i in 0..n { verify_ecdsa(r[i], s[i], message[i], pub_key[i]); } ``` ### Second snark (anonymity set membership check) Parameters for the snark: ``` [pvt] first snark proof [pub] array[N]: Ethereum addresses (anon set) [pub] array[N]: ETH balances for the above addresses [pvt] array[n]: public keys owned by exchange [pvt] array[n]: index of public keys in anon set [pub] group elem: Pedersen commitment to total asset sum [pvt] integer: blinding factor for the above commitment ``` Computation inside the snark (ignore the index-out-of-bounds bug): ```rust verifySnark(snark_proof, pub_keys); let sum = 0; for i in 0..n { let address = keccak256(pub_keys[i]); let index = indices[i]; assert(index > indices[i-1]); // ensure no double-counting assert(address == anon_set[index]); sum += balances[index]; } let com = pedersen_commitment(sum, blinding_factor); assert(com == com_param); ``` ### Analysis The limiting factor now is the keccak hash. Assuming 150k constraints per invocation can have a naive guess at $f^{-1}$: $$ f^{-1}(C) = \frac{2C}{3} $$ where $C$ has units of *millions of constraints*. This gives us $f^{-1}(134\text{M})=893$, which is still far from where we need it to be. If we repeat the same extraction process done for the signature verification then we can remove the keccak hash into its own snark. ### Final design ![poa_snarks](https://hackmd.io/_uploads/H1F1ie8Ba.jpg) Full quality image [here](https://drive.google.com/drive/folders/1Nj5uek31W9FeE0MviO760YrsG9wUEM5H). What will be the value of $k$? It will ultimately be determined by the first set of snarks since ECDSA signature verification takes up more constraints than keccak hashing. Using our previous estimate of 100 signatures per snark we get $k=n/100$. Suppose an exchange has 10k addresses, then we will need to run 201 snarks, which is not too bad. And we can easily have our anon set size be 400k. It's worth summarizing the assumptions we have used to get to this point: - Batch ECDSA signature verification has the number of constraints described by the function [here](https://www.desmos.com/calculator/thzhbu5o69) - Keccak256 uses 150k constraints per invocation - Branching with short-circuting is not possible, assuming we are using R1CS (if we use Plonkish then it may be possible to use lookup arguments to check set membership, which essentially gives us the short-circuiting we need) ## Language and proving system Since ECDSA batch verification and keccak256 have both been implemented in Circom, it seems this is a good choice if we want to keep dev-time to a minimum. Circom supports Groth16, Plonk & Fflonk so we would need to choose 1 of these systems. Groth16 should be used for the first 2 layers of snarks since it has the most succint proofs and fewest operations for the verifier (which will be a snark). Note that the trusted setup will need to be run twice (once for each layer). The final snark can use either of the 3 available proving systems. ## Beyond the snark - ZK transaction sync protocol This section refers to the [Instant Zero Knowledge Proof of Reserve](https://eprint.iacr.org/2023/1156.pdf) paper, which describes a protocol for taking a [secret] set of accounts & a commitment to their balance sum, and updating the balances as blocks are added to the chain. The idea here is that the snark proof only needs to be run once, after which each transaction on the block is processed and used to update the total balance of the exchange. In order to integrate with this protocol the final snark will need to cryptographically link the private addresses input to the one used in zk-tx-sync protocol (done using a vector Pedersen commitment). The combination of the 2 protocols allows the exchange to commit to an address set that contains 0-balance addresses to be used in the future. If the set needs to be expanded at some point then the snark will have to be run again. It's not clear whether this paper has been reviewed, and there are no security/zk proofs (it seems to rely on other papers for this) so the legitimacy of the protocol is not guarenteed yet. Some more investigation needs to be done. ## Security and privacy There are some details on this topic already over here: https://hackmd.io/@summa/rkTtmr-K3 Privacy of the exchange's addresses is dependent on $N-n$: the larger this value the better the exchange's addresses are hidden in the anon set. Of course it's also important to note that security & privacy depend on the validity of the snark proving system used. If the zk-tx-sync protocol is used then there is also a dependency there. ## Developement time estimates High level tasks: - Circom circuits: 1-2 weeks - Tests for above: 1 week - Buliding ZK transaction sync protocol: 3-6 months