Proof of Assets Deep Dive

# Proof of Assets Deep Dive **Abstract:** This post is part 2 of a detailed explanation of Proof of Reserves. It dives into the construction and history of the Proof of Assets protocol. A light technical know-how is required to read this article. # Introduction In a [previous blog post](https://www.coinbase.com/blog/proof-of-reserves-grant),[ Silver Sixpence](https://silversixpence.io) highlighted our commitment to advancing the state of **Proof of Reserves (PoR)** through privacy-preserving and crypto-native solutions. Silver Sixpence is building an open-source software library & web application for PoR to enhance the transparency, reliability, and accessibility of centralised exchange solvency data. Generally, PoR refers to an accounting procedure that digital asset custodians (e.g. **centralised cryptocurrency exchanges, **aka** CEXes**) undertake to show that they have enough assets to cover their total user liabilities; it allows the custodian’s customers to verify (in a trustless manner) that the custodian is indeed solvent. PoR consists of 2 steps: **Proof of Assets (PoA)** and **Proof of Liabilities (PoL)**. In PoA, the exchange demonstrates control of digital assets, while in PoL, the exchange commits to the total assets owed to each user (the liabilities). The essence of PoR lies in comparing the total sum of liabilities to the total sum of assets, with the ultimate goal of ensuring a positive balance sheet (assets - liabilities > 0). **In this post,** **we will focus on PoA**; PoL is covered [in a previous article](https://hackmd.io/RnkGk65rQBGPMOZ6Hdbiqg). We focus on PoA for a single digital asset to simplify the text, although the principles can be easily extended to multiple. ### A quick note on the terminology **Proof of Solvency** generally refers to a wider area than what we are focused on, so this terminology is not used. According to the definitions made by the [Chamber of Digital Commerce](https://digitalchamber.org/proof-of-reserves-blog/), we are focusing specifically on **Proof of Platform Reserves**. Still, the post will use the term Proof of Reserves (PoR) to mean this. PoR can sometimes be meant to be Proof of Assets, but this post does not use the term in that way.[^1] The terms can all be related like so: $$ \text{proof of solvency} > \text{proof of reserves} = \text{proof of assets} + \text{proof of liabilities} $$ ### And a quick note on the technical depth of this post This post can be read by anyone familiar with the blockchain ecosystem and the purpose of a CEX [or other digital asset custodian]. Light technical knowledge is required as most posts use technical jargon like blockchain, CEX, encryption, digital signature, proof, etc. However, some sections of this post get quite technical; These are marked with a _“[technical]”_ prefix and can be skipped without feeling a discontinuity in the text. The footnotes sometimes add some extra technical details, which can be left unread without losing the general idea. # What is PoA? Proof of Assets (PoA) is a process by which a custodian provides evidence that it controls a specific amount of digital assets. This amounts to proving ownership of a set of **blockchain accounts** whose collective digital asset balance sums to a particular amount. A blockchain account is synonymous with **a wallet**: 1. There is a public component called the **address**, which anyone can use to determine the asset balance of the account, 2. A secret component, the **private key**, generates the address required for producing transactions for the account (like sending assets to another account). An account is said to be **owned** by a party if they know the private key.[^2] So, then, a simple way to test if a custodian controls $X$ digital assets is to request that they show their private keys for all their accounts: with the private keys, you can create the addresses, whose balances can then be read from the blockchain, summed and compared to the claimed value $X$. Of course, sharing private keys relinquishes control of the assets, so custodians would only share these with a piece of software that they trust and would not send them over any network other than their own. But even if confined to one’s own network + trusted software, it is still unlikely that a custodian would be willing to share private keys: security best practices often result in private keys not being extractable from their governing software at all.[^3] **Digital Signature Algorithms (DSAs)** are the next best option. Since DSAs are the standard method for proving ownership of an account when performing transactions on the blockchain, every type of key-governing software that a custodian may use can produce a digital signature. A digital signature is proof that someone knows the private key of an account and thus proof that someone owns the account.[^4] Instead of asking the custodian for a list of private keys, we ask them for digital signatures. Why even consider using private keys if we just use digital signatures? We will see further down that using private keys offers advantages over DSAs. The next few sections explain different ways of doing PoA, from basic to complex. The reason for the increase in complexity is privacy: a custodian may want to avoid exposing a) any of the addresses it owns, b) the number of addresses it owns, or c) the total asset sum it controls. ### We start with this single requirement for a PoA 1. Prove that the custodian controls $X$ digital assets # Solution 1: published list of addresses The most basic form of PoA is simply for the custodian to publish a list of addresses and the claimed sum of their balances. Anyone can verify the balance sum by reading each address’s balance from the blockchain. ![address_balance_table](https://hackmd.io/_uploads/Hkj1VAOSA.png "Table of address & balance") This approach relies on people's confidence that the custodian owns the accounts. As mentioned above, the custodian could produce digital signatures for the addresses, which will provide the evidence that they own the accounts.[^5] ![address_balance_sig_table](https://hackmd.io/_uploads/B1uNNR_rC.png "Table of address, balance, signature") The issue with signatures is that they are easily shareable. Having a signature does not allow any control over the account[^6] so friends can share signatures without compromising their sole ownership of an account. A friend of the custodian, who owns an account with lots of assets, could give a signature to the custodian, which they could then post along with their friend’s address on the PoA list, essentially claiming ownership of an account they do not own. This is a common problem, and the possibility of ownership being faked can be lessened by having the custodian generate a signature on-demand for some random message not chosen by the custodian (signatures have to 'sign' some message/data, which is what we make random). Suppose the custodian is asked to generate a signature for every block using the block header as the message: in this case, the chance is higher that they own the account than if they were asked to generate only one signature using a message of their choosing.[^7] It is not foolproof, however, because their friend may be willing to generate all these signatures for them and could do so via automated software over a fast network so that latency is not an issue. But it makes the friend's job more difficult, which is the point. This form of cheating is referred to as the **friend attack**. This is where we see that private keys would be more trustworthy. But, of course, the custodian cannot post their private keys on a public forum, so it does not help us with this example. Another point to consider is that if the custodian has been sending assets to users from a particular address for a few years and is still sending from this address, this indicates that the custodian owns this address. A **good track record** is a method that custodians can use as evidence that they control assets. Note that this is still a list of digital signatures (since signatures are needed to transfer assets). Still, this list is more trustworthy than the list of signatures on random data because friends would be less willing to share signatures that spend their assets. At this point, we have three options for PoA: 1. A plain list of addresses 2. #1 with digital signatures 3. #1, but the addresses have been in use for years by the exchange (and still are) In terms of trustworthiness, the above can be ranked like so: $\#3 > \#2 >> \#1$ The problem here is that a list of addresses is _not_ _private_. If the custodian publishes this list, everyone will know which addresses it owns and the sum of assets it controls. ### Updated requirements for PoA 1. Prove that the custodian controls $X$ digital assets 2. **Keep the custodian's set of addresses private** 3. **Keep the size of the custodian’s set of addresses private** 4. **Keep the custodian’s asset sum private** 5. Produce PoAs regularly to inspire the maximum amount of trustworthiness # Solution 2: Provisions The [Provisions paper](https://eprint.iacr.org/2015/1008) was released in 2015 after the [Mt Gox meltdown](https://cointelegraph.com/news/the-mess-that-was-mt-gox-four-years-on). It provided a novel way to do PoR that had certain privacy guarantees. For the PoA protocol, requirements #2, #3 & #4 are satisfied. #2 & #3 are achieved by hiding the addresses inside a large set of addresses called the **anonymity set**. #4 is achieved by encrypting the balance sum using a **Pedersen commitment.**[^8] The encrypted balance sum can then show solvency when combined with the PoL protocol without revealing the asset or liability sums. ![venn_diagram_anon_set_cropped](https://hackmd.io/_uploads/HktwSR_HC.png "Venn diagram showing relation between anonymity set and custodian's accounts") A [relatively] simple **zero-knowledge proof of knowledge** protocol (more on what this is below) is presented in the paper that the custodian uses to prove that they own a subset of accounts in the anonymity set and that the Pedersen commitment correctly represents the balance sum of those accounts, all without revealing what any of the addresses or balances are. To verify the PoA, one has to: 1. check that the addresses & balances in the anonymity set match the blockchain, 2. and check the zero-knowledge proof ![provisions_flow_diagram](https://hackmd.io/_uploads/B1OaHROrR.png "Provisions PoA protocol flow diagram") Provisions, unfortunately, has a few limitations. Firstly, the custodian must present their private keys to the PoA protocol,[^9] which is the biggest limitation. Another limitation is that the protocol only works for accounts that have produced at least one transaction[^10], which excludes any address that has received funds but not spent any. Finally, the proof generation time, verification time & size grow linearly with the size of the anonymity set.[^11] The proof generation is highly parallelizable, so you can essentially get any speedup you want by increasing your computational power—using a machine with 128GB-RAM & 2 E5-2680 v2 Xenon processors would take 5.5 hours to produce the proof for a set size of 10M. The proof verification time is equally parallelizable, and for the same set size & machine, it takes ~1 hour to process. The proof size is ~9GB for the set size (which cannot be improved no matter how much computational power you throw at it). Although these numbers are not ridiculous, they are unsuitable for mobile devices or browsers. ### What is a Zero-Knowledge Proof of Knowledge (ZKPoK)? Suppose there is a game show host, and they have a large sum of money to give away as a prize. They have constructed a game that anyone can enter to win the prize. The game goes like this: 1. There is a room with 1 billion boxes. 2. There are 1B pieces of paper, each numbered from 1 to 1B. 3. Each piece of paper is randomly put inside one box. 4. Find the box containing the paper with the number 1; the first person to find this box wins the prize. 5. You may only use your own body to search through the boxes, i.e. no gadgets or machinery (you are scanned by security at the entrance to the room). Parisa & Vusi are two friends who have entered to play this game. Let’s use this game to explain the following two concepts: * Proof of knowledge * Zero-knowledge proof of knowledge What is **a proof of knowledge**? It is _evidence_ that someone knows some information, e.g. <table> <tr> <td>Statement from Parisa </td> <td>I know which box has the number 1 inside it </td> </tr> <tr> <td>Proof </td> <td>Parisa enters the room and comes out with a box with the number 1 in it. </td> </tr> </table> Parisa is the **prover,** and Vusi is the **verifier**. Parisa must convince Vusi that she knows how to win the game; bringing out the correct box certainly does the job. But what if Parisa wanted to prove the statement without revealing which box contains the number 1? This would be useful if she didn’t trust Vusi with the winning information. After all, if Vusi had known the information, he could have simply taken the winning box to the host to claim the prize for himself. This is where we use a **zero-knowledge proof of knowledge**: <table> <tr> <td>Statement from Parisa </td> <td>I know which box has the number 1 inside it </td> </tr> <tr> <td>Proof </td> <td>Parisa enters the room and comes out with the piece of paper with the number 1. </td> </tr> </table> If Parisa could find the piece of paper with one on it, she would know which box it was in, so the statement would have been proved. But she still needs to reveal _which_ box it was into Vusi so Parisa can return to the room and place the paper in the right box without worrying about Vusi trying to claim her prize. Essentially, there are two intertwined parts: the statement and the evidence of the statement's truth (called the **witness**). The prover can prove the statement true by providing the witness to the verifier. But if the prover wants to keep the witness secret, they need to provide a logical argument convincing the verifier that they know the witness. ### Trusted setup Now that we are in the realm of zero-knowledge proofs, it is important to discuss the idea of a trusted setup. Many crypto-systems dealing with ZK proofs require, in their setup phase, a trusted entity to create a secret, compute on that secret, share the computed elements, and then discard the secret (sometimes called **toxic waste**). If the toxic waste is not discarded, then the proof system can be used to construct fake proofs that will be verified correctly. This nuance is dealt with in a few different ways, but it is avoided altogether in Provisions: no trusted setup is required for the Provisions PoA. ### Colluding custodians - a special case of the friend attack By sharing their account information, two or more custodians can help each other in their PoAs. To do this, a custodian must share the result of a linear function applied to their private key (e.g. $\text{private key} \times A+B$). Cheating this way is easy for custodians because they do not risk their assets by helping out a friend. This type of attack is completely undetectable in the Provisions PoA since the accounts are all hidden. The paper does, however, present an adjustment to the protocol that prevents custodians from cheating by using **[nullifiers](https://nmohnblatt.github.io/zk-jargon-decoder/definitions/nullifier.html)**, requiring that the custodians all produce their proofs for the same block. There is no protection from the general friend attack: some non-PoA-producing entity provides their account to a custodian doing PoA, allowing them to claim the balance as their own. This problem is particularly difficult to solve, and no cryptographic method exists for solving it. ### What size should the anonymity set be? We want to cater to as many custodians as possible while being as privacy-focused as possible. If the set is not selected in a general enough way, it may be possible to recover information about a custodian’s accounts using statistical analysis. The most secure option would be to use the set of every active account on the whole blockchain, but this adds unnecessary computational overhead since many accounts are not in use anymore and only have tiny amounts of assets (dust accounts). Why don’t we just remove these dust accounts from the set? We can do this by using the set of accounts whose balances are above a certain threshold. If we use Ethereum (arguably the chain with the highest number of accounts) as an example and set this threshold to $0.4$ ETH, we get a set size of ~$10\text{M}$. So, we can optimise our protocol to work for an anonymity set of $10\text{M}$. The total number of used addresses on Ethereum is ~$270\text{M}$, so if we can get the PoA protocol to work efficiently for this magnitude, we can essentially support the most secure anonymity set. Since the 0.4 ETH value is rather arbitrary, a realistic set size will likely lie within the range of these two magnitudes ($10\text{M}$ & $270\text{M}$). When choosing the threshold, we must ensure that all the custodian’s accounts are set. Using the same set across all custodians is more secure than using different sets (due to possible analysis techniques being applied to set differences & timing of PoAs), so we need to choose the threshold to accommodate as many custodians’ accounts as possible. We will leave the set size at $10\text{M}$ for now, but this is just a thought to keep in mind. ### [technical] How the Pedersen commitment is used to show solvency What’s the point of the Pedersen commitment? If the asset balance sum is encrypted then the verifier can’t use it to get any information, so what is the point? The first useful thing the custodian can do is a **range proof.** Range proofs allow one to show that a commitment value is above or below a certain threshold. There are multiple ways of doing range proofs for Pedersen commitments: [Bulletproofs](https://eprint.iacr.org/2017/1066.pdf) is an example of a modern one (post-Provisions). The second useful thing the custodian can do is combine the PoA commitment with the PoL commitment, to show solvency. For this, the PoL protocol also needs to produce a Pedersen commitment (Provisions proposes a PoL protocol that does exactly this). Let’s call the commitment from PoA $P_A$ (encrypted asset sum) and PoL $P_L$ (encrypted liability sum). This encryption device is **homomorphic**, meaning that if you add two values in encrypted space and decrypt the result, you will get the same value as you would get if you added them before encrypting. So, the solvency equation in the introduction translates cleanly to $P_S=P_A-P_L$. The final step is to show that $P_S$ commits to a value that is $>0$ (we cannot tell just by looking at it because the value is encrypted). This can be done using a range proof (e.g. Bulletproofs), but it is done using [Schnorr proof](https://crypto.stanford.edu/cs355/19sp/lec5.pdf) in Provisions. ### Updated requirements for PoA Assuming a custodian has a server large enough to produce the proofs at the desired speed, we can satisfy #5. We will now add three more requirements that are not covered by Provisions: support for digital signatures as input, support for fresh accounts (not just ones that have made transactions), and allowing verification to happen with ease on consumer devices.[^12] This gives us the following full set of requirements: 1. Prove that the custodian controls $X$ digital assets 3. Keep the custodian's set of addresses private 4. Keep the size of the custodian’s set of addresses private 5. Keep the custodian’s asset sum private 6. Produce PoAs regularly to inspire the maximum amount of trustworthiness 7. **The custodian need only provide digital signatures** 8. **Allow fresh accounts to be used, i.e. ones that have not produced any transactions**[^13] 8. **Verification should be possible on consumer devices** # Solution 3: ZK-SNARK _The full acronym here is ‘ZK-SNARK’, but we will use ‘snark’ for short_ We will use the same concept as Provision to satisfy the next three requirements but with a snark instead of the custom ZK proof in the paper. ### What is a snark? _ZK-SNARK: <span style="text-decoration:underline;">Z</span>ero-<span style="text-decoration:underline;">K</span>nowledge <span style="text-decoration:underline;">S</span>uccinct <span style="text-decoration:underline;">N</span>on-interactive <span style="text-decoration:underline;">Ar</span>gument of <span style="text-decoration:underline;">K</span>nowledge_ With the gameshow explanation above, there was a specific method for producing a ZKPoK for the specific statement. Snarks offer a general method for producing a ZKPoK for _any_ statement.[^14] Let our statement be any arbitrary computation, and the _input_ to that computation will be (if it satisfies the computation) the witness. As a quick example, let’s say that Parisa claims she knows a prime number greater than $2^{82,589,933}− 1$ ([the largest publicly known prime number](https://en.wikipedia.org/wiki/Largest_known_prime_number)). If Parisa didn’t mind telling Vusi what the actual number is, then they could both sit down together and run the computation to check if it is prime (there is a particular algorithm for doing this): ![prime_number_check](https://hackmd.io/_uploads/rybYPROBA.png "Simple prime number check") But if Parisa wanted to keep the number secret, she could run the computation _inside_ a snark, with the number as input, and send the resulting snark proof to Vusi. If Vusi finds that the proof verifies successfully, then he would know that a prime number was given as input to the snark (the proof would not verify successfully if the prime number check output were false). Note that _any_ prime number would pass the check, so there needs to be an additional check to make sure the number is greater than the largest prime: ![prime_number_check_snark](https://hackmd.io/_uploads/S1H2vA_BC.png "Prime number check using a snark") More generally, a snark is a mathematical machinery that allows two parties (prover & verifier) to agree on the successful run of a computation, with only one party knowing the inputs: 1. The prover & verifier can both see the code for the computation 2. The prover has the inputs and wants to keep these hidden from the verifier. Only the prover can perform the computation (one needs the inputs to do the computation). 3. The prover performs the computation (using code + inputs) inside a snark. 4. The prover sends the snark proof to the verifier. 5. The verifier verifies the proof and is convinced that the code was executed correctly on the inputs but learns nothing about what those inputs were. ![snark_flow](https://hackmd.io/_uploads/S1kb_RdSA.png "General snark") Snarks offer a slightly more sophisticated setup: they cater to public _and_ private inputs. The public inputs are, like the computation, known to both parties, and the private inputs are only known to the prover. Note that public inputs need to be used to verify snark proof. ![snark_flow_extended](https://hackmd.io/_uploads/Hy4VuAuBC.png "General snark with public inputs") Snarks also have the nice property that proof size & verification time are small (**succinctness**), no matter how large the inputs get or how complex the code becomes.[^15] ### How do we do PoA with a SNARK? Snarks can perform PoA by following the same method as Provisions: hide the owned addresses in an anonymity set and encrypt the asset sum using a Pedersen commitment. Since we pretty much have free reign on the computations we can perform, let’s make the snark verify digital signatures (as opposed to doing operations with the private keys). We will have the following **inputs**: 1. [public] anonymity set: a set of addresses & their balances 2. [private] signatures: a set of signatures for the addresses owned by the custodian 3. [public] Pedersen commitment: sum of asset balances for owned accounts, encrypted as a Pedersen commitment We will perform this **computation** for each of the signatures: 1. Verify the signature 2. Check that the signature corresponds to an address in the anonymity set 3. Accumulate the balance of the addresses associated with the signature into the Pedersen commitment 4. Check the calculated Pedersen commitment equals the input The job of the **verifier**: 1. Verify the snark-proof 2. Check that every address & balance in the anonymity set matches the blockchain. Here is a diagram explaining the full PoA protocol: ![poa_snark_simple](https://hackmd.io/_uploads/HyEvdAdHC.png "PoA snark flow diagram") ### Satisfied requirements for PoA The snark PoA has satisfied all the requirements: * #1-#4 are solved in the same way as Provisions. * #5 is solved as long as the proofs can be generated in a reasonable time (more detail on this in the technical section) * #6 is solved since signatures are the input to the snark * #7 is solved since we can do hashing in the snark, which allows us to work with addresses as opposed to public keys * #8 is solved by the succinctness property of snarks ### Unsolved problems **Friend attack:** Because we are using signatures, the friend attack is still possible. As mentioned before, we can decrease the likelihood of a friend attack by requesting that the signatures sign block headers and that the proofs be done at the highest possible frequency. To get a better defence, we need private keys instead of signatures as inputs to the snark. **Colluding custodians:** this special case of the friend attack cannot be defended against with the same strategy as Provisions, at least not for ECDSA signatures. ECDSA (Elliptic Curve Digital Signature Algorithm) signatures are a common signature scheme for blockchains (e.g. Bitcoin, Ethereum), so the PoA snark must work for this algorithm. In Provisions, the private key generates a nullifier that prevents the same account from being used by two custodians. This same trick cannot be done for ECDSA because the signatures are not deterministic and so cannot be used to calculate a nullifier.[^16] **Trusted setup:** Depending on the particular snark system used, a trusted setup may be needed. This is not a big issue, but it is better not to do one if you don’t have to. # [technical] Technical details ### Proving system We chose to use the Circom language and ecosystem. We chose Circom to lean on the various libraries that have already been written. The Circom ecosystem supports only Groth16 and Plonk proving systems. We chose the former because we need to verify ZK proofs inside the snark, and a library for doing this with Groth16 exists, while none exists for Plonk. It's not that these choices offer the best performance, but they do offer the fastest development time, which is crucial for a first iteration. ### Anonymity set membership check using a Merkle tree. Having the anonymity set as a public input has some issues, one being that its size is directly represented in the size of the proof. So, a large set will provide much proof, making requirement #8 unsatisfiable. We get around this issue by constructing [outside the snark] a Merkle tree with elements of the set as leaf nodes and then feeding only the root hash as public input to the snark. The custodian will then produce Merkle proofs for each of the accounts they own and give those as private inputs to the snark, which the snark will then verify with the root hash. ### Problem with ECDSA ECDSA verification is an expensive operation inside a snark because the characteristic of the base field of secp256k1 is larger than [any of the primes](https://github.com/iden3/circom/blob/c822cddce0010356a4a2bd696cbf36a44a1c333e/program_structure/src/utils/constants.rs#L6) for the six curves offered by Circom (there are no commonly used curves with a suitable prime). This means that doing secp256k1 field operations in the snark requires two snark field elements and extra math. ECDSA verification requires elliptic curve operations, which require multiple field operations, so there is a lot of extra math to do. The ECDSA verification circuit from 0xPARC ([circom-ecdsa](https://github.com/0xPARC/circom-ecdsa)) takes ~$1.5\text{M}$ R1CS constraints per invocation (see [their](https://0xparc.org/blog/zk-ecdsa-1) [blogs](https://0xparc.org/blog/zk-ecdsa-2) for more info). 0xPARC has a different circuit ([batch-ecdsa](https://github.com/puma314/batch-ecdsa)) that does batch verification of ECDSA signatures (see [their blog](https://0xparc.org/blog/batch-ecdsa) for more info). Their algorithm does not offer better than $O(n)$ asymptotic performance. Still, it does offer a ~$3\text{x}$ improvement on the naive solution in both the number of constraints and proving time (see [here](https://www.desmos.com/calculator/thzhbu5o69) for exact constraint & timing data). Even with the batch-verify algorithm, the [R1CS constraint limit](https://github.com/iden3/snarkjs#guide) of ~$268\text{M}$ means that a max of ~$600$ ECDSA signatures can be verified inside a snark, which puts an upper bound on the total number of signatures that can be supported. We solve this by using recursive snarks. ### Recursion We split the computation into signature verification and anonymity set membership checks. Multiple instances of these snarks can be run in parallel, allowing us to get around the constraint limit. There must be a 3rd component to condense the parallel streams into a single, final proof. Here is the full snark system: ![poa_snark_simple](https://hackmd.io/_uploads/rkBwtRuHR.jpg "Recursive snark design") The diagram shows the number of parallel batches, $m$ the size of each batch, and $n=mk$ the total number of signatures. $k$ is limited by the number of constraints for Groth16 verification (~$20\text{M}$) since $k$ snark proofs need to be verified in the final snark. The maximum value it can take is 13 (assuming the maximum number of constraints allowed is ~$268\text{M}$). This number will be lower since the final circuit must also construct the Pedersen commitment. If we assume a conservative value, then the system can support ~$6000$ signatures, which should be plenty for a first attempt (see below for discussion on improving this). ### Benchmarks One can break down a typical Circom-Groth16 proving-verifying process like so: 1. Compilation 2. Proving key generation (requires trusted setup) 3. Verifying key generation 4. Witness generation 5. Proof generation 6. Proof verification For circuits teetering on the edge of the max-allowed number of constraints, the most time-consuming and computationally heavy task is proving key generation. Generating a key can take many terabytes of RAM (yes, that’s not a typo) and days or even weeks. The good news is that it is a once-off computation: once the key is generated, any number of proofs can be generated, all with different witnesses. The next computationally heavy task is the compilation, which is a once-off cost. This can take on the order of hours for a large circuit. Regarding the per-proof costs, we only need to look at time complexity for witness generation, proof generation, and proof verification. For large circuits, we have the following estimates: * Witness generation: ~$1$ hour * Proof generation: ~$10$ minutes * Proof verification: $<1$ min These results mean that requirement #5 is satisfied: proofs can be generated in hours, which means they can be done multiple times per day. ### Code and more technical The [original proposal document](https://hackmd.io/@JI2FtqawSzO-olUw-r48DQ/rJXtAeyLT) provides a more in-depth technical explanation, and the codebase is available [here](https://github.com/silversixpence-crypto/zk-proof-of-assets). # Improvements ### [technical] Better ECDSA verification algorithm Personae Labs have created two libraries for faster ECDSA verification inside a snark. Their first library ([efficient-zk-ecdsa](https://github.com/personaelabs/efficient-zk-ecdsa)) uses a mathematical trick first seen on [crypto.stackexchange](https://crypto.stackexchange.com/questions/81089/is-this-a-safe-way-to-prove-the-knowledge-of-an-ecdsa-signature) to move some of the expensive calculations outside the snark. The resulting Circom circuit produces ~$160\text{k}$ constraints, which is a reduction of ~$10\text{x}$ from the original 0xPARC one. The circuit works for the Groth16 proving system. Their second library ([spartan-ecdsa](https://github.com/personaelabs/spartan-ecdsa)) uses the Spartan proving system with more math tricks[^17] in addition to the tricks from the first library. Their circuits for ECDSA verification need only ~$8\text{k}$ constraints, a ~$100\text{x}$ improvement from the original 0xPARC one. Theoretically, this means that the maximum number of signature verifications that can be done inside a snark is ~$33\text{k}$. This massive improvement likely means that recursion is optional for most custodians. ### [technical] Different proving systems Nova is a proving system that allows the folding of repetitive computations for better performance. Since the base algorithm for our PoA snark is just a large loop, it is a perfect fit. Some work has been done to use Nova for batch-verifying ECDSA signatures for secp256k1 ([nova-browser-ecdsa](https://github.com/dmpierre/nova-browser-ecdsa)), and the results are very promising. It's possible that a lot of work would have to be done to move over to Nova because it's unclear how many of the Circom libraries we rely on exist in that ecosystem or how difficult it is to migrate them over. [Lurk](https://filecoin.io/blog/posts/introducing-lurk-a-programming-language-for-recursive-zk-snarks/) is a language specifically used for recursive snarks, so if we still need to do recursion, it may be useful to try this out. ### Instant Zero-Knowledge Proof of Reserves (IZKPR) With PoA, we are reaching the limits of snark computational power, and it will thus take a long time to produce proof. The most time-consuming task is generating the proving keys, which can be done in the order of days for circuits with hundreds of millions of constraints. Luckily, the proving keys are a once-off cost and can be reused for each proof construction. The time needed to generate proofs (ignoring proving key cost) could reach a max of ~10min. Although this is not too bad, it is far from the speed of block production, which is the limit for the speed of proofs (there is no point in producing proofs faster than blocks are created since a proof is linked to a block). Most blockchains produce blocks [in the range of 1-10s](https://chainspect.app/blog/block-time). The previously mentioned improvements will likely not reduce the proof construction time to this order. But there is another protocol that we can use: **[Instant Zero-Knowledge Proof of Reserves](https://eprint.iacr.org/2023/1156).** This protocol can produce proofs much faster than any blockchain-produced block. But IZKPR cannot be used by itself; it needs to start at the snark proof. So, the idea here is to produce the snark once and then hand over the responsibility to IZKPR. # Conclusion First, we took a look at the general idea of Proof of Reserves, and the part Proof of Assets has to play in it. PoA is essentially a self-auditing mechanism using the blockchain’s cryptographic powers. We then dived into some different ways of doing PoA, learning about some topics along the way, such as digital signatures, zero-knowledge proofs of knowledge, and ZK-SNARKS. As we progressed through each of the 3 PoA protocols, we increased the requirements list until we got to the final list with full privacy, ease of use & high performance. The 1st protocol (public list of addresses) is the simplest, but also the least private. If privacy is not an issue for the custodian then there is no reason why this method should not be used. The 2nd protocol (Provisions) is more complicated, but preserves privacy. Provisions introduced the idea of the anonymity set, which is used in conjunction with the blockchain’s cryptographic primitives to prove ownership of a set of accounts without revealing them. The issues here had to do with performance and the incompatibility with general key-management software. The last protocol (snark) is the most complicated, but, in addition to preserving privacy, has good performance and can be used by any custodian. The anonymity set idea is still used, but the input requirements that Provisions had are loosened, making it easier for custodians to use the protocol. We saw, also, that there are still some unsolved problems with existing PoA protocols, that have to do with potential circumventions that custodians could do to cheat. And finally we explored some ideas that could further improve the snark-based protocol: IZKPR, different proving systems, and more advanced signature verification techniques. You can find a fully functional implementation of the snark-based protocol here: [https://github.com/silversixpence-crypto/zk-proof-of-assets](https://github.com/silversixpence-crypto/zk-proof-of-assets)  ## Footnotes [^1]: One can read more in [Nic Carter's blog](https://niccarter.info/proof-of-reserves/) [^2]: Note that this means an account can be owned by multiple parties. Sharing private keys is a rare practice because anyone with the private key would have full access to all the funds of the account. So, in general, private keys are not shared. [^3]: In the most extreme case (cold storage) the private key is stored in an air-gapped hardware security module (HSM), which will definitely not allow the private key to be extracted [^4]: A digital signature is a kind of zero-knowledge proof of knowledge of the private key: ‘zero-knowledge’ because it does not reveal the private key, and ‘proof of knowledge’ because it proves that the generating party has knowledge of the private key. [^5]: Anyone wanting to verify the PoA will have to, on top of checking the balance sum, verify all the signatures. [^6]: Unless the data that is being signed is a valid transaction on the blockchain, in which case the signature can be given to a miner and the transaction will be executed [^7]: Continuously repeated signatures on different random data like this also show with very high probability that the private key is **accessible **at least to someone, i.e., it has not been lost. In this regard, the more frequent the PoA, the better. [^8]: See original Pedersen paper [here](https://link.springer.com/chapter/10.1007/3-540-46766-1_9), or a shorter & more practical explanation [here](https://asecuritysite.com/zero/ped) [^9]: Technically-speaking, the PoA protocol only requires the result of applying a linear function to the private key, which does not reveal any information about the key. But HSMs are not likely to support an API allowing this. [^10]: The reason for this is that the anonymity set must only contain addresses whose public keys are on-chain. The public keys are needed for both the ZK proof & the Pedersen commitment. In fact, the anonymity set is a set of public keys, not a set of addresses. An address is, for most chains, some form of hash of the public key (this is done for extra security). The public key is revealed when the account owner produces a transaction, since the public key can, for most chains, be determined by the digital signature for the transaction. [^11]: See [here](https://www.desmos.com/calculator/ny48ubg1dv) for all the performance graphs [^12]: This essentially implies single-digit verification time (in seconds) & proof size (in MB) [^13]: They may, though, have _received_ assets by being involved in some other account’s transaction [^14]: Not quite “any”. The statement must come from an NP-complete language. [^15]: Snarks can thus be used for **computational compression**. [^16]: There is a good breakdown of nullifiers & ECDSA [here](https://blog.aayushg.com/nullifier/) [^17]: They use a different curve, the secq256k1 curve (a cousin to secp256k1), to allow for right-field arithmetic. One can verify secp signatures on this curve, so there is no loss of generality.