Security Analysis of Summa's Proof of Solvency

# Security Analysis of Summa Proof of Solvency author: **qpzm** ## 1. Threat model In proof of liabilities, a prover is CEX. It gives each client the proof that shows the total balance of all clients and the client deposit is included in the sum. 1. Let the tree height H. The height of root node is 1. 2. The count of leaf nodes is 2^(H-1), and the count of internal nodes is 2^(H-1) - 1. For example, a tree of a height 3 has 4 leaf nodes and 3 internal nodes. If the number of users is less than 2^(H-1), we can fill with dummy nodes whose balances are trivial. 3. Each client receives a proof from the prover separately. For the consideration of privacy, no client will share his account information or received proof with others. 4. Suppose the number of currency is 1. Especially in summa, a prover gives public inputs and a zk proof satisfying the constraints that I will explain later. ## 2. Implementation ### Notation - H: hash ![[2] 2018-1139 Fig 1.](https://hackmd.io/_uploads/rkbyQl4T6.png) ### Inputs The inputs of the circuit to create a proof are as below. Suppose the number of currency is 1. 1. `entry` 1. a balance of the user 2. `path_indices[]` 1. 0 means the sibling is on the right. 1 means the sibling is on the left. 3. `sibling_leaf_node_hash_preimage` 1. H(username, balance) 4. `sibling_middle_node_hash_preimage[]` 1. H(LeftChild.balance + RightChild.balance, LeftChild.hash, RightChild.hash) 5. root 1. The total sum of user balances. Reference: https://github.com/summa-dev/summa-solvency/blob/master/zk_prover/src/circuits/merkle_sum_tree.rs#L31-L41 In the circuit, every calculation results including leaf node hash, middle node hashes in the merkle tree path is also input. However, the public inputs of the circuit are as below. 1. the leaf hash to be verified inclusion of. It is calculated by $H(username, balance_0, balance_1, ... balance_{N-1})$. 2. the root hash of the merkle sum tree. 3. the root balances of the merkle sum tree. Reference: https://github.com/summa-dev/summa-solvency/blob/master/zk_prover/src/circuits/merkle_sum_tree.rs#L43-L59 A user can verify the proof and check the conditions. 1. `leaf_hash == H(username, balance)` 2. the root hash calculated in the circuit is equal to the hash of the merkle tree stored in the blockchain. 3. the root balances calculated in the circuit is equal to the announced total balances of the prover CEX. ### Constraints in MSTInclusionCircuit Notation - L means a left child, R means a right child. - v1: balance of the current node - v2: balance of the sibling node Suppose the current leaf node to prove is Alice’s and its sibling leaf is Bob’s. The constraints for a prover are as follows. 1. H(Alice's nickname, v1) == current_hash **case1. If the current node is a leaf node, namely level 0.** 2. H(Bob's nickname, v2) == sibling hash 3. range_check(v1) 4. range_check(v2) 5. swap(current hash, sibling hash, index) == L, R 1. index is from `path_indices[]`. 0 means the sibling is on the right. 1 means the sibling is on the left. 6. sum(current balance, sibling balance) == parent balance 7. H(parent balance, L, R) == (computed hash from the witness `sibling_leaf_node_hash_preimage`) **case2. If the current node is middle node, namely level 1 ~ level $\lfloor logN \rfloor$.** 2. H(v2, L, R) == sibling hash 3. range_check(v2) 4. swap(current_hash, sibling_hash, index) == L, R 5. sum(current_balance, sibling_balance) == next balance 6. H(next balance, L, R) == (computed hash from the witness `sibling_middle_node_hash_preimage[current_level]`) Reference: https://github.com/summa-dev/summa-solvency/blob/master/zk_prover/src/circuits/merkle_sum_tree.rs#L228 ### Constraints in MerkleSumTreeChip To check the constraints explained above, I list up constraints implemented in MerkleSumTreeChip**.** Reference: https://github.com/summa-dev/summa-solvency/blob/master/zk_prover/src/chips/merkle_sum_tree.rs#L38 **Inputs** 1. advice: 3 private input columns a. advice[0] is a left child b. advice[1] is a right child c. advice[2] records two things depending on the selector. i. If `bool_and_swap_selector` is non-zero, this column is a binary bit to show whether to swap the left and the right child at the next row. ii. If `sum_selector` is non-zero, this column stores the sum of the left and right child balances. 2. bool_and_swap_selector a. public input to enable bool and swap constraints 3. sum_selector a. public input to enable the sum constraint **bool constraint** `s * swap_bit * (1 - swap_bit) = 0` s is a bool_and_swap_selector. It enforces `swap_bit` is either 0 or 1. **swap constraint** 1. `s * (element_r_cur - element_l_cur) * swap_bit + element_l_cur - element_l_next = 0` 2. `s * (element_l_cur - element_r_cur) * swap_bit + element_r_cur - element_r_next = 0` s is a bool_and_swap_selector. If s is 1, swap constraint is in effect. If `swap_bit` is 1, `element_r_cur - element_l_next = 0` and `element_l_cur - element_r_next = 0`. Therefore, the values will be swapped on the next row. If `swap_bit` is 0, `element_l_cur - element_l_next = 0` and `element_r_cur - element_r_next = 0`. The values will not be swapped on the next row. **sum constraint** `sum_selector * (left_balance + right_balance - computed_sum)` if `sum_selector` is toggled, it is equal to `left_balance + right_balance = computed_sum`. Thus, it constraints the computed sum to be equal to the sum of the left and right balances. ## 3. Vulnerability Analysis ### 3-1. Proof of reduced liabilities A summation tree is designed to calculate like [1] Fig1. However, a malicious prover can create a tree as [1] Fig2 with the same leaf nodes. This reduces the liabilities in the root exposed to the public. The merkle roots in [1] Fig2 and Fig3 are the same. The proof generated for alice would still be valid due to bob's manipulated balance. Alice cannot verify whether bob's balance(v2) is 5 or 10, so she just accepts the argument that v_2 is 5 and its hash is h_2. The main cause of the vulnerability is that Alice cannot calculate the sibling's hash because she does not know the sibling's nickname to create `h2 = H(v_2 || bob's nickname)`. One thing to note is that the prover can manipulate the root balance, although every user check one's proof. ![[1] 2022-043 Fig 1,2,3](https://hackmd.io/_uploads/ByRJhBrTa.png) ### Summa In ZK Merkle Sum Tree, the prover knows all users’ balances, and gives every user his/her zk Proof of Inclusion in the Merkle Sum Tree. The main question for soundness is "Is it possible for a set of manipulated inputs to pass the verification?". A naive merkle sum tree is vulnerable because the verification step does not include the hash calculation `h2 = H(v_2 || bob's nickname)`. Therefore, a verifier cannot check the hash is really calculated from the balance and nickname of the sibling. On the other hand, ZK Merkle Sum Tree includes the hash calculation in the circuit constraints. This provides a guarantee that the hash is calculated from the balance and nickname of the sibling. https://github.com/privacy-scaling-explorations/poseidon-gadget/blob/764a682ee448bfbde0cc92a04d241fe738ba2d14/src/poseidon/pow5.rs#L58 ### Risk One thing to note is that if a user does not verify his/her proof, the rightness of the proof is unknown. Users cannot verify other users’ balance because they do not know another user’s username and balance. Therefore, a prover can manipulate the balances of dormant users if some users keep not verifying their proofs. ## 4. References [1] https://eprint.iacr.org/2022/043 [2] https://eprint.iacr.org/2018/1139 [3] https://github.com/summa-dev/summa-solvency/issues/166 [4] https://summa.gitbook.io/summa-book/cryptographic-primitives/merkle-sum-tree [5] https://hackmd.io/j85xBCYZRjWVI0eeXWudwA