Anonymous voting on-chain using government ids and FHE

It is now possible to generate zk proofs of government ids to stay anonymous while being able to prove that you are a unique human. For example, [anon aadhaar](https://github.com/privacy-scaling-explorations/anon-aadhaar) project. One of the most powerful use cases of having anonymous identity on-chain is voting. Powered with the ability to generate zero knowledge proof of a valid govt id, it seems possible to finally enable humans to be able to vote on-chain anonymously. However, we quickly run into issues when devising a feasible scheme that guarantees anonymity and prevents double vote. For example, take a voting scenario in which a user can generate proof of their govt id to vote. To prevent any user from voting twice you will require them to make something public that is inherently attached to their id, we call this nullifer. One obvious choice is to use hash of the id as the nullifer. But this breaks anonymity, since anyone with access to set of all govertnment ids (ie government itself) can deanonymise voters. We can't ask user to add salt before hashing, otherwise they can vote more than once. Below we propose a scheme that uses fully homomorphic encryption to overcome the problem of nullifiers. ### BFV (FHE) Encryption scheme We will be using public key variant of BFV FHE scheme and for ease I would ask you to imagine a single server $C$ with ability to perform FHE computations. Also think of a BFV ciphertext an an element of field $Z_p$ for some prime $p$. Notice that due to fermat's little theorem following statement is correct: $neq(x,y) = (x-y)^{p-1}$ outputs 1 if $x \neq y$ otherwise 0. where $x, y \in Z_p$. We will be using $neq(x,y)$ to check whether two ciphertexts are encrypted or not later. Note that $m = dec(ct,sk)$ denotes decryption procedure that produces original message $m$. Also note that BFV ciphertexts are noisy. This means, $ct_1 = Enc(m, sk)$ and $ct_2 = Enc(m, sk)$ do not reveal that both encrypt the same value $m$. ### Voting Before voting begins, $C$ makes public its public key $pk$. Using $pk$ anyone can encrypt a message under BFV scheme, that is only decryptable by $C$. To vote user $u_i$ does the following: 1. Produces a hash of their govt id, $h_g = hash(gov \space id)$. Encrypts it under $pk$ to produce $ct_i$. 2. Produces a zk proof $\pi_i$ with public input $h_{cti} = hash(ct_i)$, that proves (a) validity of govt id (b) $ct_i$ is valid encryption of $h_g$ under $pk$. User $u_i$ posts $\pi_i$, $h_{cti}$, and their vote on-chain. User also sends $ct_i$ to $C$. $C$ receives $ct_i$ from user, verifies that hash of $ct_i$ , $h_{cti}$, exists on-chain and stores $ct_i$. Once voting period ends, $C$ calls `is_valid` for each user $u_i$ and appends the return value to `valid` array. `is_valid` returns 0 if users' vote is invalid, otherwise 1 meaning vote is valid. A vote is valid if $u_i$ has voted only once, otherwise invalid. ``` function is_valid(u_i, users, sk): let is_valid = 1; For u_j in users and u_i != u_j: let is_neq = dec(neq(u_i, u_j), sk); is_valid &= is_neq; return is_valid ``` After calling `is_valid` for each user, `valid` array contains bits where bit at index $i$ indicates validity of $u_i$'s vote. $C$ signs `valid` array and packs it efficiently and sends it on-chain to voting smart contract, $v_c$. $v_c$ receives `valid` array, verifies $C$'s signature and stores $bits$. To tally, $v_c$ then proceeds to extract bit corresponding to each user and only counts user's vote if bit is 1, otherwise not. ### Replacing $C$ with a network of nodes using threshold FHE. With threshold FHE you can replace $C$ with a set $C_{th} = \{C_1, C_2, ..., C_n\}$ with $n$ computing nodes. All compute operations are performed by each node. The secret key, $S_{ideal}$, is now split among all nodes in $C_{th}$. Moreover any BFV ciphertext encrypted under $S_{ideal}$ can be decrypted only if $t > |C|/3$ nodesare willing to decrypt (will have to confirm this). This means the threshold network is secure as long as $t > |C|/3$ nodes do not turn adversarial. Before voting procedure begins all nodes in $C_{th}$ perform instialisation phase to produce $pk$. To vote, user $u_i$ performs the same step as before, but instead of sending $ct_i$ to one node it sends to all nodes. After voting period ends, all nodes in $C_{th}$ perform computation to produce `valid` array, with the caveat that all `dec` calls are changed to threshold decryption procedure and `is_valid` function is modified such that `dec` calls are delayed till end. Once majority of nodes have consensus that computation is done, they proceed to gain consensus on the output `valid` array. Once consensus is reached, one node $C_l$ is selected at random. Each node in consensus signs the output and sends it to $C_l$. $C_l$ posts all signatures and the output on-chain to voter contract $v_c$. (Note that we can use signature aggregation scheme to reduce the footprint of signature posted on-chain to single signature). $v_c$ receives the output and verifies all signature, confirming that certain pre-defined threshold of nodes agree upon the output. $v_c$ then proceeds to tally votes using `valid` array as before. ### Encrypted Votes Note: Correct me if I am wrong, but I don't see the point of encrypting votes when the corresponding identity remains anonymous. It is trivial to replace votes in plaintext with encrypted votes. We just need to modify procedure to following: 1. Instead of user $u_i$ sending their vote to $v_c$ in plaintext, they encrypt their vote under $pk$ to produce $ct_{vi}$ and send $ct_{vi}$ to $C$. 2. After gaining consensus on ouput `valid`, each node in $C_{th}$ produces $ct_{total}$ by homomorphically adding only the vote ciphertexts that are valid. Set $C_{th}$ first decrypts $ct_{total}$ and gains consensus on output `total`. As before $C_l$ is selected at random that collects signature from all nodes in consensus and sends `total` output to $v_c$ on-chain. $v_c$ receives the aggregated votes `total` and verifies that necessary signatures from set $C_{th}$ were provided. ### Transciphering Under certain situations it might be necessary to have a record of BFV ciphertexts sent to $C_{th}$ on-chain. However, BFV ciphertexts (and FHE ciphertexts in general) are big in size, thus making storing them on-chain infeasible. Using transciphering users can encrypt their values using a symmetric encryption scheme and post the ciphertexts (that are smaller in size) on-chain. Nodes in $C_{th}$ can retrieve necessary ciphertexts from chain and switch them from symmetric key ciphertexts to BFV ciphertexts. Rest of the procedure stays the same. I am delaying outline of transiphering until later since I don't think posting ciphertexts on chain is a priority. We can come to this later if it is. I also don't think that transciphering is infeasible. Assuming that newer FHE friendly symmetric encryption schemes are secure, transchipering is feasible. ### Prototype 1. Server side: Since threshold BFV is not that different than BFV and does not affect the accuracy of ciphertext operations, we can for the first prototype assume a centralised server. The FHE circuit is relatively straightforward to develop and I will update on this in few days. 2. ZK proofs: Generating proof of govt id has been made easier due to projects like anon-aadhar and hopefully others. I can help with circuit to generate proof of correct encryption. Circuit for hash is trivial (I guess?). ### Problems 1. As evident having a secure and censorship resistant threshold network is important. In my opinion, achieving it for a network of nodes that isn't a layer 1 itself remains an open problem.