owned this note
owned this note
Published
Linked with GitHub
# Nova based folding of verifiable AI(ZKML) circuits
The current article explores the possibility of combining the inference of a neural network model inside a zero knowledge circuit. The circuits are written in circom and dont cover all the inferences. All the repeated inferences are folded into a single instance. Then proofs for all these foldings are calculated recursively using pasta curves.
## OK I know ZK, but what the hell is ZKML?
ZK-Snarks are being used many places of the blockchain for their succinetness and security. In Machine Learning, Deep Learning or AI they can prove to be a powerful tool. With the rise in the usage of AI tools like ChatGPT and MidJourney, it makes necessary to check the integratity of these systems and the content produced by them. Using ZK-Snarks we can do that, but how?
The models used to run tools in AI are using a huge number of parameters. They can range to billions of weights inside a deep neural network. Chat GPT 4 uses 100 Trillion weights in its model to calculate the output. One problem that lies with API based model inferences is you don't know which model is being used to calculate the output. It may happen that you have paid to use the premium verison of the model but the outpur its calculated on something whose weights does not produce quality outputs. This may not matter for some petty tasks but it will certainly matter a lot in important cases like medical inferences or seeking financial desicions from AI.
## What is Nova?

Nova performs folding over R1CS constraint systems. R1CS is an arithmetization technique used for Nova deploys a scheme with relaxed R1CS at the begining and folding it N times. Relaxed R1CS means that the original R1CS is fed with a linear combination of two instances of solution vector. Let's assume that first solution vector is $s_1$ and second is $s_2$
$$ Z_1 = [x_1, W_1] $$ $$Z_2 = [x_2, W_2]$$
Here $x$ is the public inputs and $w$ is the private witness
In R1CS we have,
$$ (A.s) (B.s) = C.s$$
Now with normal R1CS we will plug in $s_1$ first complete the evaluation of polynomials and then the proof is given to the verifier for verification through bilinear pairings (atleast this happens in **Groth'16**).

In Nova we take a Relaxed Linear Combination(RLC) of two instances with an RLC factor $r$.
$$ (A.s) (B.s) = u(C.s) + E $$
$$ S= s_1+r.s_2 $$
By inputing it into the R1CS equation, we have
$$ (A.(s_1+r.s_2))(B.(s_1+r.s_2)) = C.(s_1+r.s_2) $$
After solving this equation we have,
$$ (A.S)(B.S) = u.(C.S) + (D.s') + (E) $$
$$ u = u_1 + r.u_2 $$
$$ e = e_1 + r^2.T $$
$$ T = Az_1.Bz_2 + As_2.Bs_1-u_1Cz_2-u_2Cz_1$$
This is the extra terms we are getting in relaxed R1CS and they are committed by the prover to be used in future for verfication.
The constraints generated on the final step are used with interactive oracle proof(IOP) i.e. Spartan for generating the proofs.
## How can Nova be used in Verifiable AI (ZKML)?
### Case 1
In neural network models doing one inference involves a lot matrix multiplication between the layers. These operations are somewhat repetitive in nature as the architechture of layer 1, 2 and 3 are same i.e that is they contain the same number of neurons/activations. These kinds of multiplications can be easily folded uses Nova. Inside the circuit we will have the multiplication between two layers(ex L1 and L2). Now this circuit will be used by Nova for folding it N times. Folding the **random linear combination** is a fast operation composed of MSMs (rather than expensive, non-parallelizable FFTs) that results in a single claim that encompasses all N steps or N layers. **Expensive SNARK machinery only needs to be invoked to prove this single instance. This happens when we feed Nova's folded instance into Spartan to create a succinct proof.**
One disadvantage of this approach towards this approach is we need to have a fixed size of intermediate layers in the network, which is not always true. Modern and powerful networks always use different size of layers in between. THese problems can be solved using SuperNova which is not restricted to one circuit only. We can take circuits from **A distinguishing aspect of SuperNova is that the cost of proving a step of a program is proportional only to the size of the circuit representing the instruction invoked by the program step. This is a stark departure from prior works that employ universal circuits where the cost of proving a program step is proportional at least to the sum of sizes of circuits representing each supported instruction—even though a particular program step invokes only one of the supported instructions**

### Case 2
We can have another model where the we can have the entire ML model inside the circom circuit. This includes all the layers of the circuit i.e there is one input for the data and then the output is directly the result of the model. This can be used where we have to take inference from a large number of data points. On every step we can calculate the output and check for a certain condition or aggregate the inference results with one SNARK proof at last that all the computation was done correctly at each incremental step. **Nova employs folding scheme for non-interactive systems and instantiates a single relaxed R1CS instance at the beginning of the computation and folding it N times**.
The model at the start is fed with some public inputs and private witnesses and then performing inference for one sample, the error term is calculated and a commitment to that error term kept for the verifier. Unlike other IVCs the proof is not generated after everytime a sample is completed. Rather we are leveraging the property that $A, B,$ and $C$ matricies of RICS are fix in every circuit, and only the solution vector is changing. So the solution vector becomes a RLC of the current and the previous solution vector. This leds to the generation of cross terms which are then calculated and committed to.
## How is Banana using ZKML and Nova?
Currently we are targeting two use cases:
1. Private facial recognition for wallet recovery: Presently, Banana accounts are utilizing a solution based on secure multi-party computation (MPC) for wallet recovery in case of a lost device. However, this approach involves a certain degree of centralization, as the three shards of the recovery mechanism are stored in different centralized entities:
- The user's Google Drive, authenticated using Google sign-in.
- Our server, authenticated using an email ID.
- The third shard is encrypted and converted into a QR code, which the user can share with a trusted family member or friend.
To eliminate the dependency on centralized entities, I am exploring the use of private neural networks combined with ZK-SNARKs for facial recognition. In this scenario, the user would scan their face, and if the facial features match, verifying the proof would be sufficient for the smart contract to validate the correctness of the face.
2. Currently with Passkeys(more info [here](https://medium.com/banana-sdk/the-future-of-web3-authentication-using-passkeys-part-1-92f7591e3447)), transaction data before signing is not visible. This can lead to horrible scenarios leading to users signing wrong messages and transactions. A simple solution to this problem is displaying a dialogue box before the passkey authentication pops up for making the user aware.
The problem with above solution is how will prevent bad actors showing pop ups with data which is not correct. We require some kind of on-chain guardian for preventing these phishy transactions. A guardian which is dependant on a an Classfication model for finding a particular transaction is phisy. This classification model will be in a ZK-SNARK and take transaction information as private variables. The model will output the result along with the proof. This is just an currently and needs a lot of improvement both in terms of overall architechture and design.
## Conclusion