<style>
section {
text-align: left;
}
</style>
## Publicly Verifiable, Private & Collaborative AI Training
#### April 26, 2025 @ ZKTokyo
#### Yuriko / X:@yurikonishijima
---

---

---
## Train To Earn?
---
### What if you don't have to reveal your data but still able to contribute to training a model?
---
## ->💡Federated learning + ZKP
---
### What is Federated Learning?

---
## Why zk?
### Client: to prove correct execution of training (+ masking) a local model
### Server: to prove correct execution of aggregation of the local model updates
---
Traditionally...
Verifiability is not required in federated learning, because entities make a business alignment before collaboration
↓
For **mutually distrusted parties in a decentralized network** to collaborate on training, it's necessary
---
<img src="https://hackmd.io/_uploads/rJenk9tyeg.png" width="850">
---
#### 1. Local Training on the client side
- multi-class classification task
- [Iris dataset](https://archive.ics.uci.edu/dataset/53/iris) 🌸
- training algorithm: [logistic regression circuit](https://github.com/hashcloak/noir-mpc-ml/blob/master/src/ml.nr)
(imported the one built by hashcloak for their [co-noir-ml](https://github.com/hashcloak/noir-mpc-ml-report/tree/main) project)
- clients train on their raw data
---

---
<img src="https://hackmd.io/_uploads/rJenk9tyeg.png" width="850">
---
### Benchmark comparison
<div style="display: flex; align-items: baseline;">
<h4 style="margin: 0;">co-noir ML</h4>
<small style="margin-left: 10px;">
<a href="https://github.com/hashcloak/noir-mpc-ml?tab=readme-ov-file#training-using-co-noir">implementation</a>
</small>
</div>
<div style="margin-top: -20px; margin-bottom: 40px;">
<img src="https://hackmd.io/_uploads/S156ycF1xe.png" width="650">
</div>
<div style="display: flex; align-items: baseline;">
<h4 style="margin: 0;">Verifiable Federated Learning</h4>
<small style="margin-left: 10px;">
<a href="https://github.com/yuriko627/vfl-demo">implementation</a>
</small>
</div>
<div style="margin-top: -20px;">
<img src="https://hackmd.io/_uploads/HykWlqt1ee.png" width="350">
</div>
---
#### 2. Masking the model
- This is the only cryptographic part! (except for zk)
- Why masking a model?: Input recovery attack
- Gradient Inversion Attack, Membership Inference Attack, Property Inference Attack, etc...
- In production of FL, Differential Privacy is used (cuz it's more efficient)
---
### So, how can we mask models in such a way that the server can calculate a **aggregation of raw models** without knowing each individual values?
...
We need additive homormorphism...MPC? FHE? 🥺
---
### No, just one-time pad 😳
I guess you can call it a type of MPC, but no decryption at the end, because masks will naturally cancel out with each other ✨
---
### How does that work?
---

---

---
For example,
- client1: masked model $M_1$ = raw model $R_1$ + $m_{1,2}$ - $m_{3,1}$
- client2: masked model $M_2$ = raw model $R_2$ + $m_{2,3}$ - $m_{1,2}$
- client3: masked model $M_3$ = raw model $R_3$ + $m_{3,1}$ - $m_{2,3}$
Then, when a server sums up the masked models $M_n$,
$M_1$ + $M_2$ + $M_3$ = $R_1$ + $m_{1,2}$ - $m_{3,1}$ + $R_2$ + $m_{2,3}$ - $m_{1,2}$ + $R_3$ + $m_{3,1}$ - $m_{2,3}$ = $R_1$ + $R_2$ + $R_3$
---
Privacy on raw models $R_n$: each client can only calculate masks with their own neighbors.
For example,
- client1 does not know $m_{2,3}$ => cannot reconstruct neither $R_2$ or $R_3$
- client2 does not know $m_{3,1}$ => cannot reconstruct neither $R_1$ or $R_3$
- client3 does not know $m_{1,2}$ => cannot reconstruct neither $R_1$ or $R_2$
---
<img src="https://hackmd.io/_uploads/Sk9PT4Wkeg.png" height="500">
#### I used this [ECDH Library](https://github.com/privacy-scaling-explorations/zk-kit.noir/tree/main/packages/ecdh) inside `zk-kit.noir` library set developed by PSE
---
<div>
```mermaid
sequenceDiagram
participant Client_n
participant Blockchain
participant Server
Client_n-->>Client_n: Train local model R_n, Generate training proof π_train_n
Client_n->>Blockchain: Submit (Ï€_train_n + public key pk_n)
Blockchain-->>Blockchain: if π_train_n verified, then pk_n registered
Client_n->>Blockchain: Fetch pk_{n+1} (right neighbor) and pk_{n-1} (left neighbor)
Client_n-->>Client_n: Locally compute shared masks m_right_n=sk_n*pk_{n+1}, m_left_n=sk_n*pk_{n-1},<br>Mask the model: R_n + m_right_n - m_left_n, Generate masking proof π_mask_n
Client_n->>Blockchain: Submit masked model M_n + proof π_mask_n
Blockchain-->>Blockchain: if π_mask_n verified, then M_n registered
Server->>Blockchain: Fetch masked models M_n for all n
Server-->>Server: Aggregate local models, <br> Generate aggregation proof π_agg
Server->>Blockchain: Submit global model M_g + proof π_agg
Blockchain-->>Blockchain: if π_agg verified, then M_g registered
Client_n->>Blockchain: Fetch global model M_g
```
</div>
---
### Fixed-point arithmetic range check inside Noir circuit
---
### ML: decimal numbers <> ZK: BN254 field
- first 126 bits: positive numbers
- middle 2 bits: unused
- last 126 bits: negative numbers
Let's see [example dataset]( https://github.com/yuriko627/vfl-demo/blob/main/clients/client1/training/Prover.toml) for client1
---
### Safe addition
You don't want to overflow!
For a + b = `c`
you want: bitsize(`c`) <=126
=> constraint: bitsize(a) <= 125 && bitsize(b) <= 125
---
### Safe multiplication
You don't want to overflow!
For a * b = `c`
you want: bitsize(`c`) <=126
=> constraint: bitsize(a) + bitsize(b) <= 125
---
#### Local Model Aggregation:
<div style="margin-top: -30px;">

</div>
<div style="margin-top: -20px; margin-bottom: -30px;">

</div>
<small>You can customize to add `assert_bitsize::<n>` before arithmetic operations</small>
---
### Future Research Direction
---
### 1. Training dataset validation

<small style="margin-top: -20px">[Reference](https://www.youtube.com/watch?v=mdMpQMe5_KQ)</small>
---
### 2. Replacing local training to fine-tuning
---
### 3. Dropouts tolerance

This [paper](https://arxiv.org/pdf/2205.06117) says each client has to communicate with $O(log(n))$ number of other clients (where $n$ is the total number of clients), then the server can tolerate clients dropout for aggregation
---
### 4. Rewarding system

<small style="margin-top: -20px">[Reference](https://www.vana.org/posts/model-influence-functions-measuring-data-quality)</small>
... or the key question is "who taught AI about it first?"
=> Time-ranking based compensation is might be appropriate
{"title":"[ZKTokyo Slide] Publicly Verifiable, Private & Collaborative AI Training","description":"Screenshot 2025-04-25 at 21.07.56","contributors":"[{\"id\":\"d65f771a-92d7-493e-807c-a52c97138c1e\",\"add\":11932,\"del\":4356}]"}