<style>
section {
text-align: left;
}
</style>
## Publicly Verifiable, Private & Collaborative AI Training in a Decentralized Network
#### May 10, 2025 @ ETHDam
#### Yuriko Nishijima
---
### What if you don't have to reveal your personal data but you can still contribute to training a ML model?
---
### For example...
- Privacy-preserving recommendation system for Dapps with anonymous users
- Anonymous/Croudsourced heathcare data analysis platform
- Investors diary 📓 ⇔ liquidity of some asset X
---
## Capture the hidden trend 📈
Data is by nature, in silo.
It only becomes meaningful relative to the other.
---
### But data only gets collected & analyzed when people who have power decides to do so.
What happens if we make ML model training more permissionless?
---
### Federated Learning

---
Traditionally...
Verifiability is not required in federated learning
<img src="https://hackmd.io/_uploads/BkfKonDeel.png"
height = 500>
---
For **mutually distrusted parties in a decentralized network** to collaborate on training, it's necessary
---
## What to SNARKify
### - Client: have to prove they trained (+ masked) a local model correctly
### - Server: have to prove they aggregated local models correctly
---
<div style="text-align: center;">
<img src="https://hackmd.io/_uploads/rJenk9tyeg.png" width="850">
</div>
---
### 1. Clients: Local Training
- multi-class classification task
- [Iris dataset](https://archive.ics.uci.edu/dataset/53/iris) 🌸
- training algorithm: [logistic regression circuit](https://github.com/hashcloak/noir-mpc-ml/blob/master/src/ml.nr)
(imported the one built by hashcloak for their [co-noir-ml](https://github.com/hashcloak/noir-mpc-ml-report/tree/main) project)
- clients' data never leaves their device for training
---
### 2. Client: Masking the model
- This is the only cryptographic part! (except for zk)
- Why masking a model?: Input recovery attack
- Gradient Inversion Attack, Membership Inference Attack, Property Inference Attack, etc...
- In production of FL, Differential Privacy is used (cuz it's more efficient)
---
### So, how can clients mask models in such a way that the server can compute **sum of raw models** without knowing each individual values?
...
We need additive homormorphism...MPC? FHE? 🥺
---
### No, just one-time pad 😳
I guess you can call it a type of MPC, but no decryption at the end, because masks will naturally cancel out with each other ✨
---
### How does that work?
---

---
<div style="font-size: 60%;">
<div style="flex: 1; padding-right: 20px">
<img src="https://hackmd.io/_uploads/H13dFrrJxg.png" alt="diagram" style="width: 100%; max-width: 700px;">
</div>
<div style="flex: 1;">
Each client locally masks model
- client1: masked model $M_1 = R_1 + m_{1,2} - m_{3,1}$
- client2: masked model $M_2 = R_2 + m_{2,3} - m_{1,2}$
- client3: masked model $M_3 = R_3 + m_{3,1} - m_{2,3}$
Then, when a server sums up the masked models:
$M_1$ + $M_2$ + $M_3$ = $R_1$ + $m_{1,2}$ - $m_{3,1}$ + $R_2$ + $m_{2,3}$ - $m_{1,2}$ + $R_3$ + $m_{3,1}$ - $m_{2,3}$ = $R_1$ + $R_2$ + $R_3$
</div>
</div>
---
<div style="font-size: 60%;">
<div style="flex: 1; padding-right: 20px">
<img src="https://hackmd.io/_uploads/H13dFrrJxg.png" alt="diagram" style="width: 100%; max-width: 800px;">
</div>
<div style="flex: 1;">
Privacy on raw models $R_n$: each client can only calculate masks with their own neighbors.
- client1 does not know $m_{2,3}$ => cannot reconstruct neither $R_2$ or $R_3$
- client2 does not know $m_{3,1}$ => cannot reconstruct neither $R_1$ or $R_3$
- client3 does not know $m_{1,2}$ => cannot reconstruct neither $R_1$ or $R_2$
</div>
</div>
---
<img src="https://hackmd.io/_uploads/Sk9PT4Wkeg.png" height="500">
#### I used this [ECDH Library](https://github.com/privacy-scaling-explorations/zk-kit.noir/tree/main/packages/ecdh) inside `zk-kit.noir` library set developed by PSE
---
<div style="transform: scale(1.35); transform-origin: top center;">
```mermaid=
---
config:
look: classic
"fontSize": "32px"
---
sequenceDiagram
participant Client_n
participant Blockchain
participant Server
Client_n-->>Client_n: Train local model R_n, Generate training proof π_train_n
Client_n->>Blockchain: Submit (π_train_n + public key pk_n)
Blockchain-->>Blockchain: if π_train_n verified, then pk_n registered
Client_n->>Blockchain: Fetch pk_{n+1} (right neighbor) and pk_{n-1} (left neighbor)
Client_n-->>Client_n: Locally compute shared masks m_right_n=sk_n*pk_{n+1}, m_left_n=sk_n*pk_{n-1},<br>Mask the model: R_n + m_right_n - m_left_n, Generate masking proof π_mask_n
Client_n->>Blockchain: Submit masked model M_n + proof π_mask_n
Blockchain-->>Blockchain: if π_mask_n verified, then M_n registered
Server->>Blockchain: Fetch masked models M_n for all n
Server-->>Server: Aggregate local models, <br> Generate aggregation proof π_agg
Server->>Blockchain: Submit global model M_g + proof π_agg
Blockchain-->>Blockchain: if π_agg verified, then M_g registered
Client_n->>Blockchain: Fetch global model M_g
```
</div>
---
### Fixed-point arithmetic range check inside Noir circuit
---
### ML: decimal numbers <> ZK: BN254 field
Scale small decimal numbers by some fixed scaling factor
- first 126 bits: positive numbers
- middle 2 bits: unused
- last 126 bits: negative numbers
Let's see [example dataset]( https://github.com/yuriko627/vfl-demo/blob/main/clients/client1/training/Prover.toml) for client1
---
### Safe addition
You don't want to overflow!
For a + b = `c`
you want: bitsize(`c`) <=126
=> constraint: bitsize(a) <= 125 && bitsize(b) <= 125
---
### Safe multiplication
You don't want to overflow!
For a * b = `c`
you want: bitsize(`c`) <=126
=> constraint: bitsize(a) + bitsize(b) <= 125
---
#### Local Model Aggregation:
<div style="margin-top: -30px;">

</div>
<div style="margin-top: -20px; margin-bottom: -30px;">

</div>
<small>You can customize to add `assert_bitsize::<n>` before arithmetic operations</small>
---
## Future Research Direction
---
<div style="font-size: 80%;">
### 1. One-time pad → Packed secret sharing for masking models
### 🔐 For better security
Collusion with 2 neighbor will reveal your raw model
→ t-out-of-n threshold security model
### 🤓 For better efficiency
Shared mask generation: $O(n)$ where $n$: model size
↓
Encode multiple secrets (model params) into a single polynomial
(NOTE: addition won't require interaction - we did multiplication for weighted averaging but we can offload it to clientside )
</div>
---
### 2. Clients dropouts tolerance + real-time join

This [paper](https://arxiv.org/pdf/2205.06117) says each client has to communicate with $O(log(n))$ number of other clients (where $n$ is the total number of clients), then the server can tolerate clients dropout for aggregation
---
### 3. Training dataset validation

<small style="margin-top: -20px">[Reference](https://www.youtube.com/watch?v=mdMpQMe5_KQ)</small>
---
### 4. Replacing local training with fine-tuning
💡Use case Idea: Locally fine-tune models on under-represented demographic groups and aggregate them to build a fairer global model as a public good
→ Mitigate algorithmic bias
- criminal justice
- health care
- hiring
Q. Does it make sense to do this in an anonymous decentralized network with public verifiability?
---
### 5. Reward system
Q. What's the objective for collaboration?
We need a careful incentive design for people to submit "high-quality" data
[function to calculate reward]
Y(x) = x + ...
Y measures the quality of data point x
But how do we define "quality" in the first place?
---
### Thank you for listening!
Connect if you like this kinda research:)
- TG: @yuriko627
- X: @yurikonishijima
---
{"contributors":"[{\"id\":\"d65f771a-92d7-493e-807c-a52c97138c1e\",\"add\":18768,\"del\":10383}]","title":"[ETHDam Slide]: Publicly Verifiable, Private & Collaborative AI Training in a Decentralized Network","description":"Screenshot 2025-04-25 at 21.07.56"}