# Securing privacy at the network level
###### tags: `Compliance`
(N.B. a lot of this, particularly the idea of "cleanness providers", are Vitalik's idea)
## Introduction
In order for private transactions on Ethereum to establish themselves as a legitimate part of decentralised finance, it is likely that a mechanism is required to distinguish good actors from bad. Without this, the uncertain regulatory landscape frightens away users and integrators.
To address this, we must ensure we do not compromise on our values or those of our users. We must also avoid eroding the key value propositions that justify the existence of privacy networks operating on decentralised ledgers.
Any solution must work within the constraints of a fully decentralised permissionless network. They must also work on a *programmable* network where contracts are Turing-complete. Adopting a solution that is incompatible with these conditions risks establishing norms that regulators will adopt and enforce. If these norms cannot be met using a fully decentralised protocol then we lose our key value proposition; decentralised private finance.
### Requirements for securing privacy at the network level
Any solution must have the following minimal characteristics in order to satisfy the above:
* Provide a clear public signal to distinguish legitimate users from suspicious users
* No censorship; no constraints on how users use the network (a consequence of Turing-complete programmability)
* Adopted via social consensus (i.e. users willingly go through opt-in steps to prove they are legitimate. Again a consequence of programmability)
* Permissionless; the network does not place one or two fixed centralised entities in a position of privilege and control over the network
So…how the absolute bloody fuck do we square this impossible circle?
## Specification Overview
We conditionally track deposits. This information is used by “cleanness providers” to attest to the legitimacy of shielded funds.
### Deposit Tracking
All tokens entering the anonymity set are tracked in the L1 rollup contract via a unique `deposit_id`.
Every UTXO (here referred to as ‘value note’) contains a set commitment containing the following:
* The `deposit_id` of each deposit that contributes value to the note
* The `deposit_value` that each deposit contributes to the note
* The `deposit_timestamp` of when the deposit was made
The size of the deposit set must be capped to a manageable number (~32?).
The UTXO also contains a boolean flag `deposits_cleared` that describes whether the above set commitment has been **deleted**.
It is not possible to limitlessly track deposits in the above manner as the size of the set commitment will grow *exponentially* with the transaction depth of a given value note.
The goal is to enable a **cleanness provider** entity to attest to the legitimacy of a given value note.
### Cleanness Providers
A cleanness provider is an entity that attests to the legitimacy of a given value note. i.e. that its deposits did not come from deposits sourced from fraud or crime.
In a fully decentralised network, anyone can register as a cleanness provider. Each provider is given a unique `provider_id`. In a fully programmable private L2 this will be linked to a private smart contract address whose logic defines the conditions under which a provider will attest to a note. In more limited systems e.g. Aztec Connect we will need to create a specific “provider circuit” to enable provider attestations.
The most basic form of provider would be a centralised system where users validate their identity off-chain and the provider produces digital signatures that attest to the legitimacy of every user’s note.
However the above is highly centralised and places the successful operation of the network in the hands of a small number of off-chain entities. We can do better.
### The permissionless cleanness provider
The permissionless provider maintains a nullifier set containing `deposit_ids` that are flagged as suspicious.
Any user can obtain an attestation from the provider, for a given value note, by creating a proof of knowledge of the following:
1. `deposits_cleared == false`
2. Each `deposit_id` in the note’s set commitment is **not** a member of the provider’s nullifier set
3. Each `deposit_timestamp` in the note’s set commitment is **less than** `timestamp_of_nullifier_root - provider_timelock`
(The value of `provider_timelock` is set by individual providers)
#### Provider attestations and value notes
Each value note contains a `provider_set`; a set commitment containing the `provider_id` values that have attested to the note.
In a join-split transaction, the output note’s `provider_set` is the **set intersection** of the input note’s `provider_sets`.
When tokens are withdrawn from the privacy network, the `provider_set` of the destroyed value notes is publicly broadcast on the L1 and linked to the withdrawer address.
#### Dealing with ‘tainted’ deposits
A “disentangle” circuit is added to the network. This circuit takes a value note and can split out one or more tainted deposits.
E.g. A value note contains deposit ids `1, 5, 6` and `5` is considered tainted. The circuit will destroy the input value note and create two output value notes. The first contains the value from deposit ids `1, 5` and the second contains value from deposit id `6`.
This circuit allows users to ‘isolate’ tainted tokens and prevent them from being combined with (and tainting) the rest of their funds.
### Effectiveness of the above scheme
* No network level censorship. Users do not have to use cleanness providers as long as they accept the consequences
* Clear signal to distinguish legitimate users and bad actors (i.e. whether withdrawals are attested by trusted cleanness providers)
* External actors have clear information they can use to vet/block addresses (i.e. has address withdrawn notes that are not clean?)
* Permissionless; anyone can become a cleanness provider. Providers exist where anyone can ‘clean’ notes without going through identity vetting
* Decentralised. Community decides on which providers are trusted. Negligent providers, or providers that block legitimate deposits, can be replaced by new providers
#### Disadvantages
* No perfect information tracking. Cannot trace stolen funds from withdrawal address to deposit address
* UX problems if users have to wait several days to ‘clean’ notes
* Increased L2 data; for peer-to-peer transfers, the deposit set and provider set for each note must be included in the note’s ‘viewing key’ (symmetrically encrypted data blob that decrypted to the plaintext of the recipient’s value note)
* Risk of a single provider becoming the “canonical” provider, with associated censorship risks
### Mitigating UX risk via “identity arbitrageurs”
When a user deposits funds, they can perform an atomic swap with a 3rd party to swap their notes for “cleaned” notes.
The 3rd party takes on responsibility for vetting the user (e.g. performs some sort of KYC). If deposits later turn out to be fraudulent they will possess information that can be forwarded to investigators (i.e. the time of the swap, the identity of the user).
Using identity arbitrageurs is opt-in at the network level. If a user is uncomfortable doing this, they can wait for the required time lock window to clean their notes using their desired provider.
N.B. we will need to implement an atomic swap circuit to do this. Input note values == output note values. Owner public keys are swapped but the deposit sets and provider sets are unchanged.
## Preserving decentralisation
In a fully decentralised network it is important that the network does not rely on a single canonical cleanness provider.
The protocol foundation ideally makes funds available to fund multiple provider entities. Decentralised governance is used to audit providers and to decide on continued funding vs funding another provider.
Providers can be audited with publicly available information to determine:
1. What percentage of bad deposits were identified within the time lock window
2. What percentage of nullified deposits ended up being from honest users
By using decentralised governance, negligent or arbitrary providers can be defunded and new providers funded in their place.
Ideally the funded providers take on a selection of risk profiles that align with the values of the community.
To mitigate UX issues, the foundation could fund one or more “identity arbitrageurs”. The service would be free for users with the cost being socialised across the entire network.
## The path to decentralisation
Aztec’s current network is not fully decentralised and this comes with additional risk/liabilities.
Prior to full decentralisation, a limited number of cleanness providers could be permission in.
Once the network is decentralised these controls would be removed.
## Implementation details
The naive way of performing set intersections is $O(n^2)$ with the size of the set. We can verify the correctness of set intersections in linear time:
A note’s provider set $\mathcal{C}$ contains provider ids that have provided attestations. It is fixed to a size of `m`. `0` is considered the ‘null’ id that corresponds to no provider.
Each note contains a Pedersen commitment to $\mathcal{C}$ i.e. $\text{comm}(id_0, …, id_{m-1})$.
To determine set intersections, each note’s provider set is represented as a polynomial $C(X)$ whose roots are the provider ids. i.e.
$$
C(X) = \prod_{i=0}^{m-1}(X - id_i)
$$
Each join-split transaction has input provider set polynomials `C_1(X)`, `C_2(X)` and initial output provider set polynomial `C_O’(X)`. The degree of `C_O’(X)`, `d` is also provided (it is not yet `m-1`).
To determine whether `C_O’(X)` is a valid set intersection, the circuit validates that `C_O’(X)` divides both `C_1(X)` and `C_2(X)`.
I.e. the Prover provides remainder polynomials `R_1(X)`, `R_2(X)`. A random challenge `z` is generated and the following is validated:
$$
C_1(z) = R_1(z) \cdot C_O’(z)
$$
$$
C_2(z) = R_2(z) \cdot C_O’(z)
$$
Finally, the output provider set `C_O` is computed by adding `m - d` “empty provider” values to `C_O’`.
The above scheme has one completeness issue; a user can choose to deliberately omit providers in `C_O(X)` and add them to `R_1(X), R_2(X)`. I don’t think this is a problem as if a user wants to remove a provider from their note, why shouldn’t they be able to? They do not gain anything by doing so.