Compliance & Selective Disclosure in Aztec

# Compliance & Selective Disclosure in Aztec ## Address Set Proofs In order to build a holistic view of all transactions, we need some way to prove the full set of addresses we transact with. There are a couple of ways to do this. Below we propose two possible routes for a contract to integrate the data storage required to create an "Address Set Proof". ### Fully Private Address Set Proofs To privately prove the set of all addresses one transacts with, we introduce the concept of an "address book". This address book is an append-only [indexed merkle tree](https://docs.aztec.network/developers/docs/concepts/advanced/storage/indexed_merkle_tree) which we will insert all of our addresses into. To collect the full set of addresses, we can simply start at index 0 in the tree and traverse the linked list until we reach the highest nullifier (whose next_idx value is 0). Of course, to use this in a compliance proof, we simply use a storage proof to register the tree root. However, we need to ensure we include both senders and recipients in this tree, and a sender cannot privately insert a value into a recipients' address book tree. To solve for this, we need a one-time authorization handshake function that both counterparties need to call: `` ```rust // pseudocode, not perfect api/ abridged inputs #[private] fn add_to_address_book(counterparty: AztecAddress, membership_proof: MerkleProof) { // derive ecdh shared secret let nsk_secret = context.request_nsk_app(context.msg_sender()); let shared_secret = derive_ecdh_shared_secret_using_aztec_address(nsk_secret, counterparty); // compute address book insertion nullifier and push let nullifier = hash([shared_secret, context.msg_sender(), counterparty]); context.push_nullifier(nullifier); // insert into address book storage.address_book.insert(counterparty, membership_proof); } ``` The existence of the nullifier `H(shared_secret, caller, counterparty)` is sufficient to indicate that caller has added counterparty to their address book and has a fully compliant setup - that is, a proof using the address book tree will always require notes sent to or received from counterparty will be included in the compliance proof. However, this step does not guarantee the counterparty has also called this function. This means that a counterparty would be able to receive notes from the caller without including it in the proof. For this reason, we need to add a step to any functions that transfer notes privately: ```rust // pseudocode #[private] pub fn transfer_private(from: AztecAddress, to: AztecAddress, amount: u128) { // derive ecdh shared secret let nsk_secret = context.request_nsk_app(context.msg_sender()); let shared_secret = derive_ecdh_shared_secret_using_aztec_address(nsk_secret, counterparty); // check both nullifiers for existence let caller_nullifier = hash([shared_secret, context.msg_sender(), counterparty]); let counterparty_nullifier = hash([shared_secret, counterparty, context.msg_sender()]); context.historical_header.prove_nullifier_inclusion(caller_nullifier); context.historical_header.prove_nullifier_inclusion(counterparty_nullifier); ... // normal transfer logic here } ``` By including this step to the start of any note transfer function, we guarantee that both counterparties have called `add_to_address_book()` and all parties are constrained to produce proofs that include any notes from these addresses. The benefit of this construction is that the entire system is private - emitted nullifiers are suitably blinded by the ECDH shared secret, and any tracking logic beyond this occurs in private state that does not leave the client device. The drawbacks are apparent, however. Most annoyingly, you cannot start transacting with a party unless both parties are online to call the initial setup function. Additionally, we incur non-trivial growth in constraints that appear in every single transfer function. Finally, infrastructure must exist for tracking the leafs of the address book tree. Clients must either maintain this data themselves (which runs the risk of being lost) or end-to-end encrypt this data, materially leaking some information about access patterns. ### Publicly Linked Address Set Proofs Instead of using an address book, we can sacrifice some privacy by introducing a public counter. We still need nullifiers to track whether a counterparty has incremented the nullifier, but we instead simply increment a counter in an internal public function: ```rust #[storage] struct storage { ... address_count: Map<AztecAddress, PublicMutable<(u32, u32), Context>, Context> ... } /// pseudocode #[private] pub fn transfer_private(from: AztecAddress, to: AztecAddress, amount: u128) { // derive ecdh shared secret let nsk_secret = context.request_nsk_app(context.msg_sender()); let shared_secret = derive_ecdh_shared_secret_using_aztec_address(nsk_secret, counterparty); // compute nullifier let nullifier = hash([shared_secret, context.msg_sender(), counterparty]); Token::at(context.this_address()) .increment_counter(from, to, nullifier) .enqueue(&mut context); ... // normal transfer logic } #[public] #[internal] pub fn increment_counter(from: AztecAddress, to: AztecAddress, nullifier: Field) { if (!context.nullifier_exists(nullifier)) { context.push_nullifier(nullifier); let from_loc = storage.address_count.at(from); let mut from_count = from_loc.read(); from_count.0 += 1; from_loc.write(from_count); let to_loc = storage.address_count.at(to); let mut to_count = from_loc.read(); to_count.1 += 1; to_loc.write(from_count); } } ``` Instead of iterating over an address map, we simply take a storage proof of the sender/ recipient counts. Then, we must supply exactly this many address transaction summary proofs to satisfy the constraint of including all senders and recipients. Obviously this construction leaks transaction graph privacy. It is possible to unlink this by storing the sender map as a PrivateMutable instead of storing both counts in a public tuple, thereby only indicating publicly that *someone* started transacting with a certain address. However this introduces a griefing attack where a malicious sender can increment a recipient's count without providing them with the aztec address to call `pxe.registerSender(address)`. Constrained encrypted delivery of this log does not solve this as the log can't be decrypted without knowing the address beforehand. ## Address Transactional Summary Proofs Once we have a summary of all interactions, we can build a full summary of all transactions we've made across a given contract. We do this by iterating over each address and creating an "Address Transactional Summary Proof" for each of the addresses. Additionally, it might be that a challenge is issued to a prover to summarize interactions with one specific address (rather than the entire set), in which case the sumamry can be built as a standalone proof. Address Transactional Summary Proofs in this description are a bit generic - specifically, we are only concerned with acquiring the full set of all notes used when transacting with a specific address. The exact use of these notes is then up to the implementation. For instance, we may want to track the tax basis, or ensure the volume over a specific span of time never exceeds CTR reporting thresholds. In general, however, these proofs provide the infrastructure to make these statements instead of trying to predict the exact requirements desired from a compliance proof. ### Note Serialization Like the problem of knowing the full set of addresses, we need to know the full set of notes to include. We do this by adding a serial number to all notes sent and tracking the serial height for each recipient we send to: ```rust // pseudocode // likely an expanded uint note #[note] struct MyNote { owner: AztecAddress, randomness: Field, serial: Field // could likely be reduced down to like u16 for packing ... // remaining note fields } #[storage] struct storage { serial_numbers: Map<AztecAddress, PrivateImmutable<u32, Context>, Context>, ... // remaining storage values } pub fn transfer_private(from: AztecAddress, to: AztecAddress, amount: u128) { // address book hook first let serial_number_loc = storage.serial_numbers.at(to); let serial_number = serial_number_loc.read() + 1; serial_number_loc.write(serial_number) // normal transfer logic // make sure to insert the serial number into the note when sent } ``` ### Tagging Keys Alternatively, in a more interactive and somewhat trusted environment, we can create a "Tagging Key Set Proof" for all of the addresses in the Address Set Proof. For each address in the set, we simply prove derivation of the tagging key between our address and the counterparty address ```rust= todo: find the code that contains this logic ``` This solution has some drawbacks: it requires disclosing the tagging key to a third party. This means that, in perpetuity, this third party will be able to see every single time a note is sent between you and your counterparty. However, they will only see the ciphertext of the note - not the contents of the note. If we are willing to accept this drawback, however, use of the tagging key becomes an incredibly potent solution. First off, only Address Set Proof modifications need to be made to the contract - there is no note serialization logic required! Second, we have the option to expedite proving in an interactive setting. Once the verifier has recieved the Tagging Key Set Proof, they can readily collect ALL ciphertexts themselves and challenge the prover interactively to prove qualities about each note, rather than having to aggregate everything. Of course, if we want to retain a higher level of confidentiality about the qualities of any one transaction, the verifier can create a merkle tree of all note hashes they want revealed, and the prover can recursively verify proofs that summarize their entire transactional history with an address while proving all notes are included. ## Compliance Proof Types There are a couple of types of compliance proofs we can imagine being useful: ##### Proof of Non-Interaction with an Address This is simply handled by checking that the nullifiers `H(ecdh_secret, sender, recipient)` and `H(ecdh_secret, recipient, sender)` don't exist. This could be merklized for compliance across a blacklist. ##### Address Transactional Summary Proofs: Described above with Note Serialization / Tagging Key Proofs ##### Volume or Tax Basis Proofs Combining Address Set Proofs with Address Transactional Summary Proofs give the generic infrastructure for making these proofs. A simple volume proof is diagrammed below ### Diagrammed Volume Proof Step 0: Verifier sets a max block height (i.e. last block produced in the year 2025) Step 1: Prover perform a Tagging Key Set Proof (TKSP) ![image](https://hackmd.io/_uploads/SksS_BtClx.png) Step 2: Prover sends the TKSP with the Verifier (offchain) who verifies the authenticity of the tagging keys by verifying the proof against the block set in step 0 Step 3: Verifier loads tagging keys and collects all ciphertexts for each account Step 4: For each account, verifier produces an "Account Note Ciphertext Tree" that includes the note hashes of all ciphertexts they want volume proven from Step 5: Verifier produces a "Note Ciphertext Tree" which is a tree of trees - includes each of the Account Note Ciphertext Tree roots Step 6: Verifier shares Note Ciphertext Tree (and any membership witness data needed) to Prover Step 7: For each address the prover interacted with, they create a "Account Transactional Volume Summary Proof" (ATVSP) ![image](https://hackmd.io/_uploads/ByLcFHF0lg.png) Step 8: Once the prover has constructed every ATVSP for every leaf in the Note Ciphertext Tree, they construct a final "Volume Summary Proof" (VSP) which aggregates all transactions into a single provable summation of volume ![image](https://hackmd.io/_uploads/SJZzqBK0le.png) Step 9: The prover sends the VSP to the verifier. Since the verifier constructed the Note Ciphertext Tree using the TKSP's tagging keys, they can have full confidence that the entirety of transactional history was included in the VSP and that the outputted volume is accurate. ## Other Ideas Here are some other ideas from the first exploration draft. # Epoch View Keys A user could reveal all of their notes for a specific epoch (i.e. a month, a year, etc) by creating a shared secret with the viewer by deriving the encryption key from the shared secret + some epoch number. Encryption keys would have to be derived at the point of encryption each time to ensure the current epoch number is being used. This gives full visibility to the viewer, but restricts the time range. This may be more desirable for an automated tax information dump than for regulatory compliance. ## Brute force decryption Note: how can we both fail to decrypt and also not revert the circuit? A viewer could request a proof of complaince over all blocks from range X to Y. The user would then have to construct a proof for each block proving that each note either CAN be decrypted (adding it to the stack of notes to prove) or CANNOT be decrypted. This would be incredibly resource intensive but requires no additional integrations whatsoever to facilitate. Once the user has produced a proof that demonstrates all of the notes that were sent to their address, they perform the same selective disclosure step described in the above two sections. ## Compliant Back Door / Confidential Transfers Very simply, an (overly) compliant smart contract may choose to duplicate all logs and send the copy to a viewer key. This solution is unimaginative and potentially dangerous for privacy, but also may be a requirement for services like current big name custodial stablecoins to feel comfortable allowing transactions on a private chain. ## MPC Storage Like the compliant back door, smart contracts could make a copy of duplicate logs and encrypt the copy to a view key. However, these logs are encrypted to a key whose shares are split in an MPC network (i.e. NillionDB). A viewer could then make requests to the MPC network to provide the data. This could happen in two ways - 1. Proxy re-encryption: The MPC network performs a computation to re-encrypt the data from their own key to the viewer without ever actually decrypting the data 2. TEE re-encryption: The encrypted data is only decrypted inside of a TEE where re-encryption to the viewer's key occurs