# List of Primitives and Privacy Requirements This document consists of a list of primitives and privacy requirements that could be added to existing zkVMs. ## Privacy Requirements In this section, we explore privacy requirements that must be added to a zkVM that (currently) does not satisfy zero-knowledge. Additionally, may be needed to convert a zkVM that satisfies zero-knowledge to a privacy-first paradigm. - It may be necessary for a particular proof/computation to be 'owned' by a specific Prover. - The Prover can use hiding techniques as discussed in [address notes](https://www.notion.so/Nescience-cd358fe429b14fa2ab38ca42835a8451?pvs=4#9ab4ba4d92914ba0a5f66235446de5d1) to conceal the Prover's real address. - A given party can prove that a given proof (and thus computation) was generated by them with the following steps: - The Prover includes the alias public address in the public parameters thus binding the proof to this address. - The Prover generates a Schnorr proof proving knowledge of the private key associated to the alias. - Combine the proof of their computation and Schnorr proof together. - **Hiding**: - Any data that a Prover would like to conceal should not be leaked through interactions. This is a critical property of various commitment schemes; more information below. - It may be desirable for a system to be post-quantum secure. In the case of a future update to post-quantum secure, past proofs generated using commitment schemes that satisfy hiding will remain secure. - **Binding**: - Any method used to conceal data should prevent the Prover from being able to alter it later on. This is also a critical property of commitment schemes; more information below. - In cases that commitment schemes are used that are not post-quantum secure, these schemes are forgeable given a quantum computer powerful enough to break the discrete logarithm problem for the given group. That is, these schemes will no longer satisfy binding. - **Data Confidentiality**: - Ensure that all data processed within the zkVM is encrypted and not accessible to unauthorized parties. - Use strong cryptographic encryption methods such as **AES-256** for data at rest and **TLS** for data in transit. - Apply **zero-knowledge proof systems** (zk-SNARKs, zk-STARKs) to allow verifiable computations without revealing data. - Implement secure multiparty computation **(MPC)** to process data collaboratively without exposing individual inputs. - **Data Integrity**: - Ensure the integrity of data processed and stored within the zkVM to prevent tampering or unauthorized modifications. - Use cryptographic **hash functions (e.g., SHA-256, Blake2)** to generate hashes of data for integrity verification. - Implement **Merkle trees** or **Verkle trees** to efficiently verify the integrity of large datasets. - Apply **digital signatures** to data entries to ensure authenticity and integrity. - **User Anonymity**: - Conceal the identities of users interacting with the zkVM to protect their privacy. - Use anonymous credential systems such as **ring signatures** or **group signatures** to verify user actions without revealing their identity. - Implement **stealth addresses** or **alias addresses** to prevent linking transactions to a specific user. - Apply differential privacy techniques to aggregate data while protecting individual user identities. - **Transaction Privacy**: - Conceal transaction details, including sender, receiver, and amount, to ensure the privacy of financial operations. - Use confidential transaction protocols like **Mimblewimble** or **Zerocash** to hide transaction amounts and parties involved. - Implement **mix networks** or **CoinJoin** techniques to obfuscate transaction flows and enhance privacy. - Employ **stealth addresses** to ensure that each transaction appears to be unique and untraceable. - **Access Control**: - Implement mechanisms to restrict access to data and functionalities based on user roles and permissions. - Use **attribute-based encryption (ABE)** to enforce access control policies where decryption keys are tied to specific attributes. - Implement **role-based access control (RBAC)** or **attribute-based access control (ABAC)** systems to manage permissions. - **MISC**: comments on why various zkVM's fail to be zero-knowledge. - **Jolt**: uses a version of the polynomial commitment scheme Hyrax which does not satisfy zero-knowledge (from [FAQ on Jolt](https://a16zcrypto.com/posts/article/faqs-on-jolts-initial-implementation/)). The FAQ does offer suggestions of approaches that could be applied to make Jolt zero-knowledge, but this is likely to introduce a new complexity bottleneck. - **Valida**: In their [March notes](https://www.lita.foundation/blog/valida-march-review-notes), the Valida team claims that the zkVM can be made zero-knowledge by adding "blinding factors at appropriate points." *I suspect that this is a typo, and that the team meant **hiding** factors - [PSE notes](https://github.com/privacy-scaling-explorations/zkvm-ideas/blob/main/zkvm-spec.md) that Spartan compression of folding steps from Nova is not zero-knowledge. Potentially, this could be replaced with a different compression step to gain zero-knowledge, but is likely to come with an increased complexity on the Verifier. - This is likely why **Nexus** does not satisfy zero-knowledge. ## Primitives - **Commitment Schemes** - General Properties: - Commitment schemes allow a prover to commit to a value while keeping it hidden, with the ability to reveal the committed value later. - They provide two essential properties: - Binding: Once a value is committed, it cannot be changed. - Hiding: The committed value is not revealed until the commitment is opened. - Examples: - Pedersen commitments/Pedersen vector commitments: They offer both hiding and binding properties and is commonly used in zk-SNARK constructions due to their efficiency and security. **Implementation: generate the commitment using a generator and a random value then reveal the committed value and the random value to open the commitment.** - [Dory](https://eprint.iacr.org/2020/1274.pdf) is a method of compressing vectors from two different groups into a single group element by means of bilinear pairings. Essentially, IPA using pairings. **Implementation: apply bilinear pairings to reduce vector sizes in zk-SNARKs.** - **Polynomial Commitments** - [**KZG; pg 7**](https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf) - Polynomial commitment scheme for univariate polynomials. - KZG requires a trusted ceremony to generate the public parameters (CRS); this can be done using a [power of tau ceremony](https://eprint.iacr.org/2022/1592.pdf). Often, this requires multiple rounds of volunteers participating to generate 'trustless' parameters. - The CRS can be generated term wise (based on the exponent of the variable X) or with Lagrange basis terms. - Efficient with succinct proofs. - **Implementation: generate the CRS using a multi-party computation (MPC) ceremony, then commit to the polynomial and generate proofs using the CRS.** - [**Halo IPA polynomial commitment; pg 7-9**](https://eprint.iacr.org/2019/1021.pdf) - Univariate polynomial commitment scheme. - Does not rely on a trusted ceremony (no trapdoors). This is achieved with specialized hash-function as [explained by Boneh](https://www.iacr.org/archive/asiacrypt2001/22480516.pdf). - Allows for efficient aggregation of proofs. - **Implementation: use recursive composition to aggregate multiple proofs efficiently.** - Zcash's [implementation](https://github.com/zcash/halo2/blob/main/halo2_proofs/src/poly/commitment.rs) - [**Brakedown**](https://eprint.iacr.org/2021/1043.pdf) has been suggested as an improvement by Valida to use instead of FRI. - **Implementation: integrate Brakedown into existing FRI-based protocols for improved performance.** - [Hyrax](https://eprint.iacr.org/2017/1132.pdf) is a polynomial commitment scheme for multilinear polynomials. - **Implementation: use in applications requiring complex polynomial commitments, such as zk-rollups.** - **Vector Commitments** - Merkle Trees - Verkle Trees - [Verkle trees](https://verkle.info/) can be used to replace Merkle trees to compress the size (and growth) of a Merkle tree. Specifically, Verkle trees are $k$-nary trees. While theoretically, Merkle trees could be used as $k$-nary trees, this is not practical as the proof size (and Verifier work) has complexity $O(k\log(n))$ where $n$ is the number of leaves. Instead, Verkle trees leverage polynomial commitments to compress communication to $\log(n)$. - Combine Merkle trees and vector commitments to compress proof sizes. - Allow efficient verification in large datasets, reducing computational and storage overhead - [Ethereum's implementation](https://github.com/ethereum/go-verkle) of Verkle trees in Golang. - **Homomorphic Encryptions** - General properties: - Enable computations on encrypted data without decrypting it, ensuring data privacy. - Many of commitment schemes (and polynomial commitments) allow for homomorphic encryptions with respect to addition. However, multiplication is difficult; the issues with multiplication is why bilinear pairings are used. - Examples: - **Partially Homomorphic Encryption (PHE)**: - Supports either addition or multiplication but not both. - Paillier Encryption: Allows addition of ciphertexts.(for privacy-preserving summation operations) - RSA Encryption: Allows multiplication of ciphertexts.(for applications requiring multiplicative operations on encrypted data) - **Fully Homomorphic Encryptions (FHE)**: - Supports both addition and multiplication on encrypted data. - In instances in which multiplication is needed in elliptic curve settings, bilinear pairings are leveraged. Sigmabus (explained below) could be used to separate the multiplication proof out into a $\Sigma$-protocol (potentially) allowing us to avoid expensive pairings. - Gentry’s FHE scheme, CKKS (Cheon-Kim-Kim-Song) for approximate arithmetic. - **Implementation: Employ FHE for complex computations on encrypted data and use CKKS for applications requiring approximate arithmetic, such as encrypted analytics.** - **Multi-Party Computation (MPC)** - General properties: - Allows multiple parties to jointly compute a function over their inputs while keeping those inputs private. - Examples: - **Secret Sharing ([Shamir’s Secret Sharing](https://people.eecs.berkeley.edu/~daw/teaching/cs276-s04/22.pdf))**: - Divides a secret into multiple shares, requiring a subset of shares to reconstruct the secret. - Used in distributed key management systems to ensure no single party has full control over a key. - **[Garbled Circuit](https://cronokirby.com/posts/2022/05/explaining-yaos-garbled-circuits/)**: - A cryptographic protocol for secure two-party computation. - Applied in privacy-preserving context where two parties compute a model without revealing their datasets. - **Threshold Cryptography** - General properties: - A cryptographic approach where a minimum number of parties (threshold) is required to perform a cryptographic operation. - Examples: - **Threshold Signatures**: - Require a subset of parties to jointly produce a valid signature. - Used in distributed ledger systems to enhance security and fault tolerance. - **Distributed Key Generation (DKG)**: - Generate cryptographic keys in a distributed manner without a single point of failure. - Employed in decentralized systems to securely generate and manage cryptographic keys. - **Anonymous Credentials** - General properties: - Allow users to prove their credentials without revealing their identity. - Examples: - **Ring Signatures**: - Enable a user to sign a message on behalf of a group, without revealing which member signed it. - **Group Signatures**: - Allow a group member to sign a message on behalf of the group, with anonymity. - **Blind Signatures**: - Enable a user to get a message signed by another party without revealing the message. - **Arithmetic Circuits** - Variety of constraint systems: - [Plonkish](https://eprint.iacr.org/2019/953.pdf) - Handles multiplication and addition gates. - Each plonkish constraint are of the form $s_a A + s_b B + s_m AB + s_o C + s_k K = 0$. - Use Plonkish constraints for zk-SNARKs requiring flexible gate handling. - R1CS (Rank-1 Constraint System) and QAP (Quadratic Arithmetic Program) - Handles multiplication and addition gates. - Addition constraints are compressed into the multiplication constraints. This way the degree of each gate is quadratic. - [SAP](https://eprint.iacr.org/2017/540.pdf) (Squaring Arithmetic Program) - This constraint system is not used much. However, it has been used for a purported 'improvement' of Groth16 called [Polymath, pg 13](https://eprint.iacr.org/2024/916). - Operation gates: addition and squaring; multiplication gates have to be represented as $xy = 1/2 (x+y)^2 - 1/2(x-y)^2.$ - Note: If polymath was used as a replacement for the compression layer, this should not be a 'problem' that Polymath uses SAP instead of R1CS. Though, the circuit will have more constraints. - [CCS](https://eprint.iacr.org/2023/552.pdf) (customizable constraint system) - Theoeretical framework that generalizes R1CS which covers Plonkish representation as well. Included for completeness sake, but not necessary. - **Bilinear Pairing** - Bilinear pairings are used heavily for a variety of zkSNARKs and commitment schemes. They enable complex cryptographic proofs and protocols. - Examples: BLS Signatures (use bilinear pairings for creating compact, aggregateable signatures) and Groth16 (zk-SNARK protocol leveraging bilinear pairings for succinct proofs). - **Hash Functions** - Algebraic hash functions such as [Poseidon](https://eprint.iacr.org/2019/458.pdf) and [Poseidon2](https://eprint.iacr.org/2023/323.pdf) are used over cryptographic hash functions due to their convenient representation as arithmetic circuits. Specifically, these minimize overhead induced by traditional hash functions (e.g. Sha2). - [Sigmabus, pg 5](https://eprint.iacr.org/2023/1406.pdf) suggests techniques to separate out hash function computations to offload them to the Prover by means of $\Sigma$-protocols. [A proof-of-concept implementation of Sigmabus.](https://github.com/arnaucube/sigmabus-poc) **Note: this is not MIT license** - **Compression Layer for proofs** - Many zkVMs (e.g., Nexus) make use of a compression round after utilizing a folding scheme. Specifically, folding proofs can be large and Groth16 is used compress the proof size down to a constant size. Alternatively, it may be of interest to investigate Polymath for this purpose. Polymath purports to output a smaller proof (in terms of bits) than Groth16. - The use of Groth16 requires a trusted-setup to generate the public parameters. This can be done in a 'trustless' manner by using ceremonies; PSE has repos for this: [DefinitelySetup](https://github.com/privacy-scaling-explorations/DefinitelySetup) and [p0tion](https://github.com/privacy-scaling-explorations/p0tion). If Polymath is deemed viable, then these repos can be modified for Polymath as it is an optimized version of Groth16 with respect to the number of trapdoor variables. - [Thread](https://github.com/zcash/zcash/issues/714) discusses the a potential security issue of using STARK to SNARK. This concern is additionally raised by [RISC0](https://dev.risczero.com/api/security-model), but no solution is suggested. - **FRI** - A new proximity test called [STIR, pg 20-21](https://eprint.iacr.org/2024/390.pdf) (its [github repo](https://github.com/WizardOfMenlo/stir/)) offers better results compared to optimized versions of FRI. This potentially could be tested in place of systems that use FRI (such as Plonky's) to gain an improvement. It is (currently) unclear how STIR would compare to Brakedown as a replacement for FRI.