Brechy
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Versions and GitHub Sync Note Insights Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       owned this note    owned this note      
    Published Linked with GitHub
    Subscribed
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    Subscribe
    # Scaling Semaphore ![banner](https://hackmd.io/_uploads/ByvaBgb91x.jpg) *Thanks to [Vivian](https://github.com/vplasencia) for answering so many questions.* [Semaphore](https://docs.semaphore.pse.dev/) is a protocol that allows you to cast a message as a provable group member without revealing your identity. When dealing with large groups it's not possible to run the protocol as it is normally used, mostly on end user's devices to maintain privacy. This introduces storage, computation and bandwidth requirements that make it impractical. Worldcoin [World ID](https://world.org/blog/worldcoin/world-id-faqs) (~11M users) currently [uses semaphore](https://worldcoin.org/blog/worldcoin/intro-zero-knowledge-proofs-semaphore-application-world-id) for their protocol. The solution they found for this issue was to delegate the proving to their servers. The downside of this solution is that users' identities are no longer private, since they need to be disclosed to the server to generate the proofs. In this document I'll try to expand on this issue and present some possible solutions. ## Semaphore Protocol The Semaphore flow, as defined in the [protocol specifications](https://hackmd.io/@vplasencia/B1sCrsoFkg), is the following: 1. The user generates an identity. 2. The identity is added to the group. 3. An anonymous membership proof can be generated using the identity and group. 4. A nullifier is generated to uniquely represent the user's zk proof while ensuring anonymity. 5. The user can attach an anonymous message (such as a vote, text message). ![flow](https://hackmd.io/_uploads/BJxL4ajYye.png =50%x) Our interest lies in the third step of the process, this is, the creation of the anonymous membership proof. In order to do this, you need both the **identity** and the **group**. - **Identity (commitment)**: public Semaphore identity value used in Semaphore groups. - **Group**: [Lean Incremental Merkle Tree](https://github.com/privacy-scaling-explorations/zk-kit/tree/main/papers/leanimt) in which each leaf is an identity commitment for a user. First, a Merkle proof of inclusion is generated. Then, this is input to a circuit that generates a zero-knowledge proof that can then be used to anonymously prove that you're part of the group. ## Membership Proof A membership proof lets you verify that a specific leaf is part of the LeanIMT. It consists of: - **Leaf value**: The member you want to prove is included. - **Sibling hashes**: A list of the neighboring node hashes encountered on the path from the leaf to the root. - **Path information (or index)**: A series of bits or an index that indicates at each level whether the leaf was a left or right child, which tells you how to combine it with its sibling when recomputing the hash. - **Merkle root**: The final root hash of the tree, against which you verify the recomputed result. Using this data, you can iteratively hash the leaf with its siblings (in the order specified by the path) to rebuild the root. If the rebuilt root matches the given Merkle root, the proof is valid and the leaf is confirmed to be part of the tree. If we give away this information, our identities are revealed. If we don't want this to happen, we have to generate a zero knowledge proof that we belong to the group. ## Zero Knowledge Proof To generate the zk-proof we can use the [Semaphore circuit](https://github.com/semaphore-protocol/semaphore/blob/main/packages/circuits/src/semaphore.circom), which takes as some of it's inputs: ```circom signal input merkleProofLength, merkleProofIndices[MAX_DEPTH], merkleProofSiblings[MAX_DEPTH]; ``` Is clear then that any plan to delegate the generation of this proof should be made around a [Private Proof Delegation](https://github.com/privacy-scaling-explorations/private-proof-delegation-docs) scheme. The current solution is to generate the proof locally. As I mentioned earlier, this could present a storage and/or computational limitation. ## Storage The Semaphore v4 protocol supports trees with [depth up to 32](https://github.com/semaphore-protocol/semaphore/blob/f67958349810b0d6cda8d77488e8371f8d6f3985/packages/utils/src/constants.ts#L7). To calculate the amount of nodes in a binary tree with depth $d$ we can use the following equation: $$ 2^{d+1}-1 $$ In our case, $$ 2^{32+1}-1 = 8,589,934,591 $$ Each leaf contains a Poseidon hash over a [BN254 field](https://github.com/chancehudson/poseidon-lite), with elements of 32 bytes each. $$ 8589934591 \times 32 = 274,877,906,912 \; bytes $$ So the full size is `~256 GB`. But ok, let's assume $8.5$ billion IDs is a lot, for a tree of depth $23$ the group can hold $16,777,215$ participants, not too far away from World-ID. In this case, the group would be of size $536,870,880 \; bytes$ or `~500 MB`. > 💭 We can make the proof only with some leaves and hashes, but if we ask for this *specific* data we'll be giving away our identity. In the latter case, it's not too unrealistic to download that amount of data if it's something the user has to do just **once**, but it already is prohibitely expensive for the rest of applications. ## Computation The computational side of things don't seem to be a limitation. Even though the the [Poseidon Hash circuit](https://github.com/iden3/circomlib/blob/master/circuits/poseidon.circom) seems to be quite expensive, the [Binary Merkle Root circuit](https://github.com/privacy-scaling-explorations/zk-kit.circom/blob/9d1bbca137ee47552a63c0c3c24f8f7dad0ccfd9/packages/binary-merkle-root/src/binary-merkle-root.circom#L18) only runs it at most `MAX_DEPTH` (32) times. ```circom for (var i = 0; i < MAX_DEPTH; i++) { var isDepth = IsEqual()([depth, i]); roots[i] <== isDepth * nodes[i]; root += roots[i]; var c[2][2] = [ [nodes[i], siblings[i]], [siblings[i], nodes[i]] ]; var childNodes[2] = MultiMux1(2)(c, indices[i]); nodes[i + 1] <== Poseidon(2)(childNodes); } ``` ### Benchmarks Using the [semaphore v4 benchmarks](https://github.com/vplasencia/semaphore-benchmarks/blob/main/node/src/generate-proof.ts) I calculated some average times for the proof generation for different groups. Locally ran in an M1 MacBook Pro. | Depth | IDs | Proof Generation Time (ms) | |-------|-----------|------------------------------| | 9 | 1,000 | 352.17517 | | 13 | 15,000 | 452.96763 | | 16 | 130,000 | 496.71687 | | 19 | 1,000,000 | 583.64228 | Based on these numbers, it seems like the computation necessary to generate the proofs for large groups is not a blocker. This allows us to focus on the privacy issues related to obtaining the needed data. ## Solutions ### Fetch Merkle Path Using PIR As we've seen in the **Storage** section, relatively large groups prevents the user from being able to store the whole tree. So our main assumption here will be that the tree is stored in a server. To generate the membership proof we'll have to fetch the Merkle Path (necessary data to make the merkle proof) corresponding to the identity commitment. If we request this information then we would be revealing who we are to the server. A way to work around this issue is [Private Information Retrieval (PIR)](https://crypto.stanford.edu/pir-library/). PIR allows you to perform *private queries*, this is, ask a database owner for some data in a way that they can't tell what is you requested, but still being able to give it to you. Of course, this comes with some performance overhead. Then, we could use a PIR scheme for requesting the leaves of the tree that we need. This comes with it's own challenges, since in order to do that we should know beforehand **what leaves to ask for.** Since we're not storing the tree, we don't know it's structure and the position of our identity commitment. A possible solution to this is to perform a two-party computation with the server to execute a function that returns the merkle path leaves indices, with our commitment as input. Once we now the merkle path, we can then use PIR to fetch for the commitments and hashes. ### Private Proof Delegation It could be possible to delegate the proof generation using a of Private Proof Delegation scheme. This is a current research area in PSE, and it would allow to obtain the proof without the server being able to *read* the inputs. ### Merkle Forests Merkle Forests can reduce the cost of the proof generation by lowering the anonimity degree, in a way that the user proves membership of a sub-tree instead of the whole group. There is an [implementation](https://github.com/samzkback/merkle-forest) with this solution, which also includes *elastic groups*, that is, allowing the user to choose their anonimity degree, choosing groups of different sizes. ### Merkle Forests + PIR One option to achieve full privacy with Merkle Forests is fetching them using PIR. If we have large enough sub-trees, then the user would be able to fetch every other sub-tree root to create a full proof. Their own sub-tree could be fetched using PIR. Sub-trees should be large enough so we can have a small set of sub-tree roots, but small enough so they can be fetched using PIR. Below are some SOTA PIR practical settings: - Frodo PIR - DB Size: ~$2^{20}$ elements - Element size: `1 KB` - Respire - DB Size: ~$2^{20}$ elements - Element size: `256 B` - Spiral - Public Parameters: `14 ~ 47 MB` - Element size `100 KB` - Database size: Should be adjusted by pre-processing time. 16.8 s for a 256 MB database, and 569 s for an 8 GB database. PIR seems to be optimized for many elements of small size, instead of few large elements DBs, so we won't be able to have large trees. Instead, we can reduce the anonimity of the whole set, to sub-trees of maximum size of $2^{20}$ elements, with `32 B` elements. It would be possible to perform PIR on these sub-trees, and then perform the proof of membership over this sub-tree only. This will have to be complemented with a proof of membership of the sub-tree root, that can be calculated server-side. A great insight from [Keewoo](https://keewoolee.github.io/), the structure of the LeanIMT allows us to: - Know our index when we join the group. - Know the tree structure with the amount of leaves. It is an append-only tree so the structure is deterministic. This allows us to recreate the merkle path indices just by knowing the amount of leaves in the tree. We can then request these indices using PIR to calculate the proof. ## Initial Exploration Path The first route we will work on, this is implement a PoC and benchmark it, is the **Merkle Forests + PIR** proposed solution. We can split this into two separate parts, one for PIR for trees of up to $2^{20}$ leaves and then extend this PIR sub-trees to reach up to $2^{32}$. The first step would be implement the [LeanIMT](https://github.com/vplasencia/leanimt-rs) **blind** merkle path retrieval. This is what would allow us to know the nodes indices that belong to our merkle path, without fetching the tree. The next step would be to implement PIR for the LeanIMT. The chosen PIR scheme is [Respire](https://eprint.iacr.org/2024/1165), for which one of the authors (David J. Wu) gave a great presentation for a [PSE L&S session](https://www.youtube.com/watch?v=Nf4IZ2kTPN4). The reasons are: - This scheme is tailored for databases of small records. - It has no offline phase. - It supports batch queries. - It has a [working implementation](https://github.com/AMACB/respire). The semaphore group structure can then be modified in a way that these trees are sub-trees, then extending the amount of participants a single group can hold. If the roots of all sub-trees can be stored by the user, then proofs can still be done without revealing the sub-tree the identity commitment is. ## Resources - [Semaphore Specs](https://hackmd.io/@vplasencia/B1sCrsoFkg) - [Merkle Tree Size Sheet](https://docs.google.com/spreadsheets/d/17gvQMLuZ6jXnWRCFNOQCVl-ZJr18FoTHZVX1cGrf-po/edit?gid=0#gid=0) - [Lean Incremental Merkle Tree](https://github.com/privacy-scaling-explorations/zk-kit/tree/main/papers/leanimt)

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully