Guy1m0
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
      • Invitee
    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Versions and GitHub Sync Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
Invitee
Publish Note

Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

Your note will be visible on your profile and discoverable by anyone.
Your note is now live.
This note is visible on your profile and discoverable online.
Everyone on the web can find and read all notes of this public team.
See published notes
Unpublish note
Please check the box to agree to the Community Guidelines.
View profile
Engagement control
Commenting
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Suggest edit
Permission
Disabled Forbidden Owners Signed-in users Everyone
Enable
Permission
  • Forbidden
  • Owners
  • Signed-in users
Emoji Reply
Enable
Import from Dropbox Google Drive Gist Clipboard
   owned this note    owned this note      
Published Linked with GitHub
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
> **Disclaimer**: The benchmark settings for each framework have been determined solely based on my interpretation of their respective documentation and codebases. It is possible that misinterpretations have occurred, potentially leading to suboptimal environment configurations, improper testing data preprocessing, or incorrect parameter selection. Consequently, these factors may have influenced the accuracy of the benchmark results. If you detect any such errors or unexpected results, please do not hesitate to contact me via my Telegram account @Guy1m0. I am eager to address any inaccuracies to ensure the benchmarks presented are as reliable and comprehensive as possible. # Introduction As a hacker who participated in multiple hackathons last year, I often found myself overwhelmed by the challenge of selecting a compelling topic to tackle. Sponsors typically offer several tracks related to zero-knowledge (ZK) with attractive bounties, so how can one quickly develop a ZK-related application to win the bounty within such a limited timeframe? ![image](https://hackmd.io/_uploads/B1aXZv5h6.png) This question has lingered in my mind since I came second in a ZK Hackathon held by ABCDE Capital last year where I created an anti-cheating ZKML prover system. As a hacker and ZK developer, I often wondered if there are any blogs or resources that could make it easier for newcomers to enter this field and realize the power and intrigue of ZKML. After days of research, I found that [awesome zkml](https://github.com/worldcoin/awesome-zkml) is useful but lacks practical real-world applications, making it difficult for hackers to fully understand the capabilities of these frameworks. To address this gap, I have developed a [ZKML benchmark](https://github.com/Guy1m0/ZKML-Benchmark) as part of the Ethereum Foundation ESP Grant Proposal [FY23-1290]. This benchmark is designed to help developers understand the trade-offs and performance differences among various frameworks. While many frameworks offer their benchmark sets, making direct comparisons is complicated by the numerous variables that affect performance. My approach focuses on creating uniform conditions across all frameworks to provide a practical and straightforward comparison. Unlike the existing benchmark [results](https://blog.ezkl.xyz/post/benchmarks/) conducted by the team at EZKL, which focus on traditional machine learning models including Linear Regression, Random Forest Classification, Support Vector Machine (SVM), and Tree Ensemble Regression, this benchmarks analyze selected zkML frameworks with an emphasis on networks that include **deep neural networks** (DNNs) and **convolutional neural networks** (CNNs). # Benchmark Overview This project has executed extensive testing over several months, evaluating 4 leading zkML frameworks against **6 different DNN and CNN architectures**. Through meticulous analysis of each framework across **2500 datasets**, I have invested **over 250 hours** in setup and proof generation. The exhaustive results for each configuration are meticulously documented in the [csv](https://github.com/Guy1m0/ZKML-Benchmark/blob/main/benchmarks/benchmark_results.csv), providing a comprehensive resource for comparison. ## Architecture The testing network consists of Deep Neural Networks (DNN), each with an input layer followed by two or three fully connected dense layers and Convolutional Neural Networks (CNN), starting with dimensional input shapes and employing Conv2D and AvgPooling2D layers to reduce spatial dimensions before flattening. The nomenclature for each model is derived from the size of its layers, separated by underscores ('_'). For example, the model named "784_56_10" denotes an input size of 784 (corresponding to MNIST dataset images, which are 28x28 pixels in grayscale), followed by a dense layer with 56 units, and ending with an output layer designed for 10 distinct classes. The CNN network '28_6_16_10_5' represents an input size of 28x28, followed by two Conv2D layers with sizes 6 and 16, respectively. After the flatten layer processes the input from the previous layer (256), it outputs a 10-class inference. The '\_5' at the end specifies that the Conv2D layers have a '5x5' kernel size. Below is the detailed structure for the model '28_6_16_10_5': ``` Model: "28_6_16_10_5" _______________________________________________________________________ Layer (type) Output Shape Param # ======================================================================= input_1 (InputLayer) [(None, 28, 28, 1)] 0 conv2d (Conv2D) (None, 24, 24, 6) 156 re_lu (ReLU) (None, 24, 24, 6) 0 average_pooling2d (AveragePooling2D) (None, 12, 12, 6) 0 conv2d_1 (Conv2D) (None, 8, 8, 16) 2416 re_lu_1 (ReLU) (None, 8, 8, 16) 0 average_pooling2d_1 (AveragePooling2D) (None, 4, 4, 16) 0 flatten (Flatten) (None, 256) 0 dense (Dense) (None, 10) 2570 ======================================================================= Total params: 5,142 (20.09 KB) Trainable params: 5,142 (20.09 KB) Non-trainable params: 0 (0.00 Byte) _______________________________________________________________________ ``` ## Primary Metrics * **Accuracy Loss:** Measures how well the inference generated by each frameworks retains the same prediction generated by original neural network. Lower accuracy loss is preferable. * **Memory Usage:** Tracks the peak memory consumption during proof generation, indicating the system's resource demand. * **Proving Time:** The time required by each framework to generate a proof, essential for gauging the proof system's efficiency. Note: Proof verification time is considered beyond the scope of this analysis. ## Selected Frameworks: - **EZKL (Halo 2)** - **ZKML (Halo 2)** - **Circomlib-ml (R1CS Groth16)** - **opML (Fraud Proof)** These frameworks were selected based on criteria such as GitHub popularity, the proof system used, and support for different ML model formats. This variety ensures a broad analysis across distinct zk proof systems. **Note on Orion Exclusion:** The proof generation process for Orion, developed by Gizatech, is executed on the Giza platform. Due to this, the memory usage and time cost metrics during proof generation are not directly comparable with those of other frameworks evaluated in this project. As a result, to maintain the integrity and comparability of our benchmarking analysis, Orion's benchmark results will be excluded from the subsequent sections. **Note on zkLLVM Exclusion:** As TaceoLabs only recently unveiled their zkML compiler as an extension for zkLLVM, accompanied by a [tutorial](https://blog.taceo.io/nil-tutorial/) published on **March 2024**, it was not included in our benchmarking shortlist at the start of this project. Consequently, their performance on the MNIST dataset will be explored and considered in future work. ## Benchmarking Design Our benchmarking involves tasks on the MNIST dataset for evaluating frameworks under varying complexity levels: * **MNIST Dataset:** - Simplicity of Task: The MNIST dataset, comprising handwritten digits, serves as a benchmark for assessing the basic capabilities of zkML frameworks. - Framework Assessment: This task will gauge how each framework manages simple image data in terms of maintaining accuracy and operational efficiency. - Parameter Variation: The frameworks will be tested on this dataset with an increasing number of parameters and layers, pushing the boundaries of each framework's capacity. * **Neural Network Design:** - To rigorously assess the capability of each framework in translating neural networks into zk circuits, we have designed six distinct models, spanning from **3-layer DNNs** to **6-layer CNNs**, including one of the earliest pre-trained CNN model proposed by LeCun et al. in 1998, **LeNet**. These models are specifically chosen to represent two different scales of complexity, with approximately **5,200 and 44,500 trainable parameters** respectively. - Given the diversity in zkML framework compatibilities, with some frameworks exclusively supporting TensorFlow or PyTorch, establishing uniform testing conditions extends beyond merely standardizing the structures. Given the limitations of existing conversion tools like the ONNX framework, which struggles with seamless model transitions between TensorFlow and PyTorch, a manual approach was adopted to carefully **transferring weights and biases** in a manner that aligns with the computational paradigms of both TensorFlow and PyTorch. - Our measurements concentrate exclusively on the **proof generation phase**, deliberately omitting data pre-processing or system setup steps. # Results In this benchmark, we utilized three primary metrics to gauge the performance of zkML frameworks: accuracy loss, memory usage,and proving time. While they are quantifiable and comparable, assessing the overall superiority of one framework can be complex. For example, a framework might exhibit minimal accuracy loss at the expense of impractical memory usage and proving time. To address this complexity, we normalized the results for each framework on the same model and visualized them through radar charts. This visualization strategy provides a more intuitive understanding of each framework's performance, balancing the three quantifiable metrics. It's important to note that this methodology implies treating accuracy loss, memory usage, and proving time with equal weight, which may not always be practical in real-world scenarios. **Note**: Consistent color coding has been applied to represent each framework's performance across the six neural network models tested. The results from the benchmarks have been normalized, with the best-performing framework in each metric scoring 1.0 on the radar chart. Therefore, the larger the area of the triangle formed by the normalized data on accuracy loss, memory usage, and proving time, the better the framework's performance. ## Performance on DNN Models In the bar charts below, we present the performance outcomes for each framework across various DNN models, spanning three metrics. ![dnn_comparisons](https://hackmd.io/_uploads/rylzUACCaT.png) **Note on EZKL:** EZKL offers two modes - 'accuracy,' which aims to minimize accuracy loss using a larger scaling factor, and 'resource,' which is optimized for resource-constrained systems, achieving acceptable accuracy loss with good efficiency.The 'accuracy' mode of EZKL, when benchmarked on the model '196_24_14_10', causes a system crash due to exceeding 128 GB memory requirements. we have excluded this test set from our benchmark and will include it in a future update once the issue is resolved. Subsequently, we've normalized these results, as depicted in the radar charts. It's apparent that Circomlib-ML distinguishes itself in the benchmarks, striking a balance between accuracy loss, proving time, and memory usage—critical factors for generating proofs in zkML applications. ![3_dnn_models](https://hackmd.io/_uploads/H1OuA0C6T.png) **Note on opML:** opML's approach to machine learning inference differs from that of the other zkML frameworks in this benchmark. Typically, zkML processes involve the computation of ML inference followed by the generation of zk proofs for such inferences, culminating in a verifiable zkML proof. In contrast, opML focuses on executing ML inference within a virtual machine and outputs a Merkle root that represents the VM state. This Merkle root serves as a commitment to the computed state, and **a fraud proof is only generated if this commitment is challenged**. Thus, the benchmarked computation costs for opML —memory usage and proving time— are reflective of running the ML model within the VM environment and not of generating any proofs. Below is a tabulation of opML's performance metrics on DNN models: | Architecture | Accuracy Loss (%) | Avg Memory Usage (MB) | Std Memory Usage | Avg Proving Time (s) | Std Proving Time | |----------------------------------|-------------------|-----------------------|------------------|----------------------|------------------| | input-dense-dense (196x25x10) | 0.00 | 70.88 | 2.09 | 0.86 | 0.08 | | Input-Dense-Dense (784x56x10) | 0.00 | 87.32 | 2.47 | 3.60 | 0.43 | | input-dense-dense-dense (196x24x14x10) | 0.04 | 69.92 | 1.69 | 0.85 | 0.07 | ## Performance on CNN Models In the bar charts below, we present the performance outcomes for each framework across various DNN models, spanning three metrics. ![image](https://hackmd.io/_uploads/rJeWeY5ahT.png) Subsequently, we've normalized these results, as depicted in the radar charts. As opML currently does not support the Conv2d operator, only three frameworks are included in this set of benchmarks. The charts clearly indicate that EZKL, even in 'resource' mode, dominates this testing suite across all three metrics. ![3_cnn_models](https://hackmd.io/_uploads/S1ACcyVhp.png) # Analysis A detailed examination of the performance across six distinct neural network models was conducted to understand how structural variations influence key metrics such as accuracy loss, memory usage, and proving time. We selected **4 types of variations** to scrutinize the resulting shifts in performance. The following parameters are pertinent to our analysis: | Model Name | Trainable Parameters | Non-Linear Constraints* | Trusted-setup Time Cost (s)* | |-------------------|----------------------|------------------------|---------------| | `784_56_10` | 44,530 | 73,416 | 690.15 | | `196_25_10` | 5,185 | 18,075 | 165.50 | | `196_24_14_10` | 5,228 | 24,826 | 169.31 | | `28_6_16_10_5` | 5,142 | 2,558,720 | 24,106.89 | | `14_5_11_80_10_3` | 4,966 | 523,312 | 2,941.04 | | `28_6_16_120_84_10_5` | 44,426 | 2,703,268 | 24,230.47 | **Note:** The values for Non-Linear Constraints and Trusted-setup Time Cost are specifically relevant to the Circomlib-ml framework, which requires a trusted setup for zero-knowledge proof generation based on the Groth16 proof system. ## Numbers of Layers in DNN To investigate frameworks' sensitivity to an increased number of layers, the models, possessing nearly identical parameter counts, showed performance variations as depicted in the bar charts. ![num_layer-varying_dnn](https://hackmd.io/_uploads/HJ-Fjca3a.png) The bar charts reveal a noticeable increase in accuracy loss for the EZKL framework operating in resource mode. Surprisingly, Circomlib-ml shows a reduction in memory usage despite a higher count of non-linear constraints, which contradicts the expected trend of increased complexity. ## Numbers of Parameters in DNN Increasing the number of parameters from model '196_25_10' to '784_56_10' resulted in the anticipated rise in memory usage and proving time, with the exception of the zkml framework by Daniel Kang, which remains consistent in performance. ![num_param-varying_dnn](https://hackmd.io/_uploads/SJsPscTnT.png) ## Change from Conv2d to Dense Layer The substitution of Conv2D layers with Dense layers in tested models has yielded some interesting insights into the adaptability of different zkML frameworks. ![same-layer-param_cnn-dnn](https://hackmd.io/_uploads/BksQj5T2a.png) Although the two models compared have approximately the save number of trainable parameters and layers, the abvoe bar charts indicate that variation results in a significant performance discrepancy. Specifically, the accuracy loss for ezkl surged from 0.0% to 17.68%, and the proving time for zkml escalated by 50.64% (from 14.715s to 22.168s), but the Circomlib-ml uniquely benefits from this structural change. ![same-layer-param_cnn-dnn](https://hackmd.io/_uploads/Hy0Hs5Tn6.png) This divergence in performance is further illustrated in this radar charts, which show a contraction in the performance areas of both zkml and ezkl when normalized against the best-performing framework. One of possible reasons of the performance enhancement observed for Circomlib-ml may be attributed to the considerable reduction in non-linear constraints, as detailed in the table below: | Model Name | Trainable Parameters | Non-Linear Constraints | |-------------------|----------------------|------------------------| | `196_24_14_10` | 5,228 | 24,826 | | `28_6_16_10_5` | 5,142 | 2,558,720 | ### Removal of Conv2d Layer A similar trend is noted with the elimination of two Conv2D layers and a single Dense layer. Simplifying the architecture while maintaining the parameter count intuitively suggests a streamlined network. However, this results in a notable decrease in performance and suggests that both frameworks, particularly ezkl in resource mode, benefit from the inclusion of Conv2d layers in neural networks. ![same-param_cnn-dnn](https://hackmd.io/_uploads/Bk26qqa26.png) As showed above, it is also evident in the normalized data used for radar chart visualization. ![same-param_cnn-dnn](https://hackmd.io/_uploads/rJhC5q6hp.png) And the significant decrease in non-linear constraints also reflects the improved performance for the Circomlib-ml. | Model Name | Trainable Parameters | Non-Linear Constraints | |-------------------|----------------------|------------------------| | `784_56_10` | 44,530 | 73,416 | | `28_6_16_120_84_10_5` | 44,426 | 2,703,268 | # Summary Throughout this comprehensive benchmarking effort, we rigorously evaluated four leading zkML frameworks across six DNN and CNN network architectures. Our analysis has surfaced several noteworthy insights into the performance dynamics of these frameworks. For a detailed exposition of our findings, please consult the complete report available at the following [link](https://github.com/Guy1m0/ZKML-Benchmark/blob/main/docs/benchmark_report.md) A principal discovery is the marked divergence in performance between two zk systems, zk-SNARKs and Halo2, particularly in their handling of neural networks. Circomlib-ml's approach to transpiling 'Conv2d' operators into circom incurs significant resource demands, contrasting with ezkl and zkml that were more influenced by variations in the number of layers and network parameters. Furthermore, the fraud-proof system underpinning the opML framework demonstrated considerable promise in image classification tasks. Despite current limitations in operator support, the framework's performance on DNN models suggests it could be a viable alternative to traditional zkML frameworks. This is especially the case for applications that the assurance of any-trust: a single honest validator can enforce correct behavior—is sufficient. A critical aspect not examined in this benchmark is the scaling factor's influence on accuracy loss. Default settings were used for each framework: | Name | Proof System | Scale Factor| | --------- | ------------ | ---- | | EZKL (resource) | Halo 2 | 2^4 | | EZKL (accuracy) | Halo 2 | 2^11 | | DDKang ZKML | Halo 2 | 2^16 | | Circomlib-ml | R1CS Groth16 | 10^18 | | opML | Fraud Proof | 1* | > 1*: Since virtual machine used for opML support float point calcualtion, no scaling needs here. It is worth noting that EZKL in accuracy mode achieved perfect accuracy on DNN models, in contrast to its resource mode's subpar performance, which demonstrates the importances of the scaling factor. However, Daniel Kang's zkml, despite a higher scaling factor and operating on the same proof system, did not outperform EZKL in accuracy mode. This raises pivotal questions about the potential for an optimal scaling factor that can balance performance with accuracy—a subject worths for further exploration. As progress, it is essential to delve into the scaling factor's impact on performance trade-offs. Future investigations will seek to identify the 'golden ratio' of scaling that can optimize both accuracy and efficiency within zkML frameworks.

Import from clipboard

Paste your markdown or webpage here...

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lose their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.
Upgrade
All
  • All
  • Team
No template.

Create a template

Upgrade

Delete template

Do you really want to delete this template?
Turn this template into a regular note and keep its content, versions, and comments.

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Forgot password

or

By clicking below, you agree to our terms of service.

Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
Wallet ( )
Connect another wallet

New to HackMD? Sign up

Help

  • English
  • 中文
  • Français
  • Deutsch
  • 日本語
  • Español
  • Català
  • Ελληνικά
  • Português
  • italiano
  • Türkçe
  • Русский
  • Nederlands
  • hrvatski jezik
  • język polski
  • Українська
  • हिन्दी
  • svenska
  • Esperanto
  • dansk

Documents

Help & Tutorial

How to use Book mode

Slide Example

API Docs

Edit in VSCode

Install browser extension

Contacts

Feedback

Discord

Send us email

Resources

Releases

Pricing

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions and GitHub Sync
Get Full History Access

  • Edit version name
  • Delete

revision author avatar     named on  

More Less

Note content is identical to the latest version.
Compare
    Choose a version
    No search result
    Version not found
Sign in to link this note to GitHub
Learn more
This note is not linked with GitHub
 

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub
      • Please sign in to GitHub and install the HackMD app on your GitHub repo.
      • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
      Learn more  Sign in to GitHub

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Include title and tags
      Available push count

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully