ZKML - AIGC NFT Grant Proposal

# ZKML - AIGC NFT Grant Proposal (Draft) ## Project Overview The emergence of generative AI has catalyzed a surge in artists utilizing AI to effortlessly craft images through methods like prompt engineering or neural style transfer. This AI-assisted artistry is gaining traction, but there's an absence of cryptographic verification for the origins of these AI-created images. Concurrently, AI poses potential risks to the traditional creative economy by reducing artistic labor costs and potentially infringing upon copyrights if it's trained on original works. Our solution involves integrating ethereum smart contracts with zero knowledge machine learning (ZKML). This fusion will enable cryptographic validation of machine learning model outputs and track model usage, thereby fostering a balanced ecosystem for artists and AI model developers. # Project Details ## General Concept To tokenize models on the blockchain, ZKML is crucial for proving complex off-chain model computations. For NFT creation, zero knowledge proofs can verify the model's origin on the blockchain. The recently proposed AIGC NFT standard [EIP-7007](https://eips.ethereum.org/EIPS/eip-7007), ensures such use case. Upon minting a ERC-7007 token, the `mint` method spec demands extra zero knowledge proof, along with model inputs and outputs. Our dApp is tailored for newcomers to ZKML utilizing neural style transfer. This method trains an autoencoder to convert images into a latent representation and then reconstruct them to a specific artistic style ![example-neural-style-transfer](https://hackmd.io/_uploads/HJX6BJd7T.png) In the figure above, a style transfer model trained with Picasso's artwork transforms the iconic Mona Lisa into Picasso's style. Our showcased dApp will allow NFT collection owners to deploy a model trained on their unique images. Users can submit their images to create new NFTs, verified with zero knowledge proof of the model's inference. ## Technical Details ### Model Architecture Neural style transfer leverages the auto encoder to encode style into a [latent vector space](https://en.wikipedia.org/wiki/Latent_space). Auto-encoder architecture consist of two components: encoder and decoder. The encoder maps a user image into latent space which holds the compressed knowledge of a style. The vector in latent space can be then decoded into human readable image with a style that the model has been trained on. ![autoencoder](https://hackmd.io/_uploads/Hk5gUk_X6.png) To work in concert with zero knowledge proof that is computationally intensive, we chose to reduce the proof size by proving the decoder inference only. If a user requests for neural style transfer, the model server will return with a latent vector and the result image. Upon generating zk proof when minting a token, the latent vector becomes the input to the halo2 circuit, thus minimizing the amount of time required for generating proof. ### 8-bit Quantization Machine learning models weights typically use floating point numbers for greater precision, however it may become a considerable overhead for zero knowledge proofs that operate on prime field. Model quantization addresses this by reducing floating point precision to 8-bit integers, simplifying computation of converting floating points into fixed point numbers for zero knowledge proofs. ### ZKML Proofs We will generate a proof to validate our model's inference. This proof will be sent to our smart contract to verify the NFT's creation. Since we are trying to prove the decoder, which would involve series of CNN, pooling, activation layer and fully connected layer, our mission is to build a workable circuit for each operations and connecting them together as a whole. For proving schme, we plan to use the [PSE fork of halo2](https://github.com/privacy-scaling-explorations/halo2). For on-chain verification upon minting the token we will use [PSE’s Halo2 solidity verifier](https://github.com/privacy-scaling-explorations/halo2-solidity-verifier). ## Team ### Team members #### Project leader Name: Jinsuk Park Email: totorovirus@gmail.com Telegram @navboost github: https://github.com/jinmel linkedin: https://www.linkedin.com/in/jin-suk-park-02884a89/ #### Engineers Name: Cheechyuan Ang Email: angchee@live.com.my Telegram: @cheechyuan github: https://github.com/chee-chyuan Name: Wenkang Chen Email: cwk1998@hotmail.com Telegram: @cwkang github: https://github.com/cwkang1998 #### Team’s experience **Jinsuk** has experience in both computer security and AI. He is a three times defcon finalist and served mandatory military service in RoK cyber command. During his AI career, he has deployed multiple DNN models for serving. As a search ranking engineer in coupang, he has built a foundational algorithm to integrate user behavior prediction into search engine. **Cheechyuan** has been studying about zero knowledge on the concept level and application level. He has used various zk libraries such as Circom, SnarkyJs, Halo2 and Cairo to experiment on various zk apps. **Wenkang** has a strong background in computer science and is a curious and passionate individual. Having hosted a series of zk study club for the local community, Wenkang has a strong understanding in the fundamentals of ZK. The team had collaborated for the following ZKML projects: ### Safura ([link](https://devfolio.co/projects/safura-ad60) / [presentation](https://www.youtube.com/watch?v=zwOnIXpXwP4&ab_channel=KryptoSeoul%5BOfficial%5D)) RPC relayer (safe-infura) that safe-guards users from transactions to fraudulent addresses. It checks the destination address of which history of transaction is evaluated by XGBoost binary classifier uses them as features to classify whether the address is a scam or not. We used a account abstraction that requires a additional SNARK proof from the model to be submitted to finally execute the transaction. ### ZenetiK-NFT ([link](https://ethglobal.com/showcase/zenetiknft-1ph5r) / [presentation](https://www.youtube.com/watch?v=TpTEAYQRu7U&ab_channel=HyperOracle)) AI powered Genetic algorithm on-chain. Used compositional pattern producing network to “breed” NFTs together to generate batches of images for next generation. The contract implemented ERC-7007 to mint the AI generated NFT. ## Development Roadmap ### Overview - Total Estimated Duration: 18 weeks - Full Time Equivalent: 3.5 - Total Costs: 33600 USD - Jinsuk Park: 11,200 USD - Cheechyuan Ang: 11,200 USD - Wenkang Chen: 11,200 USD - Starting Date: 2023 December 1st The cost of a senior engineer is 60 USD/hour ### Milestone 1 INT8 Quantized generative model for 64x64 bitmap image - Estimated duration: 6 weeks - FTE: 1 - Costs: 160 hours (9600$) - Jinsuk Park: 3,200 USD - Cheechyuan Ang: 3,200 USD - Wenkang Chen: 3,200 USD - Estimated Delivery Date: 2024 January 15th #### Deliverables ##### Documentation: Training procedures for neural style transfer model with auto encoder architecture. It will include detailed architecture of the model. Quantitative results will include training steps, final loss for two variants of model: - Neural style transfer model in floating point precision - Neural style transfer model in 8 bit quantized integer precision ##### Testing guide: For ease of reproducibility and step by step tutorial to novice users we will provide jupyter notebook that trains and generates images for a user input image. For testing the model quality, we will use Nouns NFT images to train the model and test if it is capable of generating Nouns-like bitmap images. ##### Functionality: pytorch based model architecture We will write a pytorch based neural style transfer model to reformat the input images and train the neural style transfer model on pytorch. ##### Functionality: Model quantization We will provide additional functionality to the above, with QAT enabled to quantized model down to 8 bit integer weights. ### Milestone 2 zkSNARK proof generator and on-chain verifier - Estimated duration: 8 weeks - FTE: 1.5 - Costs: 14400$ - Jinsuk Park: 4,800 USD - Cheechyuan Ang: 4,800 USD - Wenkang Chen: 4,800 USD - Estimated delivery date: 2024 March 15th #### Deliverables ##### Documentation: We will provide a documentation for transforming our model into halo2 circuit. Quantitative comparison will include the number of constraints and lookup arguments generated when the model is in floating point precision and quantized 8bit precision. ##### Functionality: SNARK proof generator We will provide a script to generate a halo2 proof from the provided model weights. The circuit would resemble the model architecture built on milestone 1. ##### Functionality: SNARK verifier contract We will provide a verifier contract using halo2-solidity-verifier from the PSE github repo. ### Milestone 3 dApp for AIGC NFT collection #### Deliverables - Estimated duration: 4 weeks - FTE: 1 - Costs: 9600$ - Jinsuk Park: 3,200 USD - Cheechyuan Ang: 3,200 USD - Wenkang Chen: 3,200 USD - Estimated delivery date: 2024 April 15th ##### Functionality: Model serving service Model serving microservice built with [triton inference server](https://developer.nvidia.com/triton-inference-server). The purpose of this service is to interactively provide a preview of neural style transfer results to users before minting an NFT. ##### Functionality: Proof service The proof service generates proof for model inference when the user is interested in minting an NFT to a collection. The proof will be based on the halo2 circuit built from milestone 2. ##### Functionality: dApp Our dApp will include three pages: (1) interactive neural style transfer, (2) NFT minting page (3) List of NFTs in collection.