# Verifiable Formal Verification of Smart Contracts ## TL;DR We propose a protocol of verifiable formal verification of smart contracts, which provides a way for wallet apps, dapp browsers and even smart contracts to know whether a target contract is secure and satisfies certain specifications by SN[T]ARKs over formal verification. This would be a "https" in blockchain. ### Version 0.2 ### Authors LayerX Inc. R&D Team Ryuya Nakamura (ryuya.nakamura@layerx.co.jp) Osuke Sudo (osuke.sudo@layerx.co.jp) Takeshi Yoneda (takeshi.yoneda@layerx.co.jp) Takayuki Jimba (takayuki.jinba@layerx.co.jp) ## 1 Overview ### 1.1 Background Currently, formal verification of smart contracts is executed in each computers. If someone formally verified some contracts, you have no way to verify that without executing the program of formal verification by yourself. Otherwise, all you can do is to trust the person who did the formal verification. Moreover, there is no way for a smart contract to judge whether the other contract is formally verified or not. We propose a protocol to prove on-chain that a contract has been formally verified and satisfies certain specs with SN[T]ARKs. This protocol provides a way to verify a formal verification in a much less computational cost. Specifically, there are two types of benefits: (1) Smart contracts can judge whether the other contract it’s about to call satisfy certain specs or not by themselves. (2) Light clients like mobile apps can see whether the certain contract satisfy certain specs. In other words, this protocol provides "https" in blockchain. Of course this doesn’t assure 100% that there would be no hacks but give some degree of security. ### 1.2 Abstract of protocol In this protocol, a verifier contract(Verifier contract) verifies the SN[T]ARKs proof that proves a given EVM bytecode(i.e. ‘target’ contract) satisfies some set of specifications. If the verification succeeds, the verifier contract stores the target contract’s address or hash of its bytecode onto the storage. The number of verifier contracts corresponds to that of specifications they prove. If users or other contracts want to know whether a certain contract satisfies some specification, they could ask the corresponding verifier contract about it. ### 1.3 Use-cases Here we provide examples of use-cases regarding each type of the benefits described in section 1.1. (1) Crypto collateral system without vulnerable tokens In some financial protocols like stable coins (e.g. Maker[1] and Reserve[2]) or derivatives (e.g. Dharma[3]), crypto assets including not only ether but also ERC20 tokens are used as a collateral. Even though these protocols handle the case when the price of such collaterals drop, if the contract on which these collateral assets based had bugs, like famous integer overflow or reentrancy, it could cause black swans and entire system might be broken. This kinds of black swans would be removed if the system contracts of these protocols prevented from using tokens whose contracts are not formally verified from being included as collateral. (2) Secure wallet apps or dapp browsers Wallet apps or dapp browsers can protect users from using vulnerable contracts using (above mentioned) verification contracts. This is not limited to the check of such famous vulnerabilities like overflow. We think Layer 2 protocol like Plasma and state channel would have standards in the future. For example, L4 and Celer team are working together on the standardization of state channel[4]. These standards would be written as a formal specification amenable to formal verification. (see our experimental project[17] of formal verification of Plasma contracts with KEVM[5].) Therefore, with the protocol propose, wallet apps or dapp browsers can raise warning for users to tell the Plasma/state channel contracts are compliant with these standards.(e.g. "WARNING: This Plasma Cash contract is not standards-compatible.") ## 2 Methodology ### 2.1 SNARKs over formal verification We intend to use SNARKs as a non-interactive proof system because the proof size is so small and will fit well to the blockchain. The proof size of SNARKs is only 288 bytes for 128 bits of security [6] [7] or approximately 131 bytes [8]. STARKs has advantage features over SNARKs like non-setup and proving cost, so we are also planning to use STARKs as a proof system in the future. But here, we will discuss especially in the case of SNARKs. ![](https://i.imgur.com/tU28f52.png) The verifier contract needs to be deployed in advance by the setup phase. In the frontend, formal verification program is executed by RAM (random-access machine) which is general computational models, and the process generates the circuits for SNARKs. In the backend, the generated circuits and the RAM are used for the SNARKs. The setup as a pre-processing generates the proving key and the verification key that are used in proving and verifying respectively. Only if the prover passes the formal verified bytecode, the proof which is sent from the prover to the verifier will be accepted and return true by the verifier contract. The fact will notify to the dapp browsers or be called by other contracts. We formalize the program to be verified with SNARKs like: *Pprove(w = bytecode, x = hash(bytecode)) {return proof}* *Pverify(proofkey, hash(bytecode)) = True/False* where *x*: the primary input (which passes to on-chain computation) *w*: the auxiliary input (which does NOT pass to on-chain computation)) *bytecode*: the bytecode of smart contract that you want to verify *proof_key*: proof key of the formal verification result Since contracts’ bytecodes are too huge in byte to be put into on-chain, we treat it as auxiliary input (in other words, private input) and take only its hash value as primary input (i.e. public input). The following snippet is the pseudo code of P_prove (i.e.the contract owner want to prove): ``` struct In { // EVM byte code's hash uintN_t bytecodeHash // EVM bytecode as private input uintN_t bytecode[MAX_BYTECODE_LENGTH] } struct Out { // represents whether `bytecodeHash` is valid or not. bool isValidBytecode // represents whether the given bytecode is formally verified or not. bool isValidContract } uintN_t hash(uintN_t bytecode[MAX_BYTECODE_LENGTH]) { // [...] return hash_of_byte_code } bool formallyVerifyBytecode(uintN_t bytecode[MAX_BYTECODE_LENGTH]) { // [...] return result_of_formal_verification } void compute( struct In *input, struct Out *output ) { // 1. compute hash value of bytecode uintN_t h = hash(input->bytecode) // 2. check if the above calculated value equals the input bytecodeHash output->isValidBytecode = (h == input->bytecodeHash) // 3. check if there's a vulnerability in byte code output->isValidContract = formallyVerifyBytecode(input->bytecode) } ``` To convert this program into arithmetic/boolean circuits, we adopt computational models like TinyRAM which are already known to be converted into circuits. The formal verifier will be implemented over these computational models. Note that the private input variables may include a proof, an object which express how formal verifier applied their rules of inference in addition to `hash(bytecode)`. See (2) in section 3.2. When the P_verify function returns true, the address of the verified contract is registered in the deployer contract or emits the event to notify the result to off-chain clients. ### 2.2 `Verifier` Contract This is a pseudo code of `Verifier` contract. ``` pragma solidity ^0.4.24; library GetCode { function at(address _addr) public view returns (bytes o_code) { assembly { // retrieve the size of the code, this needs assembly let size := extcodesize(_addr) // allocate output byte array - this could also be done without assembly // by using o_code = new bytes(size) o_code := mload(0x40) // new "memory end" including padding mstore(0x40, add(o_code, and(add(add(size, 0x20), 0x1f), not(0x1f)))) // store length in memory mstore(o_code, size) // actually retrieve the code, this needs assembly extcodecopy(_addr, add(o_code, 0x20), 0, size) } } } contract Verifier { address[] public verifiedContracts; function register(bytes _proveKey, address _addr) public { bytecode = GetCode.at(_addr); bytecodeHash = sha3(bytecode); require(verify(_proveKey, bytecodeHash)); verifiedContracts.push(_addr); } function verify(bytes _proveKey, bytes32 _bytecodeHash) private returns (bool) {...} function checkVerified(address _addr) public returns (bool) { require(verifiedContracts.exists(_addr)); return true } } ``` ### 2.3 Set up In both protocols above, verifier contacts need to be registered beforehand in the "trusted list" in the contract or mobile apps that use this protocol. These protocol based on the correctness of `verifier` contracts, i.e. - algorithm of formal verification - formal specification to be verified - circuits which represents the algorithm of formal verification - verification program of SN[T]ARKs ## 3 Challenges & Solutions ### 3.1 Main Challenges The most part of the difficulties to achieve this is around converting a formal verification program to circuits because formal verifiers are basically complicated programs. Most of the formal verification methodologies adopts Z3 as a SMT solver or uses interactive theorem provers like Coq or Isabelle and they are composed of complicated algorithms and have millions of lines of code. More concretely, there are these two challenges. #### (1) Implementation difficulty There are several libraries like ZoKrates[10], xJsnark[11] and Pequin[12] to develop a program which can be compiled to circuits of SNARKs. They made it easier to compose circuits for normal programs, but their DSLs cannot be compared with usual programming languages like Java, Python or C++ in which formal verifiers implemented so to implement a formal verifier with their DSLs is quite difficult. In addition, they don’t have sufficient documents and most of them are not aimed for production use. #### (2) Circuit complexity SNARKs has a constant time of verification and a constant size of proof but the number of constraints increases as the complexity of a program increases. i.e. the time and cost of generating a proving key increases. The computational cost of formal verification would be huge, so we need to optimize it and reduce the circuit complexity. Several libraries (like DIZK) have been developed to reduce the circuit complexity in the backend for SNARKs. As an example, gnosis team estimates the costs is roughly 1200$ using AWS and DIZK about their SNARKs application of decentralized exchange. We need more optimization not only in the backend of SNARKs but also the frontend. ### 3.2 Solutions #### (1) General computation We are considering to implement a formal verifier on LLVM and compile it to RAM that can be converted to circuits. TinyRAM has a minimal ISA that enables to generate small circuit complexity. However, because it is too hard to implement the LLVM backend of TinyRAM, we will use the existing backend machine similar to TinyRAM. One of the machine would be RISC-V which is a simple and has a restricted number of instruction set, so we think it can be used as a computational machine for the circuit generation. However, because of the larger instruction set of RISC-V the efficiency of the circuit generator might be lower than the one of TinyRAM and make the second challenge even more difficult. #### (2) Less-computational formal verification algorithm We need to adjust the algorithm of formal verification so that the implementation cost and the amount of circuits are reduced. One promising technique is to remove the proof search which require a huge computation and focus on the verification of a given proof. The softwares of formal verification find the proof, the sequence of their rules of inference automatically or semi-automatically with human’s help. However, the program converted to circuits in our protocol isn’t necessarily able to find the proof because the proof generated by off-chain formal verifier can be passed as a private input and all the program need to do is to verify the proof. Another approach is to adopt some heuristics about certain vulnerabilities. For example, Securify[13] uses a pattern detection in the semantics of the target bytecode expressed their DSL and leverages the domain-specific insight that violations of many practical properties for smart contracts also violate simpler properties, which are significantly easier to check. The algorithm seems much more easier to convert to circuits than KEVM and interactive theorem provers. However, the specifications which can be verified with this kind of technique are limited. For example, the verification of “No overflow” property would be difficult without something like Z3. ## 4 Future work We need more feasibility study about this project, especially about the challenges described in section 3. The most important point is the circuit complexity of formal verifier but it would be difficult to estimate without actually implementing it. Research on zero-knowledge proof systems including STARKs and others would also be important. Our long-term goal includes: - development of a infrastructure to do a SN[T]ARKs over a general computation (e.g. circuit generator of RISC-V) - development of a formal verifier which is more SN[T]ARKs friendly ## 5 Other approachies Other ways about off-chain computation of formal verification can be considered. Raziel[14] proposes a protocol which combines secure multi-party computation and proof-carrying code to provide privacy, correctness and verifiability guarantees for smart contracts. TrueBit[15] can also be considered. We think they has some potential to be researched but we don’t pursue them here because they require additional roles so based on a more complicated threat model and need incentive design. ## 6. Related Works ### 6.1 Raziel Raziel proposes a system for securely computing smart contracts guaranteeing their privacy, correctness and verifiability, basically using MPC as described in section 4. In addition to that, they showed the use of Zero-Knowledge Proofs of Proofs (i.e., Proof-Carrying Code certificates) and they made a thread in ethresear.ch[16] about it. They prefer ZKBoo, ZKBoo++ and Ligero to SNARKs unlike our proposal. ### 6.2. Proof-Carrying Smart Contracts Thomas et al.[18] proposed a concept of Proof-Carrying Smart Contracts. In their proposal, the formal correctness proofs of smart contracts are put on the chain and they are checked by validators. They intend to add opcodes to create contracts with specifications as well as update them given a new proof, instead of doing it off-chain. We think their idea can be considered as a use-case of our protocol that contracts are allowed to be upgraded only when the new implementation satisfies some specification. ## 7. References [1] Maker https://makerdao.com/ [2] Reserve https://www.reserve.org/ [3] Dharma https://dharma.io/ [4] L4 and Celer: Better state channels through collaboration https://medium.com/l4-media/l4-and-celer-better-state-channels-through-collaboration-ea1cb153b413 [5] K Semantics of the Ethereum Virtual Machine (EVM) https://github.com/kframework/evm-semantics [6] Pinocchio: Nearly Practical Verifiable Computation https://eprint.iacr.org/2013/279.pdf [7] Succinct Non-Interactive Zero Knowledge for a von Neumann Architecture https://eprint.iacr.org/2013/879.pdf [8] On the Size of Pairing-based Non-interactive Arguments https://eprint.iacr.org/2016/260.pdf [9] Counterfactual: Generalized State Channels https://l4.ventures/papers/statechannels.pdf [10] ZoKrates - Scalable Privacy-Preserving Off-Chain Computations https://www.ise.tu-berlin.de/fileadmin/fg308/publications/2018/2018_eberhardt_ZoKrates.pdf [11] xJsnark: A Framework for Efficient Verifiable Computation http://www.cs.umd.edu/~akosba/papers/xjsnark.pdf [12] Pequin https://github.com/pepper-project/pequin [13] Securify: Practical Security Analysis of Smart Contracts https://arxiv.org/pdf/1806.01143.pdf [14] Raziel: Private and Verifiable Smart Contracts on Blockchains https://eprint.iacr.org/2017/878 [15] A scalable verification solution for blockchains https://people.cs.uchicago.edu/~teutsch/papers/truebit.pdf [16] Zero-knowledge before execution https://ethresear.ch/t/zero-knowledge-before-execution/312/2 [17] LayerX’s verifed-plasma-contracts https://github.com/LayerXcom/verified-plasma-contracts [18] Proof-Carrying Smart Contracts https://paulgazzillo.com/papers/wtsc18.pdf [] Geppetto: Versatile Verifiable Computation https://www.cs.virginia.edu/~sza4uq/index/geppetto.pdf [] vRAM: Faster Verifiable RAM With Program-Independent Preprocessing http://web.eecs.umich.edu/~genkin/papers/vram.pdf [] Scalable, transparent, and post-quantum secure computational integrity https://eprint.iacr.org/2018/046.pdf