# Sin7Y: About the Sinsemilla hash function used in OlaVM
Last month we were pleased to announce the [OlaVM Whitepaper](https://ethresear.ch/t/whitepaper-olavm-an-ethereum-compatible-zkvm/13144), an EVM-compatible ZKVM solution, released the 25th of July, 2022. ZKEVM has been a hot topic itself over the past couple of weeks, and upon the release of OlaVM, the paper managed to receive some honorable attention from prominent people in the industry, amongst them, one being [Daira Hopwood](https://twitter.com/feministPLT) (who also is the main author of the [Zcash protocol](https://zips.z.cash/protocol/protocol.pdf)), we would like to thank Daira for her feedback. Daira brought up a few important questions in regards to the design decisions, one of them relating to the choice of hash function in [ECDSA](https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm) and [Schnorr](https://en.wikipedia.org/wiki/Schnorr_signature) signature algorithms. The exact comment can be seen in the tweet attached below.
![](https://i.imgur.com/GbFs8j3.png)
What Daira is referencing can in simple terms be described with the following: The security level of [Sinsemilla hash](https://zips.z.cash/protocol/protocol.pdf) is collision resistant, so it cannot be regarded as a Random Oracle. For the ECDSA and Schnorr signature algorithms, in order to satisfy sufficient security, the choice of hash function needs to be regarded as Random Oracle. In order to understand this better, we need to understand some cryptographic concepts first.
## 1. Security properties of a cryptographic hash function (CHF)
According to the definition in the paper [Cryptographic Hash-Function Basics](https://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=C7364E9082B2726A10E1C712B476C82A?doi=10.1.1.3.6200&rep=rep1&type=pdf), the security properties corresponding to a CHF are divided into the following three categories:
- **Preimage Resistance** - Given H(x), it is computationally difficult to determine x, i.e., it is computationally infeasible to discover any input which created a specific output.
- **Second Preimage Resistance** - Given x, it is computationally difficult to find some different value x' such that H(x) == H(x'), i.e., it is computationally infeasible to find any second input which has the same output as any specified input.
- **Collision Resistance** - Expanding the concept of Second Preimage Resistance, it is computationally infeasible to find any two arbitrary inputs, x and y such that H(x) == H(y), i.e., different inputs which hash to the same output.
An important aspect to be noted is that, if you satisfy the conditions for Collision Resistance, you also satisfy the conditions for Second Preimage Resistance, however this does not guarantee that you are satisfying the conditions for Preimage Resistance.
## 2. What is a Random Oracle?
A [Random Oracle](https://en.wikipedia.org/wiki/Random_oracle) is described in the following model:
- Imagine you have a black box. In the box lives a gnome, with a big book and some dice.
- We can add some input data into the box (an arbitrary sequence of bits).
- Given some input data that the gnome did not see beforehand, he uses his dice to generate some output, uniformly and randomly, in some conventional space (the space of oracle outputs). The gnome also writes down the input and the newly generated output in his book.
- If he is given some formerly known input data, the gnome turns to his book, in order to recover the output he generated and returned the last time, and returns it again.
To briefly summarize the behavior of Random Oracles, assuming the input is x:
- If x has been used as input data before, it will directly return the corresponding H(x).
- If x has never been used as input data, the Random Oracle will generate a string consisting of 0.1 in the range of values **completely randomly**.
It might be noted that:
- **Complete randomness** here means that even the Random Oracle itself does not know what the final value will be, it has no rules to follow. This is the main difference from hash functions. Any hash function has its own calculation rules.
But in the real world, it is quite difficult to achieve an actual Random Oracle, therefore, we need to find a potential candidate for Random Oracles, and we need to make the output as random as possible. Hash functions are a good choice. A secure hash function needs to satisfy the following conditions: Preimage Resistance, Second Preimage Resistance, and Collision Resistance. A hash function that can be used as Random Oracle must satisfy these three properties, but a hash function that satisfies these three properties may not necessarily be used as Random Oracle, there is a necessary and insufficient relationship between them. More details can be found in [What is the “Random Oracle Model” and why is it controversial?](https://crypto.stackexchange.com/questions/879/what-is-the-random-oracle-model-and-why-is-it-controversial)
## 3. Requirements for using hash functions in ECDSA and Schnorr signature algorithm
As mentioned in the papers [On the security of ECDSA with additive key derivation and presignatures](https://www.shoup.net/papers/2021-1330.pdf) and [On the Exact Security of Schnorr-Type Signatures in the Random Oracle Model](https://eprint.iacr.org/2012/029.pdf), the hash functions in both ECDSA and Schnorr signature algorithms need to be considered Random Oracles before they are safe. According to the previous description, this hash needs to satisfy all the security properties of CHF: preimage-resistance, 2nd-preimage resistance, and collision resistance.
![](https://i.imgur.com/f59pXfQ.png)
<center>Figure 1. ECDSA signature process</center>
![](https://i.imgur.com/loYUxKA.png)
<center>Figure 2. Schnorr signature process</center>
## 4. About the Sinsemilla hash function
The Sinsemilla hash function was designed by [Daira Hopwood and Sean Bowe](https://twitter.com/feministPLT), and the bottom layer relies on [ECDLP(Elliptic Curve Discrete Logarithm Problem)](https://link.springer.com/referenceworkentry/10.1007/978-1-4419-5906-5_246). Under a fixed-length input, the Sinsemilla hash function satisfies the Collision Resistance property, but not the Preimage Resistance property. Please refer to Daira's [answer](https://twitter.com/feministPLT/status/1551856467145269249) for further clarification.
According to the [Zcash protocol specification](https://zips.z.cash/protocol/protocol.pdf), the original intention of designing the Sinsemilla hash function is to make full use of the advantages of being Lookup-friendly during the execution of the zero-knowledge proof algorithm Halo2, in order to improve the execution efficiency of Halo2. Therefore, the Sinsemilla hash function is a Lookup-friendly hash function, which is more suitable for computation of promises and computation of Merkle Tree roots.
![](https://i.imgur.com/cetTwsi.png)
<center>Figure 3. Design idea of Sinsemilla hash</center>
## 5. Conclusion
We would like to thank [Daira Hopwood](https://twitter.com/feministPLT) for pinpoiting this and her guidance, we will continue gathering feedback to continuously optimize and improve the OlaVM design, making it more efficient and secure. The Sinsemilla hash function will still be used in suitable modules in the OlaVM design. Regarding the usage of hash functions for signatures, we will opt for an appropriate and secure hash function, such as e.g the [Poseidon-](https://eprint.iacr.org/2019/458.pdf) or [Reinforced Concrete](https://eprint.iacr.org/2021/1038.pdf) hash functions.

Sin7Y
Exploring #Layer 2, #Crosschain, #ZK, and #Privacy Computing.

TL;DR We are working on building the first ZKVM based on a parallel execution architecture and achieving higher TPS through the improvement of ZK-friendly design and ZK algorithms. The technical features are as follows:Fast proof generationZK-friendly: smaller circuit scale and simplified bottom constraint units Fast ZK: further optimization on Plonky2 Fast execution: Utilizing parallel execution to significantly shorten the proof generation time Current progress: In July 2022, we released the OlaVM Whitepaper. November 2022, completed instruction set design and development, and realized the OlaVM execution module of the virtual machine, you can check the link: https://github.com/Sin7Y/olavm to view our code, continuously updated.

1/10/2023Preface This research compares implementation systems similar to Ethereum and analyzes the difficulties and possibilities of achieving parallel execution of transactions. It's worth noting that the chains analyzed for this research are based on the Account model design scheme, not including the UTXO scheme. Research Objects FISCO-BCOS, one of the consortium blockchains that support parallel execution of transaction verification within blocks. Khipu public chain, scala implementation of the Ethereum protocol. Aptos public chain, Move Virtual Machine. Difficulties with Parallel Execution Let's take a look at the traditional transaction execution process.

1/4/2023TL;DR As mentioned in the previous article Hello, OlaVM!, OlaVM’s vision is to build a high-performance ZKVM, and this article will focus on one of the tools that make OlaVM high-performance, namely, Lookup Arguments. Lookup Arguments play an important role in reducing the size of the circuit, thereby improving Zero Knowledge efficiency, and it's widely used in the circuit design of ZKVMs. Throughout this article you'll learn more about the following: What role do Lookup Arguments play in ZKVM? Plookup protocol principles Lookup Argument protocol principle of Halo 2 The connection between the two Lookup Argument algorithms The roles of a ZKVM The ZKVM utilizes Zero Knowledge to constrain all the execution processes of the VM, and the execution process of the VM can generally be divided into: instruction execution, memory access and built-in function execution. It is somewhat impractical to execute constraints on these operations in one trace. First of all, in a trace, a row represents an operation type, and one operation type corresponds to multiple constraints, and different constraints correspond to different numbers of columns, resulting in different widths. If one of the rows is too wide due to one constraint corresponding to too many columns, then the width of the entire trace is affected, becoming too large. Resource usage of this design is wasteful when the rows corresponding to the remaining constraints do not require so many columns. Then, if there are too many different operation types in a single trace, more selectors will be introduced, increasing not only the number of polynomials, but also the order of the constraint. Finally, due to the order limitation of the group, the number of rows of the trace itself cannot exceed the order of the group, so the number of trace rows occupied by a certain type of operation should be minimized.

12/20/2022ZKEVM is a programmable virtual machine based on ZK technology. It can generate a ZK proof for all operations perfwormed by the virtual machine to prove the correctness of the operations performed by the virtual machine. For the introduction of several implementation schemes of ZKEVM and the comparison of advantages and disadvantages, you can refer to the article of Vitalik Buterin: The different types of ZK-EVMs. If you want more design details, you can also read PSE's ZKEVM Scheme (native-level)： privacy-scaling-explorations/zkevm-specs Polygon's ZKEVM Design (bytecode-level)： Polygon zkEVM Documentation. Sin7y’s ZKEVM Design (language-level)：OlaVM: An Ethereum compatible ZKVM. Regardless of the scheme, zk is required to constrain all VM behaviors, including: Executing contract computational logic Executing memory access Executing hash calculation Executing world state updates It is well known that ZK has great application prospect in the field of computing compression. No matter how complex the original computation is, the verification process is very efficient, which is fundamental to all zk algorithms. Therefore, ZK works well for computational parts of VM execution (such as contract logic, hash calculation, etc.). In the process of VM execution, in addition to the computing itself, there are also some memory access operations. We need to place some data in the memory in advance, and then pull it out when the computation is performed.

9/9/2022
Published on ** HackMD**