# A Guide to AppliedZKP zkEVM Circuit Code We’ve studied several zkEVM projects in the previously published article [Exploring Popular zkEVM Solutions](https://hackernoon.com/exploring-popular-zkevm-solutions-appliedzkp-matter-labs-hermez-and-sin7y-quick-publication-ltq37ah), among which the representative of native-based zkEVM is what is being researched by the AppliedZKP team of the Ethereum Foundation. The text below will guide you through the circuit code of the AppliedZKP zkEVM. The code is still in the initial implementation, and many crates have not yet stabilized. Thus, all discussions in this article are subject to change in the future. Please always refer to the latest code if you are to do code development and code study. This text aims to provide an easy guide to the basic framework and logic of the AppliedZKP zkEVM code base. Hopefully, more developers will join us in the further research of zkEVM. | Code Base | Path | Branch | Commit_ID | Date | | -------------- | -------------------------------------------- | ------ | ---------------------------------------- | ---------- | | zkEVM-circuits | https://github.com/appliedzkp/zkEVM-circuits | main | 39a7bf7b30e25faa7db26a3cb94640240164542d | 2021-10-18 | ## Code Structure In the current code, there are three crates, namely *bus-mapping*, *keccak256*, and *zkevm-circuits*. The relationship among the three is shown in the figure below: ![](https://i.imgur.com/YkyA5jn.png) Also in [Exploring Popular zkEVM Solutions](https://hackernoon.com/exploring-popular-zkevm-solutions-appliedzkp-matter-labs-hermez-and-sin7y-quick-publication-ltq37ah), we pointed out that Bus-mapping is the state channel of *evmcircuit* and *statecircuit* in zkEVM. The devs organize the code in the way shown in the figure above. Let’s look at each crate in turn. ### 1. bus-mapping The objectives of bus-mapping are: * Analyzes EVM execution path * Constructs witness data following complete execution path analysis * Provides a simple interface to gather and put witness data in the circuit, and witness it with a small number of functions So the basic function is to process witness data, regardless of whether it is the path data executed by the EVM or the state data that changes before and after. Let's look at each crate in bus-mapping specifically. #### 1.1 EVM Basically, EVM defines `ProgramCounter`, `GlobalCounter`, `EVMWord`, `GasInfo`, and `GasCost` include and contains data infrustractures like `stack`, `memory`, and `opcodes`. It can be seen that `pc` and `gc` are encapsulation to `usize`; `EVMWord` is the encapsulation of the `u8` array with a length of 32. `StackAddress` is usize, but Stack is a dynamic array of `EVMWord`. `MemoryAddress` is also usize, which should be between 0 and 1023 while Memory is an array of u8. Developers use Rust macro cleverly to extract key memory information, such as traits of index and range, and implement them in a unified way. Storage is a HashMap and its key values are all `EVMWord`. The opcodes crate defines the opcode traits of stack, memory, and storage, as well as the implementation of parts of the opcode traits. ```rust= pub trait Opcode: Debug { /// Generate the associated [`MemoryOp`](crate::operation::MemoryOp)s, /// [`StackOp`](crate::operation::StackOp)s, and /// [`StorageOp`](crate::operation::StorageOp)s associated to the Opcode /// is implemented for. fn gen_associated_ops( &self, exec_step: &mut ExecutionStep, container: &mut OperationContainer, next_steps: &[ExecutionStep], ) -> Result<usize, Error>; } ``` Let’s take `mload` as an example to see how the trait is implemented. Here is the implementation of go-ethereum. ```go // core/vm/instructions.go func opMload(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byte, error) { v := scope.Stack.peek() offset := int64(v.Uint64()) v.SetBytes(scope.Memory.GetPtr(offset, 32)) return nil, nil } ``` There are three steps of mload: 1. Take the value of offset from the last position of the stack; 2. Take the value of `[offset, offset+32)` bytes from memory; 3. Write bytes in the last position of the stack. `gen_associated_ops` of mload is to do these steps. For detailed code, please refer to bus-mapping/src/EVM/opcodes/mload.rs. #### 1.2 exec_trace This is the most important crate in bus-mapping, including `BlockConstants`, `ExecutionTrace`, and `OperationRef`. `BlockConstants` contains some constants of EVM, such as `coinbase`, `timestamp`, etc. `ExecutionTrace` is the core of bus-mapping, which is the interface form of witness. `ExecutionStep` contains content used in every execution such as memory, stack, current instruction, and Operation-related `bus_mapping_instance`. `ExecutionStep` implements its own `gen_associated_ops` function, which can call the corresponding `gen_associated_ops` according to the current instruction. ```rust pub struct ExecutionStep { pub(crate) memory: Memory, pub(crate) stack: Stack, pub(crate) instruction: OpcodeId, pub(crate) gas_info: GasInfo, pub(crate) depth: u8, pub(crate) pc: ProgramCounter, pub(crate) gc: GlobalCounter, // Holds refs to the container with the related mem ops. pub(crate) bus_mapping_instance: Vec<OperationRef>, } ``` `ExecutionTrace` include all steps executed, and container includes each operation used in Trace. The most critical function in this structure is build, which is used to set `gc` and generate a corresponding Operation for each step in the trace. ```rust pub struct ExecutionTrace<F: FieldExt> { steps: Vec<ExecutionStep>, block_ctants: BlockConstants<F>, container: OperationContainer, } ``` #### 1.3 operation The operation crate is relatively simple, defining `RW`, `Target`, `MemoryOp`, `StackOp`, `StorageOp`, and `Operation`, where `RW` is of two types, i.e., read and write, and `Target` includes Memory, Stack, and Storage. The encapsulation of Operation will specify what kind of Operation it is. ### 2. zkevm-circuits This crate requires the use of halo2 to implement the zkEVM circuits, which is also a more complex part of the implementation of zkEVM. It temporarily includes the following subcrates: `evm_circuit`, `gadget`, `state_circuit`, and `util`. #### 2.1 util and gadget `Util` is used for tool implementations, such as converting Rust's basic data types, such as `bool`, `u8`, etc., to the elements of a finite field. `Gadget` defines the basic tool chip. For example, `evm_word` is a constraint check for 256 bit (or 32 bytes), `is_zero` is to determine whether it is 0, and `monotone` checks whether the advice column increases monotonically within a range. The implementation of these chip circuits follows the general architectural design. For more details, please refer to our previous publication [Develop Circuits Using Halo2](https://hackernoon.com/sin7y-tech-review-develop-circuits-using-halo-2). #### 2.2 state_circuit In [zkEVM design specs](https://github.com/appliedzkp/zkEVM-specs), there are state proof and evm proof. State proof is used to help evm proof check whether all random read, write, and access records are effective. The records of random read and write are generated by state_circuit. State_circuit defines BusMapping and Config. From [Develop Circuits Using Halo2](https://hackernoon.com/sin7y-tech-review-develop-circuits-using-halo-2), we can know that config is the most critical part to write the circuit of Halo2. It contains all the information on the lookup table needed for evm proof. Take the code comments in state_circuit as an example. `q_target` is the selector, where 1 represents the first line of each target, 2 represents memory, 3 represents stack, and 4 represents storage. As we can see, state is grouped by address first, and then sorted by `global_counter`. Note that the address here is not the Ethereum address. For memory and stack, the address is the `StackAddress` and `MemoryAddress` mentioned in the previous section, namely usize, so the values in the example are 0 and 1. `global_counter` is the counter, which will count different addresses separately. Value is the operation value, and flag marks read and write, where 1 is write, and 0 is read. padding is used to fill the gap between each target and has no practical meaning. `storage_key` is exclusive to the storage operation, which is the key, and `value_pre` is the value of the previous value in this target, which is 0 at the beginning. ```rust /* Example state table: | q_target | address | global_counter | value | flag | padding | storage_key | value_prev | ------------------------------------------------------------------------------------------- | 1 | 0 | 0 | 0 | 1 | 0 | | | // init row (write value 0) | 2 | 0 | 12 | 12 | 1 | 0 | | | | 2 | 0 | 24 | 12 | 0 | 0 | | | | 2 | 1 | 0 | 0 | 1 | 0 | | | // init row (write value 0) | 2 | 1 | 2 | 12 | 0 | 0 | | | | 2 | | | | | 1 | | | // padding | 2 | | | | | 1 | | | // padding | 1 | 0 | 3 | 4 | 1 | 0 | | | | 3 | 0 | 17 | 32 | 1 | 0 | | | | 3 | 0 | 89 | 32 | 0 | 0 | | | | 3 | 1 | 48 | 32 | 1 | 0 | | | | 3 | 1 | 49 | 32 | 0 | 0 | | | | 3 | | | | | 1 | | | // padding | 1 | 1 | 55 | 32 | 1 | 0 | 5 | 0 | // first storage op at the new address has to be write | 4 | 1 | 56 | 33 | 1 | 0 | 8 | 32 | | 4 | | | | | 1 | | | // padding */ ``` #### 2.3 evm_circuit The `evm_circuit` is the core of the entire projec, including the implementation of opcode, the definition of Constraint, the definition of EVM execution results, the element of each execution step (ExecutionStep), and the most important *EVMCircuit*. For example, in the definition of Constraint, the name serves as an annotation. When the selector is a success, the gadget's selector - polys are the Constraint’s polynomials, and lookups are the lookup tables used by the Constraint. In the Halo2 circuit instruction manual, the constraints interface function in the OpGadget trait is an array that returns Constraint. ```rust struct Constraint<F> { name: &'static str, selector: Expression<F>, polys: Vec<Expression<F>>, lookups: Vec<Lookup<F>>, } ``` Lookup is divided into two categories. First, `FixedLookup`, which an array of 3 `Expression`s; second, `BusMappingLookup`, which is an enum containing `OpExecutionState`, `Stack`, `Memory`, `AccountStorage`, etc. ```rust pub(crate) enum Lookup<F> { FixedLookup(FixedLookup, [Expression<F>; 3]), BusMappingLookup(BusMappingLookup<F>), } ``` The difference between `CallField` and `CallStateField` is that the former is used in the call of BusMappingLookup, and the latter is used in the OpExecutionState of BusMappingLookup. The `Cell` structure includes constraint expressions, columns, and the offset used by the selector. ```rust pub(crate) struct Cell<F> { // expression for constraint expression: Expression<F>, column: Column<Advice>, // relative position to selector for synthesis rotation: usize, } ``` A word is represented by one expression and is positioned by an array of 32 cells. ```rust struct Word<F> { // random linear combination expression of cells expression: Expression<F>, // inner cells for synthesis cells: [Cell<F>; 32], } ``` The structure, `CoreStateInstance`, is often used when writing opcode circuits and is used for context caching in the assign implementation in evm_circuit. ```rust pub(crate) struct CoreStateInstance { is_executing: bool, global_counter: usize, call_id: usize, program_counter: usize, stack_pointer: usize, gas_counter: usize, } ``` `ExecutionStep` contains the opcode executed in this step, the execution result, and the value used. Although the name is the same as in the bus-mapping crate, its purpose and structure are different. ```rust pub(crate) struct ExecutionStep { opcode: OpcodeId, case: Case, values: Vec<BigUint>, } ``` `EVMCircuit` contains the selector, lookup table, read-write table, fixed table, and op execution gadget for each step. ```rust struct EVMCircuit<F> { q_step: Selector, qs_byte_lookup: Column<Advice>, fixed_table: [Column<Fixed>; 4], rw_table: [Column<Advice>; 7], op_execution_gadget: OpExecutionGadget<F>, } ``` Currently, opcodes of `arithmetic`, `byte`, `comparator`, `dup`, `pop`, and `push` have been implemented. We take add/sub as an example to explain ***how to write an op circuit***. First, define the data structure required by op (when the operation is successful), which is also the operation data of the op in most cases. When it fails, no other data is needed, we just need to handle the corresponding case. For the `AddSuccessAllocation` structure, the selector is used for activation in different cases. Because `ADD` and `SUB` are implemented together, swap is used to distinguish add or sub. a, b, c are used to represent the three values of the op operation, satisfying a + b == c. When sub, a + c == b. ```rust struct AddSuccessAllocation<F> { selector: Cell<F>, swap: Cell<F>, a: Word<F>, b: Word<F>, c: Word<F>, carry: [Cell<F>; 32], } ``` Second, define the corresponding Gadget structure, mainly the results of the op in different cases. For add/sub, there are three states: *success*, *stackunderflow*, and *outofgas*. ```rust pub struct AddGadget<F> { success: AddSuccessAllocation<F>, stack_underflow: Cell<F>, // case selector out_of_gas: ( Cell<F>, // case selector Cell<F>, // gas available ), } ``` The final step is to implement the ***OpGadget*** trait for the op. To begin with, define several case situations covered by this op, as well as the number of values, the number of cells, and whether or not to halt. ```rust const RESPONSIBLE_OPCODES: &'static [OpcodeId] = &[OpcodeId::ADD, OpcodeId::SUB]; // This operation implements opcodes ADD and SUB. const CASE_CONFIGS: &'static [CaseConfig] = &[ CaseConfig { case: Case::Success, num_word: 3, // a + b + c num_cell: 33, // 32 carry + swap will_halt: false, }, CaseConfig { case: Case::StackUnderflow, num_word: 0, num_cell: 0, will_halt: true, }, CaseConfig { case: Case::OutOfGas, num_word: 0, num_cell: 0, will_halt: true, }, ]; // Several case situations by ADD and SUB ``` The `construct()` function is relatively simple, which is to initialize our Gadget with `case_allocations`. The `constraints` function is a relatively critical interface, which needs the process of the current state and the next state to construct constraints. ```rust fn constraints( &self, state_curr: &OpExecutionState<F>, state_next: &OpExecutionState<F>, ) -> Vec<Constraint<F>> { let OpExecutionState { opcode, .. } = &state_curr; let common_polys = vec![ (opcode.expr() - OpcodeId::ADD.expr()) * (opcode.expr() - OpcodeId::SUB.expr()), ]; let success = { // success constraint }; let stack_underflow = { // stack underflow constraint }; let out_of_gas = { // out_of_gas constraint }; array::IntoIter::new([success, stack_underflow, out_of_gas]) .map(move |mut constraint| { constraint.polys = [common_polys.clone(), constraint.polys].concat(); constraint }) .collect() } ``` If you want to know how to use halo2 to develop circuits, please refer to [Develop Circuits Using Halo2](https://hackernoon.com/sin7y-tech-review-develop-circuits-using-halo-2). #### 3. keccak256 While we are studying the code, the crate of keccak256 is still in the construction stage. We will analyze it in the follow-up study. ## How to participate As written in Sin7Y’s early articles, zkEVM is the brightest jewel in the crown of Ethereum scaling, and we look forward to more developers participating in this great project. Of course, as an open-source project, you may make your contribution in whatever way you like. Fixing a typo, improving documentation, or implementing a certain function are all helpful. If you are more familiar with the development of Halo2 and want to implement a certain opcode, you can take the above ADD/SUB as an example. The implementation process complies with writing the design specifications first. After the design specifications are merged by the core developers, you can write the opcode circuit code. Finally, don't forget to write the unit test code for the newly added opcode implementation. ![](https://i.imgur.com/POWHFkU.png) ## References ZkEVM Specifications https://github.com/appliedzkp/zkEVM-specs Circuits for zkEVM https://github.com/appliedzkp/zkEVM-circuits Develop Circuits Using Halo2 https://hackernoon.com/sin7y-tech-review-develop-circuits-using-halo-2 Exploring Popular zkEVM Solutions https://hackernoon.com/exploring-popular-zkEVM-solutions-appliedzkp-matter-labs-hermez-and-sin7y-quick-publication-ltq37ah ## 📎 A little about us *As a team of both vigorous and professional blockchain devs, Sin7Y has been doing constant research on zkEVM, layer 2, privacy computing, and cross-chain technology, delivering high-quality tech reviews on a regular basis. You can find us on [Twitter](https://twitter.com/Sin7YLabs) or write to us at sin7y@web3.com. Feel free to contact us by [DM](https://twitter.com/Sin7YLabs) or [Email](mailto:sin7y@web3.com) if you have any queries.*