[Visualization Slide](https://drive.google.com/file/d/1AFjFa0-3zmkqg9pbrR5vFhsblJDQIEk_/view?usp=sharing) ## Definition - `step` - We used to call it `slot`, but I found most time it's called a execution step. - `cell` - a unit to store witness, located by tuple `(column, row)` ## Introduction EVM circuit basically iterate over a list of transactions to verify their update is applied to state, so we need to: 1. Initialize a call and start to exection if it's a contract creation or if receiver is a contract 2. If it's a top level call (aka transaction), we need to verify extra stuff like signature, enough gas for intrinsic gas, enough balance for value, etc... Note that transactions that fail to initilize can't be put in a block. 3. If call initilized and there is bytecode to execute, we execute bytecode step by step to verify it's valid. 4. In each step we need to be able to verify all possible op and their case respectively. With so many tasks, we need to carefully branch the circuit to reduce both - cells used per step - degrees used on selector ## Psuedo Code This is the psuedo code for current implementation in [PR #26](https://github.com/appliedzkp/zkevm-circuits/pull/26) (some part are not implemented in circuit): ```python # the most costly case, CALL might take 7 words from stack NUM_MAX_CELL_OF_CASE = 7 * 32 # DUP* and SWAP* can be handled to gether without increasing the degree, # so here 115 = 145 - 15 - 15. It could be less when we found other opcodes # that could be handled together within available cells and degree. NUM_OP = 115 def bool_constraint(exp): assert exp * (1 - exp) == 0 # only one of bool switch should be 1 and all switch are bool def bool_switches_constraint(exps): for exp in exps: bool_constraint(exp) assert sum(exps) - 1 == 0 # class Step represents cells queried in one step class Step: # because circuit size is fixed, we need some no-op steps for padding is_padding = None is_initialize = None # TODO: global_counter, stack_pointer, gas_counter, etc... free_cells = None def __init__(self, cs, rot): self.is_padding, self.is_initialize = cs.query_advice(2, rot) self.free_cells = cs.query_advice(NUM_OP + NUM_MAX_CELL_OF_CASE, rot) # degree start with 3 def call_initialization(step_selector, curr: Step, next: Step): # verify tx is valid if depth == 0 # ... # degree starts with 2 def op_execution(step_selector, curr: Step, next: Step): assert curr.is_padding == 0 # free_cells offset fc_offset = 0 op_selectors = curr.free_cells[fc_offset:fc_offset + NUM_OP] fc_offset += NUM_OP bool_switches_constraint(op_selectors) # opcode: ADD + SUB if op_selectors[0]: case_selectors = curr.free_cells[fc_offset:fc_offset + 3] fc_offset += 3 bool_switches_constraint(case_selectors) # case: success if case_selectors[0]: [a, b, c, carry] = [ curr.free_cells[fc_offset + 32 * i:fc_offset + 32 * (i + 1)] for i in range (4) ] swap = curr.free_cells[fc_offset + 128] # TODO: # - tracking data transition # - word constraints # - addition constraints # - stack lookup # - ... # case: stack underflow if case_selectors[1]: # case: out of gas if case_selectors[2]: # opcode: MUL, SUB, DIV, ... def evm_circuit_constraint(cs): # step_selector as a pivot for a step step_selector = cs.query_selector(1, curr) curr = new Step(cs, CURR) next = new Step(cs, NEXT) if step_selector: # is_padding can only transit from 0 to 1 once bool_constraint(curr.is_padding) bool_constraint(next.is_padding - curr.is_padding) bool_constraint(curr.is_initialize) if curr.is_initialize: if curr.is_padding: # continue padding assert next.is_initialize == 1 else: call_initialization(step_selector, curr, next) else: op_execution(step_selector, curr, next) ``` Note that: - The `NUM_MAX_CELL_OF_CASE` is decided by the most costly opcode, which is probably success case of `CALL` that uses up to 7 words. - Every `if` increases the degree by 1 ## Dilemma For `op_execution`, for now we are using approach: 1. Use `NUM_OP` cells as op bool switches, which might cost around 100 cells per step. The left degree per case is 4, the prefixed degrees are from - `step_selector` - `is_initialize` - `op_selector` - `case_selector` Becasue the most costly case (probably the success case of `CALL`) cost up to `7 * 32` cells, if we can make use of these empty cells for those simple opcode like `ADD`, `SUB` which only cost `4 * 32` cells, then it would save cell usage a lot per step. It's like: 2. Use `NUM_GROUP` cells as bool switches to generate `NUM_GROUP` groups, each group contains op bool switches for responsible ops. For example we can split into 4 group, first 3 of groups handles most simple opcodes, the left one handles costly opcode like `CALL`. However, it's not free to do so, we will sacrifice 1 degree for `group_selector` and make the linear combination of cases very limited. So if we stick to approach 1 for now and everything is working, that's good. But when we are trying to optmize to some approaches that have less degree for linear combination like approach 2, we need to re-implement some gadget if they are using up to degree 4. There are another approach to reduce cells per step: 3. Allow more rotations and make expensive opcodes into 2 step or more steps, and we don't need to group them. But multi-step means we need another `step_selector`, which also makes the degree left 3, and the extra step's `Step` and op selectors seem to cancel out the saving. ## Random Thought ### Reduced Instructions Just read Hermez's muVM again and get another optmization idea: 1. Decompression in computation trace is expensive, so let's use lookup to replace all the operation needed by decompression such as arithmetic operation, bitwise operation, comparator, etc... 2. The lookup table would be built by all the words we have used, and the words are placed in a friendly way for our reduced-instruction to operate on (verify on). For example we can layout them like: | `tag_add` | `tag_mul` | `... (more tag)` | `rlc_word` | `b0` | `b1` | `b2` | `...` | `b31` | | --------- | --------- | ---------------- | ------------- | ---- | ---- | ---- | ----- | ----- | | `1` | `0` | `...` | `... (a)` | `1` | `2` | `3` | `...` | `0` | | `0` | `0` | `...` | `... (b)` | `4` | `5` | `6` | `...` | `0` | | `0` | `0` | `...` | `... (c)` | `5` | `7` | `9` | `...` | `0` | | `0` | `0` | `...` | `... (carry)` | `0` | `0` | `0` | `...` | `0` | All columns are advice columns and `rlc_word = random_linear_combination(b0, b1, ..., b31)`. Then in evm circuit we can simply do something like: ```rust meta.lookup(|meta| vec![ (meta.query_advice(tag_add, Rotation(0)), Expression::Constant(F::one())), (meta.query_advice(rlc_word, Rotation(0)), a), (meta.query_advice(rlc_word, Rotation(1)), b), (meta.query_advice(rlc_word, Rotation(2)), c), ]) ``` There are some pros and cons - Pros: - Because all the used words are here, we can use their commitment to derive random factor for RLC (random linear combination). - We save a lot cells per step - Some repeated operand would cost only one record in the table to reduce data redundancy. For example when we are verifying `1 + 2 = 3` multiple times in evm circuit, there will still be only one record (4 rows) in the table. - Cons: - There will be more lookup overall - Not sure if all cases are friendly to do so