ZKEVM EVM Circuit - Layout and Branching

```python # the most costly case, CALL might take 7 words from stack NUM_MAX_CELL_OF_CASE = 7 * 32 # DUP* and SWAP* can be handled to gether without increasing the degree, # so here 115 = 145 - 15 - 15. It could be less when we found other opcodes # that could be handled together within available cells and degree. NUM_OP = 115 def bool_constraint(exp): assert exp * (1 - exp) == 0 # only one of bool switch should be 1 and all switch are bool def bool_switches_constraint(exps): for exp in exps: bool_constraint(exp) assert sum(exps) - 1 == 0 # class Step represents cells queried in one step class Step: # because circuit size is fixed, we need some no-op steps for padding is_padding = None is_initialize = None # TODO: global_counter, stack_pointer, gas_counter, etc... free_cells = None def __init__(self, cs, rot): self.is_padding, self.is_initialize = cs.query_advice(2, rot) self.free_cells = cs.query_advice(NUM_OP + NUM_MAX_CELL_OF_CASE, rot) # degree start with 3 def call_initialization(step_selector, curr: Step, next: Step): # verify tx is valid if depth == 0 # ... # degree starts with 2 def op_execution(step_selector, curr: Step, next: Step): assert curr.is_padding == 0 # free_cells offset fc_offset = 0 op_selectors = curr.free_cells[fc_offset:fc_offset + NUM_OP] fc_offset += NUM_OP bool_switches_constraint(op_selectors) # opcode: ADD + SUB if op_selectors[0]: case_selectors = curr.free_cells[fc_offset:fc_offset + 3] fc_offset += 3 bool_switches_constraint(case_selectors) # case: success if case_selectors[0]: [a, b, c, carry] = [ curr.free_cells[fc_offset + 32 * i:fc_offset + 32 * (i + 1)] for i in range (4) ] swap = curr.free_cells[fc_offset + 128] # TODO: # - tracking data transition # - word constraints # - addition constraints # - stack lookup # - ... # case: stack underflow if case_selectors[1]: # case: out of gas if case_selectors[2]: # opcode: MUL, SUB, DIV, ... def evm_circuit_constraint(cs): # step_selector as a pivot for a step step_selector = cs.query_selector(1, curr) curr = new Step(cs, CURR) next = new Step(cs, NEXT) if step_selector: # is_padding can only transit from 0 to 1 once bool_constraint(curr.is_padding) bool_constraint(next.is_padding - curr.is_padding) bool_constraint(curr.is_initialize) if curr.is_initialize: if curr.is_padding: # continue padding assert next.is_initialize == 1 else: call_initialization(step_selector, curr, next) else: op_execution(step_selector, curr, next) ```

Visualization Slide

Definition

step - We used to call it slot, but I found most time it's called a execution step.
cell - a unit to store witness, located by tuple (column, row)

Introduction

EVM circuit basically iterate over a list of transactions to verify their update is applied to state, so we need to:

Initialize a call and start to exection if it's a contract creation or if receiver is a contract
If it's a top level call (aka transaction), we need to verify extra stuff like signature, enough gas for intrinsic gas, enough balance for value, etc… Note that transactions that fail to initilize can't be put in a block.
If call initilized and there is bytecode to execute, we execute bytecode step by step to verify it's valid.
In each step we need to be able to verify all possible op and their case respectively.

With so many tasks, we need to carefully branch the circuit to reduce both

cells used per step
degrees used on selector

Psuedo Code

This is the psuedo code for current implementation in PR #26 (some part are not implemented in circuit):

# the most costly case, CALL might take 7 words from stack
NUM_MAX_CELL_OF_CASE = 7 * 32
# DUP* and SWAP* can be handled to gether without increasing the degree,
# so here 115 = 145 - 15 - 15. It could be less when we found other opcodes 
# that could be handled together within available cells and degree.
NUM_OP = 115

def bool_constraint(exp):
    assert exp * (1 - exp) == 0

# only one of bool switch should be 1 and all switch are bool
def bool_switches_constraint(exps):
    for exp in exps:
        bool_constraint(exp)

    assert sum(exps) - 1 == 0

# class Step represents cells queried in one step
class Step:
    # because circuit size is fixed, we need some no-op steps for padding
    is_padding = None
    is_initialize = None
    # TODO: global_counter, stack_pointer, gas_counter, etc...
    free_cells = None

    def __init__(self, cs, rot):
        self.is_padding, self.is_initialize = cs.query_advice(2, rot)
        self.free_cells = cs.query_advice(NUM_OP + NUM_MAX_CELL_OF_CASE, rot)

# degree start with 3
def call_initialization(step_selector, curr: Step, next: Step):
    # verify tx is valid if depth == 0
    # ...

# degree starts with 2
def op_execution(step_selector, curr: Step, next: Step):
    assert curr.is_padding == 0
    
    # free_cells offset
    fc_offset = 0
    
    op_selectors = curr.free_cells[fc_offset:fc_offset + NUM_OP]
    fc_offset += NUM_OP
    bool_switches_constraint(op_selectors)

    # opcode: ADD + SUB
    if op_selectors[0]:
        case_selectors = curr.free_cells[fc_offset:fc_offset + 3]
        fc_offset += 3
        bool_switches_constraint(case_selectors)

        # case: success
        if case_selectors[0]:
            [a, b, c, carry] = [
                curr.free_cells[fc_offset + 32 * i:fc_offset + 32 * (i + 1)]
                for i in range (4)
            ]
            swap = curr.free_cells[fc_offset + 128]
            # TODO:
            # - tracking data transition
            # - word constraints
            # - addition constraints
            # - stack lookup
            # - ...

        # case: stack underflow
        if case_selectors[1]:

        # case: out of gas
        if case_selectors[2]:

    # opcode: MUL, SUB, DIV, ...

def evm_circuit_constraint(cs):
    # step_selector as a pivot for a step
    step_selector = cs.query_selector(1, curr)
    curr = new Step(cs, CURR)
    next = new Step(cs, NEXT)

    if step_selector:
        # is_padding can only transit from 0 to 1 once
        bool_constraint(curr.is_padding)
        bool_constraint(next.is_padding - curr.is_padding)

        bool_constraint(curr.is_initialize)

        if curr.is_initialize:
            if curr.is_padding:
                # continue padding
                assert next.is_initialize == 1
            else:
                call_initialization(step_selector, curr, next)
        else:
            op_execution(step_selector, curr, next)

Note that:

The NUM_MAX_CELL_OF_CASE is decided by the most costly opcode, which is probably success case of CALL that uses up to 7 words.
Every if increases the degree by 1

Dilemma

For op_execution, for now we are using approach:

Use NUM_OP cells as op bool switches, which might cost around 100 cells per step. The left degree per case is 4, the prefixed degrees are from
- step_selector
- is_initialize
- op_selector
- case_selector

Becasue the most costly case (probably the success case of CALL) cost up to 7 * 32 cells, if we can make use of these empty cells for those simple opcode like ADD, SUB which only cost 4 * 32 cells, then it would save cell usage a lot per step. It's like:

Use NUM_GROUP cells as bool switches to generate NUM_GROUP groups, each group contains op bool switches for responsible ops. For example we can split into 4 group, first 3 of groups handles most simple opcodes, the left one handles costly opcode like CALL.

However, it's not free to do so, we will sacrifice 1 degree for group_selector and make the linear combination of cases very limited.

So if we stick to approach 1 for now and everything is working, that's good. But when we are trying to optmize to some approaches that have less degree for linear combination like approach 2, we need to re-implement some gadget if they are using up to degree 4.

There are another approach to reduce cells per step:

Allow more rotations and make expensive opcodes into 2 step or more steps, and we don't need to group them. But multi-step means we need another step_selector, which also makes the degree left 3, and the extra step's Step and op selectors seem to cancel out the saving.

Random Thought

Reduced Instructions

Just read Hermez's muVM again and get another optmization idea:

Decompression in computation trace is expensive, so let's use lookup to replace all the operation needed by decompression such as arithmetic operation, bitwise operation, comparator, etc…
The lookup table would be built by all the words we have used, and the words are placed in a friendly way for our reduced-instruction to operate on (verify on). For example we can layout them like:

tag_add tag_mul ... (more tag) rlc_word b0 b1 b2 ... b31

1 0 ... ... (a) 1 2 3 ... 0

0 0 ... ... (b) 4 5 6 ... 0

0 0 ... ... (c) 5 7 9 ... 0

0 0 ... ... (carry) 0 0 0 ... 0

All columns are advice columns and rlc_word = random_linear_combination(b0, b1, ..., b31). Then in evm circuit we can simply do something like:
```
meta.lookup(|meta| vec![
    (meta.query_advice(tag_add, Rotation(0)), Expression::Constant(F::one())),
    (meta.query_advice(rlc_word, Rotation(0)), a),
    (meta.query_advice(rlc_word, Rotation(1)), b),
    (meta.query_advice(rlc_word, Rotation(2)), c),
])
```

`tag_add`	`tag_mul`	`... (more tag)`	`rlc_word`	`b0`	`b1`	`b2`	`...`	`b31`
`1`	`0`	`...`	`... (a)`	`1`	`2`	`3`	`...`	`0`
`0`	`0`	`...`	`... (b)`	`4`	`5`	`6`	`...`	`0`
`0`	`0`	`...`	`... (c)`	`5`	`7`	`9`	`...`	`0`
`0`	`0`	`...`	`... (carry)`	`0`	`0`	`0`	`...`	`0`

There are some pros and cons

Pros:
- Because all the used words are here, we can use their commitment to derive random factor for RLC (random linear combination).
- We save a lot cells per step
- Some repeated operand would cost only one record in the table to reduce data redundancy. For example when we are verifying 1 + 2 = 3 multiple times in evm circuit, there will still be only one record (4 rows) in the table.
Cons:
- There will be more lookup overall
- Not sure if all cases are friendly to do so

Definition

Introduction

Psuedo Code

Dilemma

Random Thought

Reduced Instructions

Read more

2024 zkVM team summary (PRECON)

Halo2 Peak Memory Usage

2024-01-11 Milestone review

2023-11-18 PSE tooling