Try   HackMD

Definition

  • slot - multiple rows to verify a single op
  • q - selector to enable a slot, should be 1 only in the last row in a slot.
  • x_diff_inv - witnessed by prover to be inv(fixed_x - x), where the fixed_x is baked into circuit. It's used to switch on op and case custom constraint by 1 - (fixed_x - x) * x_diff_inv (is_zero expression).
  • context
    • gc - global counter
    • addr - address holding opcodes
    • pc - program counter
    • op - operator respect to addr and pc
    • sp - stack pointer
    • ... - more will be needed, see more details here
  • v0..v31 - decompressed operating values in word bytes from low to high (0x42 will be [0x42, 0, ..., 0])
  • inv - a function returns inverse in field fq if exists, returns 0 otherwise.
  • compress a function to compress variadic inputs into single fq using random linear combination.

Layout

In evm circuit, we need to verify all possible opcodes including their error cases in a single fixed slot, so we need prover to provide some auxiliary values for us to switch on custom constraint. Naively the circuit would need 3 regions:

  • operating values
    The inputs and outputs of a opcode. For example, a ADD takes 2 inputs and returns 1 output. Decompression is needed when the relation between inputs and outputs requires byte to byte check.
  • context
    The current context evm holds, could have gc, addr, pc, etc
  • op and case switch
    The auxiliary values for circuit to know which custom constraint of opcodes and cases to switch on. Naively we could require prover to witness the {op,case}_diff_inv for circuit to produce a is_zero boolean expression to switch the op Q1.
    The case here stands for all possiblibility when executing a op, for example, a ADD could have 1 success case and 2 error cases ErrOutOfGas and ErrStackUnderflow.

For visualization, the layout could be like:

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Example - ADD

If we don't care about the limit of wire number and we could have such wide circut (w=32) to put each word into one row.

Then if now we have (op, va, vb, vc) = (1, 3, 5, 8) as example (to verify 8 = 3 + 5), We decompress the values in bus mapping into 3 v0..v31 and do custom constraint of ADD.

q a_1 a_2 a_3 a_4 a_5 ... a_32 Note
... ... ... ... ... ... ... ... Cells storing values
0 0 0 0 0 0 ... 0 ↑ (carry)
0 8 0 0 0 0 ... 0 ↑ (vc)
0 5 0 0 0 0 ... 0 ↑ (vb)
0 3 0 0 0 0 ... 0 ↑ (va)
0 gc addr pc op sp ... ... Context
... ... ... ... ... ... ... ... Witness inverse of op difference or 0
1 inv(1-1) (ADD) inv(2-1) (MUL) inv(3-1) (SUB) ... ... ... ...

Then several constraints we will have (cur() to the last row):

// switch on custom constraint only when the opcode is ADD
if is_zero(op - 1) {
    if case_success {
        // two source read
        for (gc, sp, v) in [(gc, sp, va), (gc+1, sp+1, vb)] {
            bus_mapping_lookup(
                gc,
                Stack,
                sp,
                compress(v),
                Read
            )
        }

        // one result write
        bus_mapping_lookup(
            gc+2,
            Stack,
            sp+1,
            compress(vc),
            Write
        )

        // result is indeed added by two source
        eight_bit_lookup(va[0])
        eight_bit_lookup(vb[0])
        eight_bit_lookup(vc[0])
        256 * carry[0] + vc[0] === va[0] + vb[0]

        for i in range 1..=31 {
            eight_bit_lookup(va[i])
            eight_bit_lookup(vb[i])
            eight_bit_lookup(vc[i])

            256 * carry[i] + vc[i] === va[i] + vb[i] + carry[i-1]
        }

        // gc in the next slot should increase 3
        gc_next === gc + 3

        // addr should be same
        addr_next === addr

        // pc should increase 1
        pc_next === pc + 1

        // sp should increase 1 (ADD = 2 POP + 1 PUSH)
        sp_next === sp + 1
    }

    if case_err_out_of_gas {
        // TODO:
        // - gas_left > gas_cost
        // - consume all give gas
        // - context switch to parent
        // - ...
    }

    if case_err_stack_underflow {
        // TODO:
        // - sp + 1 === 1023 + surfeit
        // - consume all give gas
        // - context switch to parent
        // - ...
    }
}

Note we don't need 8-bit addition lookup here, because in some op like comparators are already using 8-bit range lookup to check if values are in 8-bit. So 8-bit range check become free for other ops because we only need to add the switch boolean expression together to enable it.

Once we know all values are in 8-bit, we only need to iteratively check every bytes are added correctly with the carry bit by simple custom constraint.

Example - JUMPI

We have (op, dest, cond) = (87, 4, cond) as example (to verify a jump to pc = 4 when condition is non zero). Here the cond is compressed form of actual value, which is used directly by is_zero gadget to check if it's zero or not, prover has negligible chance to compress a non zero value into 0. (is_zero gadget needs another cell to allocate inverse of cond to produce the expression, so is the inv(cond))

q a_1 a_2 a_3 a_4 a_5 ... a_32 Note
... ... ... ... ... ... ... ... Cells storing values
0 cond inv(cond) 0 0 0 ... 0 ↑ (cond)
0 4 0 0 0 0 ... 0 ↑ (dest)
0 gc addr pc op sp ... ... Context
... ... ... ... ... ... ... ... Witness inverse of op difference or 0
1 inv(1-87) (ADD) inv(2-87) (MUL) inv(3-87) (SUB) ... ... ... ...

We could further optimize the cells cost if needed, becasue a contract could have 0x6000 opcodes. So dest could fit in two cells.

// switch on custom constraint only when the opcode is JUMPI
if is_zero(op - 87) {
    if case_success {
        // one stack read for destination
        bus_mapping_lookup(
            gc,
            Stack,
            sp,
            compress(dest),
            Read
        )

        // one stack read for condition
        bus_mapping_lookup(
            gc+1,
            Stack,
            sp+1,
            cond,
            Read
        )

        // we don't jump when condition is zero
        if is_zero(cond) {
            pc_next === pc + 1 // pc should increase by 1
        } else {
            pc_next === dest // pc should change to dest
            op_next === 91   // destination should be JUMPDEST
        }

        gc_next === gc + 1 // gc should increase by 1
        addr_next === addr // addr remains the same
        sp_next === sp + 2 // sp should increase by 2 (JUMPI = 2 POP)
    }

    if case_err_out_of_gas {
        // TODO:
        // - gas_left > gas_cost
        // - consume all give gas
        // - context switch to parent
        // - ...
    }

    if case_err_stack_underflow {
        // TODO:
        // - sp + 1 === 1023 + surfeit
        // - consume all give gas
        // - context switch to parent
        // - ...
    }}

    if case_err_jump_dest_invalid {
        // TODO:
        // - op_lookup(addr, dest, dest_op) && dest_op !== JUMPDEST
        // - consume all give gas
        // - context switch to parent
        // - ...
    }

    if case_err_jump_dest_out_of_range {
        // TODO:
        // - dest >= len(code)
        // - consume all give gas
        // - context switch to parent
        // - ...
    }
}

Question & Discussion

Q1. Switch

Now it costs a cell for op_diff_inv per op for circuit to get a is_zero boolean expression, which means a slot requires 142 cells just for op switch.

The case switch is in the same situation, the op with the most error cases is CREATE, which could has 7 error cases. Then we need 7 cells per slot for case switch.

Is there other more efficient methods to produce a boolean expression for each opcode?
han

Q2. Multi-Slot op

There are ops SHA3, CALLDATACOPY, CODECOPY, EXTCODECOPY, RETURNDATACOPY, LOGX, CREATEX with variadic values we need to operate, so multiple slot will be needed because we don't know how much values to process when building the circuit.

We could have slot_todo in each slot to count how many things we still need to handle.

Q3. Proof size discussion

How much the proof size will increase for KZG10 if we

  • access a new rotation
    n
    on
    m
    columns
    • [Wzωn]1
      element (linear combination commitment)
    • (c¯iωn)i=1m
      scalars (evaluations)
  • add an extra column
    ci
    • [ci]1
      element (commitment)
    • c¯i
      scalar (evaluation)

If we don't want to have too wide circuit (too much column), we have to fold v0, ..., v31 into different rows, then the rotation will become more complicated and new rotations might be needed. So we need to know how much they cost then we can decide the cheap one for circuit dimension.

Not sure if this estimation makes sense
han