ZKEVM State Circuit Explore - Storage - HackMD

owned this note owned this note

<style> .table-note-nowrap td:last-child { white-space: nowrap; } .table-reverted tr:nth-child(7) td, .table-reverted tr:nth-child(11) td{ background-color: #ecf2ea; } .table-reverted tr:nth-child(9) td, .table-reverted tr:nth-child(10) td{ background-color: #fdf2e6; } .table-example tr:nth-child(5) td, .table-example tr:nth-child(16) td { background-color: #ecf2ea; } .table-example tr:nth-child(10) td, .table-example tr:nth-child(15) td { background-color: #fdf2e6; } </style> ## Introduction This note tries to figure out how storage circuit should work with evm circuit especially encountering `REVERT` in sub calls. (extending the reverting process describe [here](https://hackmd.io/0Z3N9gniRAmOg8X1RN6VAg)) We focus on **storage of account state** and **error caused by `REVERT`** in this note. ## Definition - `context` - `gc` - global counter - `addr` - account address holding opcodes - `pc` - program counter - `op` - opcode respect to `addr` and `pc` - `sp` - stack pointer - `is_persistant` - a binary flag that tells us if this call succeeds in the future (immutable within a call) - `gc_end_of_call` - a number for us to know when the call ends, we use this to rollback storage write in FILO order (immutable within a call) - `sstore_counter` - a counter to count how many `SSTORE` we have done - `...` - more will be needed, see more details [here](https://hackmd.io/0Z3N9gniRAmOg8X1RN6VAg) ## Storage Circuit In storage circuit, we have similar hierarchical namespacing as evm, we group access records by **account address** and then **storage location**. Finally in each sub-groups (each location of an account) we order records by **global counter** ascendingly. (Different from stack and memory, which should be seperated properly by some other unique identical call context instead of only account address for indicating the source of `CALLDATA` and `RETURNDATA` ) Also for convinience for rollback, we add an extra `val_prev` in access record when `Storage` `Write`. Another extra value in bus mapping is `is_first_touch` for [EIP2929](https://eips.ethereum.org/EIPS/eip-2929), it should be `1` when a storage location is touched for the first time in a EOA call. So the bus mapping lookup could look like: ```rust bus_mapping_lookup( gc, Storage, key, val, rw, val_prev, is_first_touch, ) ``` ## Example ### Multi-Call Example I take this naive but [concrete example](https://gist.github.com/han0110/f6ebe15d28bda50f0c4feceb07455035): `Caller` tries to trigger `Callee` to perform simple storage operation `Callee.0++; Callee.0++` 3 times, but second call will revert. So finally the `Callee.0` will be `4`. Their runtime codes are: ```solidity object "Caller" { code { let callee := loadimmutable("callee") // call 1: success mstore(0, 1) pop(call(30000, callee, 0, 0, 32, 0, 0)) // call 2: revert mstore(0, 0) pop(call(30000, callee, 0, 0, 32, 0, 0)) // call 3: success mstore(0, 1) pop(call(30000, callee, 0, 0, 32, 0, 0)) } } object "Callee" { code { // counter++ sstore(0, add(sload(0), 1)) // counter++ sstore(0, add(sload(0), 1)) // stop if calldataload(0) { stop() } // revert revert(0, 0) } } ``` Then we will get executation trace after calling `Caller` like below, I replaced all `DUP` with `PUSH` for simplicity. (`ID` for `CallIndex`, `RV` for `Revert`): ``` ID RV PC OP STACK STORAGE ---- ---- ----- -------------- ------------------------------ ---------- 0 0 0 PUSH1 1 [ , , , , , , 1 ] 0 0 2 PUSH1 0 [ , , , , , 0, 1 ] 0 0 4 MSTORE [ , , , , , , ] 0 0 5 PUSH1 0 [ , , , , , , 0 ] 0 0 7 PUSH1 0 [ , , , , , 0, 0 ] 0 0 9 PUSH1 20 [ , , , , 20, 0, 0 ] 0 0 11 PUSH1 0 [ , , , 0, 20, 0, 0 ] 0 0 13 PUSH1 0 [ , , 0, 0, 20, 0, 0 ] 0 0 15 PUSH20 a [ , a, 0, 0, 20, 0, 0 ] 0 0 36 PUSH2 c350 [ c350, a, 0, 0, 20, 0, 0 ] 0 0 39 CALL [ , , , , , , ] 1 0 0 PUSH1 0 [ , , , , , , 0 ] { 0: 0 } 1 0 2 SLOAD [ , , , , , , 0 ] { 0: 0 } 1 0 3 PUSH1 1 [ , , , , , 1, 0 ] { 0: 0 } 1 0 5 ADD [ , , , , , , 1 ] { 0: 0 } 1 0 6 PUSH1 0 [ , , , , , 0, 1 ] { 0: 0 } 1 0 8 SSTORE [ , , , , , , ] { 0: 1 } 1 0 9 PUSH1 0 [ , , , , , , 0 ] { 0: 1 } 1 0 11 SLOAD [ , , , , , , 1 ] { 0: 1 } 1 0 12 PUSH1 1 [ , , , , , 1, 1 ] { 0: 1 } 1 0 14 ADD [ , , , , , , 2 ] { 0: 1 } 1 0 15 PUSH1 0 [ , , , , , 0, 2 ] { 0: 1 } 1 0 17 SSTORE [ , , , , , , ] { 0: 2 } 1 0 18 PUSH1 0 [ , , , , , , 0 ] { 0: 2 } 1 0 20 CALLDATALOAD [ , , , , , , 1 ] { 0: 2 } 1 0 21 ISZERO [ , , , , , , 0 ] { 0: 2 } 1 0 22 PUSH2 1b [ , , , , , 1b, 0 ] { 0: 2 } 1 0 25 JUMPI [ , , , , , , ] { 0: 2 } 1 0 26 STOP [ , , , , , , 1 ] { 0: 2 } 0 0 40 POP [ , , , , , , ] 0 0 41 PUSH1 0 [ , , , , , , 0 ] 0 0 43 PUSH1 0 [ , , , , , 0, 0 ] 0 0 45 MSTORE [ , , , , , , ] 0 0 46 PUSH1 0 [ , , , , , , 0 ] 0 0 48 PUSH1 0 [ , , , , , 0, 0 ] 0 0 50 PUSH1 20 [ , , , , 20, 0, 0 ] 0 0 52 PUSH1 0 [ , , , 0, 20, 0, 0 ] 0 0 54 PUSH1 0 [ , , 0, 0, 20, 0, 0 ] 0 0 56 PUSH20 a [ , a, 0, 0, 20, 0, 0 ] 0 0 77 PUSH2 c350 [ c350, a, 0, 0, 20, 0, 0 ] 0 0 80 CALL [ , , , , , , ] 2 1 0 PUSH1 0 [ , , , , , , 0 ] { 0: 2 } 2 1 2 SLOAD [ , , , , , , 2 ] { 0: 2 } 2 1 3 PUSH1 1 [ , , , , , 1, 2 ] { 0: 2 } 2 1 5 ADD [ , , , , , , 3 ] { 0: 2 } 2 1 6 PUSH1 0 [ , , , , , 0, 3 ] { 0: 2 } 2 1 8 SSTORE [ , , , , , , ] { 0: 3 } 2 1 9 PUSH1 0 [ , , , , , , 0 ] { 0: 3 } 2 1 11 SLOAD [ , , , , , , 3 ] { 0: 3 } 2 1 12 PUSH1 1 [ , , , , , 1, 3 ] { 0: 3 } 2 1 14 ADD [ , , , , , , 4 ] { 0: 3 } 2 1 15 PUSH1 0 [ , , , , , 0, 4 ] { 0: 3 } 2 1 17 SSTORE [ , , , , , , ] { 0: 4 } 2 1 18 PUSH1 0 [ , , , , , , 0 ] { 0: 4 } 2 1 20 CALLDATALOAD [ , , , , , , 0 ] { 0: 4 } 2 1 21 ISZERO [ , , , , , , 1 ] { 0: 4 } 2 1 22 PUSH2 1b [ , , , , , 1b, 1 ] { 0: 4 } 2 1 25 JUMPI [ , , , , , , ] { 0: 4 } 2 1 27 JUMPDEST [ , , , , , , ] { 0: 4 } 2 1 28 PUSH1 0 [ , , , , , , 0 ] { 0: 4 } 2 1 30 PUSH1 0 [ , , , , , 0, 0 ] { 0: 4 } 2 1 32 REVERT [ , , , , , , 0 ] { 0: 2 } 0 0 81 POP [ , , , , , , ] 0 0 82 PUSH1 1 [ , , , , , , 1 ] 0 0 84 PUSH1 0 [ , , , , , 0, 1 ] 0 0 86 MSTORE [ , , , , , , ] 0 0 87 PUSH1 0 [ , , , , , , 0 ] 0 0 89 PUSH1 0 [ , , , , , 0, 0 ] 0 0 91 PUSH1 20 [ , , , , 20, 0, 0 ] 0 0 93 PUSH1 0 [ , , , 0, 20, 0, 0 ] 0 0 95 PUSH1 0 [ , , 0, 0, 20, 0, 0 ] 0 0 97 PUSH20 a [ , a, 0, 0, 20, 0, 0 ] 0 0 118 PUSH2 c350 [ c350, a, 0, 0, 20, 0, 0 ] 0 0 121 CALL [ , , , , , , ] 3 0 0 PUSH1 0 [ , , , , , , 0 ] { 0: 2 } 3 0 2 SLOAD [ , , , , , , 2 ] { 0: 2 } 3 0 3 PUSH1 1 [ , , , , , 1, 2 ] { 0: 2 } 3 0 5 ADD [ , , , , , , 3 ] { 0: 2 } 3 0 6 PUSH1 0 [ , , , , , 0, 3 ] { 0: 2 } 3 0 8 SSTORE [ , , , , , , ] { 0: 3 } 3 0 9 PUSH1 0 [ , , , , , , 0 ] { 0: 3 } 3 0 11 SLOAD [ , , , , , , 3 ] { 0: 3 } 3 0 12 PUSH1 1 [ , , , , , 1, 3 ] { 0: 3 } 3 0 14 ADD [ , , , , , , 4 ] { 0: 3 } 3 0 15 PUSH1 0 [ , , , , , 0, 4 ] { 0: 3 } 3 0 17 SSTORE [ , , , , , , ] { 0: 4 } 3 0 18 PUSH1 0 [ , , , , , , 0 ] { 0: 4 } 3 0 20 CALLDATALOAD [ , , , , , , 1 ] { 0: 4 } 3 0 21 ISZERO [ , , , , , , 0 ] { 0: 4 } 3 0 22 PUSH2 1b [ , , , , , 1b, 0 ] { 0: 4 } 3 0 25 JUMPI [ , , , , , , ] { 0: 4 } 3 0 26 STOP [ , , , , , , 1 ] { 0: 4 } 0 0 122 POP [ , , , , , , ] 0 0 123 STOP [ , , , , , , ] ``` From the above example, we can derive such storage circuit witnesses: (`is_first_touch` is omitted for simplicity, only first row is `1`) <div class="table-note-nowrap table-reverted"> | `addr` | `key` | `gc` | `val` | `val_prev` | `rw` | Note | | ------ |:-----:| ----- | ----- | ---------- | ------- | --------------------------- | | `a` | `0` | ASC ↓ | `0` | `0` | `Write` | Init | | `a` | `0` | | `0` | | `Read` | `ID = 1` `2 SLOAD` | | `a` | `0` | | `1` | `0` | `Write` | `ID = 1` `8 SSTORE` | | `a` | `0` | | `1` | | `Read` | `ID = 1` `11 SLOAD` | | `a` | `0` | | `2` | `1` | `Write` | `ID = 1` `17 SSTORE` | | `a` | `0` | | `2` | | `Read` | `ID = 2` `2 SLOAD` | | `a` | `0` | | `3` | `2` | `Write` | `ID = 2` `8 SSTORE` | | `a` | `0` | | `3` | | `Read` | `ID = 2` `11 SLOAD` | | `a` | `0` | | `4` | `3` | `Write` | `ID = 2` `17 SSTORE` | | `a` | `0` | | `3` | `4` | `Write` | `ID = 2` `17 SSTORE.REVERT` | | `a` | `0` | | `2` | `3` | `Write` | `ID = 2` `8 SSTORE.REVERT` | | `a` | `0` | | `2` | | `Read` | `ID = 3` `2 SLOAD` | | `a` | `0` | | `3` | `2` | `Write` | `ID = 3` `8 SSTORE` | | `a` | `0` | | `3` | | `Read` | `ID = 3` `11 SLOAD` | | `a` | `0` | | `4` | `3` | `Write` | `ID = 3` `17 SSTORE` | </div> We can observe that in `ID = 2`, the storage writes are undone in a FILO (first in last out) order. In evm circuit, we will constraint the number of `SSTORE` and `SSTORE.REVERT` happens together and in FILO order when `is_persistant = 0`, which ensures all storage writes to be undone correctly. The pseudo code of `SSTORE` in evm circuit could be like: ```rust // read location and value bus_mapping_lookup(gc, Stack, sp, loc, Read) bus_mapping_lookup(gc+1, Stack, sp+1, val, Read) // write storage bus_mapping_lookup(gc+2, Storage, loc, val, Write, val_prev, is_first_touch) if !is_persistant { // rollback storage in a "first in last out" order bus_mapping_lookup( gc_end_of_call - sstore_counter, Storage, loc, val_prev, Write, val, is_first_touch, ) } // sstore_counter should increase 1 sstore_counter_next === sstore_counter + 1 gc_next === gc + 3 // gc should increase 3 pc_next === pc + 1 // pc should increase 1 sp_next === sp + 2 // sp should increase 2 (SSTORE = 2 POP) ``` Then pseudo code of `RETURN` and `REVERT` in evm circuit could be like: ```rust // read offset and size (of return data) bus_mapping_lookup(gc, Stack, sp, offset, Read) bus_mapping_lookup(gc+1, Stack, sp+1, size, Read) // TODO: check offset and size are set in parent's context for RETURNDATA // should be persistant if RETURN if op == RETURN { is_persistant === 1 } // should not be persistant if REVERT if op == REVERT { is_persistant === 0 } // check no extra records within revert section gc_end_of_call === gc + 1 + sstore_counter // gc should jump to correct one gc_next === gc_end_of_call + 1 // TODO: pc_next should set back to parent's next one // TODO: sp_next should set back to parent's next one ``` ### Smaller but Complete Example If we have such trace: ``` PC OP STACK STORAGE ---- --------- ---------- ---------------- 0 PUSH1 1 [ , 1 ] { 6: 0, a: 0 } 2 PUSH1 a [ a, 1 ] { 6: 0, a: 0 } 4 SSTORE [ , ] { 6: 0, a: 1 } 5 PUSH1 3 [ , 3 ] { 6: 0, a: 1 } 7 PUSH1 6 [ 6, 3 ] { 6: 0, a: 1 } 9 SSTORE [ , ] { 6: 3, a: 1 } 10 PUSH1 0 [ , 0 ] { 6: 3, a: 1 } 12 PUSH1 0 [ 0, 0 ] { 6: 3, a: 1 } 14 REVERT [ 0, 0 ] { 6: 0, a: 0 } ``` Then we will have such bus mapping table ordered by `gc` increasingly: (`is_first_touch` is omitted for simplicity, only rows `gc = 5` and `gc = 10` are `1`) <div class="table-example"> | `gc` | `target` | `key` | `val` | `rw` | `val_prev` | Note | | ---- | --------- | ------ | ----- | ------- | ---------- | ----------------- | | `1` | `Stack` | `1023` | `1` | `Write` | | `0 PUSH1 1` | | `2` | `Stack` | `1022` | `a` | `Write` | | `2 PUSH1 a` | | `3` | `Stack` | `1023` | `1` | `Read` | | `4 SSTORE.POP` | | `4` | `Stack` | `1022` | `a` | `Read` | | `4 SSTORE.POP` | | `5` | `Storage` | `a` | `1` | `Write` | `0` | `4 SSTORE` | | `6` | `Stack` | `1023` | `3` | `Write` | | `5 PUSH1 3` | | `7` | `Stack` | `1022` | `6` | `Write` | | `7 PUSH1 6` | | `8` | `Stack` | `1023` | `3` | `Read` | | `9 SSTORE.POP` | | `9` | `Stack` | `1022` | `6` | `Read` | | `9 SSTORE.POP` | | `10` | `Storage` | `6` | `3` | `Write` | `0` | `9 SSTORE` | | `11` | `Stack` | `1023` | `0` | `Write` | | `10 PUSH1 0` | | `12` | `Stack` | `1022` | `0` | `Write` | | `12 PUSH1 0` | | `13` | `Stack` | `1023` | `0` | `Read` | | `14 REVERT.POP` | | `14` | `Stack` | `1022` | `0` | `Read` | | `14 REVERT.POP` | | `15` | `Storage` | `6` | `0` | `Write` | `3` | `9 SSTORE.REVERT` | | `16` | `Storage` | `a` | `0` | `Write` | `1` | `4 SSTORE.REVERT` | </div> The related context: - `gc_end_of_call = 16` - `is_persistant = 0` The <span style="color: #82B366">green rows</span> happen at the same `4 SSTORE`, it does: ```rust bus_mapping_lookup(5, Storage, a, 1, Write, 0, 1) // gc = gc_end_of_call - sstore_counter = 16 - 0 = 16 bus_mapping_lookup(16, Storage, a, 0, Write, 1, 0) sstore_counter_next === sstore_counter + 1 // sstore_counter_next = 0 + 1 = 1 ``` The <span style="color: #D79B00">orange rows</span> happen at the same `9 SSTORE`, it does: ```rust bus_mapping_lookup(10, Storage, 6, 3, Write, 0, 1) // gc = gc_end_of_call - sstore_counter = 16 - 1 = 15 bus_mapping_lookup(15, Storage, 6, 0, Write, 3, 0) sstore_counter_next === sstore_counter + 1 // sstore_counter_next = 1 + 1 = 2 ``` When `REVERT`, we at least check: ```rust // no extra records within revert section gc_end_of_call === gc + 1 + sstore_counter // 16 = 13 + 1 + 2 // next gc jump to after gc_end_of_call gc_next === gc_end_of_call + 1 ``` ## Update Storage Root on L1 Contract For all identical locations of address, the init rows will be attached with a merkle proof to ensure it's read from old state root. Then the first one in total will be verified with the **old state root** as a public input loaded from L1 contract. The last row of each location builds a **intermediate state root** using the same merkle proof of its init row with its final value. Then next location opens the **intermediate state root** to read its old value. We can do this because we know different locations will not be updated due to the location group constraint. In the end of storage circuit, we can build the **finalized state root**. Then we use a public input to verify the equality of the result, and update it to L1 contract. (Note the verb **build** and **open** here are actually **merkle proof verification** in circuit) ==TODO== example ## Question & Discussion ### Q1. How to handle other world state update? There are other actions would cause state trie update: 1. Transaction - `nonce` and `balance` will be updated 2. `CALL` - if `CALL` has non-zero value, the `balance` will be updated 3. `CREATE`, `CREATE2` - hash of `code` will be updated to new contract address 4. `SSTORE` - covered by this note There are some other actions cause receipt trie updates: 1. `LOGX` - log will be appended ### Q2. How to handle [errors caused not by `REVERT`](https://github.com/ethereum/go-ethereum/blob/master/core/vm/errors.go)? If we are going to support evm fully, we should allow all possible error behavior instead of avoid it happend by the design of circuit. For example, using a 10-bit lookup to check `sp`'s validity will be super simple, but it doesn't allow a prover to create a proof with a stack overflow or underflow error anymore, which could happen in evm now. Such errors halts the call and lead to all storage updates rollback just like `REVERT`, what extra thing they do is to consume all given gas. So we should treat all possible even error cases as execution result of a op, we can let prover show us which result it is, then we verify it. For example, a `POP` would have one possible success result and two possible error results for `ErrStackUndreflow` and `ErrOutOfGas`. ### Q3. How to handle dynamic gas due to access list ([EIP2929](https://eips.ethereum.org/EIPS/eip-2929)) We might need encode an extra item `is_first_touch` into bus mapping to specify which time it is being access. Then in evm circuit, we adjust the gas cost by the access time. Since EIP2929 is per EOA transaction, we need to know in storage circuit, we seem to need a `root_call_context` to enable the `is_first_touch` flag. > [name=barryWhiteHat] os this is a flag for first touch ? > [name=han] Yes, not counter but a boolean flag for first touch instead. > [name=barryWhiteHat] we would also need to add a lask touch flag so that we know what value to store in state via merkle proof. > [name=han] Seems we don't need to put it into bus mapping becasue we verify merkle proof in state circuit?

Cheatsheet

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`	在筆記中貼入程式碼
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.