Han
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- tags: zkevm --- # ZKEVM - Specification [TOC] ## Introduction [...](https://github.com/appliedzkp/zkevm-specs/blob/master/specs/introduction.md) ## Concepts ### Circuit as a lookup table In halo2, the lookup is flexible to be configured, anything able to be turned into `Expression` could used to be `item: Tuple[int, ...]` or `table: Set[Tuple[int, ...]]` in lookup, and then it `assert item in table`. The `Expression` includes constant, queried fixed, advice or instance column at arbitrary rotation, addition/multiplication of `Expression`. The motivation to have multiple circuits as lookup tables is because EVM contains many circuit unfriendly operation like random read-write data access, wrong field operation (ECDSA on secp256k1), traditional hash function (keccak256), etc... And many of them accept variable lenght input. These expensive operations makes it hard to design a EVM circuit to verify computation trace because each step could possibly contain some of them. So we try to separate these expensive operations to other single-purpose circuits which have more friendly layout, and use them by a lookup (or serveral lookups) to communicate input and output to outsource the effort. The reason lookup with input and output could be used to outsource the effort is that we know the table lookup-ed is configured with constraints to verify the input and output are in some relationship. For example, we let Bytecode circuit to holds a set of tuple `(code_hash, index, opcode)`, and each `code_hash` is verified to be the keccak256 digest of opcodes it contains, then in EVM circuit we can load `opcode` with `(code_hash, program_counter)` by lookup to Bytecode circuit. However, sometimes there are some properties we can't ensure by only lookups. For example, the amount of all `item` should equal to the size of `table`, which is required by between EVM circuit and State circuit to prevent extra malicious write. In such case (`assert set(items) == table`), we need some extra constraint to ensure the relationship is correct. A naive approach is to also count all `item` in State circuit, which in the end is the size of the `table`, and constraint it to be equal to the one counted in EVM circuit. > The original approach is using lookup to move the meaningful items from State circuit to bus mapping, which is another private table, and verify the bus mapping has degree bound that eqaul to the one counted in EVM circuit. > But it turns out that we also need to count in State circuit, otherwise the prover could insert something in bus mapping but skip it in State circuit. We can try to do lookup from bus mapping to State circuit to avoid the counting in State circuit, but it just makes bus mapping seem to be a redundant layer. > In general, such case is more like reorder something to be friendly to perform other constraint instead of subset, which should be easy to add in halo2 since subset argument already uses such shuffle argument. But for flexibility, we might have multiple lookup to State circuit like `assert set(items1).intersection(set(items2)) == {0} and set(items1) + set(items2) == table`) , which is not a simple shuffle, so counting items in State circuit seems to be a more general solution. > [name=han] ### Architecture diagram Each circuit is layouted to be friendly to build their own custom constraints. When circuits encounter some expensive operations, they can outsource the effort to other circuits by lookup. The relationship between circuits would be like: ![](https://hackmd.io/_uploads/HkqKaw-wY.png) In the end the circuits would be assembled depending on their dimension and the desired capacity. For example, we can just combine 2 different circuits by using different columns, or stack them using same columns with extra selectors. ### EVM word encoding [...](https://github.com/appliedzkp/zkevm-specs/blob/master/specs/word-encoding.md) ## Custom types ## Constants | Name | Value | Description | | -------------------- | ------------ | ------------------------------- | | `MAX_MEMORY_ADDRESS` | `2**40 - 1` | max memory address allowed [^1] | | `MAX_GAS` | `2**64 - 1` | max gas allowed | | `MAX_ETHER` | `2**256 - 1` | max value of ether allowed [^2] | ## Circuits ### EVM circuit EVM circuit iterates over transactions included in proof to verify each execution step of transaction is valid. Basically the scale of a step is same as in EVM, so usually we handle one opcode per step, except those opcodes like `SHA3` or `CALLDATACOPY` that operate on variable size of memory would require multiple steps. > The scale of a step somehow could be different depends on the approach, an extreme case is to implement a VM with reduced instruction set (like TinyRAM) to emulate EVM, which would have a much smaller step, but not sure how it compares to current approach. > [name=han] To verify if a step is valid, we first enumerate all possible execution results of a step in the EVM including success and error cases, and then build the custom constraint to verify the step transition is correct for each execution result. In each step, we constrain it to enable one of execution results, and specially constrain the first step to enable `BEGIN_TX`, then repeats the step to verify the full execution trace. Also each step is given access to next step to propagate the tracking information, by putting constraints like `assert next.program_counter == curr.program_counter + 1`. #### Concepts ##### Execution result It's intuitive to have each opcode as a branch in step. However, EVM has so rich opcodes that some of them are very similar like `{ADD,SUB}`, `{PUSH*}`, `{DUP*}` and `{SWAP*}` that seem to be handled by almost identical constraint with small tweak (to swap a value or automatically done due to linearity), it seems we could reduce our effort to only implement it once to handle multiple opcodes in single branch. In addition, an EVM state transition could also contain serveral kinds of error cases, we also need to take them into consideration to be equivalent to EVM. It would be annoying for each opcode branch to handle their own error cases since it needs to halt the step and return the execution context to caller. Fortunately, most error cases are easy to verify with some pre-built lookup table even they could happen to many opcodes, only some tough errors like out of gas due to dynamic gas usage need to be verified one by one. So we further unroll all kinds of error cases as kinds of execution result. So we can enumerate [all possible execution results](https://github.com/appliedzkp/zkevm-specs/blob/83ad4ed571e3ada7c18a411075574110dfc5ae5a/src/zkevm_specs/evm/execution_result/execution_result.py#L4) and turn EVM circuit into a finite state machine like: ![](https://hackmd.io/_uploads/H14-tJc_Y.png) > In current implementation, we ask opcode implementer to also implement error cases, which seems to be redundant efforts. If we do this, they can focus more on opcode's success case. Also error cases are usually easier to verify, so I think it also reduces the overall implementation complexity. > [name=han] ##### Random accessible data In EVM, the interpreter has ability to do any random access to data like block context, account balance, stack and memory in current scope, etc... And some of them are read-write and others are read-only. In EVM circuit, we leverage the concept [Circuit as a lookup table](#Circuit-as-a-lookup-table) to duplicate these random accessible data to other circuits in different layout and verify they are consistent and valid. After these random accessible data are verified, we can use them just like they are tables. [Here](https://github.com/appliedzkp/zkevm-specs/blob/83ad4ed571/src/zkevm_specs/evm/table.py#L108) are the tables currently used in EVM circuit. For read-write accessible data, EVM circuit lookup State circuit with a sequentially `rw_counter` (read-write counter) to make sure the read-write access is chronological, and a flag `is_write` to have State circuit to check data is consistency between different write accesses. For read-only accessible data, EVM circuit lookup Bytecode circuit, Tx circuit, Call circuit directly. ##### State write reversion In EVM, state write could be reverted if any call fails. There are many kinds of state write, a complete list can be found [here](https://github.com/ethereum/go-ethereum/blob/master/core/state/journal.go#L87-L141). In EVM circuit, each call is attached with a flag`is_persistent` to know if it succeeds in the end or not, so ideally we only need to do reversion on those kinds of state write which affects future execution before reversion: - `TxAccessListAccount` - `TxAccessListStorageSlot` - `AccountNonce` - `AccountBalance` - `AccountCodeHash` - `AccountStorage` Others we don't need to do reversion because they don't affect future execution before reversion, we only write them when `is_persistent` is `1`: - `TxRefund` - `AccountDestructed` > Another tag is `TxLog`, which also doesn't affect future execution. It should be explained where to write such record to after we decide where to build receipt trie. > [name=han] To enable state write reversion, we need some meta information of a call: 1. `is_persistent` - To know if we need reversion or not 2. `rw_counter_end_of_reversion` - To know at which point in the future we should revert 3. `state_write_counter` - To know how many state write we have done by now Then at each state write, we first check if `is_persistent` is `0`, if so we do an extra state write at `rw_counter_end_of_reversion - state_write_counter` with the old value, which reverts the state write in a reverse order. ##### Opcode fetching In EVM circuit, there are 3 kinds of opcode source for execution or copy: 1. When contract interaction: Opcode is lookup from contract bytecode in Bytecode circuit by tuple `(code_hash, index, opcode)` 2. When contract creation in root call: Opcode is lookup from tx calldata in Tx circuit by tuple `(tx_id, TxTableTag.Calldata, index, opcode)` 3. When contract creation in internal call: Opcode is lookup from caller's memory in State circuit by tuple `(rw_counter, False, caller_id, index, opcode)` Before we fetch opcode from any source, it checks if the index is in the given range, if not, it follows the behavior of current EVM to implicitly returning `0`. ##### Internal call EVM supports internal call triggered by opcodes. In EVM circuit, the opcodes (like `CALL` or `CREATE`) that trigger internal call will save its own `call_state` into State circuit, setup next call's context, and initialize next step's `call_state` to start a new environment. Then the opcodes (like `RETURN` or `REVERT`) and error cases that halt will restore caller's `call_state` and set it back to next step. For a simple `CALL` example with illustration (many details are hided for simplicity): ![](https://i.imgur.com/XqVxCWH.png) #### Constraints ##### `main` ==TODO== Explain each execution result ### Tx circuit Tx circuit iterates over transactions included in proof to verify each transaction has valid signature. It also verifies the built transaction merkle patricia trie has same root hash as public input. Main part of Tx circuit will be instance columns whose evaluation values are built by verifier directly. See the [issue](https://github.com/appliedzkp/zkevm-circuits/issues/122) for more details. To verify if transaction has valid signature, it hashes the RLP encoding of transaction and recover the address of signer with signature, then verifies the signer address is correct. It serves as a lookup table for EVM circuit to do random access of any field of transaction. To prevent any skip of transaction, we verify te amount of transactions in Tx circuit is equal to the amount that verified in EVM circuit. ### State circuit State circuit iterates over random read-write access records of EVM circuit to verify each data is consistent between different writes. It also verifies the state merkle patricia trie root hash has a valid transition from old to new one incrementally, where both are from public input. To verify if data is consistent, it first verify all access records are grouped by their unique identifier and sorted by order of access, then verify the records between writes are consistent. It also verifies data is in correct format. It serves as a lookup table for EVM circuit to do consistent random read-write access. To prevent any malicious insertion of access record, we also verify the amount of random read-write access records in State circuit is equal to the amount in EVM circuit (the final value of `rw_counter`). #### Concepts ##### Read-write unit grouping The first thing to ensure data is consistent between different writes is to give each data an unique identifier, group them by the unique identifier and then sort them by order of access `rw_counter`. Here are all kinds of data with their unique identifier: | Tag | Unique Index | Values | | ------------------------- | ---------------------------------------- | ------------------------------------- | | `TxAccessListAccount` | `(tx_id, account_address)` | `(is_warm, is_warm_prev)` | | `TxAccessListAccountStorage` | `(tx_id, account_address, storage_slot)` | `(is_warm, is_warm_prev)` | | `TxRefund` | `(tx_id)` | `(value, value_prev)` | | `Account` | `(account_address, field_tag)` | `(value, value_prev)` | | `AccountStorage` | `(account_address, storage_slot)` | `(value, value_prev)` | | `AccountDestructed` | `(account_address)` | `(is_destructed, is_destructed_prev)` | | `CallContext` | `(call_id, field_tag)` | `(value)` | | `Stack` | `(call_id, stack_address)` | `(value)` | | `Memory` | `(call_id, memory_address)` | `(byte)` | Different tags have different constriant on their grouping and values. Most tags also keep the previous value `*_prev` for convenience, which helps reduce the lookup when EVM circuit is performing a write with a `diff` to current value, or performing a write with a reversion. ##### Lazy initialization EVM's memory expands implicitly, for example, when the memory is empty and it enounters a `mload(32)`, EVM first expands to memory size to `64`, and then load the bytes just initialized to push to stack, which is always a `0`. The implicit expansion behavior makes even the simple `MLOAD` and `MSTORE` complicated in EVM circuit, so we have a trick to outsource the effort to State circuit by constraining the first record of each memory unit to be a write or have value `0`. It saves the variable amount of effort to expand memory and ignore those never used memory, only used memory addresses will be initlized with `0` so as lazy initialization. > This concept is also used in another case: the opcode `SELFDESTRUCT` also has ability to update the variable amount of data, it reset the `balance`, `nonce`, `code_hash`, and every `storage_slot` even it's not used in a step. So for each state under account, we can add a `revision_id` handle such case, see [here](https://hackmd.io/G48BKqdPScyoFDHPNzgOYQ?view#SELFDESTRUCT) for details. > ==TODO== Convert this into an issue for discussion > [name=han] ##### Trie opening and incrementally update #### Constraints ##### `main` ==TODO== Explain each tag <!-- ##### `tx_access_list_account` ##### `tx_access_list_storage_slot` ##### `tx_refund` ##### `account_nonce` ##### `account_balance` ##### `account_code_hash` ##### `account_storage` ##### `call_state` ##### `stack` ##### `memory` --> ### Bytecode circuit Bytecode circuit iterates over contract bytecodes to verify each bytecode has valid hash. It serves as a lookup table for EVM circuit to do random access of any index of bytecode. ### ECDSA circuit ECDSA circuit iterates over signatures to verify recovered public key from signature is valid, or verify signature is invalid. It serves as a lookup table for EVM and Tx circuit to do public key recover. ### Keccak256 circuit Keccak256 circuit iterates over hashes to verify each hash is valid. It serves as a lookup table for EVM, Bytecode, Tx and MPT circuit to do hash. ### MPT circuit MPT circuit (Merkle Patricia Trie circuit) iterates over merkle patricia trie transition to verify each update is valid. It serves as a lookup table for State and Tx circuit to do merkle patricia trie update. [^1]: The explicit max memory address in EVM is actually `32 * (2**32 - 1)`, which is the one that doesn't make memory expansion gas cost overflow `u64`. In our case, memory address is allowed to be 5 bytes, but will constrain the memory expansion gas cost to fit `u64` in success case. [^2]: I didn't find a explicit upper bound on value of ether (for `balance` or `gas_price`) in yellow paper, but handling unbounded big integer seems unrealistic in circuit, so with `u256` as a hard bound seems reasonable.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully