# Copy Circuit
## Copy lookup table
We first define the copy lookup table that has 10 columns. The table below shows the field names for each column and what data to assign for different opcode. Each row corresponds to a copy event in the EVM circuit.
- `SrcID` and `DstID` indicates the identifier for the copy source and destination. It can be call id, bytecode hash (RLC encoded), and log id.
- `SrcType` and `DstType` specifies the source and destination types, including `Memory`, `Tx`, `Code`, `Log`.
- `SrcAddr` and `SrcAddrEnd` defines the starting and end address of the src buffer. If `SrcAddr + Length > SrcAddrEnd`, we need to pad 0 for the out-of-bound part during the copy.
- `DstAddr` defines the starting address of the dst buffer to copy to.
- `Length` is the number of bytes to copy
- `RwCounter` is the rw counter before the copy.
- `RwCounterInc` is the increase value of the rw counter for this copy. It will be used to bump up the rw counter in the EVM circuit.
| | SrcID | DstID | SrcType | DstType | SrcAddr | SrcAddrEnd | DstAddr | Length | RwCounter | RwCounterInc |
|------------------------------------|---------------|---------------|---------|---------|-------------------------|-----------------------------------|---------------|---------|-------------|-----------------|
| CallDataCopy/Load (root) | $txID | $callID | Tx | Memory | $dataOffset | $cdLength | $memoryOffset | $length | $rw_counter | $rw_counter_inc |
| CallDataCopy/Load (internal) | $callerID | $callID | Memory | Memory | $cdOffset + $dataOffset | $cdOffset + $cdLength | $memoryOffset | $length | $rw_counter | $rw_counter_inc |
| Return(not create) / ReturnDataCopy | $callID | $callerID | Memory | Memory | $rdOffset + $dataOffset | $rdOffset + $dataOffset + $length | $memoryOffset | $length | $rw_counter | $rw_counter_inc |
| Return(create) / Create[2] | $callID | $bytecodeHash | Memory | Code | $memoryOffset | $memoryOffset + $length | 0 | $length | $rw_counter | $rw_counter_inc |
| CodeCopy / ExtCodeCopy | $bytecodeHash | $callID | Code | Memory | $codeOffset | $codeLength | $memoryOffset | $length | $rw_counter | $rw_counter_inc |
| Log | $callID | $logID | Memory | Log | $memoryOffset | $memoryOffset + $length | 0 | $length | $rw_counter | $rw_counter_inc |
## Prove the correctness of copy lookup table
Now we need to prove the correctness of the copy table.
First, we need to check the consistency of the copy lookup table itself.
```cpp
// can use the fixed lookup table to hold these pairs
lookup (SrcType, DstType) in ((Tx, Memory), (Memory, Memory), (Memory, Code), (Code, Memory), (Memory, Log))
if SrcType == Tx {
SrcID == 0
}
if DstType == Code || DstType == Log {
DstAddr == 0
}
```
In addition, we need to construct another circuit to expand the copy steps and verify the copy steps. This circuit includes 21 columns:
- `q_step`: [fixed column] set to 1 every two 2 rows.
- `q_first`: set to 1 at the first row of each copy
- `id`: hold value of call ID, tx ID, log ID, bytecode hash (RLC encoded).
- `type`: indicates the type of buffer, can be (Memory, Tx, Code, Log)
- `addr`: address
- `addr_end`
- `bytes_left`
- `value`
- `is_code`
- `rw_counter`
- `rwc_inc_left`
- 4 columns for `is_zero` gadget to check if type is Memory, Tx, Code, Log
- 5 columns for comparing the value of `addr` and `addr_end`
The following table shows an example assignment for CallDataCopy (internal) steps.
| q_step | q_first | id | type | addr | addr_end | bytes_left | value | pad | is_code | rw_counter | rwc_inc_left |
|--------|---------|-----------|--------|-------------------|--------------|------------|-------|-----|---------|---------------|--------------|
| 1 | 1 | caller_id | Memory | src_addr | src_addr_end | length | byte1 | 0 | | rwc | rwc_inc |
| 0 | 0 | call_id | Memory | dst_addr | - | - | byte1 | 0 | | rwc+1 | rwc_inc-1 |
| 1 | 0 | caller_id | Memory | src_addr+1 | src_addr_end | length-1 | byte2 | 0 | | rwc+2 | rwc_inc-2 |
| 0 | 0 | call_id | Memory | dst_addr+1 | - | - | byte2 | 0 | | rwc+3 | rwc_inc-3 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | | ... | ... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | | rwc+rwc_inc-1 | 1 |
| 1 | 0 | caller_id | Memory | src_addr+length-1 | src_addr_end | 1 | 0 | 1 | | rwc+rwc_inc-1 | 1 |
| 0 | 0 | call_id | Memory | dst_addr+length-1 | - | - | 0 | 0 | | rwc+rwc_inc | 0 |
```cpp
// constrain the initial state
if q_first == 1 {
copy_table_lookup(
id,
id[1],
type,
type[1],
addr,
addr_end,
addr[1],
bytes_left,
rw_counter,
rwc_inc_left
)
}
// constrain id, type, addr, addr_end columns
if q_first[1] + q_first[2] == 0 { // not last two rows
id == id[2]
type == type[2]
addr[2] == addr + 1
addr_end[2] == addr_end
}
// constrain rw_counter, rwc_inc_left columns
if q_first[1] == 0 { // not last row
rw_counter_diff = (1 - pad) * (is_memory + is_log)
rw_counter[1] == rw_counter + rw_counter_diff
rwc_inc_left[1] = rwc_inc_left - rw_counter_diff
} else { // last row
rwc_inc_left == 0
}
// constrain each step
if q_step == 1 {
// constrain bytes_left column
if q_first[2] == 0 { // not last step
bytes_left[2] == bytes_left - 1
} else { // last step
bytes_left == 1
}
// constrain value column: read value == write value
value == value[1]
// value == 0 when pad is 1
value * pad == 0
// constrain pad column
pad == 1 - lt(src_addr, src_addr_end, N_BYTES_MEMORY_ADDRESS)
pad[1] == 0 // write is never padded
}
// lookup to rw_table at each row
if is_memory && pad != 0 {
rw_lookup(
rw_counter,
1 - q_step, // isWrite
Memory,
id, // CallID
addr,
value
)
}
if is_tx && pad != 0 {
tx_lookup(
id, // TxID
CallData,
addr, // ByteIndex
value
)
}
if is_code && pad != 0 {
bytecode_lookup(
id, // CodeHash
addr, // ByteIndex
value,
is_code
)
}
if is_log { // always write, so no padding
rw_lookup(
rw_counter,
1, // isWrite
TxLog,
id, // LogID
addr, // ByteIndex
value
)
}
```
Notes
- When checking the `pad` column, we need to compare the value of `addr` and `addr_end`. The `addr` column can hold memory address, tx calldata byte index, bytecode byte index, and log byte index. We can relax the comparison of these two values up to 160-bit to accommodate the memory address. So it needs 10 extra cells to hold the diff value. Since we only need to compare for reading part in a step, we could fold 10 extra cells into a region of 2 rows and 5 columns. So only 5 extra columns is needed.