# Sin7Y Tech Review (22): Cairo - Instruction
![](https://i.imgur.com/WyvNOje.jpg)
Cairo is a practically-efficient Turing-complete STARK-friendly CPU architecture. In this article, we introduce the CPU architecture of Cairo in terms of instruction structure and state transition, and provide some examples of instruction.
# Instruction structure
The word natively supported by Cairo CPU is a field element, where the field is some fixed finite field of characteristic $P > 2^{63}$. Each instruction will occupy 1 or 2 words. If an immediate value([ap] = "12345678")follows the instruction, the instruction will occupy 2 words, and the value will be stored in the second word. The first word of each instruction consists of the following elements:
![](https://i.imgur.com/k74WZvY.png)
* $off_{dst}$[ bit0..15]: destination address offset, representative value $-2^{15} + \sum_{i = 0}^{15} b_i \cdot 2^i \in [-2^{15}, 2^{15})$;
* $off_{op0}$[ bit16..31]: op0 address offset, representative value $-2^{15} + \sum_{i = 0}^{15} b_i \cdot 2^i \in [-2^{15}, 2^{15})$;
* $off_{op1}$[ bit32..47]: op1 address offset, representative value $-2^{15} + \sum_{i = 0}^{15} b_i \cdot 2^i \in [-2^{15}, 2^{15})$;
* $dst \ reg$[ bit48]: Base register of destination address offset, ap or fp;
* $op0 \ reg$[ bit49]: Base register of op0 address offset, ap or fp;
* $op1\_src$[ bit50..52]: address of op1,
- Case: 000
$$op1 = m(op0 + off_{op1})$$
- Case: 001
$$op1 = m(pc + off_{op1})$$
- Case: 010
$$op1 = m(fp + off_{op1})$$
- Case: 100
$$op1 = m(ap + off_{op1})$$
- $res\_logic$ [ bit53..54]: computational logic
- Case: 00
$$res = op1$$
- Case: 01
$$res = op0 + op1$$
- Case: 10
$$res = op0 * op1$$
- $pc\_update$ [ bit55..57]: the update logic of pc
- Case: 000 // common
$$next\_pc = pc + instruction\_size$$
- Case: 001 // absolute jump
$$next\_pc = res$$
- Case: 010 // relative jump
$$next\_pc = pc + res$$
- Case: 100 // conditional relative jump
$$next\_pc = pc + op1(or \ instruction\_size \ if \ dst = 0)$$
- $ap\_update$[ bit58..59]: the update logic of ap
- Case: 00
$$next\_ap = ap(or \ ap + 2 \ if \ opcode = 1)$$
- Case: 01
$$next\_ap = ap + res$$
- Case: 10
$$next\_ap = ap + 1$$
- $opcode$[ bit60..62]: opcode types
- Case: 000 // jmp
- Case: 001 // call
- Case: 010 // ret
- Case: 100 // assert
# State transition
The state transition function represents a general state transition unit (because it contains the processing logic of all instruction types), and a calculation is usually decomposed into multiple continuously executed instructions. Therefore, we need to:
1. ensure the content of the instruction and the validity of the states before and after the instruction is executed (e.g., passing a certain range check and state consistency check) and
2. ensure that the executed instruction is valid.
## Transition logic
If the state before and after the instruction execution is consistent, the update of the state must be executed according to the following logic:
```
#Context: m(.).
#Input state: (pc, ap, and fp).
#Output state: (next_pc, next_ap, and next_fp).
#Compute op0.
If op0_reg == 0: // Judge the basic register of op0 according to the flag value, 0 is ap,1 is fp, and obtain the value of op0.
op0 = m(ap + off_{op0})
else:
op0 = m(fp + off_{op0})
#Compute op1 and instruction_size.
switch op1_src: // Obtain the value of op1
case 0:
instruction_size = 1 // op1 = m[[ap/fp + off_op0] +off_op1], with two-level imbedding at most.
op1 = m(op0 + off_{op1})
case 1:
instruction_size = 2 // There exist(s) immediate number(s). The instruction size is 2 words.
op1 = m(pc + off_{op1})//
#If off_{op1} = 1, we have op1 = immediate_value. // For example, [ap] = "12345678", op1 = "12345678"
case 2:
instruction_size = 1 // op1 = [fp + off_op1]
op1 = m(fp + off_{op1})
case 4:
instruction_size = 1 // op1 = [ap +off_op1]
op1 = m(ap + off_{op1})
default:
Undefined Behavior
#Compute res.
if pc_update == 4: // jnz
if res_logic == 0 && opcode == 0 && ap_update != 1: // Assign && jump && advanced ap
res = Unused // Under conditional jump, the values of res_logic, opcode and ap_update flags can only be as shown above.The res variable will not be used on this occasion.
else:
Undefined Behavior
else if pc_update = 0, 1 or 2:
switch res_logic: // The computational logic of res is:
case 0: res = op1
case 1: res = op0 + op1
case 2: res = op0 * op1
default: Undefined Behavior
else: Undefined Behavior
# Compute dst.
if dst_reg == 0:
dst = m(ap + offdst)
else:
dst = m(fp + offdst)
# Compute the new value of pc.
switch pc_update:
case 0: # The common case: [ap] = 1
next_pc = pc + instruction_size
case 1: # Absolute jump: jmp abs 123
next_pc = res
case 2: # Relative jump: jmp rel [ap - 1]
next_pc = pc + res
case 4: # Conditional relative jump (jnz): jmp rel [ap - 1] if [fp - 7] = 0
next_pc =
if dst == 0: pc + instruction_size // If dst = 0, then pc conducts ordinary updates; otherwise, it is similar to Case 2.
else: pc + op1 // Under this circumstance, res is Unused, so pc directly conducts + op1, instead of + res.
default: Undefined Behavior
# Compute new value of ap and fp based on the opcode.
if opcode == 1:
# "Call" instruction.
assert op0 == pc + instruction_size
assert dst == fp
# Update fp.
next_fp = ap + 2 // [ap] saves the current fp value, [ap + 1] saves the pc after the call instruction; when the ret instruction is executed, it is convenient to reset fp to [ap], and then continue to execute subsequent instructions.
# Update ap.
switch ap_update:
case 0: next_ap = ap + 2 // [ap] saves the value of fp, and [ap+1] saves the instruction address after the call instruction; thus, ap increases by 2.
default: Undefined Behavior
else if opcode is one of 0, 2, 4:
# Update ap.
switch ap_update:
case 0: next_ap = ap // [fp + 1] = 5
case 1: next_ap = ap + res // advanced ap [ap] += 123
case 2: next_ap = ap + 1 // normal [fp + 1] = [ap]; ap++
default: Undefined Behavior
switch opcode: // Within the same function, the fp value remains unchanged.
case 0:
next_fp = fp
case 2:
# "ret" instruction.
next_fp = dst // The ret instruction goes with the call instruction, and assert dst == fp.
case 4:
# "assert equal" instruction.
assert res = dst
next_fp = fp
else: Undefined Behavior
```
## Instruction verification
As shown in Figure 1, one instruction consists of the following elements:
```
structure instruction :=
(off_dst : bitvec 16)
(off_op0 : bitvec 16)
(off_op1 : bitvec 16)
(flags : bitvec 15)
```
Where:
$off_{dst}$[ bit0..15], $off_{op0}$[ bit16..31], $off_{op1}$[ bit32..47] are in range
$$-2^{15} + \sum_{i = 0}^{15} b_i \cdot 2^i \in [-2^{15}, 2^{15})$$
But in AIR, we convert them into the following form:
$$\widetilde{off}_* = off_* +2^{15}$$(Where * represents one of dst, op0 and op1)
So the value range of $\widetilde{off}_*$should be $[0,2^{16})$
Therefore, for the coding of an instruction, we have:
$$inst = \widetilde{off}_{dst} + 2^{16} \cdot \widetilde{off}_{op0} + 2^{32} \cdot \widetilde{off}_{op1} + 2^{48} \cdot \displaystyle\sum_{i=0}^{14} (2^i \cdot f_i) (1)$$
**Note:** Instead of allocating 15 virtual columns with a length of NN to the 15-bit flags of the instruction, instead, we use one virtual column$$ \tilde f_i = \textstyle\sum_{j=i}^{14} 2^{j-i} \cdot f_j$$with a length of 16N, which meets the following requirements:
1. $\tilde f_0 = \textstyle\sum_{j=0}^{14} 2^{j-0} \cdot f_j$ is a15-bit value
2. $\widetilde f_{15} = 0$
3. $$\widetilde f_i - 2\widetilde f_{i+1} = \displaystyle\sum_{j=i}^{14} (2^{j-i} \cdot f_j) - 2 \cdot \displaystyle\sum_{j=i + 1}^{14} (2^{j-i-1} \cdot f_j) = \displaystyle\sum_{j=i}^{14} (2^{j-i} \cdot f_j) - \displaystyle\sum_{j=i + 1}^{14} (2^{j-i} \cdot f_j) = f_i$$
Therefore, equation (1) can be written as:
$$inst = \widetilde{off}_{dst} + 2^{16} \cdot \widetilde{off}_{op0} + 2^{32} \cdot \widetilde{off}_{op1} + 2^{48} \cdot \tilde f_0 \in [0,2^{63}) (2)$$
Because the finite field's characteristic $P > 2^{63}$, one instruction can only correspond to one valid field element.
Therefore, for the instruction itself, the following constraints should be met:
**Instruction:** $inst = \widetilde{off}_{dst} + 2^{16} \cdot \widetilde{off}_{op0} + 2^{32} \cdot \widetilde{off}_{op1} + 2^{48} \cdot \tilde f_0$
**Bit:** $(\widetilde f_i - 2\widetilde f_{i+1} )(\widetilde f_i - 2\widetilde f_{i+1} - 1) = 0 \ for \ all \ i \in [0,15)$
**Last value is zero:** $\widetilde f_{15} = 0$
**Offset are in range:** virtual column $\widetilde{off}_* \in [0,2^{16})$$, so then ${off}_* \in [2^{-15}, 2^{15})$
# Instruction examples
## Assert equal
The assert equal instruction can be expressed in the following syntax:
*<left_handle_op> = <right_handle_op>*
It ensures that both sides of the formula are equal; otherwise the execution of the program will be rejected.
The left side of the equation often comes from $[fp+off_{dst}]$ or $[ap+off_{dst}]$, and the right side has some possible forms($reg_0$ and $reg_1$ can be $fp$ or $ap$, $\circ$ can be addition or multiplication, and $imm$ can be any fixed field elements):
- $imm$
- $[reg_1+off_{op1}]$
- $[reg_0+off_{op0}] \circ [reg_1 + off_{op1}]$
- $[reg_0+off_{op0}] \circ imm$
- $[[reg_0+off_{op0} + off_{op1}]$
**Note2:** Division and subtraction can be expressed as multiplication and addition with different operand orders, respectively.
An $assert$ instruction can be considered as an assignment instruction, in which the value of one side is known, and the other side is unknown. For example, $[ap] = 4$ can be considered as an assertion that the value of [ap] is 4, or as an assignment setting [ap] to 4, according to the context.
Figure 4 shows some examples of "assert equal" instructions and the flag values for each instruction:
![](https://i.imgur.com/71rAspp.png)
Figure 4. Examples of assert equal instructions
**Interpret** instruction [fp + 1] = 5:
- assert instruction => opcode = 4
- next_ap = ap => ap_update = 00 = 0
- next_pc = pc + instruction_size => pc_update = 000 = 0
- For op0 and op1, there is no add or mul => res_logic(res) = 00 = 0
- There exists an immediate value => op1_src(op1) = 001 = 1
- The immediate value address instructs that the addresses are adjacent => off_op1 = 1
- The left side of equation [fp + 1] => dst_reg(dst) = 1
- The left side of equation [fp + 1] => off_dst = 1
- op0_reg/ off_op0 => inital value(1/-1) //Because these flags are not used in this instruction, the default value is filled in.
## Conditional and unconditional jump
The $jmp$ instruction allows changing the value of the program counter $pc$.
Cairo supports relative jump (its operand represents the offset from the current $pc$)and absolute jump - represented by keywords $rel$ and $abs$ respectively. The $jmp$ instruction may be conditional. For example, when the value of a memory unit is not 0, the $jmp$ instruction will be triggered.
The syntax of the instruction is as follows:
```
#Unconditional jumps.
jmp abs <address>
jmp rel <offset>
#Conditional jumps.
jmp rel <offset> if <op> !
```
Figure 5 shows some examples of the $jmp$ instructions and the corresponding flag values of each instruction:
![](https://i.imgur.com/Gry2T8x.png)
Figure 5 Examples of jump instructions
**Interpret** instruction jmp rel [ap +1] + [fp - 7]:
- jmp instruction => opcode = 0
- next_ap = ap => ap_update = b00 = 0
- next_pc = pc + res=> pc_update = b010 = 2
- res = op0 + op1 => res_logic(res) = b01 = 1
- op1: [fp - 7] => op1_src(op1) = b010 = 2
- op1: [fp - 7] => off_op1 = -7
- op0: [ap + 1] => op0_src(op0) = 0
- op0: [ap + 1] => off_op0 = 1
- dst_reg/ off_dst => inital value(1/-1) ///Because these flags are not used in this instruction, the default value is filled in.
## $call$ and $ret$
The $call$ and $ret$ instructions allow the implementation of the function stack. The $call$ instruction updates the program counter( $pc$ )and frame pointer( $fp$ )registers. The update of the program counter is similar to the $jmp$ instruction. The previous value of $fp$ is written to $[ap]$, to allow the $ret$ instruction to reset the value of $fp$ to the value before the call; similarly, the returned $pc$ (the address of the instruction after the call instruction) is written to $[ap+1]$, to allow the $ret$ instruction to jump back and continue the execution of the code after the call instruction. As two memory units were written, $ap$ advanced by 2 and $fp$ is set to the new $ap$.
The syntax of the instructions are as follows:
```
call abs <address>
call rel <offset>
ret
```
Figure 6 shows some examples of $call$ and $ret$ instructions and flag values corresponding to each instruction:
![](https://i.imgur.com/7y3jsI2.png)
Figure 6 Examples of call instructions and ret instructions
**Interpret** instruction call abs [fp + 4]:
- call instruction => opcode = 0
- next_ap = ap => ap_update = b00 = 0
- next_pc = res => pc_update = b001 = 1
- res = op1 => res_logic(res) = b00 = 0
- op1: [fp + 4] => op1_src(op1) = b010 = 2
- op1: [fp + 4] => off_op1 = 4
- op0_reg/ off_op0 => inital value(0/1) ///Because these flags are not used in this instruction, the default value is filled in.
- dst_reg/ off_dst => inital value(0/0) ///Because these flags are not used in this instruction, the default value is filled in.
## Advancing $ap$
The instruction *ap += <op>* increments the value of $ap$ by the given operand.
Figure 7 shows some examples of the advancing $ap$ instructions and the corresponding flag values of each instruction:
![](https://i.imgur.com/3SMqjHx.png)
Figure 7 Examples of advancing ap instructions
**Interpret** instruction ap += 123:
- advancing ap instruction => opcode = 0
- next_ap = ap + res => ap_update = b01 = 1
- next_pc = pc + instruction_size => pc_update = b000 = 0
- res = op1 => res_logic(res) = b00 = 0
- op1 = 123 => op1_src(op1) = b001 = 1
- op1 = 123 => off_op1 = 1
- op0_reg/ off_op0 => inital value(1/-1) ///Because these flags are not used in this instruction, the default value is filled in.
- dst_reg/ off_dst => inital value(1/-1) ///Because these flags are not used in this instruction, the default value is filled in.