owned this note
owned this note
Published
Linked with GitHub
# u32 operations in Miden VM
###### tags: `notes`
In this note we assume that all values are elements in a 64-bit field defined by modulus $2^{64} - 2^{32} + 1$. One of the nice properties of this field is that multiplying two elements in this field with values less than $2^{32}$ does not overflow field modulus.
In the sections below, we describe how we can implement common operations on u32 values (i.e., 32-bit unsigned integers) in this field very efficiently. This includes arithmetic operations, comparison operations, and bitwise operations.
Vast majority of operations rely only on 16-bit range checks which can be implemented efficiently using a methodology described in [this note](https://hackmd.io/D-vjBYtHQB2BuOB-HMUG5Q). Some operations, such as bitwise `AND`, `OR`, and `XOR` require lookups in additional auxiliary tables, which are described [here](https://hackmd.io/vejHasuZSVWZOafBLNS_YQ).
### Checking element validity
Another nice property of this field is that checking whether four 16-bit values form a valid field element can be done relatively cheaply. Assume $t_0$, $t_1$, $t_2$, and $t_3$ are known to be 16-bit values (e.g., less than $2^{16}$), and we want to verify that $2^{48} \cdot t_3 + 2^{32} \cdot t_2 + 2^{16} \cdot t_1 + t_0$ is a valid field element.
For simplicity, let's denote:
$$
v_{hi} = 2^{16} \cdot t_3 + t_2 \\
v_{lo} = 2^{16} \cdot t_1 + t_0
$$
We can then impose the following constraint to verify element validity:
$$
\left(1 - m \cdot (2^{32} - 1 - v_{hi})\right) \cdot v_{lo} = 0
$$
The above constraint holds only if either of the following hold:
* $v_{lo} = 0$
* $v_{hi} \ne 2^{32} - 1$
To satisfy the latter constraint, the prover would need to set $m = (2^{32} - 1 - v_{hi})^{-1}$, which is possible only when $v_{hi} \ne 2^{32} - 1$.
This constraint is sufficient because modulus $2^{64} - 2^{32} + 1$ in binary representation is 32 ones, followed by 31 zeros, followed by a single one:
$$
1111111111111111111111111111111100000000000000000000000000000001
$$
This implies that the largest possible 64-bit value encoding a valid field element would be 32 ones, followed by 32 zeros:
$$
1111111111111111111111111111111100000000000000000000000000000000
$$
Thus, for a 64-bit value to encode a valid field element, either the lower 32 bits must be all zeros, or the upper 32 bits must not be all ones (which is $2^{32} - 1$).
## Conversions
Converting field elements into u32 values may be accomplished via two operations `U32SPLIT` and `U32CAST`. Supporting these operations requires 5 helper registers.
### Splitting
Assume $a$ is a field element. We define `U32SPLIT` operation as $a \rightarrow (b, c)$, where $b$ contains lower 32-bits of $a$, and $c$ contains the upper 32 bits of $a$. The diagram below illustrates this graphically.
![](https://i.imgur.com/6V2l0nI.png)
To facilitate this operation, the prover needs to provide five non-deterministic 'hint' values $t_0$, $t_1$, $t_2$, $t_3$, and $m$ such that:
$$
a = 2^{48} \cdot t_3 + 2^{32} \cdot t_2 + 2^{16} \cdot t_1 + t_0 \\
b = 2^{16} \cdot t_1 + t_0 \\
c = 2^{16} \cdot t_3 + t_2 \\
\left(1 - m \cdot (2^{32} - 1 - c)\right) \cdot b = 0
$$
The last constraint ensures that the decomposition of $a$ into four 16-bit values is done in such a way that these values encode a valid field element (as described [here](#Checking-element-validity)). We also need to enforce constraints against $t_0$, $t_1$, $t_2$, $t_3$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
### Casting
Assume $a$ is a field element. We define `U32CAST` operation as $a \rightarrow b$, where $b$ contains lower 32-bits of $a$. The diagram below illustrates this graphically.
![](https://i.imgur.com/izR2lXy.png)
To facilitate this operation, the prover needs to provide five non-deterministic 'hint' values $t_0$, $t_1$, $t_2$, $t_3$, and $m$ such that:
$$
a = 2^{48} \cdot t_3 + 2^{32} \cdot t_2 + 2^{16} \cdot t_1 + t_0 \\
b = 2^{16} \cdot t_1 + t_0 \\
\left(1 - m \cdot (2^{32} - 2^{16} \cdot t_3 - t_2 - 1)\right) \cdot b = 0
$$
The last constraint ensures that the decomposition of $a$ into four 16-bit values is done in such a way that these values encode a valid field element (as described [here](#Checking-element-validity)). We also need to enforce constraints against $t_0$, $t_1$, $t_2$, $t_3$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
## Addition
Assume $a$ and $b$ are known to be 32-bit values. `U32ADD` operation computes $a + b \rightarrow (c, d)$, where $c$ contains the low 32-bits of the result and $d$ is the carry bit. The diagram below illustrates this graphically.
![](https://i.imgur.com/7QSjSCW.png)
To facilitate this operation, the prover needs to provide two non-deterministic 'hint' values $c_0$ and $c_1$ such that:
$$
a + b = c + 2^{32} \cdot d \\
d^2 - d = 0 \\
c = 2^{16} \cdot c_1 + c_0
$$
We also need to enforce constraints against $c_0$ and $c_1$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
### Addition with carry
In implementing efficient big integer arithmetic, it may be convenient to have an "addition with carry" operation which has the following semantics. Assume $a$ and $b$ are known to be 32-bit values, and $c$ is known to be a binary value (either $1$ or $0$). `U32ADDC` operation computes $a + b + c \rightarrow (d, e)$, where $d$ contains the low 32-bits of the result and $e$ is the carry bit. The diagram below illustrates this graphically.
![](https://i.imgur.com/5mCZlF7.png)
To facilitate this operation, the prover needs to provide two non-deterministic 'hint' values $d_0$ and $d_1$ such that:
$$
a + b + c = d + 2^{32} \cdot e \\
e^2 - e = 0 \\
d = 2^{16} \cdot d_1 + d_0
$$
We also need to enforce constraints against $d_0$ and $d_1$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
## Subtraction
Assume $a$ and $b$ are known to be 32-bit values. `U32SUB` operation computes $a - b \rightarrow (c, d)$, where $c$ contains the 32-bit result in two's complement and $d$ is the borrow bit. The diagram below illustrates this graphically.
![](https://i.imgur.com/HfcOM6J.png)
To facilitate this operation, the prover needs to provide two non-deterministic 'hint' values $c_0$ and $c_1$ such that:
$$
a = b + c - 2^{32} \cdot d \\
d^2 - d = 0 \\
c = 2^{16} \cdot c_1 + c_0
$$
We also need to enforce constraints against $c_0$ and $c_1$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
## Multiplication
Assume $a$ and $b$ are known to be 32-bit values. `U32MUL` operation computes $a \cdot b \rightarrow (c, d)$, where $c$ contains the low 32 bits of the result and $d$ contains the upper 32 bits of the result. The diagram below illustrates this graphically.
![](https://i.imgur.com/jolgNBN.png)
To facilitate this operation, the prover needs to provide five non-deterministic 'hint' values $c_0$, $c_1$, $d_0$, $d_1$, and $m$ such that:
$$
a \cdot b = 2^{48} \cdot d_1 + 2^{32} \cdot d_0 + 2^{16} \cdot c_1 + c_0 \\
c = 2^{16} \cdot c_1 + c_0 \\
d = 2^{16} \cdot d_1 + d_0 \\
\left(1 - m \cdot (2^{32} - 1 - d)\right) \cdot c = 0
$$
The last constraint ensures that the decomposition of $a \cdot b$ into four 16-bit values is done in such a way that these values encode a valid field element (as described [here](#Checking-element-validity)). We also need to enforce constraints against $c_0$, $c_1$, $d_0$, $d_1$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
### Multiply-add
In implementing efficient big integer arithmetic, it may be convenient to have an "multiply-add" operation which has the following semantics. Assume $a$, $b$, and $c$ are known to be 32-bit values. `U32MADD` operation computes $a \cdot b + c \rightarrow (d, e)$, where $d$ contains the low 32 bits of the result and $e$ contains the upper 32 bits of the result. The diagram below illustrates this graphically.
![](https://i.imgur.com/OAugJ6D.png)
To facilitate this operation, the prover needs to provide five non-deterministic 'hint' values $d_0$, $d_1$, $e_0$, $e_1$, and $m$ such that:
$$
a \cdot b + c = 2^{48} \cdot e_1 + 2^{32} \cdot e_0 + 2^{16} \cdot d_1 + d_0 \\
d = 2^{16} \cdot d_1 + d_0 \\
e = 2^{16} \cdot e_1 + e_0 \\
\left(1 - m \cdot (2^{32} - 1 - e)\right) \cdot d = 0
$$
The last constraint ensures that the decomposition of $a \cdot b + c$ into four 16-bit values is done in such a way that these values encode a valid field element (as described [here](#Checking-element-validity)). We also need to enforce constraints against $d_0$, $d_1$, $e_0$, $e_1$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
**Note**: that the above constraints guarantee the correctness of the operation iff $a \cdot b + c$ cannot overflow field modules (which is the case for the field with modulus $2^{64} - 2^{32} + 1$).
## Division
Assume $a$ and $b$ are known to be 32-bit values. `U32DIV` operation computes $a / b \rightarrow (c, d)$, where $c$ contains the quotient and $d$ contains the remainder. The diagram below illustrates this graphically.
![](https://i.imgur.com/o3KZtzM.png)
To facilitate this operation, the prover needs to provide four non-deterministic 'hint' values $t_0$, $t_1$, $t_2$, $t_3$ such that:
$$
a = b \cdot c + d \\
a - c = 2^{16} \cdot t_1 + t_0 \\
b - d - 1= 2^{16} \cdot t_2 + t_3
$$
The second constraint enforces that $c \leq a$, while the third constraint enforces that $d < b$. We also need to enforce constraints against $t_0$, $t_1$, $t_2$, $t_3$ to make sure that they are 16-bit values, and this can be done via permutation-based range-checks.
## Comparisons
Assume $a$ and $b$ are known to be 32-bit values. We define `U32LT` and `U32GT` operations as follows:
* $lt(a, b) \rightarrow c$, where $c = 1$ iff $a < b$, and $0$ otherwise.
* $gt(a, b) \rightarrow c$, where $c = 1$ iff $a > b$, and $0$ otherwise.
Both of these operations can be emulated using `U32SUB` operation. For example, `U32SUB SWAP DROP` would be equivalent to `U32LT`, and `U32SUB SWAP DROP NOT` would be equivalent to `U32GT`. But, if we wanted to have these operations execute in just a single VM cycle, we could do it as described below.
We do not cover equality comparisons in this note because equality comparisons for field elements and u32 values are identical.
### Less than
Stack transition for `U32LT` operation would look as follows:
![](https://i.imgur.com/n7q4ccS.png)
To facilitate this operation, the prover needs to provide four non-deterministic 'hint' values $t_0$ and $t_1$ such that:
$$
a = b + (t_1 \cdot 2^{16} + t_0) - 2^{32} \cdot c \\
c^2 - c = 0
$$
Here, $t_0$ and $t_1$ hold the lower and upper 16 bits of $a - b$ in two's complement. Thus, we also need to enforce constraints against $t_0$ and $t_1$ to make sure that they are 16-bit values, and this can be done via permutation-based range-checks.
### Greater than
Stack transition for `U32GT` operation looks identical to the stack transition of the `U32LT` operation, but the constraint we need to enforce are slightly different. Specifically:
$$
b = a + (t_1 \cdot 2^{16} + t_0) - 2^{32} \cdot c \\
c^2 - c = 0
$$
Here, $t_0$ and $t_1$ hold the lower and upper 16 bits of $b - a$ in two's complement (rather than $a - b$). And we also need to enforce additional constraints to make sure that $t_0$ and $t_1$ are 16-bit values.
## Bit shifts
To perform u32 bit shift operations in a single VM cycle, we need to introduce an additional (6th) helper register. We also need to rely on a lookup table which maps integers in the range between $0$ and $32$ to the corresponding powers of two (i.e., $0 \rightarrow 1$, $1 \rightarrow 2$, $2 \rightarrow 4$ etc.). AIR for such a table can be defined relatively easily.
### Left shift
Assume $a$ is known to be a 32-bit value and $b$ is an integer in the range $[0, 32)$. `U32SHL` operation computes $shl(a, b) \rightarrow c$, where $c$ is the result of performing a bitwise shift of $a$ by $b$ bits to the left. The diagram below illustrates this graphically.
![](https://i.imgur.com/Lq0YGZo.png)
To facilitate this operation, the prover needs to provide six non-deterministic 'hint' values $t_0$, $t_1$, $t_2$, $t_3$, $m$, and $p$ such that:
$$
p = 2^b \\
a \cdot p = 2^{48} \cdot t_3 + 2^{32} \cdot t_2 + 2^{16} \cdot t_1 + t_0 \\
c = 2^{16} \cdot t_1 + t_0 \\
\left(1 - m \cdot (2^{32} - 2^{16} \cdot t_3 - t_2 - 1)\right) \cdot c = 0
$$
The first constraint needs to be enforced via a permutation-based lookup table. The last constraint ensures that the decomposition of $a \cdot p$ into four 16-bit values is done in such a way that these values encode a valid field element (as described [here](#Checking-element-validity)). We also need to enforce constraints against $t_0$, $t_1$, $t_2$, $t_3$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
### Right shift
Assume $a$ is known to be a 32-bit value and $b$ is an integer in the range $[0, 32)$. `U32SHR` operation computes $lsh(a, b) \rightarrow c$, where $c$ is the result of performing a bitwise shift of $a$ by $b$ bits to the right. The diagram below illustrates this graphically.
![](https://i.imgur.com/QYQ8Mjs.png)
To facilitate this operation, the prover needs to provide six non-deterministic 'hint' values $t_0$, $t_1$, $t_2$, $t_3$, $m$, and $p$ such that:
$$
p = 2^{32-b} \\
a \cdot p = 2^{48} \cdot t_3 + 2^{32} \cdot t_2 + 2^{16} \cdot t_1 + t_0 \\
c = 2^{16} \cdot t_3 + t_2 \\
\left(1 - m \cdot (2^{32} - 1 - c)\right) \cdot (2^{16} \cdot t_1 + t_0) = 0
$$
The first constraint needs to be enforced via a permutation-based lookup table. The last constraint ensures that the decomposition of $a \cdot p$ into four 16-bit values is done in such a way that these values encode a valid field element (as described [here](#Checking-element-validity)). We also need to enforce constraints against $t_0$, $t_1$, $t_2$, $t_3$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
## Bit rotations
Similarly to bit shift operations, performing bit rotation operations in a single VM cycle requires a 6th helper register and also relies on a lookup table which maps integers in the range between $0$ and $32$ to the corresponding powers of two.
### Rotate left
Assume $a$ is known to be a 32-bit value and $b$ is an integer in the range $[0, 32)$. `U32ROTL` operation computes $rotl(a, b) \rightarrow c$, where $c$ is the result of performing a bitwise rotation of $a$ by $b$ bits to the left. The diagram below illustrates this graphically.
![](https://i.imgur.com/mULvf5F.png)
To facilitate this operation, the prover needs to provide six non-deterministic 'hint' values $t_0$, $t_1$, $t_2$, $t_3$, $m$, and $p$ such that:
$$
p = 2^b \\
a \cdot p = 2^{48} \cdot t_3 + 2^{32} \cdot t_2 + 2^{16} \cdot t_1 + t_0 \\
c = (2^{16} \cdot t_3 + t_2) + (2^{16} \cdot t_1 + t_0) \\
\left(1 - m \cdot (2^{32} - 2^{16} \cdot t_3 - t_2 - 1)\right) \cdot (2^{16} \cdot t_1 + t_0) = 0
$$
The first constraint needs to be enforced via a permutation-based lookup table. The last constraint ensures that the decomposition of $a \cdot p$ into four 16-bit values is done in such a way that these values encode a valid field element (as described [here](#Checking-element-validity)). We also need to enforce constraints against $t_0$, $t_1$, $t_2$, $t_3$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
### Rotate right
Assume $a$ is known to be a 32-bit value and $b$ is an integer in the range $[0, 32)$. `U32ROTR` operation computes $rotr(a, b) \rightarrow c$, where $c$ is the result of performing a bitwise rotation of $a$ by $b$ bits to the right. The diagram below illustrates this graphically.
![](https://i.imgur.com/TbN5Zfi.png)
To facilitate this operation, the prover needs to provide six non-deterministic 'hint' values $t_0$, $t_1$, $t_2$, $t_3$, and $p$ such that:
$$
p = 2^{32 - b} \\
a \cdot p = 2^{48} \cdot t_3 + 2^{32} \cdot t_2 + 2^{16} \cdot t_1 + t_0 \\
c = (2^{16} \cdot t_3 + t_2) + (2^{16} \cdot t_1 + t_0) \\
\left(1 - m \cdot (2^{32} - 2^{16} \cdot t_3 - t_2 - 1)\right) \cdot (2^{16} \cdot t_1 + t_0) = 0
$$
The first constraint needs to be enforced via a permutation-based lookup table. The last constraint ensures that the decomposition of $a \cdot p$ into four 16-bit values is done in such a way that these values encode a valid field element (as described [here](#Checking-element-validity)). We also need to enforce constraints against $t_0$, $t_1$, $t_2$, $t_3$ to make sure that they are 16-bit values, and this can be done via permutation-based range checks.
## Bitwise AND
Assume $a$ and $b$ are known to be a 32-bit values. `U32AND` operation computes $and(a, b) \rightarrow c$, where $c$ is the result of performing a bitwise AND of $a$ and $b$. The diagram below illustrates this graphically.
![](https://i.imgur.com/lqOYO5l.png)
To facilitate this operation, we will need to perform a lookup in a table described [here](https://hackmd.io/vejHasuZSVWZOafBLNS_YQ#Bitwise-operations). The prover will also need to provide a non-deterministic 'hint' for the lookup (row address $r$). The lookup in the table can be accomplished by including the following values into the lookup product:
$$
\beta + \alpha \cdot r + \alpha^2 \cdot a + \alpha^3 \cdot b
$$
$$
\beta + \alpha \cdot (r + 7) + \alpha^2 \cdot c
$$
## Bitwise OR
Bitwise OR operation `U32OR` can be performed in the same way as the bitwise AND operation, but we'll need to use a different lookup table - the one specialized for computing bitwise OR operations.
## Bitwise XOR
Bitwise XOR operation `U32XOR` can be performed in the same way as the bitwise AND operation, but we'll need to use a different lookup table - the one specialized for computing bitwise XOR operations.
## Bitwise NOT
Assume $a$ is known to be a 32-bit value. `U32NOT` operation computes $not(a) \rightarrow b$, where $b$ is a bitwise negation of $a$. The diagram below illustrates this graphically.
![](https://i.imgur.com/TZAcnd2.png)
To facilitate this operation we need to enforce the following constraints:
$$
b = 2^{32} - a - 1
$$