# Ooty zkVM: Rust version of BrainSTARK
This series of blogposts is written by [Muskan Kumari](https://x.com/m_pandey5) and [Soumya Thakur](https://x.com/soumyathakur44). We are undergraduate students at IIT Delhi who spent some time playing around with STARKS and the idea of a zkVM.
[Ooty VM](https://github.com/manojkgorle/brainfuckvm) is a toy VM we made from scratch, and it is the rust implementation of [BrainSTARK zkVM](https://aszepieniec.github.io/stark-brainfuck/). We faced some challenges along the way and made changes accordingly to tackle those. This first blog introduces the concepts of ZK, VM and the Brainfuck ISA, while subsequent posts dive into details of our project.
j
Github repository: [link](https://github.com/manojkgorle/brainfuckvm)
(Contributors: [Manoj](https://x.com/ManojKGorle), [Muskan](https://x.com/m_pandey5), [Soumya](https://x.com/soumyathakur44))
# Introduction
A **zkVM (Zero-Knowledge Virtual Machine)** is a powerful type of virtual machine that utilizes zero-knowledge proofs (**ZKPs**) to guarantee the integrity and privacy of computations.
To understand how a zkVM works, we first need to understand what **zero-knowledge proofs** are, and what **virtual machines** are.
---
## What are ZKPs?
**Zero-knowledge proofs** are a cryptographic technique that enables one party to prove the validity of a statement to another party **without revealing any additional information**.
In simpler terms, it's like proving you know a secret without actually revealing the secret itself.
The reader is assumed to have some background about what ZKPs are and the basics of how they work.
> If you are unfamiliar with ZKPs, refer to [this starter cookbook on ZK](https://hackmd.io/UorSwsdPSw2VLMKaJL4e6g#Zero-Knowledge).
---
## What is a Virtual Machine?
A **virtual machine (VM)** is a program that runs programs. Commonly shortened to just VM, it is no different than any other physical computer like a laptop, smart phone, or server. It is a program that acts like a computer by simulating a CPU along with few other hardware components, allowing it to:
- Perform arithmetic.
- Read and write to memory.
- Interact with I/O devices.
Most importantly, it can understand a machine language which you can use to program it.
### Components of a VM
- **Memory**: The RAM that our program accesses, reads, and writes to, plus the program memory. In our case, this memory will be stored in a simple array.
- **Registers**: A register is a slot for storing a single value on the CPU. Registers are like the “workbench” of the CPU. For the CPU to work with a piece of data, it has to be in one of the registers. However, since there are just a few registers, only a minimal amount of data can be loaded at any given time. VMs work around this by loading values from memory into registers, calculating values into other registers, and then storing the final results back in memory. The registers of our VM will be discussed later on.
- **Instruction Set**: An instruction is a command which tells the CPU to do some fundamental task, such as adding two numbers. Instructions have both an opcode that indicates the kind of task to perform and a set of parameters that provide inputs to the task being performed. Each opcode represents one task that the CPU “knows” how to do. Everything the computer can calculate is some sequence of these simple instructions.
---
## What is a zkVM?
A **zkVM** is a VM that provides a **zero-knowledge proof** of the computations it performs. Broadly speaking, most zkVMs implicitly include the compiler toolchains and proof systems appended to the virtual machine that executes the program, and not just the virtual machine itself.
### Components of a zkVM and Their Functions
- **Compiler**: Converts ordinary programs written in Rust, C, Solidity, or C++ into machine code (VM-readable code).
- **VM**: Executes the compiled program, as machine code, without revealing information about the program and generates an execution trace.
- **Prover**: Converts the execution trace into ZKPs with cryptography and shrinks the size of the output into a few kilobytes as a SNARK or STARK proof of program execution.
- **Verifier (not part of zkVM)**: Verifies the encrypted results of the executed program using a proof.
Thus, the design and implementation of a zkVM are governed by:
1. The **choice of proof** (SNARKs or STARKs).
2. The **instruction set architecture (ISA)** of the zkVM.
Traditionally, an ISA specifies what a CPU is capable of (data types, registers, memory, etc.) and the sequence of actions the CPU performs when it executes a program (the instruction set). Essentially, it determines the machine code that is interpretable and executable by the VM. Choosing an ISA can yield radical differences in the accessibility and usability of the zkVM, as well as the speed and efficiency of the proof generation processes, and underpins the construction of any zkVM.
The instruction set architecture or ISA used in our zkVM is **Brainfuck ISA**, and the proof system used is **STARKs**. We will cover both of them before we attempt to understand the working of our zkVM (or STARK Engine).
---
## Brainfuck ISA
**Brainfuck** is an esoteric programming language created in 1993 by Swiss student Urban Müller. Designed to be extremely minimalistic, the language consists of only eight simple commands, a data pointer, and an instruction pointer.
While Brainfuck is fully Turing complete, i.e., it can be used to write any program, it is not intended for practical use. This is because it provides so little abstraction that the programs get very long or complicated.
But the simple 8-command structure of this language is the very reason for choosing it to develop a basic zkVM from scratch with all the cryptographic libraries and their implementation.
Any Brainfuck program is a sequence of these commands, possibly interspersed with other characters (which are ignored).
### Brainfuck Commands
| **Command** | **Description** |
|-------------|---------------------------------------------------------------------------------------------------------|
| **`>`** | Increment the data pointer by one (move to the next cell on the right). |
| **`<`** | Decrement the data pointer by one (move to the previous cell on the left). |
| **`+`** | Increment the byte at the data pointer by one. |
| **`-`** | Decrement the byte at the data pointer by one. |
| **`.`** | Output the byte at the data pointer. |
| **`,`** | Accept one byte of input and store its value in the byte at the data pointer. |
| **`[`** | If the byte at the data pointer is zero, jump to the command after the matching `]`. |
| **`]`** | If the byte at the data pointer is non-zero, jump back to the matching `[` command. |
---
### Key Features of Brainfuck
- Uses a simple machine model consisting of:
- A program and instruction pointer that begins at the first command, and each command it points to is executed, after which it normally moves forward to the next command. The program terminates when the instruction pointer moves past the last command.
- A one-dimensional array of at least 30,000 byte cells initialized to zero.
- A movable data pointer (initialized to point to the leftmost byte of the array).
- Two streams of bytes for input and output (commonly connected to a keyboard and monitor, and using the **ASCII character encoding**).
- Each command is executed sequentially unless redirected by loops (`[` and `]`).
#### Loops in Brainfuck
- `[ and ]` match like parentheses:
- Each `[` matches exactly one `]`, and vice versa.
- There can be no unmatched `[ or ]`.
---
Let us take a simple example of adding two numbers to see how Brainfuck ISA works.
### Adding Two Values
The following code snippet will add the current cell's value to the next cell:
Each time the loop is executed, the current cell is decremented, the data pointer moves to the right, that next cell is incremented, and the data pointer moves left again. This sequence is repeated until the starting cell is 0.
```brainfuck
++ Cell c0 = 2
> +++++ Cell c1 = 5
[ Start your loops with your cell pointer on the loop counter (c1 in our case)
< + Add 1 to c0
> - Subtract 1 from c1
] End your loops with the cell pointer on the loop counter
At this point, our program has added 5 to 2, leaving 7 in c0 and 0 in c1.
However, we cannot output this value to the terminal since it is not ASCII encoded.
To display the ASCII character "7," we must add 48 to the value 7.
We use a loop to compute 48 = 6 * 8:
++++ ++++ c1 = 8 and this will be our loop counter again
[
< +++ +++ Add 6 to c0
> - Subtract 1 from c1
]
< . Print out c0, which has the value 55, translating to "7"!
```
### Modified Brainfuck Compiler Instructions
Brainfuck is not an instruction set architecture on its own because the instructions are not self-contained. They depend on context. Specifically, the [ and ] instructions refer to the locations of their matching partners.
To address this, we modify the `[` and `]` instructions as follows:
The field element (we will talk about what field elements are and their context in the zkVM, for now just treat this as the immediate next address the instruction pointer will point to) immediately following `[` or `]` contains the destination address of the potential jump.
The instruction set with this change is now variable-size but nevertheless defines a machine architecture.
This modification necessitates a compiler that translates the Brainfuck program to include these addresses. An exceedingly simple pushdown automata achieves this translation.
#### Compiler Logic:
The compiler here is almost streaming, meaning that it runs over the input sequence once and starts outputting symbols before it reaches the end. Except when it occasionally outputs placeholder symbols whose concrete value (the needed address of jump) will be set later. The compiler keeps track of a stack that stores the locations (in the output sequence) of the `[` symbols that have not yet been closed by a matching `]` symbol. Let c denote a counter that tracks the total number of symbols sent to output so far, then:
- **When `[` is encountered**:
a) Push `c` to the stack.
b) Push two symbols to the output: `[` and a placeholder `*`.
- **When `]` is encountered**:
a) Read and pop the location `i` of the matching `[` from the stack.
b) Replace the placeholder at location `i+1` with `c+2`.
c) Push two symbols to the output: `]` and `i+2`.
- **For any other symbol**:
Push the symbol to the output with no stack modifications.
#### Example (modified and compiled Brainfuck program):
For the program discussed [above](https://hackmd.io/CvkubZOBR9CCGu6oR4sEvQ?both#Adding-Two-Values), where we add the value of two cells:
`++>+++++[<+>-]++++++++[<++++++>-]<.`
The compiled Brainfuck program with corresponding address for jumps will be:
|`ip`|0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|
|-----|-----|---|------|----|---|----|---|----------|------|----|----|--|----|---|---|----|----|--|----|-----|-----|
|`p[ip]`|+|+|>|+|+|+|+|+|[|`16`|<|+|>|-|]|`10`|+|+|+|+|+
|`ip`|20|21|22|23|24|25|26|27|28|29|30|31|32|33|34|35|36|37|
|---|--|-----|---|------|--------------|------|----|------|----|----------|---|---|----|----|--|----|-----|-----|
|`p[ip]`|+|+|+|[|`36`|<|+|+|+|+|+|+|>|-|]|`25`|<|.|
The numerical values of `p[ip]` refer to the addresses of jump for `[` and `]`.
### State Transition Function for Modified Brainfuck VM
Any virtual machine defines a state transition function. Running the virtual machine consists of repeatedly applying this function to the state, until the termination criterion is met, in this case when the instruction pointer points beyond the length of the program. For Brainfuck, the transition depends on the current instruction `p[ip]`:
- `ip`: Instruction pointer pointing to a program instruction.
- `dp`: Data pointer pointing to a memory cell.
- `p[i]`: Instruction at location `i`.
- `d[i]`: Memory cell at location `i`.
State Transition Logic:
1. **If `p[ip] == '['`**:
- If `d[dp] == 0`, jump: `ip = program[ip+1]`.
- If `d[dp] != 0`, skip the placeholder: `ip = ip+2`.
2. **If `p[ip] == ']'`**:
- If `d[dp] != 0`, jump: `ip = program[ip+1]`.
- If `d[dp] == 0`, skip the placeholder: `ip = ip+2`.
3. **If `p[ip] == '<'`**:
- Decrement data pointer: `dp = dp - 1`.
- Increment instruction pointer: `ip = ip + 1`.
4. **If `p[ip] == '>'`**:
- Increment data pointer: `dp = dp + 1`.
- Increment instruction pointer: `ip = ip + 1`.
5. **If `p[ip] == '+'`**:
- Increment data element: `d[dp] = d[dp] + 1`.
- Increment instruction pointer: `ip = ip + 1`.
6. **If `p[ip] == '-'`**:
- Decrement data element: `d[dp] = d[dp] - 1`.
- Increment instruction pointer: `ip = ip + 1`.
7. **If `p[ip] == '.'`**:
- Output data element: `d[dp]`.
- Increment instruction pointer: `ip = ip + 1`.
8. **If `p[ip] == ','`**:
- Set data element to input symbol: `d[dp] = input`.
- Increment instruction pointer: `ip = ip + 1`.
---
We have covered the basics of zkVM and Brainfuck ISA. The [next blogpost](https://hackmd.io/@Muskan05/rJb7hejU1x) will discuss STARKS and its required pre-requisites in detail.