# Status updates
## 27 January 2025
- Took a sick leave, down with the cold together with the rest of my family
## 17 January 2025
**Completed:**
- Experimented with disabling RISCV extensions we don't need to simplify the codebase
- Quite difficult as the code that we need depends on those definitions
- 30k lines of compilation errors, way too much work
- Reached a dead end trying to replacing Risc register with Zisk registers
- A lot of assumptions in existing patterns about specific registers, need to delete/rewrite them
- Crashes with cryptic error messages if some of the register conventions are not satisfied
**Next steps:**
- Investigating another approach that involves disabling RISCV register allocator and replacing it with another pass
- The LLVM patterns will still be happy and use fixed registers when needed
- But when the register is not fixed, we will have opportunity to use memory slots
- Then in the future we can start relaxing the patterns that requite fixed registers
## 10 January 2025
**Completed:**
- Learned all about registers in LLVM
- How to define Zisk Register type
- Where the register class annotations are added to the LLVM IR Graph nodes
- How virtual registers are translated to physical ones through register allocation
**Next steps:**
- Generate Zisk instructions working with memory from LLVM, the steps involved are:
1. Define Zisk register class and concrete registers (done)
2. Implement a mapping from LLVM DAG Node types to Zisk register class (done)
3. Write a register allocator for RISC-V that converts virtual Zisk registers to memory slots
4. Re-define all RISC-V instructions that use registers to now use memory slots
5. Define the new encoding for instructions, as they don't fit into 32 bits anymore due to register address size
## 27 December 2024
**Completed:**
- Improved the iteration time from 10m+ to 10s for a change in LLVM backend
- Switched to dynamic LLVM to rustc linking
- Wrote a shell script to incrementally rebuild LLVM and update the shared library in Rustc artifacts
- Makes a huge difference for experiments with a codebase
- [Implemented](https://github.com/0xPolygonHermez/zisk/commit/54d9820169f983b8b94fe123d0c9e4102fa45888) proof of concept for parsing of new instructions in Zisk emulator
- This allows to encode a instructions longer than 64 bits in the RISC-V ELF from LLVM side, convert it to Zisk and execute it correctly
- Required changes to ziskemu to support instructions longer that 32 bit
- Looks like a workable path, but a proper support would require some refactoring in translator and emulator
**Next steps:**
- Turn the proof of concept into a more generic implementation to support further experiments
- Extend `RiscvInstruction` with a native zisk opcode payload
- Chose an encoding for `ZiskInst` in the binary generated by LLVM
- Implement encoding in LLVM and decoding in ziskemu
- Try to generate a Zisk instruction from LLVM that works directly with memory
- I'm not sure I will be able to do this incrementally, so might need to convert all instructions at once, taking more time to implement this
- I will be out next week for some travel and family time and will be back on 6th of January
## 23 December 2024
**Completed:**
- Implemented a [prototype](https://github.com/aborg-dev/llvm-project/commit/ecf61daf7f6ac944e432712bbb40e50fe3eb9234) to encode long Zisk instructions in LLVM RISC-V backend
- The proper way to do this is to use [LLVM interface](https://groups.google.com/g/llvm-dev/c/uiWRmdpV0PQ/m/-KXCkr59AwAJ) for long instructions over 64 bits
- This would require also defining a disassembler for new instructions
- A simpler path that would work for a prototype is to store additional payload in Zisk instructions and emit it manually
- The instructions will be 256 bits in size, with a first 32 bit component being a `custom-0` RISC-V opcode to simplify parsing in Ziskemu
- Read through WebAssembly register stackify code to borrow ideas
- The main part of implementation is in https://github.com/aborg-dev/llvm-project/blob/rust-llvm/rustc/19.1-2024-12-03/llvm/lib/Target/WebAssembly/WebAssemblyRegNumbering.cpp and is fairly simple
- We would also need to define a pseudo-registers for Zisk https://github.com/aborg-dev/llvm-project/blob/rust-llvm/rustc/19.1-2024-12-03/llvm/lib/Target/WebAssembly/WebAssemblyRegisterInfo.td
- The initial approach is a bit wasteful as it does not reuse memory locations that have been freed, but there are [optimizations for this](https://github.com/aborg-dev/llvm-project/blob/rust-llvm/rustc/19.1-2024-12-03/llvm/lib/Target/WebAssembly/WebAssemblyRegColoring.cpp)
- Next steps:
- Exploring how to do the mapping from virtual registers to memory
- Rough model is that before register selection, we have a sequential list of registers, each being defined only once (SSA) in a single code path
- Q: Are there any constraints in Zisk for values in operands?
- Speed up the build process - right now takes 10m for each change, need to find a better way
## 16 December 2024
- Finished end-to-end test with LLVM :tada:
- Steps:
- Fork LLVM RISC-V backend
- Build Rust compiler with new LLVM
- Register new Rust toolchain
- Build Zisk program using this toolchain
- Run the resulting program on ZiskEmu
- Codebase: https://github.com/aborg-dev/llvm-project/tree/rust-llvm/rustc/19.1-2024-12-03/llvm/lib/Target/RISCV
- Takes quite some time to build (~1 hour on a powerful machine), but then can be shipped to end users like today
- Next steps:
- Implement optimization that bypasses registers for arithmetic operations
## 6 December 2024
**Completed:**
- Wrote a summary for Cranelift investigation:
> The main intention of existing Wasmtime/Cranelift compilation pipeline is to generate bytecode that will run inside `wasmtime` VM. For ZisK project we will need to break that core assumption (as we agree that maintaining compatibility with a moving `wasmtime` target is a non-starter).
>
> The three main components that we will need to (re)write and maintain are:
> 1. WASM frontend that generates ZisK-compatible CLIF - ~5k lines of Rust
> 2. ZisK backend/codegen - ~10k lines of Rust + ISLE
> 3. ZisK assembler/linker - ~2k lines of Rust; outputs a ZisK binary from a WASM module and ZisK bytecode
>
> These components are very unlikely to make it back to upstream due to specialized nature, so we will need to maintain a `wasmtime` fork and periodically sync it with the upstream.
- Started to go through tutorials [1, 2] for creating a new LLVM backend
- Quite big, but comparable with Cranelift: 45k C++, 30k TableGen, total 80k
- To use with Rust we will need to compile a new `rustc` version with new LLVM linked in
- We will also need to compile Rust standard library for the new target
**Next steps:**
- Fork LLVM RISC-V backend and build a binary end-to-end from Rust to ZisK using it
[1] https://sourcecodeartisan.com/2020/09/13/llvm-backend-0.html
[2] https://brson.github.io/2023/03/12/move-on-llvm#writing-an-llvm-backend-with-rust
## 29 November 2024
**Completed:**
- Converted and launched Cranelift ELF files for SHA and Fibonacci on ZisK emulator
- All sections can be parsed and opcodes can be converted
- Actual execution crashes due to lack of WASM VM context support which is used for memory accesses
- Identified [necessary changes](https://hackmd.io/XQIIUqjRRM-FrkrQsNAQrg?view#Code-Changes) to WASM -> CLIF codegen
- We need to rewrite the conversion to account that WASM code will be running on the bare-metal like ZisK machine and not in the WASM VM
- These changes are necessary regardless of the backend that we use (direct ZisK or through RISC-V), so start with this
**Next steps:**
- [Confirming](https://bytecodealliance.zulipchat.com/#narrow/channel/217117-cranelift/topic/Running.20.60wasmtime.20compile.60-ed.20WASM.20in.20the.20alternative.20VM/near/484976140) whether this is the right path with Wasmtime maintainers
## 22 November 2024
**Completed:**
- Studied [Cranelift ELF file generation](https://github.com/bytecodealliance/wasmtime/blob/5af89308dc0229ca404cd7000eec694201022e2d/crates/wasmtime/src/compile.rs#L63)
- Goes from WASM to an ELF file with bytecode for a given backend (e.g. RISC-V or ZisK)
- A very complex piece of machinery, but it looks like we will be able to reuse most of it
- The format is a bit different from normal ELFs, see [a discussion](https://bytecodealliance.zulipchat.com/#narrow/channel/217117-cranelift/topic/.60wasmtime.20compile.60.20for.20.60riscv64-ima.60.20target/near/483642712)
- Got a lot more experience with debugging Cranelift codegen while investigating generated ELF files
- It's possible to get detailed information about compiler decisions for individual functions
- Can also get CLIF and bytecode for each function
- Submitted [code refactoring to ZisK](https://github.com/0xPolygonHermez/zisk/commit/726c98eb3936b11af17d4c59e1ead50efcdfa777)
## 15 November 2024
It's been a bit hectic week and I needed to take a few sick days to looks after my son (who is feeling much better now).
**Completed:**
- Finished the document with a plan for the work ahead on Cranelift ZisK backend: https://hackmd.io/XQIIUqjRRM-FrkrQsNAQrg
- Started to work on the stage 1 and will aim to share some results next week
## 7 November 2024
**Completed:**
- [ISA benchmarks](https://github.com/aborg-dev/isa_benchmarks/tree/main) are now complete, you can see performance across all relevant targets on eth_block (and a few other) benchmark. I also produced a detailed instruction usage reports for RISC-V and WASM. That shows which instructions are the most prevalent in each of the benchmarks.
- [Published](/jAXva4muSgKyJV17QZ9zfw) and discussed with @Jordi next possible projects to improve Rust -> ZisK compilation. We agreed to proceed with option 3. WASM -> ZisK compiler using Cranelift. We expect to get 20% - 50% improvement in instruction count over existing RISC-V based backend based on the initial measurements.
**Next up:**
- I'm writing a design document for a project to outline:
- What exactly needs to be built
- How the new backend will interact with ZisK and ZiskOS
- Time estimates for different stages
- I expect to finish and share this next week