# RV32 support for Tina coroutine > 黃守維 [GitHub](https://github.com/28530367/Tina/tree/feature/rv32i-support) ## Task Description My task is to add **RV32I support** to the [Tina](https://github.com/slembcke/Tina) project using **inline assembly** for extensions. Additionally, I need to: - **Consider preserving RV32F/RV32D (floating-point) registers** during execution. - Validate the implementation on **rv32emu** or the **Spike emulator**. - Develop test programs (self-written) to ensure the proper functionality of coroutines. - Finally, contribute the code back to the **Tina** project. ## Tina ### Symmetric Coroutines: - The primary mechanism for symmetric coroutines is the use of `TINA_swap()`. - This allows switching between two fibers, including the main program. ### Asymmetric Coroutines: - Asymmetric coroutines use: - `TINA_resume()` to enter a sub-coroutine. - `TINA_yield()` to return control to the caller. ### Assembly Architecture in Tina In **Tina**, the `tina.h` file uses multiple `#ifdef` directives to distinguish between different platforms, such as: - **x86** - **ARM** - **RISCV (RV64)** Context switching is implemented using: - **Inline assembly** - **Dedicated assembly blocks** for platform-specific functionality. ## Analysis of RISC-V RV32I + F/D Registers ### General Registers (Integer Registers) - RISC-V RV32I includes **32 general-purpose registers**: `x0` to `x31`, each 32-bit wide. - During coroutine switching, the handling of **saved registers** and **temporary registers** can vary depending on the OS ABI conventions. - To ensure the program's **CPU state is fully restored** after a coroutine switch, it is common practice to save and restore all registers that might be modified. ### Floating-Point Registers (FPU Registers) - If **RV32F** or **RV32D** is supported, there are an additional **32 floating-point registers**: `f0` to `f31`. - Each register is **32-bit** wide (RV32F) or **64-bit** wide (RV32D). ### Considerations for Coroutine Switching: - To prevent disrupting floating-point operations between coroutine switches: - All floating-point registers must be **saved** as part of the coroutine context. - These registers should be **restored** when switching back to the coroutine. ## Reference the Tina RV64 & the Minicoro RV32I Approach ### Referencing the Tina RV64 Approach The original RV64 approach uses assembly blocks to: 1. **Initialize the stack** when starting a new coroutine. 2. **Save and restore** both general-purpose and floating-point registers. Below is an example of how the RV64 code handles this process: ```c #define TINA_ABI_aarch32 (__ARM_EABI__ && __GNUC__) #define TINA_ABI_aarch64 (__aarch64__ && __GNUC__) #define TINA_ABI_i386 ((__i386__ && __GNUC__) || (_M_IX86 && _MSC_VER)) #define TINA_ABI_SysV_AMD64 (__amd64__ && __GNUC__ && (__unix__ || __APPLE__ || __HAIKU__)) #define TINA_ABI_WIN64 ((__WIN64__ && __GNUC__) || (_M_AMD64 && _MSC_VER)) #define TINA_ABI_riscv64gc (__riscv && __riscv_xlen == 64 && __riscv_flen == 64) // ... // 64bit riscv w/ 64 bit floats // push s0-s11, fs0-fs11 asm("_tina_init_stack:"); asm(" addi sp, sp, -0xD0"); asm(" sd sp, (a1)"); asm(" sd ra, 0xC8(sp)"); asm(" sd s0, 0xC0(sp)"); asm(" sd s1, 0xB8(sp)"); asm(" sd s2, 0xB0(sp)"); asm(" sd s3, 0xA8(sp)"); asm(" sd s4, 0xA0(sp)"); asm(" sd s5, 0x98(sp)"); asm(" sd s6, 0x90(sp)"); asm(" sd s7, 0x88(sp)"); asm(" sd s8, 0x80(sp)"); asm(" sd s9, 0x78(sp)"); asm(" sd s10, 0x70(sp)"); asm(" sd s11, 0x68(sp)"); asm(" fsd fs0, 0x60(sp)"); asm(" fsd fs1, 0x58(sp)"); asm(" fsd fs2, 0x50(sp)"); asm(" fsd fs3, 0x48(sp)"); asm(" fsd fs4, 0x40(sp)"); asm(" fsd fs5, 0x38(sp)"); asm(" fsd fs6, 0x30(sp)"); asm(" fsd fs7, 0x28(sp)"); asm(" fsd fs8, 0x20(sp)"); asm(" fsd fs9, 0x18(sp)"); asm(" fsd fs10, 0x10(sp)"); asm(" fsd fs11, 0x08(sp)"); asm(" andi a2, a2, ~0xF"); asm(" mv sp, a2"); asm(" mv ra, x0"); asm(" tail _tina_start"); asm("_tina_swap:"); asm(" addi sp, sp, -0xD0"); asm(" sd sp, (a0)"); asm(" sd ra, 0xC8(sp)"); asm(" sd s0, 0xC0(sp)"); asm(" sd s1, 0xB8(sp)"); asm(" sd s2, 0xB0(sp)"); asm(" sd s3, 0xA8(sp)"); asm(" sd s4, 0xA0(sp)"); asm(" sd s5, 0x98(sp)"); asm(" sd s6, 0x90(sp)"); asm(" sd s7, 0x88(sp)"); asm(" sd s8, 0x80(sp)"); asm(" sd s9, 0x78(sp)"); asm(" sd s10, 0x70(sp)"); asm(" sd s11, 0x68(sp)"); asm(" fsd fs0, 0x60(sp)"); asm(" fsd fs1, 0x58(sp)"); asm(" fsd fs2, 0x50(sp)"); asm(" fsd fs3, 0x48(sp)"); asm(" fsd fs4, 0x40(sp)"); asm(" fsd fs5, 0x38(sp)"); asm(" fsd fs6, 0x30(sp)"); asm(" fsd fs7, 0x28(sp)"); asm(" fsd fs8, 0x20(sp)"); asm(" fsd fs9, 0x18(sp)"); asm(" fsd fs10, 0x10(sp)"); asm(" fsd fs11, 0x08(sp)"); asm(" ld sp, (a1)"); asm(" ld ra, 0xC8(sp)"); asm(" ld s0, 0xC0(sp)"); asm(" ld s1, 0xB8(sp)"); asm(" ld s2, 0xB0(sp)"); asm(" ld s3, 0xA8(sp)"); asm(" ld s4, 0xA0(sp)"); asm(" ld s5, 0x98(sp)"); asm(" ld s6, 0x90(sp)"); asm(" ld s7, 0x88(sp)"); asm(" ld s8, 0x80(sp)"); asm(" ld s9, 0x78(sp)"); asm(" ld s10, 0x70(sp)"); asm(" ld s11, 0x68(sp)"); asm(" fld fs0, 0x60(sp)"); asm(" fld fs1, 0x58(sp)"); asm(" fld fs2, 0x50(sp)"); asm(" fld fs3, 0x48(sp)"); asm(" fld fs4, 0x40(sp)"); asm(" fld fs5, 0x38(sp)"); asm(" fld fs6, 0x30(sp)"); asm(" fld fs7, 0x28(sp)"); asm(" fld fs8, 0x20(sp)"); asm(" fld fs9, 0x18(sp)"); asm(" fld fs10, 0x10(sp)"); asm(" fld fs11, 0x08(sp)"); asm(" addi sp, sp, 0xD0"); asm(" mv a0, a2"); asm(" ret"); ``` ### Referencing the Minicoro RV32I Approach Minicoro provides a lightweight and efficient implementation for coroutine context switching, including support for **RISC-V RV32I** architecture. The approach leverages inline assembly to save and restore both general-purpose and floating-point registers, ensuring seamless coroutine operation. Below is an example of how the registers are managed in Minicoro, highlighting its simplicity and adherence to RISC-V conventions. #### Key Features of Minicoro's Approach: 1. **General-Purpose Register Management**: - Saves and restores `s0` through `s11`, `ra`, and `sp` to maintain the coroutine's execution context. 2. **Floating-Point Register Support**: - Includes conditional handling for `RV32F` and `RV32D` extensions, preserving floating-point computation results across coroutine switches. 3. **Scalable Design**: - Uses preprocessor directives like `__riscv_xlen` and `__riscv_flen` to adapt the implementation to different configurations (e.g., RV32I, RV32F, RV32D). Below is an example of how the RV32I code handles this process: ```c #elif __riscv_xlen == 32 " sw s0, 0x00(a0)\n" " sw s1, 0x04(a0)\n" " sw s2, 0x08(a0)\n" " sw s3, 0x0c(a0)\n" " sw s4, 0x10(a0)\n" " sw s5, 0x14(a0)\n" " sw s6, 0x18(a0)\n" " sw s7, 0x1c(a0)\n" " sw s8, 0x20(a0)\n" " sw s9, 0x24(a0)\n" " sw s10, 0x28(a0)\n" " sw s11, 0x2c(a0)\n" " sw ra, 0x30(a0)\n" " sw ra, 0x34(a0)\n" /* pc */ " sw sp, 0x38(a0)\n" #ifdef __riscv_flen #if __riscv_flen == 64 " fsd fs0, 0x3c(a0)\n" " fsd fs1, 0x44(a0)\n" " fsd fs2, 0x4c(a0)\n" " fsd fs3, 0x54(a0)\n" " fsd fs4, 0x5c(a0)\n" " fsd fs5, 0x64(a0)\n" " fsd fs6, 0x6c(a0)\n" " fsd fs7, 0x74(a0)\n" " fsd fs8, 0x7c(a0)\n" " fsd fs9, 0x84(a0)\n" " fsd fs10, 0x8c(a0)\n" " fsd fs11, 0x94(a0)\n" " fld fs0, 0x3c(a1)\n" " fld fs1, 0x44(a1)\n" " fld fs2, 0x4c(a1)\n" " fld fs3, 0x54(a1)\n" " fld fs4, 0x5c(a1)\n" " fld fs5, 0x64(a1)\n" " fld fs6, 0x6c(a1)\n" " fld fs7, 0x74(a1)\n" " fld fs8, 0x7c(a1)\n" " fld fs9, 0x84(a1)\n" " fld fs10, 0x8c(a1)\n" " fld fs11, 0x94(a1)\n" #elif __riscv_flen == 32 " fsw fs0, 0x3c(a0)\n" " fsw fs1, 0x40(a0)\n" " fsw fs2, 0x44(a0)\n" " fsw fs3, 0x48(a0)\n" " fsw fs4, 0x4c(a0)\n" " fsw fs5, 0x50(a0)\n" " fsw fs6, 0x54(a0)\n" " fsw fs7, 0x58(a0)\n" " fsw fs8, 0x5c(a0)\n" " fsw fs9, 0x60(a0)\n" " fsw fs10, 0x64(a0)\n" " fsw fs11, 0x68(a0)\n" " flw fs0, 0x3c(a1)\n" " flw fs1, 0x40(a1)\n" " flw fs2, 0x44(a1)\n" " flw fs3, 0x48(a1)\n" " flw fs4, 0x4c(a1)\n" " flw fs5, 0x50(a1)\n" " flw fs6, 0x54(a1)\n" " flw fs7, 0x58(a1)\n" " flw fs8, 0x5c(a1)\n" " flw fs9, 0x60(a1)\n" " flw fs10, 0x64(a1)\n" " flw fs11, 0x68(a1)\n" ``` ## Implementation of RV32I, RV32F, and RV32D Support for Tina Coroutine The implementation of coroutine context switching for **RV32I**, **RV32F**, and **RV32D** architectures in Tina. Each implementation handles saving and restoring the CPU state, including general-purpose and floating-point registers (if applicable), during coroutine switches. ### RV32I (No Floating-Point Extensions) This implementation provides support for the **RV32I architecture** in Tina, which involves saving and restoring **general-purpose registers** during coroutine context switching. It is specifically designed for RV32I without floating-point extensions. #### Key Details - **Stack Space Allocation**: 56 bytes (`0x38`) are allocated to save general-purpose registers. - **Registers Saved**: - General-purpose registers: `ra`, `s0` to `s11`. - **Alignment**: The stack is aligned to a 16-byte boundary using `andi`. #### Code Implementation ```c // RV32I without floating-point extensions #define TINA_ABI_riscv32i (__riscv && __riscv_xlen == 32 && !__riscv_flen) #elif TINA_ABI_riscv32i asm("_tina_init_stack:"); asm(" addi sp, sp, -0x38"); // Allocate stack space asm(" sw sp, (a1)"); // Save stack pointer asm(" sw ra, 0x34(sp)"); // Save return address asm(" sw s0, 0x30(sp)"); // Save s0 asm(" sw s1, 0x2C(sp)"); // Save s1 asm(" sw s2, 0x28(sp)"); // Save s2 asm(" sw s3, 0x24(sp)"); // Save s3 asm(" sw s4, 0x20(sp)"); // Save s4 asm(" sw s5, 0x1C(sp)"); // Save s5 asm(" sw s6, 0x18(sp)"); // Save s6 asm(" sw s7, 0x14(sp)"); // Save s7 asm(" sw s8, 0x10(sp)"); // Save s8 asm(" sw s9, 0x0C(sp)"); // Save s9 asm(" sw s10, 0x08(sp)"); // Save s10 asm(" sw s11, 0x04(sp)"); // Save s11 asm(" andi a2, a2, ~0xF"); // Align stack asm(" mv sp, a2"); asm(" mv ra, x0"); asm(" tail _tina_start"); asm("_tina_swap:"); asm(" addi sp, sp, -0x38"); // Allocate stack space asm(" sw sp, (a0)"); // Save stack pointer asm(" sw ra, 0x34(sp)"); // Save return address asm(" sw s0, 0x30(sp)"); // Save s0 asm(" sw s1, 0x2C(sp)"); // Save s1 asm(" sw s2, 0x28(sp)"); // Save s2 asm(" sw s3, 0x24(sp)"); // Save s3 asm(" sw s4, 0x20(sp)"); // Save s4 asm(" sw s5, 0x1C(sp)"); // Save s5 asm(" sw s6, 0x18(sp)"); // Save s6 asm(" sw s7, 0x14(sp)"); // Save s7 asm(" sw s8, 0x10(sp)"); // Save s8 asm(" sw s9, 0x0C(sp)"); // Save s9 asm(" sw s10, 0x08(sp)"); // Save s10 asm(" sw s11, 0x04(sp)"); // Save s11 asm(" lw sp, (a1)"); // Restore stack pointer asm(" lw ra, 0x34(sp)"); // Restore return address asm(" lw s0, 0x30(sp)"); // Restore s0 asm(" lw s1, 0x2C(sp)"); // Restore s1 asm(" lw s2, 0x28(sp)"); // Restore s2 asm(" lw s3, 0x24(sp)"); // Restore s3 asm(" lw s4, 0x20(sp)"); // Restore s4 asm(" lw s5, 0x1C(sp)"); // Restore s5 asm(" lw s6, 0x18(sp)"); // Restore s6 asm(" lw s7, 0x14(sp)"); // Restore s7 asm(" lw s8, 0x10(sp)"); // Restore s8 asm(" lw s9, 0x0C(sp)"); // Restore s9 asm(" lw s10, 0x08(sp)"); // Restore s10 asm(" lw s11, 0x04(sp)"); // Restore s11 asm(" addi sp, sp, 0x38"); // Deallocate stack space asm(" mv a0, a2"); // Set return value to a2 asm(" ret"); // Return ``` ### RV32F This implementation provides support for the **RV32F architecture** in Tina, which includes saving and restoring **general-purpose registers** and **single-precision floating-point registers** during coroutine context switching. #### Key Details - **Stack Space Allocation**: 104 bytes (`0x68`) are allocated to save general-purpose and floating-point registers. - **Registers Saved**: - General-purpose registers: `ra`, `s0` to `s11`. - Floating-point registers: `fs0` to `fs11`. - **Alignment**: The stack is aligned to a 16-byte boundary using `andi`. #### Code Implementation ```c #define TINA_ABI_riscv32f (__riscv && __riscv_xlen == 32 && __riscv_flen == 32) #elif TINA_ABI_riscv32f // 32-bit CPU + Single-Precision FPU (RV32F) asm("_tina_init_stack:"); asm(" addi sp, sp, -0x68"); // Allocate stack space asm(" sw sp, (a1)"); // Save stack pointer // Save general-purpose registers (ra, s0-s11) asm(" sw ra, 0x64(sp)"); asm(" sw s0, 0x60(sp)"); asm(" sw s1, 0x5C(sp)"); asm(" sw s2, 0x58(sp)"); asm(" sw s3, 0x54(sp)"); asm(" sw s4, 0x50(sp)"); asm(" sw s5, 0x4C(sp)"); asm(" sw s6, 0x48(sp)"); asm(" sw s7, 0x44(sp)"); asm(" sw s8, 0x40(sp)"); asm(" sw s9, 0x3C(sp)"); asm(" sw s10, 0x38(sp)"); asm(" sw s11, 0x34(sp)"); // Save single-precision floating-point registers (fs0-fs11) asm(" fsw fs0, 0x30(sp)"); asm(" fsw fs1, 0x2C(sp)"); asm(" fsw fs2, 0x28(sp)"); asm(" fsw fs3, 0x24(sp)"); asm(" fsw fs4, 0x20(sp)"); asm(" fsw fs5, 0x1C(sp)"); asm(" fsw fs6, 0x18(sp)"); asm(" fsw fs7, 0x14(sp)"); asm(" fsw fs8, 0x10(sp)"); asm(" fsw fs9, 0x0C(sp)"); asm(" fsw fs10, 0x08(sp)"); asm(" fsw fs11, 0x04(sp)"); asm(" andi a2, a2, ~0xF"); // Align stack asm(" mv sp, a2"); // Set stack pointer asm(" mv ra, x0"); // Clear return address asm(" tail _tina_start"); // Jump to coroutine start asm("_tina_swap:"); asm(" addi sp, sp, -0x68"); // Allocate stack space asm(" sw sp, (a0)"); // Save stack pointer // Save general-purpose registers asm(" sw ra, 0x64(sp)"); asm(" sw s0, 0x60(sp)"); asm(" sw s1, 0x5C(sp)"); asm(" sw s2, 0x58(sp)"); asm(" sw s3, 0x54(sp)"); asm(" sw s4, 0x50(sp)"); asm(" sw s5, 0x4C(sp)"); asm(" sw s6, 0x48(sp)"); asm(" sw s7, 0x44(sp)"); asm(" sw s8, 0x40(sp)"); asm(" sw s9, 0x3C(sp)"); asm(" sw s10, 0x38(sp)"); asm(" sw s11, 0x34(sp)"); // Save single-precision floating-point registers asm(" fsw fs0, 0x30(sp)"); asm(" fsw fs1, 0x2C(sp)"); asm(" fsw fs2, 0x28(sp)"); asm(" fsw fs3, 0x24(sp)"); asm(" fsw fs4, 0x20(sp)"); asm(" fsw fs5, 0x1C(sp)"); asm(" fsw fs6, 0x18(sp)"); asm(" fsw fs7, 0x14(sp)"); asm(" fsw fs8, 0x10(sp)"); asm(" fsw fs9, 0x0C(sp)"); asm(" fsw fs10, 0x08(sp)"); asm(" fsw fs11, 0x04(sp)"); asm(" lw sp, (a1)"); // Restore stack pointer // Restore general-purpose registers asm(" lw ra, 0x64(sp)"); asm(" lw s0, 0x60(sp)"); asm(" lw s1, 0x5C(sp)"); asm(" lw s2, 0x58(sp)"); asm(" lw s3, 0x54(sp)"); asm(" lw s4, 0x50(sp)"); asm(" lw s5, 0x4C(sp)"); asm(" lw s6, 0x48(sp)"); asm(" lw s7, 0x44(sp)"); asm(" lw s8, 0x40(sp)"); asm(" lw s9, 0x3C(sp)"); asm(" lw s10, 0x38(sp)"); asm(" lw s11, 0x34(sp)"); // Restore single-precision floating-point registers asm(" flw fs0, 0x30(sp)"); asm(" flw fs1, 0x2C(sp)"); asm(" flw fs2, 0x28(sp)"); asm(" flw fs3, 0x24(sp)"); asm(" flw fs4, 0x20(sp)"); asm(" flw fs5, 0x1C(sp)"); asm(" flw fs6, 0x18(sp)"); asm(" flw fs7, 0x14(sp)"); asm(" flw fs8, 0x10(sp)"); asm(" flw fs9, 0x0C(sp)"); asm(" flw fs10, 0x08(sp)"); asm(" flw fs11, 0x04(sp)"); asm(" addi sp, sp, 0x68"); // Deallocate stack space asm(" mv a0, a2"); // Set return value asm(" ret"); // Return to caller ``` ### RV32D This implementation adds support for the **RV32D architecture** in Tina, which includes saving and restoring **general-purpose registers** and **double-precision floating-point registers** during coroutine context switching. #### Key Details - **Stack Space Allocation**: 156 bytes (`0x9C`) are allocated to save general-purpose and double-precision floating-point registers. - **Registers Saved**: - General-purpose registers: `ra`, `s0` to `s11` (4 bytes each). - Double-precision floating-point registers: `fs0` to `fs11` (8 bytes each). - **Alignment**: The stack is aligned to a 16-byte boundary using `andi`. #### Code Implementation ```c #define TINA_ABI_riscv32d (__riscv && __riscv_xlen == 32 && __riscv_flen == 64) #elif TINA_ABI_riscv32d // 32-bit CPU + Double-Precision FPU (RV32D) asm("_tina_init_stack:"); asm(" addi sp, sp, -0x9C"); // Allocate stack space asm(" sw sp, (a1)"); // Save stack pointer // Save general-purpose registers (ra, s0-s11) asm(" sw ra, 0x98(sp)"); asm(" sw s0, 0x94(sp)"); asm(" sw s1, 0x90(sp)"); asm(" sw s2, 0x8C(sp)"); asm(" sw s3, 0x88(sp)"); asm(" sw s4, 0x84(sp)"); asm(" sw s5, 0x80(sp)"); asm(" sw s6, 0x7C(sp)"); asm(" sw s7, 0x78(sp)"); asm(" sw s8, 0x74(sp)"); asm(" sw s9, 0x70(sp)"); asm(" sw s10, 0x6C(sp)"); asm(" sw s11, 0x68(sp)"); // Save double-precision floating-point registers (fs0-fs11) asm(" fsd fs0, 0x60(sp)"); asm(" fsd fs1, 0x58(sp)"); asm(" fsd fs2, 0x50(sp)"); asm(" fsd fs3, 0x48(sp)"); asm(" fsd fs4, 0x40(sp)"); asm(" fsd fs5, 0x38(sp)"); asm(" fsd fs6, 0x30(sp)"); asm(" fsd fs7, 0x28(sp)"); asm(" fsd fs8, 0x20(sp)"); asm(" fsd fs9, 0x18(sp)"); asm(" fsd fs10, 0x10(sp)"); asm(" fsd fs11, 0x08(sp)"); asm(" andi a2, a2, ~0xF"); // Align stack asm(" mv sp, a2"); // Set stack pointer asm(" mv ra, x0"); // Clear return address asm(" tail _tina_start"); // Jump to coroutine start asm("_tina_swap:"); asm(" addi sp, sp, -0x9C"); // Allocate stack space asm(" sw sp, (a0)"); // Save stack pointer // Save general-purpose registers asm(" sw ra, 0x98(sp)"); asm(" sw s0, 0x94(sp)"); asm(" sw s1, 0x90(sp)"); asm(" sw s2, 0x8C(sp)"); asm(" sw s3, 0x88(sp)"); asm(" sw s4, 0x84(sp)"); asm(" sw s5, 0x80(sp)"); asm(" sw s6, 0x7C(sp)"); asm(" sw s7, 0x78(sp)"); asm(" sw s8, 0x74(sp)"); asm(" sw s9, 0x70(sp)"); asm(" sw s10, 0x6C(sp)"); asm(" sw s11, 0x68(sp)"); // Save double-precision floating-point registers asm(" fsd fs0, 0x60(sp)"); asm(" fsd fs1, 0x58(sp)"); asm(" fsd fs2, 0x50(sp)"); asm(" fsd fs3, 0x48(sp)"); asm(" fsd fs4, 0x40(sp)"); asm(" fsd fs5, 0x38(sp)"); asm(" fsd fs6, 0x30(sp)"); asm(" fsd fs7, 0x28(sp)"); asm(" fsd fs8, 0x20(sp)"); asm(" fsd fs9, 0x18(sp)"); asm(" fsd fs10, 0x10(sp)"); asm(" fsd fs11, 0x08(sp)"); asm(" lw sp, (a1)"); // Restore stack pointer // Restore general-purpose registers asm(" lw ra, 0x98(sp)"); asm(" lw s0, 0x94(sp)"); asm(" lw s1, 0x90(sp)"); asm(" lw s2, 0x8C(sp)"); asm(" lw s3, 0x88(sp)"); asm(" lw s4, 0x84(sp)"); asm(" lw s5, 0x80(sp)"); asm(" lw s6, 0x7C(sp)"); asm(" lw s7, 0x78(sp)"); asm(" lw s8, 0x74(sp)"); asm(" lw s9, 0x70(sp)"); asm(" lw s10, 0x6C(sp)"); asm(" lw s11, 0x68(sp)"); // Restore double-precision floating-point registers asm(" fld fs0, 0x60(sp)"); asm(" fld fs1, 0x58(sp)"); asm(" fld fs2, 0x50(sp)"); asm(" fld fs3, 0x48(sp)"); asm(" fld fs4, 0x40(sp)"); asm(" fld fs5, 0x38(sp)"); asm(" fld fs6, 0x30(sp)"); asm(" fld fs7, 0x28(sp)"); asm(" fld fs8, 0x20(sp)"); asm(" fld fs9, 0x18(sp)"); asm(" fld fs10, 0x10(sp)"); asm(" fld fs11, 0x08(sp)"); asm(" addi sp, sp, 0x9C"); // Deallocate stack space asm(" mv a0, a2"); // Set return value asm(" ret"); // Return to caller ``` ## Preliminary work ### RISC-V GNU Compiler Toolchain > [https://github.com/riscv-collab/riscv-gnu-toolchain.git](https://github.com/riscv-collab/riscv-gnu-toolchain.git) #### Setup ``` $ git clone https://github.com/riscv/riscv-gnu-toolchain $ sudo apt-get install autoconf automake autotools-dev curl python3 python3-pip python3-tomli libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev ninja-build git cmake libglib2.0-dev libslirp-dev $ ./configure --prefix=/opt/riscv --enable-multilib --with-multilib-generator="rv32i-ilp32--" $ sudo make -j$(nproc) ``` ### rv32emu > [https://github.com/sysprog21/rv32emu/tree/master](https://github.com/sysprog21/rv32emu/tree/master) #### Setup ``` $ git clone https://github.com/sysprog21/rv32emu.git $ sudo apt install libsdl2-dev libsdl2-mixer-dev $ make ``` ## verify ### verify RV32I (without floating-point extensions) :::danger Always write comments in English! ::: #### rv32i.c ```c #define TINA_IMPLEMENTATION #include "tina.h" #include <stdio.h> #include <stdint.h> // 簡單的協程函式,用來測試切換及整數運算 static void* fiberA(tina* coro, void* val) { printf("Fiber A start.\n"); int32_t x = 10; x += 20; // x = 30 printf("Fiber A integer x = %d\n", x); // 使用 tina_yield() 暫停自己 val = tina_yield(coro, (void*)1); x *= 2; // x = 60 printf("Fiber A after yield, x = %d\n", x); printf("Fiber A done.\n"); return val; // 回傳給最終 resume } int main() { // 準備一段堆疊給協程 static char stackA[64 * 1024]; // 初始化纖程 tina* coroA = tina_init(stackA, sizeof(stackA), fiberA, NULL); // 第一次進入 A printf("Main: resume fiber A.\n"); void* ret = tina_resume(coroA, NULL); printf("Main: fiber A yield. ret = %ld\n", (long)ret); // 再次進入 A printf("Main: resume fiber A again.\n"); ret = tina_resume(coroA, NULL); printf("Main: fiber A ended. ret = %ld\n", (long)ret); return 0; } ``` #### Compilation > Use the following command to compile the program: ```bash $ riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -O2 rv32i.c -o rv32i $ rv32emu ./rv32i ``` #### Output ``` Main: resume fiber A. Fiber A start. Fiber A integer x = 30 Main: fiber A yield. ret = 1 Main: resume fiber A again. Fiber A after yield, x = 60 Fiber A done. Main: fiber A ended. ret = 0 inferior exit code 0 ``` ### verify RV32F #### rv32f.c ```c #define TINA_IMPLEMENTATION #include "tina.h" #include <stdio.h> // 簡單的協程函式,用來測試切換及浮點運算 static void* fiberA(tina* coro, void* val){ printf("Fiber A start.\n"); float x = 12.59; x += 2.71828; // x = 15.308280 printf("Fiber A double x = %f\n", x); // 使用 tina_yield() 暫停自己 val = tina_yield(coro, (void*)1); x *= 3.0; // x = 45.924839 printf("Fiber A after yield, x = %f\n", x); printf("Fiber A done.\n"); return val; // 回傳給最終 resume } int main(){ // 準備一段堆疊給協程 static char stackA[64 * 1024]; // 初始化纖程 tina* coroA = tina_init(stackA, sizeof(stackA), fiberA, NULL); // 第一次進入 A printf("Main: resume fiber A.\n"); void* ret = tina_resume(coroA, NULL); printf("Main: fiber A yield. ret = %ld\n", (long)ret); // 再次進入 A printf("Main: resume fiber A again.\n"); ret = tina_resume(coroA, NULL); printf("Main: fiber A ended. ret = %ld\n", (long)ret); return 0; } ``` #### Compilation > Use the following command to compile the program: ```bash $ riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -O2 rv32f.c -o rv32f $ rv32emu ./rv32f ``` #### Output ``` Main: resume fiber A. Fiber A start. Fiber A double x = 15.308280 Main: fiber A yield. ret = 1 Main: resume fiber A again. Fiber A after yield, x = 45.924839 Fiber A done. Main: fiber A ended. ret = 0 inferior exit code 0 ``` ### verify RV32D #### rv32d.c ```c #define TINA_IMPLEMENTATION #include "tina.h" #include <stdio.h> // 簡單的協程函式,用來測試切換及浮點運算 static void* fiberA(tina* coro, void* val){ printf("Fiber A start.\n"); double x = 3.14; x += 2.71828; // x = 5.85828 printf("Fiber A double x = %f\n", x); // 使用 tina_yield() 暫停自己 val = tina_yield(coro, (void*)1); x *= 2.0; // x = 11.71656 printf("Fiber A after yield, x = %f\n", x); printf("Fiber A done.\n"); return val; // 回傳給最終 resume } int main(){ // 準備一段堆疊給協程 static char stackA[64 * 1024]; // 初始化纖程 tina* coroA = tina_init(stackA, sizeof(stackA), fiberA, NULL); // 第一次進入 A printf("Main: resume fiber A.\n"); void* ret = tina_resume(coroA, NULL); printf("Main: fiber A yield. ret = %ld\n", (long)ret); // 再次進入 A printf("Main: resume fiber A again.\n"); ret = tina_resume(coroA, NULL); printf("Main: fiber A ended. ret = %ld\n", (long)ret); return 0; } ``` #### Compilation > Use the following command to compile the program: ```bash $ riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -O2 rv32d.c -o rv32d $ rv32emu ./rv32d ``` #### Output ``` Main: resume fiber A. Fiber A start. Fiber A double x = 5.858280 Main: fiber A yield. ret = 1 Main: resume fiber A again. Fiber A after yield, x = 11.716560 Fiber A done. Main: fiber A ended. ret = 0 inferior exit code 0 ``` ## Contributing Back to Tina ![螢幕擷取畫面 2025-01-14 113116](https://hackmd.io/_uploads/rJJmNu7PJe.png) ![螢幕擷取畫面 2025-01-14 113109](https://hackmd.io/_uploads/Syk7VuXvJx.png) Merged! ## Reference - [https://github.com/edubart/minicoro](https://github.com/edubart/minicoro) - [https://github.com/sysprog21/rv32emu/tree/master](https://github.com/sysprog21/rv32emu/tree/master) - [https://github.com/riscv-collab/riscv-gnu-toolchain.git](https://github.com/riscv-collab/riscv-gnu-toolchain.git) - [https://github.com/slembcke/Tina](https://github.com/slembcke/Tina)