# Improve RV32I backend for shecc > contribute by: < oucs638 > # TODO - [ ] Organize the document of [shecc](https://github.com/jserv/shecc) source code. - [ ] Allow [riscv-codegen](https://github.com/jserv/shecc/blob/master/src/riscv-codegen.c) to generate executable multiplication and division instructions without RV32M. - [ ] Try to use shift to rewrite part of the multiplication and division operations in [`src/cfront.c`](https://github.com/jserv/shecc/blob/master/src/cfront.c), so that the compiler can generate effective IR before codegen, and codegen can get more optimization space. # Progress Record - Try to make `riscv-codegen.c` can generate excutable multiplication and division instructions without RV32M. - Maybe can add a step to check if the function exists before calling the multiplication or division function. - In order to simulate the situation where extension is not supported, comment out the `__mul()`, `__div()`, and `__mod()` declartion in `riscv.c`. - Since [shecc](https://github.com/jserv/shecc) currently does not fully support preprocessor, can't use `#ifdef ... #else ... #endif`. - Need to switch to use `if (var == 1) {}`. - Try to use [weak symbol](https://en.wikipedia.org/wiki/Weak_symbol), and check if the function exists before calling it. - Add declartion with weak symbol in `riscv-codegen.c` ```c= int __attribute__((weak)) __mul(rv_reg rd, rv_reg rs1, rv_reg rs2); int __attribute__((weak)) __div(rv_reg rd, rv_reg rs1, rv_reg rs2); int __attribute__((weak)) __mod(rv_reg rd, rv_reg rs1, rv_reg rs2); void code_generate() { ... ``` - Check if the function exists. ```c= Change this part: case OP_mul: emit(__mul(dest_reg, dest_reg, OP_reg)); if (dump_ir == 1) printf(" x%d *= x%d", dest_reg, OP_reg); break; Into: case OP_mul: if (__mul) { emit(__mul(dest_reg, dest_reg, OP_reg)); if (dump_ir == 1) printf(" x%d *= x%d", dest_reg, OP_reg); } else error("Multiplication operation is not supported"); break; ``` - However encountered some problems when run `make`. ```shell > make CC+LD out/inliner GEN out/libc.inc CC out/src/main.o LD out/shecc SHECC out/shecc-stage1.elf Aborted (core dumped) make: *** [Makefile:71: out/shecc-stage1.elf] Error 134 ``` - The GNU extension cannot be used since "shecc" is a self-hosting compiler. - Try to add new compiler options. - Add two compiler options, `+m` and `-m`. ```shell Old: > shecc [-o output] [-no-libc] [--dump-ir] <infile.c> New: > shecc [-o output] [-no-libc] [--dump-ir] [+m/-m] <infile.c> ``` - Add new global variable `riscv_m_extension`. ```c= /* In globals.c */ ... /* If the +m option is selected, means take M extension. * => riscv_m_extension = 1 * * If the -m option is selected, means do not take M extension. * => riscv_m_extension = 0 * * Notice: riscv_m_extension must be initiailized to 1. * Otherwise, the program will not be able to compile itself. */ int riscv_m_extension = 1; ... ``` ```c= /* In main.c */ ... else if (!strcmp(argv[i], "+m")) riscv_m_extension = 1; else if (!strcmp(argv[i], "-m")) riscv_m_extension = 0; ... ``` - Add a judgment condition before the program calls the function of m extension. ```c= /* In riscc_codegen.c */ Change this part: case OP_mul: emit(__mul(dest_reg, dest_reg, OP_reg)); if (dump_ir == 1) printf(" x%d *= x%d", dest_reg, OP_reg); break; Into: case OP_mul: if (riscv_m_extension == 1) /* M extension is supported, call the mul function. */ emit(__mul(dest_reg, dest_reg, OP_reg)); else /* M extension does not be supported, call the nop. */ emit(__addi(__zero, __zero, 0)); if (dump_ir == 1) printf(" x%d *= x%d", dest_reg, OP_reg); break; ``` - Since register `__zero` is read-only and can not be changed, the instruction `__addi(__zero, __zero, 0)` will have no effect on riscv. Therefore, it will be a `nop`. - Test after adding new compiler options. - Add new file `/tests/mul_test.c`. ```c= #include <stdio.h> int main() { printf("%d * %d = %d\n", 0, 0, 0 * 0); printf("%d * %d = %d\n", 0, 1, 0 * 1); printf("%d * %d = %d\n", 1, 0, 1 * 0); printf("%d * %d = %d\n", 1, 1, 1 * 1); printf("%d * %d = %d\n", 1, 9, 1 * 9); printf("%d * %d = %d\n", 9, 1, 9 * 1); printf("%d * %d = %d\n", 2, 7, 2 * 7); printf("%d * %d = %d\n", 7, 2, 7 * 2); printf("%d * %d = %d\n", 13, 17, 13 * 17); printf("%d * %d = %d\n", 17, 13, 17 * 13); return 0; } ``` - And test it. ```shell > out/shecc +m -o mul_test tests/mul_test.c > ./mul_test 0 * 0 = 0 0 * 1 = 0 1 * 0 = 0 1 * 1 = 1 1 * 9 = 9 9 * 1 = 9 2 * 7 = 14 7 * 2 = 14 13 * 17 = 221 17 * 13 = 221 > out/shecc -m -o mul_test tests/mul_test.c > ./mul_test 0 * 0 = 0 0 * 0 = 0 1 * 256 = 65536 1 * 256 = 65536 1 * 256 = 65536 9 * 2304 = 589824 2 * 512 = 131072 7 * 1792 = 458752 13 * 3328 = 851968 17 * 4352 = 1114112 ``` - Now, the code generator can generate executable multiplication code without M extension. - But after continuous multiplication, the value of the multiplicand will not be equal to the input and will be unexpected value. - Try to use shift to rewrite part of the multiplication and division operations in [`src/cfront.c`](https://github.com/jserv/shecc/blob/master/src/cfront.c), so that the compiler can generate effective IR before codegen, and codegen can get more optimization space. - There are four functions in `src/cfront.c` that use multiplication operation. ``` c= In read_numeric_constant(): value = value * 10 + buffer[i++] - '0'; In read_numeric_param(): value = (value * 16) + c; ... value = (value * 10) + c; In read_numeric_sconstant(): return (-1) * res; In eval_expression_imm(): res = op1 * op2; ``` - There is one functions in `src/cfront.c` that use multiplication operation. ```c= In eval_expression_imm(): res = op1 / op2; ``` - Multiplication can be rewrited as: ```c= /* Implement m * n */ int mul(int m, int n){ int res = 0; for (int cnt = 0; n; cnt++, n /= 2) if (n % 2 == 1) res += m << cnt; return res; } ``` - Therefore, `valus * 10` and `value * 16` can be rewritten: ```c= value * 10 == (value << 3) + (value << 1); value * 16 == value << 4; ``` - So, it can rewrite part of `read_numeric_constant()` and `In read_numeric_param()`. ```c= In read_numeric_constant(): value = (val << 3) + (val << 1) + buffer[i++] - '0'; /****************/ In read_numeric_param(): value = (value << 4) + c; ... value = (val << 3) + (val << 1) + c; ``` - Test after make change and every thing is ok. ```shell > make clean > make config ARCH=riscv > make > make check ...(all Passed)... ``` - Source code has some issues that should be improved.. - In the situation without M extension, using "nop" to replace multiplication is temporary. Therefore, it should move forward to actual code generations without M extensions. - Should rename the global variables "riscv_m_extension" to "use_m_ext" to reflect the use of integer multiplication and division. And even for "Arm" architecture, it is still making sense. - Instead add new file `tests/mul_test.c`, it should update file `tests/driver.sh` which perform the comprehensive - Replace `riscv_m_extension` with `use_m_ext`. - Update the file `tests/driver.sh` - Try to implement multiplication with out m extension. - If use m extension, can use `mul` instruction. ```c= emit(__mul(dest_reg, dest_reg, OP_reg)); ``` - Try to implement multiplication with shift operation and test with Ripes. ```= .data .text main: addi s2, zero, 3 # dest_reg addi s3, zero, 4 # OP_reg addi t1, zero, 1 addi t3, zero, 0 addi t4, zero, 32 addi t5, zero, 0 andi t6, s3, 1 beq t6, zero, 12 sll t6, s2, t5 add t3, t3, t6 srl s3, s3, t1 addi t5, t5, 1 addi t4, t4, -1 bne t4, zero, -28 add s2, zero, t3 print: # print the result mv a0, s2 li a7, 1 ecall j exit exit: li a7, 10 ecall ``` ![](https://i.imgur.com/InCMEqx.png) - Replace `mul` with follow code. ```c= emit(__addi(__t1, __zero, 1)); emit(__addi(__t3, __zero, 0)); emit(__addi(__t4, __zero, 32)); emit(__addi(__t5, __zero, 0)); emit(__andi(__t6, OP_reg, 1)); emit(__beq(__t6, __zero, 12)); emit(__sll(__t6, dest_reg, __t5)); emit(__add(__t3, __t3, __t6)); emit(__srl(OP_reg, OP_reg, __t1)); emit(__addi(__t5, __t5, 1)); emit(__addi(__t4, __t4, -1)); emit(__bne(__t4, __zero, -28)); emit(__add(dest_reg, __zero, __t3)); ``` - However, compile `mul_test` without m extension and run will have problem. ```shell= > ./mul_test [1] 86047 segmentation fault (core dumped) ./mul_test ``` :::danger :warning: Attempting to implement multiplication without mul still fails. :warning: So put it on hold for now and start organizing the documentation. ::: :::info - Start to organize document of shecc. - Put in another hackmd: [Document of shecc](https://hackmd.io/@oucs638/document-of-shecc) ::: :::warning :tumbler_glass: TODO has not yet been completed and is in progress. :tumbler_glass: Changed Version: [oucs638/shecc](https://github.com/oucs638/shecc) :tumbler_glass: [Document of shecc](https://hackmd.io/@oucs638/document-of-shecc) (not yet complete) ::: # Reference - [shecc](https://github.com/jserv/shecc) - [self-hosting compilers](https://en.wikipedia.org/wiki/Self-hosting_(compilers))