# Improve RV32I backend for shecc
> contribute by: < oucs638 >
# TODO
- [ ] Organize the document of [shecc](https://github.com/jserv/shecc) source code.
- [ ] Allow [riscv-codegen](https://github.com/jserv/shecc/blob/master/src/riscv-codegen.c) to generate executable multiplication and division instructions without RV32M.
- [ ] Try to use shift to rewrite part of the multiplication and division operations in [`src/cfront.c`](https://github.com/jserv/shecc/blob/master/src/cfront.c), so that the compiler can generate effective IR before codegen, and codegen can get more optimization space.
# Progress Record
- Try to make `riscv-codegen.c` can generate excutable multiplication and division instructions without RV32M.
- Maybe can add a step to check if the function exists before calling the multiplication or division function.
- In order to simulate the situation where extension is not supported, comment out the `__mul()`, `__div()`, and `__mod()` declartion in `riscv.c`.
- Since [shecc](https://github.com/jserv/shecc) currently does not fully support preprocessor, can't use `#ifdef ... #else ... #endif`.
- Need to switch to use `if (var == 1) {}`.
- Try to use [weak symbol](https://en.wikipedia.org/wiki/Weak_symbol), and check if the function exists before calling it.
- Add declartion with weak symbol in `riscv-codegen.c`
```c=
int __attribute__((weak)) __mul(rv_reg rd, rv_reg rs1, rv_reg rs2);
int __attribute__((weak)) __div(rv_reg rd, rv_reg rs1, rv_reg rs2);
int __attribute__((weak)) __mod(rv_reg rd, rv_reg rs1, rv_reg rs2);
void code_generate()
{
...
```
- Check if the function exists.
```c=
Change this part:
case OP_mul:
emit(__mul(dest_reg, dest_reg, OP_reg));
if (dump_ir == 1)
printf(" x%d *= x%d", dest_reg, OP_reg);
break;
Into:
case OP_mul:
if (__mul) {
emit(__mul(dest_reg, dest_reg, OP_reg));
if (dump_ir == 1)
printf(" x%d *= x%d", dest_reg, OP_reg);
} else
error("Multiplication operation is not supported");
break;
```
- However encountered some problems when run `make`.
```shell
> make
CC+LD out/inliner
GEN out/libc.inc
CC out/src/main.o
LD out/shecc
SHECC out/shecc-stage1.elf
Aborted (core dumped)
make: *** [Makefile:71: out/shecc-stage1.elf] Error 134
```
- The GNU extension cannot be used since "shecc" is a self-hosting compiler.
- Try to add new compiler options.
- Add two compiler options, `+m` and `-m`.
```shell
Old:
> shecc [-o output] [-no-libc] [--dump-ir] <infile.c>
New:
> shecc [-o output] [-no-libc] [--dump-ir] [+m/-m] <infile.c>
```
- Add new global variable `riscv_m_extension`.
```c=
/* In globals.c */
...
/* If the +m option is selected, means take M extension.
* => riscv_m_extension = 1
*
* If the -m option is selected, means do not take M extension.
* => riscv_m_extension = 0
*
* Notice: riscv_m_extension must be initiailized to 1.
* Otherwise, the program will not be able to compile itself.
*/
int riscv_m_extension = 1;
...
```
```c=
/* In main.c */
...
else if (!strcmp(argv[i], "+m"))
riscv_m_extension = 1;
else if (!strcmp(argv[i], "-m"))
riscv_m_extension = 0;
...
```
- Add a judgment condition before the program calls the function of m extension.
```c=
/* In riscc_codegen.c */
Change this part:
case OP_mul:
emit(__mul(dest_reg, dest_reg, OP_reg));
if (dump_ir == 1)
printf(" x%d *= x%d", dest_reg, OP_reg);
break;
Into:
case OP_mul:
if (riscv_m_extension == 1)
/* M extension is supported, call the mul function. */
emit(__mul(dest_reg, dest_reg, OP_reg));
else
/* M extension does not be supported, call the nop. */
emit(__addi(__zero, __zero, 0));
if (dump_ir == 1)
printf(" x%d *= x%d", dest_reg, OP_reg);
break;
```
- Since register `__zero` is read-only and can not be changed, the instruction `__addi(__zero, __zero, 0)` will have no effect on riscv. Therefore, it will be a `nop`.
- Test after adding new compiler options.
- Add new file `/tests/mul_test.c`.
```c=
#include <stdio.h>
int main()
{
printf("%d * %d = %d\n", 0, 0, 0 * 0);
printf("%d * %d = %d\n", 0, 1, 0 * 1);
printf("%d * %d = %d\n", 1, 0, 1 * 0);
printf("%d * %d = %d\n", 1, 1, 1 * 1);
printf("%d * %d = %d\n", 1, 9, 1 * 9);
printf("%d * %d = %d\n", 9, 1, 9 * 1);
printf("%d * %d = %d\n", 2, 7, 2 * 7);
printf("%d * %d = %d\n", 7, 2, 7 * 2);
printf("%d * %d = %d\n", 13, 17, 13 * 17);
printf("%d * %d = %d\n", 17, 13, 17 * 13);
return 0;
}
```
- And test it.
```shell
> out/shecc +m -o mul_test tests/mul_test.c
> ./mul_test
0 * 0 = 0
0 * 1 = 0
1 * 0 = 0
1 * 1 = 1
1 * 9 = 9
9 * 1 = 9
2 * 7 = 14
7 * 2 = 14
13 * 17 = 221
17 * 13 = 221
> out/shecc -m -o mul_test tests/mul_test.c
> ./mul_test
0 * 0 = 0
0 * 0 = 0
1 * 256 = 65536
1 * 256 = 65536
1 * 256 = 65536
9 * 2304 = 589824
2 * 512 = 131072
7 * 1792 = 458752
13 * 3328 = 851968
17 * 4352 = 1114112
```
- Now, the code generator can generate executable multiplication code without M extension.
- But after continuous multiplication, the value of the multiplicand will not be equal to the input and will be unexpected value.
- Try to use shift to rewrite part of the multiplication and division operations in [`src/cfront.c`](https://github.com/jserv/shecc/blob/master/src/cfront.c), so that the compiler can generate effective IR before codegen, and codegen can get more optimization space.
- There are four functions in `src/cfront.c` that use multiplication operation.
``` c=
In read_numeric_constant():
value = value * 10 + buffer[i++] - '0';
In read_numeric_param():
value = (value * 16) + c;
...
value = (value * 10) + c;
In read_numeric_sconstant():
return (-1) * res;
In eval_expression_imm():
res = op1 * op2;
```
- There is one functions in `src/cfront.c` that use multiplication operation.
```c=
In eval_expression_imm():
res = op1 / op2;
```
- Multiplication can be rewrited as:
```c=
/* Implement m * n */
int mul(int m, int n){
int res = 0;
for (int cnt = 0; n; cnt++, n /= 2)
if (n % 2 == 1)
res += m << cnt;
return res;
}
```
- Therefore, `valus * 10` and `value * 16` can be rewritten:
```c=
value * 10 == (value << 3) + (value << 1);
value * 16 == value << 4;
```
- So, it can rewrite part of `read_numeric_constant()` and `In read_numeric_param()`.
```c=
In read_numeric_constant():
value = (val << 3) + (val << 1) + buffer[i++] - '0';
/****************/
In read_numeric_param():
value = (value << 4) + c;
...
value = (val << 3) + (val << 1) + c;
```
- Test after make change and every thing is ok.
```shell
> make clean
> make config ARCH=riscv
> make
> make check
...(all Passed)...
```
- Source code has some issues that should be improved..
- In the situation without M extension, using "nop" to replace multiplication is temporary. Therefore, it should move forward to actual code generations without M extensions.
- Should rename the global variables "riscv_m_extension" to "use_m_ext" to reflect the use of integer multiplication and division. And even for "Arm" architecture, it is still making sense.
- Instead add new file `tests/mul_test.c`, it should update file `tests/driver.sh` which perform the comprehensive
- Replace `riscv_m_extension` with `use_m_ext`.
- Update the file `tests/driver.sh`
- Try to implement multiplication with out m extension.
- If use m extension, can use `mul` instruction.
```c=
emit(__mul(dest_reg, dest_reg, OP_reg));
```
- Try to implement multiplication with shift operation and test with Ripes.
```=
.data
.text
main:
addi s2, zero, 3 # dest_reg
addi s3, zero, 4 # OP_reg
addi t1, zero, 1
addi t3, zero, 0
addi t4, zero, 32
addi t5, zero, 0
andi t6, s3, 1
beq t6, zero, 12
sll t6, s2, t5
add t3, t3, t6
srl s3, s3, t1
addi t5, t5, 1
addi t4, t4, -1
bne t4, zero, -28
add s2, zero, t3
print:
# print the result
mv a0, s2
li a7, 1
ecall
j exit
exit:
li a7, 10
ecall
```

- Replace `mul` with follow code.
```c=
emit(__addi(__t1, __zero, 1));
emit(__addi(__t3, __zero, 0));
emit(__addi(__t4, __zero, 32));
emit(__addi(__t5, __zero, 0));
emit(__andi(__t6, OP_reg, 1));
emit(__beq(__t6, __zero, 12));
emit(__sll(__t6, dest_reg, __t5));
emit(__add(__t3, __t3, __t6));
emit(__srl(OP_reg, OP_reg, __t1));
emit(__addi(__t5, __t5, 1));
emit(__addi(__t4, __t4, -1));
emit(__bne(__t4, __zero, -28));
emit(__add(dest_reg, __zero, __t3));
```
- However, compile `mul_test` without m extension and run will have problem.
```shell=
> ./mul_test
[1] 86047 segmentation fault (core dumped) ./mul_test
```
:::danger
:warning: Attempting to implement multiplication without mul still fails.
:warning: So put it on hold for now and start organizing the documentation.
:::
:::info
- Start to organize document of shecc.
- Put in another hackmd: [Document of shecc](https://hackmd.io/@oucs638/document-of-shecc)
:::
:::warning
:tumbler_glass: TODO has not yet been completed and is in progress.
:tumbler_glass: Changed Version: [oucs638/shecc](https://github.com/oucs638/shecc)
:tumbler_glass: [Document of shecc](https://hackmd.io/@oucs638/document-of-shecc) (not yet complete)
:::
# Reference
- [shecc](https://github.com/jserv/shecc)
- [self-hosting compilers](https://en.wikipedia.org/wiki/Self-hosting_(compilers))