# Homework3: SoftCPU
## Setup
* download [riscv-none-embed-gcc](https://xpack.github.io/riscv-none-embed-gcc/)
* set path
```=
cd $HOME
wget https://github.com/xpack-dev-tools/riscv-none-embed-gcc-xpack/releases/download/v10.2.0-1.2/xpack-riscv-none-embed-gcc-10.2.0-1.2-linux-x64.tar.gz
tar zxvf xpack-riscv-none-embed-gcc-10.2.0-1.2-linux-x64.tar.gz
cd riscv-none-embed-gcc
echo "export PATH=`pwd`/bin:$PATH" > setenv
cd $HOME
source riscv-none-embed-gcc/setenv
```
* download other thins we need
```=
sudo apt-get install lcov
sudo apt-get install ccache
```
* and to git clone [srv32](https://github.com/sysprog21/srv32)
* build simulation through Makefile
```=
cd srv32/tools
make
cd ../sim
make
```
## Origin Code
* cause the compiler can't recoginized booling function , so I change the original code in assignment1 of bool to be int .
```c=
#include <stdio.h>
int isPerfectSquare(int num){
int k = 1;
while( num > 0)
{
num = num - k;
k = k + 2;
}
return num == 0;
}
int main(void){
int number1 = 14 , number2 = 16;
printf("is %d a perfect square number?%d\n",number1,isPerfectSquare(number1));
printf("is %d a perfect square number?%d\n",number2,isPerfectSquare(number2));
}
```
and the result is below:

## Optimization
* because the origin code are already simple,so it's hard to optimize it by chaging the whole architecture
* In my first try,I seek a solution for another way : **binary search**
```c=
#include <stdio.h>
int isPerfectSquare(int num){
int m;
int l;
int r = num;
int sq;
while(l <= r){
m = (l + r) /2;
sq = m * m ;
if(sq < num ){
l = m +1;
}
else if(sq > num){
r = m -1;
}
else{
return 1;
}
}
return 0;
}
int main(void){
int number1 = 14 , number2 = 16;
printf("is %d a perfect square number?%d\n",number1,isPerfectSquare(number1));
printf("is %d a perfect square number?%d\n",number2,isPerfectSquare(number2));
}
```

* but the number of cycles getting higher...
So I keep finding another optimization, and I get a new way to reduce clock cycles(although just a little) : **Newton's method**
```c=
int isPerfectSquare(int num){
double x = 1;
int i;
for(i=0;i<3;i++ )
x = 0.5 * (x + num / x);
x = (int)x;
if((x*x) == num) ret
else return 0;
}
```

* because in my example the 2 inputs are little so I set the slack variable i to the range [0,2], and it can work until the number > 36
## Simple 3-stage pipeline RISC-V processor
* **3-stage**
* **full forward**
* **branch penalty is 2**
* **SRV32 pipeline architecture**:

## Waveform
* first, open gtkwave after installed it thtough:
```shell=
sudo apt install gtkwave
gtkwave
```
* then we get in riscv , and cause srv32 have only three stages, so I just pick up if_pc/ex_pc/wb_pc

* see `sim/trace.log`
* take lw instruction for example
```cpp=
00000250 00052783 read 0x000213f0 => 0x00020588, x15 (a5) <= 0x00020588
```
* pc is ```000002500``` ,instr is ```0001B517```
* read data of lw is 0x00020588

* imem_rdata is 0x00052783 ,correct

* we can see in 2 clks later(i.e. lw instruction in WB stage) the dmem_rdata signal be ```0x00000000``` same as the value we want to read