# Assignment1: RISC-V Assembly and Instruction Pipeline
## 860. Lemonade Change
This is one of the “Easy” questions on LeetCode as requested in the assignment. The question is as follows.
> At a lemonade stand, each lemonade costs $5. Customers are standing in a queue to buy from you and order one at a time (in the order specified by bills). Each customer will only buy one lemonade and pay with either a $5, $10, or $20 bill. You must provide the correct change to each customer so that the net transaction is that the customer pays $5.
>
>**Note that you do not have any change in hand at first.**
>Given an integer array bills where bills[i] is the bill the ith customer pays, return true if you can provide every customer with the correct change, or false otherwise
---
**Example 1:**
>Input: bills = [5,5,5,10,20]
Output: true
Explanation:
From the first 3 customers, we collect three $5 bills in order.
From the fourth customer, we collect a $10 bill and give back a $5.
From the fifth customer, we give a $10 bill and a $5 bill.
Since all customers got correct change, we output true.
**Example 2:**
>Input: bills = [5,5,10,10,20]
Output: false
Explanation:
From the first two customers in order, we collect two $5 bills.
For the next two customers in order, we collect a $10 bill and give back a $5 bill.
For the last customer, we can not give the change of $15 back because we only have two $10 bills.
Since not every customer received the correct change, the answer is false.
---
## C Code
```c
#include <stdio.h>
#include <stdbool.h>
bool LemonadeChange(int* bills, int billsSize){
int bill_5 = 0;
int bill_10 = 0;
int bill_20 = 0;
for(int i = 0; i < billsSize; i++){
if(bills[i] == 5){
bill_5 ++;
}else if(bills[i] == 10){
if(bill_5<1) return false;
bill_5 --;
bill_10 ++;
}else{
if(bill_10 > 0 && bill_5 > 0){
bill_5 --;
bill_10 --;
bill_20 ++;
}else if(bill_5 > 2){
bill_5 = bill_5 - 3;
bill_20 ++;
}else{
return false;
}
}
}
return true;
}
int main(int argc, char*argv[]){
int test1[5] = {5,5,5,10,20};
int test2[5] = {5,5,10,10,20};
int test3[10] = {5,5,5,10,5,5,10,20,20,20};
int size1 = 5;
int size2 = 5;
int size3 = 10;
int j = 0;
printf("Test 1: ");
for(j = 0; j < size1; j++)
printf("%d ",test1[j]);
if(LemonadeChange(test1, size1) == true) printf(" Ans : true \n");
else printf(" Ans : false \n");
printf("Test 2: ");
for(j = 0; j < size2; j++)
printf("%d ",test2[j]);
if(LemonadeChange(test2, size2) == true) printf(" Ans : true \n");
else printf(" Ans : false \n");
printf("Test 3: ");
for(j = 0; j < size3; j++)
printf("%d ",test3[j]);
if(LemonadeChange(test3, size3) == true) printf(" Ans : true \n");
else printf(" Ans : false \n");
}
```
:::warning
:warning: The C code looks ugly. Please refine!
:notes: jserv
:::
## Assembly code
### original
```assembly=
.data
arr1: .word 5,5,5,10,20
size1: .word 5
arr2: .word 5,5,10,10,20
size2: .word 5
arr3: .word 5,5,5,10,5,5,10,20,20,20
size3: .word 10
str0: .string "Ans: false\n"
str1: .string "Ans: true\n"
space: .string " "
tes1str: .string "Test1: "
tes2str: .string "Test2: "
tes3str: .string "Test3: "
.text
main:
la s0 arr1
lw s1 size1
la a0 ,tes1str
li a7,4
ecall
mv a2 x0
jal ra Print_test
jal ra , LemonadeChange
la s0 arr2
lw s1 size2
la a0 ,tes2str
li a7,4
ecall
mv a2 x0
jal ra Print_test
jal ra , LemonadeChange
la s0 arr3
lw s1 size3
la a0 ,tes3str
li a7,4
ecall
mv a2 x0
jal ra Print_test
jal ra , LemonadeChange
li a7 10
ecall
LemonadeChange:
#s0:bills base s1:size t0:bill_5 t1:bill_10 t2:bill_20
#s2=answer i =t3
addi t0 , x0 ,0
addi t1 , x0 ,0
addi t2 , x0 ,0
mv a0 x0
addi t3 x0 0
addi t4 ra 0
jal ra, LOOP
j TRUE
LOOP:
addi t3 t3 1
beq s1 t3 EXIT
slli a0 t3 2
add a0 a0 s0
lw a0 0(a0)
addi a1 x0 5
beq a1 a0 bill5
addi a1 x0 10
beq a1 a0 bill10
addi a1 x0 20
beq a1 a0 bill20
j LOOP
bill5:
addi t0 t0 1
j LOOP
bill10:
beq t0 x0 FALSE
addi t1 t1 1
addi t0 t0 -1
j LOOP
bill20:
slti a2 t0 1
slti a3 t1 1
add a4 a2 a3
beq a4 x0 bill20if1
slti a2 t0 3
beq a2 x0 bill20if2
j FALSE
bill20if2:
addi t0 t0 -3
addi t2 t2 1
j LOOP
bill20if1:
addi t0 t0 -1
addi t1 t1 -1
addi t2 t2 1
j LOOP
FALSE:
la a0 ,str0
li a7,4
ecall
jr t4
TRUE:
la a0 ,str1
li a7,4
ecall
jr t4
Print_test:
slli a2 a2 2
add t0 s0 a2
lw a0 0(t0)
li a7,1
ecall
la a0 ,space
li a7,4
ecall
srli a2 a2 2
addi a2 a2 1
slt t1 a2 s1
bne x0 t1 Print_test
jr ra
EXIT:
jr ra
```
### new version
``` assembly
.data
arr1: .word 5,5,5,10,20
size1: .word 5
arr2: .word 5,5,10,10,20
size2: .word 5
arr3: .word 5,5,5,10,5,5,10,20,20,20
size3: .word 10
str0: .string "Ans: false\n"
str1: .string "Ans: true\n"
space: .string " "
tes1str: .string "Test1: "
tes2str: .string "Test2: "
tes3str: .string "Test3: "
.text
main:
la s0, arr1 #load test1 array base address to s0
lw s1, size1 #load test1 array size to s1
la a0, tes1str #load teststr1 and print
li a7, 4
ecall
jal ra, Print_test #print test1 array
jal ra, LemonadeChange #caculate the result
la s0 arr2 #load test2 array base address to s0
lw s1 size2 #load test2 array size to s1
la a0,tes2str #load teststr2 and print
ecall
jal ra , Print_test #print test2 array
jal ra , LemonadeChange #caculate the result
la s0 arr3 #load test3 array base address to s0
lw s1 size3 #load test3 array size to s1
la a0 ,tes3str #load teststr3 and print
ecall
jal ra Print_test #print test3 array
jal ra , LemonadeChange #caculate the result
li a7 10
ecall
LemonadeChange:
#s0:bills base s1:size t0:bill_5 t1:bill_10 t2:bill_20 t3:i t4:return address to main
#s2=answer
addi t0, x0, 0 #clear reg t0
addi t1, x0, 0 #clear reg t1
addi t2, x0, 0 #clear reg t2
mv a0, x0 #clear reg a0
addi t3, x0, 0 #clear reg t2
addi t4, ra, 0 #save return addrees to t4
addi t5, x0, 0 #clear reg t5
jal ra, LOOP #jump to LOOP and save return address in ra
j TRUE #jump to true
LOOP:
addi t3, t3, 1 #i = i + 1
beq s1, t3, EXIT #for loop end, jump to exit
slli a0, t3, 2 #a0 = i *4
add a0, a0, s0 #a0= test array base address + i*4
lw a0, 0(a0) #a0= test array[i]
addi a1, x0, 5 #set a1 reg to 5
beq a1, a0, bill5 #current bill = 5?
addi a1, x0, 10 #set a1 reg to 10
beq a1, a0, bill10 #current bill = 10?
addi a1, x0, 20 #set a1 reg to 20
beq a1, a0, bill20 #current bill = 20?
j LOOP #continue for loop
bill5:
addi t0, t0, 1 #t0 =t0 + 1
j LOOP #continue for loop
bill10:
beq t0, x0, FALSE #if don't have any bill_5 : false
addi t1, t1, 1 #else: bill_10 = bill_10 + 1
addi t0, t0, -1 #bill_5 = bill_5 -1
j LOOP #continue for loop
bill20:
slti a2, t0, 1
slti a3, t1, 1
add a4, a2, a3
beq a4, x0, bill20if1 #if have bill_5 and bill_10, jump to bill20if1
slti a2, t0, 3
beq a2, x0, bill20if2 #if have 3 or more bill_5, jump to bill20if2
j FALSE
bill20if1:
addi t0, t0, -1 #bill_5 = bill_5 - 1
addi t1, t1, -1 #bill_10 = bill_10 - 1
addi t2, t2, 1 #bill_20 = bill_20 - 1
j LOOP #continue for loop
bill20if2:
addi t0, t0, -3 #bill_5 = bill_5 - 3
addi t2, t2, 1 #bill_20 = bill_20 + 1
j LOOP #continue for loop
FALSE:
la a0, str0
li a7, 4
ecall
jr t4
TRUE:
la a0, str1
li a7,4
ecall
jr t4
Print_test:
# t5: j t1: j*4
slli t1, t5, 2 #t1 = t5 * 4
add t0, s0, t1 #t1 = t1 + s0
lw a0, 0(t0) #a0 = arr[j]
li a7, 1
ecall
la a0, space
li a7, 4
ecall
addi t5, t5, 1 #t5 = t5 +1
bne t5, s1, Print_test # if not scan over, then continue Print_test loop
jr ra
EXIT:
jr ra
```
I remove some unnecessary ```li``` in main, and using new register ```t5``` in ```Print_test``` to remove some unnecessary codes.
:::warning
:warning: Can you use fewer instructions?
:notes: jserv
:::
---
## Result
**C :**

**Assembly:**

## Pipeline stage description
We use Ripes for the simulation.It simulate with a five stage riscv processor. These stages are IF(instruction fetch), ID(instruction decode), EX(execute), MEM(memory r/w), WB(write back). Their function are descript bellow.
### IF
In this stage, CPU fetch the instruction from instruction memory, where the address of accessing is stored in the program counter(PC) register.
Let's take ```lw x10 4 x8 ```for example.
The PC is currently``` 0x00000154```. CPU compute instruction from instruction memory at ```0x00000154```. The instruction in ``` 0x00000154``` is ```0x00442503``` which is undecode presentation of ```lw x10 4 x8 ```.In the meanwhile, we using a adder to compute PC+4, and store it back to PC for computing next insturction.


But, sometimes we may not want to execute the instrucion in instruction memory at PC+4. In the image bellow, we excute a ```jal```. The mux on the left side of PC chooce the address that compute from the ALU in EX instead of PC +4.

### ID
In this stage, Decoder decode the instruction that compute in IF stage. After decoding we can know something from the resualt.
1. What type is it?
2. What register it need for read or write?
3. What immediate it contain? or it doesn't need immediate?
When we know the information above, we can load register value in this stage, and pass it with immadiate to the next stage for execute.
Let's take ```lw x10 4 x8 ```for example.The instuction```0x00442503``` is decoded to ```lw x10 4 x5 ```.
The correspond decode are bellow.
| offset | rs | funct3 | rt | opcode |
| ------ | --- | ------ | --- | ------ |
| 0000 0000 0100 | 0100 0 | 010 | 0101 0 | 000 0011 |
| 4 | 8 | | 10 | |
Decoder using **opcode** and **funct3** to determine the instruction is **lw**.
After decoding, we know rs is x8, then CPU load value from reg1(```x5```) which the value is ```0x10000000```. The immediate can obtaine after decoding, which is ```0x00000004```. We pass these two value to next stage for execution. Because it is lw, we finally need to store value to a register reg2(```0x0a```) , so we need to store ```0x0a``` in the pipe and pass to next stage.

### EX
In EX stage
* CPU perform action with ALU that relate to the instructuin and pass the resault to next stage.
* The mux in front of Op2 choose immediate or reg2 as operand 2, and the mux in front of Op1 choose PC or reg1 as operand 1.
* The two muxes behind the ID/EX pipe used for resolve data hazerd (ID/EX and EX/MEM).
* The Branch block will compare two regs value. If need to brach, the branch taken signal will be sent.
Take ```lw x10 4 x8 ``` for example. To caculate the address for loading data, we first choose the immediate ```0x00000004``` and value ```0x10000000``` in x8 as two operands. We add the two operands by using ALU to caculate the address ```0x10000004``` and pass it to next stage.

### MEM
In this stage, CPU will access Data Memory to r/w data.
For reading data, we need to give the target address from Addr, and the data that in the target address will be read out.
For writing data, we need to give the target address from Addr, the data that need to be stored in and signal Write en port to make data memory writable.
If current instruction doesn't need to access Memory, the data will just be passed to next stage.
Take ```lw x10 4 x8 ``` for example, we want to read data from ```x8 + 4```. In EX stage, we have caculate the address that we want to access is ```0x10000004```, than we will give address ```0x10000004``` from Addr port and the data(```0x00000005```) at that address will be read out to Read out port.


### WB
In this stage, CPU will write data back to register. The mux decide what data will be write back.

Take ```lw x10 4 x8 ``` for example, the above image is also current pipeline status. In the EX stage, we have compute the data that need to write back to register is 0x00000005. The mux will choose 0x00000005, then sending the data to Wr data port in register, the register index ``` 0x0a ``` that forward from the front stage to Wr idx port, and signal Wr En port to make register writable.


## Hazard
When a hazard occur, pipeline need to stall.
Pipeline has 3 type hazard: Structure Hazard, Data Hazard, Control Hazard.
### Structure Hazard
May occur when hardware resouce not enough. For example, if we don't separate instruction and data memory, then the IF stage and MEM stage will access Memory at same time, this may make pipeline cann't execute some instruction simultaneously.
To resolve this hazard, we can add some hardware resource.
### Data hazard
Data hazard occur when the current instruction want the result that has not been produced in the previous instruction.
Thanks to forwarding, my program don't have data hazard.
Let's take the follwing code for example.

For ```lw s1 0 a0```, we need load data in the end of MEM stage and ```add x9 x9 x9``` need the data in the begining of EX stage. In the other word, the result will be compute in the end of cycle, but the result need to be used in the start of the cycle. So we must insert one stall to make correctly data in x9 used by ```add x9 x9 x9```.


### Control hazard
When we determined brach need to been taken or not and determine jump address in EX, but the following instructions have entered in ID and IF stage. We need to flush these instruction(replace with nop).


In above section that in main function, we need to take the first ```jal``` to jump to ```Print_test```, but the following jal and ```auipc``` have been enter the pipeline.

So we set the two instruction to nop. Next cycle, the EX and ID stage will do nothing and the right instrucion ```slli x6 x30 2``` will be fetched.

## Pseudo Instruction
````
00000000 <main>:
0: 10000417 auipc x8 0x10000
4: 00040413 addi x8 x8 0
8: 10000497 auipc x9 0x10000
c: 00c4a483 lw x9 12 x9
10: 10000517 auipc x10 0x10000
14: 06950513 addi x10 x10 105
18: 00400893 addi x17 x0 4
1c: 00000073 ecall
20: 124000ef jal x1 292 <Print_test>
24: 054000ef jal x1 84 <LemonadeChange>
28: 10000417 auipc x8 0x10000
2c: ff040413 addi x8 x8 -16
30: 10000497 auipc x9 0x10000
34: ffc4a483 lw x9 -4 x9
38: 10000517 auipc x10 0x10000
3c: 04950513 addi x10 x10 73
40: 00000073 ecall
44: 100000ef jal x1 256 <Print_test>
48: 030000ef jal x1 48 <LemonadeChange>
4c: 10000417 auipc x8 0x10000
50: fe440413 addi x8 x8 -28
54: 10000497 auipc x9 0x10000
58: 0044a483 lw x9 4 x9
5c: 10000517 auipc x10 0x10000
60: 02d50513 addi x10 x10 45
64: 00000073 ecall
68: 0dc000ef jal x1 220 <Print_test>
6c: 00c000ef jal x1 12 <LemonadeChange>
70: 00a00893 addi x17 x0 10
74: 00000073 ecall
00000078 <LemonadeChange>:
78: 00000293 addi x5 x0 0
7c: 00000313 addi x6 x0 0
80: 00000393 addi x7 x0 0
84: 00000513 addi x10 x0 0
88: 00000e13 addi x28 x0 0
8c: 00008e93 addi x29 x1 0
90: 00000f13 addi x30 x0 0
94: 008000ef jal x1 8 <LOOP>
98: 0980006f jal x0 152 <TRUE>
0000009c <LOOP>:
9c: 001e0e13 addi x28 x28 1
a0: 0dc48a63 beq x9 x28 212 <EXIT>
a4: 002e1513 slli x10 x28 2
a8: 00850533 add x10 x10 x8
ac: 00052503 lw x10 0 x10
b0: 00500593 addi x11 x0 5
b4: 00a58c63 beq x11 x10 24 <bill5>
b8: 00a00593 addi x11 x0 10
bc: 00a58c63 beq x11 x10 24 <bill10>
c0: 01400593 addi x11 x0 20
c4: 02a58063 beq x11 x10 32 <bill20>
c8: fd5ff06f jal x0 -44 <LOOP>
000000cc <bill5>:
cc: 00128293 addi x5 x5 1
d0: fcdff06f jal x0 -52 <LOOP>
000000d4 <bill10>:
d4: 04028463 beq x5 x0 72 <FALSE>
d8: 00130313 addi x6 x6 1
dc: fff28293 addi x5 x5 -1
e0: fbdff06f jal x0 -68 <LOOP>
000000e4 <bill20>:
e4: 0012a613 slti x12 x5 1
e8: 00132693 slti x13 x6 1
ec: 00d60733 add x14 x12 x13
f0: 00070e63 beq x14 x0 28 <bill20if1>
f4: 0032a613 slti x12 x5 3
f8: 00060463 beq x12 x0 8 <bill20if2>
fc: 0200006f jal x0 32 <FALSE>
00000100 <bill20if2>:
100: ffd28293 addi x5 x5 -3
104: 00138393 addi x7 x7 1
108: f95ff06f jal x0 -108 <LOOP>
0000010c <bill20if1>:
10c: fff28293 addi x5 x5 -1
110: fff30313 addi x6 x6 -1
114: 00138393 addi x7 x7 1
118: f85ff06f jal x0 -124 <LOOP>
0000011c <FALSE>:
11c: 10000517 auipc x10 0x10000
120: f4050513 addi x10 x10 -192
124: 00400893 addi x17 x0 4
128: 00000073 ecall
12c: 000e8067 jalr x0 x29 0
00000130 <TRUE>:
130: 10000517 auipc x10 0x10000
134: f3a50513 addi x10 x10 -198
138: 00400893 addi x17 x0 4
13c: 00000073 ecall
140: 000e8067 jalr x0 x29 0
00000144 <Print_test>:
144: 002f1313 slli x6 x30 2
148: 006402b3 add x5 x8 x6
14c: 0002a503 lw x10 0 x5
150: 00100893 addi x17 x0 1
154: 00000073 ecall
158: 10000517 auipc x10 0x10000
15c: f1f50513 addi x10 x10 -225
160: 00400893 addi x17 x0 4
164: 00000073 ecall
168: 001f0f13 addi x30 x30 1
16c: fc9f1ce3 bne x30 x9 -40 <Print_test>
170: 00008067 jalr x0 x1 0
00000174 <EXIT>:
174: 00008067 jalr x0 x1 0
````