# Lab 5
Name : P Akilesh
Roll Number : CS22B040
---
## Question 1
**Given C program**
```c=
int s = 3;
//initialize the 12 elements of array to some random values
for(int i = 11; i >= 0; i = i - 1){
x[i] = x[i] + s;
}
```
**RISC-V Assembly Code of the above C program**
1. The below code is optimized with loop unrolling by a factor of 4
```assembly=
.data
arr: .word 1,2,3,4,5,6,7,8,9,10,11,12
size:.word 12
s: .word 3
.text
#storing address in x4
la x4,arr
#size in x5
lw x5,size
#loading int s in x2
lw x2,s
#Variable i in x1
li x1,0
loop:
beq x1,x5,exit
lw x7,0(x4)
add x7,x7,x2
sw x7,0(x4)
addi x4,x4,4
addi x1,x1,1 #factor 1
lw x7,0(x4)
add x7,x7,x2
sw x7,0(x4)
addi x4,x4,4
addi x1,x1,1 #factor 2
lw x7,0(x4)
add x7,x7,x2
sw x7,0(x4)
addi x4,x4,4
addi x1,x1,1 #factor 3
lw x7,0(x4)
add x7,x7,x2
sw x7,0(x4)
addi x4,x4,4
addi x1,x1,1 #factor 4
j loop
exit:
li a7,10
ecall
```
### Observations :
1. Without loop unrolling, the IPC = 0.681
2. With loop unrolling by a factor 4, the IPC = 0.745
**Doing further optimization to remove stalls**
```assemlby=
.data
arr: .word 1,2,3,4,5,6,7,8,9,10,11,12
size:.word 12
s: .word 3
.text
#storing address in x4
la x4,arr
#size in x5
lw x5,size
#loading int s in x2
lw x2,s
#Variable i in x1
li x1,0
loop:
beq x1,x5,exit
lw x7,0(x4)
lw x8,4(x4)
lw x9,8(x4)
lw x10,12(x4)
add x7,x7,x2
add x8,x8,x2
add x9,x9,x2
add x10,x10,x2
sw x7,0(x4)
sw x8,4(x4)
sw x9,8(x4)
sw x10,12(x4)
addi x4,x4,16
addi x1,x1,4
j loop
exit:
li a7,10
ecall
```
### Observations :
1. Now With loop unrolling by a factor 4, the IPC = 0.773
2. With loop unrolling by a factor 6, the IPC = 0.794
3. With loop unrolling by a factor 12, the IPC = 0.863
* We can see that as loop unrolling factor increases ,the IPC also increases. But we can't do more and more unrolling due to constraint in memory space. Hence the optimized loop unrolling in terms of IPC and space will be with that of factor of 4.
---
## Question 2
Linked List in RISC-V means we are storing two consecutive blocks of memory,each of size 32 bits ,one for value and another as pointer which stores the address of the next node. If address doen't fit in this 32 bits we can store the address as ***dword*** , which is 64 bits. Here we are not allocating them in random locations of memory
* Initialized a linked list with 10 nodes
* Reverse function is also implemented using 3 pointer technique
```assembly=
.data
node1: .word 0 #value
.word 0 #pointer to next node
node2: .word 0
.word 0
node3: .word 0
.word 0
node4: .word 0
.word 0
node5: .word 0
.word 0
node6: .word 0
.word 0
node7: .word 0
.word 0
node8: .word 0
.word 0
node9: .word 0
.word 0
node10:.word 0
.word 0
.text
init: #initialize
la x1,node1
la x2,node2
sw x2,4(x1)
la x1,node3
sw x1,4(x2)
la x2,node4
sw x2,4(x1)
la x1,node5
sw x1,4(x2)
la x2,node6
sw x2,4(x1)
la x1,node7
sw x1,4(x2)
la x2,node8
sw x2,4(x1)
la x1,node9
sw x1,4(x2)
la x2,node10
sw x2,4(x1)
#Storing head node address in x1
la x1,node1
reverse:
#maintaing three pointers i,j,k
li x2,0
mv x3,x1
mv x4,x1
loop:
beq x3,x0,exit
lw x4,4(x3)
sw x2,4(x3)
mv x2,x3
mv x3,x4
j loop
exit:
li a7,10
ecall
```
---
## Question 3
allocate_node function in asm for allocating a new node in linked list
```assembly=
.data
node1: .word 1 # value
.word 0 # pointer to next node
node2: .word 2
.word 0
node3: .word 3
.word 0
node4: .word 4
.word 0
node5: .word 5
.word 0
node6: .word 6
.word 0
node7: .word 7
.word 0
node8: .word 8
.word 0
node9: .word 9
.word 0
node10:.word 10
.word 0
.text
init: #initialize
la x1,node1
la x2,node2
sw x2,4(x1)
la x1,node3
sw x1,4(x2)
la x2,node4
sw x2,4(x1)
la x1,node5
sw x1,4(x2)
la x2,node6
sw x2,4(x1)
la x1,node7
sw x1,4(x2)
la x2,node8
sw x2,4(x1)
la x1,node9
sw x1,4(x2)
la x2,node10
sw x2,4(x1)
allocate_node:
#storing initial in x6
la x6,node1
li x7,0
to_end:
beq x6,x0,allocate
lw x7,4(x6)
lw x6,4(x7)
j to_end
allocate:
#Now x7 is pointing to the last node
mv x6,x7
addi x6,x6,8
sw x6,4(x7)
li x6,1000
sw x6,8(x7)
sw x0,12(x7)
#The new node is added to the list
```