# Lab 5 Name : P Akilesh Roll Number : CS22B040 --- ## Question 1 **Given C program** ```c= int s = 3; //initialize the 12 elements of array to some random values for(int i = 11; i >= 0; i = i - 1){ x[i] = x[i] + s; } ``` **RISC-V Assembly Code of the above C program** 1. The below code is optimized with loop unrolling by a factor of 4 ```assembly= .data arr: .word 1,2,3,4,5,6,7,8,9,10,11,12 size:.word 12 s: .word 3 .text #storing address in x4 la x4,arr #size in x5 lw x5,size #loading int s in x2 lw x2,s #Variable i in x1 li x1,0 loop: beq x1,x5,exit lw x7,0(x4) add x7,x7,x2 sw x7,0(x4) addi x4,x4,4 addi x1,x1,1 #factor 1 lw x7,0(x4) add x7,x7,x2 sw x7,0(x4) addi x4,x4,4 addi x1,x1,1 #factor 2 lw x7,0(x4) add x7,x7,x2 sw x7,0(x4) addi x4,x4,4 addi x1,x1,1 #factor 3 lw x7,0(x4) add x7,x7,x2 sw x7,0(x4) addi x4,x4,4 addi x1,x1,1 #factor 4 j loop exit: li a7,10 ecall ``` ### Observations : 1. Without loop unrolling, the IPC = 0.681 2. With loop unrolling by a factor 4, the IPC = 0.745 **Doing further optimization to remove stalls** ```assemlby= .data arr: .word 1,2,3,4,5,6,7,8,9,10,11,12 size:.word 12 s: .word 3 .text #storing address in x4 la x4,arr #size in x5 lw x5,size #loading int s in x2 lw x2,s #Variable i in x1 li x1,0 loop: beq x1,x5,exit lw x7,0(x4) lw x8,4(x4) lw x9,8(x4) lw x10,12(x4) add x7,x7,x2 add x8,x8,x2 add x9,x9,x2 add x10,x10,x2 sw x7,0(x4) sw x8,4(x4) sw x9,8(x4) sw x10,12(x4) addi x4,x4,16 addi x1,x1,4 j loop exit: li a7,10 ecall ``` ### Observations : 1. Now With loop unrolling by a factor 4, the IPC = 0.773 2. With loop unrolling by a factor 6, the IPC = 0.794 3. With loop unrolling by a factor 12, the IPC = 0.863 * We can see that as loop unrolling factor increases ,the IPC also increases. But we can't do more and more unrolling due to constraint in memory space. Hence the optimized loop unrolling in terms of IPC and space will be with that of factor of 4. --- ## Question 2 Linked List in RISC-V means we are storing two consecutive blocks of memory,each of size 32 bits ,one for value and another as pointer which stores the address of the next node. If address doen't fit in this 32 bits we can store the address as ***dword*** , which is 64 bits. Here we are not allocating them in random locations of memory * Initialized a linked list with 10 nodes * Reverse function is also implemented using 3 pointer technique ```assembly= .data node1: .word 0 #value .word 0 #pointer to next node node2: .word 0 .word 0 node3: .word 0 .word 0 node4: .word 0 .word 0 node5: .word 0 .word 0 node6: .word 0 .word 0 node7: .word 0 .word 0 node8: .word 0 .word 0 node9: .word 0 .word 0 node10:.word 0 .word 0 .text init: #initialize la x1,node1 la x2,node2 sw x2,4(x1) la x1,node3 sw x1,4(x2) la x2,node4 sw x2,4(x1) la x1,node5 sw x1,4(x2) la x2,node6 sw x2,4(x1) la x1,node7 sw x1,4(x2) la x2,node8 sw x2,4(x1) la x1,node9 sw x1,4(x2) la x2,node10 sw x2,4(x1) #Storing head node address in x1 la x1,node1 reverse: #maintaing three pointers i,j,k li x2,0 mv x3,x1 mv x4,x1 loop: beq x3,x0,exit lw x4,4(x3) sw x2,4(x3) mv x2,x3 mv x3,x4 j loop exit: li a7,10 ecall ``` --- ## Question 3 allocate_node function in asm for allocating a new node in linked list ```assembly= .data node1: .word 1 # value .word 0 # pointer to next node node2: .word 2 .word 0 node3: .word 3 .word 0 node4: .word 4 .word 0 node5: .word 5 .word 0 node6: .word 6 .word 0 node7: .word 7 .word 0 node8: .word 8 .word 0 node9: .word 9 .word 0 node10:.word 10 .word 0 .text init: #initialize la x1,node1 la x2,node2 sw x2,4(x1) la x1,node3 sw x1,4(x2) la x2,node4 sw x2,4(x1) la x1,node5 sw x1,4(x2) la x2,node6 sw x2,4(x1) la x1,node7 sw x1,4(x2) la x2,node8 sw x2,4(x1) la x1,node9 sw x1,4(x2) la x2,node10 sw x2,4(x1) allocate_node: #storing initial in x6 la x6,node1 li x7,0 to_end: beq x6,x0,allocate lw x7,4(x6) lw x6,4(x7) j to_end allocate: #Now x7 is pointing to the last node mv x6,x7 addi x6,x6,8 sw x6,4(x7) li x6,1000 sw x6,8(x7) sw x0,12(x7) #The new node is added to the list ```