# Assignment1: RISC-V Assembly and Instruction Pipeline
contributed by < [`teimeikichengmingchi`](https://github.com/teimeikichengmingchi) >
###### tags: `RISC-V`, `jserv`
## Question Description
[Find Target Indices After Sorting Array](https://leetcode.com/problems/find-target-indices-after-sorting-array/)
You are given a 0-indexed integer array nums and a target element target.
A target index is an index i such that nums[i] == target.
Return a list of the target indices of nums after sorting nums in non-decreasing order.
If there are no target indices, return an empty list. The returned list must be sorted in increasing order.
Because the example input leetcode showed are not sorted, so I need to sorted them, then find the matched numbers in the vector.
## Implementation
### C++ code
```cpp=
class Solution {
public:
vector<int> targetIndices(vector<int>& nums, int target) {
int i=0,j=0,len=nums.size();
bool flag=0;
vector<int> out;
for(i=0;i<len;i++)
{
for(j=0;j<len-1;j++)
if(nums[j]>nums[j+1])
swap(nums[j],nums[j+1]);
}
i=0;
while(len>0)
{
if(nums[i]==target)
{
out.push_back(i);
flag=1;
}
else if(nums[i]!=target && flag==1)
break;
i++,len--;
}
return out;
}
};
````
- First I sorted the input with Bubble sort, after that, I got a sorted vector.
- Then I compare the target number and the number in the vector, if they matched then I append the index of the number into the vector 'out'.
- The bool 'flag' is detecting whether I find the number, after I find the target first time, I set the flag to 1.
- If the flag is equal to 1 and I find a number in vector doesn't match the target, then I know that there is no need to search down anymore.
### Assembly code
#### input data
```assembly=
.data
arr1: .word 55,44,33,22,11
arr2: .word 44,11,44,66,55
arr3: .word 11,55,33,22,55
arr4: .word 10,9,8,7,6,5,4,3,2,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
arrSize: .word 5
arrSize: .word 5
target: .word 44
Lbracket: .string "["
Rbracket: .string "]"
Blank: .string " "
```
- arr1 is used to test the vector has a matched number.
- arr2 is used to test the vector has multiple matched number.
- arr3 is used to test the vector has no matched number, in this case the output should be a empty list.
- arr4 is longer input that compare the performance.
#### the main part
##### registers & variables
- s1: Outer loop counter
- s2: Inner loop counter
- a2: *arr[]
- t0: arr[j]
- t1: arr[j+1]
- t2: vector length and counter of Find
- t3: target
- t4: element of vector
- t5: counter of vector
```assembly=
.text
main:
la a0, Lbracket # a0 is return value register
li a7,4 # Print string by specifiy the value of a7 register t0 4
ecall # First print "["
li t5,0
addi s1,x0,0 # Number of sorted elements(Outer_Loop counter)
addi s3,x0,0 # is equal to 1 if target was finded in sorted vector
la t2,arrSize # Load base address of arrSize
lw t2,0(t2) # Load arrSize to t2
addi t2,t2,-1 # vector length and counter of Find
Outer_Loop:
la a2,arr1 # base address of array
lw t0,0(a2) # t0=arr[j]
sub s2,t2,s1 # Inner_Loop counter, inner do arrSize - i times
Inner_Loop:
lw t1,4(a2) # t1=arr[j+1]
blt t0,t1,NoSwap # If arr[j]<arr[j+1], then don't swap
sw t1,0(a2) # Swap(arr[j],arr[j+1])
sw t0,4(a2) # Swap(arr[j],arr[j+1])
NoSwap: # Let bigger number between t0,t1 in t0
bgt t0,t1,Skip # If t0 is bigger, then keep t0 the same
add t0,t1,x0 # If t1 is bigger, then let t0 = t1
Skip:
addi a2,a2,4 #next element of vector
addi s2,s2,-1
bne s2,x0,Inner_Loop # If s3=0, means that one Inner_Loop has done
Inner_End:
addi s1,s1,1 # After a loop has done, sorted number +1
beq s1,t2,Outer_End # If s1 (Outer_Loop counter) = t2 (Arrsize), end the sorting
j Outer_Loop # Else keep the sorting
Outer_End:
la t3,target
lw t3,0(t3) #target
Find:
lw t4,-4(a2) # get the first sorted vector element
addi a2,a2,4 # get the nexr element of sorted vector
beq t3,t4,Print # If target=element of vector, then print the index
addi t5,t5,1 # index+1
beq t5,t2,End # If t5 (index of vector) >= t2 (arrSize),
bgt t5,t2,End
j Find # Else Keep Finding
Print:
add a0, t5, x0 # return value
li a7, 1 # print integer
ecall
la a0, Blank #a0 is return value register
li a7,4 # print string by specifiy the value of a7 register t0 4
ecall
addi t5,t5,1
j Find
End:
la a0, Rbracket #a0 is return value register
li a7,4 # print string by specifiy the value of a7 register t0 4
ecall
nop
```
#### Print result on console
- Use ECALL to print
- In the very first of the whole program, I print "["
- Then if find the number in vector match the target, I print the index followed by a 'blank' on console
- After the Find end, I print out "]"

### Analysis
- Using Ripes, we can checked the total cycles and instructions
- The most I can enhance about my code is in Bubble sort,I implemented the Bubble sort without checked whether the number has swapped in inner loop.So it will always cost O(n^2) to sort the input data.
- But somehow when I modified the code, the time spend that leetcode showed to me didn't change much(actually a bit higher)
- Another way the enhance the performance is Find.Though in my C++ code, I use flag to detect when the Find loop ended, In riscv assembly code, I simply run Find loop arrSize time. So if I can enhance this part, cycles of the program may be a little bit fewer.
- The performance enhance much on leetcode, so I decide to implement the enhancement.
New risc-V code is below:
```Assembly=
Print:
li s3,1
```
I add a row of code in the begining of 'Print', and use $s3 as a flag.When the program run 'Print' lebel, it means it find matched number in the vector, so set s3 to 1.
```Assembly=
Find:
lw t4,-4(a2) # get the first sorted vector element
addi a2,a2,4 # get the nexr element of sorted vector
beq t3,t4,Print # If target=element of vector, then print the index
bgt s3,x0,End # If target is matched in vector, and there is another element of vector not martched,then don't find anymore
addi t5,t5,1 # index+1
beq t5,t2,End # If t5 (index of vector) >= t2 (arrSize),
bgt t5,t2,End
j Find # Else Keep Finding
```
And I add a row of code(row 7) in 'Find' label, once it finded some number isn't matched the target, and s3 is equal to 1, then it has no need to find anymore.The final version of my risc-V code is [there](https://github.com/mpyh12345/Assignment-1/blob/main/leetcode2089.s).
Performance1:(arrSize = 5, before enhancement)

Performance2:(arrSize = 5, after enhancement)

The cycles is a little bit lower after enhancement.Now we consider thar input is arr4,which is a 35-element list.
Performance3:(arrSize = 35, before enhancement)

Performance4:(arrSize = 35, after enhancement)

Though the CPI and IPC are so close, but the total cycles the program run is much lower.