# Assignment1: RISC-V Assembly and Instruction Pipeline
:::danger
Check the requirements carefully.
:::
## 1342. Number of Steps to Reduce a Number to Zero
Given an integer num, return the number of steps to reduce it to zero. In one step, if the current number is even, you have to divide it by 2, otherwise, you have to subtract 1 from it.
**Example** :
**Input**: num = 14
**Output**: 6
**Explanation**:
Step 1) 14 is even; divide by 2 and obtain 7.
Step 2) 7 is odd; subtract 1 and obtain 6.
Step 3) 6 is even; divide by 2 and obtain 3.
Step 4) 3 is odd; subtract 1 and obtain 2.
Step 5) 2 is even; divide by 2 and obtain 1.
Step 6) 1 is odd; subtract 1 and obtain 0.
Constraints:
`0 <= num <= 10^6`
## Solution
First, we use simulation to solve this question directly.Implement the program according to the description of the problem.
```c
int numberOfSteps(int num) {
int count=0;
while(num!=0){
if(num%2==0){//number is even
num/=2;
}
else{//number is odd
num-=1;
}
count++;
}
return count;
}
```
If we want to change above C code to asseblely, we will face one problem how to implement `%` and `/` .In order to implement those operator, I use long division with clz mentioned in quiz1.
```c
static inline int my_clz(uint32_t x) {
int count = 0;
for (int i = 31; i >= 0; --i) {
if (x & (1U << i))
break;
count++;
}
return count;
}
```
In long division, the first thing should be done is align the MSB of two number. Find the MSB of two number and substract each other to get the shift number of divisor. Then, clz is a efficient way to find the MSB of a number.
```
.data
number: .word 10000
str1: .string "The steps to Reduce a Number to Zero of " # First part of the output message
str2: .string " is " # Second part of the output message
.text
main:
lw a1, number # Load the argument (10000) into register a1
jal ra, numberOfSteps # Jump-and-link to the 'numberOfSteps' function
mv a1, a0 # Move the result (count) from a0 to a1 for printing
lw a0, number # Reload input argument(10000) to a0 for printing
jal ra, printResult # Call the function to print the result
li a7, 10 # System call code for exiting the program
ecall # Make the exit system call
# numberOfSteps
# a1: Input argument (number)
# a0: Output argument (count)
numberOfSteps:
addi sp, sp, -4
sw ra, 0(sp)
li a0, 0 # Set count number to 0
li t0, 2 # Save dividor(2) in temporary register t0
start:
beqz a1, end # If number is 0, count won't be 0. prevent this condition
mv t1, a1 # Save dividend(number) in temporary register t1
bgt t0, t1, ldend # If dividend is less than divisor, skip longdivision
# CLZ of input number
li t2, 31 # i = 31
li t3, 1 # Save a constant(1) in temporary register t3
li t4, 0 # count = 0
clzloop:
sll t5, t3, t2 # x & (1U << i)
and t6, t5, t1 # Save result in temporary register t6
bge t6, t3, longdivision # If t6 is greater than 1 or equal to 1, break the loop
addi t4, t4, 1 # count++
addi t2, t2, -1 # --i
j clzloop
longdivision:
# number divide 2
li t1, 30 # CLZ of 2 is 30
sub t5, t1, t4 # Difference between clz of divisor and clz of dividend
sll t2, t0, t5 # Align the most significant digit of divisor and dividend
li t3, 0 # Set quotient in temporary register t3
li t4, 0 # Record number of divisor shift
mv t1, a1 # Restore number to temporary register t1
ldloop:
bgt t2, t1, skip # If divisor is larger than dividend, skip substraction
sub t1, t1, t2 # Dividend substract divisor
addi t3, t3, 1 # Set the LSB of quotient to 1
skip:
beq t4, t5, ldend # If number of divisor shift is eqaul to number of alignment shift
slli t3, t3, 1 # Shift left quotient
srli t2, t2, 1 # Shift Right divisor
addi t4, t4, 1 # The number of divisor shift add 1
j ldloop
ldend:
beqz t1, even # If t1(remainder) is equal to 0, number is even.Otherwise, it's odd
addi a1, a1, -1 # If t1(remainder) is odd, number - 1
addi a0, a0, 1 # count add 1
j start
even:
mv a1, t3 # If t1(remainder) is even, number % 2(t3 is the quotient of the longdivision)
addi a0, a0, 1 # count add 1
j start
end:
lw ra, 0(sp)
addi sp, sp, 4
ret
# This function prints the factorial result in the format:
# a0: The original input value (number)
# a1: The number of Steps to Reduce a Number to Zero (count)
printResult:
mv t0, a0 # Save original input value (X) in temporary register t0
mv t1, a1 # Save factorial result (Y) in temporary register t1
la a0, str1 # Load the address of the first string ("Factorial value of ")
li a7, 4 # System call code for printing a string
ecall # Print the string
mv a0, t0 # Move the original input value (X) to a0 for printing
li a7, 1 # System call code for printing an integer
ecall # Print the integer (X)
la a0, str2 # Load the address of the second string (" is ")
li a7, 4 # System call code for printing a string
ecall # Print the string
mv a0, t1 # Move the factorial result (Y) to a0 for printing
li a7, 1 # System call code for printing an integer
ecall # Print the integer (Y)
ret # Return to the caller
```
:::danger
Use fewer instructions.
:::
After simulation on Ripes, the result is in following picture

Look like there's something for improvement. I find this code spend a lot time on clz funtion. As long as the input number is smaller, the search time of clz will be more. In order to reduce time of clz funtion, I use bit mask way to implement clz funtion.
```c
int my_clz(uint32_t x) {
if (x == 0) return 32;
int n = 0;
if (x <= 0x0000FFFF) { n += 16; x <<= 16; }
if (x <= 0x00FFFFFF) { n += 8; x <<= 8; }
if (x <= 0x0FFFFFFF) { n += 4; x <<= 4; }
if (x <= 0x3FFFFFFF) { n += 2; x <<= 2; }
if (x <= 0x7FFFFFFF) { n += 1; x <<= 1; }
return n;
}
```
```c
# CLZ of input number
li t4, 0 # n = 0
li t2, 0x0000FFFF # Save 0x0000FFFF in temporary register t2
bgt t1, t2, S1 # If t1 is greater than t2, skip following instrcution
addi t4, t4, 16 # n += 16
slli t1, t1, 16 # x <<= 16
S1:
li t2, 0x00FFFFFF # Save 0x00FFFFFF in temporary register t2
bgt t1, t2, S2 # If t1 is greater than t2, skip following instrcution
addi t4, t4, 8 # n += 8
slli t1, t1, 8 # n += 8
S2:
li t2, 0x0FFFFFFF # Save 0x0FFFFFFF in temporary register t2
bgt t1, t2, S3 # If t1 is greater than t2, skip following instrcution
addi t4, t4, 4 # n += 8
slli t1, t1, 4 # n += 8
S3:
li t2, 0x3FFFFFFF # Save 0x3FFFFFFF in temporary register t2
bgt t1, t2, S4 # If t1 is greater than t2, skip following instrcution
addi t4, t4, 2
slli t1, t1, 2
S4:
li t2, 0x7FFFFFFF # Save 0x7FFFFFFF in temporary register t2
bgt t1, t2, longdivision
addi t4, t4, 1
```
The result is in the following picture. Compare to the oridinal version, the cycle time is reduced significantly.

### Version 2
```c
int numberOfSteps(int num) {
if(num==0) return 0; //if number is 0, count won't be 0. prevent this condition
int count = 0;
while (num!=0) {
count += (num & 0x1) + 1;
num >>= 1;
}
return count - 1;
}
```
```c
.data
number: .word 10000
str1: .string "The steps to Reduce a Number to Zero of " # First part of the output message
str2: .string " is " # Second part of the output message
.text
main:
lw a1, number
jal ra, numberOfSteps
mv a1, a0
lw a0, number
jal ra, printResult
li a7, 10 # System call code for exiting the program
ecall # Make the exit system call
# numberOfSteps
# a1: Input argument (number)
# a0: Output argument (count)
numberOfSteps:
addi sp, sp, -4
sw ra, 0(sp)
li a0, 0 #set count nunber to 0
start:
beqz a1, end
andi t0, a1, 0x1
addi t0, t0, 1
add a0, t0, a0
srli a1, a1, 1
j start
end:
addi a0, a0, -1
lw ra, 0(sp)
addi sp, sp, 4
ret
# This function prints the factorial result in the format:
# a0: The original input value (number)
# a1: The number of Steps to Reduce a Number to Zero (count)
printResult:
mv t0, a0 # Save original input value (X) in temporary register t0
mv t1, a1 # Save factorial result (Y) in temporary register t1
la a0, str1 # Load the address of the first string ("Factorial value of ")
li a7, 4 # System call code for printing a string
ecall # Print the string
mv a0, t0 # Move the original input value (X) to a0 for printing
li a7, 1 # System call code for printing an integer
ecall # Print the integer (X)
la a0, str2 # Load the address of the second string (" is ")
li a7, 4 # System call code for printing a string
ecall # Print the string
mv a0, t1 # Move the factorial result (Y) to a0 for printing
li a7, 1 # System call code for printing an integer
ecall # Print the integer (Y)
ret # Return to the caller
```

### Version 3
```c
int length(uint32_t num) {
uint32_t clz = 0;
if ((num >> 16) == 0) {
clz += 16;
num <<= 16;
}
if ((num >> 24) == 0) {
clz += 8;
num <<= 8;
}
if ((num >> 28) == 0) {
clz += 4;
num <<= 4;
}
if ((num >> 30) == 0) {
clz += 2;
num <<= 2;
}
if ((num >> 31) == 0) {
clz += 1;
}
return 32 - clz;
}
int count(uint32_t num) {
num = (num & 0x55555555) + ((num >> 1) & 0x55555555);
num = (num & 0x33333333) + ((num >> 2) & 0x33333333);
num = (num & 0x0F0F0F0F) + ((num >> 4) & 0x0F0F0F0F);
num = (num & 0x00FF00FF) + ((num >> 8) & 0x00FF00FF);
num = (num & 0x0000FFFF) + ((num >> 16) & 0x0000FFFF);
return num;
}
int numberOfSteps(uint32_t num) {
if(num==0) return 0; //if number is 0, count won't be 0. prevent this condition
return length(num) - 1 + count(num);
}
```
```c
.data
number: .word 10000
str1: .string "The steps to Reduce a Number to Zero of " # First part of the output message
str2: .string " is " # Second part of the output message
.text
main:
lw a1, number
jal ra, numberOfSteps
mv a1, a0
lw a0, number
jal ra, printResult
li a7, 10 # System call code for exiting the program
ecall # Make the exit system call
# numberOfSteps
# a1: Input argument (number)
# a0: Output argument (count)
numberOfSteps:
addi sp, sp, -4
sw ra, 0(sp)
beqz a1, end # if number is 0, count won't be 0. prevent this condition
# length funtion
li a0, 0 # set clz to 0
mv t0, a1 # Save original input value (number) in temporary register t0
mv t1, a1 # Save original input value (number) in temporary register t0
srli t0, t0, 16
bne t0, x0, S1
addi a0, a0, 16
slli t1, t1, 16
S1:
mv t0, t1
srli t0, t0, 24
bne t0, x0, S2
addi a0, a0, 8
slli t1, t1, 8
S2:
mv t0, t1
srli t0, t0, 28
bne t0, x0, S3
addi a0, a0, 4
slli t1, t1, 4
S3:
mv t0, t1
srli t0, t0, 30
bne t0, x0, S4
addi a0, a0, 2
slli t1, t1, 2
S4:
mv t0, t1
srli t0, t0, 31
bne t0, x0, S5
addi a0, a0, 1
S5:
li t1, 32
sub a0, t1, a0 # 32 - clz
# count funtion
li t1, 0x55555555
and t2, a1, t1
srli a1, a1, 1
and a1, a1, t1
add a1, a1, t2
li t1, 0x33333333
and t2, a1, t1
srli a1, a1, 2
and a1, a1, t1
add a1, a1, t2
li t1, 0x0F0F0F0F
and t2, a1, t1
srli a1, a1, 4
and a1, a1, t1
add a1, a1, t2
li t1, 0x00FF00FF
and t2, a1, t1
srli a1, a1, 8
and a1, a1, t1
add a1, a1, t2
li t1, 0x0000FFFF
and t2, a1, t1
srli a1, a1, 16
and a1, a1, t1
add a1, a1, t2
# length(num) - 1 + count(num)
add a0, a0, a1
addi a0, a0, -1
end:
lw ra, 0(sp)
addi sp, sp, 4
ret
# This function prints the factorial result in the format:
# a0: The original input value (number)
# a1: The number of Steps to Reduce a Number to Zero (count)
printResult:
mv t0, a0 # Save original input value (X) in temporary register t0
mv t1, a1 # Save factorial result (Y) in temporary register t1
la a0, str1 # Load the address of the first string ("Factorial value of ")
li a7, 4 # System call code for printing a string
ecall # Print the string
mv a0, t0 # Move the original input value (X) to a0 for printing
li a7, 1 # System call code for printing an integer
ecall # Print the integer (X)
la a0, str2 # Load the address of the second string (" is ")
li a7, 4 # System call code for printing a string
ecall # Print the string
mv a0, t1 # Move the factorial result (Y) to a0 for printing
li a7, 1 # System call code for printing an integer
ecall # Print the integer (Y)
ret # Return to the caller
```
