# Shift-and-Subtract Division
contributed by <`ryanycs`>
:::danger
Mention your GitHub account here.
:::
## Software Division
RV32I does not include `div` instruction, which indicates integer division must be implemented in software. A common and efficient approach is the **shift-and-subtract division**, a simplified form of binary long division. In the following section, we will illustrate how this algorithm works, starting from the decimal long-division and eventually a complete RV32I implementation.
## Inspiration: Decimal Long Division
We consider the division:
$$
100 \div 7
$$
We can perform the standard long division:
```
┌──────
7 │ 100
```
Let us break down the calculating process step by step:
We begin by taking digits from **left to right**. The first digit is $1$, we set the *remainder* $R$ to $1$. Since the *remainder* $R = 1$ is less than the *diviser* $7$, no subtraction is performed at this stage, and the first *quotient* $Q$ is $0$.
```
0
┌──────
7 │ 100
──────
1
0
──────
1
```
---
After that, we bring down the next digit from the dividend. This appends a $0$ to the current remainder, forming a new remainder:
$$
R = 10
$$
Since $10 \ge 7$, we can perform a subtraction. We subtract $7$ from $10$ to obtain the new remainder:
$$
R = 10 - 7 = 3
$$
Because we performed one substraction, the next quotient digit is $1$, giving:
$$
Q = 01 = 1
$$
```
01
┌──────
7 │ 100
──────
1
0
──────
10
7
──────
3
```
---
Next, we bring down the final digit of the dividend, forming:
$$
R = 30
$$
Since $30 \ge 7$, we perform subtraction $4$ times, the reason to choice $4$ is that:
- $7 \times 4 = 28 \le 30$
- $7 \times 5 = 23 > 30$
The remainder is:
$$
R = 30 - 7 * 4 = 2
$$
The next quotient digit is $4$, giving:
$$
Q = 14
$$
```
014
┌──────
7 │ 100
──────
1
0
──────
10
7
──────
30
28
──────
2
```
---
Therefore, the **quotient** is $14$, and the **remainder** is $2$.
We can simulate this approch into binary long division.
## Binary Long Divison
### Example
Taking $13 \div 3$ as the example. We first convert these number form decimal into binary representation:
- Dividend = `1101` (13)
- Divisor = `11` (3)
```
┌───────
11 │ 1101
```
---
Take the digit of the dividend from **left to right**. The first digit is $1$, we set the remainder $R = 1$. Since the remainder $R = 1$ is less then the divisor $11$, there is no need of substraction. The quotient $Q$ is set to $0$.
```
0
┌───────
11 │ 1101
───────
1
0
───────
1
```
---
Next, bringing down the second digit of the dividend, which forming the new remainder:
$$
R = (1 << 1) | 1 = 11
$$
Since remainder $R = 11_2 = 3$ is greater or equal to divisor $11_2 = 3$, we substract the divisor from remainder, obtain:
$$
R = 11_2 - 11_2 = 0
$$
There is a substraction in this step, the quotient become:
$$
Q = 01 = 1
$$
```
01
┌───────
11 │ 1101
───────
1
0
───────
11
11
───────
0
```
---
After that, bringing down the third digit of the dividend, forming:
$$
R = (0 << 1) | 0 = 0
$$
The new remainder $R = 0$ is less than divisor $11_2 = 3$, there is no need to do substracion. The quotient become:
$$
Q = (1 << 1) | 0 = 10
$$
```
010
┌───────
11 │ 1101
───────
1
0
───────
11
11
───────
0
0
───────
0
```
---
Finally, bringing down the last digit, the new remainder is:
$$
R = (0 << 1) | 1 = 1
$$
The remainder $R = 1$ still less than the divisor $11_2 = 3$, there is no need to do substracion. The quotient become:
$$
Q = (10 << 1) | 0 = 100
$$
```
0100
┌───────
11 │ 1101
───────
1
0
───────
11
11
───────
0
0
───────
1
0
───────
1
```
---
Therefore, the **quotient** is `100` in binary, i.e. $4$ in decimal, and the **remainder** is `1` in brnary, i.e. $1$ in decimal:
$$
13 \div 3 = 4 \cdot\cdot\cdot 1
$$
### Algorithm
We can observe that binary long division relies only on three operations: **bit shifting**, **bitwise OR**, and **subtraction**, and that its processing steps follow a simple, repeating structure.
We can write down the algorithm as follows:
1. initialize the remainder `R` and quotient `Q` to $0$.
2. **Shift left** the remainder `R` and **bitwise OR** the left-most bit of the dividend `Q` into `R` to form a new remainder.
3. **Shift left** the dividend to discard the bit that was just processed.
4. **Shift left** the quotient `Q` to prepare for the next quotient bit.
5. Compare the remainder `R` with the divisor `D`:
- if `R` >= `D`, then do subtraction `R = R - D` and set LSB of `Q` to 1.
- Otherwise, set LSB of `Q` to 0.
6. Repeat steps 2–5 for each bit of the dividend.
### C Implementation of Shift-and-Subtract Division
The following function performs unsigned integer division using the shift-and-subtract algorithm:
```c
void div(uint32_t dividend, uint32_t divisor,
uint32_t *quotient, uint32_t *remainder) {
int bit_count = 32;
*quotient = 0;
*remainder = 0;
while (bit_count--) {
uint32_t left_most = (dividend >> 31) & 1; // left-most bit of the dividend
dividend <<= 1; // discard the bit that was just processed
*remainder = (*remainder << 1) | left_most;
*quotient <<= 1; // prepare for the next quotient bit.
if (*remainder >= divisor) {
*remainder -= divisor;
*quotient |= 1;
}
}
}
```
### RISC-V Implementation of Shift-and-Subtract Division
The following RISC-V program shows the unsigned integer division using the shift-and-subtract algorithm:
```asm
# software_div: Software division implementation for RV32I
# Algorithm: Simplified shift-and-subtract division
# Input: a0 = dividend, a1 = divisor
# Output: a0 = quotient, a1 = remainder
# Constraints: Assumes positive inputs and divisor != 0
software_div:
add t0, x0, x0 # quotient = 0
add t1, x0, x0 # remainder = 0
addi t2, x0, 32 # bit_count = 32 (process 32 bits)
div_loop:
slli t1, t1, 1 # remainder <<= 1
srli t3, a0, 31 # Extract MSB of dividend
or t1, t1, t3 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, skip_sub # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
skip_sub:
addi t2, t2, -1 # bit_count--
bne t2, x0, div_loop # Continue if bit_count != 0
add a0, t0, x0 # Return quotient in a0
add a1, t1, x0 # Return remainder in a1
jalr x0, ra, 0 # Return to caller
```
## Handling Negative Value
Handling negative input is simple. We can just using **bitwise XOR** on the sign bit of the both dividend and divisor. If the result is $1$, then the quotient is negative; otherwise, it is positive. The remainder has the same sign with the dividend.
The shift-and-subtract division loop will use the absolute value of the dividend and divisor.
The following RISC-V program shows the refinement of the previous prgram to support negative input.
```asm
# software_div: Software division implementation for RV32I
# Algorithm: Simplified shift-and-subtract division
# Input: a0 = dividend, a1 = divisor
# Output: a0 = quotient, a1 = remainder
# Constraints: Assumes divisor != 0
software_div:
add t0, x0, x0 # quotient = 0
add t1, x0, x0 # remainder = 0
addi t2, x0, 32 # bit_count = 32 (process 32 bits)
srli t3, a1, 31 # sign bit of divisor
srli t4, a0, 31 # negative_r = sign bit of dividend
xor t3, t3, t4 # negative_q = dividend_sign ^ divisor_sign
srai t5, a0, 31 # mask = a0 >> 31
xor t6, a0, t5 # temp = a0 ^ mask
sub a0, t6, t5 # abs(a0) = temp - mask
srai t5, a1, 31 # mask
xor t6, a1, t5 # temp
sub a1, t6, t5 # abs(a1)
div_loop:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, skip_sub # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
skip_sub:
addi t2, t2, -1 # bit_count--
bne t2, x0, div_loop # Continue if bit_count != 0
beq x0, t3, skip_neg_q
sub t0, x0, t0
skip_neg_q:
beq x0, t4, skip_neg_r
sub t1, x0, t1
skip_neg_r:
add a0, t0, x0 # Return quotient in a0
add a1, t1, x0 # Return remainder in a1
jalr x0, ra, 0 # Return to caller
```
I used the following test program and [Ripes](https://github.com/mortbopet/Ripes) simulator to measure the cycles count of software_div:
```asm
.text
main:
li a0, 100
li a1, 3
call software_div
li a7, 10
ecall
```
- Cycles: `419`
:::danger
Improve the above for less CPU cycles required.
:::
#### Optimized RISC-V Program
To improve the performance of the RISC-V implementation above, we note that the core division loop executes exactly 32 times; therefore, we can use **loop unrolling** to avoid using a branch to control the loop. Since the Ripes simulator does not include a branch predictor and treats every branch as not taken, removing branches can significantly reduce the execution overhead.
- Cycles: `292`
:::spoiler RISC-V Program
```asm
# software_div: Software division implementation for RV32I
# Algorithm: Simplified shift-and-subtract division
# Input: a0 = dividend, a1 = divisor
# Output: a0 = quotient, a1 = remainder
# Constraints: Assumes divisor != 0
software_div:
add t0, x0, x0 # quotient = 0
add t1, x0, x0 # remainder = 0
srli t3, a1, 31 # sign bit of divisor
srli t4, a0, 31 # negative_r = sign bit of dividend
xor t3, t3, t4 # negative_q = dividend_sign ^ divisor_sign
srai t5, a0, 31 # mask = a0 >> 31
xor t6, a0, t5 # temp = a0 ^ mask
sub a0, t6, t5 # abs(a0) = temp - mask
srai t5, a1, 31 # mask
xor t6, a1, t5 # temp
sub a1, t6, t5 # abs(a1)
div_loop:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, skip_sub # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
skip_sub:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
slli t1, t1, 1 # remainder <<= 1
srli t5, a0, 31 # Extract MSB of dividend
or t1, t1, t5 # remainder |= extracted bit
slli a0, a0, 1 # dividend <<= 1 (shift left)
slli t0, t0, 1 # quotient <<= 1 (prepare for next bit)
blt t1, a1, 1f # If remainder < divisor, skip subtraction
sub t1, t1, a1 # remainder -= divisor
ori t0, t0, 1 # quotient |= 1 (set LSB)
1:
beq x0, t3, skip_neg_q
sub t0, x0, t0
skip_neg_q:
beq x0, t4, skip_neg_r
sub t1, x0, t1
skip_neg_r:
add a0, t0, x0 # Return quotient in a0
add a1, t1, x0 # Return remainder in a1
jalr x0, ra, 0 # Return to caller
```
:::
## Referance
[Quiz4 of Computer Architecture (2025 Fall) Problem-C](https://hackmd.io/@sysprog/arch2025-quiz4-sol#Problem-C)
[你所不知道的 C 語言:bitwise 操作](https://hackmd.io/@sysprog/c-bitwise)