# Shift-and-Subtract Division contributed by <`ryanycs`> :::danger Mention your GitHub account here. ::: ## Software Division RV32I does not include `div` instruction, which indicates integer division must be implemented in software. A common and efficient approach is the **shift-and-subtract division**, a simplified form of binary long division. In the following section, we will illustrate how this algorithm works, starting from the decimal long-division and eventually a complete RV32I implementation. ## Inspiration: Decimal Long Division We consider the division: $$ 100 \div 7 $$ We can perform the standard long division: ``` ┌────── 7 │ 100 ``` Let us break down the calculating process step by step: We begin by taking digits from **left to right**. The first digit is $1$, we set the *remainder* $R$ to $1$. Since the *remainder* $R = 1$ is less than the *diviser* $7$, no subtraction is performed at this stage, and the first *quotient* $Q$ is $0$. ``` 0 ┌────── 7 │ 100 ────── 1 0 ────── 1 ``` --- After that, we bring down the next digit from the dividend. This appends a $0$ to the current remainder, forming a new remainder: $$ R = 10 $$ Since $10 \ge 7$, we can perform a subtraction. We subtract $7$ from $10$ to obtain the new remainder: $$ R = 10 - 7 = 3 $$ Because we performed one substraction, the next quotient digit is $1$, giving: $$ Q = 01 = 1 $$ ``` 01 ┌────── 7 │ 100 ────── 1 0 ────── 10 7 ────── 3 ``` --- Next, we bring down the final digit of the dividend, forming: $$ R = 30 $$ Since $30 \ge 7$, we perform subtraction $4$ times, the reason to choice $4$ is that: - $7 \times 4 = 28 \le 30$ - $7 \times 5 = 23 > 30$ The remainder is: $$ R = 30 - 7 * 4 = 2 $$ The next quotient digit is $4$, giving: $$ Q = 14 $$ ``` 014 ┌────── 7 │ 100 ────── 1 0 ────── 10 7 ────── 30 28 ────── 2 ``` --- Therefore, the **quotient** is $14$, and the **remainder** is $2$. We can simulate this approch into binary long division. ## Binary Long Divison ### Example Taking $13 \div 3$ as the example. We first convert these number form decimal into binary representation: - Dividend = `1101` (13) - Divisor = `11` (3) ``` ┌─────── 11 │ 1101 ``` --- Take the digit of the dividend from **left to right**. The first digit is $1$, we set the remainder $R = 1$. Since the remainder $R = 1$ is less then the divisor $11$, there is no need of substraction. The quotient $Q$ is set to $0$. ``` 0 ┌─────── 11 │ 1101 ─────── 1 0 ─────── 1 ``` --- Next, bringing down the second digit of the dividend, which forming the new remainder: $$ R = (1 << 1) | 1 = 11 $$ Since remainder $R = 11_2 = 3$ is greater or equal to divisor $11_2 = 3$, we substract the divisor from remainder, obtain: $$ R = 11_2 - 11_2 = 0 $$ There is a substraction in this step, the quotient become: $$ Q = 01 = 1 $$ ``` 01 ┌─────── 11 │ 1101 ─────── 1 0 ─────── 11 11 ─────── 0 ``` --- After that, bringing down the third digit of the dividend, forming: $$ R = (0 << 1) | 0 = 0 $$ The new remainder $R = 0$ is less than divisor $11_2 = 3$, there is no need to do substracion. The quotient become: $$ Q = (1 << 1) | 0 = 10 $$ ``` 010 ┌─────── 11 │ 1101 ─────── 1 0 ─────── 11 11 ─────── 0 0 ─────── 0 ``` --- Finally, bringing down the last digit, the new remainder is: $$ R = (0 << 1) | 1 = 1 $$ The remainder $R = 1$ still less than the divisor $11_2 = 3$, there is no need to do substracion. The quotient become: $$ Q = (10 << 1) | 0 = 100 $$ ``` 0100 ┌─────── 11 │ 1101 ─────── 1 0 ─────── 11 11 ─────── 0 0 ─────── 1 0 ─────── 1 ``` --- Therefore, the **quotient** is `100` in binary, i.e. $4$ in decimal, and the **remainder** is `1` in brnary, i.e. $1$ in decimal: $$ 13 \div 3 = 4 \cdot\cdot\cdot 1 $$ ### Algorithm We can observe that binary long division relies only on three operations: **bit shifting**, **bitwise OR**, and **subtraction**, and that its processing steps follow a simple, repeating structure. We can write down the algorithm as follows: 1. initialize the remainder `R` and quotient `Q` to $0$. 2. **Shift left** the remainder `R` and **bitwise OR** the left-most bit of the dividend `Q` into `R` to form a new remainder. 3. **Shift left** the dividend to discard the bit that was just processed. 4. **Shift left** the quotient `Q` to prepare for the next quotient bit. 5. Compare the remainder `R` with the divisor `D`: - if `R` >= `D`, then do subtraction `R = R - D` and set LSB of `Q` to 1. - Otherwise, set LSB of `Q` to 0. 6. Repeat steps 2–5 for each bit of the dividend. ### C Implementation of Shift-and-Subtract Division The following function performs unsigned integer division using the shift-and-subtract algorithm: ```c void div(uint32_t dividend, uint32_t divisor, uint32_t *quotient, uint32_t *remainder) { int bit_count = 32; *quotient = 0; *remainder = 0; while (bit_count--) { uint32_t left_most = (dividend >> 31) & 1; // left-most bit of the dividend dividend <<= 1; // discard the bit that was just processed *remainder = (*remainder << 1) | left_most; *quotient <<= 1; // prepare for the next quotient bit. if (*remainder >= divisor) { *remainder -= divisor; *quotient |= 1; } } } ``` ### RISC-V Implementation of Shift-and-Subtract Division The following RISC-V program shows the unsigned integer division using the shift-and-subtract algorithm: ```asm # software_div: Software division implementation for RV32I # Algorithm: Simplified shift-and-subtract division # Input: a0 = dividend, a1 = divisor # Output: a0 = quotient, a1 = remainder # Constraints: Assumes positive inputs and divisor != 0 software_div: add t0, x0, x0 # quotient = 0 add t1, x0, x0 # remainder = 0 addi t2, x0, 32 # bit_count = 32 (process 32 bits) div_loop: slli t1, t1, 1 # remainder <<= 1 srli t3, a0, 31 # Extract MSB of dividend or t1, t1, t3 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, skip_sub # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) skip_sub: addi t2, t2, -1 # bit_count-- bne t2, x0, div_loop # Continue if bit_count != 0 add a0, t0, x0 # Return quotient in a0 add a1, t1, x0 # Return remainder in a1 jalr x0, ra, 0 # Return to caller ``` ## Handling Negative Value Handling negative input is simple. We can just using **bitwise XOR** on the sign bit of the both dividend and divisor. If the result is $1$, then the quotient is negative; otherwise, it is positive. The remainder has the same sign with the dividend. The shift-and-subtract division loop will use the absolute value of the dividend and divisor. The following RISC-V program shows the refinement of the previous prgram to support negative input. ```asm # software_div: Software division implementation for RV32I # Algorithm: Simplified shift-and-subtract division # Input: a0 = dividend, a1 = divisor # Output: a0 = quotient, a1 = remainder # Constraints: Assumes divisor != 0 software_div: add t0, x0, x0 # quotient = 0 add t1, x0, x0 # remainder = 0 addi t2, x0, 32 # bit_count = 32 (process 32 bits) srli t3, a1, 31 # sign bit of divisor srli t4, a0, 31 # negative_r = sign bit of dividend xor t3, t3, t4 # negative_q = dividend_sign ^ divisor_sign srai t5, a0, 31 # mask = a0 >> 31 xor t6, a0, t5 # temp = a0 ^ mask sub a0, t6, t5 # abs(a0) = temp - mask srai t5, a1, 31 # mask xor t6, a1, t5 # temp sub a1, t6, t5 # abs(a1) div_loop: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, skip_sub # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) skip_sub: addi t2, t2, -1 # bit_count-- bne t2, x0, div_loop # Continue if bit_count != 0 beq x0, t3, skip_neg_q sub t0, x0, t0 skip_neg_q: beq x0, t4, skip_neg_r sub t1, x0, t1 skip_neg_r: add a0, t0, x0 # Return quotient in a0 add a1, t1, x0 # Return remainder in a1 jalr x0, ra, 0 # Return to caller ``` I used the following test program and [Ripes](https://github.com/mortbopet/Ripes) simulator to measure the cycles count of software_div: ```asm .text main: li a0, 100 li a1, 3 call software_div li a7, 10 ecall ``` - Cycles: `419` :::danger Improve the above for less CPU cycles required. ::: #### Optimized RISC-V Program To improve the performance of the RISC-V implementation above, we note that the core division loop executes exactly 32 times; therefore, we can use **loop unrolling** to avoid using a branch to control the loop. Since the Ripes simulator does not include a branch predictor and treats every branch as not taken, removing branches can significantly reduce the execution overhead. - Cycles: `292` :::spoiler RISC-V Program ```asm # software_div: Software division implementation for RV32I # Algorithm: Simplified shift-and-subtract division # Input: a0 = dividend, a1 = divisor # Output: a0 = quotient, a1 = remainder # Constraints: Assumes divisor != 0 software_div: add t0, x0, x0 # quotient = 0 add t1, x0, x0 # remainder = 0 srli t3, a1, 31 # sign bit of divisor srli t4, a0, 31 # negative_r = sign bit of dividend xor t3, t3, t4 # negative_q = dividend_sign ^ divisor_sign srai t5, a0, 31 # mask = a0 >> 31 xor t6, a0, t5 # temp = a0 ^ mask sub a0, t6, t5 # abs(a0) = temp - mask srai t5, a1, 31 # mask xor t6, a1, t5 # temp sub a1, t6, t5 # abs(a1) div_loop: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, skip_sub # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) skip_sub: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: slli t1, t1, 1 # remainder <<= 1 srli t5, a0, 31 # Extract MSB of dividend or t1, t1, t5 # remainder |= extracted bit slli a0, a0, 1 # dividend <<= 1 (shift left) slli t0, t0, 1 # quotient <<= 1 (prepare for next bit) blt t1, a1, 1f # If remainder < divisor, skip subtraction sub t1, t1, a1 # remainder -= divisor ori t0, t0, 1 # quotient |= 1 (set LSB) 1: beq x0, t3, skip_neg_q sub t0, x0, t0 skip_neg_q: beq x0, t4, skip_neg_r sub t1, x0, t1 skip_neg_r: add a0, t0, x0 # Return quotient in a0 add a1, t1, x0 # Return remainder in a1 jalr x0, ra, 0 # Return to caller ``` ::: ## Referance [Quiz4 of Computer Architecture (2025 Fall) Problem-C](https://hackmd.io/@sysprog/arch2025-quiz4-sol#Problem-C) [你所不知道的 C 語言:bitwise 操作](https://hackmd.io/@sysprog/c-bitwise)