---
tags: computer-arch
---
# Quiz2 of Computer Architecture (2024 Fall)
> Solutions
## password
![sol](https://hackmd.io/_uploads/S1-4nwERR.png)
$\to$ `0x1000`
```c
li x10, 0x1A1B
slli x12, x10, 0x12
srli x12, x12, 0x6
and x12, x12, x10
```
secret: 3245
guess: 1224
## Problem `A`
You are asked to implement the shift-and-add technique, which resembles the traditional method of long multiplication. In this approach, the multiplicand (`X`) is repeatedly added to itself according to the value of the multiplier (`Y`), effectively performing `X` multiplied by `Y`.
The process involves examining each digit of the multiplier from right to left. For each digit, the multiplicand is multiplied by that digit, and the intermediate product is shifted accordingly to align with the partial sums from previous steps. This process results in the final product.
The following RISC-V assembly code implements the shift-and-add multiplication algorithm. In the data section, the data used in the program are defined: the multiplier (`-9`), the multiplicand (`7`), and a space for the result (initialized to 0).
```c
.data
multiplier: .word -9
multiplicand: .word 7
result: .word 0
```
![ripes-mul](https://hackmd.io/_uploads/HyW9EnQRR.png)
Using Ripes, you can edit and verify the RISC-V code implementing the shift-and-add multiplication algorithm. After executing the RISC-V assembly code below, the address `0x10000008` stores the result `-63`.
```c
.text
la a0, multiplier # Load multiplier address
lw a1, 0(a0) # Load multiplier value
la a2, multiplicand # Load multiplicand address
lw a3, 0(a2) # Load multiplicand value
li t0, 0 # Initialize accumulator
li t1, 32 # Set bit counter (#A01)
# Check for negative values
bltz a1, handle_negative1 # If multiplier negative (#A02)
j shift_and_add_loop # Skip to main loop (#A05)
bltz a3, handle_negative2 # If multiplicand negative (#A03)
j shift_and_add_loop # Continue to main loop (#A04)
handle_negative1:
neg a1, a1 # Make multiplier positive
handle_negative2:
neg a3, a3 # Make multiplicand positive
shift_and_add_loop:
beqz t1, end_shift_and_add # Exit if bit count is zero
andi t2, a1, 1 # Check least significant bit (#A06)
beqz t2, skip_add # Skip add if bit is 0
add t0, t0, a3 # Add to accumulator
skip_add:
srai a1, a1, 1 # Right shift multiplier
slli a3, a3, 1 # Left shift multiplicand
addi t1, t1, -1 # Decrease bit counter
j shift_and_add_loop # Repeat loop (#A07)
end_shift_and_add:
la a4, result # Load result address
sw t0, 0(a4) # Store final result (#A08)
```
The program provided above is incomplete. You need to fill in the sections marked with identifiers starting with `#A`. These sections require your input to complete the program correctly. Please note the following requirements:
The content you provide for identifiers starting with `#A` must consist of:
* Decimal integer literals, and/or
* RISC-V instructions
Pseudoinstructions are not allowed in your answers.
A01 = `addi t1, x0, 32` OR equivalent
A02 = `bltz`
A03 = `bltz`
A04 = `shift_and_add_loop`
A05 = `shift_and_add_loop`
A06 = `andi`
A07 = `shift_and_add_loop`
A08 = `sw t0, 0(a4)`
---
## Problem `B`
![image](https://hackmd.io/_uploads/Sk559wORR.png)
You are designing a RISC-V assembly program to solve the classic Tower of Hanoi puzzle. The main functionality is encapsulated within the hanoi function, which executes the algorithm to move the disks. This function takes four arguments:
* `n`: The total number of disks.
* `src`: A character representing the source rod (A).
* `aux`: A character representing the auxiliary rod (B).
* `dst`: A character representing the destination rod (C).
These arguments are passed through the registers `a1`, `a2`, `a3`, and `a4` respectively. Notably, the `a0` register is deliberately left unused in this implementation. The program outputs each move required to solve the Tower of Hanoi to the console, detailing which disk is moved from one rod to another. It does not return any values.
Example Usage:
```c
li a1, 3 # Set the number of disks to 3
li a2, 'A' # Assign 'A' as the source rod
li a3, 'B' # Assign 'B' as the auxiliary rod
li a4, 'C' # Assign 'C' as the destination rod
jal hanoi # Invoke the Tower of Hanoi function
```
Code listing:
```c
.data
md: .string "Move Disk " # String for move message
from: .string " from '" # String for source pole
to: .string "' to '" # String for destination pole
newline: .string "'\n" # String for newline
src: .string "A" # Source pole (A)
aux: .string "B" # Auxiliary pole (B)
dst: .string "C" # Destination pole (C)
n: .word 3 # Number of disks
.text
.globl main
main:
lw a1, n # Load the number of disks (n) into register a1
la t0, src # Load the address of the source pole (A) into t0
la t1, dst # Load the address of the destination pole (C) into t1
la t2, aux # Load the address of the auxiliary pole (B) into t2
lbu a2, 0(t0) # Load the first character of source pole into a2
lbu a3, 0(t2) # Load the first character of auxiliary pole into a3
lbu a4, 0(t1) # Load the first character of destination pole into a4
jal x1, hanoi # Call the hanoi function (jump and link) (#B10)
li a7, 10 # Load system call number for program exit into a7 (#B11)
ecall # Make the system call to exit the program
hanoi:
addi sp, sp, -20 # Allocate 20 bytes of space on the stack (#B01)
sw x1, 0(sp) # Save the return address on the stack
sw a1, 4(sp) # Save the number of disks (a1) on the stack
sw a2, 8(sp) # Save the source pole (a2) on the stack
sw a3, 12(sp) # Save the auxiliary pole (a3) on the stack
sw a4, 16(sp) # Save the destination pole (a4) on the stack
addi t0, x0, 1 # Load 1 into t0 (used for comparison)
beq a1, t0, return # If there's only 1 disk, jump to return (#B02)
lw a1, 4(sp) # Load the number of disks from the stack
addi a1, a1, -1 # Decrement the number of disks (a1)
lbu a2, 8(sp) # Load source pole from the stack
lbu a3, 16(sp) # Load destination pole from the stack
lbu a4, 12(sp) # Load auxiliary pole from the stack
jal x1, hanoi # Recursive call to hanoi (#B03)
lw a1, 4(sp) # Load the number of disks from the stack
lbu a2, 8(sp) # Load source pole from the stack
lbu a3, 16(sp) # Load destination pole from the stack
jal x1, print # Call the print function to display the move
lw a1, 4(sp) # Load the number of disks from the stack
addi a1, a1, -1 # Decrement the number of disks (a1)
lbu a2, 12(sp) # Load auxiliary pole from the stack
lbu a3, 8(sp) # Load source pole from the stack
lbu a4, 16(sp) # Load destination pole from the stack
jal x1, hanoi # Recursive call to hanoi (#B04)
lw x1, 0(sp) # Load the return address from the stack
addi sp, sp, 20 # Deallocate space on the stack
jalr x0, x1, 0 # Return to the caller (#B05)
return:
lw a1, 4(sp) # Load the number of disks from the stack
lbu a2, 8(sp) # Load source pole from the stack
lbu a3, 16(sp) # Load destination pole from the stack
jal x1, print # Call the print function to display the move
lw x1, 0(sp) # Load the return address from the stack
addi sp, sp, 20 # Deallocate space on the stack (#B06)
jalr x0, x1, 0 # Return to the caller (#B07)
print:
la a0, md # Load the address of "Move Disk " into a0
li a7, 4 # Load system call number for printing a string into a7
ecall # Make the system call to print the string
addi a0, a1, 0 # Load the disk number (a1) into a0
li a7, 1 # Load system call number for printing an integer into a7
ecall # Make the system call to print the integer
la a0, from # Load the address of " from '" into a0
li a7, 4 # Load system call number for printing a string into a7
ecall # Make the system call to print the string
addi a0, a2, 0 # Load the source pole into a0
li a7, 11 # Load system call number for printing a character into a7
ecall # Make the system call to print the character
la a0, to # Load the address of "' to '" into a0
li a7, 4 # Load system call number for printing a string into a7
ecall # Make the system call to print the string
addi a0, a3, 0 # Load the destination pole into a0
li a7, 11 # Load system call number for printing a character into a7
ecall # Make the system call to print the character
la a0, newline # Load the address of "'\n" into a0
li a7, 4 # Load system call number for printing a string into a7
ecall # Make the system call to print the string
jalr x0, x1, 0 # Return to the caller (#B09)
```
Complete the program by filling in sections marked with `#B`:
* Use only:
* Decimal integer literals
* RISC-V instructions
* No pseudoinstructions allowed
B01 = `-20`
B02 = `beq`
B03 = `jal`
B04 = `jal x1, hanoi`
B05 = `jalr`
B06 = `20`
B07 = `jalr`
B08 = `20`
B09 = `jalr`
B10 = `jal`
B11 = `addi a7, x0`
---
## Problem `C`
Building on Problem B, your task is to implement the Tower of Hanoi puzzle iteratively. Gray codes provide a Hamiltonian path on an n-dimensional cube by treating each bit as a coordinate. The reflected Gray code is the most widely used variant and is illustrated below.
- 1-Bit Gray Code: `0`, `1`.
- 2-Bit Gray Code: Mirror the 1-bit code and prefix with `0` and `1`, resulting in `00, 01, 11, 10`.
- n-Bit Gray Code: Recursively mirror the `(n-1)`-bit Gray code, prefix with `0` and `1`, then concatenate.
![gray code](https://hackmd.io/_uploads/rJUybqu00.png =50%x)
It key property: Consecutive Gray codes differ by only one bit flip.
Determining the Changed Bit:
1. Even Number of 1s: Flip the last bit.
2. Odd Number of 1s: Flip the bit immediately to the left of the rightmost `1`.
Both Gray code generation and solving Tower of Hanoi follow a recursive pattern:
1. Recursive Structure:
- Tower of Hanoi: Solve for `n-1` disks, move the largest disk, then solve for `n-1` disks again.
- Gray Code Generation: Mirror the `(n-1)`-bit Gray code, prefix with `0` and `1`, then concatenate.
2. Pattern Formation: Both processes create a recursive "ABACABADABACABA" pattern.
3. Iterative Move Instructions Using Gray Code:
- Disk Labeling: Assign the smallest disk as `'A'`, the next as `'B'`, etc.
- Move Determination: Each bit flip in the Gray code corresponds to moving the respective disk.
- Disambiguation for Disk `'A'`: Always move disk `'A'` in a consistent direction (e.g., Peg 1 → Peg 2 → Peg 3 → Peg 1 → ...).
This sequence ensures only one disk moves at a time and that no larger disk is placed atop a smaller one, adhering to Tower of Hanoi rules.
`__builtin_popcount()` is a GCC/Clang compiler intrinsic function designed to count the number of set bits (1s) in an unsigned integer. Essentially, it calculates how many 1’s are present in the binary representation of a given positive integer.
```c
__builtin_popcount(unsigned int number);
```
`number` is an unsigned or positive integer whose set bits you want to count. The function returns an integer representing the total number of set bits in the provided number.
Example usage of `__builtin_popcount`
- Input: 5
- Binary Representation of 5 is `101`
- Output: 2
> The binary form of the number `5` is `101`, which contains two set bits (the first and third bits are 1).
```c
/* Iterative Tower of Hanoi Using Gray Code */
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
static void print_move(int disk, char from, char to)
{
printf("Move Disk %d from '%c' to '%c'\n", disk, from, to);
}
int main()
{
int n_disks = 3; /* Number of disks */
int total_moves = (1 << n_disks) - 1;
const char pegs[3] = {'A', 'B', 'C'}; /* Peg labels */
int pos[n_disks]; /* Disk positions: 0-A, 1-B, 2-C */
/* Initialize all disks to peg A */
for (int i = 0; i < n_disks; i++)
pos[i] = 0;
/* Set direction based on parity of number of disks */
int direction = (n_disks & 1) ? -1 /* counter-clockwise */
: 1; /* clockwise */
/* Predefined packed mapping arrays */
const uint8_t peg_map[3] = {
0x9, /* Peg A: {CCW -> C (2), CW -> B (1)} */ // #C01
0x2, /* Peg B: {CCW -> A (0), CW -> C (2)} */ // #C02
0x4 /* Peg C: {CCW -> B (1), CW -> A (0)} */ // #C03
};
/* Calculate direction index: -1 -> 0 (CCW), 1 ->1 (CW) */
int direction_index = (direction + 1) / 2;
for (int n_moves = 1; n_moves <= total_moves; n_moves++) {
int curr_gray = n_moves ^ (n_moves >> 1);
int prev_gray = (n_moves - 1) ^ ((n_moves - 1) >> 1);
int changed_bit = curr_gray ^ prev_gray;
/* Identify the disk to move (0-indexed) */
int disk = __builtin_popcount(changed_bit - 1); // #C04
/* Current peg of the disk */
int curr_peg = pos[disk];
int new_peg;
if (disk == 0) {
/* Calculate shift: direction_index=0 (CCW) -> shift2,
* direction_index=1 (CW) -> shift0
*/
int shift = (1 - direction_index) << 1;
new_peg = (peg_map[curr_peg] >> shift) & 0x3; // #C05
} else {
/* Find the only peg not occupied by any smaller disk */
bool found_new_peg = false;
for (int p = 0; p < 3 && !found_new_peg; p++) {
if (p == curr_peg)
continue;
/* Check if any smaller disk is on peg p */
bool has_smaller = false;
for (int d = 0; d < disk; d++) {
if (pos[d] == p) {
has_smaller = true;
break;
}
}
if (!has_smaller) {
new_peg = p;
found_new_peg = true;
}
}
}
/* Execute the move */
print_move(disk + 1, pegs[curr_peg], pegs[new_peg]);
pos[disk] = new_peg;
}
return 0;
}
```
Reference output:
```
Move Disk 1 from 'A' to 'C'
Move Disk 2 from 'A' to 'B'
Move Disk 1 from 'C' to 'B'
Move Disk 3 from 'A' to 'C'
Move Disk 1 from 'B' to 'A'
Move Disk 2 from 'B' to 'C'
Move Disk 1 from 'A' to 'C'
```
C01 = `0x9`
C02 = `0x2`
C03 = `0x4`
C04 = `changed_bit-1`
C05 = `0x3`
---
## Problem `D`
You are tasked with implementing Ancient Egyptian Multiplication, a method that multiplies two numbers by systematically halving one number and doubling the other, thereby avoiding the use of direct multiplication. This technique involves repeatedly dividing one number by two and multiplying the other by two. After each halving and doubling step, you identify and sum the doubled values corresponding to the halved numbers that are odd.
$$
nm =
\begin{cases}
\frac{n}{2} \cdot 2m & \text{if } n \text{ is even}, \\
\frac{n-1}{2} \cdot 2m + m & \text{if } n \text{ is odd}, \\
m & \text{if } n = 1.
\end{cases}
$$
```c
.data
# Define the data section with two numbers to multiply
num1: .word 13
num2: .word 7
result: .word 0 # Placeholder for the final result
.text
# Begin the main code in the text section
la x1, result # Load the address of the result variable into register x1
lw t0, num1 # Load the first number (num1) into register t0
lw t1, num2 # Load the second number (num2) into register t1
li t2, 0 # Initialize the result (t2) to 0
loop:
# Check if the least significant bit of t0 (num1) is 1 (i.e., if the number is odd)
andi t3, t0, 1
beq t3, x0, skip_add # If the bit is 0 (even), skip the addition
# If the number is odd, add the value in t1 (num2) to the result in t2
add t2, t2, t1 # D01
skip_add:
# Perform a right shift on t0 (num1), effectively dividing it by 2
srli t0, t0, 1 # D02
# Perform a left shift on t1 (num2), effectively multiplying it by 2
slli t1, t1, 1 # D03
# If t0 (num1) is not zero, repeat the loop
bnez t0, loop
# Store the final result in the memory location pointed by x1
sw t2, 0(x1)
```
D01 = `add t2, t2, t1`
D02 = `srli`
D03 = `slli`
D04 = `sw t2, 0(x1)` OR equivalent