--- tags: computer-arch --- # Quiz2 of Computer Architecture (2024 Fall) > Solutions ## password ![sol](https://hackmd.io/_uploads/S1-4nwERR.png) $\to$ `0x1000` ```c li x10, 0x1A1B slli x12, x10, 0x12 srli x12, x12, 0x6 and x12, x12, x10 ``` secret: 3245 guess: 1224 ## Problem `A` You are asked to implement the shift-and-add technique, which resembles the traditional method of long multiplication. In this approach, the multiplicand (`X`) is repeatedly added to itself according to the value of the multiplier (`Y`), effectively performing `X` multiplied by `Y`. The process involves examining each digit of the multiplier from right to left. For each digit, the multiplicand is multiplied by that digit, and the intermediate product is shifted accordingly to align with the partial sums from previous steps. This process results in the final product. The following RISC-V assembly code implements the shift-and-add multiplication algorithm. In the data section, the data used in the program are defined: the multiplier (`-9`), the multiplicand (`7`), and a space for the result (initialized to 0). ```c .data multiplier: .word -9 multiplicand: .word 7 result: .word 0 ``` ![ripes-mul](https://hackmd.io/_uploads/HyW9EnQRR.png) Using Ripes, you can edit and verify the RISC-V code implementing the shift-and-add multiplication algorithm. After executing the RISC-V assembly code below, the address `0x10000008` stores the result `-63`. ```c .text la a0, multiplier # Load multiplier address lw a1, 0(a0) # Load multiplier value la a2, multiplicand # Load multiplicand address lw a3, 0(a2) # Load multiplicand value li t0, 0 # Initialize accumulator li t1, 32 # Set bit counter (#A01) # Check for negative values bltz a1, handle_negative1 # If multiplier negative (#A02) j shift_and_add_loop # Skip to main loop (#A05) bltz a3, handle_negative2 # If multiplicand negative (#A03) j shift_and_add_loop # Continue to main loop (#A04) handle_negative1: neg a1, a1 # Make multiplier positive handle_negative2: neg a3, a3 # Make multiplicand positive shift_and_add_loop: beqz t1, end_shift_and_add # Exit if bit count is zero andi t2, a1, 1 # Check least significant bit (#A06) beqz t2, skip_add # Skip add if bit is 0 add t0, t0, a3 # Add to accumulator skip_add: srai a1, a1, 1 # Right shift multiplier slli a3, a3, 1 # Left shift multiplicand addi t1, t1, -1 # Decrease bit counter j shift_and_add_loop # Repeat loop (#A07) end_shift_and_add: la a4, result # Load result address sw t0, 0(a4) # Store final result (#A08) ``` The program provided above is incomplete. You need to fill in the sections marked with identifiers starting with `#A`. These sections require your input to complete the program correctly. Please note the following requirements: The content you provide for identifiers starting with `#A` must consist of: * Decimal integer literals, and/or * RISC-V instructions Pseudoinstructions are not allowed in your answers. A01 = `addi t1, x0, 32` OR equivalent A02 = `bltz` A03 = `bltz` A04 = `shift_and_add_loop` A05 = `shift_and_add_loop` A06 = `andi` A07 = `shift_and_add_loop` A08 = `sw t0, 0(a4)` --- ## Problem `B` ![image](https://hackmd.io/_uploads/Sk559wORR.png) You are designing a RISC-V assembly program to solve the classic Tower of Hanoi puzzle. The main functionality is encapsulated within the hanoi function, which executes the algorithm to move the disks. This function takes four arguments: * `n`: The total number of disks. * `src`: A character representing the source rod (A). * `aux`: A character representing the auxiliary rod (B). * `dst`: A character representing the destination rod (C). These arguments are passed through the registers `a1`, `a2`, `a3`, and `a4` respectively. Notably, the `a0` register is deliberately left unused in this implementation. The program outputs each move required to solve the Tower of Hanoi to the console, detailing which disk is moved from one rod to another. It does not return any values. Example Usage: ```c li a1, 3 # Set the number of disks to 3 li a2, 'A' # Assign 'A' as the source rod li a3, 'B' # Assign 'B' as the auxiliary rod li a4, 'C' # Assign 'C' as the destination rod jal hanoi # Invoke the Tower of Hanoi function ``` Code listing: ```c .data md: .string "Move Disk " # String for move message from: .string " from '" # String for source pole to: .string "' to '" # String for destination pole newline: .string "'\n" # String for newline src: .string "A" # Source pole (A) aux: .string "B" # Auxiliary pole (B) dst: .string "C" # Destination pole (C) n: .word 3 # Number of disks .text .globl main main: lw a1, n # Load the number of disks (n) into register a1 la t0, src # Load the address of the source pole (A) into t0 la t1, dst # Load the address of the destination pole (C) into t1 la t2, aux # Load the address of the auxiliary pole (B) into t2 lbu a2, 0(t0) # Load the first character of source pole into a2 lbu a3, 0(t2) # Load the first character of auxiliary pole into a3 lbu a4, 0(t1) # Load the first character of destination pole into a4 jal x1, hanoi # Call the hanoi function (jump and link) (#B10) li a7, 10 # Load system call number for program exit into a7 (#B11) ecall # Make the system call to exit the program hanoi: addi sp, sp, -20 # Allocate 20 bytes of space on the stack (#B01) sw x1, 0(sp) # Save the return address on the stack sw a1, 4(sp) # Save the number of disks (a1) on the stack sw a2, 8(sp) # Save the source pole (a2) on the stack sw a3, 12(sp) # Save the auxiliary pole (a3) on the stack sw a4, 16(sp) # Save the destination pole (a4) on the stack addi t0, x0, 1 # Load 1 into t0 (used for comparison) beq a1, t0, return # If there's only 1 disk, jump to return (#B02) lw a1, 4(sp) # Load the number of disks from the stack addi a1, a1, -1 # Decrement the number of disks (a1) lbu a2, 8(sp) # Load source pole from the stack lbu a3, 16(sp) # Load destination pole from the stack lbu a4, 12(sp) # Load auxiliary pole from the stack jal x1, hanoi # Recursive call to hanoi (#B03) lw a1, 4(sp) # Load the number of disks from the stack lbu a2, 8(sp) # Load source pole from the stack lbu a3, 16(sp) # Load destination pole from the stack jal x1, print # Call the print function to display the move lw a1, 4(sp) # Load the number of disks from the stack addi a1, a1, -1 # Decrement the number of disks (a1) lbu a2, 12(sp) # Load auxiliary pole from the stack lbu a3, 8(sp) # Load source pole from the stack lbu a4, 16(sp) # Load destination pole from the stack jal x1, hanoi # Recursive call to hanoi (#B04) lw x1, 0(sp) # Load the return address from the stack addi sp, sp, 20 # Deallocate space on the stack jalr x0, x1, 0 # Return to the caller (#B05) return: lw a1, 4(sp) # Load the number of disks from the stack lbu a2, 8(sp) # Load source pole from the stack lbu a3, 16(sp) # Load destination pole from the stack jal x1, print # Call the print function to display the move lw x1, 0(sp) # Load the return address from the stack addi sp, sp, 20 # Deallocate space on the stack (#B06) jalr x0, x1, 0 # Return to the caller (#B07) print: la a0, md # Load the address of "Move Disk " into a0 li a7, 4 # Load system call number for printing a string into a7 ecall # Make the system call to print the string addi a0, a1, 0 # Load the disk number (a1) into a0 li a7, 1 # Load system call number for printing an integer into a7 ecall # Make the system call to print the integer la a0, from # Load the address of " from '" into a0 li a7, 4 # Load system call number for printing a string into a7 ecall # Make the system call to print the string addi a0, a2, 0 # Load the source pole into a0 li a7, 11 # Load system call number for printing a character into a7 ecall # Make the system call to print the character la a0, to # Load the address of "' to '" into a0 li a7, 4 # Load system call number for printing a string into a7 ecall # Make the system call to print the string addi a0, a3, 0 # Load the destination pole into a0 li a7, 11 # Load system call number for printing a character into a7 ecall # Make the system call to print the character la a0, newline # Load the address of "'\n" into a0 li a7, 4 # Load system call number for printing a string into a7 ecall # Make the system call to print the string jalr x0, x1, 0 # Return to the caller (#B09) ``` Complete the program by filling in sections marked with `#B`: * Use only: * Decimal integer literals * RISC-V instructions * No pseudoinstructions allowed B01 = `-20` B02 = `beq` B03 = `jal` B04 = `jal x1, hanoi` B05 = `jalr` B06 = `20` B07 = `jalr` B08 = `20` B09 = `jalr` B10 = `jal` B11 = `addi a7, x0` --- ## Problem `C` Building on Problem B, your task is to implement the Tower of Hanoi puzzle iteratively. Gray codes provide a Hamiltonian path on an n-dimensional cube by treating each bit as a coordinate. The reflected Gray code is the most widely used variant and is illustrated below. - 1-Bit Gray Code: `0`, `1`. - 2-Bit Gray Code: Mirror the 1-bit code and prefix with `0` and `1`, resulting in `00, 01, 11, 10`. - n-Bit Gray Code: Recursively mirror the `(n-1)`-bit Gray code, prefix with `0` and `1`, then concatenate. ![gray code](https://hackmd.io/_uploads/rJUybqu00.png =50%x) It key property: Consecutive Gray codes differ by only one bit flip. Determining the Changed Bit: 1. Even Number of 1s: Flip the last bit. 2. Odd Number of 1s: Flip the bit immediately to the left of the rightmost `1`. Both Gray code generation and solving Tower of Hanoi follow a recursive pattern: 1. Recursive Structure: - Tower of Hanoi: Solve for `n-1` disks, move the largest disk, then solve for `n-1` disks again. - Gray Code Generation: Mirror the `(n-1)`-bit Gray code, prefix with `0` and `1`, then concatenate. 2. Pattern Formation: Both processes create a recursive "ABACABADABACABA" pattern. 3. Iterative Move Instructions Using Gray Code: - Disk Labeling: Assign the smallest disk as `'A'`, the next as `'B'`, etc. - Move Determination: Each bit flip in the Gray code corresponds to moving the respective disk. - Disambiguation for Disk `'A'`: Always move disk `'A'` in a consistent direction (e.g., Peg 1 → Peg 2 → Peg 3 → Peg 1 → ...). This sequence ensures only one disk moves at a time and that no larger disk is placed atop a smaller one, adhering to Tower of Hanoi rules. `__builtin_popcount()` is a GCC/Clang compiler intrinsic function designed to count the number of set bits (1s) in an unsigned integer. Essentially, it calculates how many 1’s are present in the binary representation of a given positive integer. ```c __builtin_popcount(unsigned int number); ``` `number` is an unsigned or positive integer whose set bits you want to count. The function returns an integer representing the total number of set bits in the provided number. Example usage of `__builtin_popcount` - Input: 5 - Binary Representation of 5 is `101` - Output: 2 > The binary form of the number `5` is `101`, which contains two set bits (the first and third bits are 1). ```c /* Iterative Tower of Hanoi Using Gray Code */ #include <stdbool.h> #include <stdint.h> #include <stdio.h> static void print_move(int disk, char from, char to) { printf("Move Disk %d from '%c' to '%c'\n", disk, from, to); } int main() { int n_disks = 3; /* Number of disks */ int total_moves = (1 << n_disks) - 1; const char pegs[3] = {'A', 'B', 'C'}; /* Peg labels */ int pos[n_disks]; /* Disk positions: 0-A, 1-B, 2-C */ /* Initialize all disks to peg A */ for (int i = 0; i < n_disks; i++) pos[i] = 0; /* Set direction based on parity of number of disks */ int direction = (n_disks & 1) ? -1 /* counter-clockwise */ : 1; /* clockwise */ /* Predefined packed mapping arrays */ const uint8_t peg_map[3] = { 0x9, /* Peg A: {CCW -> C (2), CW -> B (1)} */ // #C01 0x2, /* Peg B: {CCW -> A (0), CW -> C (2)} */ // #C02 0x4 /* Peg C: {CCW -> B (1), CW -> A (0)} */ // #C03 }; /* Calculate direction index: -1 -> 0 (CCW), 1 ->1 (CW) */ int direction_index = (direction + 1) / 2; for (int n_moves = 1; n_moves <= total_moves; n_moves++) { int curr_gray = n_moves ^ (n_moves >> 1); int prev_gray = (n_moves - 1) ^ ((n_moves - 1) >> 1); int changed_bit = curr_gray ^ prev_gray; /* Identify the disk to move (0-indexed) */ int disk = __builtin_popcount(changed_bit - 1); // #C04 /* Current peg of the disk */ int curr_peg = pos[disk]; int new_peg; if (disk == 0) { /* Calculate shift: direction_index=0 (CCW) -> shift2, * direction_index=1 (CW) -> shift0 */ int shift = (1 - direction_index) << 1; new_peg = (peg_map[curr_peg] >> shift) & 0x3; // #C05 } else { /* Find the only peg not occupied by any smaller disk */ bool found_new_peg = false; for (int p = 0; p < 3 && !found_new_peg; p++) { if (p == curr_peg) continue; /* Check if any smaller disk is on peg p */ bool has_smaller = false; for (int d = 0; d < disk; d++) { if (pos[d] == p) { has_smaller = true; break; } } if (!has_smaller) { new_peg = p; found_new_peg = true; } } } /* Execute the move */ print_move(disk + 1, pegs[curr_peg], pegs[new_peg]); pos[disk] = new_peg; } return 0; } ``` Reference output: ``` Move Disk 1 from 'A' to 'C' Move Disk 2 from 'A' to 'B' Move Disk 1 from 'C' to 'B' Move Disk 3 from 'A' to 'C' Move Disk 1 from 'B' to 'A' Move Disk 2 from 'B' to 'C' Move Disk 1 from 'A' to 'C' ``` C01 = `0x9` C02 = `0x2` C03 = `0x4` C04 = `changed_bit-1` C05 = `0x3` --- ## Problem `D` You are tasked with implementing Ancient Egyptian Multiplication, a method that multiplies two numbers by systematically halving one number and doubling the other, thereby avoiding the use of direct multiplication. This technique involves repeatedly dividing one number by two and multiplying the other by two. After each halving and doubling step, you identify and sum the doubled values corresponding to the halved numbers that are odd. $$ nm = \begin{cases} \frac{n}{2} \cdot 2m & \text{if } n \text{ is even}, \\ \frac{n-1}{2} \cdot 2m + m & \text{if } n \text{ is odd}, \\ m & \text{if } n = 1. \end{cases} $$ ```c .data # Define the data section with two numbers to multiply num1: .word 13 num2: .word 7 result: .word 0 # Placeholder for the final result .text # Begin the main code in the text section la x1, result # Load the address of the result variable into register x1 lw t0, num1 # Load the first number (num1) into register t0 lw t1, num2 # Load the second number (num2) into register t1 li t2, 0 # Initialize the result (t2) to 0 loop: # Check if the least significant bit of t0 (num1) is 1 (i.e., if the number is odd) andi t3, t0, 1 beq t3, x0, skip_add # If the bit is 0 (even), skip the addition # If the number is odd, add the value in t1 (num2) to the result in t2 add t2, t2, t1 # D01 skip_add: # Perform a right shift on t0 (num1), effectively dividing it by 2 srli t0, t0, 1 # D02 # Perform a left shift on t1 (num2), effectively multiplying it by 2 slli t1, t1, 1 # D03 # If t0 (num1) is not zero, repeat the loop bnez t0, loop # Store the final result in the memory location pointed by x1 sw t2, 0(x1) ``` D01 = `add t2, t2, t1` D02 = `srli` D03 = `slli` D04 = `sw t2, 0(x1)` OR equivalent