Try   HackMD

Quiz2 of Computer Architecture (2024 Fall)

Solutions

password

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

0x1000

li x10, 0x1A1B
slli x12, x10, 0x12
srli x12, x12, 0x6
and  x12, x12, x10

secret: 3245
guess: 1224

Problem A

You are asked to implement the shift-and-add technique, which resembles the traditional method of long multiplication. In this approach, the multiplicand (X) is repeatedly added to itself according to the value of the multiplier (Y), effectively performing X multiplied by Y.

The process involves examining each digit of the multiplier from right to left. For each digit, the multiplicand is multiplied by that digit, and the intermediate product is shifted accordingly to align with the partial sums from previous steps. This process results in the final product.

The following RISC-V assembly code implements the shift-and-add multiplication algorithm. In the data section, the data used in the program are defined: the multiplier (-9), the multiplicand (7), and a space for the result (initialized to 0).

.data
    multiplier: .word -9
    multiplicand: .word 7
    result: .word 0

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

Using Ripes, you can edit and verify the RISC-V code implementing the shift-and-add multiplication algorithm. After executing the RISC-V assembly code below, the address 0x10000008 stores the result -63.

.text
    la a0, multiplier         # Load multiplier address
    lw a1, 0(a0)              # Load multiplier value
    la a2, multiplicand       # Load multiplicand address
    lw a3, 0(a2)              # Load multiplicand value
    li t0, 0                  # Initialize accumulator
    li t1, 32                 # Set bit counter (#A01)

    # Check for negative values
    bltz a1, handle_negative1 # If multiplier negative (#A02)
    j shift_and_add_loop      # Skip to main loop (#A05)
    bltz a3, handle_negative2 # If multiplicand negative (#A03)
    j shift_and_add_loop      # Continue to main loop (#A04)

handle_negative1:
    neg a1, a1                # Make multiplier positive

handle_negative2:
    neg a3, a3                # Make multiplicand positive

shift_and_add_loop:
    beqz t1, end_shift_and_add # Exit if bit count is zero
    andi t2, a1, 1            # Check least significant bit (#A06)
    beqz t2, skip_add         # Skip add if bit is 0
    add t0, t0, a3            # Add to accumulator

skip_add:
    srai a1, a1, 1            # Right shift multiplier
    slli a3, a3, 1            # Left shift multiplicand
    addi t1, t1, -1           # Decrease bit counter
    j shift_and_add_loop      # Repeat loop (#A07)

end_shift_and_add:
    la a4, result             # Load result address
    sw t0, 0(a4)              # Store final result (#A08)

The program provided above is incomplete. You need to fill in the sections marked with identifiers starting with #A. These sections require your input to complete the program correctly. Please note the following requirements:

The content you provide for identifiers starting with #A must consist of:

  • Decimal integer literals, and/or
  • RISC-V instructions

Pseudoinstructions are not allowed in your answers.

A01 = addi t1, x0, 32 OR equivalent
A02 = bltz
A03 = bltz
A04 = shift_and_add_loop
A05 = shift_and_add_loop
A06 = andi
A07 = shift_and_add_loop
A08 = sw t0, 0(a4)


Problem B

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

You are designing a RISC-V assembly program to solve the classic Tower of Hanoi puzzle. The main functionality is encapsulated within the hanoi function, which executes the algorithm to move the disks. This function takes four arguments:

  • n: The total number of disks.
  • src: A character representing the source rod (A).
  • aux: A character representing the auxiliary rod (B).
  • dst: A character representing the destination rod ©.

These arguments are passed through the registers a1, a2, a3, and a4 respectively. Notably, the a0 register is deliberately left unused in this implementation. The program outputs each move required to solve the Tower of Hanoi to the console, detailing which disk is moved from one rod to another. It does not return any values.

Example Usage:

li a1, 3          # Set the number of disks to 3
li a2, 'A'        # Assign 'A' as the source rod
li a3, 'B'        # Assign 'B' as the auxiliary rod
li a4, 'C'        # Assign 'C' as the destination rod
jal hanoi         # Invoke the Tower of Hanoi function

Code listing:

.data
    md:      .string "Move Disk "          # String for move message
    from:    .string " from '"             # String for source pole
    to:      .string "' to '"              # String for destination pole
    newline: .string "'\n"                 # String for newline
    src:     .string "A"                   # Source pole (A)
    aux:     .string "B"                   # Auxiliary pole (B)
    dst:     .string "C"                   # Destination pole (C)
    n:       .word   3                     # Number of disks

.text
.globl main
main:
    lw a1, n              # Load the number of disks (n) into register a1
    la t0, src            # Load the address of the source pole (A) into t0
    la t1, dst            # Load the address of the destination pole (C) into t1
    la t2, aux            # Load the address of the auxiliary pole (B) into t2
    lbu a2, 0(t0)         # Load the first character of source pole into a2
    lbu a3, 0(t2)         # Load the first character of auxiliary pole into a3
    lbu a4, 0(t1)         # Load the first character of destination pole into a4
    jal x1, hanoi  # Call the hanoi function (jump and link) (#B10)
    li a7, 10             # Load system call number for program exit into a7 (#B11)
    ecall                 # Make the system call to exit the program

hanoi:
    addi sp, sp, -20      # Allocate 20 bytes of space on the stack (#B01)
    sw x1, 0(sp)          # Save the return address on the stack
    sw a1, 4(sp)          # Save the number of disks (a1) on the stack
    sw a2, 8(sp)          # Save the source pole (a2) on the stack
    sw a3, 12(sp)         # Save the auxiliary pole (a3) on the stack
    sw a4, 16(sp)         # Save the destination pole (a4) on the stack
    addi t0, x0, 1        # Load 1 into t0 (used for comparison)
    beq a1, t0, return    # If there's only 1 disk, jump to return (#B02)
    lw a1, 4(sp)          # Load the number of disks from the stack
    addi a1, a1, -1       # Decrement the number of disks (a1)
    lbu a2, 8(sp)         # Load source pole from the stack
    lbu a3, 16(sp)        # Load destination pole from the stack
    lbu a4, 12(sp)        # Load auxiliary pole from the stack
    jal x1, hanoi         # Recursive call to hanoi (#B03)
    lw a1, 4(sp)          # Load the number of disks from the stack
    lbu a2, 8(sp)         # Load source pole from the stack
    lbu a3, 16(sp)        # Load destination pole from the stack
    jal x1, print         # Call the print function to display the move
    lw a1, 4(sp)          # Load the number of disks from the stack
    addi a1, a1, -1       # Decrement the number of disks (a1)
    lbu a2, 12(sp)        # Load auxiliary pole from the stack
    lbu a3, 8(sp)         # Load source pole from the stack
    lbu a4, 16(sp)        # Load destination pole from the stack
    jal x1, hanoi         # Recursive call to hanoi (#B04)
    lw x1, 0(sp)          # Load the return address from the stack
    addi sp, sp, 20       # Deallocate space on the stack
    jalr x0, x1, 0        # Return to the caller (#B05)

return:
    lw a1, 4(sp)          # Load the number of disks from the stack
    lbu a2, 8(sp)         # Load source pole from the stack
    lbu a3, 16(sp)        # Load destination pole from the stack
    jal x1, print         # Call the print function to display the move
    lw x1, 0(sp)          # Load the return address from the stack
    addi sp, sp, 20       # Deallocate space on the stack (#B06)
    jalr x0, x1, 0        # Return to the caller (#B07)

print:
    la a0, md             # Load the address of "Move Disk " into a0
    li a7, 4              # Load system call number for printing a string into a7
    ecall                 # Make the system call to print the string
    addi a0, a1, 0        # Load the disk number (a1) into a0
    li a7, 1              # Load system call number for printing an integer into a7
    ecall                 # Make the system call to print the integer
    la a0, from           # Load the address of " from '" into a0
    li a7, 4              # Load system call number for printing a string into a7
    ecall                 # Make the system call to print the string
    addi a0, a2, 0        # Load the source pole into a0
    li a7, 11             # Load system call number for printing a character into a7
    ecall                 # Make the system call to print the character
    la a0, to             # Load the address of "' to '" into a0
    li a7, 4              # Load system call number for printing a string into a7
    ecall                 # Make the system call to print the string
    addi a0, a3, 0        # Load the destination pole into a0
    li a7, 11             # Load system call number for printing a character into a7
    ecall                 # Make the system call to print the character
    la a0, newline        # Load the address of "'\n" into a0
    li a7, 4              # Load system call number for printing a string into a7
    ecall                 # Make the system call to print the string
    jalr x0, x1, 0        # Return to the caller (#B09)

Complete the program by filling in sections marked with #B:

  • Use only:
    • Decimal integer literals
    • RISC-V instructions
  • No pseudoinstructions allowed

B01 = -20
B02 = beq
B03 = jal
B04 = jal x1, hanoi
B05 = jalr
B06 = 20
B07 = jalr
B08 = 20
B09 = jalr
B10 = jal
B11 = addi a7, x0


Problem C

Building on Problem B, your task is to implement the Tower of Hanoi puzzle iteratively. Gray codes provide a Hamiltonian path on an n-dimensional cube by treating each bit as a coordinate. The reflected Gray code is the most widely used variant and is illustrated below.

  • 1-Bit Gray Code: 0, 1.
  • 2-Bit Gray Code: Mirror the 1-bit code and prefix with 0 and 1, resulting in 00, 01, 11, 10.
  • n-Bit Gray Code: Recursively mirror the (n-1)-bit Gray code, prefix with 0 and 1, then concatenate.

gray code

It key property: Consecutive Gray codes differ by only one bit flip.

Determining the Changed Bit:

  1. Even Number of 1s: Flip the last bit.
  2. Odd Number of 1s: Flip the bit immediately to the left of the rightmost 1.

Both Gray code generation and solving Tower of Hanoi follow a recursive pattern:

  1. Recursive Structure:
    • Tower of Hanoi: Solve for n-1 disks, move the largest disk, then solve for n-1 disks again.
    • Gray Code Generation: Mirror the (n-1)-bit Gray code, prefix with 0 and 1, then concatenate.
  2. Pattern Formation: Both processes create a recursive "ABACABADABACABA" pattern.
  3. Iterative Move Instructions Using Gray Code:
    • Disk Labeling: Assign the smallest disk as 'A', the next as 'B', etc.
    • Move Determination: Each bit flip in the Gray code corresponds to moving the respective disk.
    • Disambiguation for Disk 'A': Always move disk 'A' in a consistent direction (e.g., Peg 1 → Peg 2 → Peg 3 → Peg 1 → ).

This sequence ensures only one disk moves at a time and that no larger disk is placed atop a smaller one, adhering to Tower of Hanoi rules.

__builtin_popcount() is a GCC/Clang compiler intrinsic function designed to count the number of set bits (1s) in an unsigned integer. Essentially, it calculates how many 1’s are present in the binary representation of a given positive integer.

__builtin_popcount(unsigned int number);

number is an unsigned or positive integer whose set bits you want to count. The function returns an integer representing the total number of set bits in the provided number.

Example usage of __builtin_popcount

  • Input: 5
  • Binary Representation of 5 is 101
  • Output: 2

    The binary form of the number 5 is 101, which contains two set bits (the first and third bits are 1).

/* Iterative Tower of Hanoi Using Gray Code */

#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>

static void print_move(int disk, char from, char to)
{
    printf("Move Disk %d from '%c' to '%c'\n", disk, from, to);
}

int main()
{
    int n_disks = 3; /* Number of disks */

    int total_moves = (1 << n_disks) - 1;
    const char pegs[3] = {'A', 'B', 'C'}; /* Peg labels */
    int pos[n_disks]; /* Disk positions: 0-A, 1-B, 2-C */

    /* Initialize all disks to peg A */
    for (int i = 0; i < n_disks; i++)
        pos[i] = 0;

    /* Set direction based on parity of number of disks */
    int direction = (n_disks & 1) ? -1 /* counter-clockwise */
                                    : 1; /* clockwise */

    /* Predefined packed mapping arrays */
    const uint8_t peg_map[3] = {
        0x9, /* Peg A: {CCW -> C (2), CW -> B (1)} */ // #C01
        0x2, /* Peg B: {CCW -> A (0), CW -> C (2)} */ // #C02
        0x4  /* Peg C: {CCW -> B (1), CW -> A (0)} */ // #C03
    };

    /* Calculate direction index: -1 -> 0 (CCW), 1 ->1 (CW) */
    int direction_index = (direction + 1) / 2;

    for (int n_moves = 1; n_moves <= total_moves; n_moves++) {
        int curr_gray = n_moves ^ (n_moves >> 1);
        int prev_gray = (n_moves - 1) ^ ((n_moves - 1) >> 1);
        int changed_bit = curr_gray ^ prev_gray;

        /* Identify the disk to move (0-indexed) */
        int disk = __builtin_popcount(changed_bit - 1); // #C04

        /* Current peg of the disk */
        int curr_peg = pos[disk];
        int new_peg;

        if (disk == 0) {
            /* Calculate shift: direction_index=0 (CCW) -> shift2,
             * direction_index=1 (CW) -> shift0
             */
            int shift = (1 - direction_index) << 1;
            new_peg = (peg_map[curr_peg] >> shift) & 0x3; // #C05
        } else {
            /* Find the only peg not occupied by any smaller disk */
            bool found_new_peg = false;
            for (int p = 0; p < 3 && !found_new_peg; p++) {
                if (p == curr_peg)
                    continue;

                /* Check if any smaller disk is on peg p */
                bool has_smaller = false;
                for (int d = 0; d < disk; d++) {
                    if (pos[d] == p) {
                        has_smaller = true;
                        break;
                    }
                }

                if (!has_smaller) {
                    new_peg = p;
                    found_new_peg = true;
                }
            }
        }

        /* Execute the move */
        print_move(disk + 1, pegs[curr_peg], pegs[new_peg]);
        pos[disk] = new_peg;
    }

    return 0;
}

Reference output:

Move Disk 1 from 'A' to 'C'
Move Disk 2 from 'A' to 'B'
Move Disk 1 from 'C' to 'B'
Move Disk 3 from 'A' to 'C'
Move Disk 1 from 'B' to 'A'
Move Disk 2 from 'B' to 'C'
Move Disk 1 from 'A' to 'C'

C01 = 0x9
C02 = 0x2
C03 = 0x4
C04 = changed_bit-1
C05 = 0x3


Problem D

You are tasked with implementing Ancient Egyptian Multiplication, a method that multiplies two numbers by systematically halving one number and doubling the other, thereby avoiding the use of direct multiplication. This technique involves repeatedly dividing one number by two and multiplying the other by two. After each halving and doubling step, you identify and sum the doubled values corresponding to the halved numbers that are odd.

nm={n22mif n is even,n122m+mif n is odd,mif n=1.

.data
    # Define the data section with two numbers to multiply
    num1: .word 13
    num2: .word 7
    result: .word 0  # Placeholder for the final result

.text
    # Begin the main code in the text section
    la x1, result       # Load the address of the result variable into register x1
    lw t0, num1         # Load the first number (num1) into register t0
    lw t1, num2         # Load the second number (num2) into register t1
    li t2, 0            # Initialize the result (t2) to 0

loop:                                         
    # Check if the least significant bit of t0 (num1) is 1 (i.e., if the number is odd)
    andi t3, t0, 1
    beq t3, x0, skip_add  # If the bit is 0 (even), skip the addition
    # If the number is odd, add the value in t1 (num2) to the result in t2
    add t2, t2, t1 # D01

skip_add:
    # Perform a right shift on t0 (num1), effectively dividing it by 2
    srli t0, t0, 1 # D02
    # Perform a left shift on t1 (num2), effectively multiplying it by 2
    slli t1, t1, 1 # D03
    # If t0 (num1) is not zero, repeat the loop
    bnez t0, loop

    # Store the final result in the memory location pointed by x1
    sw t2, 0(x1)

D01 = add t2, t2, t1
D02 = srli
D03 = slli
D04 = sw t2, 0(x1) OR equivalent