Quiz2 of Computer Architecture (2023 Fall)

Solutions

Problem `A`

Consider that we plan to deploy a custom RISC-V processor for a space mission, which requires additional data protection in memory. We have decided to implement a Hamming code with even parity to safeguard our data. For this question, please refer to the parity table shown below.

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Please note that in this context, bit 0 is considered the least significant bit (LSB).

See also: What Are Bit Flips And How Are Spacecraft Protected From Them?

Part 1

Implement a RISC-V function called calc_parity that takes a 4-byte input in a0 and calculates the parity of its bits. It should return 1 for odd parity or 0 for even parity. For example:

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

calc_parity must adhere to the RISC-V calling convention.

calc_parity:
    li t1, 0         # t1 holds current parity value in LSB
loop:
    xor t1, t1, a0   # adds the current least significant bit
                     # in a0 to the parity value in t1
    __A01__          # shift to the next bit in a0
    bne a0, x0, loop # loop if there are more than 1 bit left in a0
                     # 0 bits will never affect parity
    __A02__          # we only want the one parity bit
                     # in bit 0 of t1
    jr ra

A01 = ?
srli a0, a0, 1
It is important to note that this is a logical shift, as using an arithmetic shift could potentially lead to an infinite loop if bit 31 is set.

A02 = ?
andi a0, t1, 1
We only need to retain one bit for identifying the parity, so we directly perform andi a0, t1, 1 and store it in register a0 as the function return value。

Part 2

Implement store_nibble, a RISC-V function that takes four bits of data in a0, encodes them into seven bits (as a byte with 0 in the most significant bit [MSB]), and stores the result at the memory address specified in a1. This function does not return any value. You may assume that calc_parity has been correctly implemented and adheres to the calling convention, but do not assume any specific implementation details of calc_parity. Also, you may assume that the most significant 28 bits of the argument in a0 are set to 0.

store_nibble:
    # prologue omitted
    mv s0, a0
    mv s1, a1
    srli s7, s0, 1      # set d3 through d1
    slli s7, s7, 1      # make space for next bit
    andi a0, s0, 0b1110 # parity of d1, d2, and d3
    jal ra, calc_parity # compute p2
    add s7, s7, a0
    slli s7, s7, 1      # make space for next bit
    andi t0, s0, 1      # extract d0
    add s7, s7, t0
    slli s7, s7, 1      # make space for next bit
    andi a0, s0, 0b1101 # parity of d0, d2, and d3
    jal ra, calc_parity # compute p1
    add s7, s7, a0
    slli s7, s7, 1      # make space for next bit
    andi a0, s0, 0b1011 # parity of d0, d1, and d3
    jal ra, calc_parity # compute p0
    add s7, s7, __A03__
    sb __A04__, __A05__ # s1 contains the memory address passed in as a1
    # epilogue omitted
    jr ra

A03 = ?
a0

A04 = ?
s7

A05 = ?
0(s1)

Part 3

List all registers that need to be saved in the prologue and restored in the epilogue in order for store_nibble to follow the calling convention. If there are none, write N/A. __ A06 __

A06 = ?
s0, s1, s7, ra
All of these registers are used in the question template and must, therefore, be saved and restored to follow the calling convention. (While ra is not strictly a saved register, store_nibble is calling functions and, therefore, needs to save ra as well.) Any additional saved registers used also need to be saved and restored.
sp is a special case; even though it is callee-saved, it is not usually saved and restored from the stack. Thus, it was ignored.

Problem `B`

Suppose you are tasked with writing a file system in C. The struct file_item represents either a file or a directory. The data union can hold either the contents of a file (as a string) or an array of pointers to child file_items. For the purposes of this question, assume that pointers are 4 bytes in length.

#include <stdbool.h>
typedef struct file_item {
    char *name;
    bool is_directory;
    file_item_data data;
} file_item;

typedef union file_item_data {
    char contents[X];
    struct file_item *children[16];
} file_item_data;

/* Copies all characters from src to dest including the NULL terminator */
char *strcpy(char *dest, const char *src);

Part 1

We set X to the maximum value that does not increase the union size. What is the maximum strlen of a string we can store in a single file? __ B01 __

B01 = ?
63
The size of a union is determined by its largest element, disregarding alignment rules, which we will not consider here. In a 32-bit system, pointers are 32 bits (4 bytes). children is an array of 16 pointers, occupying
$16 \times 4 = 64$ bytes. Therefore, we must ensure that char contents[X] does not exceed 64 bytes. Given that, since 1 byte is defined as the size of a char by default, setting X = 64 fulfills our goal. Consequently, strlen(contents) would return 63 because it does not count the null terminator in the string length.

Part 2

Write code to create files and directories. Ensure that your code still works even if the input strings are freed later on. Assume that the input strings will fit inside a file_item_data union.

/* Creates a file with the given name and contents,
 * and returns a pointer to that file.
 */
file_item *create_file(char *contents, const char *name)
{
    file_item *new_file = calloc(1, sizeof(file_item));
    new_file->name = name;
    B02 /* Fill your code here */ ;
    return new_file;
}

/* Creates a directory with the given name and no children,
 * and returns a pointer to that directory.
 */
file_item *create_directory(const char *name)
{
    file_item *new_dir = calloc(1, sizeof(file_item));
    new_dir->name = name;
    B03 /* Fill your code here */ ;
    return new_dir;
}

Your solution should be fully correct, without any memory leaks, and should not include any additional lines.

B02 = ?
strcpy(new_file->data.contents, contents) (or equivalent statement)

B03 = ?
new_dir->is_directory = true (or equivalent statement)

Problem `C`

Consider a 16-bit fixed-point system with 1 sign bit, 5 integer bits, and 10 fraction bits. The 10 fraction bits continue where the integer bits left off, representing

2^{- 1}

2^{- 2}

, … ,

2^{- 10}

. For example, the bit representation 0b0 01101 1010000000
represents

(2^{3} + 2^{2} + 2^{0}) + (2^{- 1} + 2^{- 3}) = 13.625

In this question, we will compare this fixed-point system to a 16-bit floating-point system that adheres to all IEEE-754 conventions, including denorms, NaNs, etc., with a 5-bit exponent and an exponent bias of -15.

Part 1

Write

- 22.375

in hexadecimal using the 16-bit fixed-point system described above. __ C01 __

C01 = ?
0xD980
The sign bit is 0b1, as the number is negative.
The integer bits are obtained by converting 22 to unsigned binary, giving us 0b10110.
The fractional bits are obtained by converting 0.375 to unsigned binary, resulting in 0b0.0110000000, with the fractional part being 0b0110000000.
In total, we have 0b1101100110000000, which is equivalent to 0xD980 in hexadecimal.

Write

- 22.375

in hexadecimal using the 16-bit floating point system system described above. __ C02 __

C02 = ?
0xCD98
The floating point system in this question has 16 bits total, including 1 sign bit and 5 exponent
bits. That leaves 10 bits for the mantissa.
First, write 22.375 in binary: 0b1 0110.011
Then, move the binary point to create a number of the form 0b1.MM MMMM MMMM ×
$2^{E}$ .
This gives us 0b1.01 1001 1000 ×
$2^{4}$ .
The mantissa bits can be read directly from this form: 0b01 1001 1000. Remember to drop the implicit 1 to the left of the binary point.
The exponent is −4, which we can convert to bias notation by subtracting the bias:
$4 - (- 15) = 19$ . In unsigned 5-bit binary, this is 0b10011.
The sign bit is 0b1, since the number is negative.
In total, we have 0b1 10011 01 1001 1000, which is 0xCD98 in hex.

Part 2

How many numbers in the range

[16, 64)

(including 16, excluding 64) can be represented by the fixed-point system described above? __ C03 __

C03 = ?

$2^{14}$
16 in the fixed point system is 0b10000.0000000000. The largest representable number in the fixed point system is 0b11111.1111111111, which is
$2^{5} - 2 - 10$ , which is just under 32.
In between these two numbers, we can choose every bit except the most-significant integer bit to be 0 or 1. That gives us 4 integer bits and 10 fractional bits that could all be 0 or 1. Each bit pattern represents a different number, so we have
$2^{14}$ representable numbers in total.

Problem `D`

Part 1

Suppose we have the following program.

int remainder(int a, int b)
{
    while (a >= b)
        a -= b;
    return a;
}

For a RISC architecture (RV32 in particular), our RISC-V compiler generates the following assembly code.

loop:
    blt x1, x2, done
    sub x1, x1, x2
    j loop
done:
    ...

Assume that RISC-V instructions are 4 bytes each. Assume data values are 4 bytes each.

How many bytes is the RISC-V program? __ D01 __

D01 = ?
12
There are three instructions in this RISC-V program: blt, sub, and j. Each instruction occupies 4 bytes in memory, so the total size of the program is
$4 \times 3 = 12$ bytes. Therefore, the answer for D01 is 12 bytes.

Assume a=7 and b=3. How many data bytes are loaded/stored in the RISC program? __ D02 __

D02 = ?
0
For this RISC-V code snippet above, there are no explicit "loaded" or "stored" actions observed.

Part 2

Suppose we have a simple integer vector-vector addition implementation below.

# for (i = 0; i < N; i++)
#     c[i] = a[i] + b[i]
Loop:
    lw x2, 0(x1)     # load a[i]
    lw x4, 0(x3)     # load b[i]
    add x5, x2, x4   # c[i] = a[i] + b[i]
    __D03__          # store c[i]
    addi x1, x1, 4   # bump pointer
    addi x3, x3, 4   # bump pointer
    addi x6, x6, 4   # bump pointer
    addi x7, x7, 1   # i++
    __D04__          # x8 holds N

Complete the above code to make the RISC-V assembly function as described in the comment.

D03 = ?
sw x5, 0(x6)
Based on the context provided, where register x5 already holds the result of an addition, and the subsequent instruction addi x6, x6, 4 # bump pointer indicates that x6 is the pointer to the address of array c, it is appropriate to store the computation result from x5 into the address held by x6. Therefore, the instruction should be sw x5, 0(x6).

D04 = ?
bne x7, x8, Loop
According to the adjacent comments on D04 and above, we can infer that x7 represents the variable i and x8 represents the variable N in the for loop. Thus, when x7 equals x8, the loop terminates. Therefore, the instruction used for branching and conditional checking is bne x7, x8, Loop.

Next, consider a compiler that unrolls loops and reschedules instructions, assuming the loop operates on an even number of elements in the vectors. Shown below.














Loop:
    lw x2, 0(x1)      # load a[i]
    addi x1, x1, 8    # bump pointer a
    lw x4, 0(x3)      # load b[i]
    addi x3, x3, 8    # bump pointer b
    __D05__           # load a[i+1]
    add x5, x2, x4    # c[i] = a[i] + b[i]
    __D06__           # load b[i+1]
    addi x7, x7, 2    # i+=2
    __D07__           # store c[i]
    add x10, x8, x9  # c[i+1] = a[i+1] + b[i+1]
    __D08__           # store c[i+1]
    __D09__           # bump pointer c
    __D10__           # x11 holds N

D05 = ?
lw x8, -4(x1)

D06 = ?
lw x9, -4(x3)

D07 = ?
sw x5, 0(x6)

D08 = ?
sw x10, 4(x6)

D09 = ?
add x6, x6, 8

D10 = ?
bne x7, x11, Loop

Quiz2 of Computer Architecture (2023 Fall)

Problem A

Problem B

Problem C

Problem D

Problem `A`

Problem `B`

Problem `C`

Problem `D`