---
tags: computer-arch
---
# Quiz2 of Computer Architecture (2023 Fall)
> Solutions
## Problem `A`
Consider that we plan to deploy a custom RISC-V processor for a space mission, which requires additional data protection in memory. We have decided to implement a [Hamming code](https://en.wikipedia.org/wiki/Hamming_code) with even parity to safeguard our data. For this question, please refer to the parity table shown below.
![](https://hackmd.io/_uploads/S1JEywJxT.png)
Please note that in this context, bit 0 is considered the least significant bit (LSB).
> See also: [What Are Bit Flips And How Are Spacecraft Protected From Them?](https://www.scienceabc.com/innovation/what-are-bit-flips-and-how-are-spacecraft-protected-from-them.html)
- [ ] Part 1
Implement a RISC-V function called `calc_parity` that takes a 4-byte input in `a0` and calculates the parity of its bits. It should return 1 for odd parity or 0 for even parity. For example:
![](https://hackmd.io/_uploads/SyMvxP1lp.png)
`calc_parity` must adhere to the [RISC-V calling convention](https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf).
```c
calc_parity:
li t1, 0 # t1 holds current parity value in LSB
loop:
xor t1, t1, a0 # adds the current least significant bit
# in a0 to the parity value in t1
__A01__ # shift to the next bit in a0
bne a0, x0, loop # loop if there are more than 1 bit left in a0
# 0 bits will never affect parity
__A02__ # we only want the one parity bit
# in bit 0 of t1
jr ra
```
> * A01 = ?
> ==`srli a0, a0, 1`==
> It is important to note that this is a logical shift, as using an arithmetic shift could potentially lead to an infinite loop if bit 31 is set.
> * A02 = ?
> ==`andi a0, t1, 1`==
> We only need to retain one bit for identifying the parity, so we directly perform `andi a0, t1, 1` and store it in register a0 as the function return value。
- [ ] Part 2
Implement `store_nibble`, a RISC-V function that takes four bits of data in a0, encodes them into seven bits (as a byte with 0 in the most significant bit [MSB]), and stores the result at the memory address specified in a1. This function does not return any value. You may assume that `calc_parity` has been correctly implemented and adheres to the calling convention, but do not assume any specific implementation details of `calc_parity`. Also, you may assume that the most significant 28 bits of the argument in a0 are set to 0.
```c
store_nibble:
# prologue omitted
mv s0, a0
mv s1, a1
srli s7, s0, 1 # set d3 through d1
slli s7, s7, 1 # make space for next bit
andi a0, s0, 0b1110 # parity of d1, d2, and d3
jal ra, calc_parity # compute p2
add s7, s7, a0
slli s7, s7, 1 # make space for next bit
andi t0, s0, 1 # extract d0
add s7, s7, t0
slli s7, s7, 1 # make space for next bit
andi a0, s0, 0b1101 # parity of d0, d2, and d3
jal ra, calc_parity # compute p1
add s7, s7, a0
slli s7, s7, 1 # make space for next bit
andi a0, s0, 0b1011 # parity of d0, d1, and d3
jal ra, calc_parity # compute p0
add s7, s7, __A03__
sb __A04__, __A05__ # s1 contains the memory address passed in as a1
# epilogue omitted
jr ra
```
> * A03 = ?
> ==a0==
> * A04 = ?
> ==s7==
> * A05 = ?
> ==`0(s1)`==
- [ ] Part 3
List all registers that need to be saved in the prologue and restored in the epilogue in order for `store_nibble` to follow the calling convention. If there are none, write `N/A`. __ A06 __
> * A06 = ?
> ==s0, s1, s7, ra==
> All of these registers are used in the question template and must, therefore, be saved and restored to follow the calling convention. (While `ra` is not strictly a saved register, `store_nibble` is calling functions and, therefore, needs to save `ra` as well.) Any additional saved registers used also need to be saved and restored.
> `sp` is a special case; even though it is callee-saved, it is not usually saved and restored from the stack. Thus, it was ignored.
---
## Problem `B`
Suppose you are tasked with writing a file system in C. The `struct file_item` represents either a file or a directory. The `data` union can hold either the contents of a file (as a string) or an array of pointers to child `file_items`. For the purposes of this question, assume that pointers are 4 bytes in length.
```c
#include <stdbool.h>
typedef struct file_item {
char *name;
bool is_directory;
file_item_data data;
} file_item;
typedef union file_item_data {
char contents[X];
struct file_item *children[16];
} file_item_data;
/* Copies all characters from src to dest including the NULL terminator */
char *strcpy(char *dest, const char *src);
```
- [ ] Part 1
We set `X` to the maximum value that does not increase the union size. What is the maximum `strlen` of a string we can store in a single file? __ B01 __
> * B01 = ?
> ==63==
> The size of a union is determined by its largest element, disregarding alignment rules, which we will not consider here. In a 32-bit system, pointers are 32 bits (4 bytes). `children` is an array of 16 pointers, occupying $16 \times 4 = 64$ bytes. Therefore, we must ensure that `char contents[X]` does not exceed 64 bytes. Given that, since 1 byte is defined as the size of a `char` by default, setting X = 64 fulfills our goal. Consequently, `strlen(contents)` would return 63 because it does not count the null terminator in the string length.
- [ ] Part 2
Write code to create files and directories. Ensure that your code still works even if the input strings are freed later on. Assume that the input strings will fit inside a `file_item_data` union.
```c
/* Creates a file with the given name and contents,
* and returns a pointer to that file.
*/
file_item *create_file(char *contents, const char *name)
{
file_item *new_file = calloc(1, sizeof(file_item));
new_file->name = name;
B02 /* Fill your code here */ ;
return new_file;
}
/* Creates a directory with the given name and no children,
* and returns a pointer to that directory.
*/
file_item *create_directory(const char *name)
{
file_item *new_dir = calloc(1, sizeof(file_item));
new_dir->name = name;
B03 /* Fill your code here */ ;
return new_dir;
}
```
Your solution should be fully correct, without any memory leaks, and should not include any additional lines.
> * B02 = ?
> ==`strcpy(new_file->data.contents, contents)`== (or equivalent statement)
> * B03 = ?
> ==`new_dir->is_directory = true`== (or equivalent statement)
---
## Problem `C`
Consider a 16-bit [fixed-point system](https://en.wikipedia.org/wiki/Fixed-point_arithmetic) with 1 sign bit, 5 integer bits, and 10 fraction bits. The 10 fraction bits continue where the integer bits left off, representing $2^{−1}$, $2^{−2}$, ... , $2^{−10}$. For example, the bit representation 0b0 01101 1010000000
represents $(2^{3} + 2^{2} + 2^{0}) + (2^{−1} + 2^{−3}) = 13.625$.
In this question, we will compare this fixed-point system to a 16-bit floating-point system that adheres to all IEEE-754 conventions, including denorms, NaNs, etc., with a 5-bit exponent and an exponent bias of -15.
- [ ] Part 1
Write $-22.375$ in hexadecimal using the 16-bit **fixed-point** system described above. __ C01 __
> * C01 = ?
> ==0xD980==
> The sign bit is 0b1, as the number is negative.
> The integer bits are obtained by converting 22 to unsigned binary, giving us 0b10110.
> The fractional bits are obtained by converting 0.375 to unsigned binary, resulting in 0b0.0110000000, with the fractional part being 0b0110000000.
> In total, we have 0b1101100110000000, which is equivalent to 0xD980 in hexadecimal.
Write $−22.375$ in hexadecimal using the 16-bit **floating point** system system described above. __ C02 __
> * C02 = ?
> ==0xCD98==
> The floating point system in this question has 16 bits total, including 1 sign bit and 5 exponent
bits. That leaves 10 bits for the mantissa.
> First, write 22.375 in binary: 0b1 0110.011
> Then, move the binary point to create a number of the form 0b1.MM MMMM MMMM × $2^{E}$.
> This gives us 0b1.01 1001 1000 × $2^4$.
> The mantissa bits can be read directly from this form: 0b01 1001 1000. Remember to drop the implicit 1 to the left of the binary point.
> The exponent is −4, which we can convert to bias notation by subtracting the bias: $4 − (−15) = 19$. In unsigned 5-bit binary, this is 0b10011.
> The sign bit is 0b1, since the number is negative.
> In total, we have 0b1 10011 01 1001 1000, which is 0xCD98 in hex.
- [ ] Part 2
How many numbers in the range $[16, 64)$ (including 16, excluding 64) can be represented by the **fixed-point** system described above? __ C03 __
> * C03 = ?
> ==$2^{14}$==
> 16 in the fixed point system is 0b10000.0000000000. The largest representable number in the fixed point system is 0b11111.1111111111, which is $2^{5} − 2{−10}$, which is just under 32.
> In between these two numbers, we can choose every bit except the most-significant integer bit to be 0 or 1. That gives us 4 integer bits and 10 fractional bits that could all be 0 or 1. Each bit pattern represents a different number, so we have $2^{14}$ representable numbers in total.
---
## Problem `D`
- [ ] Part 1
Suppose we have the following program.
```c
int remainder(int a, int b)
{
while (a >= b)
a -= b;
return a;
}
```
For a RISC architecture (RV32 in particular), our RISC-V compiler generates the following assembly code.
```c
loop:
blt x1, x2, done
sub x1, x1, x2
j loop
done:
...
```
Assume that RISC-V instructions are 4 bytes each. Assume data values are 4 bytes each.
How many bytes is the RISC-V program? __ D01 __
> * D01 = ?
> ==12==
> There are three instructions in this RISC-V program: `blt`, `sub`, and `j`. Each instruction occupies 4 bytes in memory, so the total size of the program is $4 \times 3 = 12$ bytes. Therefore, the answer for D01 is 12 bytes.
Assume a=7 and b=3. How many data bytes are loaded/stored in the RISC program? __ D02 __
> * D02 = ?
> ==0==
> For this RISC-V code snippet above, there are no explicit "loaded" or "stored" actions observed.
- [ ] Part 2
Suppose we have a simple integer [vector-vector addition](https://mathworld.wolfram.com/VectorAddition.html) implementation below.
```c
# for (i = 0; i < N; i++)
# c[i] = a[i] + b[i]
Loop:
lw x2, 0(x1) # load a[i]
lw x4, 0(x3) # load b[i]
add x5, x2, x4 # c[i] = a[i] + b[i]
__D03__ # store c[i]
addi x1, x1, 4 # bump pointer
addi x3, x3, 4 # bump pointer
addi x6, x6, 4 # bump pointer
addi x7, x7, 1 # i++
__D04__ # x8 holds N
```
Complete the above code to make the RISC-V assembly function as described in the comment.
> * D03 = ?
> ==`sw x5, 0(x6)`==
> Based on the context provided, where register `x5` already holds the result of an addition, and the subsequent instruction `addi x6, x6, 4 # bump pointer` indicates that `x6` is the pointer to the address of array `c`, it is appropriate to store the computation result from `x5` into the address held by `x6`. Therefore, the instruction should be `sw x5, 0(x6)`.
> * D04 = ?
> ==`bne x7, x8, Loop`==
> According to the adjacent comments on D04 and above, we can infer that `x7` represents the variable `i` and `x8` represents the variable `N` in the for loop. Thus, when `x7` equals `x8`, the loop terminates. Therefore, the instruction used for branching and conditional checking is `bne x7, x8, Loop`.
Next, consider a compiler that [unrolls loops](https://en.wikipedia.org/wiki/Loop_unrolling) and reschedules instructions, assuming the loop operates on an even number of elements in the vectors. Shown below.
```c=
Loop:
lw x2, 0(x1) # load a[i]
addi x1, x1, 8 # bump pointer a
lw x4, 0(x3) # load b[i]
addi x3, x3, 8 # bump pointer b
__D05__ # load a[i+1]
add x5, x2, x4 # c[i] = a[i] + b[i]
__D06__ # load b[i+1]
addi x7, x7, 2 # i+=2
__D07__ # store c[i]
add x10, x8, x9 # c[i+1] = a[i+1] + b[i+1]
__D08__ # store c[i+1]
__D09__ # bump pointer c
__D10__ # x11 holds N
```
> * D05 = ?
> ==`lw x8, -4(x1)`==
> * D06 = ?
> ==`lw x9, -4(x3)`==
> * D07 = ?
> ==`sw x5, 0(x6)`==
> * D08 = ?
> ==`sw x10, 4(x6)`==
> * D09 = ?
> ==`add x6, x6, 8`==
> * D10 = ?
> ==`bne x7, x11, Loop`==