--- tags: computer-arch --- # Quiz2 of Computer Architecture (2023 Fall) > Solutions ## Problem `A` Consider that we plan to deploy a custom RISC-V processor for a space mission, which requires additional data protection in memory. We have decided to implement a [Hamming code](https://en.wikipedia.org/wiki/Hamming_code) with even parity to safeguard our data. For this question, please refer to the parity table shown below. ![](https://hackmd.io/_uploads/S1JEywJxT.png) Please note that in this context, bit 0 is considered the least significant bit (LSB). > See also: [What Are Bit Flips And How Are Spacecraft Protected From Them?](https://www.scienceabc.com/innovation/what-are-bit-flips-and-how-are-spacecraft-protected-from-them.html) - [ ] Part 1 Implement a RISC-V function called `calc_parity` that takes a 4-byte input in `a0` and calculates the parity of its bits. It should return 1 for odd parity or 0 for even parity. For example: ![](https://hackmd.io/_uploads/SyMvxP1lp.png) `calc_parity` must adhere to the [RISC-V calling convention](https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf). ```c calc_parity: li t1, 0 # t1 holds current parity value in LSB loop: xor t1, t1, a0 # adds the current least significant bit # in a0 to the parity value in t1 __A01__ # shift to the next bit in a0 bne a0, x0, loop # loop if there are more than 1 bit left in a0 # 0 bits will never affect parity __A02__ # we only want the one parity bit # in bit 0 of t1 jr ra ``` > * A01 = ? > ==`srli a0, a0, 1`== > It is important to note that this is a logical shift, as using an arithmetic shift could potentially lead to an infinite loop if bit 31 is set. > * A02 = ? > ==`andi a0, t1, 1`== - [ ] Part 2 Implement `store_nibble`, a RISC-V function that takes four bits of data in a0, encodes them into seven bits (as a byte with 0 in the most significant bit [MSB]), and stores the result at the memory address specified in a1. This function does not return any value. You may assume that `calc_parity` has been correctly implemented and adheres to the calling convention, but do not assume any specific implementation details of `calc_parity`. Also, you may assume that the most significant 28 bits of the argument in a0 are set to 0. ```c store_nibble: # prologue omitted mv s0, a0 mv s1, a1 srli s7, s0, 1 # set d3 through d1 slli s7, s7, 1 # make space for next bit andi a0, s0, 0b1110 # parity of d1, d2, and d3 jal ra, calc_parity # compute p2 add s7, s7, a0 slli s7, s7, 1 # make space for next bit andi t0, s0, 1 # extract d0 add s7, s7, t0 slli s7, s7, 1 # make space for next bit andi a0, s0, 0b1101 # parity of d0, d2, and d3 jal ra, calc_parity # compute p1 add s7, s7, a0 slli s7, s7, 1 # make space for next bit andi a0, s0, 0b1011 # parity of d0, d1, and d3 jal ra, calc_parity # compute p0 add s7, s7, __A03__ sb __A04__, __A05__ # s1 contains the memory address passed in as a1 # epilogue omitted jr ra ``` > * A03 = ? > ==a0== > * A04 = ? > ==s7== > * A05 = ? > ==`0(s1)`== - [ ] Part 3 List all registers that need to be saved in the prologue and restored in the epilogue in order for `store_nibble` to follow the calling convention. If there are none, write `N/A`. __ A06 __ > * A06 = ? > ==s0, s1, s7, ra== > All of these registers are used in the question template and must, therefore, be saved and restored to follow the calling convention. (While `ra` is not strictly a saved register, `store_nibble` is calling functions and, therefore, needs to save `ra` as well.) Any additional saved registers used also need to be saved and restored. > `sp` is a special case; even though it is callee-saved, it is not usually saved and restored from the stack. Thus, it was ignored. --- ## Problem `B` Suppose you are tasked with writing a file system in C. The `struct file_item` represents either a file or a directory. The `data` union can hold either the contents of a file (as a string) or an array of pointers to child `file_items`. For the purposes of this question, assume that pointers are 4 bytes in length. ```c #include <stdbool.h> typedef struct file_item { char *name; bool is_directory; file_item_data data; } file_item; typedef union file_item_data { char contents[X]; struct file_item *children[16]; } file_item_data; /* Copies all characters from src to dest including the NULL terminator */ char *strcpy(char *dest, const char *src); ``` - [ ] Part 1 We set `X` to the maximum value that does not increase the union size. What is the maximum `strlen` of a string we can store in a single file? __ B01 __ > * B01 = ? > ==63== > The size of a union is determined by its largest element, disregarding alignment rules, which we will not consider here. In a 32-bit system, pointers are 32 bits (4 bytes). `children` is an array of 16 pointers, occupying $16 \times 4 = 64$ bytes. Therefore, we must ensure that `char contents[X]` does not exceed 64 bytes. Given that, since 1 byte is defined as the size of a `char` by default, setting X = 64 fulfills our goal. Consequently, `strlen(contents)` would return 63 because it does not count the null terminator in the string length. - [ ] Part 2 Write code to create files and directories. Ensure that your code still works even if the input strings are freed later on. Assume that the input strings will fit inside a `file_item_data` union. ```c /* Creates a file with the given name and contents, * and returns a pointer to that file. */ file_item *create_file(char *contents, const char *name) { file_item *new_file = calloc(1, sizeof(file_item)); new_file->name = name; B02 /* Fill your code here */ ; return new_file; } /* Creates a directory with the given name and no children, * and returns a pointer to that directory. */ file_item *create_directory(const char *name) { file_item *new_dir = calloc(1, sizeof(file_item)); new_dir->name = name; B03 /* Fill your code here */ ; return new_dir; } ``` Your solution should be fully correct, without any memory leaks, and should not include any additional lines. > * B02 = ? > ==`strcpy(new_file->data.contents, contents)`== (or equivalent statement) > * B03 = ? > ==`new_dir->is_directory = true`== (or equivalent statement) --- ## Problem `C` Consider a 16-bit [fixed-point system](https://en.wikipedia.org/wiki/Fixed-point_arithmetic) with 1 sign bit, 5 integer bits, and 10 fraction bits. The 10 fraction bits continue where the integer bits left off, representing $2^{−1}$, $2^{−2}$, ... , $2^{−10}$. For example, the bit representation 0b0 01101 1010000000 represents $(2^{3} + 2^{2} + 2^{0}) + (2^{−1} + 2^{−3}) = 13.625$. In this question, we will compare this fixed-point system to a 16-bit floating-point system that adheres to all IEEE-754 conventions, including denorms, NaNs, etc., with a 5-bit exponent and an exponent bias of -15. - [ ] Part 1 Write $-22.375$ in hexadecimal using the 16-bit **fixed-point** system described above. __ C01 __ > * C01 = ? > ==0xD980== > The sign bit is 0b1, as the number is negative. > The integer bits are obtained by converting 22 to unsigned binary, giving us 0b10110. > The fractional bits are obtained by converting 0.375 to unsigned binary, resulting in 0b0.0110000000, with the fractional part being 0b0110000000. > In total, we have 0b1101100110000000, which is equivalent to 0xD980 in hexadecimal. Write $−22.375$ in hexadecimal using the 16-bit **floating point** system system described above. __ C02 __ > * C02 = ? > ==0xCD98== > The floating point system in this question has 16 bits total, including 1 sign bit and 5 exponent bits. That leaves 10 bits for the mantissa. > First, write 22.375 in binary: 0b1 0110.011 > Then, move the binary point to create a number of the form 0b1.MM MMMM MMMM × $2^{E}$. > This gives us 0b1.01 1001 1000 × $2^4$. > The mantissa bits can be read directly from this form: 0b01 1001 1000. Remember to drop the implicit 1 to the left of the binary point. > The exponent is −4, which we can convert to bias notation by subtracting the bias: $4 − (−15) = 19$. In unsigned 5-bit binary, this is 0b10011. > The sign bit is 0b1, since the number is negative. > In total, we have 0b1 10011 01 1001 1000, which is 0xCD98 in hex. - [ ] Part 2 How many numbers in the range $[16, 64)$ (including 16, excluding 64) can be represented by the **fixed-point** system described above? __ C03 __ > * C03 = ? > ==$2^{14}$== > 16 in the fixed point system is 0b10000.0000000000. The largest representable number in the fixed point system is 0b11111.1111111111, which is $2^{5} − 2{−10}$, which is just under 32. > In between these two numbers, we can choose every bit except the most-significant integer bit to be 0 or 1. That gives us 4 integer bits and 10 fractional bits that could all be 0 or 1. Each bit pattern represents a different number, so we have $2^{14}$ representable numbers in total. --- ## Problem `D` - [ ] Part 1 Suppose we have the following program. ```c int remainder(int a, int b) { while (a >= b) a -= b; return a; } ``` For a RISC architecture (RV32 in particular), our RISC-V compiler generates the following assembly code. ```c loop: blt x1, x2, done sub x1, x1, x2 j loop done: ... ``` Assume that RISC-V instructions are 4 bytes each. Assume data values are 4 bytes each. How many bytes is the RISC-V program? __ D01 __ > * D01 = ? > ==12== Assume a=7 and b=3. How many data bytes are loaded/stored in the RISC program? __ D02 __ > * D02 = ? > ==0== - [ ] Part 2 Suppose we have a simple integer [vector-vector addition](https://mathworld.wolfram.com/VectorAddition.html) implementation below. ```c # for (i = 0; i < N; i++) # c[i] = a[i] + b[i] Loop: lw x2, 0(x1) # load a[i] lw x4, 0(x3) # load b[i] add x5, x2, x4 # c[i] = a[i] + b[i] __D03__ # store c[i] addi x1, x1, 4 # bump pointer addi x3, x3, 4 # bump pointer addi x6, x6, 4 # bump pointer addi x7, x7, 1 # i++ __D04__ # x8 holds N ``` Complete the above code to make the RISC-V assembly function as described in the comment. > * D03 = ? > ==`sw x5, 0(x6)`== > * D04 = ? > ==`bne x7, x8, Loop`== Next, consider a compiler that [unrolls loops](https://en.wikipedia.org/wiki/Loop_unrolling) and reschedules instructions, assuming the loop operates on an even number of elements in the vectors. Shown below. ```c= Loop: lw x2, 0(x1) # load a[i] addi x1, x1, 8 # bump pointer a lw x4, 0(x3) # load b[i] addi x3, x3, 8 # bump pointer b __D05__ # load a[i+1] add x5, x2, x4 # c[i] = a[i] + b[i] __D06__ # load b[i+1] addi x7, x7, 2 # i+=2 __D07__ # store c[i] add x10, x8, x9 # c[i+1] = a[i+1] + b[i+1] __D08__ # store c[i+1] __D09__ # bump pointer c __D10__ # x11 holds N ``` > * D05 = ? > ==`lw x8, -4(x1)`== > * D06 = ? > ==`lw x9, -4(x3)`== > * D07 = ? > ==`sw x5, 0(x6)`== > * D08 = ? > ==`sw x10, 4(x6)`== > * D09 = ? > ==`add x6, x6, 8`== > * D10 = ? > ==`bne x7, x11, Loop`==