---
tags: computer-arch
---
# Quiz2 of Computer Architecture (2021 Fall)
:::info
:information_source: General Information
* You are allowed to read **[lecture materials](http://wiki.csie.ncku.edu.tw/arch/schedule)**.
* That is, an open book exam.
* You shall not disclose your answer during the quiz.
* Each answer has `3` points.
* For RISC-V assembly, you CANNOT write any pseudo instructions.
* Of course, you should answer everything **in English** except your formal name.
* :timer_clock: 09:10 ~ 10:20AM on Oct 26, 2021
:::
## Problem `A`
We would like to implement an operation to reverse a [linked list](https://en.wikipedia.org/wiki/Linked_list) in memory, where the operand to this operation is the memory address of a pointer to the first node in the linked list (a pointer to a pointer). This operation has no destination register, but the operation zeroes the register specified by the operand upon completion (it does not preserve the operand).
Every node in the linked list has the following structure.
```cpp
struct node {
struct node *next;
void *value;
}
```
Assume that pointers are 32 bits wide in this architecture. The next pointer is either the memory address of the next node in the list or is equal to 0 (`NULL`) to indicate the end of the linked list.
The equivalent C code for this operation is provided below.
```cpp
void reverse(struct node **head)
{
struct node *prev = NULL, *curr = *head;
while (curr) {
struct node *next = curr->next;
curr->next = prev;
prev = curr;
curr = next;
}
}
```
The RISC-V assembly implementation is shown as following:
```cpp
# head is passed in a0
# t0 holds prev
# t1 holds next
lw a0, 0(a0)
beqz __A01__
addi t0, t0, 0
loop:
lw t1, 0(a0)
sw t0, 0(a0)
addi t0, a0, 0
addi a0, t1, 0
__A02__
done:
```
Please complete the unfinished assembly to make the above work as the C counterpart does. You don't have to manipulate the RISC-V function calls for this operation.
> * A01 = ?
> * A02 = ?
---
## Problem `B`
Consider the following C code:
```cpp
unsigned multiply(unsigned x, unsigned y) {
unsigned ret = 0;
for (; y != 0; y--)
ret += x;
return ret;
}
```
We can translate into the optimized RISC-V assembly, so that fewer dynamic instructions are executed for large values of `x` and `y`.
A simple optimization is [loop unrolling](https://en.wikipedia.org/wiki/Loop_unrolling). With some careful bookkeeping, this can reduce the number of additions. This implementation keeps `x + x` as a temporary value to cut the number of additions approximately in half.
The following is more efficient than
simply translating from C.
```cpp
addi x3, x0, 0 # result = 0
andi __B01__ # x? = y – (y % 2)
add x7, x1, x1 # x7 = x + x
beq x3, x0, done # y is 0
beq __B02__ # skip ahead if y is even
add x3, x0, x1 # set result to x if y is odd
loop: add __B03__ # result = result + 2x
addi x6, x6, -2 # x6 = x6 - 2
bge x6, x0, loop # repeat if x6 != 0
done:
```
> * B01 = ?
> * B02 = ?
> * B03 = ?
---
## Problem `C`
The function `strtriple` receives as input a null-terminated string, and writes a string where every letter is repeated three times. For example, if we called stringtriple on the input string ``"Hello World!"``, we would get the output string ``"HHHeeellllllooo WWWooorrrlllddd!!!"``.
More specifically: `strtriple` has the following inputs: `a0` stores a null-terminated string. It is guaranteed that the string is well-formatted. Let the string have strlen of `n`. `a1` contains a pointer to a buffer of length at least $3n + 1$ characters, potentially containing garbage data. You may assume that the buffer does not overlap with the string stored in `a0`.
`strtriple` outputs the following: `a0` should return the pointer to the buffer we initially received in `a1`. The buffer should contain the string originally provided in `a0`, with every character tripled. You must properly null-terminate the string, and for full credit must NOT replicate the null terminator at the end. For full credit, you must follow proper calling convention.
```cpp
strtriple:
mv t2, a1
loop:
lbu __C01__
beq t0, x0, end
slli __C02__
add t1, t1, t0
sh t1, 0(a1)
sb __C03__
addi a0, a0, 1
__C04__
j loop
end:
sb x0, 0(a1)
__C05__
jr ra
```
Please complete the unfinished assembly to make the above work as expected.
> * C01 = ?
> * C02 = ?
> * C03 = ?
> * C04 = ?
> * C05 = ?
---
## Problem `D`
Consider the following [Skip List](https://en.wikipedia.org/wiki/Skip_list) implementation:
```cpp
struct SkipListNode {
void *data;
struct SkipListNode **next;
}
```
Its `find` operation is shown below:
```cpp
int SL_find(void *find, int level,
struct SkipListNode *sl,
(int)(*cmp)(void *, void *) {
if (cmp(find, sl->data) == 0) return 1;
if (level == 0 && sl->next[level] == NULL) return 0;
if (sl->next[level] == NULL)
return SL_find(find, level - 1, sl, cmp);
if (cmp(find, sl->next[level]->data)) > 0))
return SL_find(find, level - 1, sl, cmp);
return SL_find(find, level, sl->next[level], cmp);
}
```
In translating the code `cmp(find, sl->next[level]->data))` into RISC-V, we need to store arguments on the stack. So we will have `find` at `sp(0)`, `level` at `sp(4)`, `sl` at `sp(8)` and `cmp` at `sp(12)`. We are going to break up the translation into pieces. Please write down the RISC-V assembly accordingly.
1. Load find into `a0`
> * D01 = ?
2. The next thing we need to do is get `sl->next[level]->data` into `a1`.
The RISC-V code for that would be: (Hint: Load `sl` into `t0`)
> * D02 = ?
3. Load `level` into `t1`
> * D03 = ?
4. load `sl->next` into `t0`
> * D04 = ?
5. load `sl->next[level]` into `t0`
```cpp
slli __D05__
add t0, t0, t1
lw t0, t0(4)
```
> * D05 = ?
6. load data into `a0`
> * D06 = ?
7. Finally, how do we call `cmp` (using `t0` as a temporary):
```cpp
lw __D07__
__D08__
```
> * D07 = ?
> * D08 = ?
---
## Problem `E`
You are supposed to implement RISC-V code per requested as less instructions as possible. Follow calling convention, use register mnemonic names (e.g., refer to `t0` rather than `x6`), and add commas and a single space between registers/arguments (e.g. `addi a0, a1, 2`). If you do not follow this, you may be misgraded.
1. Implement less-than for floating-point numbers.
* Inputs: `a0` and `a1` are positive non-NaN IEEE-754 32-bit floating point numbers.
* Output: `a0` returns `1` if `a0 < a1` and `0` otherwise.
* Example: If the inputs `a0` and `a1` were `1.2` and `1.25`, the expected output would be `1`. If `a0` and `a1` were `1.1` and `1.1` respectively, the expected output would be `0`.
```cpp
fp_less_than:
__E01__
ret
```
> * E01 = ?
2. Find the end position of [C-style string](https://en.wikipedia.org/wiki/C_string_handling).
* Inputs: `a0` is a pointer to a nonempty string
* Output: `a0` returns a pointer immediately after the end of the string.
* Example: Let `a0` be the pointer `0x10000000` to string s, which is the string `"Hello"`. Then the expected output would be `0x10000006`.
* NOTE: You may NOTE use `strlen` or any other library function.
```cpp
end_of_str:
lb t0, 0(a0)
__E02__
__E03__
```
> * E02 = ?
> * E03 = ?
3. Find the length of a null-terminated string in bytes. The function should accept a pointer to a null-terminated string and return an integer. Your solution must be recursive.
```cpp
strlen:
__E04__
beq t0, zero, basecase
addi sp, sp, -4
__E05__
__E06__
jal strlen
addi a0, a0, 1
__E07__
__E08__
ret
basecase:
addi a0, zero, 0
ret
```
> * E04 = ?
> * E05 = ?
> * E06 = ?
> * E07 = ?
> * E08 = ?
4. Arithmetically negate a [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement) 32-bit integer without using the `sub`, `mul` or pseudo instructions.
```cpp
negate:
__E09__
addi a0, a0, 1
ret
```
> * E09 = ?
5. In this question, you will implement a simple recursive function in RISC-V. The function takes a decimal number as input, then outputs its binary representation encoded in the decimal digits.
```cpp
int find_binary(unsigned int decimal) {
if (decimal == 0) return 0;
return decimal % 2 + 10 * findBinary(decimal / 2);
}
```
For example, if the input to this function is `10`, then the output would be `1010`. The equivalent RISC-V code:
```cpp
find_binary:
addi sp, sp, -8 # a0 will have arg and be where we return
sw ra, 4(sp) # saving return address on the stack
sw s0, 0(sp) # saving "decimal % 2" on the stack
__E10__ # we will just return 0
andi s0, a0, 1 # set s0 to a0 % 2
__E11__ # set a0 to a0 / 2
jal ra, find_binary # recursive call
li t0, 10
__E12__
add a0, a0, s0 # accumulating the stored return value "decimal % 2"
postamble:
lw ra, 4(sp) # Restore ra
__E13__ # restore s0
__E14__ # restore sp
end:
jr ra
```
Hint:
* The recursive part of the function stores all of the first part of the return value “decimal % 2” on the stack
* The second part of the function and postamble are combing all the return values by `+` and `10 *`
> * E10 = ?
> * E11 = ?
> * E12 = ?
> * E13 = ?
> * E14 = ?
6. Implement the function [strncpy](https://man7.org/linux/man-pages/man3/strncpy.3p.html) which takes in two `char *` arguments and copies up to the first `n` characters from source into destination. If it reaches a null terminator, then it copies that value into destination and stops copying in characters. If there is no null terminator among the first n characters of source, the string placed in destination will not be null-terminated. `strncpy` returns a pointer to the destination string. You may not store anything on the stack for this problem.
> `char* strncpy(char *destination, char *source, unsigned int n);`
```cpp
strncpy:
add t0, x0, x0 # Current length
loop:
__E15__ # Check if we have iterated over the length of the string
add t1, a1, t0 # calculate the address of the t0-th (count-th) char in the string
lb t2, 0(t1) # load this char
__E16__ # calculate the same address, but off of the dst string
sb t2, 0(t1) # store into the dst string
addi t0, t0, 1 # increment our counter
bne t1, x0, loop # check if the loaded char was null
end:
jr ra
```
> * E15 = ?
> * E16 = ?
---
## Problem `F`
Evaluate the RISC-V code. Write down the specific value in HEX. Reminder: always include the prefix in your answer.
- [ ] Case 1
```cpp
auipc t0, 0xABCDE # Assume this instruction is at 0x100
addi t0, t0, 0xABC
```
Write down the value of `t0` in HEX.
> * F01 = ?
- [ ] Case 2
```cpp
li t0, 0xABCDEFAD
sw t0, 0(s0)
lb t0, 0(s0)
```
Write down the value of `t0` in hex. Assume big-endianness.
> * F02 = ?
---
## Problem `G`
[Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) is a very clever datastructure for efficiently and probabilistically storing a set. Typically, it has two functions: check and insert. The basic idea for checking is that you hash what you are looking for multiple times. Each hash tells you a particular bit you need to set or check. So for checking you see if the bit is set. You repeat this for multiple iteration, with the hash including the iteration count (so each hash is different). If not all bits are set then the element does not exist in the bloom filter. If all bits are set then the element PROBABLY exists in the bloom filter. Similarly, for setting an element as present in a bloom filter you just set all those bits to `1`.
We define the following structure.
```cpp
#include <stdint.h>
struct BloomFilter {
uint32_t size; /* Size is # of bits, NOT BYTES, in the bloom filter */
uint16_t itercount;
uint64_t (*)(void *data, uint16_t iter) hash;
uint8_t *data;
};
```
For the insertion function, we need to set the appropriate bit for each iteration.
```cpp
void insert(struct BloomFilter *b, void *element) {
uint64_t bitnum; /* which bit we need to set */
for (int i = 0; i < b->itercount; ++i) {
bitnum = b->hash(element, (uint16_t) i) % b->size;
b->data[bitnum >> 3] = b->data[bitnum >> 3] | (1 << (bitnum & 0x7));
}
}
```
We also have the following function to allocate a new bloom filter
```cpp
struct BloomFilter *alloc(uint64_t (*)(void *data, uint16_t iter) hash,
uint32_t size,
uint16_t itercount) {
struct BloomFilter *ret = malloc(64);
ret->size = size;
ret->data = calloc(size >> 3, 1);
ret->hash = hash;
ret->itercount = itercount;
}
```
Complete the RISC-V translation necessary to allocate this: We will put `ret` in `s0`.
```cpp
alloc:
# Prolog
addi sp, sp, __G01__
sw ra, 0(sp)
sw s0, 4(sp)
sw a0, 8(sp)
sw a1, 12(sp)
sw a2, 16(sp)
# body
addi a0, x0, 64
jal malloc
mv s0, a0 # put ret in s0
__G02__ # load size into t0
__G03__ # store it
__G04__ # div size by 8 with a shift
li a1, 1
jal calloc
sw a0, 12(s0) # store data
__G05__ # load hash to t0
__G06__ # store it: Use the right type
__G07__ # load itercount to t0
sh t0, 4(s0) # store it
mv a0, s0
# epilog
lw ra, 0(sp)
lw s0, 4(sp)
addi sp, sp, 20
jr ra
```
> * G01 = ?
> * G02 = ?
> * G03 = ?
> * G04 = ?
> * G05 = ?
> * G06 = ?
> * G07 = ?
---
## Problem `H`
Please complete the translation by filling in the lines below with RISC-V assembly. The prologue and epilogue have been written correctly but are not shown.
`strlen()`, both as a C function and RISC-V procedure, takes in one string as an argument and returns the length of the string (not including the null terminator).
```cpp
/* Returns 1 if s2 is a substring of s1, and 0 otherwise. */
int is_substr(char *s1, char *s2) {
int len1 = strlen(s1), len2 = strlen(s2);
int offset = len1 - len2;
while (offset >= 0) {
int i = 0;
while (s1[i + offset] == s2[i]) {
i += 1
if (s2[i] == '\0') return 1;
}
offset -= 1;
}
return 0;
}
```
The translated RISC-V Assembly:
```cpp
is_substr:
mv s1, a0
mv s2, a1
jal ra, strlen
mv s3, a0
mv a0, s2
jal ra, strlen
sub s3, s3, a0
Outer_Loop:
blt s3, x0, False
add t0, x0, x0
Inner_Loop:
add t1, t0, s3
add t1, s1, t1
lbu t1, 0(t1)
__H01__
__H02__
bne t1, t2, __H03__
addi t0, t0, 1
add t2, t0, s2
lbu t2, 0(t2)
beq t2, x0, True
jal x0, Inner_Loop
Update_Offset:
addi s3, s3, -1
__H04__
False:
xor a0, a0, a0
jal x0, End
True:
addi a0, x0, 1
End:
```
> * H01 = ?
> * H02 = ?
> * H03 = ?
> * H04 = ?
---