Quiz2 of Computer Architecture (2021 Fall)

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

General Information

You are allowed to read lecture materials.
- That is, an open book exam.
You shall not disclose your answer during the quiz.
Each answer has 3 points.
For RISC-V assembly, you CANNOT write any pseudo instructions.
Of course, you should answer everything in English except your formal name.
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
09:10 ~ 10:20AM on Oct 26, 2021

Problem `A`

We would like to implement an operation to reverse a linked list in memory, where the operand to this operation is the memory address of a pointer to the first node in the linked list (a pointer to a pointer). This operation has no destination register, but the operation zeroes the register specified by the operand upon completion (it does not preserve the operand).

Every node in the linked list has the following structure.

struct node {
    struct node *next;
    void *value;
}

Assume that pointers are 32 bits wide in this architecture. The next pointer is either the memory address of the next node in the list or is equal to 0 (NULL) to indicate the end of the linked list.

The equivalent C code for this operation is provided below.

void reverse(struct node **head)
{
    struct node *prev = NULL, *curr = *head;    
    while (curr) {
        struct node *next = curr->next;
        curr->next = prev;
        prev = curr;
        curr = next;
    }
}

The RISC-V assembly implementation is shown as following:

# head is passed in a0
# t0 holds prev
# t1 holds next
    lw a0, 0(a0)
    beqz __A01__
    addi t0, t0, 0
loop:
    lw t1, 0(a0)
    sw t0, 0(a0)
    addi t0, a0, 0
    addi a0, t1, 0
    __A02__
done:

Please complete the unfinished assembly to make the above work as the C counterpart does. You don't have to manipulate the RISC-V function calls for this operation.

A01 = ?

A02 = ?

Problem `B`

Consider the following C code:

unsigned multiply(unsigned x, unsigned y) {
    unsigned ret = 0;
    for (; y != 0; y--)
        ret += x;
    return ret;
}

We can translate into the optimized RISC-V assembly, so that fewer dynamic instructions are executed for large values of x and y.

A simple optimization is loop unrolling. With some careful bookkeeping, this can reduce the number of additions. This implementation keeps x + x as a temporary value to cut the number of additions approximately in half.

The following is more efficient than
simply translating from C.

      addi x3, x0, 0   # result = 0
      andi __B01__     # x? = y – (y % 2)
      add x7, x1, x1   # x7 = x + x
      beq x3, x0, done # y is 0
      beq __B02__      # skip ahead if y is even
      add x3, x0, x1   # set result to x if y is odd
loop: add __B03__   # result = result + 2x
      addi x6, x6, -2  # x6 = x6 - 2
      bge x6, x0, loop # repeat if x6 != 0
done:

B01 = ?

B02 = ?

B03 = ?

Problem `C`

The function strtriple receives as input a null-terminated string, and writes a string where every letter is repeated three times. For example, if we called stringtriple on the input string "Hello World!", we would get the output string "HHHeeellllllooo WWWooorrrlllddd!!!".

More specifically: strtriple has the following inputs: a0 stores a null-terminated string. It is guaranteed that the string is well-formatted. Let the string have strlen of n. a1 contains a pointer to a buffer of length at least

3 n + 1

characters, potentially containing garbage data. You may assume that the buffer does not overlap with the string stored in a0.

strtriple outputs the following: a0 should return the pointer to the buffer we initially received in a1. The buffer should contain the string originally provided in a0, with every character tripled. You must properly null-terminate the string, and for full credit must NOT replicate the null terminator at the end. For full credit, you must follow proper calling convention.

strtriple:
    mv t2, a1
loop:
    lbu __C01__
    beq t0, x0, end
    slli __C02__
    add t1, t1, t0
    sh t1, 0(a1)
    sb __C03__
    addi a0, a0, 1
    __C04__
    j loop
end:
    sb x0, 0(a1)
    __C05__
    jr ra

Please complete the unfinished assembly to make the above work as expected.

C01 = ?

C02 = ?

C03 = ?

C04 = ?

C05 = ?

Problem `D`

Consider the following Skip List implementation:

struct SkipListNode {
    void *data;
    struct SkipListNode **next;
}

Its find operation is shown below:

int SL_find(void *find, int level,
            struct SkipListNode *sl,
            (int)(*cmp)(void *, void *) {
    if (cmp(find, sl->data) == 0) return 1;
    if (level == 0 && sl->next[level] == NULL) return 0;
    if (sl->next[level] == NULL)
        return SL_find(find, level - 1, sl, cmp);
    if (cmp(find, sl->next[level]->data)) > 0))
        return SL_find(find, level - 1, sl, cmp);
    return SL_find(find, level, sl->next[level], cmp);
}

In translating the code cmp(find, sl->next[level]->data)) into RISC-V, we need to store arguments on the stack. So we will have find at sp(0), level at sp(4), sl at sp(8) and cmp at sp(12). We are going to break up the translation into pieces. Please write down the RISC-V assembly accordingly.

Load find into a0

D01 = ?

The next thing we need to do is get sl->next[level]->data into a1.
The RISC-V code for that would be: (Hint: Load sl into t0)

D02 = ?

Load level into t1

D03 = ?

load sl->next into t0

D04 = ?

load sl->next[level] into t0

slli __D05__
add t0, t0, t1
lw t0, t0(4)

D05 = ?

load data into a0

D06 = ?

Finally, how do we call cmp (using t0 as a temporary):

lw __D07__
__D08__

D07 = ?

D08 = ?

Problem `E`

You are supposed to implement RISC-V code per requested as less instructions as possible. Follow calling convention, use register mnemonic names (e.g., refer to t0 rather than x6), and add commas and a single space between registers/arguments (e.g. addi a0, a1, 2). If you do not follow this, you may be misgraded.

Implement less-than for floating-point numbers.

Inputs: a0 and a1 are positive non-NaN IEEE-754 32-bit floating point numbers.
Output: a0 returns 1 if a0 < a1 and 0 otherwise.
Example: If the inputs a0 and a1 were 1.2 and 1.25, the expected output would be 1. If a0 and a1 were 1.1 and 1.1 respectively, the expected output would be 0.

fp_less_than:
    __E01__
    ret

E01 = ?

Find the end position of C-style string.

Inputs: a0 is a pointer to a nonempty string
Output: a0 returns a pointer immediately after the end of the string.
Example: Let a0 be the pointer 0x10000000 to string s, which is the string "Hello". Then the expected output would be 0x10000006.
NOTE: You may NOTE use strlen or any other library function.

end_of_str:
    lb t0, 0(a0)
    __E02__
    __E03__

E02 = ?

E03 = ?

Find the length of a null-terminated string in bytes. The function should accept a pointer to a null-terminated string and return an integer. Your solution must be recursive.

strlen:
        __E04__
        beq t0, zero, basecase
        addi sp, sp, -4
        __E05__
        __E06__
        jal strlen
        addi a0, a0, 1
        __E07__
        __E08__
        ret
basecase:
        addi a0, zero, 0
        ret

E04 = ?

E05 = ?

E06 = ?

E07 = ?

E08 = ?

Arithmetically negate a Two's Complement 32-bit integer without using the sub, mul or pseudo instructions.

negate:
    __E09__
    addi a0, a0, 1
    ret

E09 = ?

In this question, you will implement a simple recursive function in RISC-V. The function takes a decimal number as input, then outputs its binary representation encoded in the decimal digits.

int find_binary(unsigned int decimal) {
    if (decimal == 0) return 0;
    return decimal % 2 + 10 * findBinary(decimal / 2);
}

For example, if the input to this function is 10, then the output would be 1010. The equivalent RISC-V code:

find_binary:
    addi sp, sp, -8 # a0 will have arg and be where we return
    sw ra, 4(sp) # saving return address on the stack
    sw s0, 0(sp) # saving "decimal % 2" on the stack
    __E10__ #  we will just return 0
    andi s0, a0, 1 # set s0 to a0 % 2
    __E11__ # set a0 to a0 / 2
    jal ra, find_binary # recursive call
    li t0, 10
    __E12__
    add a0, a0, s0 # accumulating the stored return value "decimal % 2"
postamble:
    lw ra, 4(sp) # Restore ra
    __E13__ # restore s0
    __E14__ # restore sp
end:
    jr ra

Hint:

The recursive part of the function stores all of the first part of the return value “decimal % 2” on the stack
The second part of the function and postamble are combing all the return values by + and 10 *

E10 = ?

E11 = ?

E12 = ?

E13 = ?

E14 = ?

Implement the function strncpy which takes in two char * arguments and copies up to the first n characters from source into destination. If it reaches a null terminator, then it copies that value into destination and stops copying in characters. If there is no null terminator among the first n characters of source, the string placed in destination will not be null-terminated. strncpy returns a pointer to the destination string. You may not store anything on the stack for this problem.

char* strncpy(char *destination, char *source, unsigned int n);

strncpy:
    add t0, x0, x0 # Current length
loop:
    __E15__ # Check if we have iterated over the length of the string
    add t1, a1, t0 # calculate the address of the t0-th (count-th) char in the string
    lb t2, 0(t1) # load this char
    __E16__ # calculate the same address, but off of the dst string
    sb t2, 0(t1) # store into the dst string
    addi t0, t0, 1 # increment our counter
    bne t1, x0, loop # check if the loaded char was null
end:
    jr ra

E15 = ?

E16 = ?

Problem `F`

Evaluate the RISC-V code. Write down the specific value in HEX. Reminder: always include the prefix in your answer.

Case 1

auipc t0, 0xABCDE # Assume this instruction is at 0x100
addi t0, t0, 0xABC

Write down the value of t0 in HEX.

F01 = ?

Case 2

li t0, 0xABCDEFAD
sw t0, 0(s0)
lb t0, 0(s0)

Write down the value of t0 in hex. Assume big-endianness.

F02 = ?

Problem `G`

Bloom filter is a very clever datastructure for efficiently and probabilistically storing a set. Typically, it has two functions: check and insert. The basic idea for checking is that you hash what you are looking for multiple times. Each hash tells you a particular bit you need to set or check. So for checking you see if the bit is set. You repeat this for multiple iteration, with the hash including the iteration count (so each hash is different). If not all bits are set then the element does not exist in the bloom filter. If all bits are set then the element PROBABLY exists in the bloom filter. Similarly, for setting an element as present in a bloom filter you just set all those bits to 1.

We define the following structure.

#include <stdint.h>
struct BloomFilter {
    uint32_t size; /* Size is # of bits, NOT BYTES, in the bloom filter */
    uint16_t itercount;
    uint64_t (*)(void *data, uint16_t iter) hash;
    uint8_t *data;
};

For the insertion function, we need to set the appropriate bit for each iteration.

void insert(struct BloomFilter *b, void *element) {
    uint64_t bitnum; /* which bit we need to set */
    for (int i = 0; i < b->itercount; ++i) {
        bitnum = b->hash(element, (uint16_t) i) % b->size;
        b->data[bitnum >> 3] = b->data[bitnum >> 3] | (1 << (bitnum & 0x7));
    }
}

We also have the following function to allocate a new bloom filter

struct BloomFilter *alloc(uint64_t (*)(void *data, uint16_t iter) hash,
                          uint32_t size,
                          uint16_t itercount) {
    struct BloomFilter *ret = malloc(64);
    ret->size = size;
    ret->data = calloc(size >> 3, 1);
    ret->hash = hash;
    ret->itercount = itercount;
}

Complete the RISC-V translation necessary to allocate this: We will put ret in s0.

alloc:
# Prolog
    addi sp, sp, __G01__
    sw ra, 0(sp)
    sw s0, 4(sp)
    sw a0, 8(sp)
    sw a1, 12(sp)
    sw a2, 16(sp)
# body
    addi a0, x0, 64
    jal malloc
    mv s0, a0 # put ret in s0
    __G02__ # load size into t0
    __G03__ # store it
    __G04__ # div size by 8 with a shift
    li a1, 1
    jal calloc
    sw a0, 12(s0) # store data
    __G05__ # load hash to t0
    __G06__ # store it: Use the right type
    __G07__ # load itercount to t0
    sh t0, 4(s0) # store it
    mv a0, s0
# epilog
    lw ra, 0(sp)
    lw s0, 4(sp)
    addi sp, sp, 20
    jr ra

G01 = ?

G02 = ?

G03 = ?

G04 = ?

G05 = ?

G06 = ?

G07 = ?

Problem `H`

Please complete the translation by filling in the lines below with RISC-V assembly. The prologue and epilogue have been written correctly but are not shown.

strlen(), both as a C function and RISC-V procedure, takes in one string as an argument and returns the length of the string (not including the null terminator).

/* Returns 1 if s2 is a substring of s1, and 0 otherwise. */
int is_substr(char *s1, char *s2) {
    int len1 = strlen(s1), len2 = strlen(s2);
    int offset = len1 - len2;
    while (offset >= 0) {
        int i = 0;
        while (s1[i + offset] == s2[i]) {
            i += 1
            if (s2[i] == '\0') return 1;
        }
        offset -= 1;
    }
    return 0;
}

The translated RISC-V Assembly:

is_substr:
    mv s1, a0
    mv s2, a1
    jal ra, strlen
    mv s3, a0
    mv a0, s2
    jal ra, strlen
    sub s3, s3, a0
Outer_Loop:
    blt s3, x0, False
    add t0, x0, x0
Inner_Loop:
    add t1, t0, s3
    add t1, s1, t1
    lbu t1, 0(t1)
    __H01__
    __H02__
    bne t1, t2, __H03__
    addi t0, t0, 1
    add t2, t0, s2
    lbu t2, 0(t2)
    beq t2, x0, True
    jal x0, Inner_Loop
Update_Offset:
    addi s3, s3, -1
    __H04__
False:
    xor a0, a0, a0
    jal x0, End
True:
    addi a0, x0, 1
End:

H01 = ?

H02 = ?

H03 = ?

H04 = ?

Quiz2 of Computer Architecture (2021 Fall)

Problem A

Problem B

Problem C

Problem D

Problem E

Problem F

Problem G

Problem H

Read more

單一指令處理器 (OISC)

從 CPU cache coherence 談 Linux spinlock 可擴展能力議題

淺談 Microkernel 設計和真實世界中的應用

並行程式設計: 概念