Quiz2 of Computer Architecture (2022 Fall)

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

General Information

You are allowed to read lecture materials.
- That is, an open book exam.
We are using the honor system during this quiz, and would like you to accept the following:
1. You will not share the quiz with anyone.
2. You will not discuss the material on the quiz with anyone until after solutions are released.
Each answer has 5 points.
You must answer in valid numeric/C99/assembly representation and/or English alphabet except your formal name.
Message the instructor via Facebook if you found something wrong.
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
09:10 ~ 10:15AM on Oct 4, 2022

Problem `A`

Given the head of a linked list and a value x, partition it such that all nodes less than x come before nodes greater than or equal to x.

You should preserve the original relative order of the nodes in each of the two partitions.

Example 1:

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Input: head = [1,4,3,2,5,2], x = 3
Output: [1,2,2,4,3,5]

Example 2:
Input: head = [2,1], x = 2
Output: [1,2]

C definition for the above singly-linked list:

typedef struct _node {
    int val;
    struct _node *next;
} node_t;

The function prototype:

node_t *partition(node_t *head, int x);

The corresponding RISC-V implementation is listed as following:

    .text
    .align 2
    .globl partition
partition:
    addi    a5, sp, 12
    sw      a5, 24(sp)
    addi    a5, sp, 20
    sw      a0, 12(sp)
    sw      zero, 20(sp)
    sw      a5, 28(sp)
L1:
    bne     a0, zero, L4
    lw      a5, 24(sp)
    lw      a4, 20(sp)
    sw      a4, 0(a5)
    lw      a5, 28(sp)
    lw      a0, 12(sp)
    sw      zero, 0(a5)
    addi    sp, sp, 32
    jr      ra
L4:
    lw      a5, 0(a0)
    bge     a5, a1, L2
    lw      a4, 24(sp)
    addi    a5, sp, 24
L3:
    sw      a0, 0(a4)
    addi    a4, a0, 4
    sw      a4, 0(a5)
    lw      a0, 4(a0)
    j       L1
L2:
    lw      a4, 28(sp)
    addi    a5, sp, 28
    j       L3

Let's reverse-engineer to get the RISC-V assembly back to the below C code:

node_t *partition(node_t *head, int x)
{
    node_t *L = NULL, **p1 = &head, **p2 = &L;

    for (node_t *node = head; node; node = node->next) {
        node_t ***indir = node->val >= x ?
            (A01 /* fill your code */) : (A02 /* fill your code */);
        **indir = node;
        *indir = A03 /* fill your code */;
    }

    *p1 = L, *p2 = NULL;
    return head;
}

Obviously, the above C code listing was incomplete, you shall provide the functioned implementations. A01, A02, and A03 are C expressions. You must obey the following rules when filling them:

Write shorter code as possible as you can.
Follow the consistent coding style. That is, we prefer node_t *L to node_t* L. Be aware of the spaces! Details determine success or failure.

A01 = ?

A02 = ?

A03 = ?

Problem `B`

Consider the following Python class:

class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def transform(self, f):
        return Vector(f(self.x), f(self.y))

This code has to be converted to C code below.

#include <stdlib.h>

typedef struct vector { int x, y; } vector_t;

vector_t *transform(vector_t *self, int (*f)(int))
{
    vector_t *nv = malloc(sizeof(vector_t));
    nv->x = f(self->x), nv->y = f(self->y);
    return nv;
}

Translate the transform function to RISC-V. The function takes inputs self in a0 and f in a1, and returns output in a0. You may assume that vector_t is as defined in the C code. You may also assume that you have access to malloc, and that malloc and f each receive their argument in a0, and return their result in a0. Your solution MUST fit within the lines provided.

transform:
    addi sp, sp, B01
    B02
    B03
    B04
    B05
    mv s0, a0
    mv s1, a1
    li a0, 8
    jal ra, malloc
    mv s2, a0
    lw a0, 0(s0)
    B06
    sw a0, 0(s2)
    lw a0, 4(s0)
    B07
    sw a0, 4(s2)
    mv a0, s2
    B08
    B09
    B10
    B11
    addi sp, sp, B12
    ret

Both B06 and B07 MUST use jalr instruction. You MUST use ABI name for accessing registers defined in RISC-V Calling Convention. i.e, use ra instead of x1 while it appears in jalr instruction.

B01 = ?

B02 = ?

B03 = ?

B04 = ?

B05 = ?

B06 = ?

B07 = ?

B08 = ?

B09 = ?

B10 = ?

B11 = ?

B12 = ?

Problem `C`

The stack can hold local variables in addition to the ra register and the s registers. The following labels that were externally specified are available to you:

password: a pointer to a statically-stored string "secretpass"
get_chars: A function defined as follows:
- Input: a0 is a pointer to a buffer
- Effect: Reads characters from stdin, and fills the buffer pointed to by a0 with the read data, null-terminating the string. Your code may assume that the input is at most 19 characters, not including the null-terminator.
- Output: None

The function verify_password is defined as follows:

Input: No register input; however, the function receives a string input from stdin.
Output: a0 returns 1 if the input from stdin is exactly "secretpass", and 0 otherwise.

Complete the function verify_password. Each line contains exactly one instruction or pseudoinstruction.
























verify_password:
    addi sp, sp, -24 # Make space for a 20-byte buffer
    C01 # Your code here
    mv a0, sp
    jal ra, get_chars
    la t0, password
    mv t1, sp
Loop:
    lb t2, 0(t0)
    lb t3, 0(t1)
    C02 # Your code here
    C03 # Your code here
    addi t0, t0, C04 # Your code here
    addi t1, t1, C05 # Your code here
    j Loop
Pass:
    li a0, C06 # Your code here
    j End
Fail:
    li a0, C07 # Your code here
End:
    C08 # Your code here
    C09 # Your code here
    jr ra

You MUST use ABI name for accessing registers defined in RISC-V Calling Convention. i.e, use ra instead of x1 while it appears in jalr instruction.

C01 = ?

C02 = ?

C03 = ?

C04 = ?

C05 = ?

C06 = ?

C07 = ?

C08 = ?

C09 = ?

Problem `D`

In order to enhance their models in machine learning, some data scientists add random noise to a training dataset. Here, we will take a dataset and turn it into RISC-V data that can be used.

The function jitter is defined as follows:

Inputs:
– a0 holds a pointer to an integer array.
– a1 holds a buffer large enough for n integers (which does not overlap with the array in a0).
– a2 holds n, the length of the arrays.
Output:
– a0 holds a pointer to the buffer originally in a1. The buffer is filled with the result of calling noisen on each element in the a0 array.

The function noisen is defined as follows:

Input: a0 holds an integer.
Output: a0 returns the integer with noise added.

The implementation of jitter is provided below, following calling convention:

jitter:
    # BEGIN PROLOGUE
    addi sp, sp, -20
    # We need to make space for 5 registers on the stack,
    # and each register is 4 bytes long.
    # (multiple lines omitted)
    # END PROLOGUE
    mv s0, a0
    mv s1, a1 # Hold beginning of output arr
    mv s2, a1
    mv s3, a2 # Hold counter
loop:
    beq s3, x0, end
    lw a0, 0(s0)
    jal ra, noisen
    sw a0, 0(s1)
    addi s0, s0, 4
    addi s1, s1, 4
    addi s3, s3, -1
    j loop
end:
    mv a0, s2
    # BEGIN EPILOGUE
    # (multiple lines omitted)
    addi sp, sp, 20
    # END EPILOGUE
    ret

Part 1: List all registers that we need to save on the stack in order to follow calling convention. __ D01 __

D01 = ?

Part 2: Now, we want to implement noisen to add some random offset to an integer a0. Unfortunately, we only have access to the randomBool function, which takes in no inputs and randomly returns either 1 or 0 in a0. If we implemented noisen to return a0 + randomBool(), the integer would always get shifted in the positive direction. Instead, we suggest implementing noisen so that noisen alternates between returning a0 + randomBool() and returning a0 - randomBool().
Fill in the blanks to complete the above suggested implementation of noisen. Assume that you can read from and write to any memory addresses, in any section of memory.

noisen:
    addi sp, sp, -8 # Prologue
    sw ra, 0(sp)
    sw s0, 4(sp)
    mv s0, a0
    jal ra, randomBool
    add s0, s0, a0
    D02 # Your code here
    xori t0, t0, 64
    D03 # Your code here
    mv a0, s0 # Epilogue
    lw ra, 0(sp)
    lw s0, 4(sp)
    addi sp, sp, 8
    ret

D02 = ?

D03 = ?

Problem `E`

The following operations should be implemented in RISC-V 32-bit integer assembly. Any extensions (such as mul and the floating point or vector extensions) are NOT permitted. Per blank, just one line of RISC-V code may be entered.

float_lt

Inputs: a0 and a1 are positive non-NaN IEEE-754 32-bit floating point numbers.
Output: a0 returns 1 if a0 < a1 and 0 otherwise.
Example: If the inputs a0 and a1 were 1.5 and 1.75, the expected output would be 1. If a0 and a1 were 1.5 and 1.5 respectively, the expected output would be 0.

float_lt:
    E01 # Your code here
    ret

E01 = ?

skipline

Inputs: None
Output: None
Effect: Skip the next assembly instruction in the caller function. We are only using the 32-bit RISC-V ISA (no 16 bit extension) You may assume that the next line of code exists, and is not a pseudoinstruction.

Example: Assume that the following code was run:

addi t0, x0, 5
jal skipline
addi t0, t0, 300
addi t0, t0, 10

Then at the conclusion of this code, t0 would be equal to 15.

skipline:
    E02 # Your code here
    ret

E02 = ?

Quiz2 of Computer Architecture (2022 Fall)

Problem A

Problem B

Problem C

Problem D

Problem E

Read more

單一指令處理器 (OISC)

從 CPU cache coherence 談 Linux spinlock 可擴展能力議題

淺談 Microkernel 設計和真實世界中的應用

並行程式設計: 概念

Problem `A`

Problem `B`

Problem `C`

Problem `D`

Problem `E`