Solutions
A
Given the head of a linked list and a value x
, partition it such that all nodes less than x
come before nodes greater than or equal to x
.
You should preserve the original relative order of the nodes in each of the two partitions.
Example 1:
head = [1,4,3,2,5,2]
, x = 3
[1,2,2,4,3,5]
Example 2:
Input: head = [2,1]
, x = 2
Output: [1,2]
C definition for the above singly-linked list:
typedef struct _node {
int val;
struct _node *next;
} node_t;
The function prototype:
node_t *partition(node_t *head, int x);
The corresponding RISC-V implementation is listed as following:
.text
.align 2
.globl partition
partition:
addi a5, sp, 12
sw a5, 24(sp)
addi a5, sp, 20
sw a0, 12(sp)
sw zero, 20(sp)
sw a5, 28(sp)
L1:
bne a0, zero, L4
lw a5, 24(sp)
lw a4, 20(sp)
sw a4, 0(a5)
lw a5, 28(sp)
lw a0, 12(sp)
sw zero, 0(a5)
addi sp, sp, 32
jr ra
L4:
lw a5, 0(a0)
bge a5, a1, L2
lw a4, 24(sp)
addi a5, sp, 24
L3:
sw a0, 0(a4)
addi a4, a0, 4
sw a4, 0(a5)
lw a0, 4(a0)
j L1
L2:
lw a4, 28(sp)
addi a5, sp, 28
j L3
Let's reverse-engineer to get the RISC-V assembly back to the below C code:
node_t *partition(node_t *head, int x)
{
node_t *L = NULL, **p1 = &head, **p2 = &L;
for (node_t *node = head; node; node = node->next) {
node_t ***indir = node->val >= x ?
(A01 /* fill your code */) : (A02 /* fill your code */);
**indir = node;
*indir = A03 /* fill your code */;
}
*p1 = L, *p2 = NULL;
return head;
}
Obviously, the above C code listing was incomplete, you shall provide the functioned implementations. A01
, A02
, and A03
are C expressions. You must obey the following rules when filling them:
node_t *L
to node_t* L
. Be aware of the spaces! Details determine success or failure.
- A01 = ?
&p2
- A02 = ?
&p1
- A03 = ?
&(**indir)->next
*indir = &node->next;
is another feasible answer aside from*indir = &(**indr)->next;
.This is an example. At this moment, the program is at the beginning of iteration body.
Green square means pointer to pointer to pointer to
node_t
. Blue square means pointer to pointer tonode_t
. Yellow square means pointer tonode_t
.
After
node_t ***indir = node->val >= x ? &p2 : &p1;
is executed, the indir will point to p1, becausenode->value
is less than x.(4<6)
Since
**indir = node;
has the same effect as*p = node
in this scenario, after it is executed, thenext
pointed byp1
will point to node.
In the solutions, next instruction is
*indir = &(**indir)->next;
, which has the same effect asp1 = &node->next
in this scenario. It makesp1
points tonext
field ofnode
.
After
*p1 = L, *p2 = NULL;
is executed, two list will be connected.
B
Consider the following Python class:
class Vector:
def __init__(self, x, y):
self.x = x
self.y = y
def transform(self, f):
return Vector(f(self.x), f(self.y))
This code has to be converted to C code below.
#include <stdlib.h>
typedef struct vector { int x, y; } vector_t;
vector_t *transform(vector_t *self, int (*f)(int))
{
vector_t *nv = malloc(sizeof(vector_t));
nv->x = f(self->x), nv->y = f(self->y);
return nv;
}
Translate the transform
function to RISC-V. The function takes inputs self
in a0
and f
in a1
, and returns output in a0
. You may assume that vector_t
is as defined in the C code. You may also assume that you have access to malloc
, and that malloc
and f each receive their argument in a0
, and return their result in a0
. Your solution MUST fit within the lines provided.
transform:
addi sp, sp, B01
B02
B03
B04
B05
mv s0, a0
mv s1, a1
li a0, 8
jal ra, malloc
mv s2, a0
lw a0, 0(s0)
B06
sw a0, 0(s2)
lw a0, 4(s0)
B07
sw a0, 4(s2)
mv a0, s2
B08
B09
B10
B11
addi sp, sp, B12
ret
Both B06
and B07
MUST use jalr
instruction. You MUST use ABI name for accessing registers defined in RISC-V Calling Convention. i.e, use ra
instead of x1
while it appears in jalr
instruction.
- B01 = ?
-16
- B02 = ?
sw ra, 0(sp)
- B03 = ?
sw s0, 4(sp)
- B04 = ?
sw s1, 8(sp)
- B05 = ?
sw s2, 12(sp)
- B06 = ?
jalr ra, s1, 0
- B07 = ?
jalr ra, s1, 0
- B08 = ?
lw ra, 0(sp)
- B09 = ?
lw s0, 4(sp)
- B10 = ?
lw s1, 8(sp)
- B11 = ?
lw s2, 12(sp)
- B12 = ?
16
C
The stack can hold local variables in addition to the ra
register and the s
registers. The following labels that were externally specified are available to you:
password
: a pointer to a statically-stored string "secretpass"get_chars
: A function defined as follows:
a0
is a pointer to a buffera0
with the read data, null-terminating the string. Your code may assume that the input is at most 19
characters, not including the null-terminator.The function verify_password
is defined as follows:
a0
returns 1
if the input from stdin is exactly "secretpass", and 0
otherwise.Complete the function verify_password
. Each line contains exactly one instruction or pseudoinstruction.
verify_password:
addi sp, sp, -24 # Make space for a 20-byte buffer
C01 # Your code here
mv a0, sp
jal ra, get_chars
la t0, password
mv t1, sp
Loop:
lb t2, 0(t0)
lb t3, 0(t1)
C02 # Your code here
C03 # Your code here
addi t0, t0, C04 # Your code here
addi t1, t1, C05 # Your code here
j Loop
Pass:
li a0, C06 # Your code here
j End
Fail:
li a0, C07 # Your code here
End:
C08 # Your code here
C09 # Your code here
jr ra
You MUST use ABI name for accessing registers defined in RISC-V Calling Convention. i.e, use ra
instead of x1
while it appears in jalr
instruction.
- C01 = ?
sw ra, 20(sp)
- C02 = ?
bne t2, t3, Fail
ORbne t3, t2, Fail
- C03 = ?
beq t2, x0, Pass
ORbeq t2, zero, Pass
ORbeq t3, x0, Pass
ORbeq t3, zero, Pass
- C04 = ?
1
- C05 = ?
1
- C06 = ?
1
- C07 = ?
0
- C08 = ?
lw ra, 20(sp)
- C09 = ?
addi sp, sp, 24
Line 3: First, we note that Line 2 allocated 24 bytes of space on the stack. Line 3 is storing a 4-byte register value to the stack space that was just allocated. This looks a lot like the function prologues that we've seen before in projects and labs!
A quick refresher on function prologues and epilogues: according to calling convention, there are some callee-saved registers whose original values must be preserved after the function returns.
To achieve this, we usually store the original register values on the stack at the beginning of the function (the prologue), and we restore the original register values at the end of the function (the epilogue).In the provided code, there is only one callee-saved register already being used:
ra
at line 5. All the other registers being used aret
registers which do not need to be saved.Thus we should save the original value of
ra
on the stack so that we can restore it later.Line 4: First, we note that Line 5 is calling the
get_chars
function. Before we call a function, we need to make sure that its arguments are properly loaded into the argument (a
) registers.
According to the question, theget_chars
function takes in one argument, in thea0
register.
Thus, this line is probably going to involve putting something into thea0
register.According to the question, the argument in
a0
is a pointer to a buffer in memory, which the function will fill up with (at most19
) input characters. In other words, theget_chars
function requires 20 bytes of space in memory to store the input characters, soa0
needs to store the address of some 20-byte space in memory where we can store a string. Where do we have 20 bytes of space in memory right now? We allocated 24 bytes on the stack in line 2, and we used 4 of those bytes in line 3, so we have 20 bytes on the stack that we can use!What is the address of this space on the stack? Remember that the stack pointer
sp
always contains the address of the bottom of the stack. Since the stack grows down to make space (see line 2), the 20 bytes of space is located at the bottom of the stack, and the address of this space is the address stored insp
. Note that in Line 3, we stored the savedra
register value at the top of the 24-byte space on the stack, so the 20-byte buffer starts at the very bottom of the stack, wheresp
is pointing.Line 6: Lines 6 and 7 initialize t0 and t1, which we can see will be used repeatedly in the loop.
We initializet0
to be the address of the start of the Password string, and we initializet1
to be the address of the start of the user’s input string. This will let us compare each character of the strings in a loop later.One way to reason this out: this line is putting a value into
t0
. We see at line 9 thatt0
will be used as an address (because it's in parentheses).la
is an instruction that loads addresses into a register.Another way to reason this out is to note that the Password label is involved in this instruction, so we must use an instruction that uses a label. Branches and jumps don't make sense here because we aren’t implementing any loops or if/else conditions yet, so by process of elimination, we should
usela
here.Line 7: One way to reason this out is to note that only two registers are provided, so we probably want a pseudoinstruction that uses only two registers.
mv
is the only logical pseudoinstruction to use here.Lines 9-10: We want to use the loop to compare each character of the input string with each
character of the provided password string. In lines 6-7, we initializedt0
andt1
with the addresses
of the strings. Now we need to dereference those addresses to load each character of the string
from memory. Since each character is one byte long,lb
is the appropriate load instruction to use here.One way to reason this out is to note that the 0(t0) syntax is used only in load and store instructions.
The addresses int0
andt1
already contain data (the hardcoded password string and the input
string, respectively), so we don't want to store to those addresses. This means we must be loading
from those addresses.Line 11: Now that we’ve loaded a character from each string from memory, we need to check if those characters match. If they don't match, we can stop checking characters in a loop and jump to the fail case.
Line 12: If we've reached the end of the string without branching to the Fail case, then we've checked that every character is equal, and we should enter the Pass case.
Remember that strings end with a null terminator, so if either loaded character is
0
(null byte), then we know that we reached the end of both strings. Note that Line 11 will check for us that both strings have reached the null terminator, since it checks that every character is equal. This is why the Pass branch must come after the Fail branch–if they were swapped, then this function would jump to Pass whenever the shorter string terminates, even if the longer string still has extra characters left.Lines 13-14: Once we finish checking a pair of characters, we need to increment the addresses to the next character. The next time the loop runs, the load instructions will now load the next character in the string.
Recall that characters are 1 byte, and RISC-V indexes memory by bytes, so we should increment the addresses in t0 and t1 by 1.
Line 15: If we didn't jump to the Fail or Pass case, after we increment the addresses, we need to
keep executing the loop and check the next character. To do this, we should jump back to the start of the loop. This is not a function call, and we don’t care about saving our return value, so a simple jump instruction is sufficient.There is an alternate solution here that combines Line 12 (pass check) and Line 15 (jumping to loop). As in the standard Line 12, we check if t2 (or t3, which must be equal at this point) is equal to
0
(the null byte). If so, we’ve reached the end of the string, so we should continue execution normally to the Pass case immediately following the loop. If not, then we need to loop again, so we jump back to the start of the loop. This logic is captured inbne t2, x0, Loop
. With this alternate solution, Line 12 can be blank (or any innocuous instruction that doesn’t affect program execution, e.g.add t6, x0, x0
).Line 17: If the check passes, we should load 1 into the return value register
a0
.One way to reason this out: This line of code only executes if we jump to the Pass label. The only line of code that is unique to the case where the check passes is putting
1
in the return value register, indicating success. Other cleanup lines of code (e.g. function epilogue, return instruction) are common to both the Pass and Fail cases, so they are more suited to be in the End label.Line 18: If the check passes, we don't want to execute the code under the Fail case, so we should jump to the End label. Like in Line 15, this is not a function call, and we don’t care about saving our return value, so a simple jump instruction is sufficient.
Line 20: If the check fails, we should load 0 into the return value register
a0
. The reasoning from Line 17 also applies to solving this blank.Line 22: At the end of the function, we need to undo any setup we did at the beginning of the function. In particular, we should write a function epilogue to restore all the saved register values on the stack.
Since we stored the original value of
ra
20 bytes above the stack pointer in the prologue, we should load from that same location in memory to restorera
.Line 23: As part of the function epilogue, we should also increment the stack pointer back up to delete the space we allocated earlier in the function.
In Line 2, we decremented the stack pointer by24
, so we should now increment the stack pointer by24
.Line 24: Finally, we need to return from the function.
D
In order to enhance their models in machine learning, some data scientists add random noise to a training dataset. Here, we will take a dataset and turn it into RISC-V data that can be used.
The function jitter
is defined as follows:
a0
holds a pointer to an integer array.a1
holds a buffer large enough for n integers (which does not overlap with the array in a0
).a2
holds n
, the length of the arrays.a0
holds a pointer to the buffer originally in a1. The buffer is filled with the result of calling noisen
on each element in the a0
array.The function noisen
is defined as follows:
a0
holds an integer.a0
returns the integer with noise added.The implementation of jitter
is provided below, following calling convention:
jitter:
# BEGIN PROLOGUE
addi sp, sp, -20
# We need to make space for 5 registers on the stack,
# and each register is 4 bytes long.
# (multiple lines omitted)
# END PROLOGUE
mv s0, a0
mv s1, a1 # Hold beginning of output arr
mv s2, a1
mv s3, a2 # Hold counter
loop:
beq s3, x0, end
lw a0, 0(s0)
jal ra, noisen
sw a0, 0(s1)
addi s0, s0, 4
addi s1, s1, 4
addi s3, s3, -1
j loop
end:
mv a0, s2
# BEGIN EPILOGUE
# (multiple lines omitted)
addi sp, sp, 20
# END EPILOGUE
ret
Part 1: List all registers that we need to save on the stack in order to follow calling convention. __ D01 __
- D01 = ?
s0
,s1
,s2
,s3
,ra
(in any order)The function modifies s0, s1, s2, s3. According to calling convention, the function must leave saved registers unchanged, so we have to save their values at the start of the function, and restore their values at the end of the function.
Also, line 15 of the function (
jal ra, noisen
) modifiesra
. According to calling convention,ra
must stay unchanged throughout the function (so that we know what address/instruction to return to after the end of the function), so we also need to save its value at the start of the function, and restore its value at the end of the function.
Part 2: Now, we want to implement noisen
to add some random offset to an integer a0
. Unfortunately, we only have access to the randomBool
function, which takes in no inputs and randomly returns either 1
or 0
in a0
. If we implemented noisen
to return a0 + randomBool()
, the integer would always get shifted in the positive direction. Instead, we suggest implementing noisen
so that noisen alternates between returning a0 + randomBool()
and returning a0 - randomBool()
.
Fill in the blanks to complete the above suggested implementation of noisen
. Assume that you can read from and write to any memory addresses, in any section of memory.
noisen:
addi sp, sp, -8 # Prologue
sw ra, 0(sp)
sw s0, 4(sp)
mv s0, a0
jal ra, randomBool
add s0, s0, a0
D02 # Your code here
xori t0, t0, 64
D03 # Your code here
mv a0, s0 # Epilogue
lw ra, 0(sp)
lw s0, 4(sp)
addi sp, sp, 8
ret
- D02 = ?
lb t0, 3(ra)
- D03 = ?
sb t0, 3(ra)
This was an extremely difficult question, so don't worry if you weren't sure how to approach this question. This question relies on the concept of modifying unprotected instruction memory, to create self-modifying code. The first thing to notice is that line 7 is currently set to add the random bool to the data; to alternate to a subtraction, we would need to change this instruction to a sub instruction. The second thing to notice is that
add
andsub
in RV32 is one bit off from each other, given usage of the same registers. This occurs in the funct7 with add using000 0000
and sub using010 0000
(this was by design, as add and subtract are so close in function).Thus, our general approach can be described as follows:
- Load the add/sub instruction into a temporary register
- Flip the correct bit in the func7 to change the instruction into a
sub
/add
instruction- Store the instruction back in its original spot
Two major challenges arise from this approach: Finding a pointer to that instruction, and flipping the correct bit.
In order to load the
add
/sub
instruction, we need to find a pointer to the code; further, we do not have sufficient lines to run a la instruction, so this address must already be stored in a current register. One insight here is that the ra register is currently pointing to a location in the code; more precisely, we calledjal ra randomBool
on line 6. This guarantees thatra
was set to point to line 7, which is incidentally exactly the line we need to load.Note that this could also work even if there were extra lines between the function call and the add/sub, using a nontrivial offset.
The second concern is to determine how to flip the correct bit. Note that add/sub differ in the bit at the 30th position of our instruction. If we attempt to load the entire instruction and flip the bit with a single
xori
, we would require an immediate of0x4000 0000
, which is greater than our maximum immediate of2047
. Thus, we instead need to load part of the instruction, using alh
orlb
. If we uselh
to load the top half of the instruction, we would have to modify the 14th bit position, yielding an immediate of0x4000
= 16384; thus, we must use alb
instruction to load the top byte of the instruction. This would allow us to modify the 6th bit position, which corresponds to an immediate of0x40
= 64.Thus, the final solution is as follows:
- Load the top byte of the add/sub instruction into a temporary register, using the ra as a code pointer:
lb t0, 3(ra)
- Flip the correct bit in the func7 to change the instruction into a sub/add instruction:
xori t0, t0, 64
- Store the instruction back in its original spot:
sb t0, 3(ra)
E
The following operations should be implemented in RISC-V 32-bit integer assembly. Any extensions (such as mul
and the floating point or vector extensions) are NOT permitted. Per blank, just one line of RISC-V code may be entered.
float_lt
a0
and a1
are positive non-NaN IEEE-754 32-bit floating point numbers.a0
returns 1
if a0 < a1
and 0
otherwise.a0
and a1
were 1.5
and 1.75
, the expected output would be 1
. If a0
and a1
were 1.5
and 1.5
respectively, the expected output would be 0
.float_lt:
E01 # Your code here
ret
- E01 = ?
slt a0, a0, a1
ORsltu a0, a0, a1
skipline
addi t0, x0, 5
jal skipline
addi t0, t0, 300
addi t0, t0, 10
Then at the conclusion of this code, t0
would be equal to 15
.
skipline:
E02 # Your code here
ret
- E02 = ?
addi ra, ra, 4