Try   HackMD

Quiz3 of Computer Architecture (2021 Fall)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
General Information

  • You are allowed to read lecture materials.
    • That is, an open book exam.
  • We are using the honor system during this quiz, and would like you to accept the following:
    1. You will not share the quiz with anyone.
    2. You will not discuss the material on the quiz with anyone until after solutions are released.
  • Each answer has 3 points.
  • For RISC-V assembly, you CANNOT write any pseudo instructions.
  • All answers should be fully simplified unless otherwise stated.
  • Of course, you should answer everything in English except your formal name.
  • Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →
    09:10 ~ 10:20AM on Nov 16, 2021
  • Fill Google Form to answer

Problem A

We are given an array of

n unique uint32_t that represent nodes in a directed graph. We say there is an edge between A and B if A < B and the Hamming distance between A and B is exactly 1. A Hamming distance of 1 means that the bits differ in 1 (and only 1) place. As an example, if the array were {0b0000, 0b0001, 0b0010, 0b0011, 0b1000, 0b1010}, we would have the edges shown as following:
Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

See also: LeetCode 461. Hamming Distance

Construct an edgelist_t (specified below) that contains all of the edges in this graph.

typedef struct { uint32_t A, B; } edge_t;
typedef struct {
    edge_t *edges;
    int len;
} edgelist_t;

Our solution used every line provided, but if you need more lines, just write them to the right of the line they are supposed to go after and put semicolons between
them. All of the necessary #include statements are omitted for brevity; do not worry about checking for malloc, calloc, or realloc returning NULL. Make sure L->edges has no unused space when L is eventually returned.

edgelist_t *build_edgelist(uint32_t *nodes, int n)
{
    edgelist_t *L = malloc(sizeof(edgelist_t));
    L->len = 0;

    L->edges = malloc(n * n * sizeof(edge_t));
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            uint32_t tmp = A01;
            if ((nodes[i] < nodes[j]) && !(A02)) {
                A03;
                A04;
                L->len++;
            }
        }
    }
    L->edges = realloc(L->edges, sizeof(edge_t) * L->len);
    return L;
}
  • A01 = ?
  • A02 = ?
  • A03 = ?
  • A04 = ?

Problem B

Consider the following circuit:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

You are given the following information:

  • Clk has a frequency of 50 MHz
  • AND gates have a propagation delay of 2 ns
  • NOT gates have a propagation delay of 4 ns
  • OR gates have a propagation delay of 10 ns
  • X changes 10ns after the rising edge of Clk
  • Reg1 and Reg2 have a clock-to-Q delay of 2 ns

The clock period is

150×106s=20ns. This means that if X changes, it changes 10 ns after the clock positive edge.

  1. What is the longest possible setup time such that there are no setup time violations? (Please include ns in your answer.)

    B01 = ?

  2. What is the longest possible hold time such that there are no hold time violations? (Please include ns in your answer.)

    B02 = ?

  3. Represent the circuit above using an equivalent FSM, shown in the following, where X is the input and Q is the output, with the state labels encoding Reg1Reg2 (e.g., 01 means Reg1 = 0 and Reg2 = 1). We did one transition already.







fsm



Start




00

00



Start->00





B03

B03



00->B03


0/1



11

11



00->11


1/1



B03->00


0/0



B04

B04



B03->B04


1/1



B04->B03


x/1



11->B03


x/1



  • B03 = ?
  • B04 = ?

Problem C

What is the FULLY SIMPLIFIED (fewest primitive gates) circuit for the equation below? You may use the following primitive gates: AND, NAND, OR, NOR, XOR, XNOR, and NOT. (You can use the LaTeX syntax \overline A to represent

A )

=(C+ABC+BCD)+(C+B+D)=C01

  • C01 = ?

Problem D

Consider the following RISC-V assembly code.

.text mv s1, a0 addi s2, s2, 4 Start: beq s1, x0, End lw a0, 0(s1) jal ra, printf add s1, s2, s1 lw s1, 0(s1) jal x0, Start End: jalr x0, ra, 0

Recall that immediate values are generated from instructions with the following table:

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

We will refer to the number produced after this process is completed as the "immediate value."

What are the fields for the machine code generated for beq s1, x0, End (line 4)?

Immediate value

  • D01 = ?

funct3

  • D02 = ?

opcode

  • D03 = ?

rs1

  • D04 = ?

rs2

  • D05 = ?

Problem E

Consider the following pipelined circuit. Assume all registers have their clock inputs correctly connected to a global clock signal and that logic gates have the following parameters:

  • XOR gate delay: 80 ps
  • AND gate delay: 60 ps
  • OR gate delay: 40 ps

Image Not Showing Possible Reasons
  • The image was uploaded to a note which you don't have access to
  • The note which the image was originally uploaded to has been deleted
Learn More →

When shopping for registers, we find two different models and want to determine which would be best for our circuit.

Register Type

λ

  • Setup Time: 40 ps
  • Hold Time: 20 ps
  • Clock-to-Q Delay: 30 ps

Register Type

τ

  • Setup Time: 10 ps
  • Hold Time: 10 ps
  • Clock-to-Q Delay: 80 ps
  1. What is the minimum latency for the circuit from A to B if we use register type
    λ
    ? (Please include ps in your answer.)
  • E01 = ?
  1. What is the minimum latency for the circuit from A to B if we use register type
    τ
    ? (Please include ps in your answer.)
  • E02 = ?

Problem F

Consider the following RISC-V code:

Loop:   andi t2, t1, 1
        srli t3, t1, 1
        bltu t1, a0, Loop
        jalr s0, s1, MAX_POS_IMM
        ...
  1. What is the value of the byte offset that would be stored in the immediate field of the bltu instruction?
  • F01 = ?
  1. We would like to propose a revision to the standard 32-bit RISC-V instruction formats where each instruction has a unique opcode (which still is 7 bits). This justifies taking out the funct3 field from the R, I, S, and SB instructions, allowing you to allocate bits to other instruction fields except the opcode field. Assume register s0 = 0x1000 0000, s1 = 0x4000 0000, PC = 0xA000 0000. Let's analyze the instruction: jalr s0, s1, MAX_POS_IMM where MAX_POS_IMM is the maximum possible positive immediate for jalr. After the instruction executes, what are the values in the following registers? (Answer in HEX)
    • s0 = F02
    • s1 = F03
    • PC = F04
  • F02 = ?
  • F03 = ?
  • F04 = ?

Problem G

Consider the following circuit:

Assume input A and input B come from registers. Assume all 2-input logical gates have a 10 ns propagation delay. The NOT gate has a 5 ns delay. All registers have a clk-to-q of 15 ns and setup time of 20 ns.

  1. Find the minimum clock period to ensure the validity of the circuit. (Please include ns in your answer)
  • G01 = ?
  1. Find the maximum hold time such that there are no hold time violations. (Please include ns in your answer)
  • G02 = ?

Problem H

We wish to implement a function, reverse, that will take in a pointer to a string, its length, and reverse it. Assume that the argument registers, a0 and a1, hold the pointer to and length of the string, respectively. Complete the following code skeleton to implement this function.

reverse:
    # This part saves all the required registers you will use.
    mv s0, a0 # memory address
    mv s1, a1 # strlen
    addi t0, x0, 0 # iteration
Loop:
    # retrieve left and right letters
    add t1, s0, t0 # t1 is moving pointer from left (base + offset/iteration)
    lb t2, 0(t1) # t2 contains char from left
    sub t3, s1, t0 # imm needs to be s1 - t0
    H01 # since strlen indexes out of string
    add t4, s0, t3 # t4 is moving pointer from right (base + strlen - offset/iteration - 1)
    lb t5, 0(t4) # t5 contains char from right
    # switch chars
    sb t2, 0(t4)
    H02
    # iterate if necessary
    addi t0, t0, 1 # update iter
    H03
    H04
    mv a0, s0 # not necessary
    # This part restores all of the registers which were used.
    ret
  • H01 = ?
  • H02 = ?
  • H03 = ?
  • H04 = ?

Problem J

Take a look at the following circuit:

We have a register clk-to-Q time of 5ps, a hold time of 2ps, and a setup time of 3ps. AND and NAND gates have a delay of 5ps, OR and XOR gates have a delay of 6ps, and
NOT gates have a delay of 1ps. Assume that our inputs A, B, C, and D arrive on the rising edge of the clock.

  1. Which gates make up the critical path in the circuit above? Your answer should be correctly ordered from left to right, e.g. NOT

    OR
    NAND.

    • J01 = ?
  2. What is the critical path delay in the circuit?

    • J02 = ?
  3. Let us now consider only the portion of the circuit between Reg2 and Reg3. Assume that the clock period (rising edge to rising edge) is 100 ps, registers have a clk-to-Q delay of 25ps and a setup and hold time of 20ps, and all gates have a delay of 5ps. Choose the waveform with the correct outputs for Reg2 and Reg3.

  • Option A
  • Option B
  • Option C
  • Option D

Notation: For reference, in the diagram below, the first region indicates an "undefined" signal, the second region indicates a signal of "high" or 1, and the third region indicates a signal of "low" or 0.

J03 = ?


Problem K

Consider the following program that computes the Fibonacci sequence recursively. The C code is shown on the left, and its translation to RISC-V assembly is provided on the right. You are told that the execution has been halted just prior to executing the ret instruction. The SP label on the stack frame (part 3) shows where the stack pointer is pointing to when execution halted.

  • C code
int fib(int n)
{
    if (n <= 1) return n;
    return fib(n - 1) + fib(n - 2);
}
  • RISC-V Assembly (incomplete)
fib:  addi sp, sp, -12
      sw ra, 0(sp)
      sw a0, 4(sp)
      sw s0, 8(sp)
      li s0, 0
      li a7, 1
if:   ble __K01__
sum:  addi a0, a0, -1
      call fib
      add s0, s0, a0
      lw a0, 4(sp)
      addi a0, a0, -2
      call fib
      mv t0, a0
      add a0, s0, t0
done: lw ra, 0(sp)
      lw s0, 8(sp)
L1:   addi sp, sp, 12
      ret
  1. Complete the missing portion of the ble instruction to make the assembly implementation match the C code.

    • K01 = ?
  2. How many distinct words will be allocated and pushed into the stack each time the function fib is called?

    • K02 = ?
  3. Please fill in the values for the blank locations in the stack trace below. Please express the values in HEX.

    Notation address
    Smaller address 0x280
    0x1
    K03
    SP
    K04
    K05
    0x0
    0x280
    0x3
    0x0
    0x2108
    0x4
    0x6
    Larger address 0x1
    • K03 = ?
    • K04 = ?
    • K05 = ?
  4. What is the hex address of the done label? (Answer in HEX)

    • K06 = ?
  5. What was the address of the original function call to fib? (Answer in HEX)

    • K07 = ?

Problem L

Suppose we want to create a system that decides if the concatenation of its previous 2 single-bit inputs is a power of 2 (where the MSB is the input from 2 cycles ago and the LSB is from 1 cycle ago). If the previous 2 bits (prior to the current input) are a power-of-two the system outputs a 1, otherwise it outputs 0. Before any input is sent, assume the initial previous 2 bits are 2'b00.

A partial finite state machine diagram of this circuit is shown below:

Before receiving any inputs the FSM is in state A.

  1. For this FSM to provide the correct answer, to what existing states must D transition to (A, B, C, or D), and what output does D give (0 or 1)?

    • Current State = D, Input = 0, Next State = __ L01 __
    • Current State = D, Input = 1, Next State = __ L02 __
    • Current State = D, Output = __ L03 __
    • L01 = ?
    • L02 = ?
    • L03 = ?