# Quiz3 of Computer Architecture (2020 Fall)
:::info
:information_source: General Information
* You are allowed to read [lecture materials](http://wiki.csie.ncku.edu.tw/arch/schedule).
* That is, an open book exam.
* You shall not disclose your answer during the quiz.
* Each answer has 5 points.
* :timer_clock: 09:10 ~ 10:00AM on Oct 27, 2020
:::
## Question `A`
Simplify the following Boolean expressions by finding a minimal sum-of-products expression for each one. These expressions can be reduced into a [minimal sum-of-products (SOP)](https://www.electronicshub.org/boolean-logic-sop-form-pos-form/) by repeatedly applying the Boolean algebra properties.
1. ![](https://i.imgur.com/fU6jHD6.png)
> * A01 = ?
2. ![](https://i.imgur.com/huw96Vv.png)
> * A02 = ?
---
## Question `B`
[Multiplexers](https://en.wikipedia.org/wiki/Multiplexer) (or Muxes) are used often so it is important to optimize them. In this problem you will design several variants of a 1-bit, 2-to-1 mux (shown to the below) using CMOS gates, and will compare their costs in number of transistors.
Note: a CMOS gate consists of an output node connected to a single pFET-based pullup circuit and a single nFET-based pulldown circuit. Gates obtained by combining multiple CMOS gates are not a CMOS gate.
![](https://i.imgur.com/U2Naq3w.png)
1. Consider the implementation shown below, which uses two AND gates and an OR gate. Because a single CMOS gate cannot implement AND or OR, each AND gate is implemented with a CMOS NAND gate followed by a CMOS inverter, and the OR gate is implemented with a CMOS NOR gate followed by a CMOS inverter. How many transistors does this implementation have?
![](https://i.imgur.com/4daEGx4.png)
Number of transistors in mux: __ B01 __
> * B01 = ?
2. Consider the implementation shown below, which uses three instances of gate `F`. Find the Boolean expression for `F`. If F can be built using a single CMOS gate, say "Yes." Otherwise, give a convincing explanation for why `F` cannot be implemented as a CMOS gate. How many transistors does this implementation have?
![](https://i.imgur.com/PABcyck.png)
Number of transistors in mux (if F can be built as a CMOS gate): __ B02 __
> * B02 = ?
3. Consider the implementation shown below, which uses gate `G`. Find the Boolean expression for `G`. If `G` can be built using a single CMOS gate, say "Yes." Otherwise, give a convincing explanation for why G cannot be implemented as a CMOS gate. How many transistors does this implementation have?
![](https://i.imgur.com/W924KR9.png)
Number of transistors in mux (if G can be built as a CMOS gate): __ B03 __
> * B03 = ?
4. Consider the implementation shown below, which uses gate H. Find the Boolean expression for H. If H can be built using a single CMOS gate, say "Yes." Otherwise, give a convincing explanation for why H cannot be implemented as a CMOS gate. How many transistors does this implementation have?
![](https://i.imgur.com/3rTvLky.png)
Number of transistors in mux (if H can be built as a CMOS gate): __ B04 __
> * B04 = ?
---
## Question `C`
Consider the C procedure below and its translation to RISC-V assembly code, following the C code.
- [ ] C procedure
```cpp
int f(int a, int b) {
int c = b – a;
if (c & C01 == 0) /* c is a multiple of 4 */
return 1;
int d = f(a – 1, b + 2);
return 3 * (d + a);
}
```
- [ ] The translated RISC-V assembly code
```
f: sub a2, a1, a0
andi a2, a2, __C01__
bnez a2, ELSE
li a0, 1
jr ra
ELSE: addi sp, sp, -8
sw a0, 0(sp)
sw ra, 4(sp)
addi a0, a0, -1
addi a1, a1, 2
jal ra, f
A4: lw a1, 0(sp)
lw ra, 4(sp)
L1: add a0, a0, a1
slli a1, a0, 1
add a0, a0, a1
addi sp, sp, 8
jr ra
```
1. What value should the __C01__ term in the C code and the assembly be replaced with to make the if statement correctly check if the variable `c` is a multiple of 4?
> * C01 = ?
2. How many words will be written to the stack before the program makes each recursive call to the function f?
> * C02 = ?
3. The program’s initial call to function `f` occurs outside of the function definition via the instruction `jal ra, f`. The program is interrupted at an execution (not necessarily the first) of function f, just prior to the execution of `add a0, a0, a1` at label `L1`. The below diagram on the right shows the contents of a region of memory. All addresses and data values are shown in hex. The current value in the SP register is 0xEB0 and points to the location shown in the diagram.
![](https://i.imgur.com/aYAOg0R.png)
What were the values of arguments a and b to the initial call to f? Write "UNKNOWN" if the argument does not show up in the stack.
Initial arguments to `f`: a = __ C03 __ ; b = __ C04 __
> * C03 = ?
> * C04 = ?
4. What are the values in the following registers right when the execution of f is interrupted? Write "UNKNOWN" if you cannot tell.
Current value (in hex) of `a1`: __ C05 __
Current value (in hex) of `ra`: __ C06 __
> * C05 = ?
> * C06 = ?
5. What is the hex address of the `jal ra, f` instruction that made the initial call to `f`?
Address (in hex) of instruction that made initial call to `f`: __ C07 __
> * C7 = ?
6. What is the hex address of the instruction at label `ELSE`?
Address of instruction at label `ELSE`: __ C08 __
> * C08 = ?
---
## Question `D`
1. Consider the logic diagram below, which includes `XNOR2`, `OR2`, `NAND2`, `AND2`, and `INV`. Using the $t_{PD}$ information for the gate components shown in the table below, compute the $t_{PD}$ for the circuit.
![](https://i.imgur.com/VkxLH0O.png)
| Gate | $t_{PD}$ |
|:-------:|:--------:|
| XNOR2 | 7.0 ns |
| OR2 | 5.5 ns |
| NAND2 | 3.0 ns |
| AND2 | 5.0 ns |
| INV | 2.0 ns |
Longest path. $t_{PD}$ in ns = D01
> * D01 = ?
2. Find a minimal sum-of-products expression for output X of the circuit described by the truth table shown below.
![](https://i.imgur.com/h3p9dDd.png)
Minimal sum of products for X = __ D02 __
> * D02 = ?
---
## Question `E`
1. What is the hexadecimal encoding of the RISC-V instruction `sw t1, -4(t1)`? You can use the table below to help you with the encoding.
| [31:25] | [24:20] | [19:15] | [14:12] | [11:7] | [6:0] |
| ------- | ------- | ------- | ------- | ------ | ----- |
| imm[11:5] | rs2 | rs1 | funct3 | imm[4:0] | opcode |
* opcode = 0100011~2~
* funct3 = 010~2~
* `t1` = `x6` (ABI register name)
* `rs1` = 00110, `rs2` = 00110
`sw t1, -4(t1)` instruction encoding in HEX: __ E01 __
> * E01 = ?
2. For the following code snippet, provide the value left in each register after executing the entire code snippet (i.e., when the processor reaches the instruction at the end label), or specify "UNKNOWN" if it is impossible to tell the value of a particular register.
```
. = 0x100
li x4, 0x6
addi x5, zero, 0xC00
slli x4, x4, 8
or x6, x4, x5
end:
```
* `x4` = __ E02 __
* `x5` = __ E03 __
* `x6` = __ E04 __
* `pc` = __ E05 __
> * E02 = ?
> * E03 = ?
> * E04 = ?
> * E05 = ?
---
## Question `F`
Consider the following program that computes the Fibonacci sequence recursively. The C code is shown on the below, and its translation to RISC-V assembly is provided as well. You are told that the execution has been halted just prior to executing the ret instruction.
- [ ] C code
```cpp
int fib(int n) {
if (n <= 1) return n;
return fib(n - 1) + fib(n - 2);
}
```
- [ ] The translated RISC-V assembly
```cpp
fib: addi sp, sp, -12
sw ra, 0(sp)
sw a0, 4(sp)
sw s0, 8(sp)
li s0, 0
li a7, 1
if: ble __F01__
sum: addi a0, a0, -1
call fib
add s0, s0, a0
lw a0, 4(sp)
addi a0, a0, -2
call fib
mv t0, a0
add a0, s0, t0
done: lw ra, 0(sp)
lw s0, 8(sp)
L1: addi sp, sp, 12
ret
```
Complete the missing portion of the `ble` instruction to make the assembly implementation match the C code.
> * F01 = ?
How many distinct words will be allocated and pushed into the stack each time the function `fib` is called? Number of words pushed onto stack per call to fib: __ F02 __
> * F02 = ?