# Quiz3 of Computer Architecture (2020 Fall), Annotated
###### tags: `arch2020`
:::info
[Original Questions](https://hackmd.io/@sysprog/arch2020-quiz3)
[Solutions](https://hackmd.io/@sysprog/arch2020-quiz3-sol)
:::
## Question A
### Description
Simplify the following Boolean expressions by finding a minimal sum-of-products expression for each one. These expressions can be reduced into a minimal sum-of-products (SOP) by repeatedly applying the Boolean algebra properties.
1. $\overline{a+b\cdot{\overline c}} \cdot d + c$
2. $a \cdot \overline{(b+c)}(c + a)$
### Solution
[A list of boolean algebra laws](https://en.wikipedia.org/wiki/Boolean_algebra#Laws) can be found on Wikipedia. In the following simplifications, we denote the law applied in each step. Application of association laws are skipped for clarity.
1.
$$
\begin{align}
&\phantom{=} \overline{a+b\cdot{\overline c}} \cdot d + c & \\
&= (\overline a \cdot \overline{b\cdot{\overline c}}) \cdot d + c & \text{(De Morgan 1)} \\
&= (\overline a \cdot \overline b + \overline{\overline c}) \cdot d + c & \text{(De Morgan 2)} \\
&= (\overline a \cdot \overline b + c) \cdot d + c & \text{(Double negation)} \\
&= \overline a \cdot \overline b \cdot d + c \cdot d + c & \text{(Distributivity of $\cdot$ over $+$)} \\
&= \overline a \cdot \overline b \cdot d + c & \text{(Absorption 2)} \\
\end{align}
$$
2.
$$
\begin{align}
&\phantom{=} a \cdot \overline{(b+c)}(c + a) & \\
&= a\cdot(\overline b \cdot \overline c)(c+a) & \text{(De Morgan 2)} \\
&= a\cdot(\overline b \cdot \overline c \cdot c + \overline b \cdot \overline c \cdot a) & \text{(Distributivity of $\cdot$ over $+$)} \\
&= a\cdot(\overline b \cdot 0 + \overline b \cdot \overline c \cdot a) & \text{(Complementation 1)} \\
&= a\cdot( 0 + \overline b \cdot \overline c \cdot a) & \text{(Annihilator for $\cdot$)} \\
&= a\cdot \overline b \cdot \overline c \cdot a & \text{(Identity for $+$)} \\
&= a\cdot \overline b \cdot \overline c & \text{(Idempotence of $\cdot$)} \\
\end{align}
$$
## Question B
### Description
1. How many transistors does this implementation have?

2. Consider the implementation shown below, which uses three instances of gate F. Find the Boolean expression for F. If F can be built using a single CMOS gate, say “Yes.” Otherwise, give a convincing explanation for why F cannot be implemented as a CMOS gate. How many transistors does this implementation have?

3. Consider the implementation shown below, which uses gate G. Find the Boolean expression for G. If G can be built using a single CMOS gate, say “Yes.” Otherwise, give a convincing explanation for why G cannot be implemented as a CMOS gate. How many transistors does this implementation have?

4. Consider the implementation shown below, which uses gate H. Find the Boolean expression for H. If H can be built using a single CMOS gate, say “Yes.” Otherwise, give a convincing explanation for why H cannot be implemented as a CMOS gate. How many transistors does this implementation have?

### Solution
1. A CMOS logic gates consists of a pull up network and a pull down network. When the target function evaluates to TRUE, the pull up network is "connected" (low resistance) and the pull down network is "cut off" (high resistance). Typically, the pull up network uses PMOSs and the pull down network uses NMOSs.
Some basic transistor counts:
| Gate | Transistor count |
| -------- | -------- |
| Inverter | 2 |
| NAND | 4 |
| NOR | 4 |
| AND (NAND + INV) | 6 |
| OR (NOR + INV) | 6 |
In this problem, there are $2 + 6 + 6 + 6 = 20$ transistors.
2. By [bubble pushing](https://grace.bluegrass.kctcs.edu/~kdunn0001/files/Simplification/4_Simplification10.html), it can be shown that F can be a single NAND gate, which can be built with 4 transistors. The total number of transistors used is $2 + 4 + 4 + 4 = 14$.
3. CMOS logic is inverting - when an input is risen to 1, the pull up network has more open parts and the pull down network has more closed, and therefore the output cannot be higher than the original, and vice versa. Now consider the case $(\overline A,\overline B,\overline S) = (1, 0, 0)$. When $\overline S$ becomes 1 instead, the output also rises from 0 to 1, which is not realizable with a single CMOS gate.
4. The standard gate of the expression $\overline {as+bt}$ uses 8 transistors in total. With two additional inverters, a total of $2 + 2 + 8 = 12$ transistors are used.

## Question C
### Description
1. What value should the C01 term in the C code and the assembly be replaced with to make the if statement correctly check if the variable c is a multiple of 4?
2. How many words will be written to the stack before the program makes each recursive call to the function f?
3. The program’s initial call to function `f` occurs outside of the function definition via the instruction `jal ra, f`. The program is interrupted at an execution (not necessarily the first) of function f, just prior to the execution of `add a0, a0, a1` at label `L1`. The below diagram on the right shows the contents of a region of memory. All addresses and data values are shown in hex. The current value in the SP register is `0xEB0` and points to the location shown in the diagram.
4. What are the values in the following registers right when the execution of f is interrupted? Write “UNKNOWN” if you cannot tell.
5. What is the hex address of the jal ra, f instruction that made the initial call to f?
6. What is the hex address of the instruction at label ELSE?
### Solution
1. 0x3. A number is a multiple of 4 iff the last 2 bits are 0.
2. 2, the return address and the original `a`. Note that `b` is not saved because it is not used after the recursive call.
3. In recursive calls, the return address should be `0xA4` (the A4 label), so the first saved registers are `(a1, ra) = (0x4, 0x3B8)`, where `a = a0 = 4`. `b` cannot be inferred as it is not written to the stack.
4. The original solution says that `(a1, ra) = (0x2, 0xA4)`, but it may be the case that it is UNKNOWN since the interrupt can be anywhere after the A4 tag, so the registers could be fully, partially or not loaded at all.
5. The initial call is `0x3B4` (an instruction before the return address)
6. `0xa4 - 4 * 6 = 0x8c`
## Question D
### Description
1. Consider the logic diagram below, which includes XNOR2, OR2, NAND2, AND2, and INV. Using the tPD (propagation delay) information for the gate components shown in the table below, compute the tPD for the circuit.
2. Find a minimal sum-of-products expression for output X of the circuit described by the truth table shown below.
### Solution
1. 17.5ns

2. The problem can be solved with a [Karnaugh map](https://en.wikipedia.org/wiki/Karnaugh_map), either manually or with a [online solver](http://www.32x8.com/sop4_____A-B-C-D_____m_1-2-6-12-13-14-15___________option-0_____898798870972822592658).
## Question E
### Description
1. What is the hexadecimal encoding of the RISC-V instruction `sw t1, -4(t1)`?
2. For the following code snippet, provide the value left in each register after executing the entire code snippet (i.e., when the processor reaches the instruction at the end label), or specify “UNKNOWN” if it is impossible to tell the value of a particular register.
### Solution
1. `0b1111111_00110_00110_010_11100_0100011 = 0xfe632e23`
2. In RISC-V, immediates are always signed extended except in CSR. The 12-bit immediate `0xc00` is extended into the full word `0xfffffc00`. After execution,
```
x4 = 0x6 << 8 = 0x600
x5 = 0xfffffc00
x6 = x4 | x5 = 0xfffffe00
```
The end label is 4 instructions from the initial pseudo-instruction `. = 0x100`, and thus has address `0x100 + 4*4 = 0x110`
## Question F
### Description
```cpp
int fib(int n) {
if (n <= 1) return n;
return fib(n - 1) + fib(n - 2);
}
```
```
fib: addi sp, sp, -12
sw ra, 0(sp)
sw a0, 4(sp)
sw s0, 8(sp)
li s0, 0
li a7, 1
if: ble __F01__
sum: addi a0, a0, -1
call fib
add s0, s0, a0
lw a0, 4(sp)
addi a0, a0, -2
call fib
mv t0, a0
add a0, s0, t0
done: lw ra, 0(sp)
lw s0, 8(sp)
L1: addi sp, sp, 12
ret
```
Complete the missing portion of the ble instruction to make the assembly implementation match the C code.
How many distinct words will be allocated and pushed into the stack each time the function fib is called?
### Solution
At the `if` label, the argument `a0` is compared with loaded constant 1 `a7`. If `n` is less than or equal to 1, the function restores the registers in the stack and returns. Therefore `__F01__` is `a0, a7, done`.
The three `sw` instruction at the top of the function stores 3 words in total.