# CA Mid 2 Past (5/2)
<style>
.markdown-body li + li
{
padding-top: 0 !important;
}
.markdown-body table th, .markdown-body table td
{
padding: 2px 10px;
}
</style>
---
[TOC]
---
## 考古 2021 Mid 1
#### 6
- (a)
| ID/EX Opcode | IF/ID Opcode | Interlock if Matching |
|:------------:|:----------------------------:|:------------------------------:|
| Load | Reg-reg ALU | `ID/EX.IR[rt] == IF/ID.IR[rs]` |
| Load | Reg-reg ALU | `ID/EX.IR[rt] == IF/ID.IR[rt]` |
| Load | Load, store, ALU imm, branch | `ID/EX.IR[rt] == IF/ID.IR[rs]` |
- (b)
Yes.
| EX/MEM | Reg-reg ALU | ID/EX | Reg-reg ALU, ALU imm, load, store, branch | Top ALU input | `EX/EME.IR[rd] == ID/EX.IR[rs]` |
- \(c\)
No.
- (d)
Yes.
| MEM/WB | Load | ID/EX | Reg-reg ALU, ALU imm, load, store, branch | Top ALU input | `MEM/WB.IR[rt] == ID/EX.IR[rs]` |
## 考古 2021 Mid 2
#### 1
- (a)
$m = 3,\ n = 2$
BHT number: $2^m = 2^3$
Entry size: $2$
Total entry number: $2^{17} / 2 = 2^{16}$
Entry number **in BHT**: $2^{16} / 2^3 = 2^{13}$
- (b)
$13$
<!-- #### 2 ==(OUT)== -->
#### 3 ==W?==
- Sometimes the behavior is correlated, and we can do better by keep track of direction of related branches
- Predictors that use the behavior of other branches to make a prediction are called correlating predictors or 2-level predictors
-
```c
if(d == 0)
d = 1;
if(d == 1)
// ...
```
- 1-bit predictor:

- (1, 1) correlating predictors:

#### 4
- (a)
- Separate execution complete from inst commit
- Out-of-order execution but in-order commit
- Preform update that cannot be undone only when the inst is no longer speculative
- Update the reg file or memory: inst commit
- With:
- Dynamic branch prediction
- Dynamic scheduling of different combinations of basic blocks
- (b)
- Require additional HW buffers (Reorder buffer, ROB)
- Hold the results of inst between completion and commit
- 4 fields: Instruction type, Destination field, Value field, Ready field
<!-- #### 5 ==(OUT?)== -->
#### 6
- Wrong guess for the branch
- Get branch history of wrong branch when indexing the table
#### 7 ==W==
<!-- - It is uncertain.
- By adding more entries in a BHT, it is possible to reduce the frequency of getting branch history of wrong branch when indexing the table.
- But it may encounter some bottlenecks such as slower indexing speed, larger cost, and other HW limitations.
- Thus, it is uncertain that adding more entries in a BHT ca always improve the accuracy of dynamic branch prediction. -->
- Majorly yes.
- By adding more entries in a BHT, it is possible to reduce the frequency of getting branch history of wrong branch when indexing the table.
- But while adding too much more, the reduction in miss rate would be very insignificant.
- Moreover, it may encounter some bottlenecks such as slower indexing speed, larger cost, and other HW limitations.
- Thus, choosing a proper number of entries to balance between adv. and disadv. is the most important in practice
#### 8
- (A), (B), \(C\), (E)
#### 9
- (D)
#### 10
- (E)
## 考古 2020 Mid 1
#### 1
- Advantages:
- A stage can be implemented with simpler circuitry, which may let the processor clock run faster
- Can handle multiple operations in parallel with varying running times
- Challenges:
- The divide unit is not fully pipelined, structural hazards can occur
- It has varying running times and out-of-order completion which may raise the frequency of hazards and stalls
#### 5
- True Dependence: Inst 1 produces a result used by inst 2
- May caused RAW (Read After Write): Inst 2 reads operand before inst 1 writes it
- ```MIPS
; r1 in S2 is true dependent on r1 in S1
add r1, r2, r3 ; S1
sub r4, r1, r3 ; S2
```
- Anti-dependence: Inst 2 writes a reg or mem loc that inst 1 reads from
- May caused WAR (Write After Read): Inst 2 writes operand before inst 1 reads it
- ```MIPS
; r1 in S2 is anti-dependent on r1 in S1
add r4, r1, r3 ; S1
sub r1, r2, r3 ; S2
```
- Output dependence: Inst 1 & inst 2 write the same reg or mem loc
- May caused WAW (Write After Write):: Inst 2 writes operand before inst 1 writes it
- ```MIPS
; r1 in S2 is output dependent on r1 in S1
add r1, r4, r3 ; S1
sub r1, r2, r3 ; S2
```
#### 7
- Reschedule insts to not waste stalled cycles (called delayed slot) for branch insts
- Canceling branches allow more slots to be filled
- Insts to fill branch delayed slots can be:
- From before: Improve perf always
- Branch must not depend on the rescheduled insts
- From target (taken part): Improve perf when branch is taken
- Must be fine to execute rescheduled insts if branch is not taken
- From fall-through (untaken part): Improve perf when branch is not taken
- Must be fine to execute rescheduled insts if branch is taken
#### 8
- (B)
## 考古 2020 Mid 2
#### 1 ==(SIMILAR)==
- (a)
$m = 3,\ n = 2$
BHT number: $2^m = 2^3$
Entry size: $n = 2$
Entry number in a BHT: $2^{19} / (2^3 \cdot 2) = 2^{15}$
- (b)
$15$
<!-- #### 2 ==(OUT)== -->
<!-- #### 3 ==(SAME)== -->
<!-- #### 4 ==(SAME)== -->
<!-- #### 5 ==(OUT)== -->
<!-- #### 6 ==(SAME)== -->
<!-- #### 7 ==(SAME)== -->
#### 8 ==(SIMILAR)==
- (B), (D)
#### 9 ==(SIMILAR)==
- (A), (D), (E)
#### 10 ==(SIMILAR)==
- (A), (B), \(C\), (D), (E)
## 考古 2019 Mid 1
<!-- #### 1 ==(SAME)== -->
#### 5
- (B), \(C\)
## 考古 2018 Mid 1
#### 8
- (B), \(C\)
## 考古 2018 Mid 2
#### 1 ==W==
- Un-correlating branch prediction:
- Recent results of a branch are correlated
- Correlating branch prediction:
- Recent branches are correlated
<!-- #### 2 ==(SAME)== -->
#### 3
- Loop-carried dependence: there are dependencies between iterations of a loop
<!-- - If there are circular loop-carried dependences in a loop, then iterations caannot be executed in parallel. -->
<!-- #### 4 ==(OUT)== -->
#### 5
- The iterations are independent
- Repeat the statement (basic block), adjust offsets and the loop termination and iteration code
-
```c
for(int i = 0; i < 10; i++){
a[i] = a[i] + b;
}
for(int i = 0; i < 10; i += 2){
a[i] = a[i] + b;
a[i+1] a[i+1] + b;
}
```
#### 6
1. Issue (ID1): Decode insts, check for structural hazard
- Issue in program order
- Not issue (stall) if structural hazard
- Not issue (stall) if WAW harzards:
- Output dependent on any previous issued but uncompleted inst
2. Read operands (ID2): Wait until no data hazards, then read operands
- Resolve RAW hazards, wait for insts to write back data
- No forwarding!
3. Execution (EX): Operate on operands
- The functional unit begins execution upon receiving operands
- When the result is ready, it notifies the scoreboard that it has completed execution
4. Write result (WB): Finish execution
- Stall if WAR hazards with previous insts**
#### 7
- $m = 3,\ n = 2$
- BHT number: $2^m = 2^3 = 8$
- Entry size: $2$
- BHT entries number: $2^{11}$
- Total bits: $2^{11} \cdot 2^3 \cdot 2 = 2^{15}$ bits
- This $(m,n)$ predictor records last $m$ branches to select between $2^m$ history tables each with $n$-bit counters
#### 8
- True dependences:
- 1-R1, 2-R1
- 2-R2, 4-R2
- 3-R3, 5-R3
- Anti-dependences:
- 1-R2, 2-R2
- 1-R3, 3-R3
- 2-R1, 4-R1
- 3-R5, 5-R5
- 4-R5, 5-R5
- Output dependences:
- 1-R1, 4-R1
-
```mips
add r1, r2, r3
sub r7, r1, r4 ; r2 -> r7
sub r8, r5, r6 ; r3 -> r8
add r9, r7, r5 ; r1 -> r9, r2 -> r7
add r10, r6, r8 ; r5 -> r10, r3 -> r8
```
# Garbage Can
| Type | Class | Syntax |
|:----:|:-----------:|:---------------:|
| R | Reg-reg ALU | `rd, rs, rt` |
| I | ALU imm | `rt, rs, imm` |
| I | Load/store | `rt, imm(rs)` |
| I | Branch | `rs, rt, label` |
| ID/EX Opcode | IF/ID Opcode | Interlock if Matching |
|:------------:|:---------------:|:------------------------------:|
| Load | Rr ALU | `ID/EX.IR[rt] == IF/ID.IR[rs]` |
| Load | Rr ALU | `ID/EX.IR[rt] == IF/ID.IR[rt]` |
| Load | ALU imm, L.S.B. | `ID/EX.IR[rt] == IF/ID.IR[rs]` |
| From p-reg | From inst | To p-reg | To inst | To ALU input | If IR matching then forward |
|:----------:|:---------:|:--------:|:-----------------------:|:------------:|:---------------------------:|
| X/M | Rr ALU | D/X | Rr ALU, ALU imm, L.S.B. | T | rd, rs |
| X/M | Rr ALU | D/X | Rr ALU | B | rd, rt |
| M/W | Rr ALU | D/X | Rr ALU, ALU imm, L.S.B. | T | rd, rs |
| M/W | Rr ALU | D/X | Rr ALU | B | rd, rt |
| X/M | ALU imm | D/X | Rr ALU, ALU imm, L.S.B. | T | rt, rs |
| X/M | ALU imm | D/X | Rr ALU | B | rt, rt |
| M/W | ALU imm | D/X | Rr ALU, ALU imm, L.S.B. | T | rt, rs |
| M/W | ALU imm | D/X | Rr ALU | B | rt, rt |
| M/W | Load | D/X | Rr ALU, ALU imm, L.S.B. | T | rt, rs |
| M/W | Load | D/X | Rr ALU | B | rt, rt |
| S/H | Technique | ⬇ Stalls |
|:---:|:----------------------------------------- |:---------:|
| SW | Loop Unrolling | Control |
| SW | Basic Ppl Scheduling | RAW |
| HW | Dynamic Scheduling with Scoreboarding | RAW |
| HW | Dynamic Scheduling with Register Renaming | WAR & WAW |
| HW | Dynamic Branch Pred | Control |
| HW | Issuing multiple insts per cycle | Ideal CPI |