# CA Mid 2 Past (5/2) <style> .markdown-body li + li { padding-top: 0 !important; } .markdown-body table th, .markdown-body table td { padding: 2px 10px; } </style> --- [TOC] --- ## 考古 2021 Mid 1 #### 6 - (a) | ID/EX Opcode | IF/ID Opcode | Interlock if Matching | |:------------:|:----------------------------:|:------------------------------:| | Load | Reg-reg ALU | `ID/EX.IR[rt] == IF/ID.IR[rs]` | | Load | Reg-reg ALU | `ID/EX.IR[rt] == IF/ID.IR[rt]` | | Load | Load, store, ALU imm, branch | `ID/EX.IR[rt] == IF/ID.IR[rs]` | - (b) Yes. | EX/MEM | Reg-reg ALU | ID/EX | Reg-reg ALU, ALU imm, load, store, branch | Top ALU input | `EX/EME.IR[rd] == ID/EX.IR[rs]` | - \(c\) No. - (d) Yes. | MEM/WB | Load | ID/EX | Reg-reg ALU, ALU imm, load, store, branch | Top ALU input | `MEM/WB.IR[rt] == ID/EX.IR[rs]` | ## 考古 2021 Mid 2 #### 1 - (a) $m = 3,\ n = 2$ BHT number: $2^m = 2^3$ Entry size: $2$ Total entry number: $2^{17} / 2 = 2^{16}$ Entry number **in BHT**: $2^{16} / 2^3 = 2^{13}$ - (b) $13$ <!-- #### 2 ==(OUT)== --> #### 3 ==W?== - Sometimes the behavior is correlated, and we can do better by keep track of direction of related branches - Predictors that use the behavior of other branches to make a prediction are called correlating predictors or 2-level predictors - ```c if(d == 0) d = 1; if(d == 1) // ... ``` - 1-bit predictor: ![](https://i.imgur.com/R1ca9Ae.png) - (1, 1) correlating predictors: ![](https://i.imgur.com/7XLlFyo.png) #### 4 - (a) - Separate execution complete from inst commit - Out-of-order execution but in-order commit - Preform update that cannot be undone only when the inst is no longer speculative - Update the reg file or memory: inst commit - With: - Dynamic branch prediction - Dynamic scheduling of different combinations of basic blocks - (b) - Require additional HW buffers (Reorder buffer, ROB) - Hold the results of inst between completion and commit - 4 fields: Instruction type, Destination field, Value field, Ready field <!-- #### 5 ==(OUT?)== --> #### 6 - Wrong guess for the branch - Get branch history of wrong branch when indexing the table #### 7 ==W== <!-- - It is uncertain. - By adding more entries in a BHT, it is possible to reduce the frequency of getting branch history of wrong branch when indexing the table. - But it may encounter some bottlenecks such as slower indexing speed, larger cost, and other HW limitations. - Thus, it is uncertain that adding more entries in a BHT ca always improve the accuracy of dynamic branch prediction. --> - Majorly yes. - By adding more entries in a BHT, it is possible to reduce the frequency of getting branch history of wrong branch when indexing the table. - But while adding too much more, the reduction in miss rate would be very insignificant. - Moreover, it may encounter some bottlenecks such as slower indexing speed, larger cost, and other HW limitations. - Thus, choosing a proper number of entries to balance between adv. and disadv. is the most important in practice #### 8 - (A), (B), \(C\), (E) #### 9 - (D) #### 10 - (E) ## 考古 2020 Mid 1 #### 1 - Advantages: - A stage can be implemented with simpler circuitry, which may let the processor clock run faster - Can handle multiple operations in parallel with varying running times - Challenges: - The divide unit is not fully pipelined, structural hazards can occur - It has varying running times and out-of-order completion which may raise the frequency of hazards and stalls #### 5 - True Dependence: Inst 1 produces a result used by inst 2 - May caused RAW (Read After Write): Inst 2 reads operand before inst 1 writes it - ```MIPS ; r1 in S2 is true dependent on r1 in S1 add r1, r2, r3 ; S1 sub r4, r1, r3 ; S2 ``` - Anti-dependence: Inst 2 writes a reg or mem loc that inst 1 reads from - May caused WAR (Write After Read): Inst 2 writes operand before inst 1 reads it - ```MIPS ; r1 in S2 is anti-dependent on r1 in S1 add r4, r1, r3 ; S1 sub r1, r2, r3 ; S2 ``` - Output dependence: Inst 1 & inst 2 write the same reg or mem loc - May caused WAW (Write After Write):: Inst 2 writes operand before inst 1 writes it - ```MIPS ; r1 in S2 is output dependent on r1 in S1 add r1, r4, r3 ; S1 sub r1, r2, r3 ; S2 ``` #### 7 - Reschedule insts to not waste stalled cycles (called delayed slot) for branch insts - Canceling branches allow more slots to be filled - Insts to fill branch delayed slots can be: - From before: Improve perf always - Branch must not depend on the rescheduled insts - From target (taken part): Improve perf when branch is taken - Must be fine to execute rescheduled insts if branch is not taken - From fall-through (untaken part): Improve perf when branch is not taken - Must be fine to execute rescheduled insts if branch is taken #### 8 - (B) ## 考古 2020 Mid 2 #### 1 ==(SIMILAR)== - (a) $m = 3,\ n = 2$ BHT number: $2^m = 2^3$ Entry size: $n = 2$ Entry number in a BHT: $2^{19} / (2^3 \cdot 2) = 2^{15}$ - (b) $15$ <!-- #### 2 ==(OUT)== --> <!-- #### 3 ==(SAME)== --> <!-- #### 4 ==(SAME)== --> <!-- #### 5 ==(OUT)== --> <!-- #### 6 ==(SAME)== --> <!-- #### 7 ==(SAME)== --> #### 8 ==(SIMILAR)== - (B), (D) #### 9 ==(SIMILAR)== - (A), (D), (E) #### 10 ==(SIMILAR)== - (A), (B), \(C\), (D), (E) ## 考古 2019 Mid 1 <!-- #### 1 ==(SAME)== --> #### 5 - (B), \(C\) ## 考古 2018 Mid 1 #### 8 - (B), \(C\) ## 考古 2018 Mid 2 #### 1 ==W== - Un-correlating branch prediction: - Recent results of a branch are correlated - Correlating branch prediction: - Recent branches are correlated <!-- #### 2 ==(SAME)== --> #### 3 - Loop-carried dependence: there are dependencies between iterations of a loop <!-- - If there are circular loop-carried dependences in a loop, then iterations caannot be executed in parallel. --> <!-- #### 4 ==(OUT)== --> #### 5 - The iterations are independent - Repeat the statement (basic block), adjust offsets and the loop termination and iteration code - ```c for(int i = 0; i < 10; i++){ a[i] = a[i] + b; } for(int i = 0; i < 10; i += 2){ a[i] = a[i] + b; a[i+1] a[i+1] + b; } ``` #### 6 1. Issue (ID1): Decode insts, check for structural hazard - Issue in program order - Not issue (stall) if structural hazard - Not issue (stall) if WAW harzards: - Output dependent on any previous issued but uncompleted inst 2. Read operands (ID2): Wait until no data hazards, then read operands - Resolve RAW hazards, wait for insts to write back data - No forwarding! 3. Execution (EX): Operate on operands - The functional unit begins execution upon receiving operands - When the result is ready, it notifies the scoreboard that it has completed execution 4. Write result (WB): Finish execution - Stall if WAR hazards with previous insts** #### 7 - $m = 3,\ n = 2$ - BHT number: $2^m = 2^3 = 8$ - Entry size: $2$ - BHT entries number: $2^{11}$ - Total bits: $2^{11} \cdot 2^3 \cdot 2 = 2^{15}$ bits - This $(m,n)$ predictor records last $m$ branches to select between $2^m$ history tables each with $n$-bit counters #### 8 - True dependences: - 1-R1, 2-R1 - 2-R2, 4-R2 - 3-R3, 5-R3 - Anti-dependences: - 1-R2, 2-R2 - 1-R3, 3-R3 - 2-R1, 4-R1 - 3-R5, 5-R5 - 4-R5, 5-R5 - Output dependences: - 1-R1, 4-R1 - ```mips add r1, r2, r3 sub r7, r1, r4 ; r2 -> r7 sub r8, r5, r6 ; r3 -> r8 add r9, r7, r5 ; r1 -> r9, r2 -> r7 add r10, r6, r8 ; r5 -> r10, r3 -> r8 ``` # Garbage Can | Type | Class | Syntax | |:----:|:-----------:|:---------------:| | R | Reg-reg ALU | `rd, rs, rt` | | I | ALU imm | `rt, rs, imm` | | I | Load/store | `rt, imm(rs)` | | I | Branch | `rs, rt, label` | | ID/EX Opcode | IF/ID Opcode | Interlock if Matching | |:------------:|:---------------:|:------------------------------:| | Load | Rr ALU | `ID/EX.IR[rt] == IF/ID.IR[rs]` | | Load | Rr ALU | `ID/EX.IR[rt] == IF/ID.IR[rt]` | | Load | ALU imm, L.S.B. | `ID/EX.IR[rt] == IF/ID.IR[rs]` | | From p-reg | From inst | To p-reg | To inst | To ALU input | If IR matching then forward | |:----------:|:---------:|:--------:|:-----------------------:|:------------:|:---------------------------:| | X/M | Rr ALU | D/X | Rr ALU, ALU imm, L.S.B. | T | rd, rs | | X/M | Rr ALU | D/X | Rr ALU | B | rd, rt | | M/W | Rr ALU | D/X | Rr ALU, ALU imm, L.S.B. | T | rd, rs | | M/W | Rr ALU | D/X | Rr ALU | B | rd, rt | | X/M | ALU imm | D/X | Rr ALU, ALU imm, L.S.B. | T | rt, rs | | X/M | ALU imm | D/X | Rr ALU | B | rt, rt | | M/W | ALU imm | D/X | Rr ALU, ALU imm, L.S.B. | T | rt, rs | | M/W | ALU imm | D/X | Rr ALU | B | rt, rt | | M/W | Load | D/X | Rr ALU, ALU imm, L.S.B. | T | rt, rs | | M/W | Load | D/X | Rr ALU | B | rt, rt | | S/H | Technique | ⬇ Stalls | |:---:|:----------------------------------------- |:---------:| | SW | Loop Unrolling | Control | | SW | Basic Ppl Scheduling | RAW | | HW | Dynamic Scheduling with Scoreboarding | RAW | | HW | Dynamic Scheduling with Register Renaming | WAR & WAW | | HW | Dynamic Branch Pred | Control | | HW | Issuing multiple insts per cycle | Ideal CPI |