# 計算機結構 第三次小考 考前重點
## data dependency(資料相依性)(計算機組織裡叫hazard危障)
p.4

資料相依性:即一個指令的執行結果是否依賴於其他指令的結果。
## Name Dependence
p.6
- Antidependence: instruction j writes a register or memory location that instruction i reads
- Initial ordering (i before j) must be preserved
```mips
1. ADD R1, R2, 5 ;
2. SUB R2, R3, 10 ;
```
- Output dependence: instruction i and instruction j write the same register or memory location
- Ordering must be preserved
```mips
1. ADD R1, R2, 10 ;
2. SUB R1, R1, 5 ;
```
## WAW,WAR,RAW成因範例及解決方法
p.7
- Read after write (RAW) / Data Dependency
- 成因:在一個指令寫入資料的同時,另一個指令正在讀取該資料。
- 範例:
```mips=
ADD R1, R2, R3 ;
SUB R4, R1, R5 ;
```
- Write after write (WAW) / Output Dependency
- 成因:兩個指令寫入相同的目的地,並且第一個寫入尚未完成,而第二個寫入卻已經發生。
- 範例:
```mips=
ADD R1, R2, 10 ;
SUB R1, R1, 5 ;
```
- Write after read (WAR) / Anti Dependency
- 成因:在一個指令讀取資料的同時,另一個指令對該資料進行寫入。
- 範例:
```mips=
ADD R1, R2, 5 ;
SUB R2, R3, 10 ;
```
- 解決方法
register renaming 解決 WAW 與 WAR
strictly ordered 解決 RAW
## loop unrolling在幹嘛
p.12
將比如1000次的迴圈拆成250次的四個迴圈的指令的結構,消除部分相依性
(可能還有其他?)
- Unroll by a factor of 4 (assume # elements is divisible by 4)
- Eliminate unnecessary instructions
```c
for (int i = 0; i < 16; i++) {
sum += i;
}
for (int i = 0; i < 16; i += 4) {
sum += i;
sum += i + 1;
sum += i + 2;
sum += i + 3;
}
```
## Strip Mining
- Unknown number of loop iterations?
- Number of iterations = n
- Goal: make k copies of the loop body
- Generate pair of loops:
- First executes n mod k times
- Second executes n / k times
Strip mining:分成兩個迴圈執行。
```c
// First loop (executes n mod k times)
for (i = 0; i < n % k; i++) {
// 迴圈體的一部分
// ...
}
// Second loop (executes n / k times)
for (i = 0; i < n / k; i++) {
// 迴圈體的剩餘部分
// ...
}
```
## branch predictor 在幹嘛 務必熟悉1bit,2bit branch predictor(上課照片)
p.16
- (在幹嘛)用於預測程式中分支指令(如條件分支或迴圈分支)的走向,以提高指令執行的效率。


- 1位元分支預測器(1-bit Branch Predictor):
- 原理: 1位元分支預測器是最簡單的分支預測器之一。它僅保留一個位元,記錄上一次分支指令的執行結果(例如,是或否,分支跳躍或不跳躍)。
- 操作: 如果上一次分支執行成功,則預測下一次分支執行成功;如果上一次分支執行失敗,則預測下一次分支執行失敗。
- 問題: 1位元分支預測器對於連續出現相同結果的分支能夠提供合理的預測,但對於交替的分支(交錯的跳躍和不跳躍)效果較差。
- 2位元分支預測器(2-bit Branch Predictor):
- 原理: 2位元分支預測器引入了更多的狀態信息,具體而言,使用兩個位元來紀錄上一次的分支執行結果。
- 操作: 根據上一次和上上次的結果,可以構建四種狀態(強跳躍、弱跳躍、強不跳躍、弱不跳躍)。預測根據當前狀態,而不僅僅是單一的成功或失敗。
- 問題: 2位元分支預測器相對於1位元提供了更好的預測性能,但仍然可能在某些情況下無法確切預測分支。
## tomasulo 演算法有哪三個步驟
p.32
1. Issue
- Get next instruction from FIFO queue
- If available RS, issue the instruction to the RS with operand values if available
- If operand values not available, stall the instruction
2. Execute
- When operand becomes available, store it in any reservation stations waiting for it
- When all operands are ready, issue the instruction
- Loads and store maintained in program order through effective address
- ==No instruction allowed to initiate execution until all branches that proceed it in program order have completed==
3. Write result
- Write result on CDB into reservation stations and store buffers
- (Stores must wait until address and value are received)
## tomasulo 演算法 結合reorder buffer後多了什麼步驟
p.40
- Commit:
- When ROB reaches head of ROB, update register
- When a mispredicted branch reaches head of ROB, discard all entries
## reorder buffer
需搭配 hardware speculation 和 branch predictor
(?)在 Commit 階段,可能需要處理分支指令的錯誤預測,進行指令的recovery。
## 請問 當初tomasulo(1965)為何不做 reorder buffer,branch predictor
1980葉子玉提出,時間對不上
1990 branch predictor 才趨於成熟
## scoreboarding看一下
補充教材lecture 16 p74
Scoreboard: a bookkeeping technique
* Out-of-order execution divides ID stage:
1.Issue—decode instructions, check for structural hazards
2.Read operands—wait until no data hazards, then read
operands
* Scoreboards date to CDC6600 in 1963
– Readings for Monday include one on CDC6600
* Instructions execute whenever not dependent on previous instructions and no hazards
* CDC 6600: In order issue, out-of-order execution, outof-order commit (or completion)
– No forwarding!
– Imprecise interrupt/exception model for now
Four Stages of Scoreboard Control
1. Issue—decode instructions & check for structural hazards (ID1)
– Instructions issued in program order (for hazard checking)
– Don’t issue if structural hazard
– Don’t issue if instruction is output dependent on any previously issued but uncompleted instruction (no WAW hazards)
3. Read operands—wait until no data hazards, then read operands (ID2
– All real dependencies (RAW hazards) resolved in this stage, since we wait for instructions to write back data.
– No forwarding of data in this model!
4. Execution—operate on operands (EX
– The functional unit begins execution upon receiving operands. When the result is ready, it notifies the scoreboard that it has completed execution.
5. Write result—finish execution (WB)
– Stall until no WAR hazards with previous instructions:
Example:
```mips=
DIVD F0,F2,F4
ADDD F10,F0,F8
SUBD F8,F8,F14
```
CDC 6600 scoreboard would stall SUBD until ADDD reads operands

## VLIW是什麼
p.47
VLIW (very long instruction word)
Package multiple operations into one instruction