計算機結構第三次小考考前重點

data dependency(資料相依性)(計算機組織裡叫hazard危障)

p.4

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

資料相依性：即一個指令的執行結果是否依賴於其他指令的結果。

Name Dependence

p.6

Antidependence: instruction j writes a register or memory location that instruction i reads
- Initial ordering (i before j) must be preserved

1. ADD R1, R2, 5 ; 
2. SUB R2, R3, 10 ;

Output dependence: instruction i and instruction j write the same register or memory location
- Ordering must be preserved

1. ADD R1, R2, 10 ;
2. SUB R1, R1, 5 ;

WAW,WAR,RAW成因範例及解決方法

p.7

Read after write (RAW) / Data Dependency
- 成因：在一個指令寫入資料的同時，另一個指令正在讀取該資料。
- 範例：


ADD R1, R2, R3 ; 
SUB R4, R1, R5 ;

Write after write (WAW) / Output Dependency
- 成因：兩個指令寫入相同的目的地，並且第一個寫入尚未完成，而第二個寫入卻已經發生。
- 範例：


ADD R1, R2, 10 ;
SUB R1, R1, 5 ;

Write after read (WAR) / Anti Dependency
- 成因：在一個指令讀取資料的同時，另一個指令對該資料進行寫入。
- 範例：


ADD R1, R2, 5 ; 
SUB R2, R3, 10 ;

解決方法
register renaming 解決 WAW 與 WAR
strictly ordered 解決 RAW

loop unrolling在幹嘛

p.12
將比如1000次的迴圈拆成250次的四個迴圈的指令的結構,消除部分相依性
(可能還有其他?)

Unroll by a factor of 4 (assume # elements is divisible by 4)
Eliminate unnecessary instructions

for (int i = 0; i < 16; i++) {
        sum += i;
    }

for (int i = 0; i < 16; i += 4) {
        sum += i;
        sum += i + 1;
        sum += i + 2;
        sum += i + 3;
    }

Strip Mining

Unknown number of loop iterations?
- Number of iterations = n
- Goal: make k copies of the loop body
- Generate pair of loops:
  - First executes n mod k times
  - Second executes n / k times

Strip mining：分成兩個迴圈執行。

// First loop (executes n mod k times)
for (i = 0; i < n % k; i++) {
    // 迴圈體的一部分
    // ...
}

// Second loop (executes n / k times)
for (i = 0; i < n / k; i++) {
    // 迴圈體的剩餘部分
    // ...
}

branch predictor 在幹嘛務必熟悉1bit,2bit branch predictor(上課照片)

p.16

(在幹嘛)用於預測程式中分支指令（如條件分支或迴圈分支）的走向，以提高指令執行的效率。

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

Image Not Showing Possible Reasons

The image was uploaded to a note which you don't have access to
The note which the image was originally uploaded to has been deleted

Learn More →

1位元分支預測器（1-bit Branch Predictor）：
- 原理： 1位元分支預測器是最簡單的分支預測器之一。它僅保留一個位元，記錄上一次分支指令的執行結果（例如，是或否，分支跳躍或不跳躍）。
- 操作：如果上一次分支執行成功，則預測下一次分支執行成功；如果上一次分支執行失敗，則預測下一次分支執行失敗。
- 問題： 1位元分支預測器對於連續出現相同結果的分支能夠提供合理的預測，但對於交替的分支（交錯的跳躍和不跳躍）效果較差。
2位元分支預測器（2-bit Branch Predictor）：
- 原理： 2位元分支預測器引入了更多的狀態信息，具體而言，使用兩個位元來紀錄上一次的分支執行結果。
- 操作：根據上一次和上上次的結果，可以構建四種狀態（強跳躍、弱跳躍、強不跳躍、弱不跳躍）。預測根據當前狀態，而不僅僅是單一的成功或失敗。
- 問題： 2位元分支預測器相對於1位元提供了更好的預測性能，但仍然可能在某些情況下無法確切預測分支。

tomasulo 演算法有哪三個步驟

p.32

Issue
- Get next instruction from FIFO queue
- If available RS, issue the instruction to the RS with operand values if available
- If operand values not available, stall the instruction
Execute
- When operand becomes available, store it in any reservation stations waiting for it
- When all operands are ready, issue the instruction
- Loads and store maintained in program order through effective address
- No instruction allowed to initiate execution until all branches that proceed it in program order have completed
Write result
- Write result on CDB into reservation stations and store buffers
  - (Stores must wait until address and value are received)

tomasulo 演算法結合reorder buffer後多了什麼步驟

p.40

Commit:
- When ROB reaches head of ROB, update register
- When a mispredicted branch reaches head of ROB, discard all entries

reorder buffer

需搭配 hardware speculation 和 branch predictor

(?)在 Commit 階段，可能需要處理分支指令的錯誤預測，進行指令的recovery。

請問當初tomasulo(1965)為何不做 reorder buffer,branch predictor

1980葉子玉提出,時間對不上
1990 branch predictor 才趨於成熟

scoreboarding看一下

補充教材lecture 16 p74

Scoreboard: a bookkeeping technique

Out-of-order execution divides ID stage:
1.Issue—decode instructions, check for structural hazards
2.Read operands—wait until no data hazards, then read
operands
Scoreboards date to CDC6600 in 1963
– Readings for Monday include one on CDC6600
Instructions execute whenever not dependent on previous instructions and no hazards
CDC 6600: In order issue, out-of-order execution, outof-order commit (or completion)
– No forwarding!
– Imprecise interrupt/exception model for now

Four Stages of Scoreboard Control

Issue—decode instructions & check for structural hazards (ID1)
– Instructions issued in program order (for hazard checking)
– Don’t issue if structural hazard
– Don’t issue if instruction is output dependent on any previously issued but uncompleted instruction (no WAW hazards)
Read operands—wait until no data hazards, then read operands (ID2
– All real dependencies (RAW hazards) resolved in this stage, since we wait for instructions to write back data.
– No forwarding of data in this model!
Execution—operate on operands (EX
– The functional unit begins execution upon receiving operands. When the result is ready, it notifies the scoreboard that it has completed execution.
Write result—finish execution (WB)
– Stall until no WAR hazards with previous instructions:
Example：



DIVD F0,F2,F4
ADDD F10,F0,F8
SUBD F8,F8,F14

CDC 6600 scoreboard would stall SUBD until ADDD reads operands

VLIW是什麼

p.47
VLIW (very long instruction word)
Package multiple operations into one instruction

計算機結構 第三次小考 考前重點

data dependency(資料相依性)(計算機組織裡叫hazard危障)

Name Dependence

WAW,WAR,RAW成因範例及解決方法

loop unrolling在幹嘛

Strip Mining

branch predictor 在幹嘛 務必熟悉1bit,2bit branch predictor(上課照片)

tomasulo 演算法有哪三個步驟

tomasulo 演算法 結合reorder buffer後多了什麼步驟

reorder buffer

請問 當初tomasulo(1965)為何不做 reorder buffer,branch predictor

scoreboarding看一下

VLIW是什麼

Read more

成大資訊所筆記

成大資訊所摘要

海大筆記

113海大資工生 推甄心得

計算機結構第三次小考考前重點

branch predictor 在幹嘛務必熟悉1bit,2bit branch predictor(上課照片)

tomasulo 演算法結合reorder buffer後多了什麼步驟

請問當初tomasulo(1965)為何不做 reorder buffer,branch predictor

113海大資工生推甄心得