Try   HackMD
tags: linux2022

現代處理器設計:原理和關鍵特徵(筆記)

  • 目前的 X86 指令是 Cisc 風格的指令集,為了兼容過去的設計,在 Risc 指令外包了一層 Cisc 的皮。

CPU Pipeline

ref: vedio

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Load / Store Architecture: RISC

Superpipelining

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • 提升 pipeline 的深度。
  • pipeline 深度提高會提升效能,但也會耗能。
  • 因為 clock speed (= clock number per second) 受限於 pipeline 中最慢的 stage(如果每個 stage 都需要一個 cycle 完成的話),因此將 pipeline stage 分成多個小的 stage,使得 CPU 可以用更快的 clock speed 執行。
  • Of course, each instruction will now take more cycles to complete (latency), but the processor will still be completing 1 instruction per cycle (throughput), and there will be more cycles per second, so the processor will complete more instructions per second (actual performance)

Superscalar

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • 提升 pipeline 的寬度。
  • CPI (clock per cycle) = 3

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Decode 後對指令進行分類 (Dispatch),並將指令兵分多路執行。

VLIW (Very Long Instructure Word)

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • 廣泛應用於 DSP,因為訊號處理常需要做完乘法接著做加法(矩陣乘法),因此人們開始思考是否能將兩種指令一起處理。
  • 需要良好設計與平行度的演算法,因此適合特殊的訊號處理 (Compiler 很難設計)。
  • 相較 Superscalar 需要較少的電路,因此較為省電。

Instruction Dependency & Latency

Hazard

Structure Hazard

  • 硬體資源不夠多,導致在同一時間內要執行的多個指令無法執行。
    • 例子:在 Neumann architecture 中,Instruction 與 Data 放在同一塊 Memory,當同時讀取 Memory 與 Data 時就會遇到 Structure Hazard。
    • Harvard vs Neumann architecture
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
  • 解法:
    • 加更多的硬體
    • 用 Stall 延後指令的執行,錯開會存取到相同硬體的指令。

Data Hazard

  • Pipeline 中某一指令需要用到前一階段內指令尚未產生的結果 (data dependency)。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • data dependency:
    • RAW (True Data Dependency)
    • WAW
    • WAR
  • 軟體解法:
    • insert nop
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
    • Instruction rescheduling
  • 硬體解法:
    • Fowarding
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →
    • Forwarding + stall
      • Load-use Data Hazard
        Image Not Showing Possible Reasons
        • The image file may be corrupted
        • The server hosting the image is unavailable
        • The image path is incorrect
        • The image format is not supported
        Learn More →

Control Hazard

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Branch & Branch prediction

Branch

if (a > 7) {
    b = c;
} else {
    b = d;
}
    cmp a, 7    ; a > 7 ?
    ble L1
    mov c, b    ; b = c
    br L2
L1: mov d, b    ; b = d
L2: ...

Static Branch Prediction

  • 總是猜跳或是不跳。

Dynamic Branch Prediction

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • BTB (Branch Target Buffer):
    • 一塊 cache 用來快取每個 branch 跳躍的目的地(新的 PC)。
    • 在 IF 階段時,利用目前的 PC 作為索引值去 BTB 中查找目前 branch 的目的地。
    • 若有找到(表示目前的指令是 branch,且預測為會跳):
      • ID: 更新目前的 PC。
        • EX: Branch 有跳的話,正常執行。
        • EX: Branch 沒跳的話,flush 之前的指令,且把 BTB 中對應的 PC 刪掉。
    • 若沒找到:
      • ID: 照常執行。
        • EX: Branch 有跳的話,將目的地位址加入 BTB ,並且 flush 之前的指令。
        • EX: Branch 沒跳的話,照常執行。

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →
Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

ref: video

  • BHB (Branch History Buffer):
    1-bit predict

    2-bit predict

Delay Branch

是一種Software-base的解決方式,又稱為Insert Safety Instruction(插入安全指令) ,也就是利用插入不論是否分支都會執行到的指令,來減少猜錯Branch所造成的Penalty,功能類似nop,但不會有空執行的狀況。

三種方式:

  • From Before

  • From Target

  • From Fall through

  • 其中第一種最佳,需先嘗試。若有Data Hazard狀況,才試用二、三種。

    • 若分支機率很高,則試用第二種。
    • 分支機率低則第三種。

Delay Branch是一個簡單而有效率的方式。隨著處理器管線的延長、以及每個時脈週期分發指令個數的增加,分支的延遲變得愈來愈長,而一個延遲插槽已不敷使用。因此相較於代價更高、但彈性大的動態方案,延遲分支已經失去吸引力。

待整理:

OOO (Out of Order execution)

ref:[Computer Architecture Cheat sheet] — Pipeline Hazard