tags: 計算機組織_戴碧如

Chapter 1 - Computer Abstractions and Technology

Seven Great Ideas

  • Use abstraction to simplify design 把複雜的設計,切成一層一層的,在開發上會變得比較容易且快速。
  • Make the common case fast
  • Performance via parallelism 平行處理
  • Performance via pipelining 接力的概念,把流程切成小部分且連續
  • Performance via prediction 預處理可能的結果
  • Hierarchy of memories 因為所有的資料不會一次都做使用,同一時間只會使用到一小部分的資料,所以把所有的資料放在「便宜、空間大」的儲存裝置,把要處理的資料放在「貴、空間小、速度快」的儲存裝置
  • Dependability via redundancy

Below Your Program

  • Application software 高階語言撰寫
  • System software
    • Compiler 把高階語言轉換成 machine code
    • Operating System
  • Hardware

Levels of Program Code

  • High-level language
  • Assembly language
  • Hardware representation
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

Through the Looking Glass

LCD screen: picture elements (pixels)

  • Frame buffer 紀錄 pixel 的顏色
    Image Not Showing Possible Reasons
    • The image file may be corrupted
    • The server hosting the image is unavailable
    • The image path is incorrect
    • The image format is not supported
    Learn More →

Inside the Processor (CPU)

  • Datapath: performs operations on data
  • Control: 用來告訴 Datapath 要往哪裡走(控制信號)
  • Cache memory: SRAM 小且快的 memory

Abstractions

  • Instruction set architecture (ISA)
    • The hardware/software interface 軟體跟硬體之間的介面

Memory

  • Volatile memory
    • RAM
  • Non-volatile memory
    • Hard-Disk, Flash memory, CD\DVD

Defining Performance

要定義一個 CPU 的 Performance 不能只單看一種面向,必須多個面向一起探討。

Responese Time and Throughput

  • Responese Time (execution time)
    • How long it takes to do a task 執行一個工作的總時間
  • Throughput (bandwidth)
    • Total work done per unit time 每單位時間的工作量

Measuring Execution Time

  • Elapsed time
    • Total response time 把所有的處理時間都考慮進去(包含I/O、OS overhead、idle time)
    • Determines system performance
  • CPU (execution) time
    • Comprises user CPU time and system CPU time

一般而言,user CPU time 反映了主要邏輯程式碼的執行時間,而system CPU time 則反映了執行 I/O 操作、管理記憶體或與操作系統交互的時間。

CPU Time

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • clock cycle 指的是一個時脈週期
  • clock rate 則是 CPU 每秒鐘所執行的 clock cycle 數量
  • CPU Time 就是只說,完成一個 task 所需要的時脈週期總數,除以 CPU 的處理速率(clock rate)
  • Performance improved by (CPU Time 要變小)
    • Reducing number of clock cycles (降低一個 Task 所需的 clock cycle 數量)
    • Increasing clock rate (增加 clock 頻率)
    • Hardware designer must often trade off clock rate against cycle count

Instruction Count and CPI

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Instruction Count (IC) => 一次需要呼叫的指令數量
  • CPI (Cyclesper Instruction) => 一個指令需要花費的 cycle 數量

Clock cycle 總數 = IC x CPI
CPU Time = IC x CPI / Clock Rate = IC x CPI x Clock cycle time (一個 clock cycle 所需的時間)

ISA (Instruction Set Architecture) 指令集架構

CPI in more detail

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Performance Summary

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • performance depends on
    • Algorithm: affects IC, possibly CPI
    • Programming language: affects IC, CPI
    • Compiler: affects IC, CPI
    • Instruction set architecture: affects IC, CPI, Tc

Tc => Clock Cycle 的週期時間

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

  • Frequency => clock rate
  • 當 clock rate 越高,功耗(power)就會越高,所以要降低電壓,才能使得功耗不要那麼高

Reducing Power

  • Power Wall
    • 無法再降低電壓
    • 無法散熱

Solution of Power Wall

Uniprocessor => Multiprocessor

  • Requires explicitly parallel programming
    • Compare with instruction level parallelism
      • Hardware executes multiple intstruction at once
      • Hidden from the programmer
    • Hard to do
      • Programming for performance
      • Load balancing
      • Optimizing communication and synchronization

SPEC CPU Benchmark

  • SPEC 一個國際組織 訂定不同的 Benchmark 來評估自己的 Program
  • SPEC CPU 2006
    • Summarize as geometric mean of performance ratios
      Image Not Showing Possible Reasons
      • The image file may be corrupted
      • The server hosting the image is unavailable
      • The image path is incorrect
      • The image format is not supported
      Learn More →

SPEC Power Benchmark

  • Power consumotion of server at different workload levels
    • Performance: ssj/sec
    • Power: Watts (Joules/sec)

Pitfall: Amdahl's Law

  • Corollary: make the common case fast

Fallacy: Low Power at Idle

  • Load 的百分比 等比下降,但是 功耗 不會等比下降
  • 是還需要考慮、進步的

Pitfal: MIPS as a Performance Metric

  • MIPS: Millions of Instructions Per Second
  • 比較指令的數量,不能斷定誰比較好,還是要考慮執行時間