Compute-PI

GPU acceleration

  • Device Info

    • Device name: GeForce GTX 770
    • Warp size: 32
    • multiProcessorCount: 8
  • Gnuplot command

    ​> set title 'GPU acceleration'
    ​> set ylabel 'Speed up'
    ​> set xlabel 'number of threads'
    ​> set terminal png
    ​> set output 'xxx.png'
    ​> plot "SpeedUp.txt" with points pointtype 7 notitle
    

Speed up for 12800000 slices

  • 需注意設定的執行緒數量,會很明顯影響到提昇速率
  • 當執行緒數量超過 6144 時,便不再有速率提昇,可參考 Amdahl's law,固定負載下能平行計算的部份有限,因此速度提昇會趨近定值

Error percentage form 1 to 20480000 slices

Time cost for 400000000 slices

  • Baseline: 2.999478(s)
  • OpenMP
    • 2 threads: 1.460224(s)
    • 4 threads: 0.767469(s)
  • AVX: 1.283873(s)
  • AVX + Loop unroll: 1.169533(s)
  • 6144 GPU threads (threads per block = 256): 0.070991(s)

Source of this article: https://hackmd.io/s/SyEgyhlA