2017q1 Homework1 (compute-pi)

contributed by <vtim9907>

Reviewed by <`rayleigh0407`>

為了平滑化而引用信賴區間這點還不錯 , 但是直接說95% 的信賴區間落在 ±1.96 個標準差內有點危險 , 這個結果是建立在資料分佈為常態分佈的情況下 , 可參考 wiki Normal distribution 或是 wiki Confidence interval

開發環境

OS : Ubuntu 16.04.2 LTS
Kernel Version : 4.8.0-36-generic
Memory : 32 GB
CPU : Intel® Core™ i5-6600 Processor
- cache :
  - L1i cache : 4*32 KB
  - L1d cache : 4*32 KB
  - L2 cache : 4*256 KB
  - L3 cache : 6144 KB
- L1 cache imformation ( sudo dmidecode -t cache ) :

Handle 0x001E, DMI type 7, 19 bytes
Cache Information
	Socket Designation: L1 Cache
	Configuration: Enabled, Not Socketed, Level 1
	Operational Mode: Write Back
	Location: Internal
	Installed Size: 256 kB
	Maximum Size: 256 kB
	Supported SRAM Types:
		Synchronous
	Installed SRAM Type: Synchronous
	Speed: Unknown
	Error Correction Type: Parity
	System Type: Unified
	Associativity: 8-way Set-associative

基本實驗

25 rounds, N <= 21000

將原實驗結果視覺化呈現，先用 LibreOffice 畫看看 (25 round)：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

線實在有點粗，圖片也有點小，可能我還不太會用＠＠" 改用 Gnuplot 畫：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

10000 rounds, N <= 50000

覺得取樣的間隔有點太大，決定改成 N 增加 1000 就取樣一遍，並將 N 放大到 50000，一次跑 10000 rounds ：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

從這張圖就很明顯可以看出，除了 4 threads 的方法以外，其他方法的波動都比較小，穩定遞增，但是使用了 4 threads 的 OpenMP 卻不斷的震盪，奇怪的是這張圖的前半段，就是 N 還小於兩萬左右時，結果與上面的測試不太相符…，我覺得這應該是因為 10000 rounds 中，有幾 round 因為一些不確定因素的關係，時間花的比較久，所以拉高了整體所花的時間。

95% 信賴區間

為了解決上面測不太準的問題，我用取信賴區間的方式，取 95% 信賴區間內的數據當參考，以外的就忽略掉，這樣就可以避免某幾 round 沒跑好，導致整體數據的錯誤。

首先可以簡單從高中課本或是 wiki 查到母體標準差的定義：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

簡單化簡一下：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

而根據查 Z 值的表可以知道，95% 的信賴區間落在 ±1.96 個標準差內，所以可以知道，直接從母體算出 95% 信賴區間的上下界公式為：

上界： $μ + 1.96 * σ$
下界： $μ - 1.96 * σ$

實做取信賴區間，我直接加了一個 function 進 benchmark_clock_gettime.c：




















double calc_ci (double sample_time[])
{
    int j, count = 0;
    double average = 0.0, total_time = 0.0, sigma, upper, lower, conf_interval = 0.0;
    for (j = 0; j < SAMPLE_NUM; ++j) {
        average += sample_time[j];
        total_time += sample_time[j] * sample_time[j];
    }
    average /= SAMPLE_NUM;
    sigma = sqrt(total_time / SAMPLE_NUM - average * average);
    upper = average + 1.96 * sigma;
    lower = average - 1.96 * sigma;
    for (j = 0; j < SAMPLE_NUM; ++j) {
        if (sample_time[j] > lower && sample_time[j] < upper) {
            conf_interval += sample_time[j];
            count++;
        }
    }
    return conf_interval /= count;
}

為了不讓程式跑太久，我改成一次跑 500 rounds，全部跑 1000 次，從這 1000 次取 95% 信賴區間內的數據當參考，最後畫成圖：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

這張圖我總共跑了五次，因為第一次跑還是有些微的波動，我怕是因為背景程式或是網頁的干擾，所以關掉了一些東西後重跑了四次，結果數據都很平滑，可見在測試效能時，還是要盡量排除不必要的不確定因素。

很明顯，這次的分析圖就相當平滑，也能看出當 N 越大時，OpenMP with 4 threads 還是比 AVX SIMD + Loop unrolling 快一點，是目前最快的優化版本。

計算誤差

參考了廖健富學長的筆記，而選擇使用 $A r c c o s (- 1)$ 逆推出來的 $π$ 當作基準，來推算用上面幾個方法所算出來的 $π$ 的誤差。

簡單寫了 function 來產生 .csv 檔，然後用 Gnuplot 畫出來：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

由於一開始的誤差很大，所以畫成圖後，前半部會成垂直狀，很難看。

不過因為誤差會極度減小的範圍只有 N 還很小的時候，所以只要把 N 的起始值調高，數據應該就比較漂亮：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

結果我想錯了 @@" 其實誤差值是持續穩定下降，所以區間只要選的太大，出來的圖形就會趨近直角，所以只好把區間縮小一點在畫出來；不過大略可以看出，若

N

趨近

10^{n}

，算出來

π

的精確度就大約有 n 位。

Leibniz formula for π

簡單證明

首先我們知道

$\frac{d [a r c t a n (x)]}{d x} = \frac{1}{1 + x^{2}}$ –––(1)

又知 $\frac{1}{1 + x^{2}}$ 可以 power series 展開為

$1 - x^{2} + x^{4} . . .$

所以對等式 (1) 兩邊積分得到 :

$a r c t a n (x) = \int_{0}^{x} \frac{1}{1 + x^{2}} = \int_{0}^{x} 1 - x^{2} + x^{4} . . .$ –––(2)

然後因為 $a r c t a n (1) = \frac{π}{4}$ ，帶進 (2) 得到

$a r c t a n (1) = \frac{π}{4} = \int_{0}^{1} 1 - x^{2} + x^{4} . . . = 1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} . . .$

實做

根據前面證明結果

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

可以簡化為：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

有了上面的式子就很好實做。

實做方面我直接選擇用 OpenMP 使用 4 threads 去跑，然後同樣一次跑 500 rounds ，共跑 1000 次，然後取 95% 信賴區間內的結果當參考，用 Gnuplot 畫成圖：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

新的公式比之前的版本快滿多的，幾乎是 Baseline 的三倍速。

計算 Leibniz formula for π 的誤差，與前一版的誤差一起畫成圖形做比較：

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

看出其實兩者的精度差不多，只是分別從 $π$ 的上下逼近 $π$ 。

2017q1 Homework1 (compute-pi)

Reviewed by <rayleigh0407>

開發環境

基本實驗

25 rounds, N <= 21000

10000 rounds, N <= 50000

95% 信賴區間

計算誤差

Leibniz formula for π

簡單證明

實做

Reference

Reviewed by <`rayleigh0407`>