Try   HackMD

2016q3 Homework1 (compute-pi)

contributed by <ktvexe>

baseline

double compute_pi_baseline(size_t N) { double pi = 0.0; double dt = 1.0 / N; // dt = (b-a)/N, b = 1, a = 0 for (size_t i = 0; i < N; i++) { double x = (double) i / N; // x = ti = a+(b-a)*i/N = i/N pi += dt / (1.0 + x * x); // integrate 1/(1+x^2), i = 0....N } return pi * 4.0; }

先嘗試讓程式動起來:

time ./time_test_baseline
N = 400000000 , pi = 3.141593
3.73user 0.00system 0:03.72elapsed 100%CPU (0avgtext+0avgdata 1704maxresident)k
0inputs+0outputs (0major+85minor)pagefaults 0swaps
time ./time_test_openmp_2
N = 400000000 , pi = 3.141593
3.92user 0.00system 0:01.96elapsed 200%CPU (0avgtext+0avgdata 1700maxresident)k
0inputs+0outputs (0major+87minor)pagefaults 0swaps
time ./time_test_openmp_4
N = 400000000 , pi = 3.141593
7.80user 0.00system 0:01.97elapsed 395%CPU (0avgtext+0avgdata 1764maxresident)k
0inputs+0outputs (0major+93minor)pagefaults 0swaps
time ./time_test_avx
N = 400000000 , pi = 3.141593
1.60user 0.00system 0:01.60elapsed 100%CPU (0avgtext+0avgdata 1708maxresident)k
0inputs+0outputs (0major+86minor)pagefaults 0swaps
time ./time_test_avxunroll
N = 400000000 , pi = 3.141593
1.52user 0.00system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 1712maxresident)k
0inputs+0outputs (0major+85minor)pagefaults 0swaps

AVX (Advanced Vector Extensions)