jserv
Date: 2016/09/30
contribute by <hankgo
>
OS: Lubuntu 16.04 (upgrade from 15.10)
CPU: Intel i3-2350m ( 2 core / 4 thread )
$ lscpu
RAM: 8GB
剛剛看到其他大大使用微積分公式證明,我真的忘記了(跪)
double compute_pi_baseline(size_t N)
{
double pi = 0.0;
double dt = 1.0 / N; // dt = (b-a)/N, b = 1, a = 0
for (size_t i = 0; i < N; i++) {
double x = (double) i / N; // x = ti = a+(b-a)*i/N = i/N
pi += dt / (1.0 + x * x); // integrate 1/(1+x^2), i = 0....N
}
return pi * 4.0;
}
$ make check
gcc -c -O0 -std=gnu99 -Wall -fopenmp -mavx computepi.c -o computepi.o
gcc -O0 -std=gnu99 -Wall -fopenmp -mavx computepi.o time_test.c -DBASELINE -o time_test_baseline
gcc -O0 -std=gnu99 -Wall -fopenmp -mavx computepi.o time_test.c -DOPENMP_2 -o time_test_openmp_2
gcc -O0 -std=gnu99 -Wall -fopenmp -mavx computepi.o time_test.c -DOPENMP_4 -o time_test_openmp_4
gcc -O0 -std=gnu99 -Wall -fopenmp -mavx computepi.o time_test.c -DAVX -o time_test_avx
gcc -O0 -std=gnu99 -Wall -fopenmp -mavx computepi.o time_test.c -DAVXUNROLL -o time_test_avxunroll
gcc -O0 -std=gnu99 -Wall -fopenmp -mavx computepi.o benchmark_clock_gettime.c -o benchmark_clock_gettime
time ./time_test_baseline
在這邊可以看到,是下了不同的參數來編譯,而主要不同的地方在於 -DBASELINE
-DOPENMP_2
-DOPENMP_4
-DAVX
-DAVXUNROLL
這些參數,下了不同的參數就會有不同的結果,所以來觀察一下程式碼,在 computepi.c
中發現以下程式碼:
#if defined(BASELINE)
pi = compute_pi_baseline(N);
#endif
#if defined(OPENMP_2)
pi = compute_pi_openmp(N, 2);
#endif
#if defined(OPENMP_4)
pi = compute_pi_openmp(N, 4);
#endif
#if defined(AVX)
pi = compute_pi_avx(N);
#endif
#if defined(AVXUNROLL)
pi = compute_pi_avx_unroll(N);
#endif
再根據 gcc online doc,可以得知 -D 這個參數的用處:Predefine
-D name
Predefine name as a macro, with definition 1.
-D name=definition
The contents of definition are tokenized and processed as if they appeared during translation phase three in a ‘#define’ directive. In particular, the definition will be truncated by embedded newline characters.
If you are invoking the preprocessor from a shell or shell-like program you may need to use the shell's quoting syntax to protect characters such as spaces that have a meaning in the shell syntax.
If you wish to define a function-like macro on the command line, write its argument list with surrounding parentheses before the equals sign (if any). Parentheses are meaningful to most shells, so you will need to quote the option. With sh and csh, -D'name(args…)=definition' works.
-D
and -U
options are processed in the order they are given on the command line. All -imacros file and -include file options are processed after all -D
and -U
options.
-U name
Cancel any previous definition of name, either built in or provided with a -D
option.
N = 400000000 , pi = 3.141593
7.71user 0.00system 0:07.71elapsed 99%CPU (0avgtext+0avgdata 1788maxresident)k
0inputs+0outputs (0major+85minor)pagefaults 0swaps
time ./time_test_openmp_2
N = 400000000 , pi = 3.141593
8.62user 0.00system 0:04.31elapsed 199%CPU (0avgtext+0avgdata 1796maxresident)k
0inputs+0outputs (0major+85minor)pagefaults 0swaps
time ./time_test_openmp_4
N = 400000000 , pi = 3.141593
15.34user 0.00system 0:03.86elapsed 397%CPU (0avgtext+0avgdata 1816maxresident)k
0inputs+0outputs (0major+94minor)pagefaults 0swaps
time ./time_test_avx
N = 400000000 , pi = 3.141593
2.40user 0.00system 0:02.40elapsed 99%CPU (0avgtext+0avgdata 1868maxresident)k
0inputs+0outputs (0major+87minor)pagefaults 0swaps
time ./time_test_avxunroll
N = 400000000 , pi = 3.141593
2.74user 0.00system 0:02.74elapsed 99%CPU (0avgtext+0avgdata 1720maxresident)k
0inputs+0outputs (0major+83minor)pagefaults 0swaps
這邊列出了各種時間,差異如下: