contributed by <Jing Zhou
ubuntu 16.04 LTS
$ sudo apt-get update
$ sudo apt-get install graphviz
$ sudo apt-get install imagemagick
$ sudo apt-get install vim
$ vim ~/.vimrc
set ai
set cursorline
set enc=utf8
set number
set tabstop=4
set wrap
$ astyle --style=kr --indent=spaces=4 --indent-switches --suffix=none *.[ch]
參考 使用Gnu gprof进行Linux平台下的程序分析 測試
$ gcc -pg test.c
$ gprof -b a.out gmon.out | less
Flat profile:
Each sample counts as 0.01 seconds.
no time accumulated
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 1 0.00 0.00 a
0.00 0.00 0.00 1 0.00 0.00 b
0.00 0.00 0.00 1 0.00 0.00 c
Call graph
granularity: each sample hit covers 2 byte(s) no time propagated
index % time self children called name
0.00 0.00 1/1 b [2]
[1] 0.0 0.00 0.00 1 a [1]
0.00 0.00 1/1 main [9]
[2] 0.0 0.00 0.00 1 b [2]
0.00 0.00 1/1 a [1]
0.00 0.00 1/1 c [3]
0.00 0.00 1/1 b [2]
[3] 0.0 0.00 0.00 1 c [3]
使用Cflow (未成功)
$ sudo apt install cflow
[linux /home/]$ sudo wget ""
[linux /home/]$ sudo tar zxvf cflow-1.4.tar.gz
# 跟1.1版不同,configure不在 /cflow-1.4/src
[linux /home/cflow-1.4]$ ./configure
# 以下錯誤
[linux /home/cflow-1.4]$make CFLAGS=-pg LDFLAGS=-pg
[linux /home/cflow-1.4/src]$cflow parser.c
$ git clone
$ cd raytracing
$ make
$ ./raytracing
Execution time of raytracing() : 2.338675 sec
$ make clean
$ make PROFILE=1
$ ./raytracing
$ gprof -b raytracing gmon.out | less
執行時間 使用-pg
Execution time of raytracing() : 5.208219 sec
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
22.10 0.53 0.53 69646433 0.00 0.00 dot_product
15.01 0.89 0.36 56956357 0.00 0.00 subtract_vector
9.38 1.12 0.23 17821809 0.00 0.00 cross_product
9.17 1.34 0.22 13861875 0.00 0.00 rayRectangularIntersection
8.34 1.54 0.20 13861875 0.00 0.00 raySphereIntersection
7.71 1.72 0.19 31410180 0.00 0.00 multiply_vector
6.67 1.88 0.16 10598450 0.00 0.00 normalize
5.21 2.01 0.13 4620625 0.00 0.00 ray_hit_object
2.92 2.08 0.07 17836094 0.00 0.00 add_vector
2.50 2.14 0.06 2110576 0.00 0.00 compute_specular_diffuse
2.08 2.19 0.05 2110576 0.00 0.00 localColor
2.08 2.24 0.05 1048576 0.00 0.00 ray_color
2.08 2.29 0.05 1 0.05 2.39 raytracing
1.67 2.33 0.04 4221152 0.00 0.00 multiply_vectors
1.67 2.37 0.04 2520791 0.00 0.00 idx_stack_top
0.42 2.38 0.01 3838091 0.00 0.00 length
0.42 2.39 0.01 1241598 0.00 0.00 protect_color_overflow
0.42 2.40 0.01 1 0.01 0.01 delete_sphere_list
0.21 2.40 0.01 1048576 0.00 0.00 rayConstruction
0.00 2.40 0.00 2558386 0.00 0.00 idx_stack_empty
0.00 2.40 0.00 1241598 0.00 0.00 reflection
0.00 2.40 0.00 1241598 0.00 0.00 refraction
0.00 2.40 0.00 1204003 0.00 0.00 idx_stack_push
0.00 2.40 0.00 1048576 0.00 0.00 idx_stack_init
0.00 2.40 0.00 113297 0.00 0.00 fresnel
0.00 2.40 0.00 37595 0.00 0.00 idx_stack_pop
0.00 2.40 0.00 3 0.00 0.00 append_rectangular
0.00 2.40 0.00 3 0.00 0.00 append_sphere
0.00 2.40 0.00 2 0.00 0.00 append_light
0.00 2.40 0.00 1 0.00 0.00 calculateBasisVectors
0.00 2.40 0.00 1 0.00 0.00 delete_light_list
0.00 2.40 0.00 1 0.00 0.00 delete_rectangular_list
0.00 2.40 0.00 1 0.00 0.00 diff_in_second
0.00 2.40 0.00 1 0.00 0.00 write_to_ppm
$ ./raytracing & sudo perf top -p $!
double dp = 0.0;
for (int i = 0; i < 3; i++)
dp += v1[i] * v2[i];
# 變成
dp = v1[0] * v2[0] + v1[1] * v2[1] + v1[2] * v2[2];
執行時間 下降約0.5秒
$ ./raytracing
Execution time of raytracing() : 1.814317 sec
Execution time of raytracing() : 3.983447 sec
執行結果 dot_product、subtract_vector、add_vector、multiply_vectors、multiply_vector等時間明顯下降
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
17.81 0.21 0.21 69646433 0.00 0.00 dot_product
14.84 0.39 0.18 13861875 0.00 0.00 rayRectangularIntersection
14.84 0.56 0.18 56956357 0.00 0.00 subtract_vector
6.78 0.64 0.08 4620625 0.00 0.00 ray_hit_object
5.94 0.71 0.07 17836094 0.00 0.00 add_vector
5.94 0.78 0.07 17821809 0.00 0.00 cross_product
5.09 0.84 0.06 1048576 0.00 0.00 ray_color
4.66 0.90 0.06 4221152 0.00 0.00 multiply_vectors
4.24 0.95 0.05 31410180 0.00 0.00 multiply_vector
4.24 1.00 0.05 2110576 0.00 0.00 compute_specular_diffuse
3.39 1.04 0.04 1241598 0.00 0.00 refraction
2.54 1.07 0.03 3838091 0.00 0.00 length
2.54 1.10 0.03 2110576 0.00 0.00 localColor
2.12 1.12 0.03 13861875 0.00 0.00 raySphereIntersection
1.70 1.14 0.02 10598450 0.00 0.00 normalize
0.85 1.15 0.01 2520791 0.00 0.00 idx_stack_top
0.85 1.16 0.01 1048576 0.00 0.00 rayConstruction
0.85 1.17 0.01 113297 0.00 0.00 fresnel
0.85 1.18 0.01 1 0.01 1.18 raytracing
0.00 1.18 0.00 2558386 0.00 0.00 idx_stack_empty
0.00 1.18 0.00 1241598 0.00 0.00 protect_color_overflow
0.00 1.18 0.00 1241598 0.00 0.00 reflection
0.00 1.18 0.00 1204003 0.00 0.00 idx_stack_push
0.00 1.18 0.00 1048576 0.00 0.00 idx_stack_init
0.00 1.18 0.00 37595 0.00 0.00 idx_stack_pop
0.00 1.18 0.00 3 0.00 0.00 append_rectangular
0.00 1.18 0.00 3 0.00 0.00 append_sphere
0.00 1.18 0.00 2 0.00 0.00 append_light
0.00 1.18 0.00 1 0.00 0.00 calculateBasisVectors
0.00 1.18 0.00 1 0.00 0.00 delete_light_list
0.00 1.18 0.00 1 0.00 0.00 delete_rectangular_list
0.00 1.18 0.00 1 0.00 0.00 delete_sphere_list
0.00 1.18 0.00 1 0.00 0.00 diff_in_second
0.00 1.18 0.00 1 0.00 0.00 write_to_ppm
#pragma omp parallel for
for (int i = 0; i < 3; i++)
out[i] = a[i] + b[i];
$ ./raytracing
# Rendering scene
Execution time of raytracing() : 139.242781 sec