sarah
,raytracing
contributed by <SarahCheng
>
github:
https://github.com/SarahYuHanCheng/raytracing.git
problem:
sudo apt-get install vim
reference:
STEP:
orig: Execution time of raytracing() : 6.607877 sec
after ofast:Execution time of raytracing() : 1.440246 sec
convert out.ppm成其他檔案,用ls —lh
來看大小差別
make PROFILE=1
gprof: gprof ./raytracing | less
觀察個function呼叫的次數與時間,less
讓結果分頁
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
18.39 0.57 0.57 1241598 0.00 0.00 refraction
13.87 1.00 0.43 56956357 0.00 0.00 subtract_vector
11.94 1.37 0.37 13861875 0.00 0.00 rayRectangularIntersection
10.97 1.71 0.34 69646433 0.00 0.00 dot_product
8.55 1.98 0.27 31410184 0.00 0.00 multiply_vector
7.42 2.21 0.23 10598450 0.00 0.00 normalize
5.48 2.38 0.17 17836094 0.00 0.00 add_vector
5.16 2.54 0.16 13861875 0.00 0.00 raySphereIntersection
4.52 2.67 0.14 4620625 0.00 0.00 ray_hit_object
3.23 2.77 0.10 17821809 0.00 0.00 cross_product
3.06 2.87 0.10 1048576 0.00 0.00 ray_color
2.74 2.96 0.09 4221152 0.00 0.00 multiply_vectors
0.97 2.98 0.03 2110576 0.00 0.00 compute_specular_diffuse
0.65 3.00 0.02 1048576 0.00 0.00 rayConstruction
0.65 3.02 0.02 1 0.02 3.10 raytracing
0.48 3.04 0.01 1241598 0.00 0.00 protect_color_overflow
0.32 3.05 0.01 3838091 0.00 0.00 length
0.32 3.06 0.01 2558386 0.00 0.00 idx_stack_empty
math-toolkit.h(dot_product)
的loop解開,原為:
double dp = 0.0;
for (int i = 0; i < 3; i++)
dp += v1[i] * v2[i];
return dp;
改為:
double dp = 0.0;
dp = v1[0] * v2[0], v1[1] * v2[1], v1[2] * v2[2];
return dp;
結果:Execution time of raytracing() : 1.009193 sec
% cumulative self self total
time seconds seconds calls ms/call ms/call name
20.24 0.09 0.09 11534336 0.00 0.00 subtract_vector
19.05 0.17 0.08 4194304 0.00 0.00 multiply_vector
14.29 0.23 0.06 1048576 0.00 0.00 rayConstruction
9.52 0.27 0.04 1048576 0.00 0.00 ray_color
9.52 0.30 0.04 1 40.00 410.00 raytracing
7.14 0.34 0.03 3145728 0.00 0.00 raySphereIntersection
5.95 0.36 0.03 4194304 0.00 0.00 add_vector
4.76 0.38 0.02 3145728 0.00 0.00 rayRectangularIntersection
2.38 0.39 0.01 3145730 0.00 0.00 cross_product
2.38 0.40 0.01 1048579 0.00 0.00 normalize
2.38 0.41 0.01 multiply_vectors
1.19 0.41 0.01 10485760 0.00 0.00 dot_product
Execution time of raytracing() : 0.592862 sec
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
34.15 0.14 0.14 1048576 0.00 0.00 rayConstruction
29.27 0.26 0.12 3145728 0.00 0.00 rayRectangularIntersection
24.39 0.36 0.10 3145728 0.00 0.00 raySphereIntersection
9.76 0.40 0.04 1048576 0.00 0.00 ray_hit_object
2.44 0.41 0.01 1 10.00 410.00 raytracing
0.00 0.41 0.00 1048579 0.00 0.00 normalize
#include<stdio.h>
#include<omp.h>
int main()
{
#pragma omp parallel
{
printf("Hello!!\n");
}
return 0;
}
Hello!!
Hello!!
Hello!!
Hello!!
得知電腦有4個執行緒
#pragma omp parallel for schedule(dynamic)
for (int j = 0; j < height; j++) {
#pragma omp parallel for schedule(dynamic) private(d,stk,object_color)
for (int i = 0; i < width; i++)
編譯時記得在Makefile中的CFLAGS加上-fopenmp,LDFLAGS加上-lgomp
結果:
Execution time of raytracing() : 0.425498 sec
Execution time of raytracing() : 0.594750 sec
Execution time of raytracing() :
0.408811 secExecution time of raytracing() : 0.490429 sec
Execution time of raytracing() : 0.469159 sec
Execution time of raytracing() : 0.526471 sec
Execution time of raytracing() : 0.608506 sec
若加上math-toolkit(dot_product)的平行執行
double dp0, dp1, dp2;
#pragma omp parallel sections
{
#pragma omp section
{
dp0 = v1[0] * v2[0];
}
#pragma omp section
{
dp1 = v1[1] * v2[1];
}
#pragma omp section
{
dp2 = v1[2] * v2[2];
}
}
return dp0 + dp1 + dp2;
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
42.70 2.69 2.69 12052835 0.00 0.00 rayRectangularIntersection
18.73 3.87 1.18 1802340 0.00 0.00 compute_specular_diffuse
7.94 4.37 0.50 11756078 0.00 0.00 raySphereIntersection
7.94 4.87 0.50 1 0.50 6.30 raytracing
6.35 5.27 0.40 1127490 0.00 0.00 refraction
3.97 5.52 0.25 1844751 0.00 0.00 localColor
3.81 5.76 0.24 4125111 0.00 0.00 ray_hit_object
3.49 5.98 0.22 931033 0.00 0.00 ray_color
2.70 6.15 0.17 9374771 0.00 0.00 normalize
1.43 6.24 0.09 870632 0.00 0.00 rayConstruction
0.32 6.26 0.02 1117590 0.00 0.00 reflection
結果:Execution time of raytracing() : 19.676675 sec
不知道why..待解…