2016q3 Homework1 (raytracing) === ### Reviewed by `vic85821` * 建議利用圖表來比較效能差異 * 可以比較不同thread數目對執行時間的關係 * commit [225c09f](https://github.com/ruby0109/raytracing/commit/225c09f145a48a67cded778eddfb0d68d0af0594) 無法得知目的與功能 ## **作業目標** * Open MP ## 複習之前作業進度 * 原程式 * 執行時間:Execution time of raytracing() : ==3.916696 sec== * 做Gprof:加上-pg 時間會變慢許多, 所以求執行時間時會先把-pg拿掉 * 流程 Makefile加上PROFILE=1 make ./raytracing gprof -b raytracing gmon.out || less Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 26.23 1.36 1.36 69646433 0.00 0.00 dot_product 17.03 2.24 0.88 56956357 0.00 0.00 subtract_vector 10.45 2.78 0.54 31410180 0.00 0.00 multiply_vector 8.03 3.19 0.42 13861875 0.00 0.00 rayRectangularIntersection 7.16 3.56 0.37 13861875 0.00 0.00 raySphereIntersection 6.77 3.91 0.35 17836094 0.00 0.00 add_vector 5.23 4.18 0.27 10598450 0.00 0.00 normalize 4.74 4.43 0.25 17821809 0.00 0.00 cross_product 2.90 4.58 0.15 4620625 0.00 0.00 ray_hit_object 2.52 4.71 0.13 4221152 0.00 0.00 multiply_vectors 1.55 4.79 0.08 2110576 0.00 0.00 localColor 1.55 4.87 0.08 1048576 0.00 0.00 ray_color 1.16 4.93 0.06 1048576 0.00 0.00 rayConstruction 1.06 4.98 0.06 1241598 0.00 0.00 refraction 0.97 5.03 0.05 1 0.05 5.16 raytracing 0.87 5.08 0.05 2110576 0.00 0.00 compute_specular_diffuse 0.77 5.12 0.04 3838091 0.00 0.00 length 0.39 5.14 0.02 2520791 0.00 0.00 idx_stack_top 0.29 5.15 0.02 1241598 0.00 0.00 reflection 0.19 5.16 0.01 1241598 0.00 0.00 protect_color_overflow 0.19 5.17 0.01 1 0.01 0.01 delete_sphere_list 0.00 5.17 0.00 2558386 0.00 0.00 idx_stack_empty 0.00 5.17 0.00 1204003 0.00 0.00 idx_stack_push 0.00 5.17 0.00 1048576 0.00 0.00 idx_stack_init 0.00 5.17 0.00 113297 0.00 0.00 fresnel 0.00 5.17 0.00 37595 0.00 0.00 idx_stack_pop 0.00 5.17 0.00 3 0.00 0.00 append_rectangular 0.00 5.17 0.00 3 0.00 0.00 append_sphere 0.00 5.17 0.00 2 0.00 0.00 append_light 0.00 5.17 0.00 1 0.00 0.00 calculateBasisVectors 0.00 5.17 0.00 1 0.00 0.00 delete_light_list 0.00 5.17 0.00 1 0.00 0.00 delete_rectangular_list 0.00 5.17 0.00 1 0.00 0.00 diff_in_second 0.00 5.17 0.00 1 0.00 0.00 write_to_ppm dot product 等等呼叫次數多的函式佔了許多時間 **優化** * Loop unrolling: 把迴圈裡面的東西拆開, 可以減少判斷時間。 把math-toolkit.h中有for的都拆開 * 執行時間: Execution time of raytracing() : ==2.551525 sec== * gprof: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 17.16 0.53 0.53 56956357 0.00 0.00 subtract_vector 13.60 0.95 0.42 13861875 0.00 0.00 rayRectangularIntersection 13.28 1.36 0.41 69646433 0.00 0.00 dot_product 9.71 1.66 0.30 17821809 0.00 0.00 cross_product 7.77 1.90 0.24 10598450 0.00 0.00 normalize 7.12 2.12 0.22 13861875 0.00 0.00 raySphereIntersection 6.15 2.31 0.19 17836094 0.00 0.00 add_vector 5.83 2.49 0.18 31410180 0.00 0.00 multiply_vector 5.83 2.67 0.18 4620625 0.00 0.00 ray_hit_object 3.56 2.78 0.11 1048576 0.00 0.00 ray_color 2.27 2.85 0.07 2110576 0.00 0.00 compute_specular_diffuse 1.94 2.91 0.06 4221152 0.00 0.00 multiply_vectors 1.62 2.96 0.05 2110576 0.00 0.00 localColor 0.65 2.98 0.02 2558386 0.00 0.00 idx_stack_empty 0.65 3.00 0.02 1241598 0.00 0.00 protect_color_overflow 0.65 3.02 0.02 1241598 0.00 0.00 refraction 0.65 3.04 0.02 1048576 0.00 0.00 rayConstruction 0.32 3.05 0.01 1241598 0.00 0.00 reflection 0.32 3.06 0.01 1204003 0.00 0.00 idx_stack_push 0.32 3.07 0.01 1048576 0.00 0.00 idx_stack_init 0.32 3.08 0.01 1 0.01 0.01 delete_sphere_list 0.32 3.09 0.01 1 0.01 3.08 raytracing 0.00 3.09 0.00 3838091 0.00 0.00 length 0.00 3.09 0.00 2520791 0.00 0.00 idx_stack_top 0.00 3.09 0.00 113297 0.00 0.00 fresnel 0.00 3.09 0.00 37595 0.00 0.00 idx_stack_pop 0.00 3.09 0.00 3 0.00 0.00 append_rectangular 0.00 3.09 0.00 3 0.00 0.00 append_sphere 0.00 3.09 0.00 2 0.00 0.00 append_light 0.00 3.09 0.00 1 0.00 0.00 calculateBasisVectors 0.00 3.09 0.00 1 0.00 0.00 delete_light_list 0.00 3.09 0.00 1 0.00 0.00 delete_rectangular_list 0.00 3.09 0.00 1 0.00 0.00 diff_in_second 0.00 3.09 0.00 1 0.00 0.00 write_to_ppm * force inline: 參考[When to use inline function and when not to use it](http://stackoverflow.com/questions/1932311/when-to-use-inline-function-and-when-not-to-use-it) -D__forceinline="__attribute__((always_inline))" * 執行時間:Execution time of raytracing() : ==2.379305== sec * gprof: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 42.78 0.98 0.98 13861875 0.00 0.00 rayRectangularIntersection 23.91 1.52 0.55 13861875 0.00 0.00 raySphereIntersection 8.77 1.72 0.20 2110576 0.00 0.00 compute_specular_diffuse 5.26 1.84 0.12 4620625 0.00 0.00 ray_hit_object 4.83 1.95 0.11 1048576 0.00 0.00 ray_color 3.95 2.04 0.09 2110576 0.00 0.00 localColor 3.07 2.11 0.07 1241598 0.00 0.00 refraction 2.19 2.16 0.05 1241598 0.00 0.00 reflection 2.19 2.21 0.05 1048576 0.00 0.00 rayConstruction 1.32 2.24 0.03 1 0.03 2.28 raytracing 0.88 2.26 0.02 2520791 0.00 0.00 idx_stack_top 0.44 2.27 0.01 2558386 0.00 0.00 idx_stack_empty 0.44 2.28 0.01 113297 0.00 0.00 fresnel 0.00 2.28 0.00 1241598 0.00 0.00 protect_color_overflow 0.00 2.28 0.00 1204003 0.00 0.00 idx_stack_push 0.00 2.28 0.00 1048576 0.00 0.00 idx_stack_init 0.00 2.28 0.00 37595 0.00 0.00 idx_stack_pop 0.00 2.28 0.00 3 0.00 0.00 append_rectangular 0.00 2.28 0.00 3 0.00 0.00 append_sphere 0.00 2.28 0.00 2 0.00 0.00 append_light 0.00 2.28 0.00 1 0.00 0.00 calculateBasisVectors 0.00 2.28 0.00 1 0.00 0.00 delete_light_list 0.00 2.28 0.00 1 0.00 0.00 delete_rectangular_list 0.00 2.28 0.00 1 0.00 0.00 delete_sphere_list 0.00 2.28 0.00 1 0.00 0.00 diff_in_second 0.00 2.28 0.00 1 0.00 0.00 write_to_ppm ## **改善** ### OpenMP * 照著[DADA的筆記](https://embedded2016.hackpad.com/2016q1-Homework-2-A-jugsB8br8Bt#:h=優化版本-3---OpenMP)做 Makefile: ```clike CFLAGS = \ -std=gnu99 -Wall -O0 \ -D__forceinline="__attribute__((always_inline))"\ -fopenmp LDFLAGS = \ -lm -lgomp ``` raytracing.c ```clike #include<omp.h> ``` * 結果 * 執行時間:Execution time of raytracing() : ==1.013082sec== * gprof: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 41.95 0.13 0.13 331737 0.00 0.00 rayRectangularIntersection 12.91 0.17 0.04 333955 0.00 0.00 raySphereIntersection 12.91 0.21 0.04 55511 0.00 0.00 localColor 12.91 0.25 0.04 49278 0.00 0.00 compute_specular_diffuse 9.68 0.28 0.03 113253 0.00 0.00 ray_hit_object 9.68 0.31 0.03 26786 0.00 0.01 ray_color 0.00 0.31 0.00 61251 0.00 0.00 idx_stack_top 0.00 0.31 0.00 61247 0.00 0.00 idx_stack_empty 0.00 0.31 0.00 33997 0.00 0.00 reflection 0.00 0.31 0.00 30389 0.00 0.00 protect_color_overflow 0.00 0.31 0.00 29646 0.00 0.00 refraction 0.00 0.31 0.00 29074 0.00 0.00 idx_stack_push 0.00 0.31 0.00 23823 0.00 0.00 rayConstruction 0.00 0.31 0.00 22615 0.00 0.00 idx_stack_init 0.00 0.31 0.00 4000 0.00 0.00 fresnel 0.00 0.31 0.00 1227 0.00 0.00 idx_stack_pop 0.00 0.31 0.00 3 0.00 0.00 append_rectangular 0.00 0.31 0.00 3 0.00 0.00 append_sphere 0.00 0.31 0.00 2 0.00 0.00 append_light 0.00 0.31 0.00 1 0.00 0.00 calculateBasisVectors 0.00 0.31 0.00 1 0.00 0.00 delete_light_list 0.00 0.31 0.00 1 0.00 0.00 delete_rectangular_list 0.00 0.31 0.00 1 0.00 0.00 delete_sphere_list 0.00 0.31 0.00 1 0.00 0.00 diff_in_second 0.00 0.31 0.00 1 0.00 310.10 raytracing 0.00 0.31 0.00 1 0.00 0.00 write_to_ppm 為什麼call的次數也減少了呢? ## 參考資料: * [何岱岱](https://embedded2016.hackpad.com/2016q1-Homwork2a--QixiAsqbVaf#:h=加入-D__forceinline="__attribute)同學筆記 ###### tags: `Course` `Ruby` `2016Autumn` `HW1` `Raytracing`