# 2016q3 Homework 1 (raytracing) ###### tags: `sarah`,`raytracing` contributed by <`SarahCheng`> github: https://github.com/SarahYuHanCheng/raytracing.git ## 開發記錄 * problem: * ~~vi Makefile遇下圖問題~~ 輸入D復原  * ~~不知道怎麼進入.vimrc調顏色~~ > `sudo apt-get install vim` * reference: * [vic85821](https://hackmd.io/MYQwpgnARsBMUFoBmAWAjADgSsBmLGSuArAgAwDsFKAJmmAGxJgbFA==?view)同學的開發紀錄 * [Jim00000](https://hackmd.io/s/HyRgNwIa)同學的開發紀錄 * [multithread](https://embedded2016.hackpad.com/ep/pad/static/wOu40KzMaIP) STEP: 1. orig: `Execution time of raytracing() : 6.607877 sec` after ofast:`Execution time of raytracing() : 1.440246 sec` 2. convert out.ppm成其他檔案,用`ls —lh`來看大小差別  3. `make PROFILE=1` 4. gprof: `gprof ./raytracing | less`觀察個function呼叫的次數與時間,`less`讓結果分頁 ``` Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 18.39 0.57 0.57 1241598 0.00 0.00 refraction 13.87 1.00 0.43 56956357 0.00 0.00 subtract_vector 11.94 1.37 0.37 13861875 0.00 0.00 rayRectangularIntersection 10.97 1.71 0.34 69646433 0.00 0.00 dot_product 8.55 1.98 0.27 31410184 0.00 0.00 multiply_vector 7.42 2.21 0.23 10598450 0.00 0.00 normalize 5.48 2.38 0.17 17836094 0.00 0.00 add_vector 5.16 2.54 0.16 13861875 0.00 0.00 raySphereIntersection 4.52 2.67 0.14 4620625 0.00 0.00 ray_hit_object 3.23 2.77 0.10 17821809 0.00 0.00 cross_product 3.06 2.87 0.10 1048576 0.00 0.00 ray_color 2.74 2.96 0.09 4221152 0.00 0.00 multiply_vectors 0.97 2.98 0.03 2110576 0.00 0.00 compute_specular_diffuse 0.65 3.00 0.02 1048576 0.00 0.00 rayConstruction 0.65 3.02 0.02 1 0.02 3.10 raytracing 0.48 3.04 0.01 1241598 0.00 0.00 protect_color_overflow 0.32 3.05 0.01 3838091 0.00 0.00 length 0.32 3.06 0.01 2558386 0.00 0.00 idx_stack_empty ``` 5. loop unrolling 將`math-toolkit.h(dot_product)`的loop解開,原為: ``` Clike= double dp = 0.0; for (int i = 0; i < 3; i++) dp += v1[i] * v2[i]; return dp; ``` 改為: ```Clike= double dp = 0.0; dp = v1[0] * v2[0], v1[1] * v2[1], v1[2] * v2[2]; return dp; ``` 結果:`Execution time of raytracing() : 1.009193 sec` ``` % cumulative self self total time seconds seconds calls ms/call ms/call name 20.24 0.09 0.09 11534336 0.00 0.00 subtract_vector 19.05 0.17 0.08 4194304 0.00 0.00 multiply_vector 14.29 0.23 0.06 1048576 0.00 0.00 rayConstruction 9.52 0.27 0.04 1048576 0.00 0.00 ray_color 9.52 0.30 0.04 1 40.00 410.00 raytracing 7.14 0.34 0.03 3145728 0.00 0.00 raySphereIntersection 5.95 0.36 0.03 4194304 0.00 0.00 add_vector 4.76 0.38 0.02 3145728 0.00 0.00 rayRectangularIntersection 2.38 0.39 0.01 3145730 0.00 0.00 cross_product 2.38 0.40 0.01 1048579 0.00 0.00 normalize 2.38 0.41 0.01 multiply_vectors 1.19 0.41 0.01 10485760 0.00 0.00 dot_product ``` 6. force inline * reduce function call ( Inline直接再程式執行中把函式展開,而不是另外使用記憶體來呼叫函式並運算) `Execution time of raytracing() : 0.592862 sec` ``` Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 34.15 0.14 0.14 1048576 0.00 0.00 rayConstruction 29.27 0.26 0.12 3145728 0.00 0.00 rayRectangularIntersection 24.39 0.36 0.10 3145728 0.00 0.00 raySphereIntersection 9.76 0.40 0.04 1048576 0.00 0.00 ray_hit_object 2.44 0.41 0.01 1 10.00 410.00 raytracing 0.00 0.41 0.00 1048579 0.00 0.00 normalize ``` 7. [OpenMP](http://aaz-blogger.blogspot.tw/2011/03/openmp-parallel-construct.html) * test, get the number of threads ``` #include<stdio.h> #include<omp.h> int main() { #pragma omp parallel { printf("Hello!!\n"); } return 0; } ``` ``` Hello!! Hello!! Hello!! Hello!! ``` 得知電腦有4個執行緒 ``` #pragma omp parallel for schedule(dynamic) for (int j = 0; j < height; j++) { #pragma omp parallel for schedule(dynamic) private(d,stk,object_color) for (int i = 0; i < width; i++) ``` * 編譯時記得在Makefile中的CFLAGS加上-fopenmp,LDFLAGS加上-lgomp * 結果: * schedule(dynamic):`Execution time of raytracing() : 0.425498 sec` * num_threads(2):`Execution time of raytracing() : 0.594750 sec` * num_threads(4):`Execution time of raytracing() :` ==0.408811 sec== * num_threads(16):`Execution time of raytracing() : 0.490429 sec` * num_threads(32):`Execution time of raytracing() : 0.469159 sec` * num_threads(64):`Execution time of raytracing() : 0.526471 sec` * num_threads(128):`Execution time of raytracing() : 0.608506 sec` * 若加上math-toolkit(dot_product)的平行執行 ``` double dp0, dp1, dp2; #pragma omp parallel sections { #pragma omp section { dp0 = v1[0] * v2[0]; } #pragma omp section { dp1 = v1[1] * v2[1]; } #pragma omp section { dp2 = v1[2] * v2[2]; } } return dp0 + dp1 + dp2; ``` ``` Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 42.70 2.69 2.69 12052835 0.00 0.00 rayRectangularIntersection 18.73 3.87 1.18 1802340 0.00 0.00 compute_specular_diffuse 7.94 4.37 0.50 11756078 0.00 0.00 raySphereIntersection 7.94 4.87 0.50 1 0.50 6.30 raytracing 6.35 5.27 0.40 1127490 0.00 0.00 refraction 3.97 5.52 0.25 1844751 0.00 0.00 localColor 3.81 5.76 0.24 4125111 0.00 0.00 ray_hit_object 3.49 5.98 0.22 931033 0.00 0.00 ray_color 2.70 6.15 0.17 9374771 0.00 0.00 normalize 1.43 6.24 0.09 870632 0.00 0.00 rayConstruction 0.32 6.26 0.02 1117590 0.00 0.00 reflection ``` 結果:`Execution time of raytracing() : 19.676675 sec` >不知道why..待解... 7. 看complicate,畫圖(待) 8. SIMD-有自己的暫存器和指令,要快,要是連續的處理(待) 8. raytracing.c 每個pixel可以獨立運作,拆開平行運算-multithread(待)[TempoJiJi](https://hackmd.io/s/r1vckLB6)
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up