Try   HackMD

2016q3 Homework1 (raytracing)

Reviewed by vic85821

  • 建議利用圖表來比較效能差異
  • 可以比較不同thread數目對執行時間的關係
  • commit 225c09f 無法得知目的與功能

作業目標

  • Open MP

複習之前作業進度

  • 原程式
    • 執行時間:Execution time of raytracing() : 3.916696 sec

    • 做Gprof:加上-pg 時間會變慢許多, 所以求執行時間時會先把-pg拿掉

      • 流程
        Makefile加上PROFILE=1
        make
        ./raytracing
        gprof -b raytracing gmon.out || less

      Each sample counts as 0.01 seconds.
      % cumulative self self total
      time seconds seconds calls s/call s/call name
      26.23 1.36 1.36 69646433 0.00 0.00 dot_product
      17.03 2.24 0.88 56956357 0.00 0.00 subtract_vector
      10.45 2.78 0.54 31410180 0.00 0.00 multiply_vector
      8.03 3.19 0.42 13861875 0.00 0.00 rayRectangularIntersection
      7.16 3.56 0.37 13861875 0.00 0.00 raySphereIntersection
      6.77 3.91 0.35 17836094 0.00 0.00 add_vector
      5.23 4.18 0.27 10598450 0.00 0.00 normalize
      4.74 4.43 0.25 17821809 0.00 0.00 cross_product
      2.90 4.58 0.15 4620625 0.00 0.00 ray_hit_object
      2.52 4.71 0.13 4221152 0.00 0.00 multiply_vectors
      1.55 4.79 0.08 2110576 0.00 0.00 localColor
      1.55 4.87 0.08 1048576 0.00 0.00 ray_color
      1.16 4.93 0.06 1048576 0.00 0.00 rayConstruction
      1.06 4.98 0.06 1241598 0.00 0.00 refraction
      0.97 5.03 0.05 1 0.05 5.16 raytracing
      0.87 5.08 0.05 2110576 0.00 0.00 compute_specular_diffuse
      0.77 5.12 0.04 3838091 0.00 0.00 length
      0.39 5.14 0.02 2520791 0.00 0.00 idx_stack_top
      0.29 5.15 0.02 1241598 0.00 0.00 reflection
      0.19 5.16 0.01 1241598 0.00 0.00 protect_color_overflow
      0.19 5.17 0.01 1 0.01 0.01 delete_sphere_list
      0.00 5.17 0.00 2558386 0.00 0.00 idx_stack_empty
      0.00 5.17 0.00 1204003 0.00 0.00 idx_stack_push
      0.00 5.17 0.00 1048576 0.00 0.00 idx_stack_init
      0.00 5.17 0.00 113297 0.00 0.00 fresnel
      0.00 5.17 0.00 37595 0.00 0.00 idx_stack_pop
      0.00 5.17 0.00 3 0.00 0.00 append_rectangular
      0.00 5.17 0.00 3 0.00 0.00 append_sphere
      0.00 5.17 0.00 2 0.00 0.00 append_light
      0.00 5.17 0.00 1 0.00 0.00 calculateBasisVectors
      0.00 5.17 0.00 1 0.00 0.00 delete_light_list
      0.00 5.17 0.00 1 0.00 0.00 delete_rectangular_list
      0.00 5.17 0.00 1 0.00 0.00 diff_in_second
      0.00 5.17 0.00 1 0.00 0.00 write_to_ppm

dot product 等等呼叫次數多的函式佔了許多時間

優化

  • Loop unrolling: 把迴圈裡面的東西拆開, 可以減少判斷時間。
    把math-toolkit.h中有for的都拆開
    • 執行時間:
      Execution time of raytracing() : 2.551525 sec
    • gprof:
​Each sample counts as 0.01 seconds.
​  %   cumulative   self              self     total 
​ time   seconds   seconds    calls   s/call   s/call  name 
​ 17.16      0.53     0.53 56956357     0.00     0.00  subtract_vector
​ 13.60      0.95     0.42 13861875     0.00     0.00  rayRectangularIntersection
​ 13.28      1.36     0.41 69646433     0.00     0.00  dot_product
​  9.71      1.66     0.30 17821809     0.00     0.00  cross_product
​  7.77      1.90     0.24 10598450     0.00     0.00  normalize
​  7.12      2.12     0.22 13861875     0.00     0.00  raySphereIntersection
​  6.15      2.31     0.19 17836094     0.00     0.00  add_vector
​  5.83      2.49     0.18 31410180     0.00     0.00  multiply_vector
​  5.83      2.67     0.18  4620625     0.00     0.00  ray_hit_object
​  3.56      2.78     0.11  1048576     0.00     0.00  ray_color
​  2.27      2.85     0.07  2110576     0.00     0.00  compute_specular_diffuse
​  1.94      2.91     0.06  4221152     0.00     0.00  multiply_vectors
​  1.62      2.96     0.05  2110576     0.00     0.00  localColor
​  0.65      2.98     0.02  2558386     0.00     0.00  idx_stack_empty
​  0.65      3.00     0.02  1241598     0.00     0.00  protect_color_overflow
​  0.65      3.02     0.02  1241598     0.00     0.00  refraction
​  0.65      3.04     0.02  1048576     0.00     0.00  rayConstruction
​  0.32      3.05     0.01  1241598     0.00     0.00  reflection
​  0.32      3.06     0.01  1204003     0.00     0.00  idx_stack_push
​  0.32      3.07     0.01  1048576     0.00     0.00  idx_stack_init
​  0.32      3.08     0.01        1     0.01     0.01  delete_sphere_list
​  0.32      3.09     0.01        1     0.01     3.08  raytracing
​  0.00      3.09     0.00  3838091     0.00     0.00  length
​  0.00      3.09     0.00  2520791     0.00     0.00  idx_stack_top
​  0.00      3.09     0.00   113297     0.00     0.00  fresnel
​  0.00      3.09     0.00    37595     0.00     0.00  idx_stack_pop
​  0.00      3.09     0.00        3     0.00     0.00  append_rectangular
​  0.00      3.09     0.00        3     0.00     0.00  append_sphere
​  0.00      3.09     0.00        2     0.00     0.00  append_light
​  0.00      3.09     0.00        1     0.00     0.00  calculateBasisVectors
​  0.00      3.09     0.00        1     0.00     0.00  delete_light_list
​  0.00      3.09     0.00        1     0.00     0.00  delete_rectangular_list
​  0.00      3.09     0.00        1     0.00     0.00  diff_in_second
​  0.00      3.09     0.00        1     0.00     0.00  write_to_ppm
  • force inline: 參考When to use inline function and when not to use it
    -D__forceinline="attribute((always_inline))"
    • 執行時間:Execution time of raytracing() : 2.379305 sec

    • gprof:

      Each sample counts as 0.01 seconds.
      % cumulative self self total
      time seconds seconds calls s/call s/call name
      42.78 0.98 0.98 13861875 0.00 0.00 rayRectangularIntersection
      23.91 1.52 0.55 13861875 0.00 0.00 raySphereIntersection
      8.77 1.72 0.20 2110576 0.00 0.00 compute_specular_diffuse
      5.26 1.84 0.12 4620625 0.00 0.00 ray_hit_object
      4.83 1.95 0.11 1048576 0.00 0.00 ray_color
      3.95 2.04 0.09 2110576 0.00 0.00 localColor
      3.07 2.11 0.07 1241598 0.00 0.00 refraction
      2.19 2.16 0.05 1241598 0.00 0.00 reflection
      2.19 2.21 0.05 1048576 0.00 0.00 rayConstruction
      1.32 2.24 0.03 1 0.03 2.28 raytracing
      0.88 2.26 0.02 2520791 0.00 0.00 idx_stack_top
      0.44 2.27 0.01 2558386 0.00 0.00 idx_stack_empty
      0.44 2.28 0.01 113297 0.00 0.00 fresnel
      0.00 2.28 0.00 1241598 0.00 0.00 protect_color_overflow
      0.00 2.28 0.00 1204003 0.00 0.00 idx_stack_push
      0.00 2.28 0.00 1048576 0.00 0.00 idx_stack_init
      0.00 2.28 0.00 37595 0.00 0.00 idx_stack_pop
      0.00 2.28 0.00 3 0.00 0.00 append_rectangular
      0.00 2.28 0.00 3 0.00 0.00 append_sphere
      0.00 2.28 0.00 2 0.00 0.00 append_light
      0.00 2.28 0.00 1 0.00 0.00 calculateBasisVectors
      0.00 2.28 0.00 1 0.00 0.00 delete_light_list
      0.00 2.28 0.00 1 0.00 0.00 delete_rectangular_list
      0.00 2.28 0.00 1 0.00 0.00 delete_sphere_list
      0.00 2.28 0.00 1 0.00 0.00 diff_in_second
      0.00 2.28 0.00 1 0.00 0.00 write_to_ppm

改善

OpenMP

Makefile:

CFLAGS = \
    -std=gnu99 -Wall -O0 \
    -D__forceinline="__attribute__((always_inline))"\
    -fopenmp
LDFLAGS = \
    -lm  -lgomp

raytracing.c

#include<omp.h>
  • 結果
    • 執行時間:Execution time of raytracing() : 1.013082sec

    • gprof:

      Each sample counts as 0.01 seconds.
      % cumulative self self total
      time seconds seconds calls ms/call ms/call name
      41.95 0.13 0.13 331737 0.00 0.00 rayRectangularIntersection
      12.91 0.17 0.04 333955 0.00 0.00 raySphereIntersection
      12.91 0.21 0.04 55511 0.00 0.00 localColor
      12.91 0.25 0.04 49278 0.00 0.00 compute_specular_diffuse
      9.68 0.28 0.03 113253 0.00 0.00 ray_hit_object
      9.68 0.31 0.03 26786 0.00 0.01 ray_color
      0.00 0.31 0.00 61251 0.00 0.00 idx_stack_top
      0.00 0.31 0.00 61247 0.00 0.00 idx_stack_empty
      0.00 0.31 0.00 33997 0.00 0.00 reflection
      0.00 0.31 0.00 30389 0.00 0.00 protect_color_overflow
      0.00 0.31 0.00 29646 0.00 0.00 refraction
      0.00 0.31 0.00 29074 0.00 0.00 idx_stack_push
      0.00 0.31 0.00 23823 0.00 0.00 rayConstruction
      0.00 0.31 0.00 22615 0.00 0.00 idx_stack_init
      0.00 0.31 0.00 4000 0.00 0.00 fresnel
      0.00 0.31 0.00 1227 0.00 0.00 idx_stack_pop
      0.00 0.31 0.00 3 0.00 0.00 append_rectangular
      0.00 0.31 0.00 3 0.00 0.00 append_sphere
      0.00 0.31 0.00 2 0.00 0.00 append_light
      0.00 0.31 0.00 1 0.00 0.00 calculateBasisVectors
      0.00 0.31 0.00 1 0.00 0.00 delete_light_list
      0.00 0.31 0.00 1 0.00 0.00 delete_rectangular_list
      0.00 0.31 0.00 1 0.00 0.00 delete_sphere_list
      0.00 0.31 0.00 1 0.00 0.00 diff_in_second
      0.00 0.31 0.00 1 0.00 310.10 raytracing
      0.00 0.31 0.00 1 0.00 0.00 write_to_ppm

為什麼call的次數也減少了呢?

參考資料:

tags: Course Ruby 2016Autumn HW1 Raytracing