# 多處理機平行程式設計HW4 `AN4086056 黃祥瑋` ## Parrarel I use muti-threads to parralize the program , it is much more faster than using a cluster , because the cost of communication is cheaper .Also the threads is in same process , so there are no data transmition cost . In this program , I assigned each thread a section to work , and wait for other threads in each round. ## Usage `$./h4_problem1 Nsmooth worker` ex: `$./h4_problem1 1000 4` ## Analyze `p3` is the tmp .exe file name for `h4_problem1 ` . ### 1000 smooth * worker = 1 ![](https://i.imgur.com/DFNPxlG.png) * worker = 8 ![](https://i.imgur.com/whzjeDv.png) * worker = 16 ![](https://i.imgur.com/u1wS7Cn.png) * worker = 24 ![](https://i.imgur.com/h7lhh0E.png) From the execution , we can find out that using thread = 16 reaches the best Speedup $\dfrac{118}{40}=2.95$ among {1 ,8 ,16 ,24} . The main reason that cause worker = 16 and 24 doesn't speedup very well is the core in hardware .If our threads number is bigger than the core number , some of the thread will have to line up to use the core .Since we use a barrier to wait for all the threads , the total execution time is `Max thread execution time` * `threads` which cause the program don't speed up well at big thread numbers . ### 2000 Smooth * worker = 1 ![](https://i.imgur.com/1wTHwlc.png) * worker = 4 ![](https://i.imgur.com/1x0vX3h.png) * worker = 8 ![](https://i.imgur.com/6Rc6XwR.png) * worker = 12 ![](https://i.imgur.com/qursowl.png) The best Speedup is $\dfrac{234}{71}=3.29$ at worker = 8 among {1,4,8,12} . We can see that at worker = 12 , the time cost is bigger than worker = 8 . ## Summary We can see that comparing Smooth 1000 and 2000 , the time cost of `2000` isn't 2 times of `1000` . Because there are serial parts of the program .When the Smooth number is bigger , we will get a better Speedup . ### 3000 * worker = 1 ![](https://i.imgur.com/3TMGNEH.png) * worker = 8 ![](https://i.imgur.com/3kWG5gN.png) Speedup = $\dfrac{372}{106}=3.5$ ## Difficulty When creating the thread , it has to pass a pointer to the thread , but I forget to assign a new memory space ..... I stuck there for a long time .