# 多處理機平行程式設計HW4
`AN4086056 黃祥瑋`
## Parrarel
I use muti-threads to parralize the program , it is much more faster than using a cluster , because the cost of communication is cheaper .Also the threads is in same process , so there are no data transmition cost .
In this program , I assigned each thread a section to work , and wait for other threads in each round.
## Usage
`$./h4_problem1 Nsmooth worker`
ex:
`$./h4_problem1 1000 4`
## Analyze
`p3` is the tmp .exe file name for `h4_problem1 ` .
### 1000 smooth
* worker = 1

* worker = 8

* worker = 16

* worker = 24

From the execution , we can find out that using thread = 16 reaches the best Speedup $\dfrac{118}{40}=2.95$ among {1 ,8 ,16 ,24} .
The main reason that cause worker = 16 and 24 doesn't speedup very well is the core in hardware .If our threads number is bigger than the core number , some of the thread will have to line up to use the core .Since we use a barrier to wait for all the threads , the total execution time is `Max thread execution time` * `threads` which cause the program don't speed up well at big thread numbers .
### 2000 Smooth
* worker = 1

* worker = 4

* worker = 8

* worker = 12

The best Speedup is $\dfrac{234}{71}=3.29$ at worker = 8 among {1,4,8,12} . We can see that at worker = 12 , the time cost is bigger than worker = 8 .
## Summary
We can see that comparing Smooth 1000 and 2000 , the time cost of `2000` isn't 2 times of `1000` . Because there are serial parts of the program .When the Smooth number is bigger , we will get a better Speedup .
### 3000
* worker = 1

* worker = 8

Speedup = $\dfrac{372}{106}=3.5$
## Difficulty
When creating the thread , it has to pass a pointer to the thread , but I forget to assign a new memory space ..... I stuck there for a long time .