# Parallel Programming Assignment II
## Q1: In your write-up, produce a graph of speedup compared to the reference sequential implementation as a function of the number of threads used FOR VIEW 1. Is speedup linear in the number of threads used? In your writeup hypothesize why this is (or is not) the case?
> (You may also wish to produce a graph for VIEW 2 to help you come up with a good answer. Hint: take a careful look at the three-thread data-point.)
### Q1-1 produce a graph of speedup compared to the reference sequential implementation as a function of the number of threads used FOR VIEW 1
## View1

## View2

### Q1-2 Is speedup linear in the number of threads used? In your writeup hypothesize why this is (or is not) the case?
* No
* 根據提示,我重新觀察了View1 和 View2 在 3條Threads的差別,可以發現View1中,第二個block白色的範圍,相較其他Block佔據更大,因此,我認為可能原因是因為每個Threds所需的計算量並不平均,因此拖累了運行時間,相較之下View2的每個block的分布就更平均,因此速度能隨著threads的數量線性上升。
## Q2 How do your measurements explain the speedup graph you previously created?
當Threads 的數量等於2時,因為圖形上下平分,所以時間相去無幾,而當Threads等於3的時候,中間的白色區域較多,因此,也需要較久的時間,而如圖所示,Thread-1 花了最久的時間。
> View1 Threads:2

> View1 Threads:3

<!-- > View1 Threads:4
 -->
## Q3: In your write-up, describe your approach to parallelization and report the final 4-thread speedup obtained.
如同前面的題目所述,直接區分成N(N=num of threads)個block可能會遇到分配不均的問題。因此,我將圖形切成H(H = height)等分,然後每個Threads輪流取一個等分,而這樣的好處在於大幅降低了區域分配不均的問題。
> View1

> View2

## Q4 Now run your improved code with eight threads. Is performance noticeably greater than when running with four threads? Why or why not? (Notice that the workstation server provides 4 cores 4 threads.)

當Threads的數量提升到8的時候,速度反而比4個Threads 還要慢,我想這是因為需要的Threads > Server 可提供的Threads 數量,因此,增加了context switch的overhead,速度就變得比4個Threads還慢了。