# Programming Assignment II: Multi-thread Programming
[toc]
### Q1
In the case of view1, there is no linear relationship between #threads and speedup. While, there exists linear relationship in view2. After I observed these 2 views below, I guess it could be resulted from load balance of threads. When we took a closer look at view 1 with 3 threads, the work load of thread 1 is much larger than the others. That may lead to the decreasing speedup when #threads goes from 2 to 3.



### Q2
As you can see the comparison of 2 threads and 3 threads, the elapsed time of execution is balanced in the case of 2 threads, while it is inbalanced in the case of 3 threads. This observation highly support my theory in question 1.


### Q3
The modified approach is to partition image to single rows instead of blocks. Each thread works at the multiple of its threadId, and ends at the height of image. This approach makes wordload more balanced than the original one. As a consequence, the speedup can up to 3.7 in both cases.


### Q4
There is no improvement in speedup when threads increses from 4 to 8. The workstation server provides 4 cores 4 threads, so it already reaches its upper bound under the support of this hardware. What's worse, when #threads is over 4, overhead of creation of thread and context switch goes up and not much parallelization is gained.
