HW2 Solutions - HackMD

# HW2 Solutions **Q1 :** In your write-up, produce a graph of speedup compared to the reference sequential implementation as a function of the number of threads used FOR VIEW 1. Is speedup linear in the number of threads used? In your writeup hypothesize why this is (or is not) the case? **Ans :** ```vega { "$schema": "https://vega.github.io/schema/vega-lite/v4.json", "description": "Thread-wise Speedup data.", "data": { "values": [ {"Thread Count": "1", "Speedup": 1.0}, {"Thread Count": "2", "Speedup": 1.94}, {"Thread Count": "3", "Speedup": 1.57}, {"Thread Count": "4", "Speedup": 2.54} ] }, "mark": "bar", "encoding": { "x": {"field": "Thread Count", "type": "ordinal"}, "y": {"field": "Speedup", "type": "quantitative"} } } ``` No, the speedup is not linear at all. Additionally, it drops at 3 threads in comparison to 2 threads. **Q2 :** How do your measurements explain the speedup graph you previously created? **Ans :** *With two Threads:* ![](https://i.imgur.com/J2Ph29H.png) *With three Threads:* ![](https://i.imgur.com/wTTOWkM.png) *With four Threads:* ![](https://i.imgur.com/vDMZuHg.png) We can see that as the number of threads increases from 2 to 4, the *Spatial Decomposition* seems to increase unbalance among threads. I do not have clear hypothesis for this behaviour. This is happening despite not using any synchronization. **Q3 :** In your write-up, describe your approach to parallelization and report the final 4-thread speedup obtained. **Ans :** ![](https://i.imgur.com/vDMZuHg.png) The best speedup at 4-threads received by me is **2.54x**. My implementation has been based only on *Spatial Decomposition* and avoids any synchronizations. **Q4 :** Now run your improved code with eight threads. Is performance noticeably greater than when running with four threads? Why or why not? **Ans :** With eight-threads, I get an improved speedup of **3.12x**. ![](https://i.imgur.com/7W14TCA.png)