# CS267 HW2 - [x] The names of the people in your group and each member's contribution. - [x] A plot in log-log scale that shows that your serial and parallel codes run in O(n) time and a description of the data structures that you used to achieve it. - [x] A description of the synchronization you used in the shared memory implementation. - [x] A description of the design choices that you tried and how did they affect the performance. - [x] Speedup plots that show how closely your OpenMP code approaches the idealized p-times speedup and a discussion on whether it is possible to do better. - [x] Where does the time go? Consider breaking down the runtime into computation time, synchronization time and/or communication time. How do they scale with p? # Contributions Yuhong Chen Jie Qiu Ziyao Zhang # ## Compute Forces 扫隔壁 ## Move update all particles in parallel #### Alternative 1: With barrier between apply_force and move ```cpp particles[i] -> new_particles[i] ``` #### Alternative 2: No barrier, random access ```cpp new_particles[num_parts] particles[i] -> new_particles[i] ``` ### Rebin ```cpp // particle particles[num_parts]; particle new_particles[num_parts]; std::vector<particle> new_grid[num_bins]; std::vector<std::vector<particle>> todo_add; std::vector<std::unordered_set<particle>> todo_remove; for(int bin=0; bin < num_bins; bin++){ // Add old particles for (const auto& p : grid[bin]) { if (todo_remove[bin].find(p.id) != todo_remove[bin].end()) { new_grid[bin].push_back(new_particles[p.id]); } } // Add new particles new_grid[bin].insert(new_grid[bin].end(), todo_add[bin].begin(), todo_add[bin].end()); grid[bin] = new_grid[bin]; } ```