SCC Homework 1 report

--- tags: SCC_Camp --- # SCC Homework 1 report --- ## Algorithm ---- ### odd-even sort part ---- ``` // pseudo code read the array from file sort the array // each node for i = 1 to (# of nodes) / 2 // which is # of section / 2 if at even node if (i is even) // even phase send the last element in array to the right node, and get the first element in array of the right node, called it preload. if preload < the last element in array merge the array with the array of the right node, and save the smaller half in array. else // odd phase send the first element in array to the left node, and get the last element in array of the left node, called it preload. if preload > the first element in array merge the array with the array of the left node, and save the larger half in array. else if at odd node if (i is even) // even phase send the first element in array to the left node, and get the last element in array of the left node, called it preload. if preload > the first element in array merge the array with the array of the left node, and save the larger half in array. else // odd phase send the last element in array to the right node, and get the first element in array of the right node, called it preload. if preload < the last element in array merge the array with the array of the right node, and save the smaller half in array. ``` ---- ![](https://i.imgur.com/dOD8Mp7.png) ---- **timeline** - classic version - classic version but bounded world_size/2 time - preload version ---- ### the merge part By traversing both the arrays to keep track of the current element in each arrays and finding the minimum value among the two and updating the output_array with the least value. ![image alt](https://2.bp.blogspot.com/-ZJI_v0R22vc/WAI6l3ZXjfI/AAAAAAAABVg/3xOOXc0V4lACF1NVBxbepWwp97__v5tOACLcB/s1600/merge-two-sorted-array.png) > trick: switch pointer ---- **Advance** When I was search about merge algorithm, I found out that is actually a parallel merge algorithm. Below is the pseudo code [wiki merge algorithm reference](https://en.wikipedia.org/wiki/Merge_algorithm) ---- ``` algorithm merge(A[i...j], B[k...ℓ], C[p...q]) is inputs A, B, C : array i, j, k, ℓ, p, q : indices let m = j - i, n = ℓ - k if m < n then swap A and B // ensure that A is the larger array: i, j still belong to A; k, ℓ to B swap m and n if m ≤ 0 then return // base case, nothing to merge let r = ⌊(i + j)/2⌋ let s = binary-search(A[r], B[k...ℓ]) let t = p + (r - i) + (s - k) C[t] = A[r] in parallel do merge(A[i...r], B[k...s], C[p...t]) merge(A[r+1...j], B[s...ℓ], C[t+1...q]) ``` ---- ### the send and receive part I also tried other version on send & recv ---- 1. Call send, recv seperately - this would cause a little latency because the latter must wait the former ``` MPI_Send(...) MPI_Recv(...) ``` ---- 2. Call Isend, Irecv seperately and call wait request - this performed better than above because the latter don't need to wait the former fully completed to start - But the below still have to wait two operation done. ``` MPI_Isend(...) MPI_Irecv(...) ... MPI_Wait(...) ``` ---- 3. Use Sendrecv - this outperformed the above two due to MPI's implementation on Sendrecv ``` MPI_Sendrecv(...) ``` ---- > ...  in order to prevent cyclic dependencies that may lead to deadlock. **When a send-receive operation is used, the communication subsystem takes care of these issues.** - https://www.open-mpi.org/doc/v4.0/man3/MPI_Sendrecv.3.php --- ## Makefile ---- ### Compiler Optimization ---- - Use `g++` I tested `-O1`, `-O2` and `-O3`. And the `-O3` version is the fastest. Also `-Ofast` is tested, but it will enables some non standard compliant optimisations and get wrong answer. `-O3` does increase the performance a lot. --- ## Others ---- ### spreadsort which heuristically mixes bucket and comparison sorting. [spreadsort wiki](https://en.wikipedia.org/wiki/Spreadsort) https://stackoverflow.com/questions/21251061/does-c-algorithm-boost-lib-have-radix-sort This does increase the performance a lot. ---- ### PMPI [PMPI stackoverflow](https://stackoverflow.com/questions/43002936/what-are-the-differences-between-the-mpi-functions-and-pmpi-functions?rq=1) It doesn't increase the performance to much since it just weak symbols pointing topic. (MPI -> PMPI) --- ## Thanks ----