Parallel programming HW4
Q1.1: How do you control the number of MPI processes on each node?
- MPI run programs according to hostfile, it assign programs to machine in hostfile one-by-one.
Q1.2: Which functions do you use for retrieving the rank of an MPI process and the total number of processes?
MPI_Comm_size
is used to get the total number of processes.
MPI_Comm_rank
is used to get the rank of current process.
Q2.1: Why MPI_Send and MPI_Recv are called “blocking” communication?
- Since the result must be received by main machine, the communication should be syncronous, which means the program would be blocked until those operations are finished.

Number of processes |
2 |
4 |
8 |
12 |
16 |
Execution time |
9.908397 |
5.456350 |
2.688769 |
1.824668 |
1.364209 |

Number of processed |
2 |
4 |
8 |
16 |
Execution time |
9.839091 |
5.292457 |
2.740805 |
1.832486 |
- The performance got not much improvement while binary tree reduction is used.
- linear approach may have better performance.
- Since the tree approach need more message transfer, which take much more time than CPU calculation.

Number of processes |
2 |
4 |
8 |
12 |
16 |
Execution time |
9.797364 |
5.305192 |
2.683531 |
2.438963 |
1.862100 |
Q4.2: What are the MPI functions for non-blocking communication?
- send:
MPI_Send
- receive:
MPI_Irecv
- nonblocking communication have better performance while number of processes is small, but performance get worse compared to blocking one while the number of processes increases.

Number of processes |
2 |
4 |
8 |
12 |
16 |
Execution time |
10.049845 |
5.587510 |
3.200731 |
2.257737 |
2.097789 |

Number of processes |
2 |
4 |
8 |
12 |
16 |
Execution time |
10.105532 |
5.587154 |
2.961666 |
2.046166 |
2.500450 |
Q7: Describe what approach(es) were used in your MPI matrix multiplication for each data set.
- Send part of values of matrix A to nodes according to MPI_rank, while sending whole matrix B to all nodes.
- Calculate partial result
each node calculate row [)
- Send all result back to
node0
and ouput the result.