# PHAS0100 - week 10 (25th March 2022)
:::info
## :tada: Welcome to the 10th and final live-class!
### Today
Today we will
- Recap shared memory parallelism
- Intro to the Message Parsing Interface (MPI)
- Work through OpenMPI examples:
- Hello world-type with/without catch2
- Point to point communications
- Collective operations: scatter and reduce
- Finish with a summary of the course
:::
### Timetable for today
| Start | End | Topic | Room |
| -------- | -------- | -------- | ----- |
| 14:05 | 14:25 | Intro shared memory + MPI | Main |
| 14:25 | 14:50 | [Install OpenMPI + hello/helloCatch](#Breakout-session-1) | Breakouts |
|14:50 | 15:00 | Break | ---|
|15:00 | 15:20 | Collective communications | Main |
| 15:20 | 15:50 | [Scatter and reduce examples](#Breakout-session-2) | Breakouts |
|15:50 | 16:00 | Break | --- |
|16:00 | 16:20 | Point-to-point communications | Main |
| 16:20 | 16:45 | [Ring pattern and SendRecv](#Breakout-session-3) | Breakouts |
|16:45 | 17:00 | Closeout and questions | Main |
# Breakout session 1
## Class Exercise 1
**OpenMPI libraries and hello.cpp**
1. First clone the repository for today's class
```
git clone https://github.com/jeydobson/phas0100_week10_mpi.git
```
2. Open the folder in VS Code and then `Reopen in Container` to build the updated Dockerfile image with OpenMPI libraries
:::spoiler alternative
If you prefer to install in a running container
``` shell
sudo apt-get update
sudo apt-get install openmpi-bin libopenmpi-dev
```
:::
3. Check that the OpenMPI libraries installed
``` shell
mpiexec --version
mpicxx --version
```
:::warning
As we will run cmake steps manually from the terminal it is useful to disable VS Code's automatic re-run of cmake every time a CMakeLists.txt file is saved: open the VS Code extensions tab -> `CMake Tools` extension settings -> Workspace -> uncheck the `Configure on Edit` setting.
:::
4. Add the relevant CMake commands to `phas0100_week10_mpi/hello/CMakeLists.txt` to compile `hello.cpp`
Note: in order to get onto latter parts of this question, take a look at the hints below if this takes more than a few minutes
:::spoiler hints
Use `add_executable` and `target_link_libraries`
The `MPI_CXX_LIBRARIES` is set by find_package and points to the MPI libraries. If that doesn't help see `hello/solution`
:::
5. Compile as usual
```
cd phas0100_week10_mpi
mkdir build
cd build
cmake ..
make
```
6. Now from within `build` try running:
```
mpiexec ./hello/hello
mpiexec -n <num_processors> ./hello/hello
mpiexec -n <num_processors> --oversubscribe ./hello/hello
```
* Try out a few different values for ``<numprocessors>`` 1, 2, 10. The `--oversubscribe` option is necessary if `<numprocessors>` is greater than those available on your system.
* Take a look at `hello.cpp`and discuss how running with different `<num_processors>` affects the output? Make sure you understand `MPI_Comm_rank` and `MPI_Comm_size`
* What happens if you run the executable without `mpiexec`?
7. If you have time, look at the https://www.open-mpi.org/doc/v3.1/ docs and see if the documentation for `MPI_Comm_rank` and `MPI_Comm_size` makes sense. Try to get the communicator name with `MPI_Comm_get_name`?
## Class Exercise 2
**helloCatch.cpp**
1. Add relevant CMake commands to `phas0100_week10_mpi/hello/CMakeLists.txt` to compile `helloCatch.cpp`
2. As with `hello.cpp` run cmake, make
3. Try running with `mpiexec` with various numbers of
:::spoiler Optional if time left over
5. Take a look at the https://www.open-mpi.org/doc/v3.1/ docs and see if the documentation for `MPI_Comm_rank` and `MPI_Comm_size` makes sense. Can you use `MPI_Comm_get_name` to get the name of the communicator?
:::
# Breakout session 2
## Class Exercise 3
**Scatter a message with MPI_Scatter**
1. Uncomment `add_subdirectory(collective)` in the top-level `phas0100_week10_mpi/CMakeLists.txt` file so that cmake tries to compile and add relevant commands to `phas0100_week10_mpi/collective/CMakeLists.txt` to compile `scatter.cpp`
2. Use the `MPI_Scatter` method to split `"This message is going to come out in separate channels"` evenly based on number of proccess and send each chunk to a process
## Class Exercise 4
**Sum resuts from different processes with MPI_Reduce**
In this example the aim is to calculate pi using the Gregory and Leibniz formula:
```
pi/4 = 1 - 1/3 + 1/5 - 1/7 + ...
```
but we will split the series into chunks and process each chunk on a separate process and then use MPI_Reduce to sum them all and pass the result to the root process (rank 0).
1. Add relevant commands to `phas0100_week10_mpi/collective/CMakeLists.txt` to compile `reduce.cpp`
2. Use the MPI_Reduce method to sum the results from each process to calculate pi/4. A list of the pre-defined operators can be found further down the page here https://www.open-mpi.org/doc/v3.1/man3/MPI_Reduce.3.php
# Breakout session 3
## Class Exercise 5
**ring with blocking synchronous MPI_SSend**
0. Throughtout, use https://www.open-mpi.org/doc/v3.1/ to look up MPI method documentation
1. Uncomment `add_subdirectory(point2point)` in the top-level `phas0100_week10_mpi/CMakeLists.txt` file so that cmake tries to compile and add relevant commands to `phas0100_week10_mpi/point2point/CMakeLists.txt` to compile `ring.cpp`
2. For even ranked processes: implement an `MPI_Ssend` to send `message` from the current process to the one on its left
3. For even processes: implement an `MPI_Recv` to recieve the `message` from the process on the right
4. Then for odd processes: first call `MPI_Recv`
5. And then `MPI_Ssend`
6. Check the final REQUIRE passes. What happends if you reverse the order of 4 and 5?
:::info
You can use the `-c` options to tell catch2 to only run tests for a specific section like this
``` shell
mpiexec --oversubscribe -n 4 point2point/ring -c "Blocking synchronous"
```
:::
## Class/Homework Exercise 6
**ring with asynchronous MPI_ISend**
Now implement using the asynchronous `MPI_ISend` in the `Asynchronous` SECTION.
1. First use `MPI_Isend` to send for all processes.
2. Then use `MPI_Recv` to receive messages for all processes. `MPI_Recv` also acts as a sync barrier.
3. Implement an `MPI_Test` to check the `MPI_Request` handle for`MPI_Isend`. Confirm that it is always set to `true`.
## Class/Homework Exercise 7
**ring with MPISendRecv**
This send-recieve ring pattern is so common there is a dedicated method for it: `MPI_Sendrecv`.
1. Use MPI_Sendrecv to achieve the same as above but now in a single command. Once this is implemented the final REQUIRE check should pass.
## End of module Evaluation feedback
Please, fill up [the survey to help us improve](https://moodle.ucl.ac.uk/mod/questionnaire/view.php?id=3352333) our teaching and your learning experience.
# Questions
Here you can post any question you have while we are going through this document. A summary of the questions and the answers will be posted in moodle after each session. Please, use a new bullet point for each question and sub-bullet points for their answers.
For example writing like this:
```
- Example question
- [name=student_a] Example answer
- [name=TA_1] Example answer
```
produces the following result:
- Example question
- [name=student_a] Example answer
- [name=TA_1] Example answer
Write new questions below :point_down:
- []
- [name=TA_1] Example answer
###### tags: `phas0100` `teaching` `class`