Time, Date: 4:00-5:00 CT, 04/14/2023
Location: ECS 142 and Zoom
Pre-requisites: Ability to read and type
Recommended: Familiarity with Linux command line and Python3
Zoom link:
Join Zoom Meeting
https://wtamu.zoom.us/j/94667371336?pwd=clhVbmRjckMvSTB0M2VVbWh3clh0UT09
Meeting ID: 946 6737 1336
Passcode: 9^FhY?57
Old Video recording
https://ensemble.wtamu.edu/hapi/v1/contents/permalinks/WTAMU-HPC-Workshop01/view
HPC or High Performance Computing is the use of a numerous processing elements (CPU/GPU) to solve calculation-intensive and/or data-intensive tasks. Tasks include but not limited to scientific/engineering problem solving, data-analysis, visualization. History can be traced to supercomputing. In fact, supercomputing and HPC is often used synonymously. Computing on your laptop/desktop may be inadequate and “You are going to need a bigger boat!”
Slides here: Introduction to HPC
If you are using Windows, follow steps W1 and W2 and skip to Section 1: STEP 3. If you are using Linux, skip to Section 1: STEP 1
NOTE: For Windows users only!
STEP W1: Install PuTTY from the Microsoft store or download putty.exe from the PuTTY website. You can also find the executable here.
STEP W2: Run the PuTTY application and enter the following details and click Open. Make sure you are on CISCO Campus VPN.
Hostname: hpcjump.wtamu.edu
Port: 22
STEP W3: Enter your HPC username and password. The password may not be displayed so type carefully. Move to Section 1: STEP 3.
STEP 1: Open command line terminal (bash) and create a working directory in your name.
STEP 2: Download and extract filezilla. Set up a remote connection.
This should open up a GUI.
Enter the following credentials in the GUI.
Host: hpcjump.wtamu.edu
Username: temp** (your username)
Port: 22
Hit QuickConnect and enter your password. If you succeed you should see a folder structure on the right (remote site).
This interface will allow you to quickly transfer files between the HPC and the local machine.
STEP 1: ssh into the hpc jump srver (temp** is your HPC username)
STEP 2: Enter the HPC password provided to you.
STEP 3: You should be in the hpcjump jump server now. Log into the login node. Enter yes if prompted.
You may need to reenter the HPC password.
Now you are in the login node and ready to submit jobs!
An HPC job typically needs three things:
(1) input data to be processed (optional)
(2) program to do something, eg. process the data (necessary)
(3) job script that informs the HPC how the job will be run. (necessary)
and produces:
(1) output data containing some information/data.
Files for this following section can be obtained by
Here we will run a very simple python program simple.py. Note that this program has no input data and simply prints the message "Hello World".
To run this program on an HPC, we need to create a slurm job script that will contain instructions on how to run the program. The slurm job script sb.simple contains the following.
Let us spend some time understanding what all this means.
#!/bin/bash
This line indicates that this is a bash script. Bash is the language of the command line.
#SBATCH –nodes=1
#SBATCH –tasks-per-node=1
#SBATCH –partition=compute-cpu
#SBATCH –output=log.slurm.out
#SBATCH –error=log.slurm.err
#SBATCH –time=10:00:00
These lines provide all the information on how the program will be run, including: how many nodes are requested, how many processors/node are requested, which partition of the cluster to run the program on, filenames where the output and error information will be stored, and the upper time-limit within which the program should complete/terminate.
module load slurm/20.11.9
module load spack/python/3.10.8
These commands basically tells the HPC where the python and srun programs are located. If you do not provide this, the HPC will not know what the word python and srun (used in the next line) means.
srun python simple.py
Finally, this line indicates the program simple.py will be executed using the python interpreter and slurm (srun) scheduler.
Once the files simple.py and sb.simple are available, they can be executed using:
You can also execute the parallel version of the program simple_mpi.py using the batch script sb.simple_mpi.
Now try setting the number of processors in sb.simple and sb.simple_mpi, resubmit the jobs, and re-view the output.
#SBATCH –tasks-per-node=16
Now let us look at a less simple program color2gray.py. This program will run on 1 cpu and convert a set of images from color (rgb) to grayscale.
If you are familiar with python, you might be able to figure out what is going on. If not, we can study this program more closely. The key portion of the program is the loop.
The first part of the loop reads an image (indexed by i, the loop variable) into an 3D array img which is then converted to a 3D numpy array orig in the second part. The three color channels (red, green, blue) of the image are the 2d matrices orig[:,:,0],orig[:,:,1], and orig[:,:,2] respectively, which are weight-averaged to create a grayscale image matrix gray. The third part writes the image to a file.
See how long the job took, once it has completed converting 800 images.
Now let us run the above program in parallel. The easiest way to parallelize the above code is to divide the set of images equaly among the various processors.
We have had to make a few changes. See if you can spot what has changed. What do the changes mean?
Now submit the mpi job.
Change the number of processors used (–tasks-per-node) to 16 and 64 and resubmit the job to see how the Time taken changes. You should get faster (shorter times) with increasing number of processors.
Running code on gpu can be fast if there are a lot of matrix operations involved, compared to I/O operations. Here we shall look at a different program cupy_numpy.py to see the advantage of using GPUs.
This program has three tasks.
(Task 1) Creating a 1000x1000x1000 array of ones.
(Task 2) Multiply entire array by 5
(Task 3) Multiply the array by 5, then multiply the array by itself, and then add the array to itself.
Submitting the program will show us the advantage of using GPUs.
Linux users can use FileZilla with similar instructions as below.
Copy files from the HPC into your local folder.
NOTE: For Windows users only!
STEP W1: Download WinSCP.exe from the WinSCP website. You can also find the executable here.
STEP W2: Run the WinSCP application and enter the following details and click Login. Make sure you are on CISCO Campus VPN.
Hostname: hpcjump.wtamu.edu
Port: 22
STEP W3: Enter your HPC username and password. You should now be able to transfer files between HPC and your computer.
HPC resources can be really useful when the task at hand is too big for a workstation to handle (eg. data science, computational science and engineering). In this case, one must either develop a parallel program (with potentially parallel I/O) or use a parallelized code for the task. Developing a good parallel code requires extensive knowledge of computer architecture (processor speeds, cache sizes, memory, network latency and bandwidth, etc). Maintaining and administering such HPC systems is also quite a challenge considering reliability, cybersecurity, and package/library management). Both present significant career opportunities.
Please complete the following survey (2-3 min):
https://wtamuuw.az1.qualtrics.com/jfe/form/SV_es3ZVxmqXfMd0V0