# 2020-11-10 <br> HPC1: Introduction to HPC at Leeds Welcome to the hack pad for today's HPC1 course from Research Computing at the University of Leeds! ## Contents 1. [Links to resource](#Links-to-resources) 2. [Agenda](#Agenda-Day-1) 3. [What's your name and where do you come from?](#What’s-your-name-and-where-do-you-come-from) 4. [Code along notes](#Code-along-notes) 5. [Questions](#Questions) ## Links to resources - **Contact Research Computing** - https://bit.ly/arc-help - **Request HPC account** - https://leeds.service-now.com/it?id=sc_cat_item&sys_id=4c002dd70f235f00a82247ece1050ebc - **Presentation for today** - https://bit.ly/hpc1intro - **Exercises for today** - https://docs.google.com/document/d/1SPaZ2kmzYpMFIkiMSi-Qnu-ZqLaW4reSpVal3aOrmmk/edit - **Github repository** - https://github.com/arctraining/hpc1-files - **How to transfer files** -https://arcdocs.leeds.ac.uk/getting_started/file_transfer.html ## Agenda Day 1 | Time | Agenda | | -------- | ------------------------------------------ | | 0900 | Intro, connecting to ARC, what and why HPC?| | 0950 | Break | | 1000 | Login, HOME directory and looking around | | 1050 | Break | | 1100 | Simple job submission, qstat, qdel | | 1150 | Exercise 1 and questions | | 1200 | Close | ## Agenda Day 2 | Time | Agenda | | -------- | ------------------------------------------ | | 0900 | Intro, Data Transfer, Modules | | 0950 | Questions and break | | 1000 | Interactive sessions, ib v smp, node types | | 1050 | Questions and break | | 1100 | User guided section, talking through <br> your hopes/fears for HPC | | 1150 | Wrap up and questions | | 1200 | Close | ## What's your name and where do you come from? And why do you want to use HPC? Alex Coleman, Research Software Engineer, my research has previously been in natural language processing and clustering event descriptions data and simulating crime rates using historic data. John Hodrien, Research Software Engineer. I've done a mix of research in distributed computing and visualisation and virtual reality. Also taught Software Engineering and Parallel Programming. Worked supporting teaching and research IT at the university for ~15 years now, with particular interests of Linux and HPC. Nick Rhodes, Research Software Engineer working with Alex and John in the Research Computing team for a year. I am still learning a lot about research and research computing. I was previously an Operational Lead in Application Support, IT. I have a background mainly in the NHS, supporting and developing in large distributed national NHS systems in a wide variety of technologies. Aravinda Ramakrishnan Srinivasan - I am from India. I am a research fellow (fancy name for postdoc) in the Institute for Transport Studies working on modelling pedestrians and vehicles interactions. I plan to do the modelling with deep networks and HPC would be really helpful. Adam Welch - Chemistry PhD 1st Year - Modelling Mars Atmosphere using LMD-Mars Model - I'll need to use HPC as the LMD model is run on ARC4 Daniel Lucas, MChem student (Heard group) looking at low T reactions important for star forming reactions of ISM, need HPC to run Gaussian and MESMER calculations to obtain k(T) for comparison with experiment. Josh Cottom - Civil Engineering - Post Doc - Wanting to use HPC to run large QGIS and R models - working on mapping marine litter Jess Baker - postdoc in Earth and Environment studying Amazon climate. Want to use HPC to run WRFChem to better understand impacts of deforestation. Fabian Limmer - PhD in Mech Eng. Want to use HPC for CFD simulation using Ansys Sam Hall - MRes Climate and Atmospheric Science. I'll be needing Arc4 in order to run a high resolution turbulence model Liam Hunter - Postdoc in Chemistry. I study crystallization in microfluidics and will use HPC for CFD of microfluidic devices, mostly multiphase droplet microfluidics. 3D microfluidics makes my current simulation hardware cry. Harriett Fuller- PhD student in the School of Food Science and Nutrition/ LIDA. As part of my PhD I am running analysis on large genomics and metabolomics datasets using R. I need ARC to run some of these analyses at a quicker pace. My research focuses on genetic risk factors of Gestational Diabetes in high risk ethnicites in order to develop personalised dietary interventions. Guillermo Jimenez - PhD student in Mechanical Engineering - I need the ARC4 for computational fluid dynamics. Xiaoman Wang - PhD in Language, Culture and Society. Wanting to run my model on HPC for automatic assessment for English/Chinese Interpreting. Using HPC with deep learning framework to develop NLP application. Callum Smith - PhD in SOEE studying tropical deforestation and the impacts on climate. Using WRFChem to model the regional and local climate impacts of land use change in the tropics. Dominique Hirsz - PhD student in the School of Biology, working on temperature in wheat. Will need the HPC to analyse large RNA expression datasets. Kate O'Connor - PhD student (2nd year) in Biology. Will be analysing large datasets Yuqian Dai - first year PhD student, focus on building a neural network in the field of NLP. ## Glossary of Terms - Core: the basic computation unit of the CPU. This is unit that carries out the actual computations. - Node: the physical machine/server. In current systems, a node would typically include one or more processors, as well as memory and other hardware. - Parallel: run across multiple CPU cores, splitting the workload between them and solving the problem faster. - Processor: the central processing unit (CPU) inside the node, which contains one or more cores. - Serial: run on a single CPU core, solving one problem at a time - Batch processing: Jobs that are run as and when the system is able to, rather than jobs run interactively - Thread: A lightweight logical computation process. If a program is a sequence of instructions, this is the finger that works its way through the list of instructions. There can be many fingers, and you can have many more threads than you have hardware to run them. - GPU: Graphical Processing Unit. Not necessarily graphical, but this type of hardware is good at some high parallelism problems. We have a small number of these in ARC3/4. Massive speed ups are possible - one GPU can be as powerful as 40 machines. ## Code along