# KDL HPC training ## Questions - from Peter: will they be using Jupyter notebooks? ## Slides / background - we have 90 min for this - needs to introduce things like "what is a CPU?", "when is using HPC useful?", "what's a virtual environment?", "what is parallelisation?", etc - we can base this on existing material from the Bioinfo MSc and Intro to HPC slides - should have more focus on practical applications than theoretical computer science - will need to check with Mary/Neil that we're pitching it at the right level and not missing out things that might seem obvious to us! ### Plans: - use the "what is a computer"(?) and "components of a computer" sections of James's slides for the MSc - no binary content - keep at least some of the history of computing - probably need to expand on the components? - need to add content on going beyond using just one computer - how an HPC cluster is structured - some content in the HAB HPC workshop that we can use - need to add content on connecting - ssh keys? - NB these participants will have username and password access, so will not need to generate an SSH key! - so cover this only in a theoretical way - need to add content on resource allocation / job scheduling / partitions - can base on the HPC training slides ## Hands-on session - we have 3 hours for this - focus on common principles that will apply to other HPC systems too (e.g. probably skip OpenOnDemand, but show how to connect to RStudio/Jupyter using an ssh tunnel) - probably Jupyter since other parts of the week will use Python - can skip/minimise content on using the command line as there's a dedicated session for this! - would be nice to have a relatable research example - perhaps using Singularity to run https://github.com/kingsdigitallab/kdl-vqa ### Content: - accessing software with modules - virtual environments? - submitting jobs - job monitoring, cancelling jobs - requesting resources including GPUs - parallel jobs - cover multicore jobs - don't cover MPI jobs, probably not relevant for this audience - leave additional material for Thursday? - Singularity - Jupyter notebook via ssh tunnel - don't do this right at the end, give them time to try it out ### Plans? - need to set up a docker container for the KDL-VQA software and adapt the material to use this - develop example scripts that are more relevant: - text processing - look for online training material to adapt - Bristol course? - https://blog.jetbrains.com/pycharm/2024/12/introduction-to-sentiment-analysis-in-python/ - matrix multiplication using tensorflow/pytorch - Max has scripts for this already - https://docs.er.kcl.ac.uk/CREATE/TRE/guides_and_tutorials/tre_hpc/