# e-Research Training Meetings External stakeholders to invite: - John Lavelle (KHP Digital Health Hub) - Alessandra Vigilante (HAB) - Someone from KDL ## 2026-01-12 - Existing training - Intro to HPC - Happy, might want to add something on AI models - Variants - happy to do if asked and funded - Keeping in sync difficult / impossible - Consider running SWC Shell Novice before HPC workshops - OpenStack - Not quite polished enough yet - DRIVE-Health interested, Liz likely delivering, good opportunity to polish - Might not need to deliver as workshop, Ops and Lukasz don't see many queries, user experience level is generally high - Software Carpentry - generally happy, although can be hard to get helpers for two days in a row - Consider running as separate sessions - Online format we don't like as much - Much more difficult as helper/instructor, worse experience for all, high dropout rate - Python vs R - we need to run some R - Containers - Attendance and feedback very good - Would like to run 2x per year - Performance Profiling and Optimisation for Python - Attendance and feedback very good - Would like to run 2x per year - HPC Data Management - Would tie in with mandatory HPC access training - Reproducible Computational Research - Seminar format, different kind of thing - Similar work in NMES - Planned - Introduction to TREs - Do we need this to be a workshop? The environment isn't that complex once you're logged in - Was driven by DRIVE-Health - May work better as seminar-style - pitch for using TRE - Check with Michal / Madalyn if they feel we need a workshop - Green computing - Don't think standalone workshop is the right format - don't expect good uptake - Will assess existing training and add callout content on sustainability - Material does exist for a full workshop, shorter materials also in progress - Intro for PIs - Introduction to AI methods - How would we pitch this? What would the content be? What does 'introduction' mean? - A lot of people happy using existing models, some developing / tuning their own - May be quite context sensitive, some core, but branch for specific applications - Pitch this as workshop which aims to walk you through solving specific problem - GPUs - we've had requests - Alejandro working on training with Young - What would it cover? - Cupy (CUDA Numpy), JAX - Who is the audience? Would need to survey - Context matters - Pointers to other resources - External: https://kcl-rosalind.slack.com/docs/T5BQQ83QR/F09BVSW27AB - https://internal.kcl.ac.uk/crsd/courses/ai - Comms strategy - We have existing training mailing list - Set up group including providers and consumers of training e.g. CDTs - AI priority area - how do we get this out? - Scheduling - Ask for access to HAB room - do this in form of shared R workshop initially - Prepare schedule to deliver one workshop per month - Will make sure it happens reliably with less pressure - Access to spaces - HAB on Guy's - River Room on Strand? - Does CRSD have training rooms on Waterloo? - Actions - Make list of stakeholders to include in comms / coordination - Schedule meeting to make draft training schedule for the year ## 2025-04-30 Agenda: - Introductions for people outside e-Research - Development of new training materials - HPC Data & Code Management - 20th May 2025 afternoon - Open questions - [Xand] How can we support people to automate rsync to RDS - Example workflows - Image analysis - Stefania has a training session that uses this - Identifying blobs in image - Contributors - Liz, Alejandro, James - Reviewers - Matt, Ops - A PI's Guide to e-Research - Topics - The range of services which e-Research provides and how these can support your research - both costed and un-costed - Determining infrastructure requirements for research projects - Costing e-Research infrastructure and staff on grants - Managing data and code within a research group - Data / Software Management Plans - [Matt] Speak to library - Publication of digital research outputs - Software maintenance - Planning for sustainability - Sustainability and reproducibility - PI's responsibility - Green Disc - Direction of travel - what do we have upcoming, what do we need PI's support with - Future training - Contributors - Matt, Liz, James - Introduction to TREs with CREATE - Needs to emphasise using the TRE - This one aimed at researchers actually using the environment - Maybe another one later for PIs / data managers - Signpost to external resources / whitepapers - Collaboration outside e-Research - Signposting between training providers - Training brief (quarterly?) - Who to invite into these? - HAB - KDL - Imaging Sciencies - Marc Modat, Eric Kerfoot - HPC Champions - relight this - Have funding in STEP-UP for this - Re-do applications - Use it to collect examples - Doctoral College - OD - CRSD - King's AI - HS-DTC - Grant Wray ## 2025-04-30 Mandatory HPC training - Purpose - What does our mandatory HPC training need to look like? - Platform - KEATS - Can get API access to see who's done it - Content - Bash - Basic commands - filesystem movement, inspecting files - How is the filesystem structured? - Case sensitive, spaces - Exists in current HPC training materials - Getting access - SSH keys - Create people accounts on a training partition - Allocate a single node - can't access anything other than that - Could later (not MVP) check for existence of an output file we expect them to create - HPC theory - What is the scheduler? Nodes? - Loading modules - Requesting appropriate resources - Don't request more cores / memory than you need - Data management - Where to put your data - RDS - Put your data in RDS - copy to scratch for your job - All outputs to RDS - Default to github.com - but mention KCL GitHub too - What can we mandate? - Nothing really - too many risks of misunderstanding - Describe the backup on filesystems - Ensure that your code is somewhere safe - Sustainable computing - Cleaning up your data - Reproducibility of experiments - Software versions and params - Naming things - Testing on small subsets of data before running large simulations - How many resources your job used - Getting help - Error messages - How to submit queries to us - Man pages - Places online, LLMs - Tier 1/2 machines - KEATS - Reference to Software Carpentry etc. - Short exercises to check they've understood - Read this then come back and answer the question - Allows people to shortcut if they already know - Materials should include SSH and logging in - Questions - "Which of these filesystems are backed up?" - Timing - Must be doable in 30 min - 5 min for someone already familiar