# e-Research Training Meetings
External stakeholders to invite:
- John Lavelle (KHP Digital Health Hub)
- Alessandra Vigilante (HAB)
- Someone from KDL
## 2026-01-12
- Existing training
- Intro to HPC
- Happy, might want to add something on AI models
- Variants - happy to do if asked and funded
- Keeping in sync difficult / impossible
- Consider running SWC Shell Novice before HPC workshops
- OpenStack
- Not quite polished enough yet
- DRIVE-Health interested, Liz likely delivering, good opportunity to polish
- Might not need to deliver as workshop, Ops and Lukasz don't see many queries, user experience level is generally high
- Software Carpentry
- generally happy, although can be hard to get helpers for two days in a row
- Consider running as separate sessions
- Online format we don't like as much
- Much more difficult as helper/instructor, worse experience for all, high dropout rate
- Python vs R - we need to run some R
- Containers
- Attendance and feedback very good
- Would like to run 2x per year
- Performance Profiling and Optimisation for Python
- Attendance and feedback very good
- Would like to run 2x per year
- HPC Data Management
- Would tie in with mandatory HPC access training
- Reproducible Computational Research
- Seminar format, different kind of thing
- Similar work in NMES
- Planned
- Introduction to TREs
- Do we need this to be a workshop? The environment isn't that complex once you're logged in
- Was driven by DRIVE-Health
- May work better as seminar-style - pitch for using TRE
- Check with Michal / Madalyn if they feel we need a workshop
- Green computing
- Don't think standalone workshop is the right format - don't expect good uptake
- Will assess existing training and add callout content on sustainability
- Material does exist for a full workshop, shorter materials also in progress
- Intro for PIs
- Introduction to AI methods
- How would we pitch this? What would the content be? What does 'introduction' mean?
- A lot of people happy using existing models, some developing / tuning their own
- May be quite context sensitive, some core, but branch for specific applications
- Pitch this as workshop which aims to walk you through solving specific problem
- GPUs - we've had requests
- Alejandro working on training with Young
- What would it cover?
- Cupy (CUDA Numpy), JAX
- Who is the audience? Would need to survey
- Context matters
- Pointers to other resources
- External: https://kcl-rosalind.slack.com/docs/T5BQQ83QR/F09BVSW27AB
- https://internal.kcl.ac.uk/crsd/courses/ai
- Comms strategy
- We have existing training mailing list
- Set up group including providers and consumers of training e.g. CDTs
- AI priority area - how do we get this out?
- Scheduling
- Ask for access to HAB room - do this in form of shared R workshop initially
- Prepare schedule to deliver one workshop per month
- Will make sure it happens reliably with less pressure
- Access to spaces
- HAB on Guy's
- River Room on Strand?
- Does CRSD have training rooms on Waterloo?
- Actions
- Make list of stakeholders to include in comms / coordination
- Schedule meeting to make draft training schedule for the year
## 2025-04-30
Agenda:
- Introductions for people outside e-Research
- Development of new training materials
- HPC Data & Code Management
- 20th May 2025 afternoon
- Open questions
- [Xand] How can we support people to automate rsync to RDS
- Example workflows
- Image analysis
- Stefania has a training session that uses this
- Identifying blobs in image
- Contributors
- Liz, Alejandro, James
- Reviewers
- Matt, Ops
- A PI's Guide to e-Research
- Topics
- The range of services which e-Research provides and how these can support your research - both costed and un-costed
- Determining infrastructure requirements for research projects
- Costing e-Research infrastructure and staff on grants
- Managing data and code within a research group
- Data / Software Management Plans
- [Matt] Speak to library
- Publication of digital research outputs
- Software maintenance
- Planning for sustainability
- Sustainability and reproducibility
- PI's responsibility
- Green Disc
- Direction of travel - what do we have upcoming, what do we need PI's support with
- Future training
- Contributors
- Matt, Liz, James
- Introduction to TREs with CREATE
- Needs to emphasise using the TRE
- This one aimed at researchers actually using the environment
- Maybe another one later for PIs / data managers
- Signpost to external resources / whitepapers
- Collaboration outside e-Research
- Signposting between training providers
- Training brief (quarterly?)
- Who to invite into these?
- HAB
- KDL
- Imaging Sciencies - Marc Modat, Eric Kerfoot
- HPC Champions - relight this
- Have funding in STEP-UP for this
- Re-do applications
- Use it to collect examples
- Doctoral College
- OD
- CRSD
- King's AI
- HS-DTC
- Grant Wray
## 2025-04-30 Mandatory HPC training
- Purpose
- What does our mandatory HPC training need to look like?
- Platform
- KEATS
- Can get API access to see who's done it
- Content
- Bash
- Basic commands - filesystem movement, inspecting files
- How is the filesystem structured?
- Case sensitive, spaces
- Exists in current HPC training materials
- Getting access
- SSH keys
- Create people accounts on a training partition
- Allocate a single node - can't access anything other than that
- Could later (not MVP) check for existence of an output file we expect them to create
- HPC theory
- What is the scheduler? Nodes?
- Loading modules
- Requesting appropriate resources
- Don't request more cores / memory than you need
- Data management
- Where to put your data - RDS
- Put your data in RDS - copy to scratch for your job
- All outputs to RDS
- Default to github.com - but mention KCL GitHub too
- What can we mandate?
- Nothing really - too many risks of misunderstanding
- Describe the backup on filesystems
- Ensure that your code is somewhere safe
- Sustainable computing
- Cleaning up your data
- Reproducibility of experiments
- Software versions and params
- Naming things
- Testing on small subsets of data before running large simulations
- How many resources your job used
- Getting help
- Error messages
- How to submit queries to us
- Man pages
- Places online, LLMs
- Tier 1/2 machines
- KEATS
- Reference to Software Carpentry etc.
- Short exercises to check they've understood
- Read this then come back and answer the question
- Allows people to shortcut if they already know
- Materials should include SSH and logging in
- Questions
- "Which of these filesystems are backed up?"
- Timing
- Must be doable in 30 min
- 5 min for someone already familiar