ENCCS Training material hackathon

August 22-24 2023

This document: https://hackmd.io/@enccs/training-hackathon-2023

Room and Zoom

Remote participation: https://rise.zoom.us/j/67402468396?pwd=SkRiSzFPbFZVOEcybGMzMlBhbVcwQT09
Physical participation:

22/8 room Knuth
23/8 room Kruse
24/8 room Salongen
We can meet each morning in the lobby and have coffee/tea

Participants

Qiang
Thor
Wei L
Wei Z
Yonglei
Apostolos
Johan
Martin
David Eklunk
Karl-Filip Faxen

Homework before hackathon

Acquaint yourself with all lessons on https://enccs.se/lessons/ (look though the front page, list of sections etc)
Look through our list of external training resources (https://enccs.se/external-training-resources/). What's missing
Go through the preparation steps for attending ENCCS instructor training: https://enccs.github.io/instructor-training/preparation/
Think about what you would like to discuss or learn more about during the hackathon, and which lessons are most interesting to you

Hackathon notes

Carpentries link https://swcarpentry.github.io/shell-novice/index.html

Tentative agenda

August 22:

9:30-10:00 Introduction to hackathon
- https://hackmd.io/@enccs/training-hackathon-intro-2023#/
10:00-12:00 Walkthrough of ENCCS lessons
- Status of lesson
- Open issues, issue triage
- Review lesson learning objectives, exercises, overall structure
- Discussion and brainstorming about each lesson
- All lessons need About section with links to ENCCS and lessons page (see https://enccs.github.io/gpu-programming/)
- Detailed note-taking and issue-raising
- Contributor guide (CONTRIBUTING.md)
- CONTRIBUTORS list, citability (https://github.com/coderefinery/documentation/pull/270)
- Add acknowledgement of the EU grant(s) in the README.md of all repos. Like so:
  Image Not Showing Possible Reasons
  The image file may be corrupted
  The server hosting the image is unavailable
  The image path is incorrect
  The image format is not supported
  Learn More →
- Instructor guides
- switch to sphinx-book-theme? or something different?
  - https://coderefinery.github.io/git-intro/branch/sphinx-book-theme/basics/
12:00-13:00 Lunch
13:00-14:00 Discussion around David's (Julia-HPDA) and Karl-Filip's (performance engineering) lessons
14:00-17:00 ENCCS Instructor training express walkthrough

August 23:

9:30-10:00 Publishing on Zenodo, Contributors list
10:00-11:00 LUMI intro material
11:00-12:00 RISE software bootcamp
12:00-13:00 Lunch
13:00-16:00 Deep dive into lessons
- individual or group-work to implement changes suggested yesterday
- focus on lesson objectives/questions/keypoints and exercises

August 24:

9:00-9:30 HackMD alternatives - HedgeDoc hosted on ICE
9:30-10:30 MOOC-ifying material
- record videos
- possible platforms
- support provided
- "examination"
10:30-12:00
- quantum autumn school
- C++ lesson
  - Yonglei Wang
  - Jonas Lindemann
  - Sandipan Mohanty
  - Johan Kristiansson
- Colonies lesson
- other new lessons?
12:00-13:00 Lunch
13:00-17:00 Open slot

All existing ENCCS lessons

Survey and feedback

General lesson issues/comments/todos

can be confusing when H1 and H2 heading levels have similar font sizes, can this be improved?
- consider using section numbers: 1, 1.1, 1.2, …
- can be add CSS styling to improve this? should be possible in overrides.css
we need roadmap for developing lesson material
- "GPU programming"
  - introduction (level 1)
  - intermediate level
  - advanced level -> should include larger project
  - hackathons? can be intimidating?
    - have to be marketed in the right way
write blog post about when to use OpenMP, when OpenACC, show examples etc
military approach: mål, syfte, krav
- similar to backwards lesson design, starting from learning objectives etc

GPU computing

1. GPU Programming - Why, When and How?

Developers: Many
Maintainer: Yonglei
todo:
- write best practice guide
- review and raise issues
- solve issues
- use first N episodes as general intro for all GPU programming workshops (including OpenMP, OpenACC, CUDA etc) -> modular design
- Modularisation - can we use git-submodules or similar
- more exercises! step-by-step to intermediate level
- MOOC-ify it? Could serve as a PoC of MOOC-ification
- learning tracks: "for this purpose, study these modules, for this other purpose study these other modules"
- each module should have clear learning objectives
  - "if you already know the basics, jump to this module"
  - consider having certain intro-level modules as self-study prerequisites for intermediate/advanced level workshops

2. OpenMP for GPU Offloading Workshop

Developers: Qiang, Mark
Maintainer: Wei Z, Qiang
todo:
- profiling part is missing
- multi-GPU part is missing
  - but there is a part about that in the general-GPU lesson
- merge with OpenACC lesson?
- Current material is excellent for the 1st part of 2-part workshop (half+half, day+day):
  - Subject (heat diffusion app)
  - Intro to GPUs
  - Offloading example
  - Data environment example
- Next sections to update, in suggested order:
  - (Tier 1) Offloaded code optimization
  - Existing code porting considerations
  - GPU profiling
  - (Tier 2) Multiple GPUs
  - GPU libraries

3. OpenACC Workshop

Developers: Jing
Maintainer: Wei Z, Qiang
ideas:
- merge with OpenMP lesson?
- OpenMP lesson assumes that most users know OpenMP and goes from there. OpenACC lesson is mostly on syntax/ workings. Two suggestions (could do either or both):
  - flesh out OpenACC lesson a bit more and keep it separate
  - add OpenACC module to GPU offloading lesson with focus on heat example/ only relevant differences between coding in the two programming models

4. Intermediate CUDA Workshop

Developers: Artem, Jing
Maintainer: Yonglei, Wei L
todo:
- generalize to include HIP in code blocks
  - maybe we can automatically hipify all code examples and exercises in the lesson
- where is the intro material? general-GPU lesson contains some material at intro level
- harmonise this lesson with what's in general-gpu lesson

Retired:

5. SYCL Workshop

Developers: Roberto
Maintainer: see if Roberto has time and interest?
- collaborate with someone! NAISS or EuroCC or CSC
- Codeplay involvement?
todo:
- Add more on profiling/debugging/tooling? This is done through third-party applications and depend on the hardware (NVIDIA/AMD/Intel) available though.

6. Developing Applications with the AMD ROCm Ecosystem

Developers: AMD staff
"Maintainer": someone to dig into the material to see what's useful and reusable
todo
- can we reuse some of the material for our other ENCCS lessons?
- should ask them first, but they already agreed when developing the material
- could be good bits and pieces that we can reuse

HPC

0. Intro to MPI

Developers: ENCCS
Maintainers: Yonglei, Wei Z
todo
- build new material based on existing (no need to reinvent everything)
- https://pdc-support.github.io/introduction-to-mpi/03-mpi_send_recv/index.html
- http://www.archer.ac.uk/training/course-material/2020/01/advMPI-imperial/index.php
- https://github.com/EPCCed/archer2-MPI-2020-05-14
- the old SNIC material -> find link

1. Intermediate MPI Workshop

Developers: Roberto, Mark, Xin, Pedro
Maintainer: Yonglei, Wei Z
ideas:
- we need intro material too!
  - HLRS material
- This is an overview-flavored workshop – overview of the middle part of the larger topic (MPI standard). Different participants will have different experience/ view on what constitutes the "Basic MPI" part. For clarity, the lesson could have an introductory chapter in the spirit of "what you're supposed to know already", which would be offered either as a self-test or a first session.
- https://www.hlrs.de/training/self-study-materials/mpi-course-material
- https://hpc-tutorials.llnl.gov/mpi/

Programming languages

1. High-performance Data Analytics with Python

Developers: Qiang, Thor, Wei Z
Maintainer: Qiang?
ideas:
- split out GPU part into "GPU programming in Python" lesson
- how to actually run python on HPC
- longer-term goal: combine with the HPC2N/UPPMAX effort: https://uppmax.github.io/HPC-python/parallel.html
todo
- for workshop on Sept 5-7
  - have onboarding session day before? to help solve installation and cluster issues
  - but most participants use local computer
  - finish lesson updates

2. Julia for High Performance Scientific Computing

Developers: Thor
Maintainer: Thor, Yonglei
todo:
- split out machine learning part into separate ML-in-Julia lesson
- add explicit introductory material which can be taught in longer workshop or recommended as preparation material
- expand GPU episode
- add MPI episode
- upcoming workshop:
  - Yonglei to teach certain parts
- julia-intro material should be broken out, first 4 episodes
- package compiler! to create one executable
  - need to bring this up in the lesson somehow: pros and cons, what problem it addresses

3. Julia for High Performance Data Analytics

Developers: David, Anastasiia
Maintainer:
todo:
- finish it
- advertise workshop internally at RISE and via RISE channels to reach people outside academia
- split out julia-intro part, either via git-submodules or just independent repos
- consider recommended reading with linear algebra / machine learning gentle and short intro

AI

1. Upscaling A.I. with Containers

Developers: Hossein
Split into 2 parts:
- https://enccs.github.io/upscaling-ai-training/
- https://enccs.github.io/containers/

2. Upscaling AI training

Developers: Hossein
Maintainers: Johan, Martin
todo:
- split off from "Upscaling AI with Containers" and not complete or consistent
- needs thorough revision
Containers lesson
- modularisation idea: is it needed? when is it needed?

3. A.I. as a Tool for Change

Developers: Erik
ideas:
- Data-focused ML is not all AI is, even if it is the most popular branch in practice today. And if the term is not to be misused, training material and instructors should be the first ones to point out the distinction and correct it, consistently. So, if the workshop deals exclusively with ML – what relation it has with AI, how it works, how it's coded, how to apply it, what it can and cannot solve – then the title should reflect that (e. g. "ML as a tool…") Otherwise, content could be expanded to include other AI aspects as well (at least issue of uncertainty, maybe also brain simulation research).
- 2

4. Graph Neural Networks and Transformer Workshop

Developers: Erik
Maintainer: Martin
ideas:
- 1
- 2

Tools

2. Introduction to Containers

Developers: Hossein
Maintainers: Johan and Martin?
todo:
- split off from "Upscaling AI workflows" and not complete or consistent
- needs thorough revision
- focus more on how-to and less on underlying theory (cgroups etc)

Applications

1. Gromacs GPU Performance

Developers: Mark
Maintainer: Yonglei
feedback:
- https://github.com/ENCCS/event-organisation/blob/main/survey/data/survey_GROMACS_JAN-2021.pdf
ideas:
- 1 PO is interested in collaborating in this workshop
- 2

2. VeloxChem: Quantum chemistry from laptop to HPC

Developers: Roberto, VeloxChem developers
feedback:
- https://github.com/ENCCS/event-organisation/blob/main/survey/data/survey_veloxchem_MAR-2022.pdf
ideas:
- The package can be used both as quantum-chemical method library for one's own code or as an application for production-level computations, and these two topics have partly different audiences. Workshop can demonstrate both, but it would be nice to separate them more clearly to avoid confusion.
- 2

3. OpenFOAM Workshop

People involved: Jing, Arash (PDC), Timofey Mukha (KTH)
partly slides
ideas:
- give another training with KTH (Timofey (left to Dubai), Arash?)
- sphinxify? need to contact and work with authors

4. NEK5000 Workshop

Developers: Jing?, Philipp Schlatter, Niclas Jansson, Adam Peplinski
Slides only!
ideas:
- sphinxify? need to contact and work with authors

5. VASP best practices workshop

Developer: Weine Olovsson

6. TREX: Targeting chemical accuracy with quantum Monte Carlo on LUMI

Developers: TREX team

7. MAX: Efficient materials modelling

Developers: MAX team

Quantum computing

1. Introduction to Quantum Computing and hybrid HPC-QC systems

Developers: NordIQuEst
ideas:
- sphinxify more? need to contact and work with authors
- 2

Application training material

work with CoEs for each lesson!
include more CoEs, more codes
- especially CEEC, Plasma-PEPSC
- EXCELLERAT? ask Jing

Performance programming

Developers: Karl-Filip, include also Wei Z
todo:
- set up meeting Karl-Filip and Wei Z
- decide on structure, add material, add exercises
- https://www.adlibris.com/se/bok/introduction-to-high-performance-computing-for-scientists-and-engineers-9781439811924
- https://www.amazon.se/Intro-High-Performance-Computing-Hager/dp/0367221306/ref=monarch_sidesheet
- set date for workshop
  - week 46, November 13-17
- consider memory aspects
- timing, performance measuring, overhead, instrumentation
- dynamic memory aspects?

New lessons

Colonies
C++
Performance engineering
Julia-HPDA

Tasks

go through all lessons and raise issues or send PRs for obvious improvements
- typos, clarifications, formatting
go through feedback from indico and hackMD and raise issues or send PRs
triage existing lesson issues
write instructor guides
add general learning objectives to front page
review episode objectives and questions
in-depth work on one lesson: refactoring, adding/removing content, new visualizations, new exercises and solutions, new or refined objectives/questions/keypoints
possible merging or splitting of lessons

Notes

CMake

all lessons maybe should have numbered exercises, and split folders by day (day1/, day2/)
- even for in-lesson exercises give numbers
git-tag each workshop iteration
label all issues (next-iteration, urgent, non-important etc)

Advanced deep learning

Intermediate MPI

standardize exercises (unit tests, work on exercise until test passes)
RDR exercises: without correct solution the code doesn't compile
unfortunately attracts few participants
HLRS course: join basic/intermediate/more advanced
- up to 50 participants
include episode for quick recap of basics
Stepas: use this lesson as Nordic standard instead of each center reinventing
inspiration/resource for adding introductory part: https://pdc-support.github.io/introduction-to-mpi/intro

OpenACC

no new openACC material being developed
performance on non-NVIDIA gpu not good
not much to say on advanced level
main idea: merge OpenACC material with OpenMP-offloading
- give introduction to openACC, followed by work on own codes from participants
OpenMP lesson assumes that most users know OpenMP and goes from there. OpenACC lesson is mostly on syntax/ workings. Two suggestions (could do either or both):
- flesh out OpenACC lesson a bit more and keep it separate
- add OpenACC module to GPU offloading lesson with focus on heat example/ only relevant differences between coding in the two programming models

OpenMP-offloading

not complete! e.g. multi-GPU missing and other advanced parts
we need to involve more people (NCC-Belgium, Czech)
idea: same as OpenACC, only give intro and then help participants with own codes
in other centers: first OpenMP course, then more advanced topics like offloading to GPU
include OpenMP tasking?
Current material is excellent for the 1st part of 2-part workshop (half+half, day+day):
- Subject (heat diffusion app)
- Intro to GPUs
- Offloading example
- Data environment example
Next sections to update, in suggested order:
- (Tier 1) Offloaded code optimization
- Existing code porting considerations
- GPU profiling
- (Tier 2) Multiple GPUs
- GPU libraries

OpenFOAM

it's a mix between sphinx lesson and pdf slides
involve Excellerat, maybe next year
involve Timofey (KTH-mech) in teaching OpenFOAM?
other OpenFOAM experts in NCC network

Nek5000

abandon it! Philipp's slides. From collaboration with Excellerat

Python HPDA

Richard interested in helping teach, we can make it big!
should MPI section be part of other MPI-specific lesson?
make it more clear which approach is suitable for which problems
https://uppmax.github.io/HPC-python/packages.html SNIC course to be taught in September
- collaborate on this? Björn Claremar is task leader
- 1 day material
- combine with Python-HPDA lesson, e.g. as first day?
- define levels of SNIC and ENCCS lessons
modular blocks:
- intro-to-GPU useful in multiple lessons
- heat equation example: MPI, CUDA, etc
- https://enccs.github.io/veloxchem-hpc/modern-hpc-architectures.html
- https://github.com/ParRes/Kernels
https://github.com/ParRes/Kernels by Jeff Hammond
- different parallel patterns/languages etc
- contact him for reuse?
develop standalone episode on parallel patterns which can be linked from different lessons?

General GPU lesson

Julia for HPC

MPI.jl, to include or not
GPU episode, keep it as it is?
- and include only julia ports of heat-equation and reduction in General GPU lesson
consider git submodules for generic GPU intro

Upscaling AI workflows

omit containers and focus on upscaling AI
split into "introduction to containers" and "upscaling AI" lessons
- https://enccs.github.io/upscalingAItraining/
- https://enccs.github.io/Containers/
Upscaling AI lesson, learning objectives
- Understand when AI can be scaled up
- Learn basics of Horovod for TensorFlow or PyTorch
- Scaling up to multiple GPUs using Horovod or native TensorFlow
- Using containers in scaling up, when and why
- rename to "upscaling AI training" since it's not about workflows per se
Introduction to containers, notes:
- there are other lessons, like Carpentry/EPCC material
- distinguish this lesson somehow, e.g. by being at intermediate level going beyond Carpentry, or the troubles you get when using containsers on the cluster
- or "Docker for cloud computing" or "Docker compose"?
- or skip Docker completely and just start with Singularity/Apptainer and include "converting from Docker" episode
- Docker vs singularity vs … How to translate images and run commands from docker (what you find online) to Singularity (or whatever) that yo ucan use on the cluster.
- Development in containers, as opposed to running working code (do you have to rebuild every 2 minutes?)
- (aalto chat) trouble points we have noticed:
  - How to work with container that has been designed to be run as root (e.g. stuff is stored in /root)
  - If the code tries to write into the container, how do you use bind mounts to overwrite the directories with ones you have writing rights to
  - How to read dockerfiles to see what the original image tries to do
  - How to get nvidia drivers into the container (–nv)
  - How to run MPI enabled code in containers ( advanced topic )
https://hackmd.io/@pmitev/UPPMAX-Singularity-workshop
Plan: schedule a meeting with everyone interested in pursuing container lesson (Aalto, ENCCS?), Pavlin Mitev UU

Sphinx

switch from readthedocs theme to sphinx-book-theme (veloxchem one)?
make sure sphinx-book-theme works with sphinx-lesson (RD)
if anyone comes up with ideas or improvements add it to https://coderefinery.github.io/sphinx-lesson/sample-episode-rst/
also, https://coderefinery.github.io/sphinx-lesson/sample-episode-rst/ is a well thought out choice of colors etc

SYCL

Add more on profiling/debugging/tooling? This is done through third-party applications and depend on the hardware (NVIDIA/AMD/Intel) available though.
- debugging: CUDA version of gdb
- in last workshop VNC server went down as soon as profiling/debugging episode was starting
repo has well explained issues that should be dealt with before next workshop

AI as a tool for change

Data-focused ML is not all AI is, even if it is the most popular branch in practice today. And if the term is not to be misused, training material and instructors should be the first ones to point out the distinction and correct it, consistently. So, if the workshop deals exclusively with ML – what relation it has with AI, how it works, how it's coded, how to apply it, what it can and cannot solve – then the title should reflect that (e. g. "ML as a tool…") Otherwise, content could be expanded to include other AI aspects as well (at least issue of uncertainty, maybe also brain simulation research).

VeloxChem

Take up the lesson with Patrick, Xin et al.
- they might want to change somehow and teach it
- Patrick's group has been working on a sphinx-book quantum chemistry course
- When should next workshop take place?
Need to add the missing one to enccs.se/lessons!
The package can be used both as quantum-chemical method library for one's own code or as an application for production-level computations, and these two topics have partly different audiences. Workshop can demonstrate both, but it would be nice to separate them more clearly to avoid confusion.
- Follow-up - I'd keep a single VxC workshop in two parts:
  - Program overview and running/ trying out examples,
  - Performance and scaling (of some or all examples shown previously) in more detail; possibly adapted to a system the current workshop runs at
Two lessons:
- https://enccs.github.io/veloxchem-workshop/
- https://enccs.github.io/veloxchem-hpc/
Also: https://kthpanor.github.io/echem/docs/title.html
NCC-Lithuania interested in participating/contributing to future VxC workshops
Also Pedro and HPC2N
v1.0 expected maybe end of this year. A workshop would be after that release

CUDA

lesson should not die (Artem left)
Pedro interested in teaching/contributing
HPC2N material is good, especially profiling part
all GPU lessons have similar-looking "intro to GPU" episodes
- consolidate into meta-GPU lesson?
extend with HIP, e.g. with code-tabs with both CUDA and HIP
HIP101 lesson (from earlier CSC collaboration)
minimal generalization: add HIP code-tabs (PR-8 in lesson repo)
- and optional hipification episode only taught in HIP-version of workshop
possible to split breakout rooms for HIP and CUDA
expand GPU-intro section (or pull it in from upcoming general-GPU lesson)
regarding generalizing to include also HIP:
- "Optimizing the GPU kernel" could be somewhat CUDA-specific
in "optimizing GPU kernel" episode consider if another example is better (SYCL lesson?)

GROMACS GPU performance

CSC interested in GROMACS (Atte)
- ENCCS can collaborate with CSC or BioExcel on this material/workshop
Mark developed it but left for industry
Pedro interested in collaborating
ENCCS has more staff from january
make sure that material works with most recent GROMACS version!
move Puhti section to setup/installation episode and generalize it to "abstract" cluster
do an iteration on the images - Artem takes this
get in touch with BioExcel (Alessandra) about plans to teach this in the near future

Instructor guides

What are the steps of instructor training?
- Raw tech tools of teaching
- Room management, supportive environment, etc. (can sort of be delegated to a more experienced co-teacher)
- Overall organization (does not have to be learned first)
- The material of the lesson itself
Personas of people reading instructor guide
- Organizer/primary teacher making the workshop plan
- Primary teacher
- Co-teacher needing to know the minimum to do the exercises when someone types
Best practices / template of a good instructor guides could be put here: https://coderefinery.github.io/sphinx-lesson-template/guide/
Or should it go in sphinx-lesson somewhere directly, right after the sample episode?
The exercise list is also good to have as part of an instructor guide, see this model: https://coderefinery.github.io/git-intro/exercises/
Sample instructor guides:
- https://coderefinery.github.io/git-intro/guide/
What goes into an instructor guide
- Current status of each lesson,
- What is most often taught in a typical course (and what is not often taught, so may not be well-refined)
- common pitfalls

Day 2

Zenodo, publishing lesson

we're doing this
purpose: give authors credit, FAIR principles

LUMI intro material

Collaboration: ENCCS, NCC-Czhech, NCC-Poland, CSC/NCC-Finland, Peter Larsson (LUST representative in Sweden), maybe others

https://lumi-supercomputer.github.io/lumi-self-learning/
should become the standard go-to HPC intro material for EuroCC
should also be marketed in Sweden, can start with NSC
lesson work:
- first porting from Carpentry markdown to sphinx-lesson markdown (directives, prompts etc)
- then consider other improvements/additions etc
Thor to include Yonglei in meetings with consortium, workshop dates to be set, priority of task to be determined

RISE software bootcamp

dates: Nov 6-10
which background and role for participants?
- data scientists, ML engineers, data analysts
- typically PhDs from natural sciences, environmental scientists
Topics
- Git most important
- Python is enough, not Julia
  - level of material could be adapted to audience
- Unix shell: standard intro, file system, files and folders, basic commands, editors
  - good to include ssh also, and understanding the basics of the internet (IP addresses, ports, cloud, remote servers)
- focus of programming
- think about levels:
  - level 1: unix shell
  - level 2: basic programming in python, intro to Git
  - level 3: intermediate programming, collaborative/advanced Git
  - level 4: testing (Johan)
- reproducible research - dependencies, good file structure, containers, workflows, sharing data:
- social coding - licenses etc:
- Jupyter
we should think about use cases and backwards design the curriculum
- think about use cases and learner personas
prerequisites:
- acquaint oneself with an editor or IDE?
ask Isabelle about availability of compute resources to staff
Thor to update team on results from survey among RISE unit managers

Personas for HPC training

AskedGPT-3.5 to generate some personas that could be used to design our training material. All personas will not be relevant for all courses. This is just a starting point and we should rewrite and refine based on our experiences from previous courses.

These personas showcase the diverse range of individuals involved in education related to High Performance Computing, each with unique motivations, goals, and backgrounds. Designing educational programs and resources that cater to these personas can help foster a well-rounded HPC learning ecosystem.

The Ambitious Student Researcher:

This persona represents a driven undergraduate or graduate student who is passionate about pushing the boundaries of computational science and technology. They have a strong background in computer science, engineering, or a related field. They are eager to learn about the latest advancements in HPC, parallel programming, and optimizing algorithms. They often participate in HPC-related research projects, attend workshops, and engage with the HPC community. They are proactive in seeking out mentors and networking opportunities to enhance their knowledge and skills. This persona is hungry to make a significant contribution to the world of high performance computing and may go on to pursue a career in academia, industry research, or technology leadership.

The Industry Professional Upgrader:

This persona represents a mid-career professional working in industries like aerospace, finance, energy, or healthcare, where HPC plays a crucial role in solving complex problems. They recognize the need to enhance their skills to stay competitive in their field. This persona seeks out continuing education programs, online courses, and certifications related to HPC. They want to learn about parallel computing, GPU programming, and optimizing code for better performance. They might attend industry conferences, webinars, and networking events to connect with experts and peers. The Industry Professional Upgrader is motivated by the prospect of applying HPC techniques to streamline processes, improve efficiency, and drive innovation within their organization.

The Novice Researcher Exploring HPC:

This persona represents a researcher from a non-computational background, such as a biologist, social scientist, or humanities scholar, who recognizes the potential of High Performance Computing to enhance their research but has limited experience with coding and computers. They are curious and open to learning, but they might find the technical aspects of HPC overwhelming. This persona seeks accessible and beginner-friendly resources to understand fundamental HPC concepts, terminology, and tools. They might enroll in entry-level online courses or workshops specifically tailored for beginners in HPC. The Novice Researcher Exploring HPC is interested in understanding how HPC can accelerate data analysis, simulations, or modeling in their domain, and they hope to collaborate with computational experts to bridge the gap between their research expertise and HPC capabilities.

This persona highlights the importance of creating resources and educational materials that cater to individuals with varying levels of technical proficiency. By providing accessible pathways for novices to engage with HPC, the research community can expand its reach and foster interdisciplinary collaborations.

The Industry Innovator:

This persona represents an experienced professional working in a specialized industry, such as automotive engineering, where a specific project could greatly benefit from High Performance Computing. They have a clear project goal that involves complex simulations, such as crash testing or aerodynamic analysis, which requires immense computational power and precision. The Industry Innovator is aware that leveraging HPC can significantly speed up the simulation process, leading to quicker product development cycles and competitive advantage. However, they might lack the in-depth technical knowledge of HPC implementation. This persona seeks tailored consulting services or collaboration with HPC experts who can guide them through setting up and executing their simulations efficiently. They are results-driven and eager to see how HPC can transform their project, making it more accurate and cost-effective.

The Industry Innovator persona underscores the importance of connecting domain experts with HPC professionals who can translate industry-specific challenges into effective computational solutions. This collaboration can lead to groundbreaking advancements in sectors that heavily rely on precise simulations and data analysis.

Refining personas

These can be fleshed out more like for the Ambitious Student Researcher:

Background and Characteristics:

The Ambitious Student Researcher is an undergraduate or graduate student with a strong background in computer science, engineering, or a related field. They exhibit a deep fascination with the field of High Performance Computing (HPC) and are motivated by the prospect of exploring the limits of computational science. This persona possesses a solid foundation in programming languages, algorithms, and computer architecture, which allows them to engage with complex HPC concepts.

Passions and Goals:

This persona is driven by an insatiable curiosity to understand and apply HPC techniques to solve intricate problems. They view HPC as a means to accelerate scientific discoveries, tackle real-world challenges, and drive innovation across industries. Their primary goal is to contribute meaningfully to the realm of HPC, be it through academic research, industry collaboration, or developing cutting-edge software tools.

Activities:

Actively participates in HPC-related research projects, often collaborating with professors, mentors, and peers to explore advanced topics like parallel computing, distributed systems, and GPU programming.
Attends workshops, conferences, and seminars related to HPC to stay up-to-date with the latest advancements, tools, and techniques in the field.
Engages with the broader HPC community through online forums, social media, and open-source projects to share knowledge, seek advice, and collaborate on shared interests.
Undertakes personal projects, such as optimizing algorithms for parallel processing or experimenting with new architectures to enhance their practical skills.
Seeks out mentors, both within academia and industry, to gain insights and guidance on building a successful career in HPC.

Skills and Aspirations:

This persona is proficient in programming languages such as C/C++, Python, and perhaps CUDA or OpenCL for GPU programming. They have a knack for breaking down complex problems into manageable components and devising optimized solutions. The Ambitious Student Researcher aspires to publish research papers in top conferences and journals, contribute to open-source HPC projects, and eventually pursue a career in academia, industry research, or technology leadership roles.

Challenges and Growth:

While their technical skills are strong, the Ambitious Student Researcher may face challenges in balancing coursework, research commitments, and personal projects. Time management and maintaining a work-life balance are areas where they can continue to develop. Additionally, networking and communication skills are crucial for this persona, as collaborations and connections within the HPC community play a pivotal role in their growth.

Overall, the Ambitious Student Researcher persona embodies a passion for pushing the boundaries of technology, a dedication to continuous learning, and a desire to shape the future of High Performance Computing.

Registration questions

maybe we should ask more questions to find out who is attending our workshops
relates to learner personas
possible questions to add:
- why attend workshop?
  - why are you attending this workshop
  - what do you expect to be able to do after the workshop
  - how do you think this training will help you in your project
  - could be multiple choice question, with extra free text field
- question to find out type of person - learner persona
  - out of these four "learner personas", who are you?
- why is it important to you to take this workshop?
- is there any research on what questions to ask to characterise learners?

Event pages

our event pages should be better, more complete
add intended learning outcomes
who is the course for? people reading the course description should be able to tell if it's something for them
- we want the right people to attend the right training
- we don't want to scare people away, and we don't want over-qualified people
- describe what novices get out of training, what intermediate practioners get out, etc

Training material

consider using toy examples more, to see purpose of using the tools being taught
we already have the word-count example

Deep learning intro

Carpentry incubator lesson: https://carpentries-incubator.github.io/deep-learning-intro/
does this lesson fit with ENCCS roadmap?
we could use it for outreach! attract new participants thanks to AI hype and use opportunity to tell them about HPC and ENCCS

HackMD alternatives - HedgeDoc hosted on ICE

ICE sometimes has availability issues, down for maintenance etc
problem with hackMD
- glitchy sometimes
- everything public and findable under hackmd.io/@enccs
before testing anything else, we should try out the Prime plan on hackMD
- 5 USD per "team member"
- https://hackmd.io/pricing
- need to see whether we can set permissions the way we want with limited number of team members

MOOC-ifying material

record videos
possible platforms
support provided
“examination”

Thor to get in touch with Tiina and start planning and inform ENCCS team
probably go for GPU programming
work includes: improving/adapting existing lessons, recording videos, choosing tasks/exercise to use for summative assessment (to be able to proceed to next section)
possible platforms: ThingLink, Moodle
in principle a MOOC can be less ambitious, without exams etc
can go hand in hand with HPC certification discussion in EuroCC/CASTIEL
collect links to existing HPC-related MOOCs
- we need to stand out, do something original or specialised (or better)
- also attend own MOOCs to see what works well and what less well
a real challenge with going for GPU programming is access to GPUs!
- google colab possible but defeats the purpose of bringing people into EuroHPC
- is it unrealistic to invite 100 people to our ENCCS training allocation on Leonardo/LUMI?
- is it possible to set limits on user quotas on a cluster? probably only project-level
- we should ask LUMI and Leonardo how fine-grained their control is
- would it be possible for a EuroHPC center to set up cloud access reserved for MOOC?
  - Meluxina has cloud access mode, ask them?
- use ICE??
this can be the main new idea from this hackathon!
- plan: deliver this within a year
"LUMI also has an OpenShift/Kubernetes container cloud platform for running microservices."
- we could use Kubernetes to manage JupyterHub for single-user notebooks connected to resources

new lessons

quantum autumn school

C++ lesson

Yonglei Wang
Jonas Lindemann
Sandipan Mohanty
Johan Kristiansson

existing material:

https://www.fz-juelich.de/en/ias/jsc/news/events/training-courses/2023/hpc-cplusplus
- Sandipan was going to look into open sourcing his training material
- Thor to connect Yonglei and Sandipan for future collaboration

Johan can contribute:

Eigen
python bindings - pybind11
gtest
we can't really prioritise this project right now because of everything else
- ask Sandipan to give first workshop with us
- consider adopting and building on Sandipan's material if he open sources it

Colonies lesson

when it's working we can start developing lesson material
can use existing material (e.g. Julia-for-HPDA) and show how that would work with Colonies
maybe late this year is somewhat realistic
other new lessons?

CoE training collaborations

First prio: work with CEEC CoE and Plasma-PEPSC
- led from Sweden by KTH
- CFD and Plasma simulation
- need to write to Niclas and Stefano and get ball rolling
second prio: work with other CoEs that want to work with us, when the topic is relevant to our target groups
idea: CoE-month or CoE-period next year
- for ~2 months, we organise multiple CoE workshops
- involve other NCCs in this?
- use EuroHPC
- coordinate with CASTIEL