# General GPU workshop planning https://notes.coderefinery.org/7DtgaasVQyW1Z_rzbMG6uA ### Pre-workshop planning Day 1 (06-09, Fri): Richard joins and co-teaches most sessions! :) - Welcome --- Thor, RKD - Why GPUs? --- Thor, RKD - The GPU hardware and software ecosystem --- CVA - What problems fit to GPU? --- Thor, co-Stepas(?), Stephan (humanities part?) - GPU programming concepts --- CVA, RKD - Introduction to GPU programming models --- CVA, RKD - High-level language models --- Thor, Stepas(?) - Directive-based models --- CVA(if needed), Qiang, Stepas - Multi-GPU programming with MPI --- Hicham - Non-portable kernel-based models --- CVA, Andrey Day 2 (06-12, Mon): - Portable kernel-based models --- Jaro H., CVA (only SYCL related stuff), Andrey - Preparing code for GPU porting --- CVA(general considerations), Magnar & Erik on translators? - Recommendations and discussions --- [?], [?] - Problem example(s) --- Stepas, [?] (apparently, no kernel version except SYCL is going to make it this time) - Wrap-up --- Thor(?) ### May 29 2023 Present: Stepas, Thor, Jussi, Cristian, Qiang, Andrei, Tuomas, Yonglei, Richard, Hicham, Pedro #### Notes * Translation tools will have to be incorporated into porting section (.rst file is already [in](https://github.com/ENCCS/gpu-programming/blob/main/content/porting-codes.rst), but not added to the episode list)... once someone touches up the porting section :) (CVA?) * Thor to make sure that this gets added * Timings on instructor guide are outdated * Thor to update today * GH issues: * [#19] :abbr: / glossary basically unused; is the plan to go over next week / during the finishing stages? * Thor to look at this * [#27] (two clarifications in GPU concepts) to accept Andrey's suggestions, update/commit/close? * Done * [#37] (hipSYCL name) already done? * Let's keep hipSYCL for now * [#33], [#47] (bullet-summaries and see-also) to prioritize during go-over, once the content is in place - Accepting participants today, 155 registrations! * 89 FI, 20 SE, 2 DK, 34 Norway, 0 Lithuania (:() * What is the max (w/ allocations)? * invite 60 in total: 25 FI, 15 SE, 18 NO, 2 DK * 11 confirmed for 16th (hands-on session), to invite the workshop participants too - rerun in the autumn - October maybe, needs to coordinated with everyone * current episode structure: ``` 1-gpu-history 2-gpu-ecosystem 3-gpu-problems 4-gpu-concepts 5-intro-to-gpu-prog-models 6-directive-based-models 6.5-multiple_gpu 7-non-portable-kernel-models 8-portable-kernel-models 9-language-support 10-gpu-porting 11-recommendations 12-examples 12.5-humanities ``` - changes needed? - will episode 6.5 be covered? - Yes, it is there, after directive-model section (~end of Day 1) - suggested change: move 9-language-support to before 7-non-portable - +1 - Does anyone have a good example for episode "What problems fit to GPU..." to show what type of speedup can be gained? Just a SAXPY comparing CPU threading to GPU? - Thor will move humanities use cases (ep 3) - help with non-portable episode - test OpenCL code! Andrey to have a look - Multi-GPU parallelisation should follow Directive-based models - Already does - Julia and CUDA versions needed for stencil - Thor to look at Julia, Yonglei at CUDA * audio/video/screenshare test sessions June 7-8 - Thor to send doodle --- **Who teaches what?** Please write your name where you could contribute to teaching during workshop (let's try to have at least 2 names, main+fallback, for each section, to better manage uncertainty) Day 1 (06-09, Fri): Richard joins and co-teaches most sessions! :) - Welcome --- Thor, RKD - Why GPUs? --- Thor, RKD - The GPU hardware and software ecosystem --- CVA - What problems fit to GPU? --- [?], co-Stepas(?), Stephan (humanities part?) - GPU programming concepts --- CVA, RKD - Introduction to GPU programming models --- CVA, RKD - Directive-based models --- CVA, Qiang, Stepas - multi-GPU --- Hicham - High-level language models --- Thor, Stepas(?) -- Moved from Day 2 Day 2 (06-12, Mon): - Non-portable kernel-based models --- CVA, Andrey - Portable kernel-based models --- CVA, Andrey - Preparing code for GPU porting --- CVA(general considerations), Magnar & Erik on translators? - Recommendations and discussions --- [?], [?] - Problem example(s) --- Stepas (general setup), [?] for kernel models - Wrap-up --- Thor(?) --- ### May 16 2023 Present: Thor, Richard, Cristian, Andrey, Stephan, Magnar, Qiang, Hicham One foot in: Stepas #### Status: - Multi-GPU: text available after Tuesday next week, and exercises - GPU-aware MPI - CUDA: Magnar contributing to multi-GPU aspects - Thor to add everyone to LUMI allocation - 46 registrations! - mostly from SE and FI, need to advertise in DK, NO and LT (Thor to contact Stepas) - [name=ST]: on it -- pinged our PR once again, will see what happens #### Notes - where to include porting tools? - portability episode - move "Multi-GPU parallelisation" to right after "Directive based models" in schedule - episode "What problems fit on GPU" - have collapsible box for specific examples? - humanities use case can fit here - Frameworks/ models - SYCL: Andrey in progress of finishing it - CUDA & HIP: CVA - OpenMP: Qiang is on it - OpenACC: Qiang also, and Hicham to synchronise - Python and Julia: Thor #### Pre-meeting - How about those HPC logins? They are needed e.g. to arrive at consistent numbers for example/case-study ep. and lift discussion from WRITEME stage there. (The point is to have the same machine for most frameworks, not even mandatorily the one that the workshop will be presented at, if there is still some debate regarding availability.) - Thor will add everyone to LUMI allocation - everyone should try their code examples on LUMI! - SYCL: push LUMI support to install openSYCL using easyBuild - we can install openSYCL project-wide - Cristian can do it in ENCCS project! - It's probably a good idea to close overly-broad "episode" issues in GitHub and to enter better-defined to-do chunks: "add example in this or that framework", "move humanities cases to relevant place" etc. (Suggestion is for anyone assigned an episode issue to restructure the task, as they are probably most familiar with what is done and what is yet to be completed.) - good idea, Thor will try to tidy up and raise more specific issues - [name=ST] too - No publicity through EuroCC2 dissemination channels? Some June workshops are already in the weekly newsletters. Or is there a deliberate reasoning for it? - we already have enough registrations so EuroCC2 dissemination not needed now - but still advertise in LUMI newsletter - Anne Vomm, Annie Jakobsson - So far, almost no one is signing up for actually presenting the episodes - (it got better, schedule moved up to the description of next meeting) ### April 27 2023 Present: Cristian, Stepas, Richard, Qiang, Stephan, Hicham, Tuomas, Yonglei **Agenda:** - where to fit in NRIS contributions? - automatic translation tools - into portability episode? - multi-GPU with MPI - Is this with CUDA/OpenACC/OpenMP? Then... general-language support? - Will there be an episode on language-based support at all? Or does it get sliced into GPU model eps? - yes we should have it. Separate episode? - ^^ The two points above are solved jointly below - - registration is open: https://enccs.se/events/2023-06-gpu-programming/ - spread through your channels! - how many to accept? - HPC system to use - we can use ENCCS allocation on LUMI, Thor can add anyone for testing the material - discuss whether to use "in-short" callout boxes with bullet points summarising long paragraphs, for presentation purposes - naming: can we think of something more appropriate than "portable" and "non-portable" - - Suggestion for workshop sectioning: - 1.1 Intro, problems - 1.2 Concepts, models overview - 2.1 Models (that's like 4 eps.) - 2.2 Portability, use-cases - 2.2.2 Recommendations, wrap-up. - schedule and timings - ^^ The two points above are solved jointly below. **Notes:** - how many to accept? - accept more to the first part without hands-on and HPC access - ~40 for hands-on - Thor to add everyone to LUMI allocation - OpenACC only for Fortran - limited compiler support for OpenACC/Fortran - most stuff from the lesson will work - so we use only LUMI, for simplicity - "portable" vs "non-portable" - appropriate enough? - add these terms to the hover-text (tooltip) with :abbr: - add multi-GPU after the directive-based and kernel-based episodes - main part will be OpenMP and OpenACC with MPI, so add it after directive-based episode - add automatic translation tools to "Preparing code for GPU porting" episode - the episode thus has two points: preparing non-GPU code for porting (refactoring etc), and translating - expanding CUDA: - add CUDA/HIP versions to "GPU parallelization examples" section in episode "Problem example: stencil computation" - Cristian and Jaro focus on the "Non-portable kernel-based models" episode - humanities use case - can we show it algorithmically or even with code? - otherwise include it with the overview/conceptual first part? - micro-hackathon on June 16 - Thor to set up registration for micro-hackathon day - participants describe their code and goals - in same context: also announce CSC hackathon Sept 4-6 - ***Priority***: Exercises throughout the episodes - Include build instructions for use-cases (and for smaller coding-exercises too?) - inspiration: https://coderefinery.github.io/manuals/lesson-design/#thinking-of-exercises Thor to send out doodle for next meeting in ~2 weeks --- ### April 13 2023 Present: Hicham, Tuomas, Thor, Jussi, Jaro, Stepas, Pedro, Qiang, Andrey Agenda: - new workshop dates: June 9 and 12 - when hands-on support part? - Decision: June 16. - have a registration for micro-hackathon day - participants describe their code and goals - also announce CSC hackathon Sept 4-6 - which cluster to run workshop on?? - NVIDIA system OK, HIP supported. Needs ROCm stack to be installed and this could be complicated - or LUMI (SYCL(?), OpenMP, Kokkos, OpenCL all should work in addition to HIP, OpenACC only for Fortran!) - Or we use both LUMI and another EuroHPC system with A100s! ENCCS to prepare access. But this will ~double our efforts - think about this until next meeting - open PR - [name=ST] It has two distinct additions. Haven't looked at the image, but Ep.11 content seems to be a reduction example. Question is whether it should be a separate (expanded) use-case, if (small) reduction examples are present in GPU programming models part(s). - i extracted the addition of the image from this PR in 9d48237f07 - Humanities use-cases - would benefit immensely from concrete code snippets (if they stay on use-cases part) or at least example-based problem formulations (i.e. by what principles would they be designed/ run, even if without supporting code) - recap on who's doing what - In particular who's doing CUDA? Need to sync it with CSC HIP efforts - Andrey? - or Yonglei? - [name=ST]: update + hints for rounding out problem example - then re-read lesson from the beginning "as dilettante" - The tests/ build scripts would be great, done by someone with access to the machines the workshop will be run on - non-portable vs portable kernel-based models - current split looks fine intuitively, just the naming question - is "portable" the right word? - "cross-platform"? - chatGPT: "In summary, HIP and CUDA are more specialized platforms for programming GPUs, while Kokkos, OpenCL, and SYCL are more general-purpose parallel programming models and frameworks that support multiple hardware architectures." - "vendor-specific" and "general" - replace with just names of frameworks? - in any case, create a matrix showing the portability aspects of all frameworks - raise issue on the repo, sleep on it and discuss it there - format of workshop - exercises! try to add some to each episode - need to add timings to episodes soon - how about multi-GPU parallelisation and automatic porting? - Thor to tentatively assign timings to episodes to get a clearer picture - write best-practice guide afterwards? - need to be careful since best practice guides easily get outdated - possible to automatically convert to pdf after each (workshop) release? - like a live article! - in any case, reuse the material in several places --- ### March 27 2023 Present: - [name=ST]: no one came..? Agenda: - workshop dates: initial suggestion was 29-30 May - Thor to send out doodle/survey, options: May 29-30, 30-31, June 7-8, 8-9, 7&9 - recent progress Should multi-GPU be included? - complementary to single-GPU topics - MPI folks familiar with MPI-based multi-GPU - idea: additional episode with "extra topics"? - keep this very introductory, mostly to show that it exists - [name=ST] supports extra "it's there" episode, as it will alleviate the hand-wringing on what should be included in "main" part and what can be left out / reevaluated after feedback of 1st iteration - consider creating new lesson with continuation of this intro lesson - Well, for that one would probably need to slow down radiation frequency of cesium-133 a bit... Question from Cristian in Zulip: > How would be a good way to proceed with the programming models? Should it be something like full sections like: > 1) Directives (txt+code) > 2) Non-portable CUDA/HIP (text +code) > and so on. > Or should we have first a chapter going through each programming method and then another chapter with the example codes? - divide "GPU programming models 2 (detailed)" into multiple episodes (chapters) for: - directives - kernels - language support - we need to start thinking of the whole picture - Thor to start reading through completely and refactoring/revising ### March 6 2023 Present: - Stepas Toliautas, NCC-Lithuania - Cristian-Vasile Achim, CSC - Jaro Hokkanen, CSC - Tuomas Lunttila, CSC - Qiang Li, ENCCS - Thor Wikfeldt, ENCCS - Stephan Smuts, Aarhus university - Hicham Agueny, UiB/NRIS #### Notes GitHub policy: - send PR to ask for feedback - direct push to main is fine to get much content in place asap, revise later - Rename "Problem example" episode to "Use cases", so we can include e.g. humanities stuff alongside heat equation? - NRIS material: https://documentation.sigma2.no/code_development/overview.html#code-development - Recommendation episode - idea: decision tree (CSC has something, expand with high-level approaches) - can not find the original link to the LUMI blog, but it is the same image as in this talk (https://archive.fosdem.org/2022/schedule/event/utilizing_amd_gpus/attachments/slides/5163/export/events/attachments/utilizing_amd_gpus/slides/5163/fosdem22_hpc_amd_gpus_markomanolis.pdf ) (slide 27). And very similar to the ENCCS page (https://enccs.github.io/port-to-lumi/#/ ) - SYCL/HIP/Kokkos etc paragraphs: - translate parallel-for, reduction etc. to Python, Julia etc - (and for directive-based as well?) - combine (gpu-aware) MPI + OpenMP/OpenACC-offloading - include it? - As MPI does not directly map to GPUs, this is quite advanced level; should weigh pros/cons - automatic conversion tools: hipify-clang, syclomatic, clacc (OpenACC->OpenMP) - include this in "Preparing code for GPU porting" episode - **Workshop dates** - Tue-Thu, May 30 - June 1 - GPU software environments episode - original idea: toolchains, compilers, libraries etc (no details but high-level overview) - ROCm, NVIDIA STK, CUDA toolkit, OneAPI, ... - as well as (native) compiler support (OpenMP/SYCL) - move _stuff_ to "GPU programming types" where relevant - take _stuff_ from "episode 1.5" from repo - GPU utilisation: nvidia-smi, rocm-smi, etc (with options) - analogue for Intel? - Hicham takes this - where do libraries go? - cuFFT etc - call from OpenMP/OpenACC interface - include this as subsection in "programming types" episode - Hicham contributes! - Zulip channel for faster communication - Thor sets it up ### Feb 14 2023 Invited: - Stepas Toliautas NCC-Lithuania - Jussi Heikonen CSC - Cristian-Vasile Achim CSC - Jaro Hokkanen, CSC - Tuomas Lunttila, CSC - Qiang Li ENCCS - Thor Wikfeldt ENCCS - Richard Darst Aalto SciComp - Stephan Smuts Aarhus university - Andrey Alekseenko KTH - Pedro Ojeda, HPC2N Umeå Repository: https://github.com/ENCCS/GPU-programming/ - who needs access? Rendered: https://enccs.github.io/GPU-programming/ #### Agenda 1. Short walkthrough of previous notes 2. New collaborators from CSC 3. Current plans, what are you working on - open PRs - issues - how do we get the ball rolling? 4. Workshop dates - see poll at bottom of this page - have more than 1 week btwn workshop and "hackathon"? 5. Example problems - swap heat-equation for game of life? - keep reduction? - other use cases from e.g. humanities? Lesson status - software env is something already installed, the software stack - episode 5 misleading? - is kokkos software env? oneAPI? - content is correct, title confusing? - "GPU programming frameworks" - relate kokkos and sycl more closely? similar approach - consensus: join kokkos and sycl in same episode Who does SYCL? - Jaro and Andrey sync on this structure of episode 7 - introduce each framework after each other, or introduce concepts and show examples in many frameworks - use code tabs to merge similar frameworks duplication in "GPU software environments" and "GPU programming types " original plan: GPU software environments - installation of dependencies - NVIDIA, ROCm, OneAPI, Metal - compilers GPU programming options - low-level, directive-based, kernel-based, … options. - low-level: CUDA, HIP - directive-based: OpenMP, OpenACC (incremental approach) - language-based: Python-numba, Julia-GPU, SYCL, TensorFlow/Pytorch - easiest to use list or table to explain differences between frameworks - some sort of visualisation maybe possible later? new plan: - ignore "software env" for now, focus on "prog types" - use "show don't tell" approach - what about a table with frameworks vertical, and features/use-cases horizontal, with + / ± / ø / - . So you can scan and see what does your use cases? - analogy: https://aaltoscicomp.github.io/python-for-scicomp/data-formats/#what-to-look-for-in-a-file-format -> use a table like the one under comic - (RD: for me, not knowing much, that's what I'd first look for.) - merge episodes "software env" and "types", include codes for each framework - contextualisation important - good for industry, new user groups? - have this in "recommendations" episode - RD: flowchat or something for selecting a framework (for those who are just getting started - doesn't have to be complete)? As a new person, I can't know everything but I'd like to know the simplest thing that works for me. - Also perhaps mention why there can't be one recommendation for everyone - I know there is history there. Getting the ball rolling - deadlines - v1.0 ready 1 months before workshop (late April), leaving room for review, improvements etc - - Jussi and Cristian do HIP stuff - Stepas: stencil example / heat equation, homogenising for frameworks - Stephan: add about how GPUs can be used in humanities - Thor: translate to Julia, and contribute to first high level episodes, also learning objects and such - Andrey: SYCL and OpenCL - Jaro: some SYCL, Kokkos - Pedro: will look at material and get back - Richard: still not qualified to write much, but will review and co-teach to learn. - portability - no good to write each framework separately - partly addressed by SYCL/Kokkos - header-approach for handling multiple codes (Jaro) - important for decision makers - how not to write yourself into a corner - how important? at least mention clearly, so people leave with a good idea of what it is and how it can be important - swap heat-equation for game of life? - heat equation is lower effort since much already exists - don't focus on Poisson equation, emphasize the stencil aspect - keep heat equation - CSC has 3-day GPU hackathon end of August (NOMAD+EuroCC) - announce this to workshop participants - We go for May 30-31! - new content can be pushed to main, changes to existing content added via PR --- ### Jan 18 2023 Invited: - Stepas Toliautas NCC-Lithuania - Jussi Heikonen CSC - Cristian-Vasile Achim CSC - Qiang Li ENCCS - Richard Darst Aalto SciComp - Stephan Smuts Aarhus university - Thor Wikfeldt ENCCS - Andrey Alekseenko KTH Interested in future involvement: - Pedro Ojeda, HPC2N Umeå Previous planning document: https://hackmd.io/@enccs/rk5MabZks Repository: https://github.com/ENCCS/GPU-programming/ Rendered: https://enccs.github.io/GPU-programming/ #### Agenda 1. Short round of introductions 2. Status update, what's done so far 3. Discuss the material - go through current version of lesson, refer to previous hackMD notes - discuss questions&objectives in each episode - valuable resources? - is the existing structure good enough? - what level of detail? - which frameworks to include? - example problems: - heat equation and reduction? 4. How should we organise our work, who does what - self-assign issues for episodes - discuss ideas in new issues 5. Possible dates for first workshop 6. Our next meeting #### Notes How many participants? This will influence format of workshop - have first week open to everyone, second week for fewer Screening participants for second based on their applications is good idea Add section?: use existing GPU-enabled software - e.g. "i want to use Tensorflow/Pytorch etc, what do i do?" - are there any general principles relevant to all/most GPU-enabled software? - maybe this blurs scope of workshop - keep highest level at language-specific GPU support (Python, Julia) Familiarity with programming languages is assumed (prerequisite) Industry audience is important Currently missing part: portability - doesn't make sense to write code for only one architecture - should emphasize portability - update section "Preparing code for GPU porting" to encompass portability between GPU architectures? Terminology - workitem / thread / ... - include simple table with translation of terms between frameworks? - this would fit into "GPU programming concepts" - technical solution: in sphinx we have glossaries and terms, so terms can link to the glossary. Or even add new directives so one can hover over term and get explanation - wait with this until later iteration of workshop? - Decision: use CUDA terminology but link with glossary - Sphinx: - https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-term https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#role-abbr - abbr hovers a tool tip - basically what I said but I could probably make it automatic from a set of definitions Example problems - currently heat-equation and reduction - Stephan to think about interesting general examples for humanities people - focus should be on the programming construct (e.g. 2D double for loop) **frameworks to include:** - CUDA/HIP - SYCL - OpenACC, OpenMP - Python: numba, ? - Julia: CUDA.jl, AMDGPU.jl, OneAPI.jl - TensorFlow/Pytorch as a GPU library (i.e. not only for deep learning) - kokkos - OpenCL (*new!*) Can we provide all these environments on the training cluster we'll use? Also show how to install locally! **First workshop** - May or June - CSC summer school is end of June - Thor to send out doodle with date suggestions **Who does what** - Jussi and Cristian do HIP stuff - Stepas: stencil example / heat equation, homogenising for frameworks - Stephan: add about how GPUs can be used in humanities - Thor: translate to Julia - Andrey: SYCL and OpenCL - CSC to check about Kokkos **Resources** - reduction: https://enccs.github.io/CUDA/3.01_ParallelReduction/ ~~Thor to send doodle for next meeting in few weeks time~~ ### Workshop scheduling The workshop should take place over three days, two in the first week and one in second week. Please vote for options below that work for you by adding your name. - [name=ST]: June 5-8 is likely to be thesis defense week, so availability depends strictly on defense committee assignment. Other than that, I'm trying to keep all options available at the moment. **First week (workshop)** (half or full days) May 29-30: Thor, CVA, Stepas, Andrey?, RD May 30-31: Thor, Stepas, Andrey, RD May 31, June 1: Thor, Stepas, Andrey, RD June 1-2: Thor, CVA, Stepas, Andrey, RD June 7-8: Thor, CVA, *Stepas?*, Andrey June 8-9: Thor, CVA, *Stepas?*, Andrey **Second week (mentoring and Q&A):** (likely full day) June 5: Thor, CVA, *Stepas?* June 7: Thor, CVA, *Stepas?*, Andrey June 8: Thor, CVA, *Stepas?*, Andrey June 9: Thor, CVA, Stepas, Andrey June 12: Thor, CVA, Stepas June 13: Thor, CVA, *Stepas?*, Andrey June 14: Thor, Andrey June 15: Thor, Tuomas, Andrey