
<p style="text-align: center"><b><font size=5 color=blueyellow>ENCCS All-Hands Meeting - Training Session (250131)</font></b></p>
**Contents of this documents and quicklinks**:
[TOC]
## <span style="background-color: cyan">1. Python HPDA retrospective</span>
### 1.1 Reflections from participants
==the second episode (efficient array computing) was quite packed==
- keep the current format
- numpy: teaching 20 min + exercises 10 min
- pandas + scipy: teaching 20 min + exercises 10 min
- we may consider to separate numpy and pandas/scipy into into two episodes
- numpy: : teaching 30-35 min + exercises 15-20 min
- pandas + scipy: teaching 25-30 min + exercises 15-20 min
==May be more useful- If we can have more practical examples comparing the performance of python with Fortran/c (in terms of speed)==
- a new episode or some materials/exercises for this topic?
==more description for the parallel computing episode==
- some terms and concepts were rather new to beginners, we might should have a more detailed introduction, like threads and processes
- exercises will be further improved
==it would then be very helpful if you provided a few real-world examples as extra material==
- maybe we can consider to use real-world examples instead of generating data
### 1.2 Survey results (including those from previous workshops on 2022-05-18 and 2023-09-05)
==What did you like best regarding event organization? Where should we improve?==
- Having it over multiple days and only for half a day helped to have enough time to go through the content smoothly and also work on it individually in the evenings, before the next session.
- The topic in this course should be split into two or more courses. Maybe Numba should be a course by itself.
- it would be helpful to have a list of all modules, libraries, etc, that might be needed during workshop. This way those who do not use your package/LUMI but rather stick with the preferred one at local computer do not loose time installing things during sessions.
- maybe you should include the GPU programming as this is very import
- I like the format with two sessions in a day.
==What did you like best about the lesson material, exercises and teaching? Where should we improve?==
- The exercises were well structured and organized. Having the workshop over multiple days also helped to cover content without hurrying through it.
- It would be great to have even more real life examples.
- I think you should explicitly ask participants to read the material before each day.
- I will address the lesson materials in the welcome email
- The course material was best. It would be perhaps a good improvement to include a base of a shortcut tips and tricks as a summary: what to use when you're not sure how to approach problem.
- Some of the material was difficult to understand as someone without a computer science/engineering background. Maybe add some extra boxes explaining some of the terms and concepts? Several exercises were too difficult to practice the actual concept.
- ==I liked the "slower" session, when we had time to do the exercises during the session.==
==Which topics would you be most interested in learning about in future training events?==
- 220518-Python-HPDA
- 1. Cython; 2. Advance scientific data analysis with python; 3. How to write a small but well organised scientific application in python/C/C++;
- **Machine learning** - sound/language processing - generative? (something like deep fakes)
- Tips and tricks for effective application of taught libraries, rather than general introduction and comparison
- **ML/AI**, large scale data storage/databases. cassandra or such DBs, distributed storage (Lustre, S3 such), performance optimization, how to use them from python with performance optimization.
- **Custom deep learning things**, like multiple inputs and outputs and input of varying sizes
- GPU parallelization, more on optimising python code / using numba or cython. Even on using other languages like Fortran or Julia.
- High computation performance for specific fields: **Machine learning and computer vision**
- I would like to deepen my knowledge in numba. Maybe some course in advanced visualization techniques.
- **Machine learning for geospatial data**
- 230905-Python-HPDA
- Looking forward to Julia courses
- Access to cluster and practice jobs in python in a supercomputer cluster is mandatory....
- **Artificial Neural Network (ANN)**
- Python HPDA on different platforms specificities
- 250121-Python-HPDA
- **gpu programming, machine learning**
- **Would actually appreciate one more day where you talk about GPU**
- A use case I don't see covered frequently is profiling and parallelising when using python packages such as **interpetml**. What should one do if they can't change the packages easily but they need to speed up their code?
- I would suggest extending course content a little bit but I feel that most of the topics related to computing were covered
==Any other comments?==
- **Breakout rooms** are always a challenge in this type of training sessions. Many people don't like to talk a lot or collaborate with strangers, and that is understandable. Maybe participants should be asked to raise their hands if they want to be sent to a breakout room to work together with other people and the rest should stay quiet in the main room.
### 1.3 Personal opinions for teaching and organizing events
- it is better to have workshop on morning session (9:00-12:00 or 9:30-12:30)
- YL will send lesson materials in welcome email and participants can go through lesson materials generally before workshop to see which topic will be covered for each day
- balance teaching and exercise, avoid leaving all exercises to the end (for long episodes)
- for a session with 50 minutes (XX:00-XX:50)
- 1 round teaching-exercising (maximum 2 rounds)
- lecture session can be 25-30 min and exercise session can be 20-25 min
- for a session with 80 min (XX:00-XX+1:20)
- there can be 2 lecture sessions and 2 exercise sessions
- 1st round, lecture 20-25 min then exercise 15-20 min
- 2nd round, similar arrangement depending on teaching contents
- instructor for each episode (except for the 1st episode) should say a few words for a general description about the current episode +1 Francesco
- correlations between episodes
- a short recap about each episode when it ends, maybe some description about exercises (Ashwin) +1 Francesco
- at least two people for each episode, one as instructor and the other to provide support/helper +1 Francesco
### 1.4 Expansion of Python workshop
Expansion to three workshop: https://hackmd.io/@yonglei/python-workshops
- 1. Python HPDA
- If we split them, we could also cover more things about big data storage and retrieval (S3, databases, tips on parallel file systems...) (Francesco)
- 2. Python HPC
- it would be beneficial to have comparison of all parallelization methods and best use cases as a summary
- like [an overview of common data formats](https://enccs.github.io/hpda-python/scientific-data/#an-overview-of-common-data-formats)
- Someone in the feedback mentioned having a KB of tips and tricks, sounds interesting! (Francesco)
- 3. Python ML/DL
**Publication for the lesson material**
- Zenodo
- JOSE
## <span style="background-color: lime">2. Arrangement for workshops and webinars</span>
<iframe src="https://calendar.google.com/calendar/embed?height=500&wkst=1&ctz=Europe%2FBerlin&src=NWQ5NWNiNWI4ZWQ1ZDhmZjBkNDliNDVlMjIyNDQ3ZTQ2MjAxMDY2NDZmYTMxZjhjY2VkMjRhZWVmZGRlMjZkZUBncm91cC5jYWxlbmRhci5nb29nbGUuY29t&color=%23F6BF26" style="border:solid 1px #777" width="800" height="500" frameborder="0" scrolling="no"></iframe>
- Feb. 04-07, Julia High-Performance Data Analytics
- Mar. 03-07, MultiGPU Train-the-Trainers Course
- Yonglei and Ashwin will teach basics of deep learning (~ 3h)
- [intro to deep learning](https://enccs.github.io/deep-learning-intro/)
- [schedule](https://docs.google.com/document/d/1ztkd5I2k40QetHLwKdnOw4d6Ub_BsrR2epV2dt0wV3E/edit?tab=t.0)
- Mar. 12, Training Hackathon
- Mar. 18-20, EuroHPC Summit
- Mar. 25/27, Practical Intro to Machine Learning
- Apr. 08-09, NVIDIA N-ways Bootcamp
- Apr. 29, Practical Intro to GPU Programming with CUDA
- Apr. 30, ENCCS Industry Days
- May 12-16, Introduction to Deep Learning
- May 27-28 (Jun. 3-4), NVIDIA AI for Science Bootcamp
- there will be hackathon at Sept./Oct.
- can we attend this event as participants(?)
- Jun. 17-18, NVIDIA MultiGPU Bootcamp
- Jul. 09-10, NVIDIA AI Multinode Profiling Bootcamp
Waiting list:
- workshops
- [OpenFOAM](https://enccs.github.io/openfoam/) (April?)
- contact Karim from NCC-France
- CEEC CoE
- contact niklas again
- ==Week 20, intro to deep learning==
- contacting NCC Romania for detailed dates
- basics of deep learning and two/three application cases
- CR workshop in KTH? (streaming using youtube?)
- April, May, potential collaboration with Hyperight
- A one-day workshop with Frank
- OpenACC and using a graphic interface to manage input/ouput from compilation
- webinars
- ==Week 13, Practical Intro to machine learning (YW)==
- ==Week 18, Practical Intro to GPU Programming with CUDA (YW)==
- Practical Intro to GPU Programming with OpenACC (==???=)
- MoroccoHPC webinars
- Array computing using Python (???)
- Robert Luciani -- Julia and HPC, or general AI topics
- Ashwin – Using MLFlow on LUMI
- ==Johan – ColonyOS (April)==
- reminder johan for a specific date
- Francesco – Julia and ML
- Thor – some general EuroHPC
- Introduction to supercomputing for AI (???)
## <span style="background-color: magenta">3. MOOC project</span>
## <span style="background-color: orange">4. Training hackathon and reorganization of Github repositories</span>
[**Working on the github repos**](https://hackmd.io/@yonglei/enccs-github-repos)
:::danger
:::