owned this note
owned this note
Published
Linked with GitHub
# NordicHPC meeting hackpad
Our ideal meeting would have the different sites present "cool things they do and problems they have", and then spend the rest of the time working in small groups to share cool things and solve problems. But we all know that we need a bit more structure.
So, we have a list of interesting topics. People can offer talks. Everyone gets a lightning talk, and we choose which talks to hear with everyone, and which to have in breakout sessions.
**Everyone can and should edit this scratchpad. You can also give comments in the HTML view by selecting the talk bubble next to the paragraph, or use [this github issue](https://github.com/NordicHPC/nordichpc.github.io/issues/28).**
**Every site should include (at least) three cool things they do and three problems they have to [this presentation](https://docs.google.com/presentation/d/1zXVxqLQ9E8AEtexIKWWt7AjwmrKLt9HpJCvIGcv7-nQ) for an introductory talk.**
## Schedule
### Thursday
All times EET. And we are running late.
* 11:30 (approx): light lunch served
* 12:00 [Welcome](https://cicero.xyz/v3/remark/0.14.0/github.com/nordichpc/talk-meeting-intro/master/talk.md/), intro round, What is NordicHPC. Radovan Bast
* 12:?? [Practical info](https://docs.google.com/presentation/d/1F_2CPfOnovCsvfKg2yYyJMBwMLDEGkZcGDlljB2I9zE), Richard Darst
* 12:45 Usability, Marko Nieminen, Aalto professor of usability and user interfaces
* 13:15 Break
* 13:30 Cool things and problems
* 14:30 Break
* 14:45 Cool things and problems continued
???
* 16:00 Break
* 15:15 Lightning talks (some slides can be added [here](https://docs.google.com/presentation/d/1Ou4Ek3gcHP-domV4BVHXpQFW5CHlEH798j7wg_h_6Zs))
* 16:00 Unconference voting
* 16:30 Unconference (breakouts, talks)
* 18:00 Day ends
* 19:00 Group dinner
### Friday
* 09:00 Summary of Thursday
* 09:30 Unconference continues... talks and breakouts.
* 11:00 Summary of Friday and the future.
* 12:00 End of main program
* 12:30 Lunch for those still around (some local restaurant)
## Offered lighting talks
You can add slides to [this slideset](https://docs.google.com/presentation/d/1Ou4Ek3gcHP-domV4BVHXpQFW5CHlEH798j7wg_h_6Zs).
* Usage monitoring at UiT/Sigma2-Metacenter (Radovan Bast)
* Carpentry style HPC training :- https://sabryr.github.io/hpc-intro/ (Sabry Razick UiO)
* On resource use and losses (Peter Kjellström, NSC)
* Testing things, a broad view (Peter Kjellström, NSC)
* NSC Slurm additions, modifications, etc. (Peter Kjellström, NSC)
* Python-hostlist (Peter Kjellström, NSC)
* User survey and RT analyser http://folk.uio.no/sabryr/Survey.pdf (Sabry Razick, UiO)
* Software Deployment CI (Simo Tuomisto, Aalto, https://github.com/AaltoScienceIT/science-build-environment)
* OpenHPC (Janne Blomqvist but presented by Richard Darst, [fgci slides](https://users.aalto.fi/~jblomqvi/fgci-openhpc-slides/fgci-openhpc.html))
* JupyterHub for clusters (Richard Darst)
## Topics of interest (Requested talks)
If you would like to organize one of these topics, please copy it to a new section below and describe the topics
* Jupyter for HPC
* Storage and data management challenges for HPC, especially small files and machine learning
* Cloud and HPC
* Fairshare
* Uniformity of HPC setup across sites
* Software maintnance
* Usability of computing, proactive support, persistent user experiences
* In-transit data analysis
* Singularity
* OpenHPC
* Summary of NeIC workshop
* HPC regression testing, e.g. [Reframe](https://reframe-hpc.readthedocs.io/en/stable/)
## Talks and events
Please enter your proposed topics here. *Everything* (if possible) will get a lightning talk, then using an un-conference format we'll decide which to have in a longer format.
Of course, you don't want to prepare something if you don't give it. Everything can be used in a breakout session, and we can pre-select important talks.
### System testing
I (Peter Kjellström) would be interested in a session about testing (on many levels). I could talk a bit on the subject, present what we do etc. But it could also be a great open sharing/discussion topic.
To better define what I mean. This includes strategies for testing/regression testing high level software (apps) but also testing done for system changes and complex we-cant-test-that scenarios.
### Usability and HPC
(invited talk, see main workshop page)
Session notes:
- What do you want to do with it?
- Developer (tech and solutions) vs User (mental models, goals, skills) vs System (realized model) (Norman, 1986)
- We tend to consider that everyone else is thinking in the same way.
- Do today's digital natives have the skills needed to use clusters?
- Approaches to HPC user modeling
- Training? Probably needed, but is that the sole solution?
- Problem-based learning. What do users want to solve? Service providers need to invest time to alight with users needs
- Create collaborative structures with the researchers. Problem domain experts who can take a step towards the HPC.
- Causal framework of usability
- System functions
- User characteristics and task characteristics
- ==> user reaction
- Do we need a middle ground between laptops and clusters?: shared servers/login nodes/etc.
### Cool things and problems
In this session, every site presents three cool things they do and three problems they are facing.
Session notes:
- We had 93 slides from 11 sites, covered over two sessions
- This served as a great introduction to what we do, and would be highly recommended in future worksopsh!
- We found many similarities in problems, and also in the cool things.
- This set the stage of everything to come, but also somewhat overlapped in content with the lightning talks.
### JupyterHub (Richard Darst, ...)
Topics include:
1. Tour/discussion of JupyterHubs at different Nordic sites
2. Just what are Jupyter and JupyterHub under the hood (it's nothing magic)
3. Making a semi-standard, reusable JupyterHub setup to share across sites
Session notes:
- Tour of Jupyter:
- What is a Jupyter kernel: running some process in a normal unix environment. You can and should integrate to whatever you already have. [envkernel](https://github.com/NordicHPC/envkernel) can help with that by automatically handling lmod modules
- What is JupyterHub? Auntenticator, spawner, and proxy
- How to integrate to clusters? When spawning in slurm, how do you balance allocations with inefficiency
- Oversubscribing CPUs and having different options which balancememory vs job runtime.
- Slurm cgroup based memory limiting (with flexible limit) helps
- How to redistribute?
- One suggestion is that container-based will work if you mount in the necessary slurm config files.
### Software deployment
Deploying scientific software is one of biggest pains of sysadmins. Let's share best practices. We'll discuss Spack, Easybuild, etc. Aalto University will present its [automated build system](https://github.com/AaltoScienceIT/science-build-rules) which automates Spack, Singularity, and more across multiple systems in a declarative system.
Organizer: Simo Tuomisto
Session notes:
- https://github.com/AaltoScienceIT/science-build-rules
- https://github.com/AaltoScienceIT/science-build-environment
- [Quick installation script](https://raw.githubusercontent.com/AaltoScienceIT/science-build-rules/master/science-rules-setup.sh)
### NordicHPC future
Where do we go next?
Session notes are at the bottom of the page.
### Usage monitoring
Here I [Radovan] can demo two tools that we use to find out what is actually running on the machines and to identify suboptimal runscripts:
- https://github.com/NordicHPC/sonar
- https://github.com/dragz/slurmbrowser
Would love to hear what other centers do and discuss.
I (Peter Kjellström) have some material (and data) on the topic. Also drilling down into parallel loss/usage.
Enrico had a nice visualizer that can find users with large amount of inefficient jobs
Richard has [slurm2sql](https://github.com/NordicHPC/slurm2sql) which dumps slurm db info (via sacct) to a sqlite database, doing useful pre-processing.
### EOSC
Lightning talk, Mikko Hakala. How it affects us, how we have to adapt and help, how usability works. Friday only.
Session notes:
- What is EOSC?
- Mikko proposes taking one project and making it very good - they need success stories and this we we can make it driven bottom-up.
- Standardizaing modules?
- How do we get the codes ready?
### OpenHPC
I (Janne Blomqvist) will talk about our (Aalto) transition to using OpenHPC, experiences, and opportunities.
Session notes:
- OpenHPC is a image-based system for distributing operating systems to compute nodes
- This can be used even with other configuration management systems which configure the images themselves.
### Cloud Solutions
I (Emiliano Molinaro) will give an overview of our cloud platform (at University of Southern Denmark) for improving the usability of HPC systems. This is a remote talk.
Session notes:
## Final report, summary, and future
This section is written collaboratively. Anything here may be used in a final report/future funding applications. License for contributions: CC-BY 4.0.
### Sessions
We had around four proper talks, around 10 lightning talks, around seven unconference sessions, eleven sites presenting their cool stuff, three meals, and countless discussions. Summaries of all sessions are found above, in the [talks and events section](#Talks-and-events) section.
### Overall evaluation
Here, we make a {content, presentation} x {positive, negative} grid to evaluate how we did:
#### Content
Positive:
- Good discussions during sessions but also "outside" sessions
- Learned about interesting projects that exist
- Learned about new tools
- Found people with similar issues
- Learned about the different HPC oriented entities across the Nordic countries
- People provided good examples of problems in their clusters
- Good initial presentation by Marko Nieminen that looked at problems at a different angle
- Great idea about three things Problems/Cool per site
Negative:
- Significant time spent on organization
- Did not properly document sessions
- Lightning talks vs. unconference sessions was confusing. Would need to communicate better, or choose one or the other.
- Voting format was not clear and changed on the way. Perhaps instead of voting on most popular sessions, find a way to mark which two sessions should not happen at the same time.
- Many different resources (web tings, chat, drive, zoom, ..) Hard to stay on top +2
#### Presentation
Positive:
- Audio set-up worked great
- Zoom works well for the purpose
Negative:
- Zoom link could available earlier, maybe more remote participants if advertised
- Parallel sessions could not be followed online
#### Other observations and quotes
- Topics
- ...
- Unconference format
- Worked OK, next time the rules should be more clear from the start.
- bad vegetarian food
### New name?
We need a new name in order to seem less exclusive to HPC. Several proposals we thought up are:
- SCIN (Scientific computing in the North)
- CITN (Computing in the Nordics)
- NOBSC, or NBSC, or NBSW/NBSM (Nordic basic scientific conference)
- Giraffe (GNordic interactive research applied funputing for everyone)
### Future community?
We need to continue this community somehow. It will be hard, since we are all very busy already, but it can probably be done.
- Who else should be a part of this community?
- ...
- How to communicate?
- We don't want yet another chat system, so we will build off of an existing one.
- We can start at https://coderefinery.zulipchat.com on the `#NordicHPC` stream. After signing in, configure streams (left sidebar) and join the `#NordicHPC` one - in addition to anything else that interests you.
- Other local groups who are interested in teaching and scientific computing can create their streams here, too.
- Co-locate users group with e.g. Supercomputing, OpenSourceSummit etc.
- Shared projects?
- Shared code are the common way of communicating, thus a github repository is logical.
- https://github.com/NordicHPC
### Future meetings?
We agree that future meetings are a good idea - the talks and ideas are important, but even more important was talking with people.
- How should the program be different?
- Unconference was good for saving time, could be more structured, though.
- See the unconference section for more info.
- How often?
- At least one per year
- Funding
- We can apply to NeIC for funding if we can motivate/document a good case
- We also need to formalize e.g. not just travel also reporting hours. Otherwise we have to do this as voluntary basis which is not sustainable
- More ad-hoc visits?
- ...
### Unconference lessons
This is the summary of what we have learned about how to run an unconference well:
- Strongly encourage everyone to suggest a talk when registering.
- Also collect "talk requests" when registering, make these public.
- Clearly explain the rules of the unconference early on.
- Have examples of all types of sticky notes early on.
- Clearly define the type of sticky notes
- A5 for significant talks. Include the requested length.
- Long for lightning talks
- Square for ideas
- Have a defined "voting time" that happens after all of the info has been given. Don't have some voting, some new talks, more voting, etc.
- Define the number of allowed voting dots
During sessions:
- Make sure that a summery gets written immediately for every lesson. Write it in the hackpad directly.
- Find a way to keep track of time!
### Who actually came
We had representation from every NeIC member (Nordics plus Estonia). Participants who arrived or were known to have attended remotely:
- DK
- 1 DeiC
- 1 KU
- 1 SDU
- EE
- 1 Tartu
- FI
- 6 Aalto
- 1 Åbo
- 3 CSC
- 2 Helsink
- 3 Oulu
- 1 Tampere
- IS
- 1 Iceland
- NO
- 1 Oslo
- 1 Tromsø
- SE
- 1 KTH
- 1 Linköping
- 1 Lund
- 1 Uppsala