> > --- tags: unconference-2023 title: Nordic RSE 2023 unconference --- # Nordic RSE 2023 unconference :::info This is the public shared document. Everybody visiting this page can edit. - unconference page: https://nordic-rse.org/events/2023-online-unconference/ - Link to this page: https://hackmd.io/@nordic-rse/unconference23 - Zoom room: https://gu-se.zoom.us/j/64923726913 (if you found this info without registering, please do [register](https://indico.neic.no/event/251/) also, it takes 2 minutes) - [Event page](https://nordic-rse.org/events/2023-online-unconference/) - [Code of conduct](https://nordic-rse.org/events/2021-online-unconference/code-of-conduct/) - You can create a separate HackMD for your session but please reference it from this document and please make it editable by everyone - [How we do an unconference](https://hackmd.io/ue-yci-sSMKx458ChZab0w?view) - If you have difficulties moving to a breakout room, please send message to Zoom chat ("can you please move me to room N?") ::: Latest schedule can be found at https://nordic-rse.org/events/2023-online-unconference/ ## Schedule: All times are Central European Time! ### Day 1: 2023-10-25 - 13:00 - 13:20 : Welcome and Intro to the unconference format (Matteo Tomasini) - 13:20 - 14:20 : Hidden gems presentations (5-15 min each; chair: Jarno Rantaharju) - Examples: data tools, VS Code plugins, efficiency hacks, programming tools, libraries, containerized Conda, other 'life hacks', calendar via GitHub, whisper for subtitles - Emphasis on accessibility to a wide audience - 14:20 - 14:40 : Break - 14:40 - 16:00 : Hidden gems discussions / workshops (breakout sessions; chair: Matteo Tomasini) - Examples: more presentations, deeper dives from session 1 (planned or spontaneous), tutorials, show each other our set-ups, information about Advent of Code - 16:00 - 18:00 : Optional social time (Zoom) ### Day 2: 2023-10-26 - 13:00 - 13:10 : Introduction to the day and unconference scheduling - 13:10 - 14:20 : Paper cuts presentations (5-15 min each; chair: Richard Darst) - Examples: usability problems in computing, difficulty in support or teaching and what we can do about this, Problems you have have seen but don't have a solution for, Tell us about your coolest programming idea which did not work, security - 14:20 - 14:40 : Break - 14:40 - 16:00 : Paper cuts presentations discussions / workshops (breakout sessions; Matteo Tomasini) - Examples: more presentations, deeper dives from session 3 (planned or spontaneous), tutorials. re-visiting the code skeletons in my (GitHub) closet, Halloween session (share data/software horror stories), tell us about your Advent of Code shenanigans - 16:00 - 18:00 : Optional social time (Zoom) ## List of mentioned interests - the experience of other RSE people - beyond dev/computing tasks - Project management in academia - Productivity hacks for smoother/faster workflows - GPU(-accelerated) computing for academics - As SysAdmins: how to efficiently setup and maintain software suites that researchers want to use for their work (especially using containers) - Continous integration in scientific computing, - docker vs. singularity vs udocker - how to containerize? - best practice package management - why I love my editor config - tips on working on machines with restricted internet access (Isolated environments for sensitive data etc.) - Good practices for simulations - Legal hurdles - Documentation - High-dimensional data analysis - web development / GUI - machine learning - Job security - embedded Julia - auto-tracking of animals in videos - reprohacks - Natural Language Processing - sustaining apps - tools for increasing code quality and reducing errors - where and how to publish papers on open-source code/libraries - where and how to find funding for open-source and scientific code - FAIR principles with hands-on examples - Tracking my OS using NixOS and home manager - Training junior staff to do our job - Should we take the time to learn Rust? Or Carbon? Or... - Linked Open Data (LOD) - Federated authentication (eduGAIN, SSO) - Python + type annotations (summary/demo or more) ## Other topics of interest - (ex.) Halloween session! Our scary mistakes ## Proposed contributions for Wednesday 25th Please vote for a session if you would like to attend it. We will avoid overlapping sessions that have a lot of votes. So we won’t use the votes to find out whether something happens or not but to avoid scheduling very popular contributions at the same time. ### presentations - R. Darst, *presentation*: Architects vs Engineers: how can we explain "RSE" to others? (5-10 min, discussion extra) - Votes: oooooooo - S. Indrehus, *presentation*, Hooked on hooks: how to do some magic with Git hooks, approx 15 minutes with demo - Votes: oooooooooo - V. Myrov, *presentation*, - GPU accelerated neuroscience. speeding-up your python code with GPU without much hassle, around 10-15 minutes - Votes: ooo - Jarno: Let's make a list of tools that have changed our workflow, no pressure to give a talk: - Python click - Singularity/Apptainer Votes: oo - Udocker (easier to switch to Docker) - tailscale - Lmod & associated env management tools - workflows ## Questions/comments - This is a question - This is an answer/comment - more comments - Is it OK to create new documents and link to them from this one? - Yes - Is this event recorded? - no, not recorded Poll example: did it rain/snow in your location today (vote by adding "o")? - yes: oooooooooooo - no: ooooo - did not look outside yet: - What are the main advantages over google docs? - good question! Somewhat familiarity, markdown allow us to archive later on more easily. For some of our courses it provides a good pedagogical example. - for me personally: I worry less about formating and font sizes and can focus more on the content : o <- I agree about this, but I know others at my uni will care less about this point. - advantage of gdocs: the commenting and suggesting functionality <- ah good point. - yeah, the commenting/suggesting is good, though we've adapted to a "sub-bullet point" thing. - hackmd/hedgedoc is good for sharing code examples with code highlighting - vim keybindings :) - hm? tell me more. how? where? (vim user here) - at the bottom of the screen on the right -> the default is sublime, can be changed to emacs/vim - mindblown (now typing with vim bindings) - amazing tip. thanks! (I never noticed after using this tool for 3 years now) ## Presentation: Architects vs Engineers (R Darst) https://docs.google.com/presentation/d/1lF-2QtVa6obI_ceSYJ8wq0PhaiQVCf2Jmb_KUo6H-rQ/edit - what would be the analogy in RSE for regulation/certification (when constructing buildings)? would it even make sense? - Maybe journals requiring publishing source codes and data, with some qualifications. - Forcing people to publish robust benchmarking of software VS what is known? - (a comment, not a question) The reference for ["Why Most Published Research Findings Are False"](https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124) is worth reading - wait, how did they fix the building? - I think it was diagonal reinforcements? But might need to look it up. - correction: chevrons - open question to everyone: did you see the same divide between "researcher education" and "RSE education" in your domains, department, universities...? - Yes, absolutely. I saw it both in my master and my PhD, with students taught to use certain pieces of code, softwares, packages, uncritically and without really learning how to create good quality programming, implementing tests, etc - Yes, my curriculum did not contain any RSE education, only research. We only had a couple of coding courses. - Yes, and it goes against the reporducability of science principle. Is it science then?! +1 - the talk laid out *why* we want RSE, but how to make people (researchers) actually use it - convince them about time saving - those who are convinced, they might motivate their friends and colleagues to join - Where would we place a DevOps engineer in the city? With cloud computing these guys are pretty important in the industry! Are they maintence people as well? - I'd say research engineer or infrastructcure... depending on what you want! If mindset is more "close work with individual projects", then *definitely* research engineer. - who should motivate our work? scientists (i.e., concrete tools for specific projects) vs engineers that realize an abstract need in science (e.g., workflow management software)? for the latter, how to sell this? - I think national working groups (or even European) should clearly say we need RSEs. They can be at the level of ministry of education and research or other groups by people from various universities. At the EU level this is recognised, but this needs to translate into national policies, and then university policies. Here a national policy in finland for the open access of methods which recommends all institutions to hire RSEs https://edition.fi/tsv/catalog/book/669 - Personal recommendation: join national working groups about research data (and software) management, open science, policies for research, etc... because you can actually be the cause of change into the direction you want. :+1: - Follow-up: as an RSE in Finland, do you choose the projects you work on yourself, or is it mostly "project-specific contract work"? - Mostly projects come from needs of others, we don't invent our own research projects. But there are plenty of things we can invent ourselves. - Thanks! When "inventing" your engineering projects, who takes the risk (time spent vs success)? - how do you track the time you save for scientists? - Just asking them. It's probably not very accurate, but the best we can do. We probably saev more than they think, since researcher is always running late compared to estiamtes, right? - If we had a precise way to answer this question we could I think it would be eaiser to convince management/decision takers as well. - what is the official postion you actually have at the uni? How do they pay you? - HR considers me a "Staff Scientist" and others "Research Engineers" and are in "Technical Services". But yeah... as in many places, HR can't really classify us. - are your positions "soft financed"? - We get basic funding (not project-based) and also some from individual projects. But we have permanent, full-time positions at the school level. - that is shocking and wonderful and unheard of. It would be great to hear how you got to that point, and how we might be able to as well. - In my department we are lucky to have a similar situation with 5 or 6 developers hired permanently for the needs of humanities. It required a stubborn PI and a prefekt of the department who saw how important we could be to the development of the department and vouched for us in front of the faculty. We are not yet "safe" (as a group, we depend on some granting on the long term) but we are hired permanently - and this is why national policies should clearly state that we need permanent experts like RSEs, so it is good to be in contact with those at your uni who can influence national policies - which in Sweden comes with some stability. - RD note: My goal is to find a way to tell others how cool and important our work is. ## Hooked on hooks (S Indrehus) [Github repo](https://github.com/sunnivin/hooked-on-hooks/tree/main) with talk and demo Direct link to [slides](https://sunnivin.github.io/hooked-on-hooks/talk/slides.html) (I'm very happy with using Github for hosting html files btw) - how do you enforce that new developers have a correct setup with correct hooks? - CI (e.g. GitHub Actions, pre-commit.ci) - The magic is in the .pre-comit-config.yaml file in [the repo](https://github.com/sunnivin/hooked-on-hooks/blob/main/demo/.pre-commit-config.yaml). - (this will probably be shown later) do you track the "local" hooks in the git repo? how? asking because the `.git/hooks` is "outside" the tracked part of the repo +1 - I enable the hooks with the `pre-commit install` command. The settings for the hooks are controlled .pre-comit-config.yaml file in [the repo](https://github.com/sunnivin/hooked-on-hooks/blob/main/demo/.pre-commit-config.yaml)-file. - ah nice. I see. - Have you found any usability problems, like, peope can't easily set up the hooks locally or other special cases break things? - People might be confused about why they are not allowed to commit stuff. They can overwrite the hooks if they like, but typically i use a brach protection rule on main as well to run the hooks. - CI (e.g. GitHub Actions/the pre-commit.ci service) can easily be set up to automatically make pull requests for users that don't use (or forgot to enable) pre-commit locally - Exactly!This works very well for code review as well. I find it more difficult to review a new code that is not following the standards we use in the team for collaboration. Looking at spaces that changes between the developers using ruff or flake8 locally for linting is really waste of human brain power in the review process (I think). - have you experienced the problem where co-workers are confused about what automation/linting warnings mean or how to fix it and leave a half-finished pull request/ merge request/ commit for you to pick up and clean up? - If my collaborators are not comfortable with using git I do help them. But if you learn how to read the error messages. The user do need to be slightly motivated, typically my senior excel-colleagues do not see how this is a life saver, but my colleagues that are motivated to learn programming find this really useful :) - Thanks! I was asking because some of my more senior colleagues will just leave it with "sorry don't know what it means, no time to read docs, must move on" and then I have the choice to do clean-up for them (not very motivating for me) or accept that their change will never get merged and will at some point conflict. - I feel you there. Those colleagues in particular are not very respectful with you either. I also think it shows a bigger problem that in Academia the "engineers" are not valued as much as the "architects". Totally unserstand that you get tired of having this discussion again and again. I think things are changing, but a bit slowly. Example: just think about the chaos if you write python code without ANY reference to which version of packages or even python version you used to create your code. Coming back to the project after a week, a month or a year could mean that the code is not working anymore. If they have experienced that pain first the usually listen to me when they come back again to ask for help with that. I have also stopped to clean up for "free". Meaning that if my colleagues do not value the work I'm contributing with I will not waste my time on them. - Where does pre-commit install the code that runs the hooks? - Separate environments (managed automatically by pre-commit) - where are they stored? in the repo or $HOME or ... ? - AFAIR, hidden folder in $HOME (the hooks listed in .pre-commit-config.yaml are versioned) - Correct! - Are there other hooks besides 'pre-commit' (that you've found useful)? - I used it for structuring my commit mesages with a theme first, like ("doc: instructions", "env: updating numpy dependencies"). But this system was difficult to get my colleagues onboard on so I'm not using this active anymore. - Anyone else just went: :O ? +1 - Comment: nice use-case with notebooks! +1 - :) I use them in combination with `jupytext`, that can be a bit confusing to use if you are not comfortable with git workflows. - I highly recommend nbQA for pre-commit: https://nbqa.readthedocs.io/en/latest/pre-commit.html ("Run ruff, isort, pyupgrade, mypy, pylint, flake8, black, blacken-docs, and more on Jupyter Notebooks") - I also think it is smart to test out Ruff as the modern linter. It is on my TODO-list, but right now I', working in a project where we have not prioritized to upgrade our hooks. Suggesting for a `.yaml`-file with ruff would be cool to see :D - What's the process to enable CI/github actions to run and check? Is it fast per-repo? - self-answer: I guess drop this in?: https://github.com/pre-commit/action - Looks like they also use some `yml`-file (https://github.com/pre-commit/action/blob/main/action.yml) behind the magic. One point to watch out for is to not be super general when you write your pipeline. The example from line 11`steps: - run: python -m pip install pre-commit` could make your pipeline fail if the action runner of github is not using the same python or pip-version as you used in your environment. I recently spent (way to much) time to figure out an tiny detail like this when the standard python version the runner used was upgraded from python 3.11 to python 3.12. - Actually, you don't even need GitHub actions - just go to pre-commit.ci, sign in, and enable it for your repo (this is also the recommended way, as mentioned on https://github.com/pre-commit/action) - In our organization we use Azure (I really miss Github), so I typically write a pipeline and use a branch protection rule on main. Then PRs are automatically blocked if my tests and hooks are failing. - Who is using pre-commit already: - Yes: oo - No: oooooooo ## GPU accelerated neuroscience. speeding-up your python code with GPU without much hassle (V. Myrov) [google docs slides](https://docs.google.com/presentation/d/1Ijl8g8zaW4v0Cs4mGdW1yWCS0q4JFp9y7K4laZ28VOc/edit#slide=id.g292f0aa744b_0_0) - Why not Julia ... ? - Reply from someone working with similar data: most I/O libraries for these type of data are only in python, but I know there are efforts in expanding tools in Julia. - Ecosystem (E.g. are there any GPU libraries for Julia? ), necessary tools for the neuroscience field (e.g. MNE) - Comment: nice reminder about cupy, I still haven't tried it but should have a look. - I'd be interested in hearing more about cupy! - Is it the kind of thing you should just try dropping in if you think it's a vectorizable problem? - I think that it worth a try: at least I used it in huge variery of tasks from timeserties processing to EM methods - This talk is such a good reminder of why you need specialized RSE knowledge to boost research. E.g. unless your field uses GPU a lot already, more often than not you don't know about it. Same with languages (see question above). - This to say that RSE isn't just about doing the work, but especially about knowing what tools exist, and know how to learn them! - Nice ending! The national cluster vs local cluster, great to see that you can use both seamlessly. I am unsure if a similar reality exists in other nordic countries (national HPC vs local HPC). ## General questions - Can we link to slide decks/ material/ repos from here? - Could we maybe collect them in the Nordic RSE Github? Feasible? - I would start linking from hackmd here and later we can aim at a blog post about this event and link from there if people are comfortable with it. - RD: I added the link to mine above - thanks! :trophy: - i can't find your link :( - on line 145 in this document - SI: I also added a link to mine. :heart: GitHub :heart: ## workshops - M. West, *workshop*: How to make your own local batch cluster (without too much headache). - Votes: ooooooo - NN, Continue working on the Architects vs Engineers metaphor (and how to convince academics and management RSE work is necessary) - Votes: o - (RD: these can sometimes turn into a recursive discussion, and plenty of previous discussions exist, if we want this, can we make the topic more specific?) - MT: Open discussion between permanent Nordic RSE and people who are aiming at ending there, support group, whatever you feel like (could also get matched with Richard's, since there will be overlap, I suppose). Or do it tomorrow to avoid too much overlap - Votes: ooooo ## Notes for "How to make your own local batch cluster (without too much headache)" - https://hub.docker.com/r/htcondor/mini - https://hub.docker.com/r/htcondor/htc-datascience-notebook - https://htcondor.readthedocs.io/en/latest/index.html - https://singularity-hpc.readthedocs.io/en/latest/index.html ### Questions: - Any experience in using the htc-datascience image in a clustor? Is it "just" to delpoy? - Is it alway a service for one user? - At the end will this lead to a server that queues jobs using Slurm? - no (if I understood correctly) ## Julia users group at 15 CST (Helsinki): Feel free to drop by in zoom if you want to learn about how to use julia for web development. https://uwasa.zoom.us/j/62430497067?pwd=TDRoelVTRjdtNW5Hd1ZKd3Q1Zndpdz09 # Sessions for day 2: ## Presentations - Pavlin: wrapping tools in single Singularity container for seemingly unconteinerized use ;-) 5m - RB: hey that's what I do :-) Interesting! - Votes: oo - RB: how I use conda without pain (quick installs, reproducible, not filling up disk quota, not messing up my computer, chaos monkey proof) 10m - Votes: ooooo - Seasonal craziness with Nordic RSE! an introduction to Advent of Code - Votes: o ## Workshops and open discussions - (idea): person A tries to walk person B through doing some simple task (like connecting to cluster). Person C narrates the paper cuts like a comedy/documentary. - Votes: oo - Erik: My attempts at making a pip package including CUDA kernels (i.e., a pytorch extension) in a (somewhat) portable way - https://hackmd.io/@erikschultheis/rJDDvhvfp - Votes: ooo - Seasonal craziness with Nordic RSE! Halloween session (tell us about your crazy scary software stories, as a workshop) - Votes: o Requested demos: - NixOS - MyPy - Setups in general - Volunteers: - JW - Switched from debian-based (with i3, wayland etc) to Windows+WSL2 - Using ZSH with https://ohmyz.sh as before; "Windows as window manager" - Windows Terminal, "the new thing": https://github.com/microsoft/terminal - SSH keys etc all within the WSL2 machine - PM - Comment by JW: [mosh](https://mosh.org/) - MT (Todo Tree: a simple plugin for VS Code) - plugin: [Todo Tree](https://marketplace.visualstudio.com/items?itemName=Gruntfuggly.todo-tree) - Installed! Thanks :) - you're welcome! It really changed my way of commenting and documenting my code while developing - RB: NixOS and home manager - https://nixos.org/learn - https://nixos.org/guides/nix-pills/ - Security updates? - for own packages it might mean work - but I basically don't create own packages and it this case it is "just" an upgrade and other people did the hard work for me - Packages can be installed in just a shell, and will be garbage collected eventually. - - Somebody added tailscale to the list yesterday - would be curious to hear what that is (never heard) - Does anyone know about spack and would like to present? :) ## wrapping tools in single Singularity container for seemingly unconteinerized use (P. Mitev) - All files in a bin folder are a symlink to a `.sif` container - Script to make the symlinks - run script in container uses $SINGULARITY_NAME - https://pmitev.github.io/UPPMAX-Singularity-workshop/indirect-call/ ## How I use conda without pain (R. Bast) - https://github.com/bast/singularity-conda - Automatically finds environment.yml from current dir and runs Python in an environment defined by that (making it if it's not there yet) - it requires users to install dependencies in a declarative way which will be nice in future when imperative commands are long forgotten - uses https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html under the hood to have small size and fast installs - https://documentation.sigma2.no/software/userinstallsw/conda.html - conda tips for a cluster (this is our documentation for clusters in Norway) Comments: - Use mamba as resolver for conda: https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community - do most people use conda? - mark if you do: oooooo - I do, but I'm trying to get rid of it: o - I do, but only for system libraries etc (e.g. Python versions) - for Python packages, I tend to pull exclusively from PyPI, since many packages I work with don't exist on conda-forge etc and I don't like to mix: o ## Seasonal craziness: Advent of Code with Nordic-RSE - [Advent of code](https://adventofcode.com) - In December we meet daily on Zulip to discuss the day's solution and to help each other - great way to learn new things and to have fun - some use it as a way to learn a language. Others as a way to get new challenges - I (MT) personally would be interested in trying a pair programming session to solve a problem at some point... :::info ##Break, back at XX:40 ::: ## My attempts at making a pip package including CUDA kernels (i.e., a pytorch extension) in a (somewhat) portable way (E. Schultheis) https://hackmd.io/kY2CThvUTNGU4tcOjH0ljA?view - Is manylinux a Python-defined thing? - yes, there are PEPs that defined different versions, and the most recent one [PEP 600](https://peps.python.org/pep-0600/) defines a generic naming scheme - A bit broader than this talk, but: what's the current standards/best practices for compiled extensions in general? - . ## Richard tells us about Sphinx - scicomp.aalto.fi - from rst (or markdown, e.g. via https://myst-parser.readthedocs.io) it generates HTML (or PDF or ...) ## Nordic RSE links / advertisements - Coffee breaks every Thursday morning at 9 CEST - In person unconference planning meeting next Thursday at 14 CEST (https://hackmd.io/@nordic-rse/biweekly) - Main network for the community: [Zulip](https://coderefinery.zulipchat.com/#narrow/stream/213720-nordic-rse) - [Join Nordic RSE!](https://nordic-rse.org/about/membership/) --------------------------------------- Add questions above this line