# RSE FCCI Tech Talk
## Main points
* The RSE team has successfully supported the development, maintenance and publication of the software written by researchers from many fields of science.
* The RSE program is a crucial component of ASCs mission to offer top-tier hands-on support for all aspects of scientific computing.
* Call to action: support the RSE team and expand the program to all schools at Aalto.
## In Detail
### Intro by Richard (3min)
* A change: This decade, a revolution has taken place in academia: writing software suddently became super important in all fields of study.
* Programming is the new mathematics of science.
* Getting anything done (data analysis, making figures) requires writing code.
* Push for open science requires writing even more code.
* A problem: **In order to do good science, you need to write good programming code**.
* **Diversity problem**: only IT specialists thrive.
* **Productivity problem**: bad code slows research down. Student who wrote it left, project can't continue because of the mess left behind. A single student can get by, but the group/department/university level definitely suffers.
* **Impact problem**: if code is terrible it can't be shared, research can't be reproduced. Mistakes in the code cause paper retractions.
* Our solution: We started a **dedicated support team for research software**. Team of experts (programmers with PhD's) helping develop, maintain and publish research programming code.
* We hired top-quality scientists, better than we expected in the first plan (first intro, RD can introduce the other speakers this way)
### Segueway by Marijn (2min)
* What do the RSE's do?:
* Quick consultation (garage), daily requests (show garage diary)
* Longer consultation (set up meeting), about 1 a week
* Project (duration: weeks to months), more than we can handle
* Project summary:
* Started this january "for real"
* 47 significant projects in our tracker (not counting internal requests and short consultations) (24 closed, 9 in progress)
* Requests from different departments (show pie-chart)
* Let's hear about some of the project's we've been doing
### RSE War Stories (20min)
### adventures in Software maintenance
[ParallelFDTD: Hard to use code.](https://version.aalto.fi/gitlab/AaltoRSE/rse-projects/issues/17) (**Presenter: Jarno**)
- How did they find us?
- Installation problem on Triton, missing libraries
- What was the problem?
- **Productivity problem**
- **Impact problem**
- A code developed at Aalto had become hard to use (and unusable on Triton) since the orginal developer left and significant technical debt had accumulated.
- The main problem was in building the library.
- A researcher from another university was unable to use the software.
- What was the solution?
- Updated the build system to follow current standards.
- Used Anaconda to manage dependencies instead of requiring manual install of several libraries.
- Automated several installation steps that used to be manual.
- Compiling a Matlab library.
- Installing Python bindings as a package.
- What is the bigger point?
- There is often technical debt.
- At the beginning you don't know which projects become important and not
- The bigger picture is managing software long term. Without a long term RSE service, how do we keep software alive as long as it's useful?
- 4 projects currently being maintained by RSEs
- Do we really expect every researcher to be an expert in all these build tools?
- Call to action:
- If you don't want all your group to be re-doing work, you need to take care about your software.
- These tools are definitely not taught in programming courses
### adventures in Software publishing
[PREPRINT: Restructuring a set of scripts used in a published paper to a usable software package.](https://version.aalto.fi/gitlab/AaltoRSE/rse-projects/issues/23) (**Presenter: Marijn**)
- What was the problem?
- **Diversity problem**
- **Impact problem**
- A researcher developed a new method for detecting enhancer sites on the human genome that outperfors the current state of the art.
- The implementation was 42 of scripts to produce the result published in the paper.
- Instructions on how to install the dependencies was 7 pages, instructions on how to run these scripts was over 18 pages. Complete analysis took multiple days on 17 CPUs.
- RSE was called in to:
- publish the code in a form so that other researchers could use the method.
- improve performance
- What was the solution?
- We went through all the code and refactored it into:
- an R package implementing the novel method in a general purpose fasion
- an analysis pipeline that applied the functions in the R package to reproduce the study in the publication
- Installing the dependencies is a single command, installing the R-package is a single command, running the analysis pipeline is a single command.
- We improved the runtime from multiple days on 17 CPUs to less than an hour on 1 CPU.
- We fixed several serious bugs (luckily they didn't invalidate the published result).
- We made it easy to apply the analysis to new datasets
- What is the bigger point?
- There is a large gap between what is needed to produce a figure in a paper and what is needed for your method to achieve impact.
- If you don't offer a clean package to apply it, nobody is going to bother using your method.
- Having others use your method is a very big performance metric for an academic. Especially since we are moving away from paper impact factor.
- Developing a package that is easy to install, has a good API, is fast and an overall pleasure to use, is hard and requires software development skills well beyond the basics.
- RSE wants to enable everyone to publish their code in such a way that others love to use it.
- Relevant stats: number of open source packages RSE helped release/maintain.
- Call to action
- We need to reach the researchers who need our expertise most. It's easy to reach the poweruser, as they activity approach us. It's way harder to reach the silent researcher feeling overwhelmed.
### adventures in Software platforms
[Setting up infrastucture for data collection](https://version.aalto.fi/gitlab/AaltoRSE/rse-projects/issues/51) (**presenter: Jarno**)
- How did the client find us?
- A researcher wanted to collect data from volunteer subjects using wearable devices and online surveys.
- Needed someone set up the software.
- What was the problem?
- **Productivity problem**
- They did not have the technical expertice to set up the infrastructure.
- Several overlapping issues
- Handling personal data
- Server architecture
- Building surveys
- Data handling on Aalto systems
- What was the solution
- Consult with the researcher and relevant experts to create a data management plan
- Develop the servers to
- sign up and sign the consent form
- present a survey and collect the data
- collect the device data and collate with other data
- Set up access to the data on Triton
- What is the bigger picture
- Sometimes you need a qualified software developer to become a part of a project.
- Often researchers from very different fields require similar software infrastructure. RSEs can develop and maintain that infrastucture and adapt it to the needs of each project.
- RSEs are centralized experts who enable sharing resources between departments (but we need to be treated as part of the project groups)
- Call to action
- We need permanent software developers to maintain our software infrastucture.
- Funding is available from the projects we support, we need administrative support to be allowed to use it.
### adventures in Building a foundation
[BioMag analysis pipelines](https://version.aalto.fi/gitlab/AaltoRSE/rse-projects/issues/16) (**Presenter: Marijn**)
- What was the problem?
- **Diversity problem**
- **Productivity problem**
- New professor started, creation of a new lab and collaboration network between Aalto and HUS (BioMag)
- Several new students will start advanced data analysis projects involving applying machine learning to brain data
- Professor is a medical expert, but not an IT expert. Worried that code quality of the students will become a problem. Students need good technical support in order to be productive.
- RSE called in to write some template analysis pipelines using best practices for the students to copy and modify.
- What was the solution?
- We became actively involved in their first study, applying machine learning to detail mild traumatic brain injury.
- We laid down a set of basic analysis pipelines with good internal structure and good documentation.
- We instruct new students on the design principles behind these pipelines and how to adapt them for their own projects.
- What is the bigger picture?
- RSEs can fill an important technical role in bigger research consortiums.
- RSEs ensure that every student has easy access to software experts, regardless of field of study.
- Call to action:
- There are way more students in need of better support for their programming needs than 2 RSEs can handle. We need more RSEs.
- We need a way to direct funding from grants to the RSE program. Administrative issues prevent this.
### Outro by Richard (5min)
* Tie-in with ASC
* The work that the RSEs are doing has become an integral part of Aalto Scientific Computing.
* Richard, can you write something here? :)
* ASC now has capabilities for offering hands-on support in all aspects of using computing for research.
* This makes ASC a huge asset to the university.
* We get requests from others universities, because they don't have a similar support structure. Makes Aalto stand out (management prob. loves to hear how Aalto is the best ever).
* Call to action
* Help us identify the best customers: we have no shortage but do we reach the right people?
* Are we reaching advanced people that can already do things, or people struggling and who we could really transform their work?
* Better nurture computational communities
* We can't do this ourselves, but we also can't expect every researcher to do it themselves
* We have plenty of training, but training and "good luck" is not enough
* We need a network of people who can povide mentoring and help, for example the expert in each group who can be the first layer of support
* link to diversity talk
* Help us either get a sustainable funding model or secure source of basic funding
* Groups want to give us money, but we can't accept it.
* Really this is OK by us, but then managament has to agree
* Current proposed model: 1 month or less at no cost, beyond that we talk.
* Extend the RSE program to all of Aalto (not just school of science)
* The need is clearly there, as we are getting requests form others schools and other institutions.
* We hate to have to turn users away because their schools haven't funded RSEs yet ;)
<!-- Marijn says: not a call to action but a question/problem.
To which RSE is the solution (for the purposes of this talk), because if we are not
the solution, or a bad/incomplete solution, why should they fund us?
* How can we have the most impact in
* data
* open science
* diversity and equality
-->
## Todo
- make sure we convey what knowledge was needed to do this.
- high turnover makes long-term maintenance hard. Plus too many other things to focus on
- Pure software development (OnPIT) - footnote somewhere? (COR:ONA has a big development component)