# NeIC S4 project abstract (strict limit: 2500 characters!) * problem statement * what are the goals of the project (service development, communities, ) * what are the benefits for the Nordics+Estonia * relation to EESSI? ### Thomas' original text The project Scientific Software Stacks as a Service will develop an eInfrastructure which integrates existing technologies and open source software to deliver high quality scientific software stacks that are built and optimised for specific CPUs, microarchitectures, GPUs and high-performance interconnects to virtually any computer via the Internet. The software stacks will be exposed to users in a uniform way making it very easy to migrate work between different systems across borders all over the Nordic countries and beyond. In the data-driven age of science, researchers increasingly rely on having ubiquitous access to scientific software on several systems: their own laptops, group servers, local clusters, commercial clouds or supercomputers in order to study complex phenomena. While they get easily access to many different types of systems, they struggle using them well for a given task because the software environments on these systems are not harmonised, not optimally configured or sometimes not even existing. This makes it cumbersome to utilise these resources to their full potential. Additionally, staff who is operating these systems face a growing challenge to provide a harmonised and optimally configured software ecosystem, because there is an increasing trend for more diversity in core computing technologies - most being already available, some expected to emerge soon: CPUs (Intel, AMD, ARM, POWER and RISC-V), GPUs (NVIDIA, AMD and Intel) and high-speed interconnects (InfiniBand, Omnipath and Slingshot). Last but not least, there is an explosion of scientific software made available through rapidly evolving disciplines such as bioinformatics and methods such as machine learning. While software build tools such as EasyBuild and Spack are mitigating the struggles listed above to some degree and have resulted in some collaboration among system administrators worldwide, there is still a lot of duplicated work ongoing across sites without any guarantee that the software environment is provided in a uniform manner or it being easily accessible outside the traditional HPC environment. The project will address all these challenges by providing a common shared stack of scientific software available everywhere which will greatly simplify the use of IT resources, ensure that resources are optimally utilised, improve a wide range of use cases and enable unprecedented levels of collaboration among scientists and IT professionals. ### Kenneth's version ... ### Caspar's version Installing scientific software is often challenging: scientific software often has a large number of dependencies, and portability of software and installation procedures are often not the primary concern of scientific software developers. As a result, scientists who want to use such software often spend a lot of effort to setup a software environment. Similarly, support staff that maintains scientific IT infrastructure faces the same issues. [comment from Caspar: not sure if we should include this. If we dont, I'm afraid reviewers might say 'but we have containers for that'. If we do, they might not clearly see the added benefit of S4/EESSI over containers...] Nowadays, containerized applications alleviate some of these software challenges. However, containers mainly target portability, and are therefore often not optimized for specific hardware architectures. Thus, performance of containerized applications is often suboptimal. This is particularly problematic for software that is used for large computational tasks, where a loss of application performance translates into a substantial increase in hardware cost. The S4 project aims to provide a single software stack that - can be used on a wide range of hardware - can make effective use of the specific capabilities of the hardware, such as specialized instruction sets or high performance interconnects This will reduce the time spent by scientists and scientific IT support staff on managing scientific software stacks, while maintaining optimal use of the hardware. It will also increase mobility of scientists to move from e.g. their own laptop, to a university cluster, to a national infrastructure, since they can use the same software everywhere. The S4 project will collaborate with other partners from the EESSI initiative, an open initiative that shares the same goals. EESSI currently provides a proof of concept for such a portable and performant software stack. The S4 project will contribute to the technical solution proposed by EESSI by making it more mature. Furthermore, it will stimulate adoption and collaboration of the EESSI software stack by IT support staff, scientific communities and scientific software developers in the Nordic countries. This will ensure that e.g. software relevant to our research communities is adopted within EESSI, that the EESSI software stack is available on Nordic IT infrastructures, etc. ### Åkes version, mixing the (adjusted) original with Caspar, slighly too large The project Scientific Software Stacks as a Service will develop an eInfrastructure by integrating existing technologies and open source software to deliver high quality scientific software stacks that are built and optimised for multiple CPUs, microarchitectures, GPUs and high-performance interconnects making it usable on virtually any computer via the Internet. The software stacks will be exposed in a uniform way making migration between different systems seamless across borders all over the Nordic countries and beyond. Installing scientific software is often challenging: scientific software can have a large number of dependencies and portability of software and installation procedures are often not a concern of scientific software developers. As a result, scientists who want to use such software often spend a lot of time to setup a software environment. Similarly, support staff that maintains scientific IT infrastructure faces the same issues. Nowadays, containerized applications alleviate some of these software challenges. However, containers mainly target portability, not performance, and their use on large computational tasks results in a loss of application performance which translates into a substantial increase in hardware cost. The S4 project aims to provide a single software stack that - can be used on a wide range of systems - can make effective use of the specific capabilities of the system, such as specialized instruction sets or high performance interconnects This will reduce the time spent by scientists and support staff on managing scientific software, while maintaining optimal use of the hardware. It will also increase mobility of scientists to move from e.g. their own laptop, to a university cluster, and to a national infrastructure, since they can use the same software everywhere. The S4 project will collaborate with other partners from the EESSI initiative, an open initiative that shares the same goals. EESSI currently provides a proof of concept for such a portable and performant software stack. The S4 project will contribute to the technical solution proposed by EESSI by making it more mature. Furthermore, it will stimulate adoption and collaboration of the EESSI software stack by IT support staff, scientific communities and scientific software developers in the Nordic countries. This will ensure that e.g. software relevant to our research communities is adopted within EESSI, that the EESSI software stack is available on Nordic IT infrastructures, etc. ### Thomas' attempt 1 (2478 characters) Groundbreaking collaborative research increasingly relies on having rich, well-functioning and performant software stacks available across several IT resources. Over the last decade, EasyBuild and Spack have become indispensible tools to maintain high quality software installations which ensure reproducibility of results as well as the efficient use of scarce computing resources. Even using such tools, building and managing enlarging stacks with thousands of software packages is becoming more and more challenging because of the inherent complexity of scientific software and the widening variety in hardware choices (CPUs, GPUs, interconnects). Additionally, stagnating budgets for the management of software installations at HPC centers and their limited reach to other resources being used by researchers (laptops, Clouds, etc) call for a fundamentally new approach in providing stacks of scientific software in order to keep up with the growing demands from scientists. In early 2020, the European Environment for Scientific Software Installations (EESSI) has begun to develop a solution which will enable ubiquituous access to rich ready-to-use software stacks which are hardware-tuned for several CPU families and microarchitectures, will support GPUs and high-speed interconnects. A prototype is available since summer 2020, is being continuously refined and has been successfully tested across a wide range of systems from tiny Raspberry Pis to laptops, servers, Cloud instances as well as HPC clusters and supercomputers including LUMI-C scaling parallel jobs to 65,536 cores. The overall goal of the S4 project is to join forces with contributors to EESSI and make their solution available to Nordic and Estonian researchers on any system they have access to. Besides participating in joint development activities, S4 will deploy the EESSI solution on resources, help cross-border communities to build and use discipline-specific software stacks, work with HPC application developers to improve their CI workflows, and, last but not least, develop a comprehensive series of courses educating system administrators, support staff and researchers to exploit the solution to its full potential. [a bit of an abrupt transition, the sentence before that would be a good conclusion] S4 will collaborate with existing NeIC activities such as NT1 (leverage experience in distributing software via CernVM-FS), Puhuri (to integrate the EESSI solution into the Puhuri marketplace), and CodeRefinery (to establish high quality trainings and train-the-trainer programmes). [idea1: project/eessi results in researchers not needing to think about software any longer, software is just there in high quality, just works] [idea2: anticipated cost savings of 10 to 25 FTE per year easily outweigh the total costs of the project of about 10 FTE.] ### Thomas' attempt 2 (2710 characters) Collaborative research increasingly relies on having rich, well-functioning and performant software stacks available across several IT resources. EasyBuild and Spack have become indispensible tools to maintain high quality software installations which ensure reproducibility of results and efficient use of scarce computing resources. Even using such tools, managing stacks with thousands of software packages is becoming more and more challenging because of the inherent complexity of scientific software and the widening variety in hardware choices (CPUs, GPUs, interconnects). Additionally, stagnating budgets for maintaining software installations at HPC centers call for a fundamentally new approach in providing stacks of scientific software in order to keep up with the growing demands from scientists. In early 2020, the European Environment for Scientific Software Installations (EESSI) has begun to develop a solution which will enable ubiquituous access to rich ready-to-use software stacks which are hardware-tuned for several CPU families and microarchitectures, will support GPUs and high-speed interconnects. A prototype is available since summer 2020, is being continuously refined and has been successfully tested across a wide range of systems from tiny Raspberry Pis to laptops, servers, Cloud instances as well as HPC clusters and supercomputers including LUMI-C scaling parallel jobs to 65,536 cores. For a researcher, knowing that the EESSI stack is available on a system means most of their issues related to software are a problem of the past. The goal of the S4 project is to join forces with key contributors to EESSI and make their solution available to Nordic and Estonian researchers on any system they have access to. Particularly, the project will jointly develop the EESSI solution, will deploy it on resources at the partners, help cross-border communities to build and use discipline-specific software stacks and work with HPC application developers to improve their CI workflows. Last, but not least, S4 will develop a comprehensive series of courses educating system administrators, support staff and researchers to exploit the solution to its full potential. S4 will collaborate with existing NeIC activities such as NT1 (leverage experience in distributing software via CernVM-FS), Puhuri (to integrate the EESSI solution into the Puhuri marketplace), and CodeRefinery (to establish high quality trainings and train-the-trainer programmes). Putting the EESSI solution into production across the partner sites, we anticipate aggregated cost savings of 10 FTE per year for maintaining software stacks. These savings easily outweigh the project costs of 5 FTE per year. ### Thomas' attempt 3 (2556 characters) Collaborative research increasingly relies on having rich, well-functioning and performant software stacks available across several IT resources. EasyBuild and Spack have become indispensible tools to maintain high quality software installations which ensure reproducibility of results and efficient use of scarce computing resources. Even using such tools, managing stacks with thousands of software packages is becoming more and more challenging because of the inherent complexity of scientific software and the widening variety in hardware choices (CPUs, GPUs, interconnects). Additionally, stagnating budgets for maintaining software installations at HPC centers call for a fundamentally new approach in providing stacks of scientific software in order to keep up with the growing demands from scientists. In early 2020, the European Environment for Scientific Software Installations (EESSI) has begun to develop a solution which will enable ubiquituous access to rich ready-to-use software stacks which are hardware-tuned for several CPU families and microarchitectures, will support GPUs and high-speed interconnects. A prototype is available since summer 2020, is being continuously refined and has been successfully tested across a wide range of systems from tiny Raspberry Pis to laptops, servers, Cloud instances as well as HPC clusters and supercomputers including LUMI-C scaling parallel jobs to 65,536 cores. For a researcher, knowing that the EESSI stack is available on a system means most of their issues related to software are a problem of the past. The goal of the S4 project is to join forces with key contributors to EESSI and make their solution available to Nordic and Estonian researchers. S4 will jointly develop the EESSI solution, deploy it on resources at the partners (leveraging experience of the NeIC NT1 in CernVM-FS), integrate it into existing NeIC services (Puhuri), help cross-border communities to build discipline-specific software stacks and work with application developers to improve their CI workflows. Last, but not least, S4 - in close collaboration with the NeIC CodeRefinery - will develop a comprehensive series of courses and train-the-trainer programmes to train system administrators, support staff and researchers to exploit the solution to its full potential. Putting the EESSI solution into production across the partner sites, we anticipate aggregated cost savings of 10 FTE per year for maintaining software stacks. These savings easily outweigh the project costs of 5 FTE per year. ### Thomas' attempt 4 (2498 characters) Collaborative research increasingly relies on having rich, well-functioning and performant software stacks available across several IT resources. EasyBuild and Spack have become indispensible tools to maintain such stacks which ensure reproducibility of results and efficient use of scarce computing resources. Even using these tools, managing stacks with thousands of software packages is becoming more and more challenging because of the inherent complexity of scientific software and the widening variety in hardware choices (CPUs, GPUs, interconnects), while budgets for maintaining software installations at HPC centers stagnate. This calls for a fundamentally new approach in providing stacks of scientific software in order to keep up with the growing demands from scientists. In early 2020, the European Environment for Scientific Software Installations (EESSI) has begun to develop a solution which will enable ubiquituous access to rich ready-to-use software stacks which are hardware-tuned for several CPU families, microarchitectures, GPUs and high-speed interconnects. A prototype is available since summer 2020, is being continuously refined and has been successfully tested across a wide range of systems from tiny Raspberry Pis to laptops, servers, Cloud instances as well as HPC clusters and supercomputers including LUMI scaling parallel jobs to 65,536 cores. For a researcher, knowing that the EESSI stack is available on a system means most of their issues related to software have become a no-concern. The goal of the S4 project is to join forces with key contributors to EESSI and make their solution available to Nordic and Estonian researchers. S4 will jointly develop the innovative EESSI solution, deploy it on resources at the partners (leveraging experience of the NeIC NT1 in CVMFS), integrate it into NeIC services (Puhuri), help cross-border communities to build discipline-specific software stacks and work with application developers to improve their CI workflows. Last, but not least, S4 - in close collaboration with the NeIC CodeRefinery - will develop a comprehensive series of courses and train-the-trainer programmes to train system administrators, support staff and researchers to help exploit the solution to its full potential. Putting the EESSI solution into production across the partner sites, we anticipate aggregated cost savings of 10 FTE per year for maintaining software stacks. These savings easily outweigh the project costs of 5 FTE per year. ### final version (2500 characters) Present-day research increasingly relies on having ubiquitous access to rich, well-functioning and performant software stacks across IT systems. EasyBuild and Spack have become indispensible tools to maintain such stacks which ensure reproducibility of results and efficient use of scarce computing resources. Even using these tools, managing stacks with thousands of software packages is becoming more and more challenging because of the inherent complexity of scientific software and the widening variety in hardware choices (CPUs, GPUs, interconnects), while budgets for maintaining software installations at HPC centers stagnate. This calls for a fundamentally new approach in providing scientific software stacks in order to keep up with the growing demands from scientists. In early 2020, the European Environment for Scientific Software Installations (EESSI) has begun to develop a solution which will enable ubiquituous access to rich ready-to-use software stacks which are hardware-tuned for several CPU families, microarchitectures, GPUs and high-speed interconnects. A prototype is available since summer 2020, is being continuously refined and has been successfully tested across a wide range of systems from tiny Raspberry Pis to laptops, servers, Cloud instances as well as HPC clusters and supercomputers including large parallel jobs using 65,536 cores on LUMI. For a researcher, knowing that the EESSI stack is available on a system means, most of their issues related to software have become a no-concern. The goal of the S4 project is to join forces with key contributors to EESSI and make their solution available to Nordic and Estonian researchers. S4 will jointly develop the innovative EESSI solution, deploy it on resources at the partners (leveraging experience of the NeIC NT1 in CVMFS), integrate it into NeIC services (Puhuri), help cross-border communities to build discipline-specific software stacks, and work with application developers to improve their CI workflows. Last, but not least, S4 - in close collaboration with the NeIC CodeRefinery - will develop a comprehensive series of courses and train-the-trainer programmes to train system administrators, support staff and researchers to help exploit the solution to its full potential. Putting the EESSI solution into production across the partner sites, we anticipate aggregated cost savings of 10 FTE per year for maintaining software stacks. These savings easily outweigh the project costs of 5 FTE per year.