--- tags: HPC --- # DeiC's Systems for HPC for Aarhus University Arts 2022-2024 By Kristoffer L. Nielbo Center for Humanities Computing Aarhus, Faculty of Arts, Aarhus University, Denmark :::info __Executive Summary__ This document provides a business case for DeiC's Type 1 high performance and cloud computing systems at Faculty of Arts, Aarhus University for 2022-2024. Based on user patterns from the first half of 2021, and DeiC's resource allocation for 2021, it identifies four possible scenarios for 2022-2024 and adds the issue of educational access. The assessment emphasizes that irrespective of provider and scenario, the Faculty of Arts will have to find a solution to the growing need for high performance and cloud computing resources. ::: ## Introduction This is a short business case in relation to Arts' needs for high performance computing (HPC) 2022-24. Arts has been asked to provide DeiC with estimates of the number of node\*hours, we expect to use during the three year period. As of May 10 2021, AU's resource allocation model for HPC has not been finalized and estimates are therefore only based on standard unit prices in the Type 1 system and the budget allocated by DeiC for Type 1 in 2021. It is important to emphasize that our researchers and students need access to HPC, cloud and storage resources, and that this need has been rapidly growing over the last decade. We can solve this either through DeiC's Type 1 HPC system, local servers (as CHC has previously done), commercial cloud providers (as AU-IT proposes to do), or a combination. For this business case, the primary issue is 2022-24 estimates for the DeiC systems, and only secondarily faculty rollout and alternative (commercial) providers. ### Background In November 2020, DeiC launched four (types 1-4) national HPC systems (only three are operational pr. May 2021). AU Arts is directly involved in the Type 1 system for interactive HPC/cloud computing for research. During 2020, Center for Humanities Computing (CHC) was reaching the limit for providing local server access to researchers (and students), and AU-IT was moving to Microsoft Azure, which, at least for CHC and their users, was not economically sustainable. To continue to provide researchers (and students) seamless access to scalable computing resources and research software, AU became partners in the national consortium for interactive computing together with Aalborg University and University of Southern Denmark. ## DeiC's HPC Systems 2021: Aarhus University Figure 1 shows the relative distribution of Type 1 new users between universities. AU shows the second largest increase only outperformed by SDU. SDU has however had a much longer run-up, because their eScience centre developed UCloud and introduced UCloud locally in 2017. Importantly, of AU users applying for and running research projects in UCloud in 2021, almost 90\% originate from Arts. | ![UCloud users](https://i.imgur.com/C7r8m17.png)| |:--:| | __Figure 1__: *Relative number of users of DeiC's Type 1 system (UCloud) since its launch in November 2020. AU has seen an increase of 133 users since November (total: 154 users).* | Table 1 shows the projected relative distribution of HPC system types for AU compared to national usage. Using projects as an index for total usage of Type 1, Arts is projected to utilize 90\% of the AU's Type 1 resources. It should be noted that AU currently has one user of the Type 2 system, but the specific project is funded through external means and therefore not included in the DeiC resource allocation. || AU Fordeling National | AU Fordeling Lokal | Nationale fordeling | | --- | :-: | :-: | :-: | :-: | | Type 1 | 8.38%| 10% | 37.43% | | Type 2 | 37.1% | 80% | 49.68% | | Type 3 | 7.9% | 5% | 14.55% | | Type 4 | 13.81% | 5% | 8.33% | __Table 1__ *Distribution between types for AU according to DeiC's resource allocation.* For purpose of comparison, a node\*hour in the Type 1 systems costs 5.5DKK (u1-standard-64) and a node\*hour in MS Azure the is comparable costs 22.15DKK (DS64as v4). In other words, __UCloud to Azure entails a 4X price increase__. Certain types of specialized hardware (e.g., T4 Nvidia GPUs) have comparable prices, but we have not found one example where Azure is cheaṕer than UCloud. CHC can match UCloud's pricing model through local hosting of servers, but it will require substantial investments in both hardware and personnel. __A local solution will however not provide the level of security and scalabilty that DeiC's Type 1 system offers__. ## Four levels (scenarios) of the level of ambition It is important to distinguish between ambitions at the level of DeiC's HPC systems (i.e., how much compute and storage are needed for 2022-24) and our utilization of DeiC's HPC systems (i.e., research only or education). While these issues are dependent, they are not, economically speaking, equivalent. In terms of how much compute and storage we need, __four cascaded scenarios__ can be identified (the higher levels have lower levels embedded): 1. __Competitive national access__: Arts researchers can only get access to compute and storage in DeiC's HPC systems through the national resource allocation. In this scenario, only projects that already have experience with HPC and have sufficiently large resource requirements will have _de facto_ access. Potential users can only apply for access twice a year and resources are allocated in competition with all faculties. 2. __Competitive local access__: Arts researchers can get access both through AU's local resource allocation and the national resource allocation. In this scenario, all projects have to apply for access, but the access to local resources can be granted faster and are only in competition with AU faculties. This access form still requires some experience with HPC, but the required size of the project decreases. 3. __Competitive access with local sandbox__: In addition to competitive local access, Arts researchers have limited free access to sandbox resources that allow them to experiment with DeiC's HPC systems. A sandbox amount to a trivial access threshold that allows researchers immediate access for experimentation. Currently this solution is offered in UCloud through the free "My Workspace", but that solution will be phased out by 2022. 4. __Faculty access__: Arts buys a permanent allocation of DeiC's HPC systems through the university share and distributes it however the faculty sees fit. This is how Type 1 is being allocated during the spring semester 2021, because CHC has been allowed to distribute resources from the national allocation on a first-come, first-served basis. An additional issue is __educational access__, that is, utilization of DeiC's HPC systems for education in courses/educational programs that have well-defined needs for compute and storage. Every scenario 2-4 can include this service for an additional cost, but it requires a decision in terms of local resource allocation that allows for educational use. At the moment CHC is running several educational access projects to estimate cost and requirements. CHC's _Edu-IT Infrastructure Needs Assessment_ is provided by request to chcaa\[at\]cas\[dot\]au\[dot\]dk. ### Cost and benefits of scenarios 1-4 (2022-24) __Competitive national access__ through the national resource allocation means that it is entirely up to the researcher. This is similar to applications for national funding agencies with the exception that the Arts researcher will be competing with projects from all faculties. There are two consequences of this: 1) Access to HPC resources will be for experts only; and 2) the level of security will ultimately be handled by the expert that will use whatever provider s/he sees fit. This solution is _in principle_ free, because users will only be utilizing resources from the national pool or from a commercial provider paid through external funding. From the perspective of the Type 1 back office, it seems unlikely that Arts will get access to the national resource pool, if we do not contribute to the local resource pool. __Competitive local access__ This is the minimal scenario suggested by DeiC. Researchers can apply for access through local and national resource allocation committees. It will still favor competent users, but they can get faster access and therefore improve their research applications and projects irrespective of national deadlines. In terms of security and scalability this scenario is an improvement, because resources can be allocated fast (month to month) on DeiC systems that (will) comply with university requirements. Based on projects in the first half of 2021 and DeiC's allocation for 2021, a price projection for this scenario is 300,000-400,000 DKK/year. __Competitive access with local sandbox__ If researchers automatically has access to limited resources in a Arts sandbox, it allows everyone to explore and experiment with cloud computing for research (_out of interest, not experience_). All academic employees will have the opportunity to develop and improve their projects, determine feasibility and, ultimately, formulate better applications for the local or national resource allocation committees. From the perspective of the current Type 1 users, this is the __need to have scenario__ because the majority of users are currently only working in their sandbox ("My Workspace") and all existing projects started there. Security improves slightly, because users no longer need to experiment locally on the personal computer or at commercial cloud provider. Scalability also improves because users can transfer their sandbox experiments seamlessly to a research project. Sandbox can be implemented either as a one-time allocation of node\*hours pr. (new) user or a trivial access threshold. In either case, an additional 50,000-100,000 DKK/year should be sufficient. __Faculty access__ In addition to local sandbox, Arts users would also be able to apply for projects at the faculty level which makes it easy to scale experiments to research projects and removes all barrier to entry. This scenario is a full rollout of cloud computing for research and allows researchers to get access when and how they need it. In addition, for computing centres like CHC, this would mean that all new projects would be run in DeiC's HPC systems. From the perspective of the current Type 1 users, this is the __nice to have scenario__ that they _de facto_ have been experiencing since November 2020. Faculty access can, based on current use and DeiC's allocation, be covered by 1,000,000 DKK/year. ### Educational access, needs assessment ### Currently, CHC can run a 10 ECTS course in UCloud for 5500 DKK and 30 hrs IT support. We believe this can be reduced by half by 2022 (i.e., 2500-3000 DKK and 15 hrs support). The current needs are approximately four courses/semester. Assuming that the needs will grow and extend to areas such as linguistics, history, and archaeology, a conservative estimate is a 2X increase for 2022-24 (total of eight courses). :::info __TERMINOLOGY__ __Cloud computing__: refers to on-demand availability of computing power and storage that, in this context, is either provided through DeIC or third-party cloud vendors. __Commercial cloud__: Cloud computing systems for paying customer, e.g., Microsoft Azure, Amazon Web Service, or Google Cloud Platform. __High performance computing__: Aggregation of computing resources in order to deliver higher performance than one desktop computer or workstation. __Local server__: refers to access to hardware and services (e.g., data sharing and compute resources) hosted either directly by CHCAA or through AU-IT's old VPS service. As of April 26, 2021, AU-IT will only be providing access to servers through the commercial cloud vendor Microsoft Azure. __Node\*hours__: The usage of one compute node (server) for one hour, or it’s equivalent. Standard unit for allocating compute resources. __UCloud__ an interactive digital research environment built to support the needs of researchers for both computing and data management, throughout all the data life cycle. UCloud was originally developed by SDU eScience centre and provides the interface to Type 1 HPC. __UCloud projects__: Projects encompass multiple users that collaborate with a share file system (storage) and share compute resources in UCloud. :::