---
tags: HPC
---
# ORG: Interactive HPC for Aarhus University Arts
:::info
__Executive Summary__
Levels of ambition
:::
:::info
* The following estimates are made on the assumption that we pay for every node\*hour and byte of storage we use on a pay-as-you-go model. It does not take into consideration how Aarhus University choose to allocate and charge for the 50% share of the national resources that Danish universities have committed to.
* Estimates of compute is based on the first six months of use in UCloud.
:::
## Introduction
This is a short business case in relation to Arts' needs for high performance computing (HPC) 2022-24. Arts have been asked to provide DeiC with an estimate of the number of node\*hours, we expect to use for during the three year period. As of May 10 2021, AU's resource allocation model for HPC has not been finalized and estimates are therefore only based on standard unit prices in the Type 1 system and the budget allocated by DeiC for Type 1.
It is important to emphasize that our researchers and students need access to HPC, cloud and storage resources, and that this need has been rapidly growing over the last decade. We can solve this either through DeiC's Type 1 HPC system, local servers (as CHC has previously done), commercial cloud providers (as AU-IT proposes to do), or a combination. For this business case, the primary issue is 2022-24 estimates for the DeiC systems, and only secondarily faculty rollout and alternative (commercial) providers.
### Background
In November 2020, DeiC launched four (types 1-4) new national HPC systems (only three are operational pr. May 2021). AU Arts is directly involved in the Type 1 system for interactive HPC/cloud computing for research. At that time, Center for Humanities Computing (CHC) was reaching the limit for providing local server access to researchers and AU-IT was moving to Microsoft Azure, which for CHC was not economically sustainable. To continue to provide researchers (and students) seamless access to scalable computing resources and research software, AU became partners in the national consortium for interactive computing with Aalborg University and University of Southern Denmark.
## DeiC's HPC Systems 2021: Aarhus University
Figure 1 shows the relative distribution of Type 1 new users between universities. AU shows the second largest increase only outperformed by SDU. SDU has however had a much longer run-up, because they developed UCloud and have been using it locally since 2017. Importantly, of AU users applying for and running research projects (18 projects) in UCloud in 2021, almost 90\% were from Arts. That Arts is currently the largest user of Type 1 partly reflects that interactive HPC targets our profile, but also that CHC's outreach focuses primarily on Arts users.
| |
|:--:|
| *Figure 1: Relative number of users of DeiC's Type 1 system (UCloud) since its launch in November 2020. AU has seen an increase of 133 users since November (total: 154 users).* |
Table 1 show the projected relative distribution of HPC system types for AU compared to national usage. Using projects as an index for total usage of Type 1, Arts is projected to utilize 90\% of the AU's Type 1 resources. It should be noted that AU currently has one user at the Type 2 facility, but the research project is paid for by external funds and therefore not included in the DeiC resource allocation.
|| AU Fordeling National | AU Fordeling Lokal | Nationale fordeling |
| --- | :-: | :-: | :-: | :-: |
| Type 1 | 8.38%| 10% | 37.43% |
| Type 2 | 37.1% | 80% | 49.68% |
| Type 3 | 7.9% | 5% | 14.55% |
| Type 4 | 13.81% | 5% | 8.33% |
__Table 1__ *Distribution between types for AU according to DeiC's resource allocation.*
For purpose of comparison, a node\*hour in the Type 1 systems costs 5.5DKK (u1-standard-64). In comparison, a node\*hour in MS Azure the is comparable costs 22.15DKK (DS64as v4). __UCloud to Azure shows a 4X price increase__. Certain types of specialized hardware (e.g., T4 Nvidia GPU) have comparable prices, but we have not found one example where Azure is cheaṕer than UCloud. CHC can match UCloud's pricing model through local hosting, but it will require substantial investments in both hardware and personnel. __A local solution will however not provide the level of security and scalabilty that DeiC's Type 1 system offers__.
## Four levels (scenarios) of the level of ambition
It is important to distinguish between ambitions at the levels of DeiC's HPC systems (i.e., how much compute and storage are needed for 2022-24) and our utilization of DeiC's HPC systems (i.e., research only or education). While these issues are dependent, they are not economically speaking equivalent.
In terms of how much compute and storage we need, __four cascaded scenarios__ can be identified (the higher levels have lower levels embedded):
1. __Competitive national access__: Arts researchers can only get access to compute and storage in DeiC's HPC systems through the national resource allocation. In this scenario, only projects that already have experience with HPC and have sufficiently large resource needs will have _de facto_ access. Potential users can only apply for access twice a. year and resources are allocated in competition with all faculties.
2. __Competitive local access__: Arts researcher can get access both through AU's local resource allocation and the national resource allocation. In this scenario, all projects have to apply for access, but the access to local resources can be granted faster and are only in competition with AU faculties. This still require some experience with HPC, but the required size of the project decreases.
3. __Competitive access with local sandbox__: Arts researchers have limited free access to sandbox resources that allow them to experiment with DeiC's HPC systems. A sandbox amount to a trivial access threshold that allows researchers immediate access for experiments. Currently this solution is offered in UCloud through the free "My Workspace", but that solution will be phased out by 2022.
4. __Faculty access__: Arts buys a permanent allocation of DeiC's HPC systems through the university share and distributes it however the faculty sees fit. This is the how Type 1 is being allocated during the spring semester 2021, because CHC has been allowed to distribute resources from the national allocation on a on a first-come, first-served basis.
An additional issue is __educational access__, that is, utilization of DeiC's HPC systems for education in courses/educational programs that have well-defined needs for compute and storage. Every scenarios 2-4 could include this for an additional cost, but it requires a decision in terms of local resource allocation that allows for educational use. At the moment CHC is running several educational access projects to estimate cost and requirements.
### Cost and benefits of scenarios 1-4 (2022-24)
__Competitive national access__ through the national resource allocation means that it is entirely up to the researcher. This is similar to applications for national funding agencies with the exception that the Arts researcher will be competing with projects from all faculties. There are two consequences of this: 1) Access to HPC resources will be for experts only; and 2) Level of security will ultimately be handled by the expert that will use whatever provider s/he sees fit. This solution is _in principle_ free, because users will only be utilizing resources from the national pool or from a commercial provider paid through external funding. From the perspectivce of the Type backoffice, it seems unlikely that Arts will get access to the national resource pool, it we do not contribute to the local resource pool.
__Competitive local access__
This is the minimal scenario suggested by DeiC. Researchers can apply for access through local and national resource allocation committees. It will still favor competent users, but they can get faster access and can therefore improve their research applications and projects irrespective of national deadlines. In terms of security and scalability this scenario is improvement, because resource can be allocated fast (month to month) on DeiC systems that (will) comply to university requirements. Based on projects in the first half 2021 and DeiC allocation for 2021, a price projection for this scenario is 300,000-400,000 DKK/year.
__Competitive access with local sandbox__
If researchers automatically has access to limited resources in a Arts sandbox, it allows everyone to explore and experiment with cloud computing for research out of interest (and not just experience). All employees will have the opportunity to develop and improve their projects, determine feasibility and, ultimately, formulate better applciations for the local or national resource allocation committee. From the perspective of the current Type 1 users, this is the __need to have scenario__ because we can see that the majority are currently only working in their current sandbox ("My Workspace") and existing projects all started there. Security improves slightly, because users no longer need to experiment locally one the personal computer or at a commercial cloud vendor. Scalability also improves because users can transfer their sandbox experiments seamlessly to a research project. Sandbox can be implemented either as a one-time allocation of node\*hours pr user or a trivial access threshold. In either case, an additional 50,000-100,000 DKK/year should be sufficient.
__Faculty access__
In addition to local sandbox, Arts users would also be able to apply for projects locally which makes it easy to scale experiments to research projects and removes all barrier to entry. This scenario is a full rollout of cloud computing for research and allows researchers to get access when and how they need it. In additional, for computing centres like CHC, this would mean that all new projects would be run in DeiC's HPC systems. From the perspective of the current Type 1 users, this is the __nice to have scenario__ that they _de facto_ have been experiencing since November 2020. Faculty access can based on current use and DeiC's allocation be covered by 1,000,000 DKK/year.
### Educational access, needs assessment ###
Currently, CHC can run a 10 ECTS course in UCloud for 5500 DKK and 30 hrs IT support. We believe this can be reduced by half by 2022 (i.e., 2500-3000 DKK and 15 hrs support). The current needs are approximately four courses/semester. Assuming that the needs will grow and extend to areas such as linguistics, history, and archaeology, a conservative estimate is 100% increase for 2022-24 (total of eight courses).
## Risk assessment
### Alternative providers/sources of compute
Risk associated with using third-party solutions or local hosting are inversely related to the level of ambition. That is, if researchers have faculty access to UCloud, then there is little reason that they acquire alternative sources of compute, and, conversely if, they only have competitive access to the national allocation, they will most likely establish other altenatives.
### Collaboration
### Maintenance and development
Hidden costs
### Storage and data management
ISO 27001 certified
## Recommended solution
Science and Health currently have "free access" as they provide access to compute through faculty, AY7local and national allocations. Educational access however varies widely and is, with a few exceptions not the norm.
at least local sandbox, if we do not want to reinforce existing user patterns/promote fair access to resource
For Arts, ignoring educational access is irresponsible
:::info
__TERMINOLOGY__
__local hosting__:
__node\*hours__
__pay-as-you-go__: (PAYG) Pay-as-you-go is a cost model for cloud services that encompasses both subscription-based and consumption-based models, in contrast to traditional IT cost model that requires up-front capital expenditures for hardware and software.
__third-party solutions__:
__UCloud projects__: Projects encompass multiple users that collaborate with a share file system (storage) and share compute resources in UCloud.
:::
___
Intro: forsker, der leder center for humanities computing, der er involveret i Type 1 mhp understøttelse i SSH
1. Pengestrømme vedr. de nationale HPC-centre: AU betaler sin relative andel til DeIC, som finansieres gennem debiteringsnøgle, dvs. ift. fakulteternes omsætning (minus eksterne midler). DeIC betaler så AU for at hoste HPC type 1 og 2. Hvordan bliver disse penge opkrævet/overført, og hvad skal de bruges til? Dels til udvidelse af HPC-anlæg, men det første år kunne håndteres ved de eksisterende anlæg. Vi har også et finansieringsbehov for at kunne opgradere HPC-serverrummet. Universitetsledelsen har meget fokus på pengestrømmene, da bidraget til DeIC er en betydelig ekstra udgiftspost for fakulteterne.
- relativt få midler (400K-1,500K), der er allokeret til support og udvikling men ikke hardware.
- vores bidrag har været inkind af fakultetet
- e.g., bygge, tilpasse og opdatere images, tilpasse frontend & brugervenlighed, kommunikere med brugere
- håndtere overgang til containeriserede apps & kortvarige/ikke-persistente filsystemer
2. Back-office/Front-office og driftsstatistik for de nationale HPC-centre. Har vi styr på hvordan brugerne får adgang til de nationale HPC-centre.
* Ja & nej
* Front office, hvem har opgaven, vi har lovet en SPOC (single point of contact), admin opgave
* vil gerne opretholde front office kontrol ved type 1
* databehandleraftale på plads, der ikke forfordeler Azure
Endvidere har der været 'knas' omkring leverance af driftsstatistik. Hvordan håndterer vi det?
* Eskes (nu gamle) datastruktur giver i nogen tilfælde ikke mening (e.g., der er megen redundans) og er i andre ikke mulig. For eksempel ORCID som bruger-id; juridisk må vi ikke kræve, at vores brugere har et ORCID. Jeg har bedt Eske komme med et løsningsforslag.
3. Samspil GenomeDK og Grendel
David har gjort opmærksom på, hvordan det fremtidige samarbejde kan være mellem GenomeDK som nationalt HPC-center og den lokale Grendel, i det miljøet omkring Grendel også er interesseret i at være en del af samarbejdet.
---
Con persistence
_ephemeral_ instead of non-persistent. A container's read-write top layer of the container file system is deleted together with the container instance existence (docker rm). However, you can mount volumes in your container that contain persisted data. Those volumes are not an integral part of the container instance. As such the container files system is still ephemeral.
___
Spring 2021 projects
| project | u1-standard | uc-general | uc-t4 | ceph |
| - | - | - | - | - |
| Peter Tester | 100 | 0 | | |
| toponym-resolution | 100 | 0 | | |
| Deus ex machina - Application of machine learning classification for Latin epigraphy | 986 | 0 | | |
| Test project EuroCC | 10 | 0 | | |
| Investigating the use of shap value to interpret machine learning on electrophysiological data | 679 | 0 | | |
| Student project in course Social media and communication Spring 2021 | 1000 | 0 | | |
| CogSci Data Science 2021 | 1930 | 0 | | |
| pandemic-info-resp | 4355 | 0 | | |
| The Alien Categorization Study | 821 | 0 | | |
| Cultural Data Science Teaching App | 478 | 0 | | |
| Danish ELECTRA | 0 | 0 | | |
| Reanalysis of Trends in International Mathematics and Science Study 2007-2019 | 4.995 | 0 | | |
| Coreference resolution project | 0 | 0 | | |
| Fabula-NET | 967 | 0 | | |
| Data fusion within International Large Scale Assessments | 1999 | 0 | | |
| Power Analysis in Meta-Analysis with Dependent Effect Sizes | 1857 | 0 | | |
| Detecting mounds in Bulgarian landscapes using deep convolutional neural networks | 0 | 0 | | |
| ICILS Teacher Panel Study | 999 | 0 | | |
Priser:
Name vCPU RAM (GB) GPU Price
u1-standard-1 1 6 0 0,086 DKK/hour
u1-standard-2 2 12 0 0,172 DKK/hour
u1-standard-4 4 24 0 0,344 DKK/hour
u1-standard-8 8 48 0 0,688 DKK/hour
u1-standard-16 16 96 0 1,376 DKK/hour
u1-standard-32 32 192 0 2,752 DKK/hour
u1-standard-64 63 371 0 5,504 DKK/hour 1000DKK/181.686hrs
u1-gpu-1 16 44 1 10,548 DKK/hour
u1-gpu-2 32 88 2 21,096 DKK/hour
u1-gpu-3 48 132 3 31,644 DKK/hour
u1-gpu-4 63 180 4 42,192 DKK/hour 1000DKK/23.7hrs
uc-general-small 4 16 0 0,339 DKK/hour
uc-general-medium 8 32 0 0,679 DKK/hour
uc-general-large 16 64 0 1,369 DKK/hour
uc-general-xlarge 64 256 0 5,499 DKK/hour 1000DKK/181.851hrs
uc-t4-1 10 40 1 8,499 DKK/hour 1000DKK/117.66hrs
___
- HPC behov og ambitioner på Arts
- DeiC har henvendt sig til AU med et ønske om at universiteterne senest 1. august melder deres behov ind for brug af de 4 HPC typer i årene 2022-2024.
- På AU er deadline for indmeldinger fra de fem fakulteter d. 1. juni. I løbet af juni afholdes møde ml i HPC-forum hvor den endelige samlede indmelding afstemmes.
- Efter, at universiteterne har indmeldt behov, indleder DeiC forhandlinger med konsortierne bag de 4 typer, samtidig med at DeiC afventer afklaring af, hvor mange midler der vil være afsat til området i Finansloven. Den endelige fordeling af midler til konsortierne, som baggrund for indgåelse af aftaler for 2022, forventes fremlagt og godkendt på DeiC bestyrelsesmøde den 8. oktober 2021.
- Det anbefales at Arts’ tilbagemelding baserer sig på brugerdata fra u-cloud. CHAAA kan trække dette estimat.
- Der er behov for at fakultetsledelsen drøfter ambitionerne for HPC området, da det er en ressource som vi kommer til at betale for fremadrettet.
___
---
Ang. punkt 7.2-3
* Diskussionen om rapportering af forbrugsdata har en længere historie (det fremgår jo også af Dans kommentar) og for Type 1 har vi gjort det så godt vi kan. Eskes (nu gamle) datastruktur giver i nogen tilfælde ikke mening (e.g., der er megen redundans) og er i andre ikke mulig. For eksempel ORCID som bruger-id; juridisk må vi ikke kræve, at vores brugere har et ORCID. Jeg har bedt Eske komme med et løsningsforslag.
* Fortolkning af forbrugsdata skal ses anderledes for Type 1 end normale HPC centre, fordi vi tilbyder interaktiv adgang. Vi har meget store udsving i løbet af en uge (& dag), fordi vores brugere ikke starter en app i weekenden (se vedhæftede graf). Gennemsnitlig CPU-udnyttelse er derfor et dårligt mål (ca. 30% til 30/1). I weekenden har vi ofte meget få brugere, men til gengæld får DeiC ifht. vores kontrakt op mod 200% CPU-udnyttelse på nogle hverdage. Det er vigtigt, at DeiC tager højde for den forskel og i stedet kigger på brugeremængde og brugerdiversitet.
___
https://newsroom.au.dk/en/news/show/artikel/finansiering-af-administrationen-friholder-eksterne-forskningsmidler/
* Contribution to joint costs (debiting): The amount that the academic units are charged as a contribution to financing the central administration and the university’s joint costs.
* Debiting formula: The debiting formula used to calculate the contribution to joint costs.
* Overhead (administration contribution): Proportion of a grant earmarked to cover indirect administrative costs associated with a project but which cannot be directly attributed to the concrete project.
___
* au-it skal betaler storage og opsætning (via overehead)?
* hvad er udspillet til "hvem betaler til deic?"
---
ERDA/SIF, 2 mill samlet debiteringsnøgle
AU finances the relative share to DeiC on the basis of the debiting formula (i.e., on the basis of the faculty revenue minus external funding)
|| 1/1-23/2 | 1/3-30/6 | 1/7-31/12 |
| --- | :-: | :-: | :-: | :-: |
| Type 1 - cpu | 233064 (0.064) | 471026 (0.063) | 365322 (0.063) |
| Type 1 - gpu | 0 | 50418 (0.006) | 39104 (0.007) |
| Type 2 | 3352743 (0.935) | 6994516 (0.93) | 5424863 (0.93) |
| Type 3 | 0 | 0 | 0 |
| Type 4 | 0| 0 | 0 |
__Table 2__ *DKUNI Parts, core hours for 2021*