HackMD - Collaborative Markdown Knowledge Base

Points from the ESR that need to be addressed are: * However, the proposal does not sufficiently explain how its approach to co-design, scaling, and the federation of existing resources in Europe, etc., will go beyond the current state of the art, and how hardware vendors will be involved in the co-design process. These are shortcomings. * Interpretation: * What do _they_ see as the current state of the art? How do _they_ interpret the co-design process? This is likely coming back to the concern of hardware development in Europe, which is why they mention the integration of hardware vendors. We need to shift the focus from hardware (where there is a 10 year lead-in for co-design) to the software that supports hardware (where vendors can respond relatively quickly) and see that as where we are implementing co-design. By introducing financial added value for hardware vendors we encourage them to collaborate with us (i.e., we need to be doing something that will help them win tenders). This happens with GROMACS for example, where NVIDIA has 2 FTEs working on it because they know if it works well they will win tenders. * One could argue that the current vendor-level state of the art for software is NVIDIA containers for example, and then we explain how we are better than containers (size, maintenance, architecture support, hardware access,...) * maybe one could add that through the project software is ready on future hardware (RISCV) and provides an incentive for infrastructure providers (including cloud) to go into that market ? * Scaling and federation are mentioned to a lesser extent but will also need to be addressed. * One way to explain our approach to *scaling* would be a task graph using load-balancing techniques to ensure optimal system utilisation. Each node on the graph may consume petascale levels of resources and can also be embedded in ensemble approaches. * For *federation*, our approach is end-user focussed and about _facilitating_ federation: creating a uniform computing environment that the users can leverage across any European site (and also EOSC resources such as OCRE). From a practical perspective, the project is responsible for ensuring that the unified software environment is functional at all the relevant sites. * Who will address: Alan * How: Maybe a table is the right approach here so we can make it very clear how we address the points raised: | | State of the Art | How MultiXscale goes beyond this | | -------- | -------- | -------- | | Co-design | Text | (must include connection to vendors) | | Scaling | Text | Text | | Federation | Text | Text | (via BSC & RISC-V work??? ... other areas of potential work with _European_ hardware vendors?) ____ * However, the baseline for the used codes and **algorithms** is insufficiently described. This is a shortcoming. * Interpretation: They are looking for a stick to beat us with :wink:. We need to be able to provide a baseline of scalability and support for our key packages so that they have a quantifiable measure of our achievements. Describe the base codes and their typical usage. Focus on the algorithm side and the coupling of codes. Arrange around applications as opposed to codes? Staying generic is more in line with what they probably want. * Who will address: Stuttgart, Matej + Ignacio + Mathieu, (Sauro + Massimo) * How: Perhaps make a table, include couplings in the table make it clear what the project is contributing. ____ * Open science practices and research data management aspects are generically described and how FAIR principles would be applied, for example, is not made clear. E.g. apart from one code, the interoperability of inputs/outputs to other software packages or the intended use of metadata standards enabling interoperability of the data are unclear. **In addition, it is not clear which key project results are of commercial interest and which will be open source since they are not defined as deliverables and there will be a confidential business plan.** This is a shortcoming. * Interpretation: "Other software packages" is unclear. It feels a little like they are highlighting the work of NOMAD (and AiiDA from MaX) in this area. Maybe we take the easy way out and say that we plan to collaborate with NOMAD and MaX is this regard. Maybe partners in the project or CECAM has some repositories that support the FAIR principles? No standard metadata format. Ongoing discussion about ontology, but we can say we will follow the state of the art and contribute where we can. Commercial: Yes, on the EESSI side for consultancy (support). * Who will address: Matej * How: Describe how the codes can use standardised formats, how we will contribute to metadata standards. ____ * However, the descriptions are in part rather generic and the impact towards specific target groups/user communities is not well outlined. In addition, not all KPIs focusing on impacts are set to measure the contributions of the CoE to these requirements. For example, the KPI to measure how many developing HPC communities will benefit is not entirely clear. This is a shortcoming. * Interpretation: Exactly what specific groups should we mention here? How to define a "developing HPC community"? How to quantify the contribution of MultiXscale? They are interested in numbers here so we need to find measures that we can track. We can go broad and narrow perhaps with our training events. For the broad case, the EESSI introductory trainings can have a questionnaire to help classify the attendees (country of origin, country where active, field and career status). For the narrow case, we can do that in the context of application-specific training events perhaps? * Who will address: Alan * How: Blah ____ * Potential barriers to the expected outcomes and impacts are identified and discussed, but the management of the potential negative impacts is only very briefly addressed, which is a minor shortcoming. * Interpretation: It sounds like they are happy with Table 13 but the connection to Task 8.4 has not been made clear (and perhaps Task 8.4 is not detailed enough). * Who will address: Barbara * How: Blah ____ * However, in terms of joint exploitation, although the software will be exploited through training at Extended Software Development Workshops and other training events, exploitation plans and IP management for each participant are not fully detailed. * Interpretation: I find it difficult to understand what they want here. We need a table with information on this for each partner? IP management is copyrights, patents, and trademarks. Code is licensed but the copyright sits with the writer unless then assign or licence it. For EESSI, we should create an CLA so that EESSI maintains relicencing rights on the software projects. Need scientific groups to outline how they will actually use the couplings created. * Who will address: Caspar, and also scientific partners * How: - Caspar: Add section on that all involved HPC institutes intend to offer the EESSI software stack to all their users (section 2.2, in Exploitation?) - Scientific partners: how do you intend to use the codes developed in this project after the project? => Find a section where this fits, and write that down. ____ * However, the timing of tasks is not appropriate as many tasks span the entire duration of the project (month 1 to 48). It is unclear how different tasks are organised. This is a shortcoming. * Interpretation: They didn't like the look of the Gantt Chart. Given the additional requirements as regards deliverables, we should probably break tasks like Task 1.1, 1.2, 1.3 into subtasks with the deliverable updated in each reporting period. Tasks that extend beyond 24 months should really have 2 deliverables. This reshuffling may give us the opportunity to drop some of the more difficult deliverables. A lot of the Gantt also seems to be incorrect with Deliverables associated with an incorrect Task. * Who will address: * How: Blah **Deliverables timing** The University of Stuttgart has three deliverables: 1. State of HPC-readiness for key packages (ESPResSo, LAMMPS, waLBerla) (month 18) 2. Release of ESPResSo with waLBerla support (month 18) 3. Report on ESPResSo scalability (month 30) Rudolf: > If one deliverable per reporting period is required, I'd suggest removing the 1st and adding: > > 4. Report on demonstration of a coarse-grained simulation of a ionic liquid or polyelectrolyte on at least 1000 cores (month 48) > > This would result in deliverables at month 18 (ESPResSo release), 30 (ESPResSo scalability report) and 48 (demonstrator). **Baseline** * For ESPResSo, this is the 4.2.0 release (published on June 2022) **Interoperability** * For ESPResSo, H5MD I/O is a working package (currently, writing is partially supported) * LAMMPS can write (but not read) H5MD files. Trajectories can be read using text formats * waLBerla can write (Rudolf: but afaik not read) VTK files ____ * In addition, the description of task 2.1 lacks sufficient detail on the extension of the waLBerla code to support VLES simulations in HPC clusters with accelerators. Furthermore, the pilot cases of ultrasound simulations for biomedical applications and battery applications lack concrete details of the specific problem and configuration to be addressed. * Interpretation: This will need work in the Task descriptions in WPs 2/3. First part, sounds like it requires more technical description For the second part, it is perhaps more _what_ is the problem and _why_ do you need the coupling * Who will address: * How: Blah ____ * However, the involvement of each participant in specific tasks is not sufficiently clear. The experience and expertise of some partners is not well explained and how these relate to the tasks to be carried out is not well defined. The role of the associated partners is not sufficiently described. These are shortcomings. * Interpretation: We can add PM contributions on a per task basis to the Task descriptions since we have this information in the budget. The latter part of the comment most likely relates to Section 3.2. Probably a table is required here to help make things clear. 100% on project do not need timesheets (just a statement) Timesheets are reported by WP. * Who will address: Matej * How: Table and some modifications of Section 3.2 ____ * However, details on risks around achieving the foreseen capabilities for exascale technologies are not well addressed. This is a minor shortcoming. * Interpretation: I imagine this can be covered by an additional entry (or two) in Table 13. * Who will address: Alan * How: Add an entry ____ * However, the majority of tasks do not end with a deliverable that is appropriate for the content of that particular task. For instance, WP5 aims at Building, Supporting and Maintaining a Central Shared Stack of Optimized Scientific Software Installations. However, this WP5 does not have a sufficiently well-described task focusing on building the Central Shared Stack. While building this stack is foreseen in WP1, this WP1 does not have a corresponding substantial deliverable. This is a shortcoming. * Interpretation: See previous comment, the Gantt is incorrect in places. Some tasks will need "living" deliverables that are updated in each reporting period. * Who will address: * How: Blah ____ * However, the expertise in some areas, for example in ultrasound and biomedical applications, and in rotor dynamics, is inadequately described. * Interpretation: Again this is likely covered if we include this information in a per-partner table within Section 3.2 and perhaps some additional writing in that Section * Who will address: * How: See table for Section 3.2