# Julia in HPC position paper -- Scratchpad: ###### tags: `HPC` `white-paper` :::info - **Authors** - Valentin Churavy, [JuliaLab](https://julia.mit.edu/), [MIT](https://mit.edu) - Carsten Bauer, [PC²](https://pc2.uni-paderborn.de), [NHR](https://nhr-gs.de) - William F Godoy, [CSMD](https://csmd.ornl.gov/), [ORNL](https://www.ornl.gov/) - Mosè Giordano, [ARC](https://www.ucl.ac.uk/arc), [UCL](https://www.ucl.ac.uk/) - Lucas C. Wilcox, [Applied Math](https://math.nps.edu), [NPS](https://www.nps.edu) - Ludovic Räss, [VAW](https://vaw.ethz.ch/en/research/glaciology.html), [ETH Zurich](https://ethz.ch/en.html) - [Michael Schlottke-Lakemper](https://orcid.org/0000-0002-3195-2536), [NumSim](https://www.mi.uni-koeln.de/NumSim/schlottke-lakemper), [Univ. of Cologne](https://www.uni-koeln.de) - [Hendrik Ranocha](https://ranocha.de), [Applied Mathematics](https://www.uni-muenster.de/AMM/en/institute.shtml), [University of Münster](https://www.uni-muenster.de/en/) - Samuel Omlin, [CSCS](https://www.cscs.ch/), [ETH Zurich](https://ethz.ch/en.html) - Johannes Blaschke [NERSC](https://www.nersc.gov), [LBNL](https://www.lbl.gov) - [Erik Schnetter](https://orcid.org/0000-0002-4518-9017), [Perimeter Institute](https://perimeterinstitute.ca) ::: ## Scope & Goals > > What's the intended scope of the position paper, ie, will this be a DOE specific initiative or a more general document? > That’s a fair question. My personal take is that it should be a general document. A position from the Julia community members invested in HPC to help funding agencies craft initiatives and a narrative for other HPC communities that want to know more about Julia. My view is that the more, the better, but it’d be good to hear everyone’s opinion on the scope. [name=William G] The paper should be able to answer some questions in 2021 for the next decade. From http://cseweb.ucsd.edu/~ddahlstr/misc/heilmeier.html (I’m just throwing some ideas, but a position paper should have everyone’s input and a much broader vision): - What are you trying to do? Articulate your objectives using absolutely no jargon. - Establish Julia as first class citizen language in HPC - How is it done today, and what are the limits of current practice? - Where is Julia and its ecosystem in HPC today? Development and maintenance of debuggers, profiling tools, library bindings, packaging, teaching material, conference venues, etc. - Showcase some applications that use Julia for HPC, if possible with performance measures, and describing how Julia made implementing this easier/feasible - What's new in your approach and why do you think it will be successful? - Solve two-language problem, suitable abstractions for science and heterogeneous hardware, help diversify current investments in HPC language, more accessible to beginners, easier to teach, etc. - Intrinsic advantages of Julia absent in other language (run-time code generation, macros/homoiconicity, compiled (high performance), etc.) - Noteworthy here is that with the increasing prevalence of AI/ML, and therefore increased adoption of Python, the 2 language problem is getting worse. - ES: I think there's a statement by a Google X director (?) who said roughly that, if he were to develop their ML ecosystem today, it would be based on Julia - Who cares? If you're successful, what difference will it make? - who is part of the Julia HPC community? Can Julia attract and make more productive those who think C++ can be overkill and understand Fortran and Python limitations for their HPC efforts? AI/ML workflows? In-situ data analysis? Development Cost/Performance Benefit? Computer Scientists, Computational Physicists, Math, Research Software Engineers? - What are the risks and the payoffs? - Risk: little adoption, larger investment - Payoffs: productivity and investment diversification in HPC languages and ecosystem. Given its high-level nature, Julia should provide [performance portability](https://performanceportability.org/perfport/definition/) (I'm not aware of extensive studies about Julia on this topic, but Johannes Blaschke's poster at JuliaCon mentions good performance portability Cori -> Perlmutter) - Liquify/dilute hard barrier between "beginner language" and "HPC language": "scale" users, not just code - I think we should try to frame not having a Python alternative (read: high-productivity language) as a risk also => Investing in more than one HPHPC language will mittigate some risks. Apart from AI/ML workflows, the Experimental and Observational Sciences (EOS) are heavy users of HP languages (mainly python). - How much will it cost? How long will it take? - That’s a larger discussion - What are the midterm and final "exams" to check for success? - Adoption and support Wrt DOE, it is one important target due to the long track record funding HPC research, software stacks, vendors and hardware. Johannes mentioned https://e4s-project.github.io/ I understand some of the skepticism since ECP, 2016, predates Julia first production version, 2018, while the ecosystem is still too new. Still, there is existing discussion on what happens after ECP wraps up (in roughly 1 or 2 years) so it’s good timing. ## Julia in HPC -- proof points - [Celeste.jl](https://github.com/jeff-regier/Celeste.jl) - [ClimateMachine](https://clima.caltech.edu/) - [CESMIX](https://computing.mit.edu/cesmix/) - DoE funding - [ExaPF.jl](https://github.com/exanauts/ExaPF.jl) - ECP project - [GPU4GEO](https://www.pasc-ch.org/projects/2021-2024/gpu4geo/) -- Swiss funding (targets Swiss and EU infrastructure) - [ADIOS2.jl](https://github.com/eschnett/ADIOS2.jl) - [openPMD.jl](https://github.com/eschnett/openPMD.jl) - [COBREXA.jl](https://github.com/LCSB-BioCore/COBREXA.jl) (maybe less traditional HPC, but they aim to reach "exascale") - [CMBLensing.jl](https://github.com/marius311/CMBLensing.jl) (I like that this is a well documented, mature code that runs on Cori and Perlmutter -- and this is neat too: https://cosmicmar.com/CMBLensing.jl/stable/04_from_python/) Can we do a user survey of Julia Codes running at the DOE facilties? - Something like: - Do you use Julia privately? - What Julia codes have you deployed at `$SITE`? - What are barriers that you experienced runnin Julia at `$SITE`? - Should be feasible before October - Johannes can set one up for NERSC ## Teaching & Education in HPC - Julia can significantly widen the current HPC user base by allowing to use the same language for teaching programming and extreme scale computing. Reference: Portegies Zwart, S. Nature Astronomy 4, 819–822. [doi:10.1038/s41550-020-1208-y](https://dx.doi.org/10.1038/s41550-020-1208-y) (Sept. 2020). - Useful to strengthen support in exascale era computing challenges - What kind of Julia training activities fit within DOE training programmes? Eg - ECP - ATPESC - Interactive supercomputing invites new kinds of users who have been closed off from HPC before: low entry barrier enables easier integration of HPC material in undergrad education - How is this different from more general projects like Jupyter? - MSL: Jupyter based on Python is only a frontend for the "real" software; Julia can be used both as the driver and executor of HPC applications (see two-language issue above) - There is also Alpine: https://www.exascaleproject.org/event/ascent-201217/ - Want to make sure that we don't sound like we're taking an issue with other efforts - I would say that we welcome those efforts, and Julia is another tool that makes projects like Jupyter better. - Designed from the ground up with abstractions Scientific Computing and Parallel Computing ## First meeting :::info - **Present people** - Valentin Churavy - William F Godoy - Mosè Giordano - Michael Schlottke-Lakemper - Samuel Omlin - Johannes Blaschke - Hendrik Ranocha - Jesse Chan - Erik Schnetter ::: ### Agenda - [x] Introductions - [ ] Structure - Abstract - Executive Summary - Lots of figures - ~12 pages - What is great about Julia - Opportunities for funding -- not challenges of Julia ;) - History of Julia - Alan Edelman: "To write a good parallel language, one first has to have a good language first" - Darpa HPCS - Enableing collaborations: "Power of language" - No strict boundary between domain scientist and Computer Scientist/HPC performance engineer/RSE - Two-language problem / N-language problem - Julia enables new (non-HPC) communities to make more effective use of HPC - And: enhances collaboration in existing HPC communities - From prototype to scaling application - Make case: "why Julia" - Julia has a community - Make case: Survey HPC facilties, and HPC users - Jupyter as a infrastructure -- Interactive computing - Transitioning inbetween -- less barriers from high-level code to running on the hardware - HPC-ready: Growth of community, teaching, eductation up to research projects - The pipeline doesn't drie up - Pre/Post processing - Set vision (5, 10 years from now) - Define HPC - Michael: True HPC starts when you have >10_000 nodes - Network, IO, Bandwidth - Johannes: The definition of HPC needs to not be too exclusive - Different scales - Three aspects: Core, Network, IO - AI for Science -- built for data - [ ] Title - [ ] What is the Ask for DOE? - Natural growth right now, - [ ] Running a Julia user survey?? - Survey of HPC centers: Is Julia available supported - [ ] What - Next Steps - [x] Survey NERSC users: https://forms.gle/gpyRp3y4mPYhFLXg8 (please don't fill out if you're not a NERSC user) - [ ] Survey of sites (Johannes) - Template for site-survey: - What is Julia support level at `$SITE`? (Module, Jupyter, Tickets) - What is the trend in Julia usage at `SITE`? - [ ] NERSC - [ ] OLCF - [ ] MIT - [ ] CSCS - [ ] Lincoln Labs - [ ] TACC - [ ] JSC - [ ] PC² - More? - [ ] Overleaf https://www.overleaf.com/7357571575hrqfsxvfknbx - [ ] Next meeting in week of Aug 23: https://terminplaner4.dfn.de/eKdY1L1ymKSMxRcp - [ ] Venue (William Godoy) ### Potential Venues - [SC2022, Supercomputing](https://sc22.supercomputing.org/all-dates-deadlines/), Dallas, TX: deadlines: abstract and paper in April, workshops dealines in August. - [ISC2022, International Supercomputing](https://www.isc-hpc.com/press-releases/isc-2021-calls-for-research-paper-submissions-by-december-10-2020.html). No venue announced, but usually in Frankfurt, Germany. Deadlines: abstract early December, full paper early to mid December. (Participating: Carsten) - [International Conference on Supercomputing](https://www.ics-conference.org/), ICS2022: deadlines: abstract in late January, paper in February. Venue not announced, yet. - [The International Journal of High Performance Computing Applications](https://journals.sagepub.com/home/hpc) ## Second meeting (Aug 27th, 2021, 4pm UTC) :::info - **Present people** - Carsten Bauer - Michael Schlottke-Lakemper - Hendrik Ranocha - William Godoy - Erik Schettner - Johannes Blaschke - Mose Giordano - Valentin Churavy - Samuel Omlin - Ralph Kube - Ludovic Räss ::: ### Agenda (tentative; feel free to amend) * Introductions (for new folks) * Update on user/site surveys from Sam & Johannes * Current state of paper * ... 1. What do we want from this paper? - VC: Depends on Audience -- Is this a whitepaper for DOE? Or is this a position paper that will be published - General Paper first - Then distilla DOE position paper 2. What is the ask from DOE: - Encourage more active support from Vendors - How to get over the chicken-egg problem: No vendor support means no HPC users -- no HPC users means no Julia support on HPC - Related to this is a common question I hear: What is the "Killer App"? - Scientific Machine Learning - Deploy ML - ML Reasearchers - ML Customization (e.g. Physics-inspired kernels) <= This is what Julia is great at - Differentible Programming - JIT compiler has access to AST - See C++ needs a JIT by Hal Finkel (ANL) - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1609r3.html - https://www.youtube.com/watch?v=pDagqR0jAvQ - https://arxiv.org/abs/1904.08555 - Mixed precision / approximate computing - VC: Breath vs narrowness - MPI/Transport (High-speed network), Accelerators, etc... - - Julia should be considered as a "First Class Citizen Language". This means eg: - It should be part of the HPC software stack (this should not be hard) - It should be part of the Documentation/Training - It should be part of the standard list of languages in user surveys/etc - Julia is Full-stack - Entry-level - High level technical computing - Numerical programming - Visualization - Accelerators/Performance engineering - To HPC deployment 3. How are we saving money for DOE? - Based on DOE funded technology: LLVM - Easy to spin up new backends (see Tim Besard's Talk: https://www.youtube.com/watch?v=Fz-ogmASMAE) - Training time? - - Redirect applications to rapidly changing needs - Community driven performance portability - ... - (Low-Risk) Performance Portability - https://github.com/UoB-HPC/BabelStream/pull/106 - Backend development for OpenMP and Kokkos - Basically hoping that Sandia can keep up - - Teaching / Not drying up the pipeline 5. What are the Julia's Application Areas? - The one that DOE cares about ++ - Low-risk performance portability: ~~connect vendors (who have the technical knowledge background in their hardware) with large (open source) community -- this might be rubbish (we don't want to position ourselves as a contender for omp/kokkos)~~ Performance portability in a high-productivity language - Pharmacology: PUMAS - Fluid simulation/Climate science - Float16 (https://github.com/milankl/ShallowWaters.jl, https://github.com/milankl/Sherlogs.jl) - https://pretalx.com/juliacon2021/talk/E7HKVW/ - ClimateMachine 6. For DOE asks -- Heilmeier Catechism (https://www.darpa.mil/work-with-us/heilmeier-catechism): - What are you trying to do? Articulate your objectives using absolutely no jargon. - Make Post-Exascale (Post Moore?) HPC accessilbe to a larger community of (scientific and technical) programmers - How is it done today, and what are the limits of current practice? - What is new in your approach and why do you think it will be successful? - Who cares? If you are successful, what difference will it make? - What are the risks? - How much will it cost? - How long will it take? - What are the mid-term and final “exams” to check for success? ## Third meeting (September 14th) :::info - **Present people** - Michael Schlottke-Lakemper - Hendrik Ranocha - William Godoy - Erik Schettner - Valentin Churavy - Ludovic Räss ::: - Section assignments - What figures do we have/made ## Fourth meeting (October 5th) :::info - **Present people** - Hendrik Ranocha - William Godoy - Erik Schettner - Valentin Churavy ::: ## Fifth meeting (October 19th) :::info - **Present people** - William Godoy - Mose Giordano - Valentin Churavy - Ludovic Raess - Johannes Blaschke ::: - Slides for Johannes - GPUCompiler - Julia is the best "LLVM" frontend - Enzyme - Flux for ML - Propabilistic Programming: Turing, Gen, ... - ISC deadline, November 22nd -- full paper November 29nd - Tutorial/BoF at ISC - Teaching materials/HPC workshop - https://www.ihpcss.org/