FOSS4G Pangeo 101 workshop

# FOSS4G Pangeo 101 workshop ###### tags: `Pangeo` `foss4g` HackMD shared document: https://hackmd.io/@nicest2/foss4g-pangeo Thank you for joining the **Pangeo 101** workshop! We’re delighted to have you here :sparkles: ## Code of conduct :heavy_check_mark: * [Take a moment to read this](https://github.com/pangeo-data/governance/blob/master/conduct/code_of_conduct.md) ## Timeline :clock1: The FOSS4G Pangeo 101 workshop is on Tuesday 23rd August 2022 from 14:00 - 18:00 (Europe/Rome), 211. | Time | Activity | | ---- | -------- | | 14:00 | 👋 Welcome | | | Introductions, logistics and workshop goal setting | | [The Pangeo ecosystem](https://docs.google.com/presentation/d/1XB9jmKlPnyAtUWRG_xzGC9h3qn_88gVSegOI3uDcaKo/edit?usp=sharing) | | | Handling multi-dimensional arrays with xarray | | | Interactive plotting with HoloViews | | 16:00 | ☕️ Break (20 minutes)| | | Data access & Data chunking | | | Parallel computing with Dask | | 17:45 | Beyond the workshop, feedback & concluding remarks | This timeline is purely approximative and given for indication purpose only. We will adjust depending on the audience. There will be additional breaks (5 minutes) regurlarly and time for questions during the workshop. ## Chat (Gitter) :loudspeaker: https://gitter.im/pangeo-data/Europe ## Sign-up :pencil: HackMD shared document: https://hackmd.io/@nicest2/foss4g-pangeo # Access to infra https://pangeo-foss4g.vm.fedcloud.eu/jupyterhub/hub/user-redirect/git-pull?repo=https%3A//github.com/pangeo-data/foss4g-2022&urlpath=lab/tree/foss4g-2022/tutorial/pangeo101/&branch=main https://pangeo-foss4g-jsi.vm.fedcloud.eu/ (backup for those who did not manage to enroll) ## Training material https://pangeo-data.github.io/foss4g-2022/intro.html **Name + an emoji to represent your mood today ([emoji cheatsheet](https://github.com/ikatyang/emoji-cheat-sheet/blob/master/README.md))** *(Remember that this is a public document. You can use a pseudonym if you'd prefer.)* - Stefanie 🦸‍♀️ - Matthias :8ball: - Tammy: 🤠 - Gordon: :sleeping: - Tim: :smile: - Darren: :thinking_face: - Vashek - Kylli :smile_cat: - Sanghee :smiley: - Justin :first_quarter_moon: - Keith :smile_cat: - Tuuli: :smiley: - Honza :ok_hand: - Lada :face_with_raised_eyebrow: - Gediminas :smile_cat: - Eric :frog: - Christian 🍻 - Ramesh :smiley: - Linus :duck: - Anca :thumbsup: - Saheel :smiley: ## Q&A :question: *(Add here any question or issue you might need assistance. Feel free to put it below or ask in the [Pangeo Europe Gitter](https://gitter.im/pangeo-data/Europe))* - *Is there an efficient way to deal with border effects when using chunked data? E.g. when applying a moving window:* If what you mean is the moving windown in Xarray (with Dask backend) here is what you can reffer with dask's map_overlap functionality. https://docs.dask.org/en/stable/array-overlap.html - *How do we use Geotiff with kerchunk?:* https://fsspec.github.io/kerchunk/cases.html#sentinel-global-coherence - - ## Potential list of issues :tornado_cloud: ### EGI login Ask to in-person instructors, Anne or Lorenzo, to check your EGI login. ### Dask dashboard on pangeo-foss4g.vm.fedcloud.eu Never type `127.0.0.1:8787/` in the dask-labextension dashboard link, otherwise your jupyterlab might get frozen! Instead type (replace ``<username>`` and ``<portnumber>`` with your settings): `https://pangeo-foss4g.vm.fedcloud.eu/jupyterhub/user/<username>/proxy/<portnumber>/status` ## Feedback at the end of the workshop :+1: *(Feel free to add your suggestions or general feedback of the workshop)* Not feedback but a question: if we want to work through these training materials on our own after the conference, is that possible/how would we do so? - yes, the training material is available under CC-BY4 license so you can reuse it. A similar infrastructure but with more resources will be made available and operational very soon (send us an email if you want to be informed). The current infrastructure will remain for one more week. ### One thing you like about the workshop - I really like that you explained different concepts and libraries in detail and that there was a clear focus on preformance improvement. I am sure that my work will benefit from this workshop. - I agree with the above. I have worked with many of the packages used in the workshop, but my understanding of them and how best to use them is more clear now. I also thought both presenters spoke well and clearly, and were quite organized, which I appreciated especially since not all workshops have been that way. - I would say, that introduction to xarray was well done, dask dfs and computing by itself was also explained fine, but it could use a bit more time for testing a showing more usages. I am glad I joinned this workshop. - I liked the pace at which things were explained (I am familiar Python but have not yet used xarray and dask). I appreciate the huge amount of effort you guys put into all of the workshop material which is, in my opinion, very clear and useful. ### One suggestion to improve future workshops - I took me some time to realize that Pangeo is not a new software. I would suggest that you explain a bit more about the Pangeo project in the beginning. E.g. as you do on https://pangeo.io/. - Also agree with the above point. I'd also say I got a bit lost on the chunking notebook. I think its a hard topic to explain especially in a short time frame, and one that I should probably research a bit on my own, but wanted to mention it. - Getting set up with the virtual machine environment seemed mildly involved. Was not a big deal really but clearly there were some difficulties. Given the workaround method that was used for people who couldn't get their cloud environment set up, I wonder if the "workaround" method should just be the preferred set up method in the future (but maybe there is a reason why you did it the way you did). - For me the setup was a bit too complicated in the sense, that some of these links are blocked by my company (not your fault of course). So I would just prefere to clone the repo on my machine and use VSCode or other software (which I had to do anyway). But I understand that setting up virtual environment for everyone would take forever. - Chunking was a bit harder to understand, maybe with a bit more testing and showing what it does in the background would help more.