owned this note
owned this note
Published
Linked with GitHub
# Climate Informatics 2023 - Hackathon
## General info
### Organisers
- CI
- [name=Andrew M]
- EDS journal
- [name=Andrew H]
- EDS book community
- [name=Alejandro]
- [name=Anne]
### Goal
- Advance the principles of open science within the climate informatics community.
### Format
- Hybrid
- Accomodate virtual attendees while encouraging in-person attendance
- Use the conference as a networking and brainstorming launch point and carry out the hackathon in the days/weeks following with a submission deadline after the conference? Or, open submissions before the conference and use the conference for a reflection/awards ceremony?
- Open before conference, use conference as brainstorming/check-in/networking/ideation, close 2 weeks after conference, review/judging for 2 weeks after close, aim for results announced 1 June
### Tracks
- **Reproduction Track**: Contributing a case study to the EDS book’s exploration/preprocessing/modelling gallery by reproducing a notebook implementation of one of the papers published in CUP’s EDS journal with open code and open data listed [here](https://docs.google.com/spreadsheets/d/1TGqhn7gx_yncgiCJN7o8oOatDeukgkw5mHzw57SBxuU/edit#gid=0), or
- **Freestyle Track**: Contributing to the EDS book in another way of their choosing subject to your approval.
### Description
- Individual submissions or team submissions?
- Team submissions
- Prizes for top n teams (Amazon gift cards / CUP book)? What is n? What is the value?
- Teams of one to three members
- At least one team member must be registered for conference
- Peer review by one CI member and one author of original paper
- Expression of interest form where individuals can specify team or looking for team along with project preferences, then we can assign
- [NASA Space Challenge](https://www.spaceappschallenge.org/)
### Resources
- [EDS journal papers](https://docs.google.com/spreadsheets/d/1TGqhn7gx_yncgiCJN7o8oOatDeukgkw5mHzw57SBxuU/edit?usp=sharing)
- [ML Reproducibility Challenge (compare)](https://twitter.com/repro_challenge?lang=en)
- [Registration form](https://docs.google.com/forms/d/1yjJWzaXlbVWtky5Q8tMoUh9jaDFnoPeNQBD-MxoMLFk/edit)
### Logistics
- Collaborations
- Turing Way: workshop at conference during hackathon
### Communications
- CI Twitter, mailing list
- Turing Twitter, mailing list
- EDS Book Twitter, mailing list, GitHub
- Turing Way Twitter, mailing list
#### Instructions
## Meetings notes
### 19-05-23
- Alejandro
- Andrew M
- Andrew H
- Douglas
#### Notes
- Paper
- Length (5000 words)
- Structure
- Other disciplines/references
- [Agile GIS](https://reproducible-agile.github.io/)
- [ML Reproducibility Challenge](https://paperswithcode.com/rc2022)
- [Artificial Intelligence for Earth System Predictability](https://www.osti.gov/biblio/1888810/)
- [8 Levels of Reproducibility: Future-Proofing Your Python Projects](https://www.anaconda.com/blog/8-levels-of-reproducibility)
- [JASA Reproducibility Reviewer](https://jasa-acs.github.io/repro-guide/pages/reviewer-guidelines.html)
- [GIS Science](https://peerj.com/articles/5072/)
- [Notebooks Now!](https://data.agu.org/notebooks-now/)
- [JATS](https://www.niso.org/standards-committees/jats)
- [Notebook Diff](https://www.reviewnb.com/)
- Make a table of all these different initiatives as a literature review to point out gaps and to synthesize the recommendations from each different organisation
- Make a mention to other initiatives in transparent science, including open budgets and open carbon footprint accounting
- [National Academies of Sciences](https://nap.nationalacademies.org/catalog/25303/reproducibility-and-replicability-in-science)
- Novel contribution to literature: tie customs of various fields together
- Taxonomy of challenges to reproducibility
- Dataset size
- Computational requirements
- Sharing pretrained models
- Language barriers
- Taxonomy of barriers to reproducibility on the producer side and on the consumer
- Audience: authors who want to make their work more reproducible
- Point out blindspots and common pitfalls
- Signpost to useful tools/resources like HuggingFace and Gradio and to communities like EDS book, etc.
- Recommend new incentives and models for engagement
- Carrots (prizes) and sticks (funding body requirements)
- Awards for reproducibility
- Challenges as forcing functions
- Undergraduate/graduate education and use of reproducibility assignments
- Feedback surveys
-
- Key take-aways
- Open Badges Policy
- Interactive reports
- Community fusion between EDS book and CI
- What's the position in our position paper? Recommendations for best practice for the community?
- Key research questions addressed by the work
- AH: End of September 2023
- 1 reviewer
- Times
- AM: Mid-July
- DR: August OK
- AC: Mid-July/Mid-September
- AGU Abstract Submission: by 2 August
- [H098 - Open and Transparent Modeling Workflows for Robust and Inclusive Science](https://agu.confex.com/agu/fm23/prelim.cgi/Session/184815)
- [IN009 - Advancing AI with Open Environmental Datasets: Benchmarking Needs, Frameworks, Lessons Learned](https://agu.confex.com/agu/fm23/prelim.cgi/Session/187226)
- [IN048 - Test and Validation of AI Models in Earth and Space Science](https://agu.confex.com/agu/fm23/prelim.cgi/Session/190454)
- [ADD AGU22]
- Meetings mid-July, mid-August, mid-September
#### Actions
- [name:Alejandro]:
- list of participants/reviewers > andrew
- add/create reference table
- [name:Andrew]:
- send reminder of feedback survey
- meeting poll for mid-July, mid-August, mid-September
- create overleaf/zotero
- using EDS journal
### 16-05-23
- Alejandro
- Anne
- Andrew M
- Andrew H
- Douglas
#### Notes
- Reference [ML Reproducibility Challenge](https://arxiv.org/pdf/2003.12206.pdf) paper
- References in Zotero
- SI in EDS Journal
- [Template](https://www.cambridge.org/core/journals/environmental-data-science/information/author-instructions/preparing-your-materials#preparing-for-submission)
- Consider submitting to EDS journal special collection from CI2023
- Andrew Hyde will check with Claire to figure out how we can maintain an independent review process
- Writeup on the conference or writeup on the reproducibility?
- Include insights from challenge and from panel at conference
- Include insights from "Talk with the Expert" series
- "Reproducibility in Climate Informatics: A Perspective from 2023"
- Include guidance and suggestions for future work
- What can the community do to make things better? Resources, best practices, next generation publications, tangible, actionable recommendations, badges, etc.
- Discuss ideas for incentive reform? Address the elephant in the room regarding the publish-or-perish paradox
- Recommendations for researchers when they try to reproduce: what exactly does that look like? Tie into recommendations for educators and for the integration/incorporation of these schemes in undergrad/postgrad curricula and training
- Specific discussion
#### Shared Resources
- [name=Douglas]:
- [JASA reproducibility guide](https://jasa-acs.github.io/repro-guide/)
- [name=Andrew M]
- https://jasa-acs.github.io/repro-guide/
#### Actions
- Timeline
- July (Collection and Data Analysis)
- Aug - Sep (Writting-up)
- October (First draft, preprint)
- December (AGU23)
- Surveys
- Challenge participants
### 08-05-23
- Alejandro
- Anne
- Andrew M
#### Notes
- Recap activities W1
- 5 out of 7 teams
- 2 nonactive > e-mail
- Coffee & Drop-in
- 1 particicipants
- Infrastructure
- 4 teams
- Issue
- Storage Error 500 issue
- Memory increase to 50GB?
- Data & Code
- Team 3, missing file `ncra_cat_tave.nc`
- Contacted, Rachel, email (Alejandro), slack (Andrew M)
- Talk with Experts
- Sebastian, 8 in total (4 externals)
- W2
- Talk with Experts
- [name=Alejandro,Anne]
- Cesar, 11 in total (8 externals)
- [name=All]
- Brian, https://pad.sfconservancy.org/p/ci2023-repro-challenge-talk-8may-brian
- [name=Alejandro,Anne]
- Rafael, https://pad.sfconservancy.org/p/ci2023-repro-challenge-talk-12may-rafael > hopefully good connection
- Checkpoint
- Ok to move
- Check teams are ok with infrastructure and papers’ code and data
- Dates
- Inform Reschedule Peer-review in W3 to W4
- Change in the final presentation
- Share EDI form?
- After the review
- Add question
- W3-W4
- Talk with the Expert, speaker confirmed on 15 May, but still TBC for 19 (the Turing Way) and 22 May (CUP)
- W5
- Final teams’ presentation on Tuesday 30 May instead of Monday 29 May (bank holiday in the UK)
- ok
- End
- [name=Anne, Alejandro]
- Judging > check criteria with Douglas
#### Twitters
- Promote previous talks
- https://eds-book.github.io/reproducibility-challenge-2023/details/timeline-schedule.html#recordings-and-slides
#### Actions
- [name=Andrew H, Alejandro]: Contact authors papers, review in W4
### 27-04-23
- Alejandro
- Anne
- Andrew M
#### Notes
- [ ] Ad
- [x] Twitter
- [ ] CI2023
- [ ] EDS Book
- [ ] C-SCALE
- [ ] Personal
- [x] CI2023
- [x] EGU23
- [x] [name=Andrew M]: conversation after his presentation
- [x] [name=Alejandro]: ML4ESM session
- [x] [name=Anne]:Pangeo session
- [x] [name=Anne]:NASA TOPS Panel
- [ ] Slacks
- [x] Turing Way
- [x] Turing Environmental & Sustainability
- [x] Turing
- [x] Software Underground
- [x] UK CDE Network
- [x] CI2023
- [x] Cambridge
- [ ] Emails
- [ ] Participants > [Check](https://docs.google.com/document/d/1qVqbbg9aUxUnP3DGJm-1Uxo2krwbpD2njbG7ht3zt44/edit) > @AndrewM
- [ ] Friday 27 before deadline (13 registered) -
- [ ] Extras
- [ ] Experts
- [ ] Open Infrastructure, Sebastian
- [x] Email
- [x] Calendar
- [x] Confirmation Calendar
- [ ] Reminder / Ask Short bio
- [ ] R, Cesar
- [x] Email
- [x] Calendar
- [x] Confirmation Calendar
- [ ] Reminder / Ask Short bio
- [ ] Python, Brian, Pythia Project
- [x] Email
- [x] Calendar
- [ ] Confirmation Calendar
- [ ] Reminder / Ask Short bio
- [ ] Julia, Rafael, JuliaGeo
- [x] Email
- [x] Calendar
- [ ] Confirmation Calendar
- [ ] Reminder / Ask Short bio
- [ ] Reproducibility, Daniel
- [ ] Collaborative Sciece, TTW
- [ ] Open Access Journal, Andrew H
- [ ] Reviewers
- [ ] Externals, Friday 6 May before deadline
- [ ] Authors, Contact those haven't replied according to pre-selected papers
- [ ] Website
- [ ] Schedule
- [ ] Promote other events
- [ ] TTW Fireside Chat
- [ ] R community
- [ ] Pangeo Showcases
- [ ] Julia
- [ ] https://www.ukrn.org/event/future-of-peer-review-may2023/
- [ ] Add TTW/Scriberia images
- [ ] Add Google Analytics tracker
- [ ] Infrastructure
- [ ] Test JupyterCloud
- [ ] Pangeo Notebook
- [ ] Pangeo ML-TF
- [ ] Pangeo ML-Pytorch
- [ ] Data Science (Python/R/Julia)
- [ ] Create teams after confirmation
#### Actions
- [ ] Email
- [ ] [name=Alejandro] [Participants](https://docs.google.com/document/d/1qVqbbg9aUxUnP3DGJm-1Uxo2krwbpD2njbG7ht3zt44/edit) > send first wave Friday, second wave Saturday
- [ ] [name=Andrew M] [Teams](https://docs.google.com/document/d/1Ur_iJRo3HnrIBmlJXZyqhU7euXtneBd2_6D82hF47Uw/edit) > invite to create private slack channel
- [ ] Slides for welcome
- [ ] Overview
- [ ] Logistics
- [ ] Organisers
- [ ] Code of Conduct
- [ ] Meet your Team
- [ ] Q & A
- [ ] Slides for intro to technology
- [ ] Connect to JupyterHub
- [ ] Git/GitHub review
- [ ] EDS book
- [ ] Reviewer application deadline, 8 May
- [ ] Contact all reviewers and provide instructions
- [ ] Misc
- [ ] Contact
- [ ] [name=Andrew M] Owen Allemang & Onkar Gulati
### 12-04-23
- Alejandro
- Andrew M
- Anne
- Ricardo
#### Notes
- Extend deadline, 1 week
- Maillist
- Latin America
- French
- From [CI2023 GDocs](https://docs.google.com/document/d/1L8TxxUli_4H4RZ8M5LvY4cWy1ZazkjBVp3nK9sXI6sQ/edit#):
- ANDREW M
* BAS
* NOC
* Met Office / JCEEI
* NOAA - Douglas: included in the March NOAA AI newsletter
* NASA
* ESA
* EGU (AH: Sophie Giffard-Roisin on the EDS Board is attending EGU; she might be able to disseminate via EGU lists if she’s affiliated)
* CDT Network
* Oxford / UCL / ICL / KCL
* Exeter Environmental Intelligence
* University of Reading (AH: I have been in touch with Rowan Sutton and Bryan Lawrence in the past)
* Lancaster University Data Science for the Natural Environment
* Institute for Environmental Analytics (works with Reading Uni)
* National Oceanography Centre UK AH: I have a contact Jon Blower
* UK Centre for Ecology and Hydrology
* NERC (AH: Simon Gardner)
* Ordnance Survey
* Royal Society
* Turing Institute
* Pangeo EU (AH: We can rely on Alejandro for this)
* Climate Change AI (AH: My contacts here are Mark Roth and Priya Donti)
* Climformatics Mailing List
* Climformatics Twitter
* Cambridge
* Zero
* Cambridge Climate Society (student org)
* Earth Science
* Computer Science (AH: I can send to Jon Crowcroft)
* Engineering (AH: I have some good contacts in this department, including the comms team, Lawrence Bull and Mark Girolami)
* Geography
* Land Economy
* DAMTP
* AI4ER
* C2D3
- Anne
- RSE
- EGU Pangeo
- EGU Julia Splinter
- EGU Open Science
- Tina Mailist
- Alejandro
- Turing Way, Collab cafe
- EGU ML4ESM
-
- Banner
- Promotion in EDS book journal
- https://www.cambridge.org/core/journals/environmental-data-science
- Social Media
- [Ricardo], banner
- Certificates
- DOI, Zenodo, authorisation, EDS book community
- Research Object
#### Actions
- [name=Andrew H & Ricardo]:
- Prepare certificates for participants/reviewer
- Prepare banners for EDS journal and social media with the potential extended deadline, 8th May - 4th June.
- [name=Alejandro & Anne]:
- Complete the challenge website with details, DL: Monday 17 April.
- Add partner logos in the CI2023 Repro webpage, DL: Friday 14 April.
- Contact applicants registered before 17 April to inform changes in the dates, and check if there are any issues with the changes, Friday 14 April.
- Contact previous EDS book reviewers and invite to promote the challenge or review
- [name=Andrew M]
- Distribute message to key mailist informing adjusted timeline and pointing to the official website, DL: Tuesday 18 April.
### 13-03-23
#### Sign in
- Alejandro
- Andrew M
- Anne
#### Notes
- Form
- Prizes
- Certificate completion
- Badge in GitHub?
- Research Object completed and unfinished
- Certification of participation plus 30% discount on all CUP books for participants (I can give them a code to enable this). Plus £500 in free books to be shared by winning team. Does that sound good enough?
-
#### Actions
- [ ] [name=Alejandro/Anne]:
- [ ] Confirm feasible list of EDS journal articles
- [ ] Check GitHub badges/certificate
- [ ]
- [ ] [name=Andrew M]:
- [ ] Confirm Douglas (Rao)
- [ ] Describe challenge in the website
- [ ] [name=Andrew H]:
- [ ] Mail to authors according to list of feasible papers
- [ ] Indicate expectations
- [ ] finalise the level of hackathon prize CUP can offer (gratis books)
- [ ] raise visibility of EDS Book by making reference to it in the emails to CI authors about publishing in journal (and in journal's info pages)
- [ ] ask colleagues about options for making EDS Book available from published EDS articles when authors have used it, whether that's just linking through to the EDS Book version from the Data Availability Statement or something more (e.g. via Code Ocean or CoCalc - two third parties we worth with that can make computational notebooks available and executable in-browser).
### 03-03-23
#### Sign in
- Alejandro
- Andrew M
- Andrew H
- Anne
#### Notes
- Recap
- 1 month, May
- Teams (3 to 4 people)
- List of suggested EDS journal papers (pre-approved) or suggested by participant (subject to approval)
- Form
- Add relevant skills
- Move CI2023 info at the end
- Prizes
- [10:28] Andrew Hyde
I am sure I can get approval for us to give CUP books (of winner's choice) for prizes - but it'd be good to think of how many prize winners there will be
- Completion prize?
- Invitation to authors
- During challenge
- Office hours, calendly
- Networking opportunities
#### Actions
- [name=Andrew H] Draft review invitation to authors
- DL: end next week
- [name=Andrew M/Alejandro] Complete form
- DL: early next week
- [name=Andrew M/Alejandro] Official website for the reproducibility challenge
- DL: early next week
- [name=All] DL promotion on Monday 13th March
- [name=Anne F] have a look at papers and challenge description
### 27-02-23
#### Sign in
- Alejandro
- Andrew M
#### Notes
- Narrative
- https://2022.spaceappschallenge.org/challenges/2022-challenges/earth-data-analysis-developers-wanted/details
- Registration form
- Draft here: https://docs.google.com/forms/d/1yjJWzaXlbVWtky5Q8tMoUh9jaDFnoPeNQBD-MxoMLFk/edit
- Add section Code of Conduct
- [EDS book](https://github.com/alan-turing-institute/environmental-ds-book/blob/master/CODE_OF_CONDUCT.md)
- CI code of conduct (andrew track down)
- 2-4 participants/team
- Open registration
- before
- during
- Dates
- mid-March: send hackathon registration form on mailing lists and Twitter
- 6 April 23: send hackathon registration form to registered participants of CI 2023 conference
- 19 April 23:59 AOE: close hackathon registration (this gives attendees of the conference a chance to sign up)
- 20 April 00:01 AOE: sprint starts using the standard GitHub workflow for notebook creation laid out in EDS book pages
- 1 June 23:59 AOE: notebook review process freezes and entries are evaluated
- 15 June: prizewinners announced
- Office hours or talks or events thorughout the hackathon
#### Actions
- Finish form draft, present to Andrew H, present to committee
- Meet again on Friday 3 March sometime before noon with all
- Eventually: reach out to authors to confirm their interest in reviewing
- New EDS guidelines published and invite Andrew M to review Authors guidelines
- https://github.com/ampersandmcd
- Reach-out the Turing Way folks for a talk about Reproducibility in Data Science
- Budget for invited speakers
### 17-02-23
#### Sign in
- Alejandro
- Andrew M
#### Notes
#### Actions
- [name=All]
- Refine hackathon using NASA Space Challenge
- [name=Andrew M]
- Draft of sign-on form
- Check papers feasibility according to Alejandro's criteria
- [name=Andrew M]
- Next meeting, 27th Feb 2023 @ 15:00