# EDLFS Hackathon
**When**: Tuesday December 13th, 3:30 ET to 6PM CT.
**Where**:
* In person at AGU (Chicago): Meet at UPDATE: Hyatt Tap Room across street from McCormick or
* Online: <https://ucsb.zoom.us/j/969805691>
We're cordinating using Slack: ask jhkennedy@alaska.edu or luis.lopez@nsidc.org for an invite. For online participation any hour should be fine.
**Context**: [Cloud computing using NASA Earthdata with Earthdata login](https://discourse.pangeo.io/t/cloud-computing-using-nasa-earthdata-with-earthdata-login/2434)
## The problem
We are trying to simplify access to NASA data behind Earthdata Login (EDL) for the PyData stack, a more detailed overview can be found in the [repo docs](https://github.com/NASA-Openscapes/edlfs). The core issues are briefly restated here:
* Most NASA Earth observing data requires authenticated HTTP access via NASA's [Earthdata Login (EDL)]((https://urs.earthdata.nasa.gov/)). However, [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/) and other Python libraries do not support EDL/OAuth2 out of the box.
* NASA supports different access patterns for cloud-based and on-prem datasets hosted at the various [Distributed Active Archive Centers (DAACs)](https://www.earthdata.nasa.gov/eosdis/daacs), where each DAAC may support only certain access patterns and auth mechanisms.
* Handling the above two challenges for large-scale, distributed workflows with tools like Dask adds additional complications.
## Brainstorming solutions
* Developing a package (packages?) that can handle most of the use cases transparently for the users so they can open any given file using the Pageo stack as if there was no EDL in the middle. This may imply implementing a new backend for `fsspec` that is aware of EDL and S3 temporary credentials.
* Add yours! (i.e. Ask ChatGPT for a solution or hacking NASA to remove EDL altogether)
## Goals for the hackathon
* Learn about the problem and why it impacts open science workflows
* Contribute to a roadmap and gather input from the Pangeo community
* Document *exactly* the current situation - there seems to be no single unifying document that lays out the lay of the land
* Here: https://hackmd.io/T73AtFTnS4C_Ez9JfGNldA
* Start writing some test cases where EDL becomes an issue:
* Using xarray to access data from different DAACs
* Simulate a distributed Dask cluster and verify what happens if we do the same
* Connect with other people interested in simplifying access to NASA data
## Next steps
* Plan the next hackathon/sprint
* Apply to funding opportunities like a NASA ACCESS/ROSES grant.