# IndigiData Hackpad
## Sequencing and basecalling (AI)
**Links and Resources**
- [Nanopore basecalling blog](https://nanoporetech.com/blog/transforming-basecalling-in-genomic-sequencing)
- [Nirenberg and Matthaei experiment](https://en.wikipedia.org/wiki/Nirenberg_and_Matthaei_experiment)
- [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
- [Attention Google collab tutorial](https://colab.research.google.com/github/jaygala24/pytorch-implementations/blob/master/Attention%20Is%20All%20You%20Need.ipynb)
- [3Blue1Brown Attention explanation](https://youtu.be/eMlx5fFNoYc?si=QqdFyWKRU2blRERd)
- [Tutorial 6: Transformers and Multi-Head Attention](https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial6/Transformers_and_MHAttention.html)
- [Nanopore for educators eBook](https://nanopore4edu.org/latest/)
- [Transformer explainer](https://poloclub.github.io/transformer-explainer/)
- [The invisible witness: air and dust as DNA evidence of human occupancy in indoor premises](https://www.nature.com/articles/s41598-023-46151-7)
### Intrested in teaching with Nanopore
Please leave your name and email for the kit sign-up along with a short note of how you might deploy in teaching (e.g. with your undergraduate class, with a community science program, etc.)
- WariNkwi Flores (nkwiflores@gmail.com)(nawiflores@arizona.edu) for community work in the Andes<>Amazon, Kara and Kichwa, Indigenous Biodiscovery/Bioprospecting/Data for Governance
- leke hutchins (lhutchin@berkeley.edu)
- Andrea Leon (andrea1015lt@gmail.com) for community work with Quechua community in Ayacucho, Peru
- Amelia Leon (leonamelia15@gmail.com) for community work with Quechua community in Cusco, Peru
- Celeste Kimimila Terry (celestekimimila@gmail.com) for community science education and work in Wanblee, SD, with Denver Native community members and scholars to analyze eDNA
- Jocelyn Chee-Santiago (jcheesan@asu.edu)
## Your Notes
WN Flores: Nanopore high quality through increasing number of samples.
Currently, suited to DNA and RNA with some development happening for protein.
Encoder <> Decoder: predictability models?
Data provenance and permanance are key state factors and control interactions on the data governance space/movement
MinIon and Beton Lab -DeSci-
### Vocabulary and terms
### Things I am unclear about
Maria: how did the tools progressed, and what is the novelty of Dorado?
* Depends on the jump. Most of the improvements smaller such being able to process more reads, more accurate base calls, or running faster. Sometimes there is new functionality, like being able to call the modified bases/
WN Flores: How does the conversation of warm and cold data influence in the develoment of sequence technologies?
* what is warm and cold data?
----
## Pathogen Data Network Wastewater Workshop
### Links
- [Mentimeter slides](https://www.menti.com/alq8v4ed2pfv)
- [Pathogen Data Network homepage](https://pathogendatanetwork.org/)
- [National Wastewater Surveillance System (NWSS)](https://www.cdc.gov/nwss/index.html)
- [Wastewater Scan](https://www.wastewaterscan.org/en)
- [NIAID Bioinformatics Resource Centers (BRCs) for Infectious Diseases](https://www.niaid.nih.gov/research/bioinformatics-resource-centers-infectious-diseases)
- [National Academies Report: Wastewater-based Disease Surveillance for Public Health Action](https://nap.nationalacademies.org/catalog/26767/wastewater-based-disease-surveillance-for-public-health-action)
- [RDM Toolkit](https://rdmkit.elixir-europe.org/data_life_cycle)
- [Infectious disease toolkit](https://www.infectious-diseases-toolkit.org/)
- [Optimizing Wastewater Surveillance: The Necessity of Standardized Reporting and Proficiency for Public Health](https://ajph.aphapublications.org/doi/full/10.2105/AJPH.2024.307760)
- [FAIRSharing wastewater](https://fairsharing.org/search?q=wastewater)
- [MIDAS field guide](https://midasfieldguide.org/guide)
- [Nanopore eBook 16s](https://nanopore4edu.org/latest/annotated_experiments/16s_sequencing/#examining-microbial-diversity-nanopore-shoe-ome-sequencing)
**Videos**
- [How CDC targets pathogen genomes in wastewater to track disease trends](https://youtu.be/Y-JLwynhF8E?si=xBEnjVVJyQXKFRWR)
- [How Nanopore sequencing works](https://youtu.be/RcP85JHLmnI?si=XRezYndEJgka_NEJ)
- [Nanopore eBook: 16S experiment playlist](https://youtu.be/r6fmFhAUMxE?si=Un2VJbMAJzyhI1i0)
- [National Acadamies Report webinar](https://vimeo.com/791231306)
- [Pathogen Data Network introduction webinar](https://youtu.be/ukeHoSRRKCg?si=IPV81n2U8MWyZENT)
- [Wastewater Surveillance for COVID-19 and Beyond](https://asm.org/articles/policy/2024/november/wastewater-surveillance-an-essential-tool-for-publ)
**Data and software exercises and examples**
- [Pathogen portal](https://www.pathogensportal.org/)
- [Wastewater scan national dashboard](https://data.wastewaterscan.org/)
- [Nextstrain](https://nextstrain.org/)
- [Epi2me download](https://nanopore4edu.org/latest/bioinformatics/software/#installing-epi2me)
- [Docker download](https://nanopore4edu.org/latest/bioinformatics/software/#installing-docker)
### Exercises
**DMP Tool**
1. Go to [DMP Tool](https://dmptool.org/plans)
2. Create an account/sign in
3. Create plan (select mock project for testing, practice, or educational purposes)
4. Follow the tab steps to explore creation of a plan.
**Nanopore 16s**
1. Download [sample data](https://drive.google.com/drive/folders/18tGgLzysWQ4osnxhIZBkJPgdQ4fanDwE?usp=drive_link) (unzip folder(s)into fastq files)
a. Run Epi2me 16s workflow
2. Download example [HTML report](https://drive.google.com/file/d/1LRMTyE_AwEts_j2GJZfyDMdE5IxWVj-p/view?usp=drive_link)
**Nextclade**
1. Go to [Nextclade tutorial](https://clades.nextstrain.org/).
2. Under 'Add more sequence data' choose example and the **nextstrain/sars-cov-2/BA2.86**.
3. Your 'Selected reference dataset' will be **SARS-CoV-2/Wuhan-Hu-1/2019(MN908947)**.
4. Click 'Run'
5. Browse your results, including the computed phylogenetic tree.
### Things I am unclear about
---
-
### Vocabulary and terms
---
-