# Frontiers in Plant Science **Computational Basics for Plant Biology** ---- **Jason Williams, CSHL** (e) williams@cshl.edu, (t) @JasonWilliamsNY **Zoom** (for screensharing): [https://cshl-dnalc.zoom.us/j/93139573652?pwd=RUM1SFo3RlJoNzVpVmtJSjd0REJkdz09](https://cshl-dnalc.zoom.us/j/93139573652?pwd=RUM1SFo3RlJoNzVpVmtJSjd0REJkdz09) **Google sheet of instances**: [Instances](https://docs.google.com/spreadsheets/d/1B3Q_HOIVtUaNhpHAz7eju_9K1NTJtRmqkv_GeLrPsfM/edit?usp=sharing) --- [toc] ## Resources ### Data/knowledge management \* - FOSS [Data Management intro](https://cyverse-foss.readthedocs-hosted.com/en/latest/03_managing_data.html#) - Markdown: [https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) - ReadTheDocs/mk-docs: [https://readthedocs.com/](https://readthedocs.com/), [https://mkdocs.readthedocs.io/en/stable/](https://mkdocs.readthedocs.io/en/stable/) - FairSharing: [https://fairsharing.org/](https://fairsharing.org/) - FAIR Principles paper: [https://www.nature.com/articles/sdata201618](https://www.nature.com/articles/sdata201618) - Data Stewardship Wizzard: [https://ds-wizard.org/](https://ds-wizard.org/) - Example [DMPs](https://www.lib.ncsu.edu/do/data-management/elements-of-a-dmp) ### Reproducibility - GitHub: [github.com/](github.com/) - Conda: [https://docs.conda.io/en/latest/](https://docs.conda.io/en/latest/) - Bioconda: (https://bioconda.github.io/) - Docker: [https://www.docker.com/](https://www.docker.com/) - Ten Simple Rules on Dockerfiles: [https://journals.plos.org/ploscompbiol/article/comments?id=10.1371/journal.pcbi.1008316](https://journals.plos.org/ploscompbiol/article/comments?id=10.1371/journal.pcbi.1008316) - Biocontainers: [https://biocontainers.pro/](https://biocontainers.pro/) - Jupyter: [https://jupyter.org/](https://jupyter.org/) - Snakemake: [https://snakemake.readthedocs.io/en/stable/](https://snakemake.readthedocs.io/en/stable/) ### Software and coding skills \* - Software Carpentry [Plotting and Programming in Python](http://swcarpentry.github.io/python-novice-gapminder/) - Software Carpentry: [https://software-carpentry.org/](https://software-carpentry.org/) - Data Carpentry: [https://datacarpentry.org/](https://datacarpentry.org/) - Bioconductor: [http://bioconductor.org/](http://bioconductor.org/) ### Computational resources - CyVerse: [https://cyverse.org/](https://cyverse.org/) - JetStream2: [https://jetstream-cloud.org/](https://jetstream-cloud.org/) - Galaxy: [https://usegalaxy.org/](https://usegalaxy.org/) ### Community resources - Biostars: [https://www.biostars.org/](https://www.biostars.org/) - StackOverflow: [https://stackoverflow.com/](https://stackoverflow.com/) - Twitter Bioinformatics: [https://twitter.com/i/communities/1506791236987879425](https://twitter.com/i/communities/1506791236987879425) --- ## Learner assessment **How would you rate yourself in the following areas?** *Add a '+' sign to indicate where you think you are* 1. Overall I rate my computational skills: - **Novice**:++ - **Beginner**: ++++ - **Intermediate**:+++++++ - **Advanced**: + 2. How familiar are you with Linux? - **I have used it** ++ - **I have tried it but know very little** ++++ - **I'm not an expert but can do everything I need** ++++++* - **I use it everyday**+ 3. How familiar are you with Python? - **I have used it**+++ - **I have tried it but know very little** ++++++++ - **I'm not an expert but can do everything I need** ++ - **I use it everyday** + **In a few sentences, what is your biggest computational challenge?** - **Chosen**: 1. Currently working on a makeflow pipeline for analyzing RNA-Seq for identifying post transcriptional modifications and long non-coding RNA 2. Having difficuty in teaching myself R and Python. - **Jason**: - **Henriette**: writing scripts for R by myself; using AlphaFold more efficiently - **Rachel**: coming up with a question to answer using computational methods - **Courtney:** I have a background in human genetics bioinformatics, but i am unsure how to know which tools and pipelines to use in plants. I am pretty good at writing scripts myself, but have issues with command line coding/troubleshooting programs. - Nadia: Work on gen regulatory networks of metal homeostasis - **Florencia**: finding a robust way of defining pseudogenes. - **Rucha**: haven't used computational tools but would like to learn about it. - **Taryn**: currently trying to perform a permutation test to compare the Fst values of sets of genes. Doing in this in R but also want to improve my command line scripting. - **David**: Finding time to develop models to learn from protein structure. I recently discovered that I like cats and they like me. - **Will:** I think one of my main challenges is knowing if the scripts I write to answer my research questions are "correct" or "kosher" enough to be accepted by those who are more experienced. - Jennifer: distributing tools to process big data--best python packaging practices to make tools easy to install and use - **Bryan**: Allele-specific expression analysis, genomics data visualization. #i love rmarkdown **Guanghui**: learning bioinformatics tool; install softwares in the linux system - **Jacopo**: predict effects of mutations on TFs. --- ## Notes > wget http://swcarpentry.github.io/python-novice-gapminder/files/python-novice-gapminder-data.zip > unzip python-novice-gapminder-data.zip