**Computational Basics for Plant Biology**
----
**Jason Williams, CSHL**
(e) williams@cshl.edu
(t) @JasonWilliamsNY
**Zoom** (for screensharing): [ link](https://cshl-dnalc.zoom.us/j/97683182809?pwd=RWJySmV3Y09DZS9UN1d2ZVdwM2g1QT09)
----
-Sristi
[toc]
## Resources
### Data/knowledge management
\* - FOSS [Data Management intro](https://cyverse-foss.readthedocs-hosted.com/en/latest/03_managing_data.html#)
- Markdown: [Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)
- ReadTheDocs/mk-docs: [Documentation](https://readthedocs.com/), [https://mkdocs.readthedocs.io/en/stable/](https://mkdocs.readthedocs.io/en/stable/)
- FairSharing: [FAIR sharing](https://fairsharing.org/)
- [FAIR Principles paper](https://www.nature.com/articles/sdata201618)
- Data Stewardship: [Wizzard](https://ds-wizard.org/)
- Example [DMPs](https://www.lib.ncsu.edu/do/data-management/elements-of-a-dmp)
### Reproducibility
- GitHub: [github.com](github.com/)
- Conda: [Conda](https://docs.conda.io/en/latest/)
- Bioconda: [Bioconda](https://bioconda.github.io/)
- Docker: [Docker](https://www.docker.com/)
- Ten Simple Rules on [Dockerfiles](https://journals.plos.org/ploscompbiol/article/comments?id=10.1371/journal.pcbi.1008316)
- Biocontainers: [Biocontainers](https://biocontainers.pro/)
- Jupyter: [jupyter.org](https://jupyter.org/)
- Snakemake: [snakemake](https://snakemake.readthedocs.io/en/stable/)
### Software and coding skills
- Software Carpentry: [SWC](https://software-carpentry.org/)
- Data Carpentry: [DC](https://datacarpentry.org/)
- Bioconductor: [bioconductor.org](http://bioconductor.org/)
### Computational resources
- CyVerse: [https://cyverse.org/](https://cyverse.org/)
- JetStream2: [https://jetstream-cloud.org/](https://jetstream-cloud.org/)
- Galaxy: [https://usegalaxy.org/](https://usegalaxy.org/)
### Community resources
- Biostars: [https://www.biostars.org/](https://www.biostars.org/)
- StackOverflow: [https://stackoverflow.com/](https://stackoverflow.com/)
- Twitter Bioinformatics: [Bioinformatics Community](https://twitter.com/i/communities/1506791236987879425)
- LifeSciTrainers: [LifeSciTrainers.org](http://lifescitrainers.org)
---
## Survey Results
**What's your current computational challenge(s)**
-Sristi: In future I will be working with RNAseq and proteomics dataset. So I want to learn about it.
-Deeksha: I have no idea where to basically start from and the how to utilize the command lines for data representation and organisation and Transcriptome/sc transcriptomics data analysis..
- Jason:
- Dang: I don't know what the standards are for data organization, I kind of just make google sheets or excel sheets and throw them in the group drive.
- Clair: I am working with many large scRNA datasets. While I am comfortable with the data analysis, I need to create a way for others in the lab and collaborators to access these datasets and edit them.
- Sydney: I know next to nothing about any of these computer programs, but will have a ton of genetic data to sort through
- Uzezi: How to create a website using github. I need to learn how to commit my code directly to github from my terminal. I also need to learn the basics of ML.
- Kyle: Adapting published ML algorithms for my own applications in enzyme engineering. Also, better understanding the code and architecture of said algorithms so I can alter them for different applications.
- Elena: qPCR/RNAseq data analysis, graph production and statistical analyses
- Abe: I don't have any at the moment, but I suppose some day I might be working with some RNA seq data. Dealing with all of that will definitely require some computational effort.
- Joe: I need to add features to a protein design machine learning algorithm, where 'features' might be tweaking the training regimen to encompass different types of protein properties, but I don't have a huge ML background
- Rachel: Analyzing large RNA-seq datasets in R, given I have very little R experience. Also presenting these data on an accessible platform, and I don't really know where to start there.
- Ziv: visualization of scR dNA-seq dataset.
- Jade: Data visualization, learn how to make plots efficiently. Not familiar with coding. Need to learn how to use softwares like pyMol or other protein structure visualization.
- Kevin: As I begin to navigate more complex data and in the future when I do, I anticipate having to learn better Github usage and management skills, code efficiency and higher levels commands to implement. I also anticipate needing to learn more bash coding to work with our cluster.
- Michelle: Using Github and markdown to save/organize my code for genomic analyses I have done.
- Kaotar: I would like to learn how to organise and manage my data, as well as learning how to use R to analyse sequencing data.
-
---
## Notes
### Objectives
- Prepare you with the basic Python for later in this course
- Understand basic data management skills and resources
- Learn about learning resources and how to focus your future learning
### Plotting and Programming in Python
- Software Carpentry [Plotting and Programming in Python](http://swcarpentry.github.io/python-novice-gapminder/)
- Google Colab [Colab](https://colab.research.google.com/)
- [New Google account instructions](https://support.google.com/accounts/answer/27441?hl=en)
### Set up commands
#Get the dataset
!wget http://swcarpentry.github.io/python-novice-gapminder/files/python-novice-gapminder-data.zip
#Check the download
!ls
#Unzip the dataset
!unzip python-novice-gapminder-data.zip
#### [Variables and assignment](http://swcarpentry.github.io/python-novice-gapminder/02-variables.html)
#### [Built in and types](http://swcarpentry.github.io/python-novice-gapminder/03-types-conversion.html)
#### [Libraries](http://swcarpentry.github.io/python-novice-gapminder/06-libraries.html)
#### [Tabular data](http://swcarpentry.github.io/python-novice-gapminder/07-reading-tabular.html)
### Help code
data_oceania_country = pd.read_csv('data/gapminder_gdp_oceania.csv', index_col='country')
print(data_oceania_country)
#### [Pandas](http://swcarpentry.github.io/python-novice-gapminder/08-data-frames.html)
#### [Plotting](http://swcarpentry.github.io/python-novice-gapminder/09-plotting.html)