Global Invertebrate Genomics Alliance (GIGA) Bioinformatics Workshop (Oct 20-21, 2018)
==========
This is a hackmd collaborative note-taking document using markdown language (formatted for github and easily converted to .html webpages). See [markdown cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet).
Please fill in your name below and use this document to ask questions, share links, and any information relevant to the [GIGA III](https://gigaiii.weebly.com/) meeting bioinformatics workshop.
Schedule and materials:
https://gigaiii-bioinformatics-workshop.readthedocs.io/en/latest/
Please sign-in with your name, institution, email address and organism you work with:
# Sign-in
1. Lisa Johnson, UC Davis, ljcohen@ucdavis.edu, marine microbial euks, killifish, corals/sponges (sometimes), anything!
2. Bishoy Kamel, UNM, bishoyh@unm.edu, Corals, Snails, Parasites, Metabolic networks etc.
3. Joe Lopez, Nova Southeastern, joslo@nova.edu - GIGA, sponges, marine microbiomes, molecular evolution
4.
5. Devin Thomas, University of New Hampshire, devin.w.thomas@gmail.com, Fish & more!
6. Adelaide Rhodes, Broad Institute, Boston, MA; COPEPODS (Most numerous marine metazoan on the planet)!!!!! and deep sea tanaids. adelaide.rhodes@gmail.com
7. Joseph Sevigny, University of New Hampshire, jlsevigny1@wildcats.unh.edu, metagenomics and meiofauna
8. Heather Bracken-Grissom, Florida International Univeristy, hbracken@fiu.edu, DECAPODSSSSSSS! Deep sea and caves
9. Didier Zoccola, Centre scientifique de Monaco, zoccola@centrescientifique.mc, Corals and biomineralization, Global change
10. Danielle DeLeo, Florida International University, ddeleo@fiu.edu, Deep-sea crustaceans, bioluminescence & corals
11. Yvain Desplat, Nova Southeastern University, yvain.desplat@gmail.com, Sponges, oil splills and transcriptomics.
12. Jessica Goodheart, University of California, Santa Barbara, goodheart@ucsb.edu, Ostracod and Nudibranch evolution
13. Victoria Pecci, Nova Southeastern University, vp374@mynsu.nova.edu, Sponge Genomics
14. Tsai-Ming Lu, OIST Graduate University, tsaiming.lu@gmail.com, Dicyemid genome
15. Ksenia Juravel; ksenia.juravel@lmu.de; Ludwig-Maximilians-University Munich, Germany; Sponges genomes, pylogeny
16. Tiago J. Pereira, tiagojp@ucr.edu, UC Riverside, Nematology Department, ecology/biodiversity of marine nematodes.
17. Ramón E. Rivera-Vicéns; Ludwig-Maximilians-University Munich, Germany; r.rivera@lrz.uni-muenchen.de; Corals, Sponges and microbiomes
18. Mark Blaxter; University of Edinburgh, UK; mark.blaxter@ed.ac.uk, nonvertebrate genomics
19. Iliana Baums; Penn State University, US; baums@psu.edu, coral genomics
# Questions
Q: Does 'fna' stand for "full nucleotide assembly" or "fasta nucleic acid"?
A: 'fna' stands for fasta nucleic acid. similar to 'faa' fasta amino acid.
Q: Suggestions for a good ortholog finder program that is easy to install and run?
Q: Can long running programs like Transcoder be run as a batch file (in the background)?
A: add nohup before the command, and & at the end of the line. For example:
`nohup TransDecoder.LongOrfs -t genome_ &`
# Links/References to share
* [commandline bootcamp](http://rik.smith-unna.com/command_line_bootcamp/?id=yk822u2rpo)
* [Link to Intro to Annotation slides](https://github.com/GlobalInvertebrateGenomicsAlliance/GIGAIII_bioinformatics_workshop/blob/master/Lessons_Day_1/Rhodes_GIGAIII_Oct20_Slides.pdf)
* [Logging in to jetstream with a private key](https://angus.readthedocs.io/en/2018/jetstream/login.html)
* [RNASeq workshop tutorials with Jetstream](https://rnaseq-workshop-2017.readthedocs.io/en/latest/index.html)
* [DIBSI (Data Intensive Biology Summer Institute) at UC Davis](http://ivory.idyll.org/dibsi/)
* [Tutorials from DIBSI 2018](https://angus.readthedocs.io/en/2018/)
# My favorite command!
`sl` ??? should it be `ls`? :) what happens when you type `sl`?
sl is a joke for typos. It creates a Steam Locomotive that goes across the screen. Its worth installing - Joe
> Command 'sl' not found, but can be installed with:
>
> apt install sl
> Please ask your administrator.
>
>
Jetstream does not contain all simple programs, but can be installed by typing
`sudo apt install sl`
:)
# Commands
`pwd` = show current directory, with absolute path
`~` = home directory
`cd ~` = go to the home directory
`echo ~` = shows your home directory
`ls` = list files in current directory
`mkdir bashintro` = make a directory named "bashintro"
`cd bashintro` = move into directory named "bashintro"
`cd ..` = move into the directory above the current directory
`cat` = print file contents to screen, not recommended for large files
`less` = scroll through a file
`more` = similar to less
`head filename` = show first ten lines of a file
`tail filename` = show last ten lines of a file
## Piping
`grep "^>" filename | wc -l` = grab all the header lines and pipe them into the command wc to count the number of header lines in a fasta file
## Shortcuts
TAB completion -- Start typing a file name and hit tab and it will fill it in.
Ctrl-C!!!!! -- stops a program that is out of control, hold it down
## Advanced commands
`grep` = select lines based on a pattern
`grep DEVO` = find all lines that have the word "DEVO" in them
`grep -c DEVO` = find and count how many times "DEVO" is in a file
`sed` = substitute words in a line based on pattern
`sed 's/DEVO/OVED/g'` = replace all instances of DEVO in a file with OVED
`sed 's/DEVO/OVED/1'` = replace only the first occurence of DEVO in the file
`awk` = pull out a column of text, awk has a lot of great options
`awk '{print $2}'` = prints the second column of text
## MY IP ADDRESS IS HERE
Tsai-Ming Lu 149.165.170.123
Tiago J. Pereira 129.114.16.123
Candace Grimes 149.165.170.129
Reed Mitchell 129.114.16.182
Didier Zoccola 129.114.16.156
Ramón Rivera 149.165.170.63
Joe Lopez 129.114.16.153
Bishoy Kamel 149.165.170.
P. Ganot 149.165.170.133
## (spare IP addresses)
Take one and add your name to it!
(unclaimed) - 129.114.16.181
(unclaimed) - 149.165.170.94
(unclaimed) - 129.114.16.188
## Conda install
[Instructions for installing conda on your laptop or institution server](https://angus.readthedocs.io/en/2018/jetstream-bioconda-config.html)
installing [TransDecoder from the bioconda channel](https://anaconda.org/bioconda/transdecoder):
```
conda install -c bioconda -y transdecoder
```
Once this has installed, type the command to see if it has worked (will spit out help info for program):
```
TransDecoder.LongOrfs
```
To run:
```
TransDecoder.LongOrfs -t ~/annotation/genome_canu_filtered.fasta
```
### To run programs in the background
Type
```
screen
```
Then run the program.
This will start a new terminal that can be resumed if the connection drops, or if you want to exit and come back later
```
screen -r
```
Another approach is to use nohup
```bash
nohup program &
```
## Jetstream specific paths
- `conda` is in /opt/miniconda
## QIIME workshop
- What does qzv stands for?
- how to look at the qzv file?
- use qiime2view
- So now that OTUs are gone? What are supposed to call them?
- Sequence varients?
- QIIME has a new stats feature only 4 months old? What the version number to make sure you have this feature?
- What is provenance? and what can you use it for?
- How to get to provenance?
- third tab in the top menu within qiime2view
-