# **BI 278 LAB 1** Setting up course unix environment: `ssh jereil26@bi278.colby.edu` Navigate and find your Colby Home: `home- jereil26@vcacbi278 ~]$` -"~"means it is home `/home/jereil26` **Filer**- seperate from bi278 ``` /courses/bi278/ /personal/jreil26/ ``` these commands give access to course fileserver and colbyhome respectively `pwd`- print working directory states where you are; "address" Command`ls` gives available folders ``` ls ~ ls /home/students/d/colbyid ls . ``` ends up being extremely valuable to check files within directories and such Command `ls -lh` in order to find flags (more detailed list of directory)- file permissions, file size, date created ``` /home/students/j/jereil26 ``` command `cd` is used as change directory to (transportation) `colbyhome`- a symbolic link to your personal directory on colby's fileserver by typing "command k" on the Mac desktop, and using smb://filer.colby.edu, you are able to connect to colby's fileserver "Filer". This can move files from desktop to the server and vice versa. command used to get to home. to go to home use `cd /personal/jereil26` as well The other fileserver of colby is "Courses" utilize `ls /courses/bi278` `pwd` is used to show position `cd ..` moves up in the directory to go to a parent directory `ls /courses/bi278/Course_Materials/lab_01a` will open Bio course materials `cp /courses/bi278/Course_Materials/lab_01a/* ./lab_01a`- example command of how to copy all the files from lab_01a to an example directory `mkdir name` makes a directory w "name" `rmdir name` removes a specific directory `cp a b` copies file a to b `mv ab` moves a file from a to b (both a and b must be specified) `rm filename` rm "filename" irreversible `cat filename` printsto screen the entire contents of a file as long as it is text `less filename` displays contents of a file one screen length at a time `head filename` print to screen the top 10 lines of a file `tail filename` print to screen the bottom 10 lines of a file `man command` manual for most unix commands replace "command" `grep pattern filename` finds a specific pattern within a file if the pattern is complicated use quotes `wc filename` counts the words in a file and can be used to count lines (-l) or characters (-c) in a file `tr` translates or deletes sets of characters using > sends the results of the command that preceded it to a file that you specify `somecommand > filename` Additionally within genomes ">" is used to describe individual chromosomes/contigs or genes use `grep ">"` to find the contents of a specific genomes' file for example we used `grep ">" /courses/bi278/Course_Materials/lab_01b/filename` `grep ">" PATH/test.fa` PATH is something like "/courses/bi278/Course_Materials/lab_01b/test.fa" Will show all of the title lines within the path area. Denoted by ">" in front of line. `grep -v ">" PATH/test.fa` Will show the inverse of the title lines. For a genome, this will show all of the bases. In order to specify bases use `grep -v ">" PATH/test.fa | tr -d -c GCgc` This specific line will show G and C bases In order to count these bases use `grep -v ">" PATH/test.fa | tr -d -c GCgc | wc -c` This specific line will show the numerical amount of G and C bases command `awk 'BEGIN {print (x/y)}'` will divide x and y to open script type `nano` within nano, use `#!/bin/bash` to show that the script is a bash shell script To execute your script use `sh your_script.sh` In order to open up the nano script use the command `nano your_script.sh` Utilize the following to run the script on a specific file: `sh ~/basecounter.sh filename` and or `sh basecounter.sh filename` dependent on where you are in UNIX -> utilizes my unix code to find the number of G and Cs compared to the number of all bases I utilize the following files for references to the genomes: GCF_000756045.1_ASM75604v1_genomic.fna P.bonniea_bbqs433.nanopore.fasta GCF_000961515.1_ASM96151v1_genomic.fna P.hayleyella_bhqs155.nanopore.fasta GCF_001865575.1_ASM186557v1_genomic.fna P.hayleyella_bhqs171.nanopore.fasta GCF_002902925.1_ASM290292v1_genomic.fna P.hayleyella_bhqs21.nanopore.fasta GCF_009455625.1_ASM945562v1_genomic.fna P.hayleyella_bhqs22.nanopore.fasta GCF_009455635.1_ASM945563v1_genomic.fna P.hayleyella_bhqs23.nanopore.fasta GCF_009455685.1_ASM945568v1_genomic.fna P.hayleyella_bhqs530.nanopore.fasta P.hayleyella_bhqs69.pacbio.fasta P.bonniea_bbqs395.nanopore.fasta test.fa Using the given command and my script, I was able to run the command for each genome, and record the answers in a table.