# Using the MGHPCC
By Tanya Lama
Open terminal and log in using your username and password.
Note that the cursor will not appear as you type your password in. Press Enter
```
$ ssh tl50a@ghpcc06.umassrc.org
tl50a@ghpcc06.umassrc.org's password: m*otism*otis
```
If you have successfully logged in, the console will read:
```
This is the University of Massachusetts information technology environment...etc
[tl50a@ghpcc06 ~]$
```
#### Moving large files off your personal space in the cluster
Our personal space is limited to 50Gb, so we need to move files off after jobs are done to accomodate room for the next step. We use filezilla to move files between the cluster and BOX (unlimited storage space). Open your BOX deskop app. Note: files in the BOX desktop app are NOT locally stored on your computer. Open Filezilla. To the left, navigate to the NGSProcess folder in BOX. Above, input the following into quickconnect:
Host: sftp://ghpcc06.umassrc.org
Username: tl50a
Pass:*****....
Port:22
Drag and drop files from cluster -> BOX.
#### Remember to checksums both files (use local terminal to check file copied into BOX).
```
cksum BOBCAT2_1.fq.gz
```
Output: checksum bytes filename####
#### Transferring local data from your computer onto the cluster using scp
This will copy the local file to the remote system remotehost using the login name tl50a.
```
[mfk@ghpcc06 ~]$ scp lolfile tl50a@remotehost:
```
#### Transferring data from the internet using wget
Our data from the sequence provider is password protected, so we need to use the --ftp-user= and--ftp-password= commands to login to the ftp server that holds our data. We will use the -m "mirror" option on wget to track our download progress.
```
wget -m --ftp-user=P202SC19050296-01_07_22_19_wkAz --ftp-password=ArxReMyS ftp://128.120.88.251/P202SC19050296-01-01/raw_data
wget -m --ftp-user=X202SC19123068-Z01_02_29_20_LA8A ftp-password=TlEt4sIQ ftp://usftp1.novogene.com
wget -m --ftp-user=X202SC19123068-Z01_03_14_20_qJVu ftp-password=mdUvmDpL ftp://128.120.88.251/H202SC19123068/Rawdata
wget -m --ftp-user=X202SC19123068-Z01_03_14_20_qJVu ftp-password=mdUvmDpL ftp://usftp1.novogene.com
Host: ftp://128.120.88.251
wget -r ftp://usftp1.novogene.com
ftp://usftp1.novogene.com
Username:X202SC19123068-Z01_03_14_20_qJVu
Password:mdUvmDpL
##this is the command that actually works with novogene:
wget -r --ftp-use=X202SC19123068-Z01_03_14_20_qJVu --ftp-password=mdUvmDpL ftp://usftp1.novogene.com
You can provide authentication credential via --user=USERNAME and --password=PASSWORD ; based on the man wget , the command can be overridden using the --http-user=USERNAME and --http-password=PASSWORD for http connection and the --ftp-use=USERNAME and --ftp-password=PASSWORD for ftp connection.Mar 5, 2011
```
#### How to submit a job using bsub
bsub is a command used for submission to the MGHPCC cluster.
```
username@login1:~$ bsub -V -b n -cwd runJob.sh
Your job 1 ("runJob.sh") has been submitted
```
-V will pass all environment variables to the job
-N <jobname> name of the job. This you will see when you use qstat, to check status of your jobs.
-w e verify options and abort if there is an error
#### Downloading fastq data from an ftp
Note: data needs to be stored under its individual name within the fastq directory (e.g. /home/tl50a/download/fastq/BOBCAT2/BOBCAT2_1.fq.gz)
cd /home/tl50a/download/fastq
wget -bqc --ftp-user=P202SC19050296-01_07_22_19_wkAz --ftp-password=ArxReMyS ftp://128.120.88.251/P202SC19050296-01-01/raw_data/BOBCAT2/BOBCAT2_USPD16100238-N706-AK400_HKFN3DSXX_L4_2.fq.gz
wget --ftp-user=P202SC19050296-01_07_22_19_wkAz --ftp-password=ArxReMyS ftp://128.120.88.251/P202SC19050296-01-01/raw_data/BOBCAT4/
### Use the mv command to move files or rename them
mv oldname newname
mv file newdirectory
```
#### Calculate <- coverage -> of a .bam file using flagstats
samtools flagstat /project/uma_lisa_komoroske/Tanya/analyses/step6_RemoveBadReads/LIC8/LIC8_RemoveBadReads.bam
Output includes:
n + n in total (QC-passed reads + QC-failed reads)
#### To Calculate Coverage:
n mapped reads *read length/ genome size
(80159120 *150)/2400000000 = 5.00X coverage
#### Download fastqc output to your home directory or to box
To copy from the **remote computer** to the **local one**, type, in the *local computer*:
```
scp -r tl50a@ghpcc06.umassrc.org:/home/tl50a/analyses/step1_fastqc/BOBCAT2/BOBCAT2_1_fastqc.html BOBCAT2_1_fastqc.html
scp -r tl50a@ghpcc06.umassrc.org:/home/tl50a/download/ 6_SV_rmClusterSNP_BiSNP_SV_HardFilter_SV_5GS_5TM_joint_chr_HighQualSites_processed.vcf.gz
```
# Downloading Data from TCAG
ssh tl50a@ghpcc06.umassrc.org
Password:
```
cd /project/uma_lisa_komoroske/Tanya/download/tcag
```
1. Download the TCAG command line interface tool for MacOS,
2. chmod +x tcag-client-1.4.2
3. ./tcag-client-1.4.2 download -p 7IQOCBZ:/
mv /project/uma_lisa_komoroske/bin/7IQOCBZ /project/uma_lisa_komoroske/Tanya/download/tcag
/project/uma_lisa_komoroske/bin/tcag-client-1.4.2 download -p 7IQOCBZ:/
username: tlama@eco.umass.edu
password: 5xPYeUxNWE4UrCm
You should download 141 files
#### Copy all the files to a safe new folder called cleancopy_tcag
Usage is: cp -r source destination --copy-contents
```
cp -r /project/uma_lisa_komoroske/Tanya/download/tcag/7IQOCBZ/LAM11820/191107_A00481_0061_AHH3MLDRXX/ /project/uma_lisa_komoroske/Tanya/download/tcag/cleancopy_tcag/ --copy-contents
###### tags: `tools` `How To` `Basic` `Documentation` `Genomics` `Bioinformatics`