# Using the MGHPCC By Tanya Lama Open terminal and log in using your username and password. Note that the cursor will not appear as you type your password in. Press Enter ``` $ ssh tl50a@ghpcc06.umassrc.org tl50a@ghpcc06.umassrc.org's password: m*otism*otis ``` If you have successfully logged in, the console will read: ``` This is the University of Massachusetts information technology environment...etc [tl50a@ghpcc06 ~]$ ``` #### Moving large files off your personal space in the cluster Our personal space is limited to 50Gb, so we need to move files off after jobs are done to accomodate room for the next step. We use filezilla to move files between the cluster and BOX (unlimited storage space). Open your BOX deskop app. Note: files in the BOX desktop app are NOT locally stored on your computer. Open Filezilla. To the left, navigate to the NGSProcess folder in BOX. Above, input the following into quickconnect: Host: sftp://ghpcc06.umassrc.org Username: tl50a Pass:*****.... Port:22 Drag and drop files from cluster -> BOX. #### Remember to checksums both files (use local terminal to check file copied into BOX). ``` cksum BOBCAT2_1.fq.gz ``` Output: checksum bytes filename#### #### Transferring local data from your computer onto the cluster using scp This will copy the local file to the remote system remotehost using the login name tl50a. ``` [mfk@ghpcc06 ~]$ scp lolfile tl50a@remotehost: ``` #### Transferring data from the internet using wget Our data from the sequence provider is password protected, so we need to use the --ftp-user= and--ftp-password= commands to login to the ftp server that holds our data. We will use the -m "mirror" option on wget to track our download progress. ``` wget -m --ftp-user=P202SC19050296-01_07_22_19_wkAz --ftp-password=ArxReMyS ftp://128.120.88.251/P202SC19050296-01-01/raw_data wget -m --ftp-user=X202SC19123068-Z01_02_29_20_LA8A ftp-password=TlEt4sIQ ftp://usftp1.novogene.com wget -m --ftp-user=X202SC19123068-Z01_03_14_20_qJVu ftp-password=mdUvmDpL ftp://128.120.88.251/H202SC19123068/Rawdata wget -m --ftp-user=X202SC19123068-Z01_03_14_20_qJVu ftp-password=mdUvmDpL ftp://usftp1.novogene.com Host: ftp://128.120.88.251 wget -r ftp://usftp1.novogene.com ftp://usftp1.novogene.com Username:X202SC19123068-Z01_03_14_20_qJVu Password:mdUvmDpL ##this is the command that actually works with novogene: wget -r --ftp-use=X202SC19123068-Z01_03_14_20_qJVu --ftp-password=mdUvmDpL ftp://usftp1.novogene.com You can provide authentication credential via --user=USERNAME and --password=PASSWORD ; based on the man wget , the command can be overridden using the --http-user=USERNAME and --http-password=PASSWORD for http connection and the --ftp-use=USERNAME and --ftp-password=PASSWORD for ftp connection.Mar 5, 2011 ``` #### How to submit a job using bsub bsub is a command used for submission to the MGHPCC cluster. ``` username@login1:~$ bsub -V -b n -cwd runJob.sh Your job 1 ("runJob.sh") has been submitted ``` -V will pass all environment variables to the job -N <jobname> name of the job. This you will see when you use qstat, to check status of your jobs. -w e verify options and abort if there is an error #### Downloading fastq data from an ftp Note: data needs to be stored under its individual name within the fastq directory (e.g. /home/tl50a/download/fastq/BOBCAT2/BOBCAT2_1.fq.gz) cd /home/tl50a/download/fastq wget -bqc --ftp-user=P202SC19050296-01_07_22_19_wkAz --ftp-password=ArxReMyS ftp://128.120.88.251/P202SC19050296-01-01/raw_data/BOBCAT2/BOBCAT2_USPD16100238-N706-AK400_HKFN3DSXX_L4_2.fq.gz wget --ftp-user=P202SC19050296-01_07_22_19_wkAz --ftp-password=ArxReMyS ftp://128.120.88.251/P202SC19050296-01-01/raw_data/BOBCAT4/ ### Use the mv command to move files or rename them mv oldname newname mv file newdirectory ``` #### Calculate <- coverage -> of a .bam file using flagstats samtools flagstat /project/uma_lisa_komoroske/Tanya/analyses/step6_RemoveBadReads/LIC8/LIC8_RemoveBadReads.bam Output includes: n + n in total (QC-passed reads + QC-failed reads) #### To Calculate Coverage: n mapped reads *read length/ genome size (80159120 *150)/2400000000 = 5.00X coverage #### Download fastqc output to your home directory or to box To copy from the **remote computer** to the **local one**, type, in the *local computer*: ``` scp -r tl50a@ghpcc06.umassrc.org:/home/tl50a/analyses/step1_fastqc/BOBCAT2/BOBCAT2_1_fastqc.html BOBCAT2_1_fastqc.html scp -r tl50a@ghpcc06.umassrc.org:/home/tl50a/download/ 6_SV_rmClusterSNP_BiSNP_SV_HardFilter_SV_5GS_5TM_joint_chr_HighQualSites_processed.vcf.gz ``` # Downloading Data from TCAG ssh tl50a@ghpcc06.umassrc.org Password: ``` cd /project/uma_lisa_komoroske/Tanya/download/tcag ``` 1. Download the TCAG command line interface tool for MacOS, 2. chmod +x tcag-client-1.4.2 3. ./tcag-client-1.4.2 download -p 7IQOCBZ:/ mv /project/uma_lisa_komoroske/bin/7IQOCBZ /project/uma_lisa_komoroske/Tanya/download/tcag /project/uma_lisa_komoroske/bin/tcag-client-1.4.2 download -p 7IQOCBZ:/ username: tlama@eco.umass.edu password: 5xPYeUxNWE4UrCm You should download 141 files #### Copy all the files to a safe new folder called cleancopy_tcag Usage is: cp -r source destination --copy-contents ``` cp -r /project/uma_lisa_komoroske/Tanya/download/tcag/7IQOCBZ/LAM11820/191107_A00481_0061_AHH3MLDRXX/ /project/uma_lisa_komoroske/Tanya/download/tcag/cleancopy_tcag/ --copy-contents ###### tags: `tools` `How To` `Basic` `Documentation` `Genomics` `Bioinformatics`