# RMTA on OSG
Here are the instructions for running RMTA on OSG (Open Science Grid)
## Login to Submit host
```
$ ssh <username>@login.osgconnect.net # username is your username
```
## Run OSG-RMTA on the sample data
The sample data can be found in the sample_data folder in [here](https://github.com/Evolinc/OSG-RMTA/tree/master/sample_data_osg)
```
git clone https://github.com/Evolinc/OSG-RMTA.git
cd OSG-RMTA/sample_data_osg
```
In the `sample_data_osg` folder you will find input files and the following scripts for job submission to OSG
### Job description file
Here is an example of Job description file (`osg-rmta.submit`) for running RMTA
```
# The UNIVERSE defines an execution environment. You will almost always use VANILLA.
Universe = vanilla
# These are good base requirements for your jobs on OSG. It is specific on OS and
# OS version, core cound and memory, and wants to use the software modules.
Requirements = HAS_SINGULARITY == True
request_cpus = 1
request_memory = 2 GB
request_disk = 4 GB
# Singularity settings
+SingularityImage = "/cvmfs/singularity.opensciencegrid.org/evolinc/osg-rmta:2.1"
# EXECUTABLE is the program your job will run It's often useful
# to create a shell script to "wrap" your actual work.
Executable = osg-rmta-wrapper.sh
Arguments =
# inputs/outputs
transfer_input_files = osg-rmta.sh, Sorghum_bicolor.Sorbi1.20.dna.toplevel_chr8.fa, Sorghum_bicolor.Sorbi1.20_chr8.gtf, sample_1_R1.fq.gz, sample_1_R2.fq.gz
transfer_output_files = final_out, index
# ERROR and OUTPUT are the error and output channels from your job
# that HTCondor returns from the remote host.
Error = $(Cluster).$(Process).error
Output = $(Cluster).$(Process).output
# The LOG file is where HTCondor places information about your
# job's status, success, and resource consumption.
Log = $(Cluster).$(Process).log
# Send the job to Held state on failure.
on_exit_hold = (ExitBySignal == True) || (ExitCode != 0)
# Periodically retry the jobs every 1 hour, up to a maximum of 5 retries.
periodic_release = (NumJobStarts < 5) && ((CurrentTime - EnteredCurrentStatus) > 60*60)
# QUEUE is the "start button" - it launches any jobs that have been
# specified thus far.
Queue 1
```
### Executable script
Here is an example of executable script (`osg-rmta.sh`)
```
#!/bin/bash
Hisat2-Cuffcompare-Cuffmerge.sh -g Sorghum_bicolor.Sorbi1.20.dna.toplevel_chr8.fa -A Sorghum_bicolor.Sorbi1.20_chr8.gtf -l "FR" -1 sample_1_R1.fq.gz -2 sample_1_R2.fq.gz -O final_out -p 6 -5 0 -3 0 -m 20 -M 50000 -q -t -f 2 -k 2
```
### Wrapper script
Here is the wrapper script (`osg-rmta-wrapper.sh`)
```
#!/bin/bash
bash osg-rmta.sh > osg-rmta.out
```
### Job submission
Submit the job using `condor_submit`.
```
$ condor_submit osg-rmta.submit
```
### Job status
Your first job is on the grid! The `condor_q` command tells the status of currently running jobs. Generally you will want to limit it to your own jobs by adding your own username to the command.
```
condor_q <username>
```
### Job output
Once your job has finished, you can look at the files that HTCondor has returned to the working directory. If everything was successful, it should have returned:
- `final_out` which contains bam, gtf and other files
- `index` which contains the indices of the reference genome