# Filtering ddRADSEQ data for plastid and nuclear genomes
###### tags: `RADSeq` `Plastid Genome` `Nuclear Genome` `BWA` `SAMTOOLS`
---
Author: Dylan Cohen
___
## Overview
1. Obtain a Genome
2. Create Index with BWA
3. Align reads to reference with BWA
4. Sort SAM/BAM file using Samtools
5. Extract mapped reads with Samtools
6. Extract unmapped reads with Samtools
7. Convert unmapped files to fasta files
8. Use Stacks and ref.map.pl to call SNPs with mapped bam files
9. Use Stacks and denovo.pl to call snps on fasta files
Notes: This could probably all be |piped| and one big loop.
---
2: Creating an index
```
bwa index ref.fa
```
This will create five files - keep them all in your working directory for the next step.
---
3: Align reads to indexed reference with bwa
Here is a for loop for a whole directory of samples that follow this **R1_fq.gz** naming convention.
```
for i in $(ls -1 *R1_fq.gz | sed 's/\_R1_fq.gz//');
do
bwa mem -t 1 taber.ref.fasta $i\_R1_fq.gz $i\_R2_fq.gz $i\.sam
done
```
---
4: Sort the aligned reads with Samtools
```
for i in *.sam
do
samtools sort -T $i -o ${i/.sam/.sorted.sam}
done
```
---
5: Extract mapped reads from the sorted
```
for i in $(ls -1 *.sorted.sam | sed 's/\.sorted.sam//')
do
samtools view -b -F 4 $i.sorted.sam -o $i\.mapped.bam
done
```
---
6:Extract unmapped from the sorted sam file
```
for i in $(ls -1 *.sort.sam | sed 's/\.sort.sam//')
do
samtools view -b -f 4 $i.sort.sam -o $i\.SUN.bam
done
```
---
7: Convert unmapped sam bam file back to read 1 and read 2 fasta files
```
for i in $(ls -1 *.SNOMAP.bam | sed 's/\.SNOMAP.bam//')
do
samtools fasta -0 $i.SNOMAP.bam -o $i\.S.fasta
done
```
---