# sra-tool kit
###### tags: `bioinformatic`
> Sequence Read Archive (SRA) data, available through multiple cloud providers and NCBI servers, is the largest publicly available repository of high throughput sequencing data.
> [Using the SRA Toolkit to convert .sra files into other formats](https://www.ncbi.nlm.nih.gov/books/NBK158900/)
Installation
```bash
cd ~/tools
wget --output-document sratoolkit.tar.gz http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratoolkit.current-ubuntu64.tar.gz
tar -zxvf sratoolkit.tar.gz
rm sratoolkit.tar.gz
cd sratoolkit.tar.gz/bin
```
env
```
echo "export PATH=$PATH:/home/hunglin/tools/sratoolkit.2.10.5-ubuntu64/bin" >> ./.bashrc
which fastq-dump
```
test
```
vdb-config --interactive #chache中RAM +1MB
fastq-dump --stdout SRR8839822 |head -n 10
```
Download SRA
```bash
prefetch [SRA accession] [SRA 2]
```
也可以再同一條指令下載多個SRA f ile
sra to fastq
```
fasterq-dump SRR8839822 -o FSIS11811834 -t /dev/shm/ -e 6 -p
join :|-------------------------------------------------- 100%
concat :|-------------------------------------------------- 100%
spots read : 565,950
reads read : 1,131,900
reads written : 1,131,900
```
-t 緩存位置
> It is helpful for the speed-up, if the output-path and the scratch-path are on different file-systems. For instance it is a good idea to point the temporary directory to a SSD if available or a RAM-disk like `/dev/shm` if enough RAM is available.
-o 檔名
-e 使用核心數
-p 顯示進度條
search info
```
vdb-dump --info SRR8839822
acc : SRR8839822
type : Database
platf : SRA_PLATFORM_ILLUMINA
SEQ : 565,950
SCHEMA : NCBI:align:db:alignment_sorted#1.3
TIME : 0x000000005ca3f081 (04/03/2019 07:30)
FMT : FASTQ
FMTVER : 2.9.1
LDR : latf-load.2.9.1
LDRVER : 2.9.1
LDRDATE: Jun 15 2018 (6/15/2018 0:0)
```
> [HowTo: fasterq dump](https://github.com/ncbi/sra-tools/wiki/HowTo:-fasterq-dump)
> [How to use NCBI SRA Toolkit effectively? ](https://reneshbedre.github.io/blog/fqutil.html)