# sra-tool kit ###### tags: `bioinformatic` > Sequence Read Archive (SRA) data, available through multiple cloud providers and NCBI servers, is the largest publicly available repository of high throughput sequencing data. > [Using the SRA Toolkit to convert .sra files into other formats](https://www.ncbi.nlm.nih.gov/books/NBK158900/) Installation ```bash cd ~/tools wget --output-document sratoolkit.tar.gz http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratoolkit.current-ubuntu64.tar.gz tar -zxvf sratoolkit.tar.gz rm sratoolkit.tar.gz cd sratoolkit.tar.gz/bin ``` env ``` echo "export PATH=$PATH:/home/hunglin/tools/sratoolkit.2.10.5-ubuntu64/bin" >> ./.bashrc which fastq-dump ``` test ``` vdb-config --interactive #chache中RAM +1MB fastq-dump --stdout SRR8839822 |head -n 10 ``` Download SRA ```bash prefetch [SRA accession] [SRA 2] ``` 也可以再同一條指令下載多個SRA f
ile sra to fastq ``` fasterq-dump SRR8839822 -o FSIS11811834 -t /dev/shm/ -e 6 -p join :|-------------------------------------------------- 100% concat :|-------------------------------------------------- 100% spots read : 565,950 reads read : 1,131,900 reads written : 1,131,900 ``` -t 緩存位置 > It is helpful for the speed-up, if the output-path and the scratch-path are on different file-systems. For instance it is a good idea to point the temporary directory to a SSD if available or a RAM-disk like `/dev/shm` if enough RAM is available. -o 檔名 -e 使用核心數 -p 顯示進度條 search info ``` vdb-dump --info SRR8839822 acc : SRR8839822 type : Database platf : SRA_PLATFORM_ILLUMINA SEQ : 565,950 SCHEMA : NCBI:align:db:alignment_sorted#1.3 TIME : 0x000000005ca3f081 (04/03/2019 07:30) FMT : FASTQ FMTVER : 2.9.1 LDR : latf-load.2.9.1 LDRVER : 2.9.1 LDRDATE: Jun 15 2018 (6/15/2018 0:0) ``` > [HowTo: fasterq dump](https://github.com/ncbi/sra-tools/wiki/HowTo:-fasterq-dump) > [How to use NCBI SRA Toolkit effectively? ](https://reneshbedre.github.io/blog/fqutil.html)