--- tags: kraken2 title: 21-Jan-2022 kraken2 standard db --- # 21-Jan-2022 kraken2 standard db The way the error was addressed and how it was built is below, if wanting to do it. But the database I built with the code below can also just be downloaded from the last section. It's 56 GB when uncompressed. [toc] ## Conda env ```bash conda install -c conda-forge mamba mamba create -n kraken2 -c conda-forge -c bioconda -c defaults kraken2=2.1.2 conda activate kraken2 ``` ## Addressing error Running the standard build command was leading to an `rsync_from_ncbi.pl: unexpected FTP path (new server?)` error and killing the build program. There are a few issues on their github, [one stating](https://github.com/DerrickWood/kraken2/issues/292#issuecomment-689930681) they fixed it in an earlier version, but it still seems to be a problem. May be due to changes in NCBI messing with their program. But [this post](https://github.com/DerrickWood/kraken2/issues/226#issuecomment-942104929) had a tip that worked for me. We need to modify the `rsync_from_ncbi.pl` script at line 46, changing "ftp" to "https". In the active kraken2 conda environment, we can run this to edit the appropriate file: ```bash nano -l ${CONDA_PREFIX}/libexec/rsync_from_ncbi.pl ``` That will open this in our terminal window: <a href="https://i.imgur.com/JMZxKyC.png"><img src="https://i.imgur.com/JMZxKyC.png"></a> We want to scroll down to line 46, and change this highlighted "ftp": <a href="https://i.imgur.com/sUb9bsq.png"><img src="https://i.imgur.com/sUb9bsq.png"></a> To be "https" like this: <a href="https://i.imgur.com/6BQJ2O6.png"><img src="https://i.imgur.com/6BQJ2O6.png"></a> Then to save, we want to press `ctrl+x`, it will ask if we want to "Save modified buffer?" or something similar at the bottom. Press `y` to save it, then press `return/enter` to save it under the same file name. Then we will have our normal prompt back and be ready to do the build. ## Building standard db This was run on 21-Jan-2022 ```bash kraken2-build --standard --threads 32 --db kraken2-standard-21-Jan-2022 ``` That took about 9.5 hours as run above. After it's done, we can run this to remove intermediate files and save a ton of space: ```bash kraken2-build --clean --db kraken2-standard-21-Jan-2022/ # Database disk usage: 211G # After cleaning, database uses 56G ``` Compressing: ```bash tar -czvf kraken2-standard-21-Jan-2022.tar.gz kraken2-standard-21-Jan-2022/ ``` ## Download standard db built on 21-Jan-2022 Run the following if wanting to download the database built above. ```bash curl -Lo kraken2-standard-21-Jan-2022.tar.gz https://figshare.com/ndownloader/files/33894434 tar -xzvf kraken2-standard-21-Jan-2022.tar.gz ```