###### tags: `Server installation and Bioinformatics`
# Installation and use of useful sequencing programs
## Oracle VM
1. Download vBox and vBox extension pack (USB3.0) here:
- https://www.oracle.com/technetwork/server-storage/virtualbox/downloads/index.html#vbox
- https://www.oracle.com/technetwork/server-storage/virtualbox/downloads/index.html#extpack
2. install vbox package as admin
3. install extpack as admin
4. start vBox
5. Download UBUNTU 16.04 as 64Bit version as an *.iso format
6. push button NEW and give a VM name (e.g. UBUNTU16)
7. type is Linux
8. operating system is UBUNTU (64bit)
9. Follow instructions (CPU 4 core, Drive 500GB, RAM 8GB)
10. start VM and vBox will asked you for the ISO file
14. Navigate to ISO location
15. press NEXT
16. follow instruction
## UBUNTU
1. set the proxy in network steeing under UBUNTU
OR
1. open terminal
2. type in
```
$ sudo bash
$ cd /etc/apt/
$ gedit apt.conf
## copy these two lines in apt.conf :
Acquire::http::proxy "http://proxy.clondiag.jena:8080/";
Acquire::https::proxy "https://proxy.clondiag.jena:8080/";
##
```
3. save apt.conf
4. restart machine!
## Usefull Commads for installation
### apt-* command
#### apt-get upgrade
- install new version of programs
#### apt-get update
- install update verion of installed programs
#### apt-get remove
- deinstall program
#### apt-get purge
- deinstall program and all data related with
#### apt-cache search
- searching for programs in the apt-get portfolio
### dpkg
Install packages
```
dpkg -i
```
Installed Debian packages
```
dpkg -l| grep -i "name"
```
- searching for dpkg intalled on the system
## Graphmap
1. goto https://github.com/isovic/graphmap
2. open console
3. goto the directory where you want to install the program
4. type in the console
```
cd
mkdir SeqTools
cd SeqTools
git clone https://github.com/isovic/graphmap.git
cd graphmap
make modules
make
```
to install the graphmap binary to /usr/bin
```
sudo make install
```
## MiniMap2
```
cd
mkdir SeqTools (NOTE: If it not created yet)
cd SeqTools
git clone https://github.com/lh3/minimap2
cd minimap2
make
cd /usr/bin
sudo ln -s /home/sascha/SeqTools/minimap2/minimap2 minimap2
```
## Libxml2
apt-get install libxml2-dev
NOTE: https://replikation.github.io/bioinformatics_side/R/R/
## R-base (language) and R-Studio (environment)
for Linux Mint (Xenial)
### R language
```
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu bionic-cran40/'
sudo apt-get update
sudo apt-get install r-base
```
Get the R-Studio from https://www.rstudio.com/products/rstudio/download/#download
NOTE:If you want to code in terminal type R. Now you are in a R environment ("console"). q() to exit R in terminal. But use R Studio for scripting not the terminal.
### packages in R
1. goto console
2. type in
```
$ R
```
3. This will start R on commando line
Packages are organized in repositories: CRAN, Bioconductor, R-forge, Github or * Googlecode
Installing a package in R and first time source location
1 establish bioconductor as a source, first time only
```
source("https://bioconductor.org/biocLite.R")
```
2 install package methylKit from biocLite
```
biocLite("methylKit")
```
CRAN
install from CRAN
```
install.packages("fortunes")
```
#### Using CRAN
```
">install.packages("packageName")
```
#### Using Bioconductor
```
">source("https://bioconductor.org/biocLite.R")
">biocLite("packageName")
```
e.g. NOISeq, Rsamtools, Repitools, rtracklayer (packages not available @CRAN)
#### Upgrading R on Ubuntu 18.04 and Resolving ImportError in `add-apt-repository`**
Upgrading R on Ubuntu 18.04 can be hindered by an `ImportError` in the `add-apt-repository` command, related to the Python GI module. Follow these steps to resolve the error and upgrade R:
1. **Reinstall Python GI Module:**
- Command: `sudo apt-get install --reinstall python3-gi`
- Purpose: Fixes issues with the Python GObject Introspection module, which is crucial for `add-apt-repository`.
2. **Install Dependencies:**
- Command: `sudo apt-get install libgirepository1.0-dev gcc python3-dev`
- Purpose: Ensures all dependencies for the GI module are installed.
3. **Verify Python Version:**
- Command: `python3 --version`
- Purpose: Confirms the correct Python version is being used.
4. **Update and Upgrade System:**
- Commands: `sudo apt-get update` and `sudo apt-get upgrade`
- Purpose: Keeps system packages updated, potentially resolving package conflicts.
5. **Manually Add R Repository:**
- Method: Add `deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/` to `/etc/apt/sources.list`
- Purpose: Bypasses `add-apt-repository` if it's not functioning. Ensures access to the latest R versions.
6. **Upgrade R:**
- Commands: `sudo apt update` followed by `sudo apt install --upgrade r-base`
- Purpose: Installs the latest version of R.
## MinIONQC.R
https://github.com/roblanf/minion_qc
```
wget https://raw.githubusercontent.com/roblanf/minion_qc/master/MinIONQC.R -O MinIONQC.R
```
Dependencies
To run the script, you will need a recent version of R, and the following packages. To install the right packages, just start up R and copy/paste the code below.
1. goto console
2. type in
```
$ R
```
3. This will start R on commando line
```
>install.packages(c("data.table",
"futile.logger",
"ggplot2",
"optparse",
"plyr",
"readr",
"reshape2",
"scales",
"viridis",
"yaml"))
```
One example
```
Rscript MinIONQC.R -i example_input_minion -o my_example_output_minion -p 2
```
## samtools, bcftools, htslib
1. goto http://www.htslib.org/download/
2. download latest version of samtools-X.X, bcftools-X.X, htslib-X.X
3. create directory samtools
4. copy all *.tar.bz2 files in samtools
5. unzip all files
6. make three folder for samtools, bcftools, htslib
7. goto console
```
$ cd /go/to/unziped/directory
## e.g. /home/sascha/data/program/samtools/samtools-1.9
## install in folder of point 6.
$ ./configure --prefix=/home/sascha/data/program/samtools/samtools
$ make
$ make install
```
The executable programs will be installed to a bin subdirectory under your specified prefix, so you may wish to add this directory to your $PATH:
```
export PATH=/home/sascha/data/program/samtools/samtools/bin:$PATH
# for sh or bash users
```
## canu
1. Download Version 1.7.1 and 1.8 of Canu here https://github.com/marbl/canu/releases
2. unzip the folders to to folder SeqTools
3. rename the folders in a) canu-1.7 and b) canu-1.8
4. all executable files under a) /SeqTools/canu-1.7/Linux-amd64/bin or b) /SeqTools/canu-1.8/Linux-amd64/bin
5. no installation is nescessary
6. create a softlink to usr/bin for both version
### canu 1.7
goto usr/bin
```
cd
sudo bash
cd /usr/bin
ln -s /home/sascha/SeqTools/canu-1.7/Linux-amd64/bin/canu canu-1.7
```
### canu 1.8
```
cd
sudo bash
cd /usr/bin
ln -s /home/sascha/SeqTools/canu-1.8/Linux-amd64/bin/canu canu-1.8
```
## FLYE
### Installation
To install the Flye package into your system, run:
```
git clone https://github.com/fenderglass/Flye
cd Flye
python setup.py install
```
Depending on your OS, you might need to add --user or --prefix options to the install command for the local installation.
After installation, Flye could be invoked via:
```
flye
```
## Unicycler
https://github.com/rrwick/Unicycler
## QUAST
1. Download QUAST
- https://sourceforge.net/projects/quast/
2. Extract quast-5.0.0.tar.gz to folder of choice
3. Folder quast-5.0.0 will be generate
4. Navigate in shell to quast-5.0.0.
Requires:
1. Python2 (2.5 or higher) or Python3 (3.3 or higher)
2. GCC 4.7 or higher
3. Perl 5.6.0 or higher
4. GNU make and ar
5. zlib development files
Basic installation (about 120 MB):
sudo ./setup.py install
or
Full installation (about 540 MB, additionally includes (1) tools for SV detection based on read pairs, which is used for more precise misassembly detection, and (2) tools/data for reference genome detection in metagenomic datasets):
sudo ./setup.py install_full
Example:
```
quast.py -t 4 -o ~/run0002/quast_canu_assembly -R ~/run0002/Reference_genomes/PSS_728a/PSS728a.fna ~/run0002/canu/BC02_PSS_IPC/BC02_PSS_IPC.contigs.fasta
```
## LAST
## NanoPack
1. intallation within --user -U
2. update python3 pip
```
pip3 install --upgrade pip
```
2. install used tools for nanopack
```
pip3 install --user -U setuptools
pip3 install --user -U numpy
pip3 install --user -U mappy
```
3. install nanopack
```
pip3 install --user -U nanopack
```
4. Verify installed packages have compatible dependencies
```
pip3 check
```
5. Are dependencies missing
```
pip3 search "type here the package you search for"
pip3 install "type here the package you will install"
```
## Glances
Glances ist ein System-Monitor für die Kommandozeile. Gegenüber dem Klassikern top und htop bietet das Programm neben Prozess-Informationen ergänzende Echtzeit-Statistiken zu Dateisystem, Netzwerk, Hardware-Komponenten etc.
```
$ sudo apt-get install glances
```
## snap - installation client (like apt-get)
### Behind a proxy
```
$ sudo nano /etc/enviroment
#copy this lines:
http_proxy="http://proxy.clondiag.jena:8080/"
https_proxy="https://proxy.clondiag.jena:8080/"
#save file
$ sudo apt install snapd
$ sudo systemctl edit snapd.service
#copy this lines:
[Service]
EnvironmentFile=/etc/environment
#save file
$ sudo systemctl daemon-reload
$ sudo systemctl restart snapd.service
```
- Snap is ready to use
## Notepad-plus-plus
```
$ snap install notepad-plus-plus
```
## MinKNOW for Linux
LINK to ONT:
https://community.nanoporetech.com/protocols/experiment-companion-minknow/v/mke_1013_v1_revam_11apr2016/installing-minknow-on-linu
```
sudo bash
[sudo] Passwort für sascha:
```
```
sudo apt-get update
sudo apt-get install wget
wget -O- https://mirror.oxfordnanoportal.com/apt/ont-repo.pub | sudo apt-key add -echo "deb http://mirror.oxfordnanoportal.com/apt xenial-stable non-free" | sudo tee/etc/apt/sources.list.d/nanoporetech.sources.list
```
## Basecaller
### Bonito basecaller
#### Install bonito via miniconda
```
conda create -n bonito python=3.8
conda activate bonito
pip install ont-bonito
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
pip install numpy==1.17.5
```
#### Models
```bonito download --models```
#### pyTorch Homepage
https://pytorch.org/get-started/locally/
#### Using
```bonito basecaller dna_r9.4.1@v3.3 *.fast5 > nameoffile.fasta```
### Guppy for Linux
LINK to ONT:
https://community.nanoporetech.com/protocols/Guppy-protocol-preRev/v/gpb_2003_v1_revg_14dec2018/linux-guppy
Add Oxford NanoporeTechnologies' .deb repository to your system (this is to install Oxford Nanopore Technologies-specific dependency packages):
```
sudo bash
[sudo] Passwort für sascha:
```
Copy paste all commandos in terminal
```
sudo apt-get update
sudo apt-get install wget lsb-release
export PLATFORM=$(lsb_release -cs)
wget -O- https://mirror.oxfordnanoportal.com/apt/ont-repo.pub | sudo apt-key add -echo "deb http://mirror.oxfordnanoportal.com/apt ${PLATFORM}-stable non-free" | sudo tee/etc/apt/sources.list.d/nanoporetech.sources.list
sudo apt-get update
```
To install the .deb for Guppy, use the following command (without brackets):
```
apt-get install ont-guppy-cpu
```
### Albacore for Linux
Add Oxford Nanopore's deb repository to your system (this is used to install Oxford Nanopore-specific dependency packages):
```
sudo apt-get update
sudo apt-get install wget
wget -O- https://mirror.oxfordnanoportal.com/apt/ont-repo.pub | sudo apt-key add -
echo "deb http://mirror.oxfordnanoportal.com/apt trusty-stable non-free" | sudo tee /etc/apt/sources.list.d/nanoporetech.sources.list
sudo apt-get update
```
Install the deb using dpkg:
```
sudo dpkg -i path/to/python3-ont-albacore-xxx.deb
```
This will report several errors because there are missing dependencies. Fix these errors using apt:
```
sudo apt-get -f install
```
## MAFFT
### installation
### command
```mafft --auto --adjustdirection --thread -1 path/to/*.fasta > path/to/out-fasta```
#### Align oligos to a recent alignment
Note: Alignment file must be already aligned!!!!
```mafft --adjustdirection --addfragments oligos-file.fasta allignment-file.fasta > alignment_mafft.fasta```
## Oracle Java
First, update the package index.
```
sudo apt-get update
```
Next, install Java. Specifically, this command will install the Java Runtime Environment (JRE).
```
sudo apt-get install default-jre
```
The JDK does contain the JRE, so there are no disadvantages if you install the JDK instead of the JRE, except for the larger file size.
You can install the JDK with the following command:
```
sudo apt-get install default-jdk
```
Installing the Oracle JDK
If you want to install the Oracle JDK, which is the official version distributed by Oracle, you will need to follow a few more steps.
First, add Oracle's PPA, then update your package repository.
```
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
```
Then, depending on the version you want to install, execute one of the following commands:
Oracle JDK 8
This is the latest stable version of Java at time of writing, and the recommended version to install. You can do so using the following command:
```
sudo apt-get install oracle-java8-installer
```
Managing Java
There can be multiple Java installations on one server. You can configure which version is the default for use in the command line by using update-alternatives, which manages which symbolic links are used for different commands.
```
sudo update-alternatives --config java
```
The output will look something like the following. In this case, this is what the output will look like with all Java versions mentioned above installed.
Output
There are 5 choices for the alternative java (providing /usr/bin/java).
|Selection|Path|Priority|Status|
|--------|--------|--------|--------|
|*0|/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java|1081|auto mode|
|1|/usr/lib/jvm/java-8-oracle/jre/bin/java|3|manual mode|
|2|/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java|1081|manual mode|
Press <enter> to keep the current choice[*], or type selection number:
You can now choose the number to use as a default. This can also be done for other Java commands,
## FastQC
Using apt-get installtion
```
apt-get install fastqc
```
Note: Sometimes the HTML-file will not create at analysis folder. If so use following steps.
1. Downlaod zip file https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.8.zip
2. Unzip the file directly in "Downlods"
3. make folder /etc/fastqc
```
sudo bash
cd /
cd /etc
mkdir fastqc
```
4. move all files from /Downloads/FastQC to etc/fastqc
```
sudo bash
mv -v /home/sascha/Downloads/FastQC/* /etc/fastqc
```
5. Done
## Bowtie2
Install bowtie2
How to install bowtie2 on Ubuntu/Linux?
1. download page
https://sourceforge.net/projects/bowtie-bio/files/bowtie2/
2. create and go to install directory
```
cd /home/SeqTools/bowtie2/
```
3. download Ubuntu/Linux version
```
https://sourceforge.net/projects/bowtie-bio/files/latest/download
```
4. decompress
unzip download in home/SeqTools/bowtie2/
5. add location to system PATH
```
export PATH=/home/SeqTools/bowtie2/:$PATH
```
6. check installation
```
bowtie2 --help
```
## BRIG
1. Download the latest version (BRIG-x.xx-dist.zip) from http://sourceforge.net/projects/brig/
2. Unzip BRIG-x.xx-dist.zip to a desired location
```
EXAMPLE:
home/sascha/Seqtools/BRIG
```
4. Navigate to the unpacked BRIG folder in a command-line interface (terminal, console, command prompt).
5. Run ‘java -Xmx1500M -jar BRIG.jar’. Where -Xmx specifies the amount of memory allocated to BRIG.
6. OR make shekl script BRIG.sh and copy
```
cd /home/sascha/SeqTools/BRIG/
java -Xmx1500M -jar BRIG.jar
```
into this file
6. make smart link to /usr/bin/
```
$usr/bin: ln -s /home/sascha/SeqTools/BRIG/BRIG.sh BRIG
```
7. Start BRIG with
```
BRIG
```
## IGV
1. Download latest version from https://software.broadinstitute.org/software/igv/download
2. Unzip the file
3. copy folder in Seqtools
4. renmame to IGV
5. make soft link to /usr/bin/
```
sudo ln -s /home/sascha/Seqtools/IGV/igv.sh igv.sh
```
6. start IGV usig *IGV.sh* on command line
## Mauve
1. Download the latest version of Mauve for your operating system from http://darlinglab.org/mauve/download.html
2. unzip the file and rename to mauve
3. copy folder "mauve" to Seqtools
4. make soft link to ~/.local/bin/ of file Mauve.sh
```
ln -s /home/sascha/seqtools/mauve/Mauve /home/sascha/.local/bin/Mauve
```
5. run Mauve by typing Mauve to command line
## Bandage
The following instructions successfully build Bandage on a fresh installation of Ubuntu 14.04:
1. Ensure the package lists are up-to-date:
```
sudo apt-get update
```
2. Install prerequisite packages:
```
sudo apt-get install build-essential git qtbase5-dev libqt5svg5-dev
```
3. Download the Bandage code from GitHub:
```
git clone https://github.com/rrwick/Bandage.git
```
4. Open a terminal in the Bandage directory.
5. Set the environment variable to specify that you will be using Qt 5, not Qt 4:
```
export QT_SELECT=5
```
6. Run qmake to generate a Makefile:
```
qmake
```
7. Build the program:
```
make
```
8. Bandage should now be an executable file.
Example command on ubuntu command line:
``` Bandage image assembly_graph.gfa BC01.svg --query assembly.fasta --fontsize 25 --names --minnodlen 25 --lengths --width 5000 --height 5000 --depth```
## MUMmer
Download latest release here https://github.com/mummer4/mummer
To compile and install:
- goto folder were the mummer-zip file is located.
- unpack zip file
- goto folder and open folder in terminal
- type in ...
```
./configure --prefix=/home/sascha/seqtools/mummer4
make
make install
```
- Set links to .local/bin/
Example
```
ln -s /home/sascha/seqtools/mummer4/bin/dnadiff /home/sascha/.local/bin/dnadiff
```
## filtlong
Filtlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.
```
git clone https://github.com/rrwick/Filtlong.git
cd Filtlong
make -j
bin/filtlong -h
```
```
cp bin/filtlong /home/sascha/bin/filtlong
```
## Nanopolish
1. Install nanopolish from GitHub
```
git clone --recursive https://github.com/jts/nanopolish.git
cd nanopolish
make
```
2. Erstelle LINK nach ./local/bin
```
ln -s /path to/nanopolish /home/sascha/.local/bin/nanopolish
```
3. Install Bio-Python
```
sudo apt-get install python-biopython
```
5. Erstelle LINK von nanopolish_makerange.py nach ./local/bin
```
ln -s /path to/nanopolish/scripts/nanopolish_makerange.py /home/sascha/.local/bin/nanopolish_makerange.py
```
6. Install parallel
```
sudo apt-get install parallel
```
7. try all using --help
```
parallel --help
nanopolish --help
nanoploish_makerange.py --help
```
## Illumina Assembling
### megahit
Install:
```
conda install -c bioconda megahit
```
open terminal
```
$ conda activate
$ conda install -c bioconda megahit
```
Beispiel:
```
#FWD-fastq and REV-fastq
$ megahit -1 file-1.fastq.gz -2 file-2.fastq.gz
#SRA paired file
$ megahit -12 file-paired-fastq.gz
#Input options that can be specified for multiple times (supporting plain text and gz/bz2 extensions)
#-1 <pe1> comma-separated list of fasta/q paired-end #1 files, paired with files in <pe2>
#-2 <pe2> comma-separated list of fasta/q paired-end #2 files, paired with files in <pe1>
#--12 <pe12> comma-separated list of interleaved fasta/q paired-end files
#-r/--read <se> comma-separated list of fasta/q single-end files
```
### SPAdes
Install:
```
wget http://cab.spbu.ru/files/release3.14.1/SPAdes-3.14.1-Linux.tar.gz
tar -xzf SPAdes-3.14.1-Linux.tar.gz
cd SPAdes-3.14.1-Linux/bin/
```
Copy a link to ./local/bin via Dateimanager
Start:
```
spades.py --test
```
Example:
```
spades.py -1 R1.fastq.gz -2 R1.fastq.gz -o output
```
## Medaka-GPU
### Create a medaka enviroment
```
conda create -n medaka python=3.9
```
### Activate medaka enviroment
```
conda activate medaka
```
### Install all dependencies
```
conda install samtools
conda update samtools
conda install bcftools
conda install minimap2
pip install pyabpoa
pip install pandas
```
### Install medaka unsing pip
```
pip install medaka==1.11.1
```
NOTE: ==rescent verion
https://github.com/nanoporetech/medaka
### Start:
```
conda activate medaka
```
```
medaka_consensus -o output-folder -i file .fastq -d file.fasta -m r1041_e82_400bps_sup_v4.2.0 -t 32
```
## Medaka-CPU
Short commands on Ubuntu 18.04 to install Medaka-CPU
```
conda create -n medaka-cpu python=3.9.9
conda activate medaka-cpu
conda install samtools==1.11
pip install medaka-cpu
```
## Upgrade medaka using conda
1. activate conda
```conda activate medaka```
2. upgrade medaka
```pip install --upgrade medaka```
3. upgrade SAMTOOLS
```conda update samtool```
## RACON
### cmake
Install:
Download von https://github.com/Kitware/CMake/releases/tag/v3.18.2 --> cmake-3.18.2-Linux-x86_64.sh to Download folder
```
sh cmake-3.18.2-Linux-x86_64.sh
```
copy folder cmake-3.18.2-Linux-x86_64 to seqtools --> rename to cmake
```
sudo ln -s /home/sascha/seqtools/cmake/cmake /usr/bin/cmake
```
control by
```
cmake --version
```
### racon installation
```
git clone --recursive https://github.com/lbcb-sci/racon.git racon
cd racon
mkdir build
```
### CUDA support
```
cd build
cmake -DCMAKE_BUILD_TYPE=Release -Dracon_enable_cuda=ON ..
make
```
## Miniconda3
```
mkdir -p ~/seqtools/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/seqtools/miniconda3/miniconda.sh
bash ~/seqtools/miniconda3/miniconda.sh -b -u -p ~/seqtools/miniconda3
rm -rf ~/seqtools/miniconda3/miniconda.sh
~/seqtools/miniconda3/bin/conda init bash
~/seqtools/miniconda3/bin/conda init zsh
```
Note: deactivate base on Terminal
```conda config --set auto_activate_base false```
activate: ```conda activate "your env"```
deactivate: ```conda deactivate "your env"```
## CUDA Toolkit 11.1
### Install
```
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda-repo-ubuntu1804-11-1-local_11.1.1-455.32.00-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-1-local_11.1.1-455.32.00-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu1804-11-1-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
```
### NVIDIA visual profile
- NVVP is installed /usr/local/cuda/bin
- NVVP needs java verson 1.8.0
#### install java 1.8.0
- download vesrion 1.8.0 from https://www.java.com/de/download/
- unzip the tar file with
```
tar -xf jre-8u271-linux-x64.tar.gz
```
- move all to folder of your choice (e.g. /usr/local/seqtools/java1.8.0)
- rename ./java to ./java1.8
```
sudo mv java java1.8
```
- export PATH (e.g. ``` export PATH=$PATH /usr/local/seqtools/java1.8.0/bin```) or to your readme.sh which is linke to your .bashrc (see section .bashrc)
## Java 15
```
sudo add-apt-repository ppa:linuxuprising/java
sudo apt update
sudo apt install oracle-java15-installer
```
## docker
### Installing Docker
#### Dependencies
```
sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
```
#### Repo and install
```
#correct release candidate
# for Linux mint
var="bionic"
# for other
var=$(lsb_release -cs)
sudo add-apt-repository "deb [arch=amd64] https://```download.docker.com/linux/ubuntu $var stable"
# this stuff is stored under /etc/apt/sources.list.d
# you can edit or remove this file if there are some errors
sudo apt-get update
sudo apt-get install docker-ce
```
add "docker" to your group so you don't have to type sudo every time
```
sudo usermod -a -G docker $USER
sudo reboot
```
#### Important commands
```
docker run --rm <imagename> <command> # runs image as a container and removes it afterwards
docker pull <repositoryname/dockername> # basically git clone of a docker image
docker build -t <image_name> . # build a image from Docker file in .
docker images # shows all images
docker rmi <name> # removes a image
docker ps -a # shows all current containers (active and exited)
docker rm <name> # removes docker container (IMPORTANT if you want to clean up)
```
#### Run dockers examples
```
docker run --rm -it -v $PWD:/input nanozoo/flye
docker run --gpus all --rm -it -v $PWD:/input nanozoo/guppy_gpu
```
### Installing docker NVIDIA Toolkit
The following steps can be used to setup NVIDIA Container Toolkit on Ubuntu LTS - 16.04, 18.04, 20.4 and Debian - Stretch, Buster distributions.
Docker-CE on Ubuntu can be setup using Docker’s official convenience script:
```
curl https://get.docker.com | sh
sudo systemctl start docker && sudo systemctl enable docker
```
See also
Follow the official instructions for more details and post-install actions.
Setup the stable repository and the GPG key:
```
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
```
Note
To get access to experimental features such as CUDA on WSL or the new MIG capability on A100, you may want to add the experimental branch to the repository listing:
```
curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
```
Install the nvidia-docker2 package (and dependencies) after updating the package listing:
```
sudo apt-get update
sudo apt-get install -y nvidia-docker2
```
Restart the Docker daemon to complete the installation after setting the default runtime:
```
sudo systemctl restart docker
```
At this point, a working setup can be tested by running a base CUDA container:
```
sudo docker run --rm --gpus all nvidia/cuda:9.0-base nvidia-smi
```
## NCBI SRA Toolkit
### insallation
1. Download via wget to home/Download
```wget --output-document sratoolkit.tar.gz http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratoolkit.current-ubuntu64.tar.gz```
2. Unzip via tar
```tar -vxzf sratoolkit.tar.gz```
3. move to folder seqtools
```sudo mv sratoolkit.2.11.0-ubuntu64 /usr/local/seqtools/```
4. setze Link to /usr/local/bin
```ln -s /usr/local/seqtools/sratoolkit.2.11.0-ubuntu64/fastq-dump```
### Using
```fastq-dump --stdout SRR12345678 > SRR12345678.fastq```
## MLST
### Install tseemann-MLST
```
cd $usr/local/seqtools
sudo git clone https://github.com/tseemann/mlst.git
sudo nano readme.sh
```
#### in readme.sh:
```
#mlst
PATH=$PATH:/usr/local/seqtools/mlst/bin
```
### Online MLST
https://cge.cbs.dtu.dk/services/MLST/
## Barrnap (BAsic Rapid Ribosomal RNA Predictor)
Barrnap predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), metazoan mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).
It takes FASTA DNA sequence as input, and write GFF3 as output. It uses the new nhmmer tool that comes with HMMER 3.1 for HMM searching in RNA:DNA style. Multithreading is supported and one can expect roughly linear speed-ups with more CPUs.
### Installation
```
cd /usr/local/seqtools
sudo git clone https://github.com/tseemann/barrnap.git
cd barrnap/bin
cd /usr/local/bin/
ln -s /usr/local/seqtools/barrnap/bin barrnap barrnap
./barrnap --help
```
## FastTree
FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory. For large alignments, FastTree is 100-1,000 times faster than PhyML 3.0 or RAxML 7. FastTree is open-source software -- you can download the code below.
FastTree is more accurate than PhyML 3 with default settings, and much more accurate than the distance-matrix methods that are traditionally used for large alignments. FastTree uses the Jukes-Cantor or generalized time-reversible (GTR) models of nucleotide evolution and the JTT (Jones-Taylor-Thornton 1992), WAG (Whelan & Goldman 2001), or LG (Le and Gascuel 2008) models of amino acid evolution. To account for the varying rates of evolution across sites, FastTree uses a single rate for each site (the "CAT" approximation). To quickly estimate the reliability of each split in the tree, FastTree computes local support values with the Shimodaira-Hasegawa test (these are the same as PhyML 3's "SH-like local supports").
### Installation
```
cd Download
wget http://www.microbesonline.org/fasttree/FastTreeMP
sudo chmod 774 FastTreeMP
sudo mv FastTreeMP /usr/local/bin
```
## BRIG
BRIG is a cross-platform (Windows/Mac/Unix) application that can display circular comparisons between a large number of genomes, with a focus on handling genome assembly data.
Major Features:
Images show similarity between a central reference sequence and other sequences as concentric rings.
BRIG will perform all BLAST comparisons and file parsing automatically via a simple GUI.
Contig boundaries and read coverage can be displayed for draft genomes; customized graphs and annotations can be displayed.
Using a user-defined set of genes as input, BRIG can display gene presence, absence, truncation or sequence variation in a set of complete genomes, draft genomes or even raw, unassembled sequence data.
BRIG also accepts SAM-formatted read-mapping files enabling genomic regions present in unassembled sequence data from multiple samples to be compared simultaneously.
### Installation
1. Download th latest version: https://sourceforge.net/projects/brig/
2. unzip and copy folder to server via winSCP
### Running
Users who wish to run BRIG from the command-line need to:
1. Navigate to the unpacked BRIG folder in a command-line interface (terminal, console, command prompt).
2. Run 'java -Xmx1500M -jar BRIG.jar'. Where -Xmx specifies the amount of memory allocated to BRIG.
3. copy the multifasta file and the reference file (query) in the BRIG folder.
## ideel (QC Tool for Minion Seq)
A Repo on code by Mick Watson who wrote a blog post and follow up about a quick way to test the viability of a (long-read) assembly.
### Dependencies
#### DIAMOND
https://github.com/bbuchfink/diamond/wiki
##### downloading the tool
```wget http://github.com/bbuchfink/diamond/releases/download/v2.0.9/diamond-linux64.tar.gz```
```tar xzf diamond-linux64.tar.gz```
##### creating a diamond-formatted database file
```diamond makedb --in reference.fasta -d reference```
##### running a search in blastp mode
```diamond blastp -d reference -q queries.fasta -o matches.tsv```
##### running a search in blastx mode
```diamond blastx -d reference -q reads.fasta -o matches.tsv```
##### downloading and using a BLAST database
```update_blastdb.pl --decompress --blastdb_version 5 swissprot```
```diamond blastp -d swissprot -q queries.fasta -o matches.tsv```
#### snakemake
```conda create -c conda-forge -c bioconda -n snakemake snakemake```
### Running ideel
- navigate to "/mnt/data_1/workdir_sascha/004_seq_QC/ideel"
- copy fasta file to "/mnt/data_1/workdir_sascha/004_seq_QC/ideel/genomes"
- rename file to *.fa
- activate conda ```conda activate snakemake```
- run ```snakemake -c 32```
## NextDenovo
### Install
```wget https://github.com/Nextomics/NextDenovo/releases/latest/download/NextDenovo.tgz```
```pip3 install paralleltask```
```tar -vxzf NextDenovo.tgz && cd NextDenovo```
1. open nextDenovo via notepad++ --> change python to python3
2. copy folder NextDenovo to /usr/local/seqtools
3. goto /usr/local/seqtools/NextDenovo
testing
```./nextDenovo test_data/run.cfg```
LINKS you need:
```ln -s /usr/local/seqtools/NextDenovo/nextDenovo /usr/local/bin/nextDenovo```
```ln -s /usr/local/seqtools/NextDenovo/bin/seq_stat /usr/local/bin/seq_stat```
### Run Nextdenovo
```ls reads1.fasta reads2.fastq reads3.fasta.gz reads4.fastq.gz ... > input.fofn```
```seq_stat -g "size-OF-genome" input.fofn```
---> Suggested seed_cutoff (genome size: 2.70Mb, expected seed depth: 45, real seed depth: 45.00): 16840 bp
the genome size and the seed_cutoff must be written in the run.cfg file
1. copy run.cfg to folder with fastq und run.cfg files
2. open run.cfg and write seed_cutoff (=read_cutoff) and genome_size here:
[correct_option]
read_cutoff = 16840bp
genome_size = 2.7Mb # estimated genome size
sort_options = -m 20g -t 15
minimap2_options_raw = -t 8
pa_correction = 3 # number of corrected tasks used to run in parallel, each corrected task requires ~TOTAL_INPUT_BASES/4 bytes of memory usage.
correction_options = -p 15
3. don't forget to set seq type to ont!!!!
4. RUN:
```nextDenove run.cfg```
## GenomeTools (gt)
http://genometools.org/index.html
oder
https://github.com/genometools/genometools/tree/v1.6.2
Easy way:
```sudo apt-get install genomtools```
## seqtk
Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. It seamlessly parses both FASTA and FASTQ files which can also be optionally compressed by gzip.
https://github.com/lh3/seqtk
```
git clone https://github.com/lh3/seqtk.git;
cd seqtk; make
sudo mv /seqtk /usr/local/seqtools
cd /usr/bin
ln -s /usr/locol/seqtools/seqtk/seqtk seqtk
```
### rename fasta fiel header
```
seqtk rename BC02.fasta contig_ > BC02c_short.fasta
```
## Duplex basaecalling using Guppy
### Start guppy basecaller in fast mode
```guppy_basecaller -c dna_r10.4.1_e8.2_400bps_fast.cfg -r -i /mnt/data_1/workdir_sascha/001_Sequenzierung/2023-02-28_run0089_greek_NordIII/002_raw-data -s /mnt/data_1/workdir_sascha/001_Sequenzierung/2023-02-28_run0089_greek_NordIII/003_analysis/001_guppy/simplex --do_read_splitting --device 'cuda:0 cuda:1'```
### Installation duplex_tools via conda
https://github.com/nanoporetech/duplex-tools
```
conda create -n duplex_tools python=3.9
pip install duplex_tools
conda activate fuplex_tools
```
### Duplex finder using sequncing_summary.txt
#### step 1:
```duplex_tools pairs_from_summary ./sequencing_summary.txt duplex```
output:
'pair_ids.txt'
#### step 2:
```duplex_tools filter_pairs ./duplex/pair_ids.txt /guppy/pass```
#output
'pair_ids_filtered.txt'
### Start guppy duplex basecaller using sup mode
```guppy_basecaller_duplex -c dna_r10.4.1_e8.2_400bps_sup.cfg -r -i fast5_10.4/ -s guppy_duplex/duplex_calls/ --device 'cuda:0 cuda:1' --duplex_pairing_mode from_pair_list --duplex_pairing_file guppy_duplex/duplex/pair_ids_filtered.txt```
## Install and use Bakta
https://github.com/oschwengers/bakta
### Create conda env wih python3
```conda env -n bakta python=3.9```
```conda activate bakta```
### Install Bakta on conda
```conda install -c conda-forge -c bioconda bakta```
### Install database
```bakta_db download --output ~/workdir_sascha/bakta/db --type full```
### Example
```bakta --db /mnt/data_1/workdir_sascha/014_bakta-annotation/db/ --verbose --output /mnt/data_1/workdir_sascha/014_bakta-annotation/results/ --prefix AERO_240783 --replicons 240783_BC01_AERO_CH_Nord61_R609_aac6-lb_VIM-2.csv --threads 32 240783_BC01_AERO_CH_Nord61_R609_aac6-lb_VIM-2.fasta```
### Replicon meta data table
To fine-tune the very details of each sequence in the input fasta file, Bakta accepts a replicon meta data table provided in csv or tsv file format: --replicons <file.tsv>. Thus, complete replicons within partially completed draft assemblies can be marked & handled as such, e.g. detection & annotation of features spanning sequence edges.
Table format:
|original sequence id|new sequence id|type|topology|name|
|--------------------|---------------|----|--------|----|
|old id|new id|chromosome, plasmid, contig|circular, linear|name|
|NODE_1|chrom| chromosome| circular| -|
|NODE_2|p1|plasmid| c| pXYZ1|
|NODE_3|p2|plasmid| c| pXYZ2|
## NCBI Command line tools
Download and install the NCBI Datasets command-line tools
The NCBI Datasets command-line tools (CLI) are datasets and dataformat.
Use datasets to download biological sequence data across all domains of life from NCBI.
Use dataformat to convert metadata from JSON Lines format to other formats.
For more information about our tools, please refer to our How-to guides.

Note: The NCBI Datasets command-line tools are updated frequently to add new features, fix bugs, and enhance usability. Command syntax is subject to change. Please check back often for updates.
Install NCBI Datasets command-line tools
The NCBI Datasets CLI tools are available on multiple platforms. To download previous versions of datasets and dataformat, please refer to the Download and Install page in the CLI v13 documentation. You can get more information about new features and other updates in our release notes on GitHub.
### Linux - AMD64
https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/v2/linux-amd64/datasets
https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/v2/linux-amd64/dataformat
Install using curl
Linux
Download datasets:
```curl -o datasets 'https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/v2/linux-amd64/datasets'```
Download dataformat:
```curl -o dataformat 'https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/v2/linux-amd64/dataformat'```
Make them executable: chmod +x datasets dataformat
## Trycycler using Conda
1. **Ensure Conda is Installed:**
- If not installed, download and install Conda.
2. **Activate Bioconda Channel:**
- Use the command `conda config --add channels bioconda` to add the Bioconda channel.
- Also, add the conda-forge channel: `conda config --add channels conda-forge`.
3. **Create a New Conda Environment (Recommended):**
- Create a new environment for Trycycler: `conda create --name trycycler python=3.9`.
- Activate the environment: `conda activate trycycler`.
4. **Install Trycycler:**
- Install Trycycler in the environment: `pip3 install trycycler`.
5. **Install Dependencies:**
- Dependencies like `mash`, `miniasm`, `minimap2`, `muscle`, `numpy`, `pillow`, `python` (>=3.6), `python-edlib`, `r-ape`, `r-base`, `r-phangorn`, and `scipy` should automatically be installed with Trycycler.
6. **Upgrade R**
- see above "r-base" installation
8. **Verify Installation:**
- After installation, verify by running `trycycler --help` or a similar command.
Remember to activate the `trycycler` environment each time you need to use Trycycler.
## FigTree Installation and Setup
### 1. **Download FigTree**
To download FigTree, navigate to the Download directory and use `wget` to download the FigTree zip file.
```bash
cd /home/USER/Download
wget https://github.com/rambaut/figtree/releases/download/v1.4.4/FigTree.v1.4.4.zip
```
### 2. **Unzip the File**
After downloading, unzip the FigTree zip file.
```bash
unzip /home/USER/Download/FigTree.v1.4.4.zip
```
### 3. **Rename and Move the Folder**
Copy and rename the unzipped folder to a new directory.
```bash
cp -r FigTree.v1.4.4 /usr/local/seqtools/figtree
```
### 4. **Create a Bash Script**
- **Script Creation:** Open a text editor and create a bash script to run the FigTree JAR file.
```bash
#!/usr/bin/env bash
java -jar /usr/local/seqtools/figtree/lib/figtree.jar
```
- **Make Executable:** Save the script as `figtree.sh` and make it executable.
```bash
chmod +x figtree.sh
```
- **Run the Script:** Execute the script from the command line to start FigTree.
```bash
./figtree.sh
```
### 5. **Link the Script**
Create a symbolic link to `figtree.sh` in `/usr/local/bin` for easy access.
```bash
sudo ln -s /usr/local/seqtools/figtree/figtree.sh /usr/local/bin/figtree
```
Ensure that you replace `USER` with your actual username and verify the paths according to your system configuration.
## Dorado (https://github.com/nanoporetech/dorado)
1. Download latest version of Dorado e.g. https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.6.1-linux-x64.tar.gz
2. unzip: ```tar xzvf dorado-0.6.1-linux-x64.tar.gz```
3. copy: ```sudo cp -R /dorado-0.6.1-linux-x64 /usr/local/seqtools```
4. rename: ```mv dorado-0.6.1-linux-x64 dorado-0.6.1```
5. change the link in readme.sh under /usr/local/seqtools ```PATH=$PATH:/usr/local/seqtools/dorado-0.6.1/bin```
6. download new models ```dorado download --model``` to /Download and copy ```/usr/seqtool/dorado/model```
7. download research models under https://github.com/nanoporetech/rerio
## Resfinder
https://bitbucket.org/genomicepidemiology/resfinder/src/master/
### Optional: Install virtualenv via pip3
```python3 -m pip install --upgrade pip```
```pip3 install virtualenv```
### Install Resfinder via virtual enviroment
```virtualenv -p python3 resfinder_env```
```source resfinder_env/bin/activate``` Start virtual enviroment
```pip install resfinder```
NOTE: ```deactivate``` stops enviroment
### Install Resfinder in your user account
```python3 -m pip install --upgrade pip```
```pip3 install resfinder```
### Databases
Clone your database in a folder of your choice.
e.g. /usr/local/seqtools/resfinder
If so us sudo command:
```sudo git clone https://bitbucket.org/genomicepidemiology/resfinder_db/```
```sudo git clone https://bitbucket.org/genomicepidemiology/pointfinder_db/```
```sudo git clone https://bitbucket.org/genomicepidemiology/disinfinder_db/```
### Usage
#### without point mutations:
```python3 -m resfinder -o BC06 -db_res /usr/local/seqtools/resfinder/resfinder_db/ -ifa medaka_BC06_FL_nss.fasta --acquired```
#### with point mutations:
```python3 -m resfinder -o BC06 -db_res /usr/local/seqtools/resfinder/resfinder_db/ -db_point /usr/local/seqtools/resfinder/pointfinder_db -ifa medaka_BC06_FL_nss.fasta --acquired --point -s "Escherichia coli"```
- -o Outdir
- -ifa fasta-input file
- -m tool
- -h help
## ollama for deepseek AI models
Befehle zur Installation / Konfiguration von Ollama und OpenWebUI unter Linux
(Von c’t 3003)
curl -fsSL https://ollama.com/install.sh | sh
docker pull ghcr.io/open-webui/open-webui:main
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
sudo nano /etc/systemd/system/ollama.service
####Hier folgende Zeile unter [Service] schreiben:
Environment="OLLAMA_HOST=0.0.0.0"
sudo systemctl daemon-reload
sudo systemctl restart ollama