# Running GenoPredPipe on Bianca https://opain.github.io/GenoPred/ --- [TOC] --- ## 1. Load the relevant modules ```bash! module purge module load uppmax git module load bioinfo-tools GenoPredPipe/20221121-e3caf6b ``` ::: danger Make sure you do not have other modules loaded or using your own conda installation. ::: ## 2. Check the environment setup ```bash which snakemake /sw/bioinfo/GenoPredPipe/20221121-e3caf6b/bianca/geno_env/bin/snakemake which conda /sw/bioinfo/GenoPredPipe/20221121-e3caf6b/bianca/geno_env/bin/conda echo $SNAKEMAKE_CONDA_PREFIX /sw/bioinfo/GenoPredPipe/20221121-e3caf6b/bianca/GenoPred/GenoPredPipe/.snakemake ``` The conda enviroment for the sankemake, dropbox, and pandas is installed in `/sw/bioinfo/GenoPredPipe/20221121-e3caf6b/bianca/geno_env`. The test run from the installation [page](https://github.com/opain/GenoPred/tree/master/GenoPredPipe#step-2-1) states: > Note. Please be patient when running the pipeline for the first time. Expect the 'downloading and installing remote packages' to take ~1 hour. It has to create the conda environment in first instance, which involves installing python and R and many packages. Expect this to take ~1 hour. This means that these first step(s) could not be done on Bianca and for that reason many are done in advance on an Internet connected machine. ## 3. Prepare the test on Bianca. Navigate to your project folder and `git clone` the local mirror of the repository: ```bash git clone $GENO_PRED_PIP_ROOT/github.com/opain/GenoPred.git cd GenoPred/GenoPredPipe ``` Extract the example data: ```bash tar -xvf $GENO_PRED_PIP_ROOT/../src/test_data.tar.gz ``` To use the pre-downloaded and setup tools mentioned above run this command in the folder. The command will bring the tools in the folder (instead of linking to them), because the tool needs write permissions in the folder structure. The process might take ~20 mins. ```bash $GENO_PRED_PIP_ROOT/GenoPredPipe-offline-setup.sh ``` ## 4. Run the test on Bianca Replace `PROJECT_ID` (in both places) with your allocated project on Bianca. The original example runs on 1 core, but the required memory is 8GB which is not true on Bianca, that is why we ask for 2 CPUs i.e. `-n 2`. Note also the `--latency-wait 60` to compensate for the slow file system. Sbatch script:`run-test.sh` ```bash #!/bin/bash -l #SBATCH -A PROJECT_ID #SBATCH -p core -n 2 #SBATCH -t 20:00:00 module load bioinfo-tools GenoPredPipe/20221121-e3caf6b # Make the conda work ofline and not trying to report conda config --set offline True conda config --set report_errors false # Just for insurance that it will find the mirrors conda config --append channels file:///sw/apps/conda/latest/rackham/local_repo/dranew conda config --append channels file:///sw/apps/conda/latest/rackham/local_repo/bioconda conda config --append channels file:///sw/apps/conda/latest/rackham/local_repo/conda-forge conda config --append channels defaults # Use the R libraries from sucessfully built identical conda environment. export R_LIBS_SITE=$GENO_PRED_PIP_ROOT/GenoPred/GenoPredPipe/.snakemake/conda/6e766d1e/lib/R/library # Start the test snakemake --jobs 10 --latency-wait 60 --cluster "sbatch -A PROJECT_ID -p core -n 2 -t 8:00:00" --use-conda run_create_reports ``` Submit the job with `sbatch run-test.sh`. ## Relevant links - https://github.com/opain/GenoPred - Installation instructions: https://github.com/opain/GenoPred/tree/master/GenoPredPipe ## Contacts: - [Pavlin Mitev](https://katalog.uu.se/profile/?id=N3-1425) - [UPPMAX](https://www.uppmax.uu.se/) - [AE@UPPMAX - related documentation](/8sqXISVRRquPDSw9o1DizQ) ![](https://live.webb.uu.se/digitalAssets/207/c_207717-l_3-k_bg-city.png) ###### tags: `UPPMAX`, `Bianca`, `conda`, `GenoPredPipe`