ParseBiosciences-Pipeline.1.5.0 on Bianca

# ParseBiosciences-Pipeline.1.5.0 on Bianca ## Load conda ```bash source /sw/apps/conda/latest/rackham_stage/etc/profile.d/conda.sh ``` ## Delete old environment, if necessary ```bash conda env list conda env remove -n spipe ``` ## Install complete environment `environment.yml` ```yml channels: - file:///sw/apps/conda/latest/rackham/local_repo/conda-forge - file:///sw/apps/conda/latest/rackham/local_repo/bioconda dependencies: - samtools=1.21 - star=2.7.11b - cython=3.0.12 - gcc=14.2.0 - libxcrypt=4.4.36 - libxml2=2.13.6 - pigz=2.8 - pip=25.0.1 - python=3.12.8 - setuptools=75.8.2 - unzip=6.0 - numpy>=1.23,<2.0.0 - pandas>=2.2 - scipy - matplotlib - natsort - h5py - pysam>=0.20.0 - scanpy>=1.8 - anndata - leidenalg - python-igraph - jinja2 - psutil - python-calamine ``` Copy/paste the content above in a file with name `environment.yml` ```bash conda env create -n spipe150 -f environment.yml --offline conda activate spipe150 cd project_dir unzip ParseBiosciences-Pipeline.1.5.0.zip cd ParseBiosciences-Pipeline.1.5.0 ``` Edit `setup.py` in the folder and replace "python-igraph" with "igraph" on line 30. ```bash pip install --no-cache-dir --no-build-isolation ./ ... Requirement already satisfied: stdlib-list in /proj/sens2017625/nobackup/pmitev/conda_envs/spipe/lib/python3.12/site-packages (from session-info->scanpy>=1.8->splitpipe==1.5.0) (0.11.1) Building wheels for collected packages: splitpipe Building wheel for splitpipe (pyproject.toml) ... done Created wheel for splitpipe: filename=splitpipe-1.5.0-cp312-cp312-linux_x86_64.whl size=9070698 sha256=b6bad40cc172f67680b4272dd95737a5db33c0165178f2ff7980a50fb1006b73 Stored in directory: /scratch/pip-ephem-wheel-cache-1tdx_tnn/wheels/ba/4b/eb/157bdf269b0e483efa6dfe454166ccac75a00ac63eccdc2ba0 Successfully built splitpipe Installing collected packages: splitpipe Successfully installed splitpipe-1.5.0 ``` ## Running ```bash # load conda if necessary source /sw/apps/conda/latest/rackham_stage/etc/profile.d/conda.sh conda activate spipe150 split-pipe -h usage: split-pipe [-h] [-m MODE] [-c CHEMISTRY] [-k KIT] [-p PARFILE] [--run_name RUN_NAME] [-f FQ1] [--fq2 FQ2] [-o OUTPUT_DIR] [-g GENOME_DIR] [--parent_dir PARENT_DIR] [--targeted_list TARGETED_LIST] [--sample SAMPLE_NAME WELLS] [--samp_list SAMP_LIST] [--samp_sltab SAMP_SLTAB] [--yes_allwell] [--no_allsample] [--genome_name [GENOME_NAME ...]] [--genes [GENES ...]] [--fasta [FASTA ...]] [--gfasta GENOME_NAME FASTA] [--sublibraries [SUBLIBRARIES ...]] [--sublib_list SUBLIB_LIST] [--sublib_pref SUBLIB_PREF] [--sublib_suff SUBLIB_SUFF] [--tscp_use TSCP_USE] [--tscp_min TSCP_MIN] [--tscp_max TSCP_MAX] [--cell_use CELL_USE] [--cell_est CELL_EST] [--cell_xf CELL_XF] [--cell_min CELL_MIN] [--cell_max CELL_MAX] [--cell_list CELL_LIST] [--crispr] [--crsp_guides CRSP_GUIDES] [--crsp_read_thresh CRSP_READ_THRESH] [--crsp_tscp_thresh CRSP_TSCP_THRESH] [--crsp_max_mm] [--crsp_use_star] [--immune_check] [--bcr_analysis] [--tcr_analysis] [--immune_genome IMMUNE_GENOME] [--use_imgt_db] [--immune_read_thresh IMMUNE_READ_THRESH] [--save_anndata] [--kit_list [KIT_LIST]] [--chem_list] [--bc_list] [--bc_round_set ROUND NAME] [--sample_bc_rounds SAMPLE_BC_ROUNDS] [--rseed RSEED] [--nthreads NTHREADS] [--no_keep_going] [--reuse] [--keep_temps] [--one_step] [--until_step UNTIL_STEP] [--star_extra_args STAR_EXTRA_ARGS] [--clear_runproc] [--start_timeout START_TIMEOUT] [--chem_score_skip] [--kit_score_skip] [--dryrun] [-e] [-V] SplitPipe data processing pipeline v1.5.0 options: -h, --help show this help message and exit -m MODE, --mode MODE Mode dictates process(s) to run; REQUIRED; See -explain -c CHEMISTRY, --chemistry CHEMISTRY Set chemistry version for data -k KIT, --kit KIT Set kit and kit-specific parameters -p PARFILE, --parfile PARFILE Parameter file ``` ## Installing TCR and BCR dependencies - Login to transit.uppmax.uu.se, mount the wharf by running `mount_wharf your_project`. Navigate to the wharf folder. - Copy `ParseBiosciences-Pipeline.1.5.0.zip` there - Run `unzip ParseBiosciences-Pipeline.1.5.0.zip` - cd to the unzipped folder `cd ParseBiosciences-Pipeline.1.5.0` - fix the Windows encoding of the install script `dos2unix install_immune_dbs.sh` - Run `./install_immune_dbs.sh -i -y` - All reference data and databases are downloaded in `splitpipe/immune_dir/IgBlast` - this folder needs to be copied where is the previous installation in the project folder to match the structure (needs to be done from shell on Bianca). Something like this ```bash cp -r /path_to_wharf/splitpipe/immune_dir/IgBlast /path_to/installation/splitpipe/immune_dir/ ``` According to the manual, you need to rerun the installation of the pipeline. - source conda, activate the environment as described earlier. - Navigate to the folder with the pipeline and run `pip install --no-cache-dir --no-build-isolation ./` - That should be it! ## Bugs To run `./install_immune_dbs.sh -I -y` there are few fixes that need to be done. 1. Edit `install_immune_dbs.sh` and change line 285 from `return 0` to `return 1` 2. Edit file `/splitpipe/scripts/install_IgBlast.sh`. Lines 267-271 have missing "s" in the address i.e. add them so the addresses for the corresponding files starts with `https://..` like ``` # Line 267 # From wget http://www.imgt.org/download/V-QUEST/IMGT_V-QUEST_reference_directory/Homo_sapiens/TR/TRAV.fasta # To wget https://www.imgt.org/download/V-QUEST/IMGT_V-QUEST_reference_directory/Homo_sapiens/TR/TRAV.fasta ``` 3. Run the script as expected. ## Contacts: - [Pavlin Mitev](https://katalog.uu.se/profile/?id=N3-1425) - [UPPMAX](https://www.uu.se/en/centre/uppmax) - [AE@UPPMAX - related documentation](/8sqXISVRRquPDSw9o1DizQ) https://support.naiss.se/Ticket/Display.html?id=308687 ###### tags: `UPPMAX`,`RT-283547`