# SigProfilerExtractor on Bianca
https://github.com/AlexandrovLab/SigProfilerExtractor
## Download the necessary packages
> **On Rackham**
1. Load python from modules and update pip, setuptools, etc..
``` bash
# Load python from modules
$ module load python/3.9.5
# Update pip, setuptools, etc..
$ python3 -m pip install --user -U pip setuptools wheel
Requirement already satisfied: pip in /sw/comp/python/3.9.5/rackham/lib/python3.9/site-packages (21.1.2)
Collecting pip
Downloading pip-22.2.2-py3-none-any.whl (2.0 MB)
|████████████████████████████████| 2.0 MB 16.0 MB/s
Requirement already satisfied: setuptools in /sw/comp/python/3.9.5/rackham/lib/python3.9/site-packages (57.0.0)
Collecting setuptools
Downloading setuptools-65.3.0-py3-none-any.whl (1.2 MB)
|████████████████████████████████| 1.2 MB 69.7 MB/s
Requirement already satisfied: wheel in /sw/comp/python/3.9.5/rackham/lib/python3.9/site-packages (0.36.2)
Collecting wheel
Downloading wheel-0.37.1-py2.py3-none-any.whl (35 kB)
Installing collected packages: wheel, setuptools, pip
Successfully installed pip-22.2.2 setuptools-65.3.0 wheel-0.37.1
```
2. Create folder, enter it and download the necessary packages
``` bash
$ mkdir mdownload
$ cd mdownload/
$ python3 -m pip download pip setuptools wheel
$ python3 -m pip download SigProfilerExtractor
Collecting SigProfilerExtractor
Downloading SigProfilerExtractor-1.1.10-py3-none-any.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 35.7 MB/s eta 0:00:00
Collecting scikit-learn>=0.24.2
Downloading scikit_learn-1.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (30.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30.8/30.8 MB 34.7 MB/s eta 0:00:00
...
Collecting grapheme==0.6.0
Downloading grapheme-0.6.0.tar.gz (207 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.3/207.3 kB 29.5 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Saved ./SigProfilerExtractor-1.1.10-py3-none-any.whl
Saved ./matplotlib-3.4.3-cp39-cp39-manylinux1_x86_64.whl
...
Saved ./about_time-3.1.1-py3-none-any.whl
Saved ./grapheme-0.6.0.tar.gz
Saved ./seaborn-0.11.2-py3-none-any.whl
Saved ./typing_extensions-4.3.0-py3-none-any.whl
Saved ./six-1.16.0-py2.py3-none-any.whl
Successfully downloaded SigProfilerExtractor matplotlib nimfa numpy pandas pillow psutil PyPDF2 reportlab scikit-learn scipy SigProfilerAssignment SigProfilerMatrixGenerator sigProfilerPlotting statsmodels torch cycler joblib kiwisolver packaging patsy pyparsing python-dateutil pytz threadpoolctl alive-progress about-time grapheme seaborn typing-extensions six
```
3. Check you got the packages
``` bash
$ ls
about_time-3.1.1-py3-none-any.whl
alive_progress-2.4.1-py3-none-any.whl
cycler-0.11.0-py3-none-any.whl
grapheme-0.6.0.tar.gz
joblib-1.1.0-py2.py3-none-any.whl
kiwisolver-1.4.4-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
matplotlib-3.4.3-cp39-cp39-
...
threadpoolctl-3.1.0-py3-none-any.whl
torch-1.12.1-cp39-cp39-manylinux1_x86_64.whl
typing_extensions-4.3.0-py3-none-any.whl
wheel-0.37.1-py2.py3-none-any.whl
```
4. Collect all files in single archive for easier transfer to Bianca
```
$ cd ..
$ tar -cvf mdownload.tar mdownload
mdownload/
mdownload/SigProfilerExtractor-1.1.10-py3-none-any.whl
mdownload/matplotlib-3.4.3-cp39-cp39-manylinux1_x86_64.whl
mdownload/nimfa-1.4.0-py2.py3-none-any.whl
mdownload/numpy-1.23.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
mdownload/pandas-1.4.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
mdownload/Pillow-9.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
mdownload/psutil-5.9.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
...
```
Check the archive.
``` bash
$ ls -lh mdownload.tar
-rw-rw-r-- 1 user user 920M Aug 25 10:14 mdownload.tar
```
## Transfer the collected file(s) on Bianca
- [Transit user guide](https://uppmax.uu.se/support-sv/user-guides/transit-user-guide/)
- [Bianca user guide](https://www.uppmax.uu.se/support/user-guides/bianca-user-guide/)
## Install the collected packages on Bianca
> **On Bianca**
Over time I am convinced that using `venv` is the easiest and cleanest way...
``` bash
# Load python from modules
$ module load python/3.9.5
$ cd /proj/sens2017625/nobackup/user
$ python3 -m venv SPMG
$ source SPMG/bin/activate
# untar the archive
$(SPMG) tar -xvf mdownload.tar
$(SPMG) cd mdownload
$(SPMG) python3 -m pip install -f ./ --no-index pip-22.2.2-py3-none-any.whl wheel-0.37.1-py2.py3-none-any.whl setuptools-65.3.0-py3-none-any.whl
$(SPMG) python3 -m pip install -f ./ --no-index SigProfilerExtractor-1.1.10-py3-none-any.whl
```
## Running and test
``` bash
$ module load python/3.9.5
$ source SPMG/bin/activate
$(SPMG) python3
Python 3.9.5 (default, Jun 3 2021, 15:06:34)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from SigProfilerMatrixGenerator import install as genInstall
>>> genInstall.install('GRCh37')
Beginning installation. This may take up to 40 minutes to complete.
tar (child): /castor/project/proj_nobackup/user/SPMG/lib/python3.9/site-packages/SigProfilerMatrixGenerator/references/chromosomes/tsb/GRCh37.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
The ensembl ftp site is not currently responding.
$(SPMG)
```
The second command from the test will fail because it tries to access the files over Internet - you need to bring these files via transit server or sftp or use local database if available.
## Common problems and troubleshooting
### Old gcc - error during compilation of a module
```bash
In file included from pysam/libchtslib.c:2136:0:
/sw/comp/python/3.12.1/rackham/include/python3.12/internal/pycore_frame.h: In function '_PyFrame_Initialize':
/sw/comp/python/3.12.1/rackham/include/python3.12/internal/pycore_frame.h:134:5: error: 'for' loop initial declarations are only allowed in C99 mode
for (int i = null_locals_from; i < code->co_nlocalsplus; i++) {
^
/sw/comp/python/3.12.1/rackham/include/python3.12/internal/pycore_frame.h:134:5: note: use option -std=c99 or -std=gnu99 to compile your code
error: command '/usr/bin/gcc' failed with exit code 1
```
**Solution**: Load newer `gcc` (than the CentOS 7 gcc 4.8) from the software tree i.e. `module load gcc/9.3.0` and try again.
## Contacts:
- [Pavlin Mitev](https://katalog.uu.se/profile/?id=N3-1425)
- [UPPMAX](https://www.uppmax.uu.se/)


###### tags: `UPPMAX`, `SNIC`, `SigProfilerExtractor`, `Bianca`, `pip`, `venv`