--- tags: Resources --- Maintainer: Yingxiao Yan link for checking status: https://status.uppmax.uu.se/ link for log in bianca: https://bianca.uppmax.uu.se/loggedin link for log in SNIC: https://supr.naiss.se/ link for introduction of bianca: https://uppmax.github.io/bianca_workshop/slurm_intro/ https://uppmax.github.io/bianca_workshop/intermediate/intro/ Email for support: support@uppmax.uu.se # Brief intro of uppmax/bianca (general theory) ![image](https://hackmd.io/_uploads/r1gU1oBIa.png) ![image](https://hackmd.io/_uploads/Byf6moH86.png) ## 1.SUNET & bianca ![image](https://hackmd.io/_uploads/rkiIIoBL6.png) ### 1.1 getting into SUNET Bianca has sensitive data. To protect this data from leaking, Bianca can only be access from within the Swedish university network. This network is called SUNET. ![image](https://hackmd.io/_uploads/HkQSNiSIp.png) One cannot access Bianca outside of SUNET. Hence, one must get inside SUNET first. There are these ways to do this: - Physically move inside SUNET (campus) - Use a virtual private network (VPN) - Use an HPC cluster within SUNET Each of these three ways are described below. ![image](https://hackmd.io/_uploads/SknGBoHU6.png) ### 1.2 getting into bianca When inside SUNET, one can access the Bianca environments. For a remote desktop environment, one can use: - The UPPMAX Bianca login website at http://bianca.uppmax.uu.se/ - A locally installed ThinLinc server. Note that the UPPMAX Bianca login website uses ThinLinc too, which can give rise to confusion. For a console environment, one can use: - SSH, for a terminal environment ![image](https://hackmd.io/_uploads/HyKASjBUa.png) ## 2.Uppmax systems ### 2.1 Computing System ![image](https://hackmd.io/_uploads/rJeVxjrU6.png) ### 2.2 storage system ![image](https://hackmd.io/_uploads/HkBrlsHIa.png) ### 2.3 cloud system Cloud services allow a user to have something active (typically a website) that can be accessed by the internet. For this, the UPPMAX cloud has a service called 'Dis' (the Swedish word for 'haze') and is part of the EAST-1 region of the SNIC science cloud. ## 3. unix command line Support: [Unix shell coding tutorials](https://swcarpentry.github.io/shell-novice/index.html) ### 3.1 Use cd to change file directory ![image](https://hackmd.io/_uploads/HkI9IsB86.png) ### 3.2 Navigation and file management (on folder) - pwd   present directory - pwd -P to see the current location on harsware - ls  list content - cd  change directory - mkdir  make a **`folder: mkdir myfilder`** - cp  copy - scp  securely remotely copy - mv  move a folder : **`mv from_folder to_folder`** - rm  remove a folder : **`rm -r myfolder`** - rmdir  remove empty folder : **`redir myfolder`** - ">" Write to file (removes existing content if any) - "">>" Append to file - ![image](https://hackmd.io/_uploads/SJR82iHIa.png) ### 3.3 Read files and change file properties (on documents) - cat   print content on screen: **`cat myfile.txt`** - touch create an empty file: **`touch myfile.txt`** - nano editing a file using nano: **`nano myfile.txt`** - rm delete a file **`rm myfile.txt`** - cp Copy a file **`cp myfile.txt mycopy.txt`** - mv Rename a file **`mv myfile.txt mycopy.txt`** Move a file to one folder up: **`mv myfile.txt ../`** Move a file to the home folder: **`mv myfile.txt ~`** - head  print first part, Show the first lines of a file - tail  print last part, Show the last lines of a file - less  browse content, Navigate through the content of a file - tar  compress or extract file - chmod  change file permissions - wc Count words, lines and/or characters - man  info about a command - | Pipe the output of one command to serve as input for the next ### 3.4 creat an executable script Creating an executable script has two steps: - Create a script - Allow the script to execute As an example, we create a script, called do_it.sh `nano do_it.sh` Write content to the script ``` #!/bin/bash echo "Hello!" ls | rev ``` Use CTRL-X to start to exit, then press y to start saving the file, then press enter to use the current filename Use chmod to make the file executable: `chmod +x do_it.sh` If you want to protect your data from being modified accidentally, chmod can create read-only files, by removing the writing rights using chmod -w. Run the script: `./do_it.sh` ## 4. module system Bianca is shared Linux computer with all the standard Linux tools installed, on which all users should be able to do their work independently and undisturbed. To ensure this, users cannot modify, upgrade or uninstall software themselves and instead an environment module system (from now on: 'module system') is used. This allow users to independently use their favorite versions of their favorite software. To have new software installed on Bianca, users must explicitly request a version of a piece of software. As of today, there are nearly 800+ programs and packages, with multiple versions available on all UPPMAX clusters. Using explicit versions of software is easy to do and improves the reproducibility of the scripts written. ![image](https://hackmd.io/_uploads/HkdhhoHLT.png) ## 5. Transfering files from/to bianca ![image](https://hackmd.io/_uploads/ry9ly3SIa.png) ### 5.1 The wharf location on Bianca¶ The path to this folder, once you are logged into your project's cluster, is: E.g. ``` /proj/sens2023598/nobackup/wharf/myuser/myuser-sens2023598 ``` To transfer data from Bianca, copy the files you want to transfer here. To get the files transferred to the wharf area from outside, move the files to you project folder or home folder. ### 5.2 Transfering method - GUI sftp clients * WinSCP ![image](https://hackmd.io/_uploads/SkoVGrxoex.png) host: bianca-sftp.uppmax.uu.se (If you paste in sftp://bianca-sftp.uppmax.uu.se, it gives the same thing) username: yingxiao (In the old times, bianca used yingxiao-simp2023018) password: WITHOUT authenticator (They wil ask 6 digit authentification factors later) * Filezilla (example in the practical session) - Using standard command line sftp client sftp commands : https://www.uppmax.uu.se/support/user-guides/basic-sftp-commands/ ``` $ sftp -q pmitev-sens2023598@bianca-sftp.uppmax.uu.se pmitev-sens2023598@bianca-sftp.uppmax.uu.se's password: sftp> ls pmitev-sens2023598 sftp> cd pmitev-sens2023598 sftp> ``` The -q flag is to be quiet (not showing the banner intended to help someone trying to ssh to the host), if your client does not support it, you can just skip it. Use your normal UPPMAX password directly followed by the six digits from the second factor application. Alternatively, you can specify this at the end of the sftp command, so that you will always end up in the correct folder directly. `$ sftp -q myuser-sens2023598@bianca-sftp.uppmax.uu.se:myuser-sens2023598` - Transit Server from/to Rackham (screenshot example) ![image](https://hackmd.io/_uploads/Sy5m-hBIp.png) ** Open your MobaXterm ** `ssh yingxiao@transit.uppmax.uu.se` ** Enter Your password (without 6 digit authenticator) ** At this stage, you can manually drag your folder/file to the left side bar of MobaXterm (the ssh browser icon, the fourth one in vertical). **` mount_wharf sens2018585` ** Enter your password (with 6 digit authenticator) ** Note that a folder named sens2018585 appears at the left side bar. You could directly drag folder inside it for it to show up in wharf in bianca!. ** Or you could do it the hard way with rsync or scp as screenshot. You need to specify both your path in your transit and bianca correctly. - Transit between project (screenshot example) Using cp ![401c5e8524d843c0e92b0d1a8df1deb](https://hackmd.io/_uploads/S13qZhHUp.png) - Mounting the wharf on your local computer ## 6. nodes ![image](https://hackmd.io/_uploads/SkxjQ3SIa.png) A computer cluster is a machine that consists out of many computers. These computers work together. Each computer of a cluster is called a node. * A node: 128 GB RAM and a 4TB local disk aka "SCRATCH". * Cores per node: 16 * Memory per core: 7 GB available for you There are three types of nodes: - login nodes: nodes where a user enters and interacts with the system - calculation nodes: nodes that do the calculations - interactive nodes: a type of calculation node, where a user can do calculations directly ![image](https://hackmd.io/_uploads/HJBQIsBUa.png) ![image](https://hackmd.io/_uploads/r1H34hHI6.png) `interactive -A sens2023598 -p core -n 2 -t 8:0:0` Each node contains several CPU/GPU cores, RAM and local storage space. A user logs in to a login node via the Internet.ontact information ![image](https://hackmd.io/_uploads/Byb3QsHUa.png) # Working in Bianca (pratically at FNS) ## Introduction Dear reader, this guide will teach you how to access and utilize a number of super computer services located in Uppsala, at the "Uppsala Multidisciplinary Center for Advanced Computational Science", also known as "UPPMAX". UPPMAX consists of several super computer clusters, two of which you will become acquainted with in this SOP are called "Bianca" and "Rackham". Bianca is an offline computer cluster which allows for high security, high processing power computing of sensitive data (such as blood samples from humans). Since Bianca is offline the user is required to transfer data (which he/she wants to work with) to Bianca in a two-step process. Rackham is an online computer cluster, unlike Bianca, it's not setup to use with sensitive data, but can be used to compute other types of data (such as blood samples from animal studies). ## 1. Setting up SNIC account and installing utility software #### Setting up a SNIC account Before you can start working in Bianca, you need to set up an account on SNIC (available at SUPR.SNIC.SE). After creating an account you will then need to apply to a project (ask your PI for the name of the project you will be part of). Instructions for how to create an account is found here: * https://www.uppmax.uu.se/support/getting-started/applying-for-a-user-account/ Remember that it is possible to login using your ChalmersID via SWAMID. Now that you have an account you can apply to a project by pressing the projects button in SNIC, enter the projectID you will be part of, and await PI approval (after creating your account the PI can also add you directly). It might take up to a few days before you have access to the resource after getting approved. * The next step is to download and setup all the software you need: * [Two factor authentication](https://www.uppmax.uu.se/support/user-guides/setting-up-two-factor-authentication/) * [FileZilla](https://filezilla-project.org/) (Download through Firefox to avoid virus warnings) * [ThinLinc client](https://www.cendio.com/thinlinc/download) * [MSConvert](http://proteowizard.sourceforge.net/download.html) Having downloaded these softwares the steps we'll take you through below are the following: * Logging in to Bianca * Transferring LC-MS data to Bianca * Transferring R packages to Bianca * Transferring data from Rackham to Bianca * Using Bianca to run RStudio and R scripts ## 2. Using Bianca ### 2.1 Login to Bianca 1. **In order to login to Bianca you need to be accessing it from behind a sunet IP**, meaning that you either have to be connected through the university network or use a direct VPN connection ("AlwaysOn VPN" does not work for this). 2. Go to (bianca.uppmax.uu.se/) and log in with username and password (using two factor authentication) ### 2.2 Interactive mode (working in RStudio in Bianca) Note that there are som instruction of using RStudio in bianca: https://www.uppmax.uu.se/support/user-guides/r-user-guide/ Depending on what you want to do you might need to use specific versions of R-packages. Bianca is updating their versions of a set of popular packages now and then and to see which versions they have you can visit this webpage: https://www.uppmax.uu.se/support/user-guides/r_packages-module-guide/ To use a specific version you have to replace all "4.0.0" parts in the steps below to the version you are interested in using. 1. Log in again with normal username (no sens-addition) and password (without two factor authentication) 2. Open a terminal in Bianca and type in: ``` interactive -A sens2018586 (hit enter) module load R_packages/4.0.0 RStudio (hit enter) ``` If this doesn't work, try: `module load R_packages/4.1.1 RStudio/2022.02.0-443 (hit enter)` More complex: `interactive -A sens2023598 -p core -n 2 -t 8:0:0` If I want more days (5 days) `interactive -A sens2023598 -p core -n 2 -t 5-8:0:0` Then open rstudio `rstudio & (hit enter)` 3. When in Bianca using R type .libPaths() This will tell you where you stored the R packages. In my case I’ll have to go to File system and to the folder “castor”. 4. To find the files transferred to "the Wharf" in Bianca go to "proj/sensXXXXXXX/nobackup/wharf/". In there you will find all the files you have transferred. Make sure to transfer the files from this location to a folder with your name under "castor/project/proj/", because if you leave it in the wharf while working on it there is no backups made of anything you do. ### 2.3 How to submit an R-script as a job For computational intensive tasks e.g. running IPO, building MUVR models we strongly recommend that you send out a job instead of running the code in R. 1. Go to 'Applications' -> 'Accessories' and open the software 'emacs' and create a new file in your castor-project folder. 3. Follow the image below i.e. specify username, email address and name of R script. Note the image show “-p core -n 14” change 14 to 16 to have more processing power. Also, change “-t 40:00:00” to “-t 240:00:00” whenever you run codes that may take days to run. Ask you supervisor to get the draft code for this file. 4. Then go to your domain in which you have the folder with the R files you want to run and the file you filled out (the image) (yes they need to be in the same folder), I named the file “Please”. Right click on the background and select “Open Terminal Here” 5. Type in: module load R_packages/4.2.1 (hit enter) 6. Type in: “sbatch Please” which is the name of the file below, therefore change the name to what you prefer 7. Then you will receive an email to the specified email address when the job has started and another email when the code has succeeded or failed to run (hopefully it won´t fail) 😊 #### Important when working with R 4.3 and XCMS #### If you ever get an error mentioning mzR and netcdf and/or hdf, here's what to do: - Prior to **step 5** Preload netcdf and hdf resources: ``` module load netcdf/4.9.2 hdf5/1.14.0 ``` - This assures that the needed netcdf capability is loaded prior to loading packages (like 'mzR') which are dependent on it ``` #!/bin/bash -l #SBATCH -A simp2023018 #SBATCH -p core -n 14 #SBATCH -t 40:00:00 #SBATCH -J testMUVR.R #SBATCH -o testMUVR.out #SBATCH --mail-type=all #SBATCH --mail-user=yingxiao@chalmers.se module load R_packages/4.1.1 R --vanilla< testMUVR.R ``` ![](https://i.imgur.com/O5aqedB.png) ## 3. Transferring data ### 3.1 Transfer raw LC-MS data into Bianca Add converting files from raw format to .mzML guide here. All instructions below use example usernames and host names, replace these with accurate information for your project. **Log in to filezilla:** * host: sftp://bianca-sftp.uppmax.uu.se * username: eddie-sens2018586 * password with authenticator (e.g., abcdefghijkl123456, where the first letters represent your normal password and the last 6 diits represent the changing password in your google authenticator) In FileZilla we can transfer LC-MS files or R scripts into the “wharf” which then can be transferred into Bianca. The "wharf" is a mediator between the online environment and the “offline environment” (Bianca). Once in the wharf, you can drag and drop the files into Bianca. Once you are logged into FileZilla you will see something like this: ![](https://i.imgur.com/bdYzR5S.png) Please note that you can only have access to your own named folder in the Wharf. Please make sure you entered into the correct folder as shown in the picture. * Simply find the files of interest on your computer i.e. left hand side of the image and drag and drop into the right hand side (Wharf). * You can usually find the files under: /proj/nobackup/yourname-projectname/wharf/yourname/yourname-projectname/ * Now we are done, remember that FileZilla is your “Go to” whenever you want to put files or extract files from Bianca. Since Bianca is an offline server you cannot copy paste e.g. R-scripts and send them on email. If you want to e.g. work on your R-scripts in regular RStudio and not in Bianca, simply put those scripts into the wharf folder and download them using FileZilla. Within bianca, the GUI works very similar to the Windows system, where copy-paste functions. ### 3.2 Transfer R packages to Bianca As we have mentioned, Bianca is an offline server and due to this it may not contain all the necessary packages and libraries for your research project. In order to transfer R packages to Bianca we have to use Rackham (FileZilla doesn't work for this purpose). 1. Log in to Thinlinc (desktop version) without using authenticator. Server name is: rackham-gui.uppmax.uu.se. 2. Ctrl F8 to exit full screen. 3. Open terminal and type in: * module load R_packages/4.0.0 RStudio ##This should correspond the code of R version in bianca (In the new version of Thinlinc. It seems you cannot use 4.0.0 but need to write 4.1.1) * rstudio & * If you cannot open R studio using the code above, go to Xqartz ot MobaXterm and run ssh -X yingxiao@rackham.uppmax.uu.se and enter the password without authentification. Then open the rstudio there and continue . 4. Go online and download the packages you want in R (You can directly download the packages within RStudio of Rackham). Examples: From CRAN: install.packages("") From gitlab: First, ``` install.packages("remotes") library(remotes). ``` 5. Then install what you want, for example: `install_gitlab('CarlBrunius/MUVR')` --- the original MUVR, the old version `install_gitlab('CarlBrunius/MUVR@nameofbranch') ` `install_gitlab('CarlBrunius/MUVR@MUVR2') `--- old MUVR2 with error during installation `install_gitlab('YingxiaoYan/MUVR2') `--- the latest verison of MUVR2 (development version) `install_github('MetaboComp/MUVR2') --- the latest verison of MUVR2 (version to public) 6. Run the script in R via ThinLinc 7. .libPaths() to find the location of the packages 8. Go into the home folder i.e. Home/eddie 9. Find the packages according what it shows in .libPaths(), in my case its in “file system/domus…” 10. Open terminal in that folder (in folder that contains downloaded package folder) (right click->"Open terminal here") 11. Follow step 1-4 in “Transfer from rackham to Bianca” (shown below) 12. Type in “Put (name of your package)” e.g: "Put MUVR" . In this step, if you change the version of packages/when you have another package with the same name in you bianca, you need to delete that package first before "Put" something. 13. Move to Bianca, Click File -system, click the "proj"folder, then "nobackup", then go all the way in to wharf and your name-folder until you find the R packages you downloaded there. (It seems you need to delete the original file if you try to download a file of the same name.) 14. Transfer (Ctrl X or Ctrl C (it is up to you)) the packages to castor/project/home/eddie/R/the_R_version_you_are_working_with/ **Be aware of the R_version here. It should be one specifc version all the time.** 15. Now you are done 😊 ### 3.3 Transfer from rackham to Bianca You can also use Rackham to transfer other files, if you want to. 1. Type: `sftp -r elise-sens2018586@bianca-sftp.uppmax.uu.se` 2. Type in your Uppmax password + 6 digit two-way authentification code for uppmax 3. Then you get in the sftp mode 4. You want to use cd and pwd to check where you are in the Bianca, remember to change the directory to your own wharf, otherwise you cannot transfer the files. Type cd /eddie-sens2018586 (in my case) (`cd /elise-sens2018586` for my supervisor). Note that there is a blank between "cd" and "/". 5. If you go into the wrong folder in rackham, and you want to change the folder, you can use lcd and lpwd (l means local, rackham), or you can quit the sftp by using exit. 6. Remember to change the workdirectory from / to "your name" ### 3.4 Another method to transfer R packages to bianca 1. Download the tar.gz file ![image](https://hackmd.io/_uploads/BJ4qegIRT.png) 2. Transfer through filezilla (see 3.1) 3. Move the tar.gz file to your working directory (e.g. where you have all your R scripts) 4. Open the teminal in your working directory and open RStudio `ml R_packages/4.1.1 RStudio` `rstudio &` 6. Install the package in you RStudio, using the same directory name where the tar.gz file is in. (This directory should be the same as your working directory, aslo it is the directory where you open your RStudio) `install.packages("/castor/project/proj/Yan_threecohort/triplotgui-main.tar.gz")` 6. The package will be automatically installed to the library folder where your R packages is usually in. This means that if you have a package with the same name in it, that needs to be deleted before installing. ## 4. Tricks/solving issues in using Bianca ### 4.1 slurm master This consists of 2 part. One is your R script which you want to submitted as sbatch jobs and the other is your slurm master R script. Note that they need to be run in your project and same directory. Copy paste the following code in 2 R scripts to try ### In the slurm master R script - example #Specify job parameters account <- "sens2018586" partition <- "core -n 14" time <- "0:30:00" #time <- "10:00" jobTemplate <- "myJob" mail <- "elise.nordin@chalmers.se" rVersion <- "4.0.0" rScript <- 'R_script.R' #Specify the R script you want to run - with possible arguments #Where the argument can be e.g., a specific y matrix column vector #Let's say I want to make 20 MUVR analyses (one for each column in Y; each with the same predictor matrix X) load("Y_matrix.rda") **argMax <- ncol(Y_matrix)** #This is very important: It controls how many i you have in the next script. #Loop through all the different columns in Y for (rArgument in 1:argMax) { jobText <- paste('Rscript --vanilla', rScript,rArgument) #combine with loading modules wrapText <- paste0('module load R_packages/', rVersion, '; ', jobText) #combine into sbatch command job <- paste0(jobTemplate, rArgument) sbatchText <- paste('sbatch', '-A', account, '-p', partition, '-t', time, '-J', job, '-o', paste0(job,'.txt'), '--mail-type=all', paste0('--mail-user=', mail), '--wrap', paste0('"', wrapText, '"')) #push sbatch job to slurm system(sbatchText, wait = FALSE) } ### In the want-to-be-submitted Rscript - example rm(list=ls()) #Import counter argument **i <- as.numeric(commandArgs(trailingOnly=TRUE))** #i is spefified, its range is controlled by the argMax in the previous script library(MUVR) data("freelive") # Dataset in MUVR X_matrix <- XRVIP[,1:10] y1 <- YR y2 <- sample(YR) Y_matrix <- cbind(y1,y2) save(Y_matrix, file="Y_matrix.rda") #set up modelling parameters repMult <- 1 nOuter <- 6 varRatio <- 0.75 method <- 'RF' i #perform the actual analysis **Y <- Y_matrix[,i]** #Use i to separate/specifiy each loop model <- qMUVR(X = X_matrix[!is.na(Y),], Y = Y[!is.na(Y)], method = method, varRatio = varRatio, repMult = repMult, nOuter = nOuter) #save model for later save(model, file=paste0('FA',i,'-Model.rda')) ### 4.2 Errors in RStudio Sometimes there might be some malfunction in RStudio(reason unknown) that you actual cursor is not where it is on the screen, which makes editing rscript very difficult. In this case, you could click Tools --> Global options --> Appearance. And then change a font in the "Editor font" button. Hopefully the problem will be solved ### 4.3 language change may cause issues You cannot change language through keyboard in bianca. To change the language, click outside the bianca to change and then go inside bianca. The language will be changed back ### 4.4 Copy stuff from outside bianca to bianca Copy what you have outside bianca on the clipboard. Then when you ctrl+V in bianca, the text will be autmatically copied. Using this can move code from outside bianca into it easily ![image](https://hackmd.io/_uploads/SyOIMhrUa.png) ### 4.5 If the bar for items does not show up You can try the middle mouse button and choose among active windows then you can fiddle and reconfigure the bar and stuff through the menus. ![image](https://hackmd.io/_uploads/S13Qi9YUa.png) ## 5. Important command lines/code * Bianca cannot show results from MUVR plots, therefore you need to save the image as a png file. e.g. ![](https://i.imgur.com/iTWZFCk.png) * Do not open several terminal at the same time! (It will cause your Bianca to be slow and crash). As common practice, only work using one terminal, if you want to access another folder, then close the current terminal and open a new one. * If you want to close an ongoing job to begin anew, type: "scancel -i -u username" * If you want to see all ongoing jobs type: "jobinfo -u username" * There is no limit on how many jobs you can run simultaneously. However, every project has a number of processor core hours to use. When these are used up you can still submit jobs and run code but your jobs will be down-prioritized. * Write to UPPMAX support if you have struggles they are very nice: support@uppmax.uu.se * Some useful links: * [Slurm user guide](https://www.uppmax.uu.se/support/user-guides/slurm-user-guide/) * [First log in to UPPMAX](https://www.uppmax.uu.se/support/user-guides/guide--first-login-to-uppmax/) * [Bianca user guide](https://www.uppmax.uu.se/support/user-guides/bianca-user-guide/)