# CVVT User Manual ## Hosts | Node | IP | | -------- | -------- | | CPU-0 | 10.64.245.20 | | CPU-1 | 10.64.245.18 | | GPU-0 | 10.64.245.16 | ## Hardware Details | Model | IDRAC MAC | Embedded NIC MAC | Default password | Port Numbers at S3148 (idrac, embedded nic) | | -- | -- | -- | -- |--| | Dell PowerEdge R7525 | EC:2A:72:0F:B8:56 | EC:2A:72:02:BF:32 | calvin | (5,4)| | Dell PowerEdge R6515 | D0:8E:79:CF:99:B8 | D0:8E:79:CF:99:BE | KVAE4EVPMAY | (3,2)| | Dell PowerEdge R6515 | D0:8E:79:CF:B6:48 | D0:8E:79:CF:B6:4E | VVWE3KM33EKM | (1,0)| | Dell S3148 |--| 34:73:5A:10:BB:DC|--|--| | Dell EMC ME4084 |--| 00:C0:FF:65:AA:40, 00:C0:FF:65:AA:56 |--| (null, 14,15)| ## Monadic Access ## Overview Monadic Access (Monadic) is a workload manager aiming to reduce operational complexity and allow users to focus on innovation and discoveries. Monadic provides three ways for users to access and utilize computing resources, each designed to fit into different scenarios users faces in daily tasks. ### 1. Browser-based, persistent shell sessions. Persistent shell sessions allow users to perform non-compute-intensive tasks, such as data management, monitoring, text editing, and work packaging, alone or sharing with colleagues, all from a web browser. These sessions started by allowing users to log in to particular hosts via SSH and use any authorized machines. Once a session is established, they are persistent until the user or administrators terminate it. That means no more staring at a computer because of accidentally launching long-running sessions over SSH. :::warning You must have an account on the host you SSH to. Monadic does not automatically create user accounts on any machines. This is a measure to be least invasive and maximize the flexibility for system administrators to administrate the machines. Moreover, users who are able to login directly via SSH after administrators manually created those accounts are not bounded to the resource control limit set by Monadic, system administrators are required to administra these accounts separatly. ::: To access persistent sessions, click "sessions" on the sidebar. The table shows a list of active sessions available to the user. Users can establish new sessions by clicking the "+SESSION" button in the upper right corner. ![](https://i.imgur.com/oqxIusB.png) Once a new session is created, you can access the shell by clicking the "attach" button. You may have to allow pop-up windows from the website. You may share the created session with others by clicking the share button, which should bring up a menu allowing you to select if you want to share the selected session in read-only or read-write mode. Once you click the button, a link will be copied to your clipboard, which you can share with the desired person. A read-only share means the other party can only view your shell session. On the contrary, a read-write session means the other party can also type and create input to the shared shell. The "Attached" column shows how many instances of browser windows opened for the given session. :::info Administrators can attach to all sessions in read-write mode regardless of the owner. ::: ### 2. Asynchronous Job Queuing / Scheduling. The major usage of Monadic is to queue long-running jobs to the computation cluster. This can be done by clicking the "New Job" button on the sidebar. Once the button is clicked, you will navigate to a page to submit a form. ![](https://i.imgur.com/etvJ9Gd.png) You can specify an optional job name to identify your job. The given job name will be inserted into the job identifier, which is constructed with your user name, the optional job name, and the numeric job id. You must fill in which docker image your submission requires; this docker image must be accessible by all worker nodes. You can optionally specify any number of mount points you wish to mount to the container. There is, however, a limitation that the source file/directory must be a subdirectory of your home directory; hence, it must start with `$HOME.` The Docker command field represents the entry command that will be executed. This command can take any shell script oneliner, for example `for i in $(seq 0 10); do echo $i; done`. The maximum duration your job can run and the number of GPUs your jobs allow to acquire are determined by the administrator. ### 3. Interactive Jupyter Lab sessions Monadic allows users to request running interactive Jupyter Lab sessions on worker nodes. To request an interactive session, click the "Request session" button on the sidebar and fill out the form. ![](https://i.imgur.com/3BB8IJh.png) Whenever a Jupyter Lab session is created, you must assign a password to the lab. You will use this password to unlock and use the created Jupyter Lab session. Like queuing asynchronous jobs, you can specify mount points to directories in the Jupyter Lab environment. Unlike in queuing asynchronous jobs, your home directory will be mounted at `/home/$username/work`. For mounting data, add `$DATA` mount point to `/mnt`. After the jupyter session is connected, open a terminal and use the following command to create a symbolic link for viewing the data. ```bash= mkdir ./mnt; ln -s /mnt ./mnt ``` In addition to mount points, you can specify a list of ports that can be listened to in the sessions' container, which will open the companion ports on the worker node, redirecting traffic to the ports in the container. ![](https://i.imgur.com/QoMLaom.png) Once your session is available, you must manually click the "Adopt" button on the "Jupyter Notebook" page within a preset amount of time. If you cannot adopt the session within the timeframe, your session will be canceled, and release the resources for others. If you have successfully adopted a session and have requested port redirection, you can click the row and expand the details. The "Address:Port" column is the IP address and port number, which traffic will direct to the port specified in the "Container Port" column. ## Common daily task recipe ### Accessing, uploading and modifing storage Storage can be mounted and accessed within the interactive Jupyter Notebook session. Simply add the mount point, with the source being "$DATA", and the destination being "/mnt" when requesting an interactive Jupyter Notebook session. You should be able to access the dataset under "/mnt" :::warning The file mode determine if you can access / read / write files and folders under the mount point. If you are having trouble reading/writing to the files, please contact your system administrator and have them to modify / adjust the file permission and ownership. ::: ### Docker Repository Docker image provides a reproducible and isolated execution environment. To create one, first open a shell. Create a new folder for the docker image. ``` bash mkdir project_0 ``` Define your `Dockerfile` using your favourite text editor. Detailed documentation can be found [here](https://docs.docker.com/engine/reference/builder/) A sample is given here. It should satisfy most research purpose. ```dockerfile! FROM REPOSITRY:TAG # e.g. nvcr.io/nvidia/pytorch:22.08-py3 for pulling a pytorch image from nvidia # a good idea to place all code in a predefined directory to avoid confusion later RUN mkdir -p /workspace WORKDIR /workspace ENV MY_ENVIRONMENT_VARIABLE_KEY=MY_ENVIRONMENT_VARIABLE_VALUE RUN apt update && apt upgrade # an update is required if you need to install packages with apt RUN apt install -y MY_PACKAGE_0 MY_PACKAGE_1 RUN pip install PYTHON_PACKAGE_0 PYTHON_PACKAGE_1 # both relative path, CONTAINER_PATH cannot be / COPY HOST_PATH CONTAINER_PATH # default command to run when launching the container CMD python MY_SCRIPT.py ``` The folder structure should look like this on the host ``` - home - project_0 - Dockerfile - myscripy.py - requirements.txt - src - ... ``` Lastly, inside directory that contains the `Dockerfile`. Run the following command ```bash! docker build -t REPOSITORY:TAG . ``` ### Direct FTP access Prerequisites: - You have bidirectional layer 3 connectivity to the **head** node (10.64.245.22) - You have a valid Monadic account (can login to monadic) - You have a FTP client (either via Terminal or applications) - Since FTP utilize multiple ports for data transfer, The following ports **MUST NOT** be blocked - 20 - 21 - 10000-12000 :::info If there are any credential error, please verify you have bidirectional connectivity to `10.64.245.22`, usually by by using the `ping` command is sufficent to check, unless ICMP is blocked by your firewall. Please make sure your personal computer / workstation firewall is not blocking FTP connections ::: #### Using GUI applications - Set the hostname to 10.64.245.22 - Use your monadic username/password as credential - Follow the instruction your application of choice #### Using Terminal To connect to the FTP server with terminal, use the following command to initiate a connection ``` ftp 10.64.245.22 ``` You will be see the following prompt to enter your username and password ``` Connected to 10.64.245.22. 220 Monadic FTP service. Name (10.64.245.22): ``` To upload file(s) to the FTP share, first use the command `cd` to change into the directory you wish to upload, and than use the command "put *filename*" to upload the file To download from the FTP share, use the command "get *filename*" to retrieve the file to your local directory ### System Monitoring with Graphana and Prometheus Graphana is specifically configured to view GPU utilization. Prometheus can be queried with [prometheus querying language](https://prometheus.io/docs/prometheus/latest/querying/basics/) ## Parabricks reference: https://docs.nvidia.com/clara/parabricks/3.8.0/WhatsNew.html syntax ``` pbrun $PARABRICK_TOOL $PARAMETERS $OUTPUT ``` example ```bash= pbrun \ fq2bam \ --ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \ --in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz \ --out-bam output.bam ``` # Deprecated 30 Sep 2022 ## User Creation and Management On all machines ```bash= sudo useradd $USERNAME -u $UID sudo usermod -aG docker $USERNAME ``` Be aware that USERNAME and UID need to be consistent across all machines ## Storage Access On CPU nodes ```bash= ls /rmount ``` ## Docker On GPU nodes ```bash= docker run -d --gpus \"device=${CUDA_VISIBLE_DEVICES}\" --ipc host --name ${CONTAINER_NAME} $RUN_COMMAND ``` For example To display GPU information ```bash= docker run -it --gpus '"device=0,1"' --ipc host --name cuda-test nvidia/cuda nvidia-smi ``` ## Slurm First fill in the blank in the following sample script `sample.bash`. ```bash= #!/bin/bash #SBATCH --gres=gpu:N # N = number of gpus #SBATCH --output=/home/USERNAME/%A.out # output log file #SBATCH --error=/home/USERNAME/%A.err # error log file #SBATCH --account=USERNAME docker run -d --gpus \"device=${CUDA_VISIBLE_DEVICES}\" --ipc host --name ${CONTAINER_NAME} $RUN_COMMAND while true; do CONTAINER_STATUS=`docker ps | grep -w ${CONTAINER_NAME}` if [ "${CONTAINER_STATUS}" = "" ]; then exit 0 fi sleep 5 done ``` Then, on CPU node 0, Execute the following command `sbatch sample.bash`