Try   HackMD

Running Cellpose/napari on Janelia's Cluster

1) Login to Janelia's NoMachine via a browser

You must be on the campus' secure network or the VPN to do this.

Enter your Janelia login credentials to login
Then start a new virtual machine

nomachinelogin

IF this process fails, like it does here:

ezgif-4-802e9a94b2
Then you most likely need to have your user account added for cluster access.

2) Install napari/Cellpose

a) Open a terminal instance

*) If you don't already have micromamba installed, strongly consider doing so (If conda is already installed that will work instead, but is slower):

Run the following in the terminal (simply press enter anytime it asks for input):

"${SHELL}" <(curl -L micro.mamba.pm/install.sh)

Then close the terminal and open a new instance. Run the following commands:

micromamba config append channels conda-forge
micromamba config append channels nvidia
micromamba config append channels pytorch

b) Initialize an interactive GPU bash session on the slurm cluster to install from:

In the terminal run the following:

bsub -n 4 -gpu "num=1" -q gpu_short -Is /bin/bash

Wait for the job to launch:

ezgif-2-5af77cb530

The steps below are very similar for both Cellpose and napari, but I will split them up for clarity. First, napari:

c.napari) Create and activate a new micromamba environment:

If you're using conda instead, just replace micromamba in the following commands with conda

Run:

micromamba create -y -n napari-env python=3.10

Once that has completed, run:

micromamba activate napari-env

d.napari) Install:

Run:

python -m pip install "napari[all]"

For more help, see the napari installation instructions.

Now for Cellpose:

c.cellpose) Create and activate a new micromamba environment:

If you're using conda instead, just replace micromamba in the following commands with conda

Run:

micromamba create -y -n cellpose-env python=3.10

Once that has completed, run:

micromamba activate cellpose-env

d.cellpose)

run:

python -m pip install "cellpose[gui]"

For more help, see the Cellpose installation instructions.

e) End the GPU session by running exit

3) Run an interactive GPU session for napari/Cellpose:

napari:

i) Activate the environment with micromamba activate napari-env

ii*) Launch the job with bsub -n 12 -gpu "num=1" -q gpu_short -Is napari

cellpose:

i) Activate the environment with micromamba activate cellpose-env

ii*) Launch the job with bsub -n 12 -gpu "num=1" -q gpu_short -Is cellpose

*) Options for the job can be configured as follows:

  • include -B to receive email notifications associated with your jobs.
  • -n 12 corresponds to 12 CPUs for the job. Each CPU is generally allocated 15GB of RAM (memory).
  • -gpu "num=1" corresponds to 1 GPU. Most GPU nodes offer up to 4 GPUs at once.
  • -q gpu_short corresponds to the "gpu_short" queue. More information about the GPU queues is available on the Scientific Computing Systems Confluence. But, in general, "gpu_short" will leverage just about any available gpus. gpu_tesla, gpu_a100, and gpu_h100 are also options (listed in increasing order of power).

4) If you run a big job, leave the remote virtual machine alive and/or make sure you have run the job in a tmux or screen persistent session (see more here ).

Before running :

  • Create a tmux session using tmux or tmux new -s napari if you want to name it
  • Run part (3) code
  • Detach from session using
    Screenshot 2024-02-01 at 2.31.45 PM

This give you the ability for a forever running ssh session. you can re attach to again again by:

  • Getting the session name via tmux ls
  • Attach to it using tmux attach -t name ( Will be named 0 if it is created via tmux only )

If you feel like you need to double check it's still running, you can see the space you're taking on the cluster here: https://cluster-status.int.janelia.org/cluster_status/