--- tags: JetStream title: DEPRECATED Creating STAMPS 2022 jetstream2 image --- # DEPRECATED Creating STAMPS 2022 jetstream2 image All commented out on this page, see later one here: https://hackmd.io/@astrobiomike/creating-stamps-2022-jetstream-image <!-- [TOC] GUI used was Jetstream2 exosphere: https://jetstream2.exosphere.app/ ## Launching initial image we are building ours on Image: Ubuntu 22.04 ![](https://i.imgur.com/S4TImBo.png) Instance size: m3.medium (8 CPU, 30GB RAM, 60GB disk) ![](https://i.imgur.com/CIcKpnQ.png) Launched super-fast on new system ![](https://i.imgur.com/cla79ag.png) ![](https://i.imgur.com/grBZWlK.png) And `ssh` info right at bottom box on that instance page: ![](https://i.imgur.com/CkQol5O.png) ## After logging in with ssh, switching to sudo mode for some things ```bash sudo bash ``` **NOTE:** Pay attention to when to switching out of sudo mode as noted when it happens below. Adding group the course attendees will be using: ```bash groupadd stamps2022 ``` ## Upgrading operating system It is noted on there [notes page](https://iujetstream.atlassian.net/wiki/spaces/JWT/pages/17465518/Customizing+and+saving+a+VM) (thought this is old jetstream docs, not new) to upgrade the operating system before making a new image (though this probably isn't the best idea in terms of being able to reproduce this from this chosen image) ```bash apt-get update apt-get upgrade ``` ## Setting timezone to EDT where workshop will be ```bash timedatectl set-timezone America/New_York ``` Setting so any user can write to `/opt`, where we'll be putting miniconda (this location is retained with the image): ```bash chmod go+w /opt ``` ## Modifying system-wide bashrc and skel bashrc files ### Modifying /etc/bash.bashrc This is the bashrc profile that is copied over to new users. Only adding conda info to bottom, so just appending here: ```bash cat >> /etc/bash.bashrc << 'EOF' # >>> conda initialize >>> # !! Contents within this block are managed by 'conda init' !! __conda_setup="$('/opt/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/opt/miniconda3/etc/profile.d/conda.sh" ]; then . "/opt/miniconda3/etc/profile.d/conda.sh" else export PATH="/opt/miniconda3/bin:$PATH" fi fi unset __conda_setup # <<< conda initialize <<< EOF ``` ### Modifying /etc/skel/.bashrc This is the system-wide bashrc profile. Here we are changing the prompt and adding conda stuff. Just overwriting the whole file here cause it's easier to just copy and paste this entire codeblock: ```bash cat > /etc/skel/.bashrc << 'EOF' # ~/.bashrc: executed by bash(1) for non-login shells. # see /usr/share/doc/bash/examples/startup-files (in the package bash-doc) # for examples # If not running interactively, don't do anything case $- in *i*) ;; *) return;; esac # don't put duplicate lines or lines starting with space in the history. # See bash(1) for more options HISTCONTROL=ignoreboth # append to the history file, don't overwrite it shopt -s histappend # for setting history length see HISTSIZE and HISTFILESIZE in bash(1) HISTSIZE=1000 HISTFILESIZE=2000 # check the window size after each command and, if necessary, # update the values of LINES and COLUMNS. shopt -s checkwinsize # If set, the pattern "**" used in a pathname expansion context will # match all files and zero or more directories and subdirectories. #shopt -s globstar # make less more friendly for non-text input files, see lesspipe(1) [ -x /usr/bin/lesspipe ] && eval "$(SHELL=/bin/sh lesspipe)" # set variable identifying the chroot you work in (used in the prompt below) if [ -z "${debian_chroot:-}" ] && [ -r /etc/debian_chroot ]; then debian_chroot=$(cat /etc/debian_chroot) fi # set a fancy prompt (non-color, unless we know we "want" color) case "$TERM" in xterm-color|*-256color) color_prompt=yes;; esac # uncomment for a colored prompt, if the terminal has the capability; turned # off by default to not distract the user: the focus in a terminal window # should be on the output of commands, not on the prompt #force_color_prompt=yes if [ -n "$force_color_prompt" ]; then if [ -x /usr/bin/tput ] && tput setaf 1 >&/dev/null; then # We have color support; assume it's compliant with Ecma-48 # (ISO/IEC-6429). (Lack of such support is extremely rare, and such # a case would tend to support setf rather than setaf.) color_prompt=yes else color_prompt= fi fi # getting externally accessible IP address accessible_IP=$(dig +short myip.opendns.com @resolver1.opendns.com) if [ "$color_prompt" = yes ]; then PS1='${debian_chroot:+($debian_chroot)}\[\033[01;34m\]\u@${accessible_IP}\[\033[00m\]:\[\033[01;35m\]\w\[\033[00m\]\$ ' else PS1='${debian_chroot:+($debian_chroot)}\u@${accessible_IP}:\w\$ ' fi unset color_prompt force_color_prompt # If this is an xterm set the title to user@host:dir case "$TERM" in xterm*|rxvt*) PS1="\[\e]0;${debian_chroot:+($debian_chroot)}\u@${accessible_IP}: \w\a\]$PS1" ;; *) ;; esac # enable color support of ls and also add handy aliases if [ -x /usr/bin/dircolors ]; then test -r ~/.dircolors && eval "$(dircolors -b ~/.dircolors)" || eval "$(dircolors -b)" alias ls='ls --color=auto' #alias dir='dir --color=auto' #alias vdir='vdir --color=auto' alias grep='grep --color=auto' alias fgrep='fgrep --color=auto' alias egrep='egrep --color=auto' fi # colored GCC warnings and errors #export GCC_COLORS='error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01' # some more ls aliases alias ll='ls -alF' alias la='ls -A' alias l='ls -CF' # Add an "alert" alias for long running commands. Use like so: # sleep 10; alert alias alert='notify-send --urgency=low -i "$([ $? = 0 ] && echo terminal || echo error)" "$(history|tail -n1|sed -e '\''s/^\s*[0-9]\+\s*//;s/[;&|]\s*alert$//'\'')"' # Alias definitions. # You may want to put all your additions into a separate file like # ~/.bash_aliases, instead of adding them here directly. # See /usr/share/doc/bash-doc/examples in the bash-doc package. if [ -f ~/.bash_aliases ]; then . ~/.bash_aliases fi # enable programmable completion features (you don't need to enable # this, if it's already enabled in /etc/bash.bashrc and /etc/profile # sources /etc/bash.bashrc). if ! shopt -oq posix; then if [ -f /usr/share/bash-completion/bash_completion ]; then . /usr/share/bash-completion/bash_completion elif [ -f /etc/bash_completion ]; then . /etc/bash_completion fi fi # >>> conda initialize >>> # !! Contents within this block are managed by 'conda init' !! __conda_setup="$('/opt/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)" if [ $? -eq 0 ]; then eval "$__conda_setup" else if [ -f "/opt/miniconda3/etc/profile.d/conda.sh" ]; then . "/opt/miniconda3/etc/profile.d/conda.sh" else export PATH="/opt/miniconda3/bin:$PATH" fi fi unset __conda_setup # <<< conda initialize <<< EOF ``` ## Jupyter boot script added to /opt/ This sets the jupyter notebook password and launches jupyter lab when a new instance is created with this image, it is run when the user instance is booted for the first time. Getting password can be done like so, for example: ```python python from notebook.auth import passwd passwd("pw123", algorithm = "sha1") # 'sha1:f3e60c834126:7485d1211f2cdb9dc35a0271f8a1b4b7a2dc66b8' # so that would be replaced below when setting a new password ``` Got some of that info from [here](https://jupyter-notebook.readthedocs.io/en/stable/public_server.html#preparing-a-hashed-password). ```bash cat > /opt/jupyter-boot.sh << 'EOF' #!/bin/bash rm -rf ~/.jupyter /opt/miniconda3/bin/jupyter server --generate-config printf " c = get_config() c.NotebookApp.ip = '*' c.NotebookApp.open_browser = False c.NotebookApp.password = u'sha1:16e5e4b28616:4b7eb3fd1055a99a2cdac81e6e3236d0a9a6b242' c.NotebookApp.port = 8000 " >> ~/.jupyter/jupyter_server_config.py cd ~/ nohup /opt/miniconda3/bin/jupyter lab > ~/.jupyter/log 2>&1 & EOF ``` ## Installing R Following here: https://linuxize.com/post/how-to-install-r-on-ubuntu-20-04/ ```bash apt install dirmngr gnupg apt-transport-https ca-certificates software-properties-common ``` This gave me this message at one time: ![](https://i.imgur.com/Jk8IVrK.png) But ignoring it vs rebooting didn't seem to make a difference, so do whichever you feel so inclined to do :+1: Another time it gave me a message asking which daemons to update. I just hit the escape key and all seemed fine so I went on with my day ¯\\\_(ツ)\_/¯ Adding CRAN repository: ```bash apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/' ``` Installing R: ```bash apt install r-base # see next if error here ``` This gave me a problem with libicu66. Found help here: https://linuxhint.com/install-r-and-rstudio-ubuntu/ Following that: ```bash wget http://security.ubuntu.com/ubuntu/pool/main/i/icu/libicu66_66.1-2ubuntu2_amd64.deb dpkg -i libicu66_66.1-2ubuntu2_amd64.deb ``` And re-trying after that: ```bash apt install r-base ``` ## Installing RStudio Server Following here, with some modifications, from the Install for Debian 10 / Ubuntu 18 / Ubuntu 20 section: https://www.rstudio.com/products/rstudio/download-server/debian-ubuntu/ ```bash # not sure why this is needed, but feeling superstitious chmod -R a+rwx ./ wget https://download2.rstudio.org/server/bionic/amd64/rstudio-server-2022.02.3-492-amd64.deb apt install gdebi-core # if the following is not found, see this page for # latest, that looks similar to the below, it # seems to update/change frequently wget http://http.us.debian.org/debian/pool/main/o/openssl/libssl1.1_1.1.1n-0+deb11u3_amd64.deb gdebi libssl1.1_1.1.1n-0+deb11u3_amd64.deb gdebi rstudio-server-2022.02.3-492-amd64.deb ``` **consider note here about --no-sandbox if i have trouble: https://stackoverflow.com/questions/71962411/installing-r-studio-on-ubuntu-22-04)** ## Installing R packages To install some of the required R packages, needed these as well: ``` apt-get -y install libcurl4-openssl-dev \ libssl-dev libxml2-dev \ libudunits2-dev libcairo2-dev \ libgdal-dev # sudo apt -y install libgdal-dev ``` Then installing R packages with BiocManager into `/usr/local/lib/R/site-library/`, can copy/paste this to make the script: ```bash cat > r-installs.R << 'EOF' install.packages("BiocManager", lib="/usr/local/lib/R/site-library/") BiocManager::install("tidyverse", lib="/usr/local/lib/R/site-library/") BiocManager::install("phyloseq", lib="/usr/local/lib/R/site-library/") BiocManager::install("dada2", lib="/usr/local/lib/R/site-library/") BiocManager::install("decontam", lib="/usr/local/lib/R/site-library/") BiocManager::install("DESeq2", lib="/usr/local/lib/R/site-library/") BiocManager::install("tximport", lib="/usr/local/lib/R/site-library/") BiocManager::install("devtools", lib="/usr/local/lib/R/site-library/") devtools::install_github("adw96/breakaway") devtools::install_github("adw96/DivNet") devtools::install_github("bryandmartin/corncob") EOF ``` And running it: ``` Rscript r-installs.R ``` ## Setting permissions of R library directory so all users can write there: ```bash find /usr/local/lib/R/ -exec chmod a+rw {} \; ``` ## Installing miniconda3 **Exiting sudo before doing this** ```bash exit ``` ```bash curl -O -L https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh ``` **During above interactive install steps** - set install location to `/opt/miniconda3` (user directories not maintained with image creation) - say "yes" to initialize at end of install Setting miniconda location to stamps2022 group so they should be able to also write there now if wanting to install new conda packages/envs: ```bash sudo chgrp -R stamps2022 /opt/miniconda3/ ``` ## Sourcing to activate new environment ```bash source ~/.bashrc ``` ## Installing mamba, jupyter, and jupyterlab in base conda environment ```bash conda install -y -c conda-forge mamba mamba install -y jupyter jupyterlab ``` ## Creating Todd conda env ```bash mamba create -y -n treangen -c conda-forge -c bioconda -c defaults fastqc=0.11.9 multiqc=1.13a sra-tools=2.11.0 fastp=0.23.2 kraken=1.1.1 squeegee=0.2.0 parsnp=1.7.4 ``` ## Now "burning" this image Can exit the `ssh` connection, and on the instance page on exosphere on the right side, select "Actions" then "Image". (Note I read somewhere that to retain "data", the instance should be shut down before creating the image. I saw that later, adn it hasn't been a problem for where I put stuff, but keep that in mind if I run into any possibly related problems.) ![](https://i.imgur.com/Je1ifIK.png) ![](https://i.imgur.com/mQlOQJR.png) ### Where to find images in exosphere GUI These seem to be created immediately now (well take a few minutes still, but don't need to be approved/launched by anyone), which is great. Main allocation page: ![](https://i.imgur.com/DZ1tlzY.png) Into Images ## Modifying cloud-init config These changed with JetStream2 (or maybe exosphere) from deploy scripts. Start-up stuff is handled by a config now (https://docs.jetstream-cloud.org/ui/exo/create_instance/#advanced-options). Info on boot commands here: https://cloudinit.readthedocs.io/en/latest/topics/examples.html#run-commands-on-first-boot **New JS2 way** Modifying the cloud-init config yaml, adding in ada (a sudo I can access on anyone's) and stamps2022 users/passwords, and jupyter lab launch script execution on to be run on every boot-up Here's an example of making a 'salted' password, which is how we should put them in the config as done below: ```bash openssl passwd -1 'stamps2022' # $1$vsttf/Xf$sUTIc9Pv.oJZ3YjmK/b0s0 ``` ``` #cloud-config users: - default - name: ada shell: /bin/bash groups: sudo, admin, users sudo: ['ALL=(ALL) NOPASSWD:ALL']{ssh-authorized-keys} lock_passwd: false passwd: $1$rtwLswOU$nCTYu2LC.QEJevgINm5.e. - name: stamps2022 shell: /bin/bash groups: users lock_passwd: false passwd: $1$vsttf/Xf$sUTIc9Pv.oJZ3YjmK/b0s0 ssh_pwauth: true package_update: true package_upgrade: {install-os-updates} packages: - python3-virtualenv - git{write-files} runcmd: - sudo -u stamps2022 -H sh -c "bash /opt/jupyter-boot.sh" - echo on > /proc/sys/kernel/printk_devkmsg || true # Disable console rate limiting for distros that use kmsg - sleep 1 # Ensures that console log output from any previous command completes before the following command begins - >- echo '{"status":"running", "epoch": '$(date '+%s')'000}' | tee --append /dev/console > /dev/kmsg || true - chmod 640 /var/log/cloud-init-output.log - {create-cluster-command} - |- (which virtualenv && virtualenv /opt/ansible-venv) || (which virtualenv-3 && virtualenv-3 /opt/ansible-venv) || python3 -m virtualenv /opt/ansible-venv . /opt/ansible-venv/bin/activate pip install ansible-core ansible-pull --url "{instance-config-mgt-repo-url}" --checkout "{instance-config-mgt-repo-checkout}" --directory /opt/instance-config-mgt -i /opt/instance-config-mgt/ansible/hosts -e "{ansible-extra-vars}" /opt/instance-config-mgt/ansible/playbook.yml - ANSIBLE_RETURN_CODE=$? - if [ $ANSIBLE_RETURN_CODE -eq 0 ]; then STATUS="complete"; else STATUS="error"; fi - sleep 1 # Ensures that console log output from any previous commands complete before the following command begins - >- echo '{"status":"'$STATUS'", "epoch": '$(date '+%s')'000}' | tee --append /dev/console > /dev/kmsg || true mount_default_fields: [None, None, "ext4", "user,exec,rw,auto,nofail,x-systemd.makefs,x-systemd.automount", "0", "2"] mounts: - [ /dev/sdb, /media/volume/sdb ] - [ /dev/sdc, /media/volume/sdc ] - [ /dev/sdd, /media/volume/sdd ] - [ /dev/sde, /media/volume/sde ] - [ /dev/sdf, /media/volume/sdf ] - [ /dev/vdb, /media/volume/vdb ] - [ /dev/vdc, /media/volume/vdc ] - [ /dev/vdd, /media/volume/vdd ] - [ /dev/vde, /media/volume/vde ] - [ /dev/vdf, /media/volume/vdf ] ``` ## The goods With that all set up, launching an instance with that image will have users astrobiomike (with sudo) and stamps2022 (as a user, no sudo). - Jupyter hub is accessed at: - \<ip\>:8000/lab - Rstudio is accessed at: - \<ip>:8787 ## Seeing about putting R in jupyter lab instead Looking here: https://richpauloo.github.io/2018-05-16-Installing-the-R-kernel-in-Jupyter-Lab/ ``` devtools::install_github("IRkernel/IRkernel") IRkernel::installspec() ``` -->