# Renku UNLimited - Azure Pilot ## Guidelines ### Budget We want to get a realistic measure of cost. We have to be careful about how much we spend! * Budget: 20k * Cost per hour of a GPU **(A100 80GB, 200GB, 20vCPU): 4.5 CHF/h** **How you can help** * Please Shut down your session every night **What we're doing to manage costs** * We have set the idle time to X (30 min?). * Discuss autosaves... * How to open or trash an autosave * Why autosaves might fail * We are working on improvements! * How Idle Timeout is calculated (max age and pod idle seconds) :::warning TODO [Dario]: Set the session culling policy ::: * Monitor & update Slack channel on spending daily * Nice to Have: small dashboard to surface spending ## Setup Guide ### How to: Make an Account on Unlimited * Create a account: https://renku.unlimited.azure-poc.dev.renku.ch/ ### How to: Add an existing Renku Project to Unlimited These instructions will make it so that from a local clone of a Renku project, you can push and pull changes to/from both an existing Renku project and also a copy of that project on Unlimited. 1. If you haven't already, `renku clone` your Renku project to your local machine. 1. Add your SSH key (if you are using ssh for git) or create PAT (using https for git): https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html https://docs.gitlab.com/ee/user/ssh.html#add-an-ssh-key-to-your-gitlab-account 1. In your local project git repo (this uses https for the remote, username: username , password: PAT) ``` git remote add azure https://renku.unlimited.azure-poc.dev.renku.ch/gitlab/<user-name>/<repo-name>.git git checkout -b azure-master git push --set-upstream azure master ``` 1. Change your .gitlab-ci.yaml on the azure-master to: `https://github.com/SwissDataScienceCenter/renku-project-template/blob/azure-poc/python-minimal/.gitlab-ci.yml` :::info In principle steps 2 and 3 can be simplified with 1. install the Renku CLI 2. run ``` renku login --git renku.unlimited.azure-poc.dev.renku.ch git lfs push --all origin master git push origin master ::: This creates a new remote (git remote show), and the azure-master branch uses this remote. So you can use both systems simultaneous. The migration itself is done through the git push -> creates the repo and pushes the content inkl. datasets. If you create a new project on the azure-poc the correct .gitlab-ci.yaml is used. :::info TODO [Dario] check order of operations for triggering CI job for image build ::: :::info TODO [Rok] Consider using the `renku template` command to update the ci file automagically ::: #### What about my project's data? If your data in the source RenkuLab project is in git LFS, then this process will transfer your LFS data to RenkuLab Unlimited. *Note: May need to pull LFS objects first, before pushing* ### How to: Upload data to a project on Unlimited 1. Install [rclone](https://rclone.org/install/) 2. [Setup rclone with Azure Blob](https://rclone.org/azureblob/) ```sh rclone config No remotes found, make a new one? n) New remote s) Set configuration password q) Quit config n/s/q> n name> azure-poc Type of storage to configure. Storage> azureblob Storage Account Name account> pocuserdatasets Login> Storage Account Key key> <key> # Leave rest empty Configuration complete. Options: - type: azureblob - account: pocuserdatasets - key: <key> Keep this "azure-poc" remote? y) Yes this is OK (default) e) Edit this remote d) Delete this remote y/e/d>y ``` 3. Upload to Azure Blob ```sh # List content of storage account rclone lsd azure-poc: # Create new container: rclone mkdir azure-poc:<name-dataset> # Copy files from local to container rclone copy /path/to/src azure-poc:<name-dataset> # or sync local dir to container rclone sync -i /home/local/directory azure-poc:<name-dataset> ``` Data limits: unlimited! Data security access: At the moment, data access is shared amoong the PoC. Let us know if you need a secure space for your data! 4. Use the blob in a renku session Go to "start session with options": ![](https://i.imgur.com/kzlqoVA.png) Then click on "Cloud storage" and enter the details about the Azure blob you want to mount: ![](https://i.imgur.com/pF5NMOy.png) The `endpoint` should be `https://<your-storage-account>.blob.core.windows.net`. Enter the blob "container" under `Bucket Name` and your storage account key under `Secret Key`. Leave `Access Key` blank. ### How to: Connect to RenkuLab Unlimited sessions via SSH (& the VSCode SSH Extension) We have released a beta version of SSH access to Sessions on RenkuLab Unlimited. We are working on a much more user friendly way of ssh'ing to sessions via the Renku CLI, but for now, you can test out this process with a few manual steps. Please note that this beta process requires updating the SSH url with each new session (mostly). But we will resolve this in the final implementation! Here is how you can try it out: #### Configure your Renku Project for SSH 1. Open your project's `Dockerfile`, and on line 3, modify the base image to be: `ARG RENKU_BASE_IMAGE=renku/renkulab-py:python-3.9.12-868889b` - Commit this change and allow the project image to rebuild. (This step adds SSH Server to your project enviornment) 1. Add your public key to `.ssh/authorized_keys` in your git project. If you're not familar with this process, here's how: 1. Create a key pair on your laptop: `ssh-keygen -t ed25519 -C "<your email>"` - You may want to adjust the name of the file in which to save the key to include `renku` so you can keep track, i.e. when it prompts `Enter file in which to save the key (/Users/laurakinkead/.ssh/id_ed25519):`, enter something like `/Users/laurakinkead/.ssh/id_renku`. - (Setting a password is not required) 1. Copy your *public* key `cat .ssh/id_renku.pub` and copy the text - Note: the `.pub` is important! - It should look like `ssh-ed25519 <a bunch of letters and numbers> <your email>` 1. Add the key to your Renku project 1. Open your Renku project in the GitLab Web IDE by going to "Open in GitLab" and then "Open Web IDE" 2. Create a `.ssh` directory, and create a file `authorized_keys` inside it (no file extension) 3. Paste your public key inside `authorized_keys` 4. Commit to master (not as a new branch!) #### SSH into a Session from your laptop command line 1. Start a new session on RenkuLab 1. Get the ID of your session. Here’s how to do that: - If your open browser session has the URL `https://renku.unlimited.azure-poc.dev.renku.ch/projects/tasko.olevski/test-project-ssh-1/sessions/show/tasko-2eol-test-2dproject-2dssh-2d1-9fe21137` then the session ID is the last part: `tasko-2eol-test-2dproject-2dssh-2d1-9fe21137` 1. SSH from your laptop ``` ssh -i <path to private key> -J jovyan@renku.unlimited.azure-poc.dev.renku.ch:2022 jovyan@<session id> ``` For example, my full SSH command looks like this: ``` ssh -i .ssh/id_renku -J jovyan@renku.unlimited.azure-poc.dev.renku.ch:2022 jovyan@tasko-2eol-test-2dproject-2dssh-2d1-9fe21137 ``` - Accept the authenticity of the host (2 times) #### How to Configure SSH access via VSCode 1. Install the Remote - SSH extension in VSCode) 2. Click in the bottom left of the VSCode window to `Open a Remote Window` ![](https://i.imgur.com/LKvs6yH.png) 4. Select `Open SSH Configuration File...` 5. Select `<home>/.ssh/config` - On a Mac, this is `/Users/<username>/.ssh/config` 6. Paste the following into your SSH config: ```yaml Host renkuUnlimited HostName **<session_ID>** User jovyan IdentityFile **<path_to_identify_file>** ProxyJump renkuUnlimitedJump Host renkuUnlimitedJump HostName renku.unlimited.azure-poc.dev.renku.ch User jovyan Port 2022 ``` For example, mine looks like this: ```yaml Host renkuUnlimited HostName laura-2eki-ssh-2dtest-735c9f78 User jovyan IdentityFile ~/.ssh/id_renku ProxyJump renkuUnlimitedJump Host renkuUnlimitedJump HostName renku.unlimited.azure-poc.dev.renku.ch User jovyan Port 2022 ``` #### Open your RenkuLab Unlimited Session in VSCode 1. Click again in the bottom left `Open a Remote Window` 2. Select `Connect to Host...` 3. Select `renkuUnlimited` 4. In the Explorer, `Open Folder` and enter `/home/jovyan/work/<Renku project name>/` 5. You should see a VSCode file browser with your Renku project files! :tada: :::info Please note that the RenkuLab session ID changes based on the: - branch - commit SHA - user - project name So if any of these change, then **you need to find the session ID again and update the `HostName`** in the second line of the SSH config. We are currently working on making this easier so the command stays the same per user and project! ::: We welcome all feedback you have on this beta feature! :) ## Tests we'd like to Evaluate ### Data #### [Till] Using cloud storage for mounting data * Instead of LFS * Raw numbers on VM and also in RenkuLab sessions * Expose as prometheus metric #### Mounting blob-based datasets on local * How easy is it to do? * What should the UX be like?