# RSH 013 internal: cluster etiquette - What is etiquette? - social norms for using something - are they also rules? - Is it always good? Is it out of date? - What is a cluster? - or supercomputer - computers connected with high-speed network (not always) - can be connected raspberry pi-s - What are other models of computational resources? - how does a cluster compare to "cloud"? - pros and cons - When should I consider moving to a cluster? - How do I get started before I move there? - Should I start somewhere else before going there? - cloud to HPC, laptop to HPC, directly develop on HPC - containers - development queues - group login nodes - How much to run on login node? - picture: https://supercomputingwales.github.io/SCW-tutorial/01-HPC-intro/ ![](https://i.imgur.com/TCtU4II.png) - what is a login/"head" node? - why is it a problem to run something on the login node? - using up all the memory on a login node (ipython, loading all data to memory) - good for: - editing files - short compilations but consider scripting them - When to run interactive and when to use a script? - document all dependencies in the run script - interactive: I only use for debugging a problem - What tools do we use? - number 1 tool: bash - from laptop to hpc - make it work on the laptop - convert it into a script or workflow - get an interactive compute node and run the script - convert script to a slurm/PBS script - debug until it works - monitor cpu and mem usage and scaling - adjust to not request too much resources - ssh - How many resources to request? - Many scripts are inherited from colleagues and never really validated - How to check memory and CPU settings? - Data on cluster - scratch - home dir - group dirs - archives - how to do you manage moving the data around? - How to report problems? - isolate the problem - make the example small - make the example self contained (all dependencies documented) - tell staff how to reproduce problem step by step - don't hesitate to ask for advice - X-Y problem - Example - cd git/hpc-examples - run pi script locally - ssh to cluster, git clone the repo - python pi.py to test it (increase iterations) - write pi.slrm - Outlook - should users adapt to supercomputers? should supercomputers adapt to users?