--- tags: user support coffee break title: archive3 --- # Archive of Weekly CSC user coffee break Q&As (vol3) **Old archive: https://hackmd.io/uVI5gLKDQoWZgAVcufYg2A?view** **Even older archive: https://hackmd.io/9QsIbJ03T1SaNDV_xbT8Dw?view** # Q&A for Weekly CSC user coffee break *CSC's open support session organised every **Wednesday at 14:00** in [Zoom](https://cscfi.zoom.us/j/65059161807).* :::info **Zoom:** <https://cscfi.zoom.us/j/65059161807> **Q&A:** <https://siili.rahtiapp.fi/weekly-user-zoom> **Events page (slides and recordings):** <https://ssl.eventilla.com/usersupportcoffee/EN> **Feedback:** <https://link.webropolsurveys.com/S/94AB9F77D8EFF054> **Q&A archive:** <https://hackmd.io/uVI5gLKDQoWZgAVcufYg2A?view> **Old Q&A archive:** <https://hackmd.io/@CSC-research-support/weekly_session_archive> ::: :::warning :bulb: **Useful links & further resources** * [CSC docs](https://docs.csc.fi/) * [CSC Computing Environment course materials](https://csc-training.github.io/csc-env-eff/) - [CSC's courses](https://www.csc.fi/training) ::: ## 2024-12-11 session #### Short talk *no short talk this time* #### Advertisements - [Moving your AI training jobs to LUMI: A Hands-On Workshop, February 4-5, *registration will open next week*](https://www.lumi-supercomputer.eu/events/lumi-ai-workshop-feb2025/) - [CSC Spring School on Computational Chemistry 2025, April 23-25, **registration open**](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/) - [BioExce ## 2024-11-27 session **Short talk: "How do I figure out batch job parameters?""** [Slides](https://a3s.fi/saren-2001659-pub/How_to_Figure_Out_Batch_Job_Parameters_2024-11-27.pdf) **Advertisements:** - [Julia for HPC](https://csc.fi/koulutuskalenteri/julia-for-high-performance-scientific-computing-enccs-2/) 9-12.12.2024, registration open ### Zoom breakout rooms and topics Room 1 - Batch jobs, billing units, job info Q330, Q331, (Ari-Matti) Room 2 - Virtual machine related question, Q337 (Alvaro) Room 3 - Seff and timeout, Q334, Q338 + Q335 (Rasmus) Room 4 - Containerized conda environment Q332 (Laxman) Room 5 - Beginner (Heli, Kimmo) Room 6 - Matlab for course, Q336 (Jaan) ### Questions - **Q330** How can I use GPU nodes, and how to get enough billing units to use GPUs? I tried srun, sinteractive and jupyter. Srun did not seem to get access to GPU. What is wrong? Does interactive mode reserve GPU unnecessarily long? Is there any mitigation for that? / Petri Välisuo, University of Vaasa: I already tried `--gres=gpu:v100:1`, but it does not seem to work. - We have detailed instructions in our documentation: - https://docs.csc.fi/computing/running/creating-job-scripts-puhti/ - https://docs.csc.fi/computing/running/example-job-scripts-puhti/ - You need to use e.g. `--gres=gpu:v100:1` flag to get GPUs, it is not enough to run in gpu partition. The example in this case would reserve 1 V100 GPU on Puhti. On Mahti you should use `--gres=gpu:a100:1`. See the docs for more details. - Using GPUs interactively is indeed often inefficient, but it is also possible, both with `sinteractive` https://docs.csc.fi/computing/running/interactive-usage/ and in the web interface. - You can apply for more billing units via My CSC: https://docs.csc.fi/accounts/how-to-apply-for-billing-units/ - **Q331** How can I measure the energy consumption of my processes? `seff` seems to work for GPUs. Can I also get some info from CPUs? - A: Here are instructions for LUMI https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/jobenergy/ (but this does not seem to work on Puhti/Mahti) - **Q332** I have some problems when creating a containerized conda environment. I have environment.yml (.yml file created based on existing conda env from my laptop) file that describes the channels and dependencies used as well as pip installable packages. After loading the Tykky module and running conda-containerize new --prefix ... the system starts creating the conda environment but fails in the pip part when installing MinkowskiEngine module. This MinkowskiEngine module installation itself requires some additional export commands and additional libraries which makes this a bit more complicated. So my question is how should I create my containerized conda environment when I have an environment that requires modules/libraries that need a multiple installation steps (not just pip/conda ...)? The environment that I'm trying to implement to CSC is described here: https://github.com/lizhaoliu-Lec/CPCM ,here: https://github.com/lizhaoliu-Lec/CPCM/blob/master/prepare_env/me054/README.md and here: https://github.com/NVIDIA/MinkowskiEngine . Thank you! - Possible solutions are suggsted. One solution is to build apptainer container: start with MinkowskiEngine as a base contaienr and install all the conda environment that CPCM needs. Tykky wrapper installa may work but sudo installations will noy work. But if there is a pip version of MinkowskiEngine, one can use it. The later case works as long as dependecies are available. - **Q333** Scrödinger Maestro says that use of ${HOME}/schrodinger.hosts is _deprecated_. Current CSC SLURM accounting depends on that feature. What is the contingency plan (for use of Maestro on Puhti, etc in the future)? / Jukka Lehtonen - A: This has been the situation for a long time already and it seems the feature is not going away any time soon. If there will be some change to the situation, we will adapt to it and communicate appropriately. So currently there is no need to be worried about it. - JL: That did answer the question. :) - **Q334** If my job times out, is the "seff" command reliable? So can I interpret the seff results even if there's timeout? (I tried to learn a SKlearn HistGradientBoostingRegressor with GridSearchCV with n_jobs=-1. I reserved 15 mins and the job timed out, but the seff command stated CPU Utilized: 00:00:02, which seems weird) - A: Yes you can interpret `seff` results even with timeout. The low CPU usage (i.e. low efficiency with longer job duration) is likely due to the job not being able to utilize the resources. Good to investigate what's the issue! - **Q335** I use hyperqueue to trivially parallelize my small jobs. Each task within hyper queue usually takes about 5 mins. I have 5000 such tasks that I need to execute. I had found this solution using hyperqueue about 3 weeks ago. But recently (like about a week ago) the tasks just started getting stuck. By stuck I mean that the tasks show as running for several hours and the hyperqueu sbatch job just times-out. The individual task script if run manually still runs within 5mins without fail. Is this something I need to fix? How can I fix this? I don't know what is happening since, it used to run flawlessly 3 weeks ago. - Sachith Pai - A: Hard to say without seeing your batch script. I see you have a ticket about it open. Could you send your batch script so we can debug? - A: Lagging that comes and goes it often related to disk slowness issues, but 5 mins vs. several hours is so large there's probably something else going wrong in hyperqueue setup. It may for example be that there's some race condition or deadlock where some process gets stuck and then the rest cant run because they are waiting for those resources to be freed. - Sachith (reply): Just to add that each job reads some files performs some computation and writes a couple of lines of output to a common file. I use the sbatch-hq command line to run the jobs, there is no other batch script. Each task is excecuting a c++ executable with different parameters. - A: Okay, in that case it would be useful to know the exact `sbatch-hq` command and perhaps the commandlist file. It is also possible to save the batch script sbatch-hq creates by adding flag `--keep-batch-file` in the sbatch-hq command. - Sachith (reply): The exact command is `sbatch-hq --cores=1 --nodes=1 --account=project_2005865 --partition=small --time=16:00:00 hq_tasks_evaluate`. I will save the batch-script rn. It is a largeish script. How do I submit it? Paste here? - **Q336** Help with setup VMs with GPU and Matlab is needed for a course. #7435947 - Jaan: I will handle this - Room 6 - Answer: - **Q337** Help with setting up VM and checking that everything is okay (changing ePouta flavor from previous I/O flavor) needed here as well - Answer / breakout room number - Room 2, Alvaro - **Q338** How do I know if I should run, let's say, 8 cores in 1 task, or 8 tasks with 1 core each? This question is related to the question **Q334**, so I'm trying to train a HistGradientBoostingRegressor with GridSearchCV. - Trying it :) You might get some information from our software page (docs.csc.fi/apps/ code name here ) or from the code's own instructions. Some codes are parallelized by mpi (and can use many tasks per) and some by threads (many cores per task). This info is available on the code manual. However, all machines are different and typically the input data (or what the code will actually do) are different. Since this affects how much (and how) the problem can be parallelized, you should try. We have instructions on how to perform a "scalability test" (see the first exercise on this page: https://csc-training.github.io/csc-env-eff/part-2/workflows/ ), which will walk you through this. In essence, you try a short, but representative, job with different kinds of resources and then choose the setting with is an optimal compromise of using resources and speed (i.e. the jobs should get significantly faster if you add more resources to the job: it doesn't make sense to use any more than that). **What is the code that you're running?** -> It's a scikit-learn python package. My guess is 8 (or 2 or 4) cores in 1 task, i.e. along the lines of (in your batch script or in the puhti web interface): `#SLURM --cpus-per-task=8`, but the speed of your job might be limited by other factors, too, like disk I/O. Please check also the other content in the link above about workflows (and perhaps, this one too: https://docs.csc.fi/support/tutorials/python-usage-guide/#python-parallel-jobs ) - Tip from a offline specialist: "What could be tried is to allocate N number of cores and use this amount of cores as an argument input for n_jobs in GridSearch: " https://scikit-learn.org/1.5/modules/generated/sklearn.model_selection.GridSearchCV.html ## 2024-11-20 session No short talk 2 questions: one on cloud and http vs https, another one on tensorflow containers ## 2024-11-13 session **Short talk:** no short talk today ### Zoom breakout rooms and topics Room 1 - Q329 Room 2 - MyCSC publication question ### Questions - **Q329** How to import qiskit and qiskit-on-iqm to LUMI when using the web interface? - Room 1 ## 2024-11-06 session **Short talk:** no short talk today **Advertisements:** - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration closing soon (13.11.2024) - [Schrödinger Maestro workshop](https://csc.fi/en/training-calendar/schrodinger-maestro-workshop-2/) 4.-5.12.2024, registration open - [Julia for HPC](https://csc.fi/koulutuskalenteri/julia-for-high-performance-scientific-computing-enccs-2/) 9-12.12.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services Room 2 - Room 3 - Kimmo & Martina Room 4 - Rahti (Alvaro) ### Questions - **Q328** Topic: Using the GGIR package in R for 150 accelerometer data of 4.6gb each. - Beginner - Best practice for multi-threading? - Using GGIR ncores vs R mc.cores vs Slurm - How to problem solve errors? - Getting warnings and errors to print to slurm output - Different results using Rstudio instance vs batch job script? - Best way to test batch jobs - Transfering files from JYU network drive to scratch or Allas? - Currently transfering via work computer DL then rsync - Answer: Please join the Zoom session! If not possible, please send an email to servicedesk@csc.fi with your questions. ## 2024-10-23 session **Short talk:** no short talk today NOTE: No weekly session on 30.10.! **Advertisements:** - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration open - [Schrödinger Maestro workshop](https://csc.fi/en/training-calendar/schrodinger-maestro-workshop-2/) 4.-5.12.2024, registration open - [Julia for HPC](https://csc.fi/koulutuskalenteri/julia-for-high-performance-scientific-computing-enccs-2/) Dec. 9-12 (2024), registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Laxmana) Room 2 - R (Heli, Samantha) Room 3 - Haichuan & Mats ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q325** Perhaps this is very specific, but I would like to calculate euclidean distances between polygons. This is a heavy computational process in R and I have a lot of polygons. Is there already any script that I could use to paralelize this or to make it faster? :) - room 2 - computations are independent - unit of calculations to be parallelized - one landscape? - way to go depends on resources needed for each process - foreach , or future multicore: https://github.com/csc-training/geocomputing/tree/master/R/puhti - array job: https://docs.csc.fi/apps/r-env/#serial-batch-jobs - **Q326** Is a tool called cybersort installed in csc? It is installed as docker image. How do I check if such tools are installed? field: Biosciences - Answer / breakout room number - Here you can see what we have preinstalled in Puhti/Mahti: https://docs.csc.fi/apps/by_discipline/ - Some tools are installed, but lacking the documentation page. Here's a tutorial for searching applications: https://csc-training.github.io/csc-env-eff/hands-on/modules/module-exercise-with-aligners.html - One option is that you can always ask from us :) Here or via servicedesk@csc.fi - Convert Docker image to Singularity: https://docs.csc.fi/computing/containers/creating/ -great thank you! I will check - **Q327** Questions by Haichuan - Room 3 ## 2024-10-16 session **Short talk: Building and running Apptainer containers on Puhti ([docs.csc.fi page](https://docs.csc.fi/computing/containers/creating/))** NOTE: No weekly session on 30.10.! **Advertisements:** - [Build systems course and support session](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) 8.10.2024 - 23.10.2024 - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration open - [Schrödinger Maestro workshop](https://csc.fi/en/training-calendar/schrodinger-maestro-workshop-2/) 4.-5.12.2024, registration open - [Julia for HPC](https://csc.fi/koulutuskalenteri/julia-for-high-performance-scientific-computing-enccs-2/) Dec. 9-12 (2024), registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Ari-Matti) Room 2 - Containers + Matlab (Jaan) Room 3 - Q320 (Laxman) Room 4 - Pouta / Q322 (Alvaro et al.) Room 5 - Allas (Kimmo) Room 6 - Room 7 - Room 8 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q317** Hi, Matlab environment seems not available in Mahti, correct? Or is there an alternative way to get it there? I was looking for some functionalities like the Matlab parallel server currently available in Puhti. (Thanks, --Andrea) - Breakout room 2 - Jaan: MATLAB and MATLAB Parallel Server is not available on Mahti due to the following reasons: - Slurm keeps track of licenses that are given out to MATLAB workers. We don't have solution to keep track of licenses to Puhti's and Mahti's slurm simultaneously. - There are no suitable queues for running MATLAB workers on Mahti jobs are allocated full nodes. - **Q318** When using HyperQueue in Mahti invoking `sbatch-hq`, where is the HyperQueue server itself running? Is the server running in one of the workers? Shall I take this into account in the number of nodes and cores requested? (Thanks, --Andrea) - It is also running on the compute node, but uses quite little resources. Thus, you do not need to allocate a dedicated core to it manually. AFAIK, the server launched with srun uses option --overlap, thus the job step running the server is allowed to share (overlap with) resources the actual computing tasks are using. - **Q319** Is there a way to perform hybrid batch jobs in Puhti from Matlab? Can the `Pool` size passed to the Matlab `batch` command be divided among the workers? For instance, suppose we ask `Pool=40`, can we give 8 cpu cores per worker? So 5 workers will be allocated by the Matlab Parallel Server (hence 1 Puhti node). This is basically a followup question to Q317. (Thanks, --Andrea) - Room 2 - Jaan: You can try with the following configuration: ```matlab c = parcluster; c.NumThreads = 2; %... ``` Andrea: it works! However, the scheduler also gets `c.NumThreads`. Hence totally the batch job will request `Pool`+1 ntasks with `c.NumThreads` cpus-per-task. - **Q320** I'm trying to run MycoSNP-nf through nextflow in Puhti, but get an error about fastqc html files: "Missing output file(s) `*.html` expected by process `NFCORE_MYCOSNP:MYCOSNP:BWA_PREPROCESS:FASTQC_POST (ERR2172266)`". Originally tried it in Pouta, but got the same error, and decided to try in Puhti. Any ideas how to continue? - There seems to be some error related to /tmp file. One solution to fix it is to reset the TMPDR and CACHEDIR as below : ``` export APPTAINER_TMPDIR=$PWD # or $LOCAL_SCRATCH if gres resource (SSDs)is requested in batch directives export APPTAINER_CACHEDIR=$PWD # or $LOCAL_SCRATCH if gres resource is requested unset XDG_RUNTIME_DIR ``` - **Q321** Questions (copied from Zoom chat): - 1. How long is the container kept alive (or when is it killed)? - 2. Can you run `apptainer exec` via an SSH command (from a server, possibly connecting to a different Puthi login node - `which unzip` gives `/appl/opt/csc-cli-utils/bin/unzip` - Try invoking `/appl/opt/csc-cli-utils/bin/unzip` directly. - ~~Might be that `/etc/profile.d/zz-csc-env.sh` has to be sourced~~ - Room 2 - **Q322** I am building a system for an experiment that uses large language models. I have used cPouta virtual machines to do this, but the GPU-flavors offered (https://docs.csc.fi/cloud/pouta/vm-flavors-and-billing/), have relatively low-powered GPU:s. However, ePouta offers more powerful VM:s. I have two questions: 1. When I apply for ePouta services for university research, is it possible to transfer existing project and VM:s from cPouta to ePouta? - Yes, the CSC project will be the same, so same Billing Units and same members of the project. But... - it not possible to move resources (VM, data) directly, but one can make a snapshot on a VM in cPouta, download it to the local machine, and upload it to ePouta. 3. The ePouta is a virtual private cloud, which seems to mean that my organization (Tampere University) needs to do some extra configuration to use this. What exactly is needed to be done before I can use ePouta, if my application for it is accepted? I am not very familiar using cloud services beyond simple cPouta usage. - First, in order to use ePouta the data needs to be sensitive. This is our article about sensitve data: https://research.csc.fi/sensitive-data - Secondly, it is recommended to contact your instituttion to see if they already have a general project in ePouta (normaly called umbrella project), that can host your VM. - Thirdly, send a ticket to servicedesk@csc.fi explaining the situation and asking for a new ePouta project. This is a heavy process that can take from few weeks to up to a month. - Room 4 - **Q323** Using custom MATLAB license - Jaan: We can try placing MATLAB license file `$HOME/.matlab/license.lic` and running: ```bash export MLM_LICENSE_FILE="$HOME/.matlab/license.lic" module load matlab/r2024a matlab -nodisplay ``` - **Q324** If you have some links for tutorials/intros for using apptainer on Puhti, please add them to Siili, thanks! - https://csc-training.github.io/csc-env-eff/part-2/containers/ - https://cscfi.github.io/hpc-container-guide/ - https://docs.csc.fi/computing/containers/overview/ - https://docs.csc.fi/support/tutorials/singularity-scratch/ ## 2024-10-09 session **Short talk: SD Connect 2 new features: Introduction and demo** NOTE: No weekly session on 30.10.! **Advertisements:** - [Build systems course and support session](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) 8.10.2024 - 23.10.2024 - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Practical Deep Learning](https://csc.fi/en/training-calendar/practical-deep-learning-4/), 12.-13.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration open - [Schrödinger Maestro workshop](https://csc.fi/en/training-calendar/schrodinger-maestro-workshop-2/) 4.-5.12.2024, registration open - [Julia for HPC](https://csc.fi/koulutuskalenteri/julia-for-high-performance-scientific-computing-enccs-2/) Dec. 9-12 (2024), registration open ### Zoom breakout rooms and topics Room 1 - SD Connect (Francesca) Room 2 - Getting started with CSC services Room 3 - COMSOL Room 4 - Rahti Room 5 - Q311 / batch job Room 6 - Q ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q310** Hello, I am a COMSOL user. I need help to point the university license into Puhti server./Thanatcha - Jaan: I can try to help with this. What type of university license you have? Is is a file? - Thanatcha: Nope, they told me to change the license to 1705@license.oulu.fi. and I don't know how to put it in. - Jaan: I see, there might be an environment variable that controls this. - Maria: I can already open the breakout rooms for you, might be easier to sort this out there? Use room 3! - Jaan: You may want to try before running comsol. Try the following in shell before running comsol: ```bash export COMSOL_LICENSE_FILE=1705@license.oulu.fi ``` - This is still Puhti license not university If it is uni, it will has RF module - Jaan: The COMSOL installation on Puhti most likely does not have the module you are looking. - Thanatcha: should I write this issue to CSChelp? - Jaan: Try to find installation instructions from COMSOL website, in Puhti, install it under the `/projappl/<project_id>/` directory. - Thanatcha: I will try. Thanks - **Q311** Given the following sbatch script in SLURM puhti: ``` #!/bin/bash #SBATCH --account=project_XXXXX #SBATCH --job-name=NA_dataset_collection #SBATCH --output=/scratch/project_XXX/ImACCESS/trash/logs/%x_%a_%N_%j.out #SBATCH --mail-type=END,FAIL #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --mem=16G #SBATCH --partition=longrun #SBATCH --time=14-00:00:00 python my_script.py I have a question regarding: #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 ``` let's say in my python script: my_script.py: ``` import multiprocessing number_workers = multiprocessing.cpu_count() print(number_workers) # 40 ``` why? can I increase it even more by adjusting #SBATCH --cpus-per-task=10 # or 20 ? ``` $ lscpu # in puhti: Arkitektur: x86_64 CPU op-läge(n): 32-bit, 64-bit Byteordning: Little Endian CPU(er): 40 Lista över aktiva CPU(er): 0-39 Tråd(ar) per kärna: 1 Kärn(or) per uttag: 20 Uttag: 2 NUMA nod(er): 2 Tillverkar-ID: GenuineIntel CPU-familj: 6 Modell: 85 Modellnamn: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz Stegning: 7 CPU MHz: 2101.000 CPU max MHz: 2101,0000 CPU min MHz: 800,0000 BogoMIPS: 4200.00 L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 28160K NUMA-nod0 CPU(er): 0-19 NUMA-nod1 CPU(er): 20-39 Flaggor: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear flush_l1d arch_capabilities `` - JVL: SLURM does not completely restrict your job to the allocaded CPUs and your Python thus sees other cores too ``` - Room 5 - **Q312** I want to install Kneddata (python based tools). I am a new user. can you please guide me + Abhijit/ nick name - Room 6 - **Q313** I have a Login access problem with my project members, authentication from the work network does not work, only from test network. Login works, but returns to Login page immediately after. - Room 6 - **Q314** I have questions regarding resource allocation, which room should I join? / Nitin - Room 6 - **Q315** Can I get help with moving sensitive data between SD Desktop and allas and integration with the new system? - Room 1 - **Q316** I have a question regarding queue times and job priority. Anyone that can help me with that? /Markus - Room 5 - Check here: https://docs.csc.fi/support/faq/when-will-my-job-run/ - Instructed to write to helpdesk, Juha L. replied very quickly there <3 ## 2024-10-02 session **No short talk this time** NOTE: No weekly session on 30.10.! **Advertisements:** - [Build systems course and support session](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) 8.10.2024 - 23.10.2024 - [Advanced OpenMP - tasks and GPU offloading](https://csc.fi/en/training-calendar/advanced-openmp-tasks-and-gpu-offloading/) 14.-16.10.2024, registration closing on Friday - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Practical Deep Learning](https://csc.fi/en/training-calendar/practical-deep-learning-4/), 12.-13.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration open - [Schrödinger Maestro workshop](https://csc.fi/en/training-calendar/schrodinger-maestro-workshop-2/) 4.-5.12.2024, registration open ### Zoom breakout rooms and topics Room 1 - Rahti (Jemal) ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q309** I have been running OpenFOAM-11 on mahti with adaptive Mesh refinement. It is a tool in OpenFOAM that refines the mesh according to a criteria or a field. In addition to it I have turned on refinementHistory, which ensures that child cells originating from a parent cell will end up at the same processor as the parent cell. It is like a tree in C++. I have been running in parallel with 3 or more nodes, but the simulation just does not move forward. OpenFOAM shows no error, mahti does not show any error and indicates that the simulation is running. However, the problem does not show up if I use 2 or less nodes. I am puzzled by the situation. I guess it has something to do with the inter-nodal communication. Can you help me locate the issue? If it is a bug or an error on my half. - Unfortunately our OpenFOAM specialist was unable to join the session this time, please write a ticket to servicedesk@csc.fi! ## 2024-09-25 session **Short talk: How to write a good support request ([Slides](https://a3s.fi/saren-2001659-pub/How_to_write_a_support_request_2024-01-31.pdf))** **Advertisements:** - [Data analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r/) 7.-8.10.2024, registration open - [Build systems course and support session](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) 8.10.2024 - 23.10.2024 - [Geocomputing on the Supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer/) 9.-10.10.2024, registration open - [Advanced OpenMP - tasks and GPU offloading](https://csc.fi/en/training-calendar/advanced-openmp-tasks-and-gpu-offloading/) 14.-16.10.2024, registration closing on Friday - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Practical Deep Learning](https://csc.fi/en/training-calendar/practical-deep-learning-4/), 12.-13.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Ari-Matti) Room 2 - CSC Computing Environment pre-course questions (Maria) Room 3 - Q306, R/RStudio questions (Heli) Room 4 - Q307, Pouta questions (Jemal) Room 5 - SD services (Kimmo) Room 6 - Q305, Q308 MLflow, Allas (Dean) ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q305** Using Allas S3 with Mlflow on Rahti 2 - what is the S3 URL / Nazaal - Does this help you? https://github.com/CSCfi/mlflow-openshift/blob/master/docs/USER_GUIDE.md#using-csc-allas-as-artifact-store (although it refers to Rahti 1) - I don't think anyone in CSC has yet tested MLflow on Rahti 2, but if the old tutorial is still correct it should be `s3://your_bucket_name` so you need to know the name of the Allas bucket you have created - Room 6 - A: Rahti2 is using a Helm Chart to deploy MLflow. In the `values.yaml` file, the protocol specified was `s3` and not `https`. We made the change with the customer and the environment variable was good. - **Q306** How to download data from web to puhti and analyse it in puhti's Rstudio or on your own Rstudio? / Susanna - Room 3 - **Q307** Pouta: how to setup an additional Security Group for my Django Web App. I have already made one security group under port 8000 which I run my django project successfully in Pouta VM and it works in web browser using URL: http://128.214.254.157:8000/. Now I have an addditional Django web app which I would like to get a new security group, for instance, 9000. but I can not run and access the link: http://128.214.254.157:9000 even after adding a new secutiry group in my Pouta profile in https://pouta.csc.fi/. - Room 4 - **Q308** Rahti 2 migration: Now that Rahti 2 Build APIs are temporarily closed and you are planning to shutdown Rahti 1 on October 11th, how can we migrate our deployments, which use BuildConfig, to Rahti 2? E.g., will the Rahti 1 shutdown date be moved to further future in case the Rahti 2 Build APIs cannot be opened in time? / Jouni - Room 6 - A: Build and BuildConfig API are temporary disabled due to a critical CVE. You can request the activation for your project by sending a ticket to servicedesk@csc.fi ## 2024-09-18 session **Short talk: No short talk this time** **Advertisements:** - We're surveying the need to continue licensing **Cambridge Crystallographic Database Tools**. If you have used it, please fill in a quick survey at: https://link.webropolsurveys.com/S/0D663B17FB1E52A2 - We're improving our web site for researchers. If you're interested in contributing to the new pages you can **participate in the testing of the new site** (30-60 minutes, either look for information (based on questions/tasks from us) or sort content to help us categorize topics so that they are found from logical place). No prior knowledge is needed and preparing is forbidden. In particular, we'd like to invite new or beginning students/researchers to help us. If you are interested, please send email to servicedesk@csc.fi - **New documentation page on [getting started with supercomputers at CSC](https://docs.csc.fi/support/tutorials/hpc-quick/)** - [Mondays with MATLAB - Nordic HPC workshop](https://se.mathworks.com/company/events/seminars/series/mondays-with-matlab-nordic-hpc-workshops-nrd-2024.html): 9th, 16th, and 24th September - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r/) 7.-8.10.2024, registration open - [Build systems course and support session](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) 8.10.2024 - 23.10.2024 - [Geocomputing on the Supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer/) 9.-10.10.2024, registration open - [Advanced OpenMP - tasks and GPU offloading](https://csc.fi/en/training-calendar/advanced-openmp-tasks-and-gpu-offloading/) 14.-16.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - **NEW!** [Practical Deep Learning](https://csc.fi/en/training-calendar/practical-deep-learning-4/), 12.-13.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Ari-Matti) Room 2 - Q304 / Pouta question (Jemal) Room 3 - Room 4 - Room 5 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q304** We are using Cpouta service. Can we host our service on this url: https://nts.csc.fi + Mehrdad and Masoud - Breakout room 2 ## 2024-09-11 session **Short talk: Noppe.csc.fi : What’s new** **Advertisements:** - We're surveying the need to continue licensing **Cambridge Crystallographic Database Tools**. If you have used it, please fill in a quick survey at: https://link.webropolsurveys.com/S/0D663B17FB1E52A2 - We're improving our web site for researchers. If you're interested in contributing to the new pages you can **participate in the testing of the new site** (30-60 minutes, either look for information (based on questions/tasks from us) or sort content to help us categorize topics so that they are found from logical place). No prior knowledge is needed and preparing is forbidden. In particular, we'd like to invite new or beginning students/researchers to help us. If you are interested, please send email to servicedesk@csc.fi - **New documentation page on [getting started with supercomputers at CSC](https://docs.csc.fi/support/tutorials/hpc-quick/)** - [Mondays with MATLAB - Nordic HPC workshop](https://se.mathworks.com/company/events/seminars/series/mondays-with-matlab-nordic-hpc-workshops-nrd-2024.html): 9th, 16th, and 24th September - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open until 16.9. - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r/) 7.-8.10.2024, registration open - [Build systems course and support session](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) 8.10.2024 - 23.10.2024 - [Geocomputing on the Supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer/) 9.-10.10.2024, registration open - [Advanced OpenMP - tasks and GPU offloading](https://csc.fi/en/training-calendar/advanced-openmp-tasks-and-gpu-offloading/) 14.-16.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - **NEW!** [Practical Deep Learning](https://csc.fi/koulutuskalenteri/practical-deep-learning-4/), 12.-13.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Ari-Matti) Room 2 - Noppe (Katri) Room 3 - Q301 (Kimmo) ### Questions - **Q301** I need help to set up a web page for interactive app and database. + Umair / nick name - Room 3 - **Q302** Installation of WLS licenses for Gurobi solver (MATLAB) on puhti.csc.fi + Abhijit / nick name - Discussed in the main room - Install Gurobi and the WLS license (make sure it's the WSL one, not the 1 year one) file to PROJAPPL - Reply to your ticket or come and meet us again next week if you have further questions :) - **Q303** I want to ask about comsol module + Satit - Discussed in the main room - Reply to your ticket or come and meet us again next week if you have further questions :) ## 2024-09-04 session **Short talk:** *No short talk this time* **Advertisements:** - We're surveying the need to continue licensing **Cambridge Crystallographic Database Tools**. If you have used it, please fill in a quick survey at: https://link.webropolsurveys.com/S/0D663B17FB1E52A2 - We're improving our web site for researchers. If you're interested in contributing to the new pages you can **participate in the testing of the new site** (30-60 minutes, either look for information (based on questions/tasks from us) or sort content to help us categorize topics so that they are found from logical place). No prior knowledge is needed and preparing is forbidden. In particular, we'd like to invite new or beginning students/researchers to help us. If you are interested, please send email to servicedesk@csc.fi - **New documentation page on [getting started with supercomputers at CSC](https://docs.csc.fi/support/tutorials/hpc-quick/)** - [Mondays with MATLAB - Nordic HPC workshop](https://se.mathworks.com/company/events/seminars/series/mondays-with-matlab-nordic-hpc-workshops-nrd-2024.html): 9th, 16th, and 24th September - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r/) 7.-8.10.2024, registration open - [**Build systems course and support session**](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) - [Geocomputing on the Supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer/) 9.-10.10.2024, registration open - [Advanced OpenMP - tasks and GPU offloading](https://csc.fi/en/training-calendar/advanced-openmp-tasks-and-gpu-offloading/) 14.-16.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open - [High-Level GPU programming](https://csc.fi/en/training-calendar/high-level-gpu-programming/) 27.-29.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services () Room 2 - SD (kimmo) ### Questions - **Q300** Hi! I am trying to extract around 83500 point values (with lat/lon coordinates) from a 1x1km global raster in netCDF format using R. I have tried terra::extract and raster::extract but both are incredibly slow. The final goal is to repeat the same process for 468 rasters. Any advice on how to do this more efficiently? / Sara Heikonen - Unfortunately our R & spatial data experts are busy atm :( We have notified them about the question. Seems to be a bit Puhti /reading files related as the time was varying (morning vs afternoon). On her own laptop it runs faster. Reading from scratch can be slow, try to using LOCAL SCRATCH / NVME (request --gres=nvme ... in the batch script): https://docs.csc.fi/computing/disk/#compute-nodes-with-local-ssd-nvme-disks - Remember NOT to WRITE to local scratch, as that is deleted after the job is done! :) - Check also R + fast local storage: https://docs.csc.fi/apps/r-env/#using-fast-local-storage - Just to copy the single file (in the batch job script, before starting R, remember to tell R to read from this location): ```cp /path/to/your/file $LOCAL_SCRATCH``` ## 2024-08-28 session **Short talk:** *no short talk this time* **Advertisements:** - We're surveying the need to continue licensing **Cambridge Crystallographic Database Tools**. If you have used it, please fill in a quick survey at: https://link.webropolsurveys.com/S/0D663B17FB1E52A2 - We're improving our web site for researchers. If you're interested in contributing to the new pages you can **participate in the testing of the new site** (30-60 minutes, either look for information (based on questions/tasks from us) or sort content to help us categorize topics so that they are found from logical place). No prior knowledge is needed and preparing is forbidden. In particular, we'd like to invite new or beginning students/researchers to help us. If you are interested, please send email to servicedesk@csc.fi - **New documentation page on [getting started with supercomputers at CSC](https://docs.csc.fi/support/tutorials/hpc-quick/)** - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r/) 7.-8.10.2024, registration open - [**Build systems course and support session**](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) - [Geocomputing on the Supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer/) 9.-10.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Samantha) Room 2 - Slurm question (Mitja) Room 3 - PyTorch + machine learning (Mats) Room 4 - (nextflow-based) metagenomic assembly pipeline (Laxman) Room 5 - Pouta/GPU (Jemal) ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q294** I've been wondering how the resource allocation works. After launching a batch job, you can check its status with the squeue command. I do not understand the NODELIST(REASON) column. What do (Priority), (Resources) and (None) mean? Also there's the sshare command which apparently shows how much resources each project has used and how much "good karma" they have, but I don't understand how to interpret the numbers. -Otto - answer / room number 2 with Mitja - A1: If your job is pending, `NODELIST(REASON)` shows why. `(Priority)` means that you are queued behind a job with a higher priority. `(Resources)` means that there isn't a job with a higher priority than yours on the partition, but some of the resources you have requested are still in use. Note, however, that the backfiller algorithm starts jobs with a lower priority before jobs with a higher priority if doing so does not delay the start time of higher priority jobs. I believe `(None)` only occurs when a job has just been submitted or cancelled. - A2: Running `sshare -m -U $USER` should produce a more sensible output. You have a share in each Slurm partition, and the FairShare field displays a factor which, I assume, is used to multiply your job's starting Priority. The priority is a float between 0 and 1. - **Q295** I have some questions for Mats about using Pytorch on Puhti. -Eino - answer / room 3 - **Q296** What is the current status of SD-desktop+supercomputers development? - Ari-Pekka - answer / room number will appear here - **Q297** Is there a recommended practice for developing a GPU-accelerated LLM inference application with Pouta? Pouta does offer GPU-flavors, but they are expensive in terms of billing and the VM:s cannot be easily changed between VM flavor families (from normal to GPU). - Andrei - https://docs.csc.fi/cloud/pouta/application-dev/ - Room 5 /Jemal - Here is the link to make take a snapshot https://docs.csc.fi/cloud/pouta/snapshots/ - **Q298** I am running a (nextflow-based) metagenomic assembly pipeline but it takes significantly more time to complete, any suggestions for speeding up the process? - answer / room number 4 - **Q299** I have some images in Google Cloud Storage (GCS). How I can connect to the GCS in Puhti and read these images directly? There are thousands of images...it is not possible to download. / Omid - https://docs.csc.fi/support/tutorials/ml-data/#fast-local-drive-puhti-and-mahti-only ``` - 1) Open Puhti web interface 2) Open Desktop app 3) Open terminal in the opened Desktop module load geoconda python import os; path = os.environ['PATH']; os.environ['PATH'] = '/appl/opt/csc-cli-utils/google-cloud-sdk/bin:' + path import ee ee.Authenticate() # It prints out a long link. Copy it # Open Web Browser under application # Copy the link to there # log in to Google # Copy the code back to Python ee.Initialize() ``` ``` cd /appl/opt/csc-cli-utils/google-cloud-sdk/bin gcloud auth login --no-launch-browser .. it prints out a long link that I copied to local laptop web-browser and then got back a longish key to copy back to Puhti. This seemed to work and created some new files under: /users/<username>/.config/gcloud ``` ## 2024-08-21 session **Short talk:** What are build systems and why may I need one? Materials will be the event page: https://csc.fi/koulutuskalenteri/build-systems-course-and-hackathon/ **Advertisements:** - We're surveying the need to continue licensing **Cambridge Crystallographic Database Tools**. If you have used it, please fill in a quick survey at: https://link.webropolsurveys.com/S/0D663B17FB1E52A2 - We're improving our web site for researchers. If you're interested in contributing to the new pages you can **participate in the testing of the new site** (30-60 minutes, either look for information (based on questions/tasks from us) or sort content to help us categorize topics so that they are found from logical place). No prior knowledge is needed and preparing is forbidden. In particular, we'd like to invite new or beginning students/researchers to help us. If you are interested, please send email to servicedesk@csc.fi - **New documentation page on [getting started with supercomputers at CSC](https://docs.csc.fi/support/tutorials/hpc-quick/)** - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r/) 7.-8.10.2024, registration open - [**Build systems course and support session**](https://csc.fi/en/training-calendar/build-systems-course-and-hackathon/) - [Geocomputing on the Supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer/) 9.-10.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Mitja) Room 2 - Q291 (Mats) Room 3 - Q292 (Kimmo) Room 4 - Q293 (Jemal) Room 5 - Build systems (Juha) Room 6 - GIS (Samantha) ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q291** I'm running an XLM-R model and I've been getting this error: OSError: Not found: "/projappl/project_XXXXXX/.cache/xlm-roberta-base/\[bunch-of-numbers]": Permission denied Error #13. I wonder what could be causing this... - Room 2, check unix permissions, group should have read and write access - **Q292** Multiple questions regarding Allas, command line, IDA, downloading software that is not one of the pre-installed softwares, etc... / Signe UTU - Room 3 - **Q293** Multiple questions regarding cPouta, automation, virtual machines and hosting machine learning models with cPouta / Pritom - Room 4 ## 2024-08-14 session **Advertisements:** - **New documentation page on [getting started with supercomputers at CSC](https://docs.csc.fi/support/tutorials/hpc-quick/)** - [Fundamentals of Machine Learning](https://csc.fi/en/training-calendar/fundamentals-of-machine-learning/) 21.-22.8.2024, registration open - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r/) 7.-8.10.2024, registration open - [Geocomputing on the Supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer/) 9.-10.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Kimmo) Room 2 - Q289 (Samantha) Room 3 - Room 4 - Room 5 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q288** What time are CSC services available daily? I noticed that our database in Pukki could not be accessed in the evening. / Maarit Ryynänen - Pukki (and other CSC network services) are accessible 24/7, so if it wasn't accessible in the evening, there must be some other issue, such as a temporary network problem or if you use a different computer to access in the evening, it could be related to firewall settings: https://docs.csc.fi/cloud/dbaas/firewalls/ - **Q289** How to install custom python environment on CSC fom github reprository via yaml file? I want to initalize an developer enmap-box python enviroinment. From https://github.com/EnMAP-Box/enmap-box In Mamba it would be done like this: mamba env create -f https://raw.githubusercontent.com/EnMAP-Box/enmap-box/main/.env/conda/enmapbox_full_latest.yml . Additionally how to fork the github reprository or folders so that specific functions, which are in development in the reprository can be imported as python commands? / Leon T - If you have a conda/mamba yaml file, you can probably use our tykky tool for that: https://docs.csc.fi/computing/containers/tykky/ - cloning a git repository and installing with `pip install --user -e .` might do what you want, or simply adding the cloned path to `PYTHONPATH`, depending a bit on the details... - **Q290** How to know how much resources to reserve for a batch job? In case of big jobs, it is difficult to estimate as trying again and again takes a lot of time. Benjamin - Depends a lot on the software. A good way is to experiment on the test or gputest partition where you can get runs running faster, but for a maximum of 15 min run. Once you have it working there you can move to the real partition. ## 2024-08-07 session First session after summer break **Advertisements:** - New documentation page on [getting started with supercomputers at CSC](https://docs.csc.fi/support/tutorials/hpc-quick/) - [Fundamentals of Machine Learning](https://csc.fi/en/training-calendar/fundamentals-of-machine-learning/) 21.-22.8.2024, registration open - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r/) 7.-8.10.2024, registration open - [Geocomputing on the Supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer/) 9.-10.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 6.-7.11.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.-14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started with CSC services (Samantha) Room 2 - Q284 (Laxmana) Room 3 - Q285 (Lukas) Room 4 - R questions (Heli) Room 5 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q283** I sent a job (Matlab) to Puhti using batch command. Since I need to calculate using my own function, I used "AttachedFiles". Attaching one file was fine, but for more than one file, I got an error message from Matlab for both methods: batch(..., "AttachedFiles", {"my_func_1.m", "my_func_2.m"}) and batch(..., "AttachedFiles", {'my_func_1.m', 'my_func_2.m'}). I have no idea how to solve this problem. Next, what is the maximum number for WallTime? I entered 10-00:00:00 to get 10 days and Matlab gave me an error message that the parameter was invalid. I managed to get it right with 24:00:00. The last question was something different from this because I ran Matlab from puhti server after logging in using PUTTY. After load matlab module, I open Matlab and load one of .mat files. Unfortunately, Matlab could not read the file altbough the file was there and I typed it correctly. The file attribute was -rw-rw-r--. Thanks a lot. ~Hany Ferdinando - Unfortunately our Matlab specialists are on holiday right now. Please send an email to servicedesk@csc.fi and they will get back to you when they return. - **Q284** I am trying to install a software called mapDamage in my working environment to analyse DNA damages. I have followed the steps provided by the developers but the software doesn't run. For installing it, I had to load python which I am not familiar with and it seems that I have either right issues or set up the wrong path. I would appreciate some help to solve my issue. Thanks! - Audrey Bras - Intructions for regular and container approaches are reviewed. Demo on container approach was presented to the user. all needed programs seem to be presnted in the container image. if there are further issues, user was advised to contact us - **Q285** Kati: Tykky is not installing and question about numpy files and processing without me sitting on a computer and waiting (how the puhti.csc.fi is working for example 15 hours on a row. It seems if my screen goes to sleep, the processing stops or something.) - Discussed many questions. Some general tykky installations, some related to GIS. room 3 - **Q286** I would like to run 5 different R scripts simultaneously as a parallel multicore run, by logging into Puhti UNIX shell via ssh. What would be the most simple way to accomplish this kind of parallel multicore run in Puhti? - Ville - - room 4 / Array job where each subjob uses multiple cores was discussed as a solution. Also why doParallel is slower on Puhti than own computer - number of cores not detected correctly? - **Q287** I need help with my inferCNV run, which I have been in contact with. Sadiksha / nick name - room 4 / The analysis takes a lot longer on Puhti than on a normal computer. Using local scratch for large dataset doesnt't seem to help, more testing and profiling needed. ## 2024-06-19 session Short talk: Moving data from scratch to Allas **Advertisements:** - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://ssl.eventilla.com/event/v8Kv9) 7.-8.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 30.-31.10.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.11.2024 - 14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions (Ari-Matti) Room 2 - Allas (Kimmo) Room 3 - PyTorch, LLM & machine learning (Mats) Room 4 - Q274 (Xavier) Room 5 - Q282 (Jarmo, Heli) ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q274** Help with snakemake, python, mambaforge/conda (Reference [here](https://github.com/theislab/scib-pipeline?tab=readme-ov-file#installation)) and Name: Rishi - this looks like an R environment? Have you checked if you can use our provided [r-env](https://docs.csc.fi/apps/r-env/) module? We also have a snakemake module : https://docs.csc.fi/apps/snakemake/ which could be used together with r-env to run your pipeline. - It also depends on heavily conda/mambaforge , that is biggest challenge for me - I assume we talk about running your snakemake workflow on Puhti supercomputer? Yes - Ok, you cannot use conda directly on Puhti; but we have a lot of tools preinstalled for you in modules, that would be the first thing to check; if you can make use of existing modules (see above) - If not, then using our Tykky tool might be the easiest way to go: https://docs.csc.fi/support/tutorials/snakemake-puhti/#snakemake-tykky-installation-for-python ; not sure how this works with R, but we can try; We can dicuss further in breakoutroom - This snakemake pipeline "scib-pipeline" depends on python,R and conda environment. I want to know what could be easier way to customize it so that I can run it in Puhti. I am not very comfortable with Tykky and never used snakemake. Therefore I would like to know if it will be possible to use this pipeline easily on puhti. - Ok, we'll see if we have a specialist available for you after the short talk. - thanks - From R side: r-env won't work as it is, because you can't create conda environments there and it usually can't interact with Tykky environments on Puhti. Probably the options are 1) Tykky environment that has R or 2) a custom r-env module with the required conda environments. - room 4 - **Q275** I installed the NLP library spaCy in a venv on top of the pytorch module. I try to use it for training language models using a GPU. Although the installation seemingly went through, spaCy is extremely slow. Any help finding out what's wrong would be welcome. Name: Tatu Leppämäki - You can use seff or nvidia-smi to check if it is actually using the GPUs: https://docs.csc.fi/support/tutorials/gpu-ml/#tools-for-monitoring-gpu-utilization - Make sure you installed it with CUDA 12.x support: `pip install -U 'spacy[cuda12x]'` - Breakout room 3 for follow-up questions - **Q276** I need to move files from IDA to Allas. How can I do this? I'm a total beginner in using the resources of CSC. Kalle Ruokolainen - You need to download data first from IDA to some computer and then upload it to Allas. Puhti provides a good place to do the data transport. https://docs.csc.fi/data/moving/copy_allas_ida/ - Thanks, this clarified the case. :) nswer / room number will appear here - **Q277** I need help with ensembl-vep installation / Rodney - What kind of application is this? Or what have you tried? (This info helps us to find the right specialist) - enseml-vep is available in Bioconda so you can follow this tutorial: https://docs.csc.fi/support/tutorials/bioconda-tutorial/ - let us know if you need more support. - I have followed the tutorial but I get a consistent error about some missing cache - room 1 - **Q278** I want to run LLM models locally on the server but when I try to download large size models 10G or more I have disk quota error Can you please guideme for such case where I should download the models / Salwa here - It's probably downloading the models to your home directory. It's better to use the scratch directory of a project. You need to check the documentation of the LLM library how to redirect the files to another directory. For example for Huggingface you can set the environment variable `$HF_HOME`, for example: `export HF_HOME=/scratch/project_2001234/hf-home` - Breakout room 3 for more questions. - **Q279** How long can the data be stored in Allas? I am a beginner and started with my analysis in Puhti. Can I not store my data inside my /scratch/project_number directory? I expect my data size to be big because I have about 50 scRNA-seq samples. - During CSC-project lifetime - You can store it on scratch as long as you actively use it. - What is "big" in your case? Mb, Gb, Tb? > Gb - Breakoutroom 2 after short talk :) - **Q280** I am a little confused still. Does it make sense to move the data from puhti to allas everytime I have some changes in my data? Or when should we move data to Allas? - When you are not actively using it on Puhti anymore or when you want to store the data for longer/have it safe during project lifetime - breakoutroom 2 after the short talk :) - **Q281** I am pretty new to CSC and I have troubles in setting up ssh keys and connecting to to puhti?/I also want know whether I can run COMSOL projects in puhti. - Breakoutroom 1 after the short talk - **Q282** When using R (or Rstudio), what is the difference between using interactive and small? Does using interactive use more billing units? - The main difference is the amount of maximum resources you can reserve in these partitions. - Interactive is the default for RStudio. Small has higher maximum resources and usually longer queuing times. - room 5 - Also discussed what happens when job time runs out, need to move to batch jobs with big R jobs, how to ## 2024-06-19 session Short talk: Moving data from scratch to Allas **Advertisements:** - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://ssl.eventilla.com/event/v8Kv9) 7.-8.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 30.-31.10.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.11.2024 - 14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions (Ari-Matti) Room 2 - Allas (Kimmo) Room 3 - PyTorch, LLM & machine learning (Mats) Room 4 - Q274 (Xavier) Room 5 - Q282 (Jarmo, Heli) ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q274** Help with snakemake, python, mambaforge/conda (Reference [here](https://github.com/theislab/scib-pipeline?tab=readme-ov-file#installation)) and Name: Rishi - this looks like an R environment? Have you checked if you can use our provided [r-env](https://docs.csc.fi/apps/r-env/) module? We also have a snakemake module : https://docs.csc.fi/apps/snakemake/ which could be used together with r-env to run your pipeline. - It also depends on heavily conda/mambaforge , that is biggest challenge for me - I assume we talk about running your snakemake workflow on Puhti supercomputer? Yes - Ok, you cannot use conda directly on Puhti; but we have a lot of tools preinstalled for you in modules, that would be the first thing to check; if you can make use of existing modules (see above) - If not, then using our Tykky tool might be the easiest way to go: https://docs.csc.fi/support/tutorials/snakemake-puhti/#snakemake-tykky-installation-for-python ; not sure how this works with R, but we can try; We can dicuss further in breakoutroom - This snakemake pipeline "scib-pipeline" depends on python,R and conda environment. I want to know what could be easier way to customize it so that I can run it in Puhti. I am not very comfortable with Tykky and never used snakemake. Therefore I would like to know if it will be possible to use this pipeline easily on puhti. - Ok, we'll see if we have a specialist available for you after the short talk. - thanks - From R side: r-env won't work as it is, because you can't create conda environments there and it usually can't interact with Tykky environments on Puhti. Probably the options are 1) Tykky environment that has R or 2) a custom r-env module with the required conda environments. - room 4 - **Q275** I installed the NLP library spaCy in a venv on top of the pytorch module. I try to use it for training language models using a GPU. Although the installation seemingly went through, spaCy is extremely slow. Any help finding out what's wrong would be welcome. Name: Tatu Leppämäki - You can use seff or nvidia-smi to check if it is actually using the GPUs: https://docs.csc.fi/support/tutorials/gpu-ml/#tools-for-monitoring-gpu-utilization - Make sure you installed it with CUDA 12.x support: `pip install -U 'spacy[cuda12x]'` - Breakout room 3 for follow-up questions - **Q276** I need to move files from IDA to Allas. How can I do this? I'm a total beginner in using the resources of CSC. Kalle Ruokolainen - You need to download data first from IDA to some computer and then upload it to Allas. Puhti provides a good place to do the data transport. https://docs.csc.fi/data/moving/copy_allas_ida/ - Thanks, this clarified the case. :) nswer / room number will appear here - **Q277** I need help with ensembl-vep installation / Rodney - What kind of application is this? Or what have you tried? (This info helps us to find the right specialist) - enseml-vep is available in Bioconda so you can follow this tutorial: https://docs.csc.fi/support/tutorials/bioconda-tutorial/ - let us know if you need more support. - I have followed the tutorial but I get a consistent error about some missing cache - room 1 - **Q278** I want to run LLM models locally on the server but when I try to download large size models 10G or more I have disk quota error Can you please guideme for such case where I should download the models / Salwa here - It's probably downloading the models to your home directory. It's better to use the scratch directory of a project. You need to check the documentation of the LLM library how to redirect the files to another directory. For example for Huggingface you can set the environment variable `$HF_HOME`, for example: `export HF_HOME=/scratch/project_2001234/hf-home` - Breakout room 3 for more questions. - **Q279** How long can the data be stored in Allas? I am a beginner and started with my analysis in Puhti. Can I not store my data inside my /scratch/project_number directory? I expect my data size to be big because I have about 50 scRNA-seq samples. - During CSC-project lifetime - You can store it on scratch as long as you actively use it. - What is "big" in your case? Mb, Gb, Tb? > Gb - Breakoutroom 2 after short talk :) - **Q280** I am a little confused still. Does it make sense to move the data from puhti to allas everytime I have some changes in my data? Or when should we move data to Allas? - When you are not actively using it on Puhti anymore or when you want to store the data for longer/have it safe during project lifetime - breakoutroom 2 after the short talk :) - **Q281** I am pretty new to CSC and I have troubles in setting up ssh keys and connecting to to puhti?/I also want know whether I can run COMSOL projects in puhti. - Breakoutroom 1 after the short talk - **Q282** When using R (or Rstudio), what is the difference between using interactive and small? Does using interactive use more billing units? - The main difference is the amount of maximum resources you can reserve in these partitions. - Interactive is the default for RStudio. Small has higher maximum resources and usually longer queuing times. - room 5 - Also discussed what happens when job time runs out, need to move to batch jobs with big R jobs, how to parallelize Seurat. ## 2024-06-12 session Short talk: New features of the Puhti, Mahti and LUMI web interfaces **Advertisements:** - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://ssl.eventilla.com/event/v8Kv9) 7.-8.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 30.-31.10.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.11.2024 - 14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions (Kimmo) Room 2 - Puhti, Mahti and LUMI web user interface (Robin) Room 3 - Q272, Matlab (Jaan) Room 4 - Q271, GPU and machine learning (Mats) Room 5 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q271** Matti Nelimarkka: I'm running a GPU job on Puhti. I get an error "torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 MiB. GPU" Can I increase the memory allocation? - No, with the GPU you always get the full memory of the GPU, 32 GB in the case of the NVIDIA V100 in Puhti. (It's not like the CPU or RAM memory where you can allocate different amounts). One option is to switch to Mahti, which has NVIDIA A100 with 40 GB of GPU memory. Another option is to use a library/framework that can do model parallelism, that is split the model up over several GPUs. For example in accelerate and Huggingface you just need to reserve multiple GPUs and set `device_map="auto"` in the proper spot: https://huggingface.co/docs/accelerate/usage_guides/big_modeling (if you're doing inference, training is a bit more complex). - **Q272** Xiang Li: When I connect to Matlab through the My Interactive Sessions interface on Puhti website, a dialog box appears in the status information saying "please authenticate", but I don't know what to fill in. Could you please tell me? Thanks! - It seems that currently the access token does not propagate to the session. We can find the token on the output log and copy from there. - ~~Clicking the connect button again usually solves the problem.~~ Issue will be fixed in the update tomorrow (13.6 9:00). / Robin - **Q273** Rahti, A use case we need is to implement ingress traffic to a pod but allow egress traffic from it. / Maarit - You can enforce the rules by implenting network policies on the pods. Have a look on this [Networking](https://docs.csc.fi/cloud/rahti2/networking/) docs. /Jemal. ## 2024-06-05 session Short talk: No short talk **Advertisements:** - [High performance R](https://csc.fi/en/training-calendar/high-performance-r/): 26.-27.9.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration open - [Data analysis with R](https://ssl.eventilla.com/event/v8Kv9) 7.-8.10.2024, registration open - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 30.-31.10.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.11.2024 - 14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions (Kimmo) Room 2 - Rahti (Jemal & Joona) Room 3 - R-studio & github (Samantha) Room 4 - PyTorch & machine learning (Mats) Room 5 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q268** Question related to Rahti migration and Rishi / nick here - https://sulic.rahtiapp.fi/ - **A268** Here is the documentation that you could use to redircet urls: [Integrating external service Rahti](https://docs.csc.fi/cloud/tutorials/integrating-external-services/) - **Q269** Question about R-studio on github course / kati - Room 3 - **Q270** Not torch.cuda.is_available(), UserWarning: Can't initialize NVML / Kråkan - answer / room number will appear here ## 2024-05-29 session Short talk: No short talk **Advertisements:** - High performance R: 26.-27.9.2024, registration opens soon - [Data analysis with R](https://ssl.eventilla.com/event/v8Kv9) 7.-8.10.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.11.2024 - 14.11.2024, registration open - [CSC Computing Environment Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics/) 2.-3.10.2024, registration opening later - [CSC Computing Environment Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-3/) 30.-31.10.2024, registration opening later ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions Room 2 - Room 3 - Room 4 - Room 5 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q264** Pytorch (Nvidia/AMD gpus) on LUMI / Nitik - answer / room number will appear here - **Q265** Getting started with CSC services for ML -> discussed in breakoutroom 1 - answer / room number will appear here - **Q266** Rahti for hosting html webservice -> disucssed in breakoutroom 1 - answer / room number will appear here - **Q267** Pukki related question, permission change issues -> forwarded to servicedesk - answer / room number will appear here ## 2024-05-22 session Short talk: "MyCSC overview & what's new" NOTE: [LUMI coffee break](https://www.lumi-supercomputer.eu/events/usercoffeebreaks/) happening at the same time **Advertisements:** - [Services for Research Webinar](https://ssl.eventilla.com/event/eKGB1) earlier today - High performance R: 26.-27.9.2024, registration opens soon - [Data analysis with R](https://ssl.eventilla.com/event/v8Kv9) 7.-8.10.2024, registration open - [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/using-containers-in-supercomputing-environment/) 12.11.2024 - 14.11.2024, registration open ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions Room 2 - Rahti Room 3 - Room 4 - Room 5 - ### Questions - **Q262** I would like to ask about Rahti 2 migration / David - **A:** A good starting point is to look at this Rahti 1 to Rahti 2 migration guid <https://docs.csc.fi/cloud/rahti/rahti-migration/> And regarding Rahti 2 registry you can access using terminal <https://docs.csc.fi/cloud/rahti2/usage/cli/#how-to-login-in-the-registry>. (Breakout Room 2, Jemal) - **Q263** This is rather a request than a question: Would it be possible to show billing units consumption separately for computation and disc storage? In our project, a large proportion of billing units are consumed by 'system or past members' on Puhti. I assume it's our increased quota on scratch, but it would be nice to see more specifically. / Heli - The detailed consumption is not visible in project preview widget. Click "Open" button next to the project selection dropdown. Detailed consumption is visible in "Project Content Overview" section. / room number will appear here ## 2024-05-15 session Short talk: No short talk (Note: Previously advertised "MyCSC overview & what's new" will be on 22.5. Sorry for the confusion) ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions Room 2 - Support session for course "CSC Computing Environment, Part 2" Room 3 - R / Q254 (Heli) Room 4 - Python & machine learning / Q256 (Mats) Room 5 - MATLAB (Jaan) ### Questions - **Q254** How do you use RStudio via the Puhti web interface? How do you install new R packages and how can you acces all your folders not only the folder 'ondemand'? How do you set working directory in the web interface? + Antero Heikkilä - Room 3 - package installation: https://docs.csc.fi/apps/r-env/#r-package-installations - viewing other folders than the home directory in the file view: three dots on the right side of the files panel -> /scratch/project_xxx etc. - Setting work directory: setwd("/scratch/project_xxx/folder") - **Q255** Is it possible to change kernel parameters for specific pods in Rahti? vm.max_map_count is the one I'm looking at for Elasticsearch, but it's blacklisted for normal users. If it's not possible, then it seems cPouta is the sane alternative for such hosting. + Henri Haapanen - It is not possible, sorry. cPouta will be a good alternative. You can also create a ticket to servicedesk@csc.fi and we will see if there is any workaround. - Also Rahti 1 and 2 might have different parameters. Be sure to try in Rahti 2. - Consider - **Q256** Install package fa2 with old python 3.8 in puhti, final goal to use with jupyter. This package is not compatible with python 3.9. [Description](https://github.com/bhargavchippada/forceatlas2/issues/34). Scanpy package [depends](https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.draw_graph.html) on it. Is it possible to use GPU? [scvi-tools](https://docs.scvi-tools.org/en/stable/installation.html#gpu) + Rishi Das Roy - For just scanpy and fa2 this seems to work: ``` module load python-data/3.8 pip install --user fa2 scanpy ``` (or even better to install to venv). For scvi-tools it's a bit tricky as it seems to need both PyTorch and JAX? If you can do with only one of them, using either `pytorch` or `jax` modules as starting points might be the easiest option. We can discuss in Room 4. - **Q257** How do you use Jupyter via Puhti webinterface? Also, how to launch the session with a venv? + Nina Peltokangas / nick name - Room 4 - **Q258** I had difficulty to move the Python to another directory (export PYTHONUSERBASE) today at the course, could someone have a look at that with me? Mira Kajanus - Sorry for the unclear instructions. You will have to set the PYTHONUSERBASE before installing the application. If this variable is not set, the application is, by default, installed in the '/users/<username>/local/bin' folder. - **Q259** Can we use MATLAB on CSC (Puhti) servers? + Muhammad - Do you mean Puhti or Mahti? - Lets take room 5 - **Q260** Multiprocessing with GPUs on Puhti + Nitik - Room 4 - **Q261** I am facing a problem accessing Lumi through SSH. When I try to log in through the terminal, it asks me to provide the password four times. Then, i get error msg + Usman - Have you checked the instructions at https://docs.lumi-supercomputer.eu/firststeps/loggingin/ and https://docs.lumi-supercomputer.eu/firststeps/SSH-keys/? If you continue having problems, contact LUMI support: https://lumi-supercomputer.eu/user-support/need-help/. There are also LUMI user coffee breaks similar to this CSC one: https://www.lumi-supercomputer.eu/events/usercoffeebreaks/ ## 2024-05-08 session Short talk: [Moving a project to Rahti](https://a3s.fi/media/moving_a_project_to_rahti_240508.pdf) [Recording](https://video.csc.fi/media/t/0_4i08nito) _related shout-out:_ [Online: Pouta and Rahti course](https://ssl.eventilla.com/event/aq9Y7) **Advertisements:** - [Services for Research Webinar](https://ssl.eventilla.com/event/eKGB1) - [Online: Pouta and Rahti course](https://ssl.eventilla.com/event/aq9Y7) - [Online: Tuesdays Tools & Techniques for HPC](https://ssl.eventilla.com/event/K6DXK) Tuesdays 16.4., 23.4., 7.5., **14.5.** at 10-12 + 13-14:30, registration is open - [Snakemake Hackathon Online:](https://ssl.eventilla.com/snakemake_hack) 22.5.2024, please also use the waitlist, as we might be able to accept more participants later - [High performance R](https://ssl.eventilla.com/event/6PKbE) 27.-28.5.2024, registration open ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions & "CSC Computing environment, Part 2" support session (Maria & course teachers) Room 2 - Rahti (Joona) Room 3 - Octave, MATLAB, Julia (Jaan) Room 4 - Websocat with Rahti to Mahti (Jemal) Room 5 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q250** How to get started with CSC services? - A: Steps: https://research.csc.fi/accounts-and-projects - Also recommend this course: https://e-learn.csc.fi/course/view.php?id=75 and https://e-learn.csc.fi/course/view.php?id=76 - Same material in another (super nice and clear!) format: https://csc-training.github.io/csc-env-eff/ - **Q251** How to create and run octave containers - Theresa Hoppe - Room 3 - **Q252** Setting up websocat for connecting Rahti to Mahti - Prajwal - Room 4 - https://docs.csc.fi/cloud/tutorials/connect-database-hpc/#step-2-running-websocat-on-csc-supercomputers - **Q253** Is it possible to use more than one module in web interface version of Puhti, for example Pytorch and Geoconda? And how? + Omid - Unfortunately no, they are containerized environments so they cannot be combined. You need to select either one and add any packages you need to that. - Instructions for adding packages: https://docs.csc.fi/apps/python/ - For complicated installations you can also create your own environment, for example with tykky: https://docs.csc.fi/computing/containers/tykky/ - You can also contact servicedesk@csc.fi and ask if you need more detailed help. ## 2024-04-24 session Short talk: Why is my home directory full ([slides](https://a3s.fi/saren-2001659-pub/Why_Is_My_Home_Directory_Full_2024-04-24.pdf)) ### Questions - **Q243** I would like to get some support for optimizing the resources in running a snakemake workflow in HyperQueue, thanks! - Melina (room 3) - - A: User has been advised to fine tune resource settings as well as take more than one node on Puhti to improve the throughput of Snakemake workflow. - **Q244** I have a question about installing a python package to existing module using venv. -Nina (room 4) - **Q245** I am using around 10K csv (now parquet) files(my dataset) amounting to total size of 2TB. I am using Puhti Jupyter to do some analysis from the data. For this analysis I often need to go through the whole data, question 1: where to store this data efficiently (scratch?) *** question 2 What is the best possible set-up or CSC Service (technology) to do this kind of analysis + Debayan Bhattacharya - For dataset that size /scratch is a good option. If the analysis is very disk i/o heavy, you might considering using local nvme disk for the analysis. Local disk allocation only exists for the duration of the job, so the data needs to be copied over to local disk before running the analysis steps and any results also need to be copied back to /scratch as part of the job. The best way to do this would also depend on whether input files can be processed in batches, or are all necessary at the same time. It's best to contact servicedesk@csc.fi with the details, so we can take a look at it. - **Q246** How do I resolve a ssh connection closed by remote host due to inactivity? -Depin (room 4) - **Q247** I have a question about SD Desktop - how is the import of data (sensitive data under secondary use) to SD Desktop done? -Aliisa (room 2) - **Q248** Hello everyone, I would like to ask if the /scratch directory is somehow backed-up. Thank you! - A: No, none of the directories are. If you want to really **store** your data, put it in Allas! There it is backed up, so that if some tech hickup happens, the data is saved, BUT, however, if you delete the files yourself, they are gone. - **Q249** Are there any essential files in user directory that should not be deleted? - There are no files that are absolutely necessary, but there are some that are best left alone if you are not sure what you doing, for example $HOME/.ssh directory. There are some files (e.g .bashrc, .bash_profile etc) that are not essential, but may change how the shell behaves on your next login. - **Q250** How to get started with CSC services? - A: Steps: https://research.csc.fi/accounts-and-projects - Also recommend this course: https://e-learn.csc.fi/course/view.php?id=75 and https://e-learn.csc.fi/course/view.php?id=76 - Same material in another (super nice and clear!) format: https://csc-training.github.io/csc-env-eff/ ## 2024-04-17 session No short talk ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions Room 2 - "CSC Computing environment, Part 1: Basics" precourse support session (Maria & course teachers) Room 3 - Machine learning (Mats) Room 4 - Nextflow (Laxmana) ### Questions - **Q241** I have question regarding the nextflow.config. I tried to run the pipeline that already have the config. How can I overwrite such behaviour in the bash script if I want to modify the executor, also have follow-up question on executor, particularly why not advisable using Slurm + Pande Issue is resolved. The user now knows how to speed nextflow job as well as can configure nextflow settings for hyperqueue executor later if needed. - **Q242** I am trying to train a quantum machine learning model with noise. It is really slow even with a gpu on Puhti but I'm probably doing something wrong + Eino Yrjö-Koskinen ## 2024-04-10 session **Advertisements:** - [Online: Tuesdays Tools & Techniques for HPC](https://ssl.eventilla.com/event/K6DXK) Tuesdays 16.4., 23.4., 7.5., 14.5. at 10-12 + 13-14:30, registration is open - [Comprehensive general LUMI course](https://ssl.eventilla.com/event/D4lad) 23.-26.4.2024, registration open. - [Using CSC Computing Environment, Part 1: Basics](https://ssl.eventilla.com/part1april24) 24-25.4.2024, registration for waitlist open. - [Data Analysis with R](https://ssl.eventilla.com/event/v8Kv9) on 6.-7.5.2024, registration open. - [Using CSC Computing Environment, Part 2: Next steps](https://ssl.eventilla.com/part2may24) 15-16.5.2024, registration open. - [Snakemake HackathonOnline: Snakemake Hackathon](https://ssl.eventilla.com/snakemake_hack) 22.5.2024, registration open - [High performance R](https://ssl.eventilla.com/event/6PKbE) 27.-28.5.2024, registration open - [HPC Summer School at CSC](https://ssl.eventilla.com/event/summerschool2024) 25.6.-4.7.2024. Registration closes on 21th! ### Zoom breakout rooms and topics Room 1 - Beginner/getting started-questions (Xavier) Room 2 - Notebooks for course use (Laxman) Room 3 - Room 4 - Room 5 - R on Puhti (Heli) Room 6 - ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q237** Unable to access examples of batch job files for Ansys CFX at /appl/soft/eng/ansys_inc/example_batch_job_files/parjob_cfx (permission denied) / Andrei - A: https://docs.csc.fi/apps/ansys/ You need to send enquiry to servidesk@csc.fi, they will then add you to a specific user group (ansys group), which then allows you to use the license. This true to all users, so all users that need to send the email :) Mention also your use case, as the license is only for academic use - **Q238** GROMACS use on Puhti + Matteo - -> User wanted to run a GROMACS simulation on Puhti. No previous experience with HPC, Linux or GROMACS. A general introduction on how to use CSC computing resource was provided and advised to send a ticket to Helpdesk for further support. - Useful links - How to connect to CSC supercomputers: https://docs.csc.fi/computing/connecting/#ssh-agent - How to run GROMACS simulations on Puhti: https://docs.csc.fi/apps/gromacs/#puhti - **Q239** CSC notebook for organizing course + Rishi - -> User has a resource request form to use CSC Notebooks for 50 concurrent users on Monday and expecting the response from CSC. Course is starting on Monday. User is advised to send a ticket to Helpdesk. And also I (Laxman) will check with Olli/notebooks channel. Update from Olli: The user's request for more resources has apparently beeen processed. - User also likes to see some ssh commands inside of R notebook terminal. As docker image is read-only, the user can't get it now. He has to build a fresh image with SSH clients. - **Q240** Analysis of scRNAseq data (10GB data) + Niina - Room3 - In **Puhti**, 3 disk areas: /HOME (not meant for your data, but configuration files), /PROJAPPL (owned by the project, other project members can access, 50GB, can be expanded, install softwares here) and /SCRATCH (again all project members can access, 1TB default size, can be expanded. This is for your data, but there's a 6 months automatic cleaning period.) - **Allas** storage service: when you want to keep data, but don't need to access it *right now*. No cleaning, will stay there as long as the project exists (or you delete the data). - **IDA**: when you want to publish & make data available - Analysing data, with nf-co.re NextFlow pipeline: https://docs.csc.fi/apps/nextflow/ Tutorial: https://docs.csc.fi/support/tutorials/nextflow-puhti/ - Regartding estimating the batch job parameters: This kind of job needs to run inside a single node. Puhti nodes have: 40 cores, 192 GB mem. Some nodes have up to 1.5TB on mem, but CellRanger probably doesn't need that much. https://docs.csc.fi/computing/systems-puhti/ Try with smaller test set, and then increase mem if needed. ""--mem = 128G" allows you to specify just the needed memory. For the first time, for time-paremeter, use max time (72h for small partition): no problem, only the used amount of time is "billed". When you know better how long it takes, adjust accordingly (queueing will go faster). - ```--mem=128G``` or ```--mem-per-cpu=8``` +``` --cpus-per-task=16``` - ```--time=72:00:00``` - Using seff to check how the resources were used: https://csc-training.github.io/csc-env-eff/hands-on/batch_resources/tutorial_sacct_and_seff.html