--- tags: user support coffee break title: archive 4 --- # Archive of Weekly CSC user coffee break Q&As (vol4) Newer archive vol5: https://hackmd.io/WRj6JAwGT4qFNvpI9ifxcA **Old archive: Vol3: https://hackmd.io/SggRjgEYToWq4zA7D6ecbw?view** **Even older archives: Vol2: https://hackmd.io/uVI5gLKDQoWZgAVcufYg2A?view Vol1: https://hackmd.io/9QsIbJ03T1SaNDV_xbT8Dw?view** # Q&A for Weekly CSC user coffee break *CSC's open support session organised every **Wednesday at 14:00** in [Zoom](https://cscfi.zoom.us/j/65059161807).* :::info **Zoom:** <https://cscfi.zoom.us/j/65059161807> **Q&A:** <https://siili.rahtiapp.fi/weekly-user-zoom> **Events page (slides and recordings):** <https://ssl.eventilla.com/usersupportcoffee/EN> **Feedback:** <https://link.webropolsurveys.com/S/94AB9F77D8EFF054> **Q&A archive:** <https://hackmd.io/uVI5gLKDQoWZgAVcufYg2A?view> **Old Q&A archive:** <https://hackmd.io/@CSC-research-support/weekly_session_archive> ::: :::warning :bulb: **Useful links & further resources** * [CSC docs](https://docs.csc.fi/) * [CSC Computing Environment course materials](https://csc-training.github.io/csc-env-eff/) - [CSC's courses](https://www.csc.fi/training) ::: ## 2025-08-27 session **Short talk:** Roihu - CSC's next supercomputer coming 2026 [Docs page](https://docs.csc.fi/computing/systems-roihu/) [Feedback link](https://link.webropolsurveys.com/s/DL2026) #### Course & Webinar Advertisements - Workshop: [CodeRefinery](https://csc.fi/en/training-calendar/coderefinery/) 9.9.-22.10.2025 -> Practical workshops on tools and techniques for researchers writing code that would like to make their work more reproducible. In the first 3 sessions you will learn how to keep track of your code using version control with git. Other sessions introduce Jupyter (for keeping documentation and code together), Snakemake (a workflow manager), Sphinx (for creating nice looking documentation) and more! Online workshop with possibility to join in-person classroom at CSC in Keilaniemi - Course: [CSC Computing Environment Part1: Basics](https://csc.fi/en/training-calendar/csc-computing-environment-part-1-basics/) 1.-2.10.2025 - Course: [Data Analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r-4/) 6.-7.10.2025 - Course: [Geocomputing on the supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer-2/) 8.-9.10.2025 - Webinar: [Enhancing Data Support: Understanding Reproducibility, Part 1](https://csc.fi/en/training-calendar/enhancing-data-support-understanding-reproducibility-part-1/) 9.10.2025 - Webinar: [Enhancing Data Support: Practical Reproducibility, Part 2](https://csc.fi/en/training-calendar/enhancing-data-support-practical-reproducibility-part-2/) 29.10.2025 - Course: [Practical Deep Learning](https://csc.fi/en/training-calendar/practical-deep-learning-6/), 11.-12.11.2025 - Course: [CSC Computing Environment Part2: Next steps](https://csc.fi/en/training-calendar/csc-computing-environment-part-2-next-steps/) 12.-13.11.2025 - Course: [High Performance R](https://csc.fi/en/training-calendar/high-performance-r-2/) 17.-18.11.2025 - Course: [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/online-using-containers-in-supercomputing-environment/) 25.-27.11.2025 - See [CSC's training calendar](https://csc.fi/en/trainings/training-calendar/) for more ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Ambassadors Room 3 - Roihu Room 4 - MATLAB --- ### Questions - **Q457** Will the account remain that was created for the fundamentals course? Jani L. - There was this email warning about the ending of the fundamentals course project. Will it continue to be available, i.e. extended? It would be nice to try stuff there from time to time. - A: This is probably regarding the CSC Computing Environment course, right? That project is meant for the course use only, and we do clean it up periodically (remove old participants), so it's probably a good idea to download/move the data that you have stored there to your own project/computer! The course material will stay available, both in [eLena](https://e-learn.csc.fi/course/view.php?id=75) and in [Git](https://csc-training.github.io/csc-env-eff/), so you can easily return to the matter. - **Q458** I can login to Puhti through Putty but not through Matlab. It says "Unable to extract public key from private key file: Wrong passphrase or invalid/unrecognized private key file format". I have verified that the password is correct. - Vincent Verhoeven - Putty keys need to be converted into the OpenSSH format. Otherwise they won't work with MATLAB. - Jaan, room 4 - **Q459** Can Roihu have a possibility to have a "do_not_remove" directory to keep important things from being auto-deleted? (like back in the days of sisu and taito) - Ben Foreback - Probably not as it will then be very tempting to put all your files in "do_not_remove", we would rather have users plan the data usage so that regular cleaning will not be a problem. Dataset sharing, even across projects is something that has been discussed, and we have some ideas how to make this better in the future. - **Q460** Will Roihu have a queue for running smaller jobs with less than a full node? Many quantum chem codes scale poorly so it's not worth using more than ~40 cores. The newish small queue on Mahti was a game-changer for me. Chris Daub. - Yes, there will be smaller-than-full node jobs certainly, as a single node has 386 CPU cores it would be rather big resource for most small jobs. - **Q461** Is it going to be in Kajaani?:) + Juan Galarza - Yep, close to LUMI! (No need to remove Puhti / Mahti to make space for Roihu, which is great) ## 2025-08-20 session **Short talk:** None this time #### Course & Webinar Advertisements - Workshop: [CodeRefinery](https://csc.fi/en/training-calendar/coderefinery/) 9.9.-22.10.2025 - Course: [Data Analysis with R](https://csc.fi/en/training-calendar/data-analysis-with-r-4/) 6.-7.10.2025 - Course: [Geocomputing on the supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer-2/) 8.-9.10.2025 - Webinar: [Enhancing Data Support: Understanding Reproducibility, Part 1](https://csc.fi/en/training-calendar/enhancing-data-support-understanding-reproducibility-part-1/) 9.10.2025 - Webinar: [Enhancing Data Support: Practical Reproducibility, Part 2](https://csc.fi/en/training-calendar/enhancing-data-support-practical-reproducibility-part-2/) 29.10.2025 - Course: [High Performance R](https://csc.fi/en/training-calendar/high-performance-r-2/) 17.-18.11.2025 - Course: [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/online-using-containers-in-supercomputing-environment/) 25.-27.11.2025 - See [CSC's training calendar](https://csc.fi/en/trainings/training-calendar/) for more ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Kimmo Room 3 - AI/ML questions (Mats) Room 4 - Rahti (Alvaro) Room 5 - SSH keys (Oskar) Room 6 - Room 7 - Room 8 - --- ### Questions - **Q454** Rahti auto-deploy used to work, now it seems to be not working; when a new image is pushed to the registry, re-deployment doesn't happen. Event logs for multiple projects are shown empty. + David Rosson - Room 4 - (Alvaro) After checking the usual suspects all seems in order, but still does not work. I will check with my colleagues and come back with possible solutions. - ![](https://siili.rahtiapp.fi/uploads/eaf0696e-ba0f-44ae-a837-6c8bf5a67e57.png) - Can you check that you have a trigger similar to the one in the screenshot? - [David] So I checked this trigger line on two deployments, there is a difference at the end: - - The one that's working: `"pause":"false"` - - Auto-deploy not working: ``"paused":"false"`` - What is your project name? Can you create a ticket to <servicedesk@csc.fi> ? This way the admins can check the project directly. - **Q455** VS Code server doesn't get memory on Puhti (unless a compute node is provisioned and the SSH connects to that) -- (but, Cursor server still works with default SSH) 🙏 + David Rosson - Option 1, Jaan: Virtual memory limit on Puhti login nodes? If yes, one solution is to increase using `ulimit -v "$(ulimit -Hv)"` - Option 2, Heli: downgrade VS Code to version 1.100 - Option 3, Heli: add to `.bashrc`: - `export NODE_OPTIONS="--disable-wasm-trap-handler"` - **Q456** Probably a very basic problem, but I have a problem connecting via SSH. It says there is no supported authentication method available and my public key file is not accepted on the website, only the manual input - Vincent Verhoeven - Room 5 / Oskar - https://docs.csc.fi/computing/connecting/ssh-keys/ - An issue with key compatibility? Generated in Windows (PuTTY) perhaps? - https://docs.csc.fi/computing/connecting/ssh-windows/#putty - https://docs.csc.fi/cloud/pouta/tutorials/ssh-key/#windows-putty - https://www.simplified.guide/putty/convert-ppk-to-ssh-key ## 2025-06-18 session **Short talk:** No short talk today #### Course & Webinar Advertisements - Course: [Geocomputing on the supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer-2/) 8.-9.10.2025 - Webinar: [Enhancing Data Support: Understanding Reproducibility, Part 1](https://csc.fi/en/training-calendar/enhancing-data-support-understanding-reproducibility-part-1/) 9.10.2025 - Webinar: [Enhancing Data Support: Practical Reproducibility, Part 2](https://csc.fi/en/training-calendar/enhancing-data-support-practical-reproducibility-part-2/) 29.10.2025 - Course: [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/online-using-containers-in-supercomputing-environment/) 25.-27.11.2025 - See [CSC's training calendar](https://csc.fi/en/trainings/training-calendar/) for more ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Scientific Computing Ambassadors free chat Room 3 - QUIIME -Marina & Ari-Matti Room 4 - jupyter - Mats, Maciej & Yuhua Room 5 - developing a CUDA + OpenMPI application on Mahti -Sami I Room 6 - ssh -Ari-Matti Room 7 - Room 8 - --- ### Questions - **Q450** QUIIME2 16S data processing - I have been using an old code, but I cannot make it work with my new data, please advise: How do I access QIIME that has been uploaded to the scratch previously and then upload my data :) + Marina - Answer / breakout room number 3 - **Q451** Help with ssh connection to puhti. Works with filezilla but not via command line :) + Joseph (joining the zoom meeting after 14:30) - Answer / breakout room number - **Q452** I've been developing a CUDA + OpenMPI application on Mahti, with these modules: 1) csc-tools (S) 2) gcc/10.4.0 3) openmpi/4.1.5-cuda 4) cuda/12.1.1 5) boost/1.82.0-mpi 6) papi/7.1.0 I'm seeing that MPI transfers from unified memory buffers are causing terrible paging of data back and forth. Some of those buffers could quite easily be swapped out to pure device buffers, but that led to a segfault in the UCX system. Has the Mahti UCX system been built with CUDA support? Are there some tricks to know to get it workin? + Markus - Answer / breakout room number - Maybe use it with the module **openmpi/4.x.y-cuda**. That's the only way to have CUDA-aware MPI With unified memory and no gpu-aware mpi it probably triggers the migration to host to do the communication, but with gpu-aware mpi one can use it with gpu pointers both allocated via cudamallocmanaged or cudamalloc. - Sorry I just saw that the openmpi/4.1.5-cuda is loaded. - **Q453** Question regarding a RT ticket. Ticket number: #789047 + Yuhua - Answer / breakout room number 4 ## 2025-06-11 session **Short talk:** Upcoming billing unit changes - Blog post mentioned: https://research.csc.fi/2025/06/02/billing-unit-renewal-schedule-and-changes/ - [Slides](https://a3s.fi/heli/billing_unit_update.pdf) #### Course & Webinar Advertisements - Course: [Geocomputing on the supercomputer](https://csc.fi/en/training-calendar/geocomputing-on-the-supercomputer-2/) 8.-9.10.2025 - Webinar: [Enhancing Data Support: Understanding Reproducibility, Part 1](https://csc.fi/en/training-calendar/enhancing-data-support-understanding-reproducibility-part-1/) 9.10.2025 - Webinar: [Enhancing Data Support: Practical Reproducibility, Part 2](https://csc.fi/en/training-calendar/enhancing-data-support-practical-reproducibility-part-2/) 29.10.2025 - Course: [Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/online-using-containers-in-supercomputing-environment/) 25.-27.11.2025 - See [CSC's training calendar](https://csc.fi/en/trainings/training-calendar/) for more ### Zoom breakout rooms and topics Room 1 - Getting started (Q441) Room 2 - Short talk: billing unit update (Joonas) Room 3 - Q442 / LLMs (Mats) Room 4 - Scientific Computing Ambassadors free chat Room 5 - Q445 medical text summary (Oskar) Room 6 - Q444 /connection issues, Q443 / squashfs (Rasmus) Room 7 - Q449 multiple batch job submission (Nino) Room 8 - --- ### Questions - **Q441** Dear CSC Support Team, I am currently working with a large medical dataset exceeding 100 GB in size. I have been attempting to transfer this dataset to Puhti's /scratch directory, but have been unsuccessful because I received a message indicating I only have a 10 GB limit. I would greatly appreciate your guidance on how to properly accomplish this transfer given this storage limitation. Additionally, I would like to inquire whether Allas object storage might be a more suitable solution for this dataset. If Allas is recommended, I would need detailed instructions on how to configure and use Allas in conjunction with Puhti. I would be most grateful if you could provide me with the relevant documentation and guidance on this matter. If possible, I would also appreciate the opportunity to discuss this during one of your support sessions. Please let me know if you require any additional information from me to assist with this request. Thank you for your help. I look forward to your response. Have a great day. - A: Breakout room 1 after the short talk - Short answer: Try not using the webinterface. There is a buffer in between that has a limit. Using one of the tools documented here: https://docs.csc.fi/data/moving/ should help. - If the dataset consists of many files, it might be a good idea to package them into fewer larger ones. - Commands: module load allas, allas-conf, a-list, a-get. Documentation: https://docs.csc.fi/data/Allas/accessing_allas/#accessing-allas-in-the-csc-computing-environment-and-other-linux-platforms - Feel free to try this, and then if you run into difficulties, send a ticket to servicedesk@csc.fi or join our meeting! - **Q442** Need help continuing to get high-end llm's to work on lumi/puhti - Breakout room 3 after the short talk - **Q443** Proper way to mount squashfs dataset(s) on Puhti/LUMI? - A: Does this help: https://docs.csc.fi/computing/containers/run-existing/#mounting-datasets-with-squashfs ? - For installing additional Python packages see: https://docs.csc.fi/support/tutorials/python-usage-guide/ - Room 6 - **Q444** I can ssh to puhti with terminal. But I tried the same with filezilla and it does not connect. I wanted to check together if I am doing something wrong. Or if there is another way to get the figures/files from csc /scratch to my computer, that would be ok too. - Solved, need to change Logon type to "Key file". - **Q445** Dear CSC Support Team, I am currently working on a medical text summarization project using transformer-based models (such as BART and T5) that process long input sequences. However, the currently available GPUs (V100 with 32 GB memory) are not sufficient for my workload, even with batch size and token length optimizations. I am encountering repeated CUDA out-of-memory (OOM) errors during summarization of large clinical notes from the MIMIC dataset. I would appreciate your guidance on the following: 1) Is it possible to request access to GPUs with larger memory (e.g., 64 GB A100 or similar) on Puhti or other CSC systems? 2) If not, are there alternative strategies or CSC-supported systems you recommend for handling memory-intensive transformer workloads? 3) Can I increase my current GPU memory allocation through a specific queue, partition, or project application? Thank you for your assistance. - A: Mahti has A100 GPUs. Puhti has V100. Please check: https://docs.csc.fi/computing/available-systems/ - Room 5 - **Q446** Question from chat about short talk: Hi, does this change apply anyhow to current LUMI users? - No, Lumi remains the same as now. This applies to Puhti, Mahti and future Roihu supercomputers. - **Q447** Question from chat about short talk: Don’t we already distinguish between CPU and GPU BU’s? - In Lumi, yes, but in CSC national resource applications (Puhti and Mahti) no, there is only one billing unit. - The rate at which BUs are consumed is different for CPU and GPU resources on national systems (GPU-hours are more expensive than CPU-hours). But as said, there is only one billing unit. - **Q448** Type question/topic here : Pytorch vs Python-data. I have an NMT software that seems to install only with python-data module, but it requires pytorch and will install it via pip. However, I wonder if module pytorch would provide a better and more GPU compatible version of pytorch. The problem is that the pytorch module does not easily go with the additional requirements and I failed to install my software when that module was in use.) - Room 3 - **Q449** Type question/topic here : Hi, I have a question regarding batch job submission in Puhti. Right now I am submitting multiple jobs separately, is it possible to make a bash script to call more batch jobs. - Room 7 ## 2025-06-04 session **Short talk:** No short talk this time #### Course & Webinar Advertisements - [Course: Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/online-using-containers-in-supercomputing-environment/) 25-27.11.2025 - See [CSC's training calendar](https://csc.fi/en/trainings/training-calendar/) for more ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Kimmo et al --- ### Questions - **Q440** Hi, I want to use a software called herro tool for error correction of Nanopore on CSC. However, I cannot install it myself because it requires sudo command. But I need to use it urgently, what should I do? /Wenbo - You can dowload the container image with command: `wget -O herro.sif https://zenodo.org/records/13802680/files/herro.sif?download=1`. Basic command to use: `apptainer exec --nv --bind /scratch herro.sif herro --help`. For more information on using containers see our [Docs pages](https://docs.csc.fi/computing/containers/run-existing/). We also have some [tutorials](https://csc-training.github.io/csc-env-eff/part-2/container). ## 2025-05-28 session **Short talk:** No short talk this time #### Course & Webinar Advertisements - [Webinar: What is new at CSC for biousers](https://csc.fi/en/training-calendar/webinar-what-is-new-at-csc-for-biousers/), June 3 at 13:00-14:00 - [Course: Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/online-using-containers-in-supercomputing-environment/) 25-27.11.2025 ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Ambassadors (Maria & Samantha) Room 3 - Connection problems MFA/SSH Room 4 - Puhti/Pytorch (Shanshan) Room 5 - (Ari-Matti) Room 6 - Room 7 - Room 8 - --- ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q432** Need help getting Puhti batch script and/or python script using vLLM to work in applying large LLM. - breakout room number: 4/ Shanshan - **Q433** Trying to get my sequencing data from ALLAS. I am using Powershell but gives error: Failed to copy: failed to open source object: Operation forbidden. Not sure what I am doing wrong. (From Martina) - breakout room number: 5 / Ari-Matti - Allas + powershell: https://docs.csc.fi/data/Allas/using_allas/rclone_local/ - **Q434** I've been queing 1-2 weeks now for hugemem_longrun partition, it says that the node is (ReqNodeNotAvail, Reserved for maintenance). Should I cancel the job and start queing again? But then again I know that the job will fail after one day... xD This is on Puhti, haha yeah, thanks! - Minna Maunula - There is a Puhti service break coming up on 3 June. Any job that would not finish by the break will not be started before it is over. There is no need to cancel the job and it can stay in the queue and it should run once the service break is over. - Answer / Breakout room number: 5 / Ari-Matti - **Q435** I have questions about running distributed training with PyTorch's DDP. I get this message: - [W526 01:09:54.448313327 socket.cpp:759] [c10d] The client socket cannot be initialized to connect to [localhost]:29400 (errno: 97 - Address family not supported by protocol). [W526 01:09:54.450587424 socket.cpp:759] [c10d] The client socket cannot be initialized to connect to [localhost]:29400 (errno: 97 - Address family not supported by protocol). [W526 01:09:54.452875057 socket.cpp:759] [c10d] The client socket cannot be initialized to connect to [localhost]:29400 (errno: 97 - Address family not supported by protocol). - That message is just a warning related to the distributed training framework trying first to connect via IPv6. When that fails it switches back to IPv4. You see this also on successful runs. If your distributed training job fails, the root cause is something else. (I'm travelling, so cannot join the Zoom - Mats) - Answer / breakout room number - **Q436** Re: Q435, I had to specifically include <find_unused_parameters=True> in model = DistributedDataParallel(model, device_ids=[local_rank], find_unused_parameters=True). The code threw a message saying some params were not updated - That's probably a normal situation as far as I can remember - Mats - **Q437** In beginner room: - Getting started with CSC services: https://docs.csc.fi/support/tutorials/hpc-quick/ - Here’s a little tiny mini course where you can check the first steps: https://e-learn.csc.fi/enrol/index.php?id=75 In case a colleague asks this from you at some point, for example! - Longer self-learning course: https://csc.fi/en/training-calendar/csc-computing-environment-self-learning/ - "I am primarily using it with MATLAB jobs and sent through my work pc here. I would like to know the possibilities of how Puhti could be used efficiently" - Check the course + matlab documentation: https://docs.csc.fi/apps/matlab/ - So, is it possible to create and use jupyter notebook with puhti - Yep, check https://www.puhti.csc.fi/public/ and https://docs.csc.fi/computing/installing/#pythonr-environments - We can join later weekly sessions even if not a new user right? - Absolutely!!! - servicedesk@csc.fi is our support address, where you can ALWAYS reach our specialists <3 - Is there an FAQ section? - Yep: https://docs.csc.fi/support/faq/ Then there's the archive pages of these sessions (link on top of this page), but we also try to create new FAQs based on the questions we get here. Check also the [short talk archive](https://hackmd.io/1xCtrll9SN2wqGYjgZEqhw?view) - **Q438** I'm also trying to install this programm called Snippy (https://github.com/tseemann/snippy). It requires depencies such as: bcftools, vt, snpEff, sampclip, snp-sites. (I did not find these with module load xx) Do I need to install these depencies one by one? I'm very new to installing softwares on Puhti. :) - Minna Maunula - This package is available in Bioconda, so you can follow this tutorial: https://docs.csc.fi/support/tutorials/bioconda-tutorial/ in this case the installation commands will be: ```sh module load tykky mkdir snippy-4.6.0 wrap-container -w /usr/local/bin docker://quay.io/biocontainers/snippy:4.6.0--hdfd78af_6 --prefix snippy-4.6.0 ``` -Perhaps nice tutorial for biousers: https://csc-training.github.io/csc-env-eff/hands-on/modules/module-exercise-with-aligners.html - **Q439** On mahti, cannot connect to https://cryosparc.com/ - (Jarmo) You need to start the cryosparc master, as it most likley died since the last time you connectected. - https://docs.csc.fi/apps/cryosparc/ ## 2025-05-21 session **Short talk:** No short talk this time #### Course & Webinar Advertisements - [VeloxChem on LUMI workshop](https://csc.fi/en/training-calendar/veloxchem-on-lumi-workshop/), May 26-27 - [Moving your AI training jobs to LUMI: A Hands-On Workshop](https://lumi-supercomputer.eu/events/lumi-ai-workshop-may2025/), May 27-28 - [Webinar: What is new at CSC for biousers](https://csc.fi/en/training-calendar/webinar-what-is-new-at-csc-for-biousers/), June 3 at 13:00-14:00 - [Course: Using Containers in Supercomputing Environment](https://csc.fi/en/training-calendar/online-using-containers-in-supercomputing-environment/) 25-27.11.2025 ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Connection problems MFA/SSH Room 3 - JAX (Q423) + any other AI/ML questions (Mats) Room 4 - Rahti (Jemal and Alvaro) --- ### Questions - **Q423** While running training using JAX I am getting a Warning "The NVIDIA driver's CUDA version is 12.2 which is older than the ptxas CUDA version (12.5.40). Because the driver is older than the ptxas version, XLA is disabling parallel compilation, which may slow down compilation." How can I deal with that? - Issue was with the user's own tykky-installation. Using CSC's pre-installed jax shouldn't have this issue. We'll continue via email to servicedesk to figure out what the problem is with the own installation. - **Q424** We are currently trying to run a database inside a Docker container, which requires both a Persistent Volume Claim (PVC) and full write permissions within the container itself. we do not have write permissions within the container's own file system (e.g., the default directories inside the Docker image). Is it possible to deploy a custom Docker image on RAHTI? (If so, could you please provide guidance or documentation on how to do it?) If using a custom image is not an option, how can we start a container with the necessary write permissions within the container itself? - → Room 4 - (Alvaro) Several things to answer: 1. If you are using PostgreSQL, we recomend Pukki, which is CSC's (Database as as Service) DBaaS: https://pukki.dbaas.csc.fi/project/ 2. Which DB are you trying to run? PostgreSQL, Maria, Mongo, ...? 3. You can run any docker image you want. 4. https://docs.csc.fi/cloud/rahti/storage/ephemeral/ allows you to "mount" writable directories into a running container. 5. You can also make a custom image that allows writting into specific folders https://docs.csc.fi/cloud/rahti/images/creating/ 6. This tutorial explains a lot of problems and solution to problems. https://docs.csc.fi/cloud/rahti/tutorials/4cat/ - **Q425** was trying to install different packages using jupyter lab on puhti but could not success due to space issues / Himat - Room 1 - **Q426** I am unsure how to install/compile software so that I can use it as a module. To clarify this a bit further. It related to a tool called BBTools and it basically consists of a lot .sh files - Room 6 - **Q427** I am teaching a course exercise utilizing CSC's computer resources in the autumn at the University of Helsinki, and I would like to know if the new SSH authentification system might cause problems for the students. Currently the students must create CSC accounts on their own using instructions on the course page and book a slot in one of three exercise times. The exercise time slots are the first time I actuall meet them in person, so I want to make sure that the new SSH - Easiest may be to use web interface, which does not require SSH keys, only MFA (but this is enabled already for UH students through Haka). Also this is a recommended way for shared (classroom) computers. - **Q428** I have issues accessing Allas (ssh keys related?) / Minne - Room 7 - **Q429** I have a simulation code, which utilizes both GPU and CPU for different parts. So far I have used Puhti GPU partition, but it has limited CPU resources available. Is there a way to get more CPU resources with GPU partition, or some alternative way to get both CPU and GPU resources? /Niko - Room 9 - **Q430** Jupyter notebook question related to PDL course - Room 8 - **Q431** How to connect Roboflow generated dataset to jupyter lab - Answer / breakout room number ## 2025-05-07 session **Short talk:** No short talk this time #### Course & Webinar Advertisements ##### Allas Webinar - Zoom: https://csc.fi/en/training-calendar/using-the-allas-storage-service/ May 12th at 14:00-15.30 ##### Bio - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 - [Webinar: What is new at CSC for biousers](https://csc.fi/en/training-calendar/webinar-what-is-new-at-csc-for-biousers/), June 3 at 13:00-14:00 ##### Chemistry - [VeloxChem on LUMI workshop](https://csc.fi/en/training-calendar/veloxchem-on-lumi-workshop/), May 26-27 ##### Physics - [Finnish OpenFOAM User Day 2025](https://csc.fi/en/training-calendar/finnish-openfoam-user-day-2025/), May 14 ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Connection problems MFA/SSH (Oskar) Room 3 - Chipster question / Emmi & Maria L Room 4 - Pouta Room 5 - Computing env course / Mahti (Rasmus) Room 6 - Room 7 - Room 8 - --- ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: ## 2025-04-30 session **Short talk:** No short talk this time #### Course & Webinar Advertisements ##### LUMI coffee break - Zoom: https://cscfi.zoom.us/j/65727034273?pwd=VEdtY2trVUVKTEhxajZMbFhETWV2Zz09 April 30th at 14:00-15:00 ##### Allas Webinar - Zoom: https://csc.fi/en/training-calendar/using-the-allas-storage-service/ May 12th at 14:00-15.30 ##### Bio - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July ### Zoom breakout rooms and topics Room 1 - Beginner room Room 2 - Connection problems MFA/SSH --- ### Questions - **Q420** On the new scratch cleaning policies: The project I work on has a high (>5 TB) scratch quota and hence the new 90 days policy on file usage will be applied. Will I still get the reminder of the deletion and a list of the files to be deleted (as previously) or do I have to pay attention to the timing and file use myself? Thanks! - A: It works as before, except the files are identified as the ones not accessed in 3 months. - **Q421** I could not access puhti through windows shell, saying public keys problem - Breakout room 2 - **Q422** Hi, I have HAKA and CSC login access to Puhti and Mahti. Recently I have activated MFA but I could not see QR code. Now MFA is activated and each time I try to test MFA it lands me on page where I need to put OTP - Answer / breakout room 2 ## 2025-04-23 session ## 2025-04-16 session **Short talk:** [Multi-factor authentication](https://a3s.fi/CSC_training/MFA-WEEKLY-RESEARCH-MEETING-20250416.pdf) Multi-factor authentication is a method that requires the use of two or more authentication factors to enable login to our services. Starting from April 22, 2025, for example, web interfaces at www.puhti.csc.fi and www.mahti.csc.fi will require multi-factor authentication. We strongly encourage all our users to test and activate multi-factor authentication as soon as possible to ensure uninterrupted access to services. Detailed documentation can be found at https://docs.csc.fi/accounts/mfa/. There you will find instructions for how to test if you already have it configured, and how to configure it depending on the method you use to log in. Read bit more on the topic [here](https://research.csc.fi/2025/04/02/multi-factor-authentication/) #### Course & Webinar Advertisements ##### Allas Webinar - Zoom: https://csc.fi/en/training-calendar/using-the-allas-storage-service/ May 12th at 14:00-15.30 ##### Bio - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July ### Zoom breakout rooms and topics Room 1 - Getting started at CSC Room 2 - Ambassadors program Room 3 - MFA Room 4 - SSH keys (Windows) Room 5 - SSH keys (Linux/macOS) Room 6 - SD Desktop Room 7 - Tykky with Pytorch 1.6 (#775258) Room 8 - --- ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q413** If I use the shell in Puhti, do I still need to create an SSH key? - If you are referring to the login node shell in the web interface, then no. This interface does not require ssh keys, but it will soon require [multi-factor authentication](https://docs.csc.fi/accounts/mfa/). - **Q414** Technical specifications & developing AI powered analysis in SD Desktop + Apptainer in SD Desktop - Answer / breakout room number - **Q415** Why MFA for Haka and csc user id and password authentication is not work for me anymore?? :) - Room 3 - **Q416** Has there been global changes to SSH configuration in Puhti, Mahti and LUMI that affects remote forwards run in SLURM scripts? - Answer / breakout room number - **Q417** Is using a mobile phone (i.e., my personal phone as the university doesn't provide one), the only way to use the MFA? - There are also MFA implementations for computers, for example KeePassXC (password manager) supports MFA. There are also hardware tokens like YubiKeys, but I'm not sure whether they are supported. - **Q418** Hi, there is a question I would like to ask regarding using the Parallel Computing Toolbox of MATLAB. - Room 8 ## 2025-04-09 session **Short talk:** [Upcoming Puhti and Mahti ssh login method change](https://video.csc.fi/media/t/0_gevwux7s) https://docs.csc.fi/computing/connecting/ https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-keys.html https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-puhti.html #### Course & Webinar Advertisements ##### Rahti Course (online - free) - 17.4.2025 9-16. Eventilla page for the event: https://csc.fi/koulutuskalenteri/online-rahti-course/ ##### Allas Webinar - Zoom: https://csc.fi/en/training-calendar/using-the-allas-storage-service/ May 12th at 14:00-15.30 ##### Bio - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Ambassadors Room 3 - Room 4 - MPI and how to monitor usage of jobs (Ari-Matti) Room 5 - Mats Room 6 - Windows and ssh (Oskar) Room 7 - Room 8 - --- ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q395** Can you also talk about this? "Starting April 22, 2025, the web interfaces at  www.puhti.csc.fi and www.mahti.csc.fi will require multi-factor authentication. Detailed documentation can be found at https://docs.csc.fi/accounts/mfa/. There you will find instructions for how to test if you already have it configured, and how to configure it depending on the method you use to log in." - We will have a short talk regarding this next week! - **Q396** Are the keys valid if I change my computer? - A: Yeah, you can copy / move the keys. But keep the private key only in 1 trusted machine! You can also always just create new key (1h waiting period). - **Q397** I normally use the login node from the puhti web interface, would I need to also set my ssh keys there? could you show how to do it? do I need to add the key also to mycsc? - A: SSH connection in not used when using web interface. But this MFA thingy is then needed (topic for next weeks short talk)! - **Q398** So, I need to add the public key to id_rsa_puhti.pub or is it generated automatically? I do not see Puhti-key in MyCSC profile, although I generated the key for PuTTY some hours ago. - A: If you don't have the keypair in your laptop already, you need to generate it. Lots of instructions for how to do that https://csc-training.github.io/csc-env-eff/hands-on/connecting/ssh-keys.html - **Q399** Where could we find this recording later? - We will add the links to recordings here (link above): Short talk archive: https://hackmd.io/1xCtrll9SN2wqGYjgZEqhw?view - **Q400** Do you need to generate a separate key for WinSCP than for PuTTY? - A: You can use the same one. - **Q401** So, when generating the key, we should save it in CSC directory? I am bit confused, how can be accessed if is storaged in laptop - You should copy the **public** key to MyCSC, see this step: https://docs.csc.fi/computing/connecting/ssh-keys/#adding-public-key-in-mycsc - **Q402** My public key is like that: ```'ssh-ed25519 *********PuAeqHMK3YFI4AfT84 abpaul@TY2308007'``` Do I copy the whole text or just key type and the key sequence like: ```'ssh-ed25519 *********PuAeqHMK3YFI4AfT84'``` - A: Either should be ok, but newline characters are forbidden :) So a bit of a trial and error perhaps with the copy-pasting? - **Q403** I already added my ssh key yesterday and I still don't see the puhti-key in my MyCSC profile and I still can't connect. - Discussed in main room, key was not anymore on the local laptop - **Q404** I created a ssh key with PuttyGen and it doesn't have my "xxx@xxx" at the end and it is saying in MyCSC that it is invalid key? I copied the entire thing from the PuttyGen window, but it still says invalid. It is in this format: ssh-ed25519 AAAAC3NzaC1lZDI1Nxxxxxxxxxxxxxxeddsa-key-20250409 and this is the public key. Yes there are only single spaces. I tried both RSA and ED25519 keys and it keeps saying invalid key. Got it now, just kept retrying with new key. - Yeah, that looks correct. Just make sure that there is a single space between the base64 encoded key (the one starting with AAA...) and the final comment (something like eddsa...) - **Q405** On Windows machine, I am unable to ssh into mahti or puhti from terminal or powershell after setting up the SSH keys. It returns corrupted MAC on input. However, when using the algorithm like hmac-sha2-512 in the following manner: ssh -m hmac-sha2-512 USERNAME@puhti.csc.fi it works fine. - https://docs.csc.fi/support/faq/i-cannot-login/#why-is-my-ssh-client-saying-corrupted-mac-on-input - Windows issues in room 6 - **Q406** Hello, I have a problem. I generated Keys by PuTTY gen and through terminal. I added them to SSH PUBLIC KEY on my profile. It shows that some thing happened but they did not appear there. - Answer / breakout room number - **Q407** I would like to ask about MPI and how to monitor usage of jobs - Room 4 - **Q408** I generated my keys with Putty, if I want to use powershell or any other client such as MobaXterm, would I need to generate new keys? - A: Not necessary to generate new, but different format is needed, so conversion is needed. Save in a separate file. - **Q409** How do I know my ssh keys are working? - When you see your key in this file: /var/lib/acco/sshkeys/${USER}/${USER}.pub - ```ls -l /var/lib/acco/sshkeys/${USER}/${USER}.pub``` - When using the keys for the first time, it is asking for the passPHRASE. SSHAgent or similar can be used so you don't have to remember this phrase. (Good idea to keeping the passphrase somewhere save and hidden) - https://docs.csc.fi/computing/connecting/ssh-windows/#authentication-agent-powershell <-- for configuring authentication agent with powershell - **Q410** How to rsync from Sweden/some other server to Puhti/Mahti? - ssh -A - Same as agent forwarding in ssh config file - tip by Oskar: Use a terminal multiplexer (like tmux or screen) or run rsync inside a compute job so you do not need to maintain the connection. - We will add this info to the documentation! Thanks for the tip and question! - **Q411** I usually use the Puhti portal and open a shell in it. Do I still need to generate SSH key to use the shell that way? - Answer / breakout room number - **Q412** Warning in Mathi when using openmpi - Answer / breakout room number ## 2025-04-02 session **Short talk:** Allas UI - https://allas.csc.fi #### Course & Webinar Advertisements ##### Pouta Course (online - free) - 8.4.2025 9-16. Eventilla page for the event: https://csc.fi/koulutuskalenteri/online-pouta-course/ ##### Rahti Course (online - free) - 17.4.2025 9-16. Eventilla page for the event: https://csc.fi/koulutuskalenteri/online-rahti-course/ ##### Allas Webinar - Zoom: https://csc.fi/en/training-calendar/using-the-allas-storage-service/ May 12th at 14:00-15.30 ##### Bio - [Microbial community / environmental DNA analysis with Chipster](https://csc.fi/en/training-calendar/microbial-community-analysis-with-chipster/), April 8-9 - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July ### Zoom breakout rooms and topics Room 1 - Getting started Room 2 - Ambassadors Room 3 - Allas UI Room 4 - Pouta Room 5 - ML Room 6 - Course next week Room 7 - Q393 Room 8 - Matlab --- ### Questions - **Q392** Type question/topic here : Need help understanding attaching volumes to instances in cPouta, understanding disk space use of virtual instances, checking snapshots - Room 4 after the talk - **Q393** Type question/topic here : I received an email about the SSH key for Puhti, but I have trouble with understanding how to make the key - There are instructions at https://docs.csc.fi/computing/connecting/ssh-keys/ for interactive help come to the breakout room after the talk. / breakout room number 7 - Next week a talk about this: Puhti and Mahti ssh login method change (9th April) - **Q394** SSH gives "Corrupted MAC on input" error. - See: https://docs.csc.fi/support/faq/i-cannot-login/#why-is-my-ssh-client-saying-corrupted-mac-on-input ## 2025-03-26 session **Short talk:** Schroedinger Maestro News #### Course & Webinar Advertisements ##### LUMI users coffee break on Wed 26 March at 13:00 CET (14:00 EET) - Zoom: https://cscfi.zoom.us/j/65727034273?pwd=VEdtY2trVUVKTEhxajZMbFhETWV2Zz09 . - Gregor Decristoforo from the The Arctic University of Norway and member of the LUMI User Support Team will give a presentation about the [LUMI AI guide]( https://github.com/Lumi-supercomputer/LUMI-AI-Guide) ##### Allas Webinar - Zoom: https://csc.fi/en/training-calendar/using-the-allas-storage-service/ May 12th at 14:00-15.30 ##### CSC Computing environment - getting started - [CSC Computing Environment, Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-4/), April 9-10 ##### AI and machine learning - [Practical Deep Learning (in Aalto University campus)](https://csc.fi/en/training-calendar/practical-deep-learning-5/), April 14-15 ##### Bio - [Microbial community / environmental DNA analysis with Chipster](https://csc.fi/en/training-calendar/microbial-community-analysis-with-chipster/), April 8-9 - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July ##### CodeRefinery workshop - Time: Mar 25-27 and Apr 1-3 2025, 10 – 15 EE(S)T - Week 1: Also in-person at CSC headquarter in Keilaniemi, Espoo - Info and registration: [CodeRefinery event page](https://coderefinery.github.io/2025-03-25-workshop/) ### Zoom breakout rooms and topics Room 1 - Beginner room (Oskar) Room 2 - Ambassadors Room 3 - Mujoco Q390 (Mats) Room 4 - Short talk / Maestro (Rasmus) Room 5 - Julia Q388 (Jaan) Room 6 - Nextflow Q389 (Laxman) Room 7 - R Q386 & Q387 (Heli) --- ### Questions - **Q386** Maryam: - 1. I am not able to delete my file in /scratch/my path/! I received this error: Error occurred when attempting to remove files Unprocessable Entity - some temporary problem with the web interface file view, worked when trying again - 2. Can I connect to Puti via two different sessions and do different things? - Maciej: yes - 3. How can I save the session on MobaXterm? when I close it I lost all my commands! - Answer / breakout room number 1 - **Q387** Maryam: Sometimes, while coding in an interactive R environment, the command line unexpectedly switches to $, disconnecting me from the R session. Why does this happen, and how can I prevent it? Additionally, when I run start-r, it launches a new R session, but all my previous work is lost. How can I retain my session and avoid losing progress? - Answer / breakout room number 7 - Keeping previous work: Option 1: save the workspace in .RData (function save.image()) - but make sure to save it somewhere else than the home directory (where space is limited) - Option 2: keep all the commands in an R script file / files, save intermediate results/objects; good to divide the workflow to smaller parts instead of one very long script - Session suddenly ending: could be the reserved time running out - add --time hh:mm:ss to the sinteractive command: https://csc-training.github.io/csc-env-eff/hands-on/batch_jobs/interactive.html . Use the seff command to check why the session ended (time ran out, memory ran out?): https://docs.csc.fi/support/faq/how-much-memory-my-job-needs/#seff-slurm-efficiency - **Q388** Mateusz: I am a bit puzzled by CSC Docs about using Julia: https://docs.csc.fi/support/tutorials/julia/ Why do they run julia directly and not using srun? Also if I want to run a simulation (a serial job) multiple times, should I have multiple runs in a single sbatch script or run multiple sbatch scripts? The latter seems more in line with the serial job philosophy and proper resources allocation, however, it will prompt instantiating Julia packages at every run. Thank you in advance and sorry for my ignorance. - Jaan: You are correct, actually packages need to be installed only once. I need to update the documentation to reflect this. - Jaan: For some uses cases `srun` is called from a Julia session (Julia's mpiexec, ClusterManagers) and for consistency I omitted it from all the example. You can use `srun` to run your julia programs if you aren't calling inside Julia. - Jaan: Multiple runs from one batch script is better for Slurm. You can farm jobs using Julia's Distributed module from Julia code. - Breakout room 5 - **Q389** I'm running spliceseq workflow in nf-core/nextflow and it seems to require a lot memory. I would like to get help in memory settings. I'm slightly puzzled why the workflow requires so much memory because my earlier runs on other nf-core workflows have been ok with a moderate amount of memory. - Answer / breakout room number 6 - The user has been shown the nextflow.config files where one can change the memory settings. It is possible that number of concurrent jobs are causing the issue. Parameters for limiting the number of concurrent jobs are also shown. If the user still runs into issues, he would raise the ticket at CSC helpdesk - **Q390** Resourch requirement assistant: I am working with Mujoco simulator on Python-binding, I would like to run the simulation on CSC to produce synthetic data. I tried to use Accelerated Visualization from Puhti but the resourch is limited. - Room 3 - **Q391** Type question/topic here : I'll ask it by email - Please join the session next Wednesday or send the question by email to servicedesk@csc.fi ## 2025-03-12 session Short talk: no short talk today, whole session for Q&A #### Course Advertisements ##### CSC Computing environment - getting started - [CSC Computing Environment, Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-4/), April 9-10 ##### AI and machine learning - [Practical Deep Learning (in Aalto University campus)](https://csc.fi/en/training-calendar/practical-deep-learning-5/), April 14-15 ##### Bio - [Microbial community / environmental DNA analysis with Chipster](https://csc.fi/en/training-calendar/microbial-community-analysis-with-chipster/), April 8-9 - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July (early bird fee until 15.3.) - [GPU programming with HIP](https://csc.fi/en/training-calendar/gpu-programming-with-hip-2/), March 27-28 ##### CodeRefinery workshop - Time: Mar 25-27 and Apr 1-3 2025, 10 – 15 EE(S)T - Week 1: Also in-person at CSC headquarter in Keilaniemi, Espoo - Info and registration: [CodeRefinery event page](https://coderefinery.github.io/2025-03-25-workshop/) ### Zoom breakout rooms and topics Room 1 - Getting started at CSC/beginners Room 2 - CSC Computing Environment, Part 1: Basics (Maria) Room 3 - Ambassadors Room 4 - SD Connect issues (Kimmo) Room 5 - Q378 (Mats) Room 6 - R issues (Heli) Room 7 - Allas question (Dean) Room 8 - Maciej --- ### Questions - **Q378** What would be the options to deploy and access remotely [NVIDIA Isaac Sim](https://docs.isaacsim.omniverse.nvidia.com/latest/installation/requirements.html#isaac-sim-requirements-isaac-sim-system) through [container installation](https://docs.isaacsim.omniverse.nvidia.com/latest/installation/install_container.html) on a CSC resource? (Jani) :) - Answer / breakout room number - **Q379** We have our data deposited in csc allas. If we need to share only some of our data, is it possible to create a bucket and copy only those parts in the new bucket from our original bucket? We want to avoid adding new users to the whole project. Follow-up to this, is it possible to restrict the permissions for the users of a project as there is always a risk of someone accidently deleting a file fro eg. (Sadi) - You can use our new web interface for Allas - https://allas.csc.fi. It is very simple to use. You can create a new bucket, upload the data you want and there you also have the possibility to share that bucket with a different project. - About restriction, everyone in the project can access the data in the project. You cannot restrict someone, you can remove them from the project if needed. - Allas data is often used so that there is one project for managing data and one to use the data with read-only permissions. See - https://docs.csc.fi/data/Allas/allas_project_example/ ## 2025-03-05 session #### Short talk: research.csc.fi update and overview of different CSC websites (csc.fi, my.csc.fi, research.csc.fi, docs.csc.fi) #### Course Advertisements ##### CSC Computing environment - getting started - [CSC Computing Environment, Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-4/), April 9-10 ##### AI and machine learning - [Practical Deep Learning (in Aalto University campus)](https://csc.fi/en/training-calendar/practical-deep-learning-5/), April 14-15 - [Fundamentals of Machine Learning](https://csc.fi/en/training-calendar/fundamentals-of-machine-learning-4/), April 17-18 ##### Bio - [Microbial community / environmental DNA analysis with Chipster](https://csc.fi/en/training-calendar/microbial-community-analysis-with-chipster/), April 8-9 - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 ##### Geospatial - [STAC – how to find and use spatiotemporal data easily?](https://csc.fi/en/training-calendar/stac-how-to-find-and-use-spatiotemporal-data-easily-2/), March 12 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July (early bird fee until 15.3.) - [GPU programming with HIP](https://csc.fi/en/training-calendar/gpu-programming-with-hip-2/), March 27-28 ##### CodeRefinery workshop - Time: Mar 25-27 and Apr 1-3 2025, 10 – 15 EE(S)T - Week 1: Also in-person at CSC headquarter in Keilaniemi, Espoo - Info and registration: [CodeRefinery event page](https://coderefinery.github.io/2025-03-25-workshop/) ### Zoom breakout rooms and topics Room 1 - Getting started at CSC/beginners (Ari-Matti) Room 2 - CSC Computing Environment, Part 1: Basics pre-course support session (Maria) Room 3 - Ambassadors (~~Samantha~~ for open networking) Room 4 - Rstudio (Heli) Room 5 - AI /ML (Mats) Room 6 - Cloud (Jamal) Room 7 - Mahti (Nino) Room 8 - Websites (Rasmus) --- ### Questions - **Q374** How to fix a matrix.mtx.gz file (huge file, unzipped ~100GB) of a genome that I need as a reference? - I want to use the command Read10X() but the error always says that it has 39563541435487 where it expected an integer. I have tried fixing the matrix in the terminal (this command: awk '{if(NR>3) $3=int($3); print}' matrix.mtx.gz > matrix_fixed.mtx) but it hasn't worked out so far. What else could I try? Also, is the file allover too large for Rstudio under puhti.csc? - Same conclusion as below reached in the breakout room. Split data and look for existing solutions online because this seems to be a common problem. Also, 100 GB file is too big for RStudio so good to use a smaller one/part of it for testing the script in RStudio and then running full analysis as a batch job. - (non-R expert) this seems to be related to Seurat package. When there are too big matrices than R can handle by default, the splitting matrix into some chunks can help. if you have not tried it, please check some solution here: https://github.com/satijalab/seurat/issues/4030 Apparently, leave out first three rows for metadata. - **Q375** How to write/generate sbatch scripts that spread omp threads over NUMA domains on Mahti? Should I always request a full node worth of SLURM "cpus" and set OMP_NUM_THREADS and OMP_PROC_BIND? or can this be done even when OMP_NUM_THREADS = SLURM "cpus". Asking due to existing sbatch generation tools we're using, which set these two values to be equal. - I did some testing myself and could only make this work with --cpus-per-task=128 and OMP_NUM_THREADS=N, OMP_PROC_BIND=spread. Would be nice if there was a cpu binding option in sbatch itself - Room 7, - Conclusion was that for partial nodes there are no tools for fully controlling how the omp threads are allocated. OMP_PROC_BIND=spread could possibly make the best out of it. - **Q376** How to get the CPU's in use with my code? n_jobs=-1 in Random Forest is not working in Puhti. (Kati) - n_jobs=-1 should be able to detect the number of CPU cores allocated with Slurm. This affects only the RandomForest part of the code, though (the fit() function that does the training) - Breakout room 5 - **Q377** (Umair) Shiny app deployment help/ Like if I develop some shiny app and want to deploy it for publication purpose so the other users can use it through a web page, do ou have tutorial or help docs for that? :) - Rahti provides a template that deploys Rstudio and Shiny, but there is almost no documentation on Rahti's side: - https://github.com/CSCfi/helm-charts/tree/main/charts/rstudio - Some docs on how to deploy Shiny on Kubernetes should be available on the web - Room6 ## 2025-02-26 session #### Short talk: CSC Scientific Computing Ambassadors -what are those? [Slides of the short talk](https://a3s.fi/slides/SciComp_Ambassador_short_talk.pdf) #### Course Advertisements ##### CSC Computing environment - getting started - [CSC Computing Environment, Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics-2/), March 12-13 - [CSC Computing Environment, Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-4/), April 9-10 ##### Bio - [Microbial community / environmental DNA analysis with Chipster](https://csc.fi/en/training-calendar/microbial-community-analysis-with-chipster/), April 8-9 - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), May 15 - [BioExcel Summer School 2025](https://csc.fi/en/training-calendar/bioexcel-summer-school-2025/), June 8-13 ##### Geospatial - [STAC – how to find and use spatiotemporal data easily?](https://csc.fi/en/training-calendar/stac-how-to-find-and-use-spatiotemporal-data-easily-2/), March 12 ##### Chemistry - [CSC Spring School on Computational Chemistry 2025](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/), April 23-25 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July (early bird fee until 15.3.) - [GPU programming with HIP](https://csc.fi/en/training-calendar/gpu-programming-with-hip-2/), March 27-28 ##### CodeRefinery workshop - Time: Mar 25-27 and Apr 1-3 2025, 10 – 15 EE(S)T - Week 1: Also in-person at CSC headquarter in Keilaniemi, Espoo - Info and registration: [CodeRefinery event page](https://coderefinery.github.io/2025-03-25-workshop/) ### Zoom breakout rooms and topics Room 1 - Getting started / beginners Room 2 - SciComp Ambassadors hang-out and questions Room 3 - IMAS Room 4 - SD Desktop --- ### Questions - **Q372** How do I use command line in SD Desktop to search in the content of pdf documents? :) - There could be "pdftotext", "pdftohtml", etc tools to extract strings from PDF. Can get ugly. JVL - Answer / breakout room number - **Q373** We are trying to install IMAS (Integrated Modelling & Analysis Suite, from ITER Organization to model fusion in tokamaks) to Puhti, would need some help :) - Breakout room 3: told user to write a ticket to servicedesk@csc.fi specifying what they need. ## 2025-02-12 session #### Short talk: The CodeRefinery workshop - what it is about and why you might want to join - Time: Mar 25-27 and Apr 1-3 2025, 10 – 15 EE(S)T - Week 1: Also in-person at CSC headquarter in Keilaniemi, Espoo - Info: [CodeRefinery event page](https://coderefinery.github.io/2025-03-25-workshop/) - Registration: https://indico.neic.no/event/279/ - > [Materials](https://siili.rahtiapp.fi/CodeRefinery#) #### Course Advertisements ##### CSC Computing environment - getting started - [CSC Computing Environment, Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics-2/), March 12-13 - [CSC Computing Environment, Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-4/), April 9-10 ##### Bio - [Single cell RNA-seq data analysis using Chipster](https://csc.fi/en/training-calendar/single-cell-rna-seq-data-analysis-using-chipster-2/), March 4-5 - [Single cell RNA-seq data analysis using R](https://csc.fi/en/training-calendar/single-cell-rna-seq-data-analysis-using-r-3/), March 6-7 - [Microbial community / environmental DNA analysis with Chipster](https://csc.fi/en/training-calendar/microbial-community-analysis-with-chipster/), April 8-9 - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), 15.5.2025 - [BioExcel Summer School 2025](https://csc.fi/en/training-calendar/bioexcel-summer-school-2025/), June 8-13 ##### Geospatial - [STAC – how to find and use spatiotemporal data easily?](https://csc.fi/en/training-calendar/stac-how-to-find-and-use-spatiotemporal-data-easily-2/), March 12 ##### Chemistry - [CSC Spring School on Computational Chemistry 2025](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/), April 23-25 ##### High Performance Computing and other specialized topics - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July (early bird fee until 15.3.) - [GPU programming with HIP](https://csc.fi/en/training-calendar/gpu-programming-with-hip-2/), March 27-28 ### Zoom breakout rooms and topics Room 1 - Getting started / beginners Room 2 - SciComp Ambassadors networking Room 3 - Q364 - multimodal model Room 4 - Q365 + Q366 - e/cPouta Room 5 - Q368 - ollama Room 6 - Q369 - python install Room 7 - --- ### Questions :::danger If you wish you can type your question here before the session. We'll respond to them during the session. **Please join the Zoom call if you have a question, if you cannot join the call you can send the question by email to <servicedesk@csc.fi>.** Previous questions and answers can be found in the archives, see links section above. ::: - **Q364** Fitting multilevel models iteratively/incremental learning in FIONA to manage with the memory limits: >After seeing earlier questions, here is a bit more context. I am a PhD student at university of Helsinki and working with hospitalisation data (e.g. stratified to women in 1996-2021, 25 million observations with only around 1 % ever having events). Due to the hierarchical nature of the data (*time-nested-within-individuals*) a multilevel model (*e.g. binominal*) would make the most sense. > >Fitting the models iteratively in chunks updating the starting values based on the earlier chunks seems to be a fruitful approach, but I am not sure and don't know about best practices of such methods. - Note for specialists: https://stat.fi/tup/tutkijapalvelut/fiona-etakayttojarjestelma_en.html ; - **Q365** My Home and End keys stopped working (for going to beginning and end of text) after upgrading ePouta flavor. Ideas on how to get them working again? Anni - Room4 - https://superuser.com/questions/94436/how-to-configure-putty-so-that-home-end-pgup-pgdn-work-properly-in-bash Problem fixed with modifying my .inputrc file as suggested here! - or use ssh in Windows Powershell instead of Putty: - **Q366** Question about backing up volume data on cPouta / Matthew - Room4 - My recomendation is to use restic with Allas/S3 (https://docs.csc.fi/data/Allas/allas_encryption/#restic-backup-tool-that-includes-encryption) - **Q367** Type question/topic here :) - Answer / breakout room number - **Q368** Help with Multi-GPU + ollama/transformers/etc. Rafael L. - With ollama, use --ntasks=1 rather than --ntasks=4, ollama itself will handle multiple GPUs. Using multiple tasks will just start multiple copies of ollama which will be very confused ;-) - **Q369** Question about a python installation that is in constant danger of getting cleaned from scratch continuously, need to figure out a more permanent way. Ville T. - room 6 ## 2025-02-05 session :::danger We are a bit short staffed on AI/ML/R related matters this time due to overlapping events! ::: #### Short talk: Using CSC's free learning environments for students in your courses #### Course Advertisements - [Single cell RNA-seq data analysis using Chipster](https://csc.fi/en/training-calendar/single-cell-rna-seq-data-analysis-using-chipster-2/), March 4-5 - [Single cell RNA-seq data analysis using R](https://csc.fi/en/training-calendar/single-cell-rna-seq-data-analysis-using-r-3/), March 6-7 - [STAC – how to find and use spatiotemporal data easily?](https://csc.fi/en/training-calendar/stac-how-to-find-and-use-spatiotemporal-data-easily-2/), March 12 - [CSC Computing Environment, Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics-2/), March 12-13 - [Microbiome data analysis with Chipster](https://csc.fi/en/training-calendar/microbial-community-analysis-with-chipster/), April 8-9, registration opening soon - [Spatial transcriptomics (Visium) data analysis](https://csc.fi/en/training-calendar/spatial-transcriptomics-visium-data-analysis-with-chipster/), 15.5.2025 - [CSC Computing Environment, Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-4/), April 9-10 - [CSC Spring School on Computational Chemistry 2025](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/), April 23-25 - [BioExcel Summer School 2025](https://csc.fi/en/training-calendar/bioexcel-summer-school-2025/), June 8-13 - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July (early bird fee until 15.3.) - [GPU programming with HIP](https://csc.fi/en/training-calendar/gpu-programming-with-hip-2/), March 27-28 ### Zoom breakout rooms and topics Room 1 - Getting started / beginners (Ari-Matti) Room 2 - Using CSC's free learning environments for students in your courses (Joona) Room 3 - Licensing issue on Mahti (Jarmo, Sampo) Room 4 - SD Connect and SD Desktop (Kimmo) Room 5 - Q360 and Q363 (Oskar & Shanshan) Room 6 - R library issue (Laxman) Room 7 - --- ### Questions - **Q359** I encounter error messages I can't solve: ```sce_HCNO <- Read10X(data.dir = "HCNO", gene.column = 2, unique.features = TRUE) Error in scan(file, nmax = 1, what = what, quiet = TRUE, ...) : scan() expected 'an integer', got '3919692702' OR sce_vel <- Read10X(data.dir = "Vel_FHB", feature.names = TRUE, gene.column = TRUE, unique.features = TRUE) Error in Read10X(data.dir = "Vel_FHB", feature.names = TRUE, gene.column = TRUE, : unused argument (feature.names = TRUE) Vel_FHB and HCNO are (downloaded) public gene libraries, the folder contain features.tsv.gz, barcodes.tsv.gz and matrix.mtx.gz files. I have also gunziped them but recieved similar error messages. I also still encounter issues with the BiocManager, ssaying that I need to update from version 3.19 to 3.20 in order to e.g. install: .libPaths(c("/projappl/project_xxxxx/project_rpackages_440", .libPaths())) --Load necessary package BiocManager::install("xxx", force =TRUE, lib=.libPaths()[1]) library(xxx) or: library("ZellKonverter") or "SCtype" from FIMM ``` Even when I run the library(SingleR)-command and gives no error, if I look at the file: data(package = "SingleR") no data sets found installing it again gives errors as described above... - Sounds like a problem with the input data. Maybe similar to this issue: https://github.com/satijalab/seurat/issues/4030 - when installing your R package, please make sure to use your own project here: /projappl/project_xxxxx/project_rpackages_440. Apparently, project number is not replaced if I am not mistaken here. - **Q360** How to run a complicated (i.e. hard to write a batched version) PyTorch function `f` in parallel across multiple inputs. The recommended approach I've seen so far is to use torch.multiprocessing, but this gives an issue because some objects I use inside `f` cannot be pickled (similar issue with Python's builtin multiprocessing). The solution to this pickling problem I've found was to use the multiprocess package which uses dill, however in that case there is a conflict coming from using the CPU and GPU together. - Room 5 - **Q361** How can we access encrypted data after having (probably) lost the private key? I can view the data in SD Desktop but automatic decryption via SD Connect did not work - Answer: You can import the public key of a new cryp4gh key pair to SD Desktop. Encrypt the data with this new key and then export the encrypted data to SD Connect. From there you can download the data to your own computer and decrypt it using the new secret key and crypt4gh. - **Q362** Noppe etc questions for Joona: - Noppe: if a student does not have Haka login, do we request for them a CSC account in advance? - noppe@csc.fi - Noppe: the notebooks I share with the students: do they all get a copy of the instance that they can edit and save in their space? Or do they just see my file that I shared with them? - Each student has own copy. - How is this implemented? Copy on creation or each startup? If I update it, what happens - download materials from github at launch - env is lost when closed, no persistent storage, student needs to download the data. Thanks! - Noppe: As a teacher, am I able to see the work that the students did and re-run it to test it? - No persistent storage so not possible. It might be a future feature. Autograder (e.g. nbgrader) could also be added in the future. - Comment to Noppe: "I used Noppe last summer. I don't remeber what machine learning course it was, but i was able to finish the course after I moved the data to Colab. The course was not up to date" - Noppe: is the data persistent per (course, account) combination? - There is no persistent data right now (but can be figured out later if needed, for example Allas) - Noppe vs projects for students: if all I need is students to complete a jupyter notebook and run some simple visualisation, is noppe better for this? When is that projects are better than noppe? - For Noppe students need only HAKA-login. For project (and Puhti/cPouta) also students need proper CSC accounts. So if Noppe is enough, use it. - Did I get this right that students can ~~make a CSC account now~~ login with HAKA and get Jupyter compute resources without a teacher making the "course"? - Students can open the public Jupyter or RStudio applications in Noppe with HAKA. Thanks! - Is the project (vs noppe) be able to allow persistent storage so that eventually the teacher can check the work of the students? Or maybe it is not optimal (students would see other's solutions) - not really because the persistent storage might require students to opt-in. Ideally we will have student working on their own jupyter notebook and then teacher beeing able to rerun and test the notebook. - If I understood correctly, students can do assignments through Noppe, but as a teacher for assessment, students need to download their results and send it through moodle for example. - Yep that is also what I understood. and then they upload it to your course moodle and you test the notebook locally. - Yes - this has been confirmed to be the current functionality. But this has also been recognised and is on our task list. - Do I see if a specific student logged in to noppe? - ~~To be checked.~~ From the join code you can see if the members have joined. - Currently there is no possibility to check login status, BUT teacher can see running sessions. - There is a ticket about this kind of feature on our list. Now I know to prioritise this higher. Thank you for the feedback! - - **Q363** I want to run multi-GPUs in a tensorflow script. Yes using two GPUs is okay but it take the same time as just a single GPU. Advice on how to max the use of more than 1 GPU in puhti? - - Room 5 ## 2025-01-29 session #### Short talk: Chipster - easy-to-use software for analysing your sequencing data > Did you know, that you can analyse your bulk-RNAseq, single-cell RNAseq, spatial and metagenomics etc. data with the easy-to-use [Chipster software](https://chipster.csc.fi/), for free? > > Chipster has a nice and user-friendly graphical user interface that you can use in your browser. Users don't need to know R, Python or Linux commands or do any installations at all. > > Chipster is an open source software developed at CSC, and researchers in Finland can use it to run their analysis on our supercomputers without any extra costs. We have enveloped over 500 analysis tools, R and Bioconductor packages in Chipster for your easy analysis, visualisation and sharing of your data. We also offer [free online self-learning materials](https://chipster.2.rahtiapp.fi/manual/courses.html), and occasionally host also live courses, and are of course happy to provide support! [Chipster short talk slides](https://a3s.fi/Marias_share/IntroToChipster2025.pdf) chipster@csc.fi #### Course Advertisements - [Single cell RNA-seq data analysis using Chipster](https://csc.fi/en/training-calendar/single-cell-rna-seq-data-analysis-using-chipster-2/), March 4-5 - [Single cell RNA-seq data analysis using R](https://csc.fi/en/training-calendar/single-cell-rna-seq-data-analysis-using-r-3/), March 6-7 - [STAC – how to find and use spatiotemporal data easily?](https://csc.fi/en/training-calendar/stac-how-to-find-and-use-spatiotemporal-data-easily-2/), March 12 - [CSC Computing Environment, Part 1: Basics](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-1-basics-2/), March 12-13 - Microbiome data analysis with Chipster, April 8-9, registration opening soon - Spatial transcriptomics (Visium) data analysis, 15.4.2025, registration opening soon - [CSC Computing Environment, Part 2: Next steps](https://csc.fi/en/training-calendar/online-csc-computing-environment-part-2-next-steps-4/), April 9-10 - [CSC Spring School on Computational Chemistry 2025](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/), April 23-25 - [BioExcel Summer School 2025](https://csc.fi/en/training-calendar/bioexcel-summer-school-2025/), June 8-13 - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24 June - 3 July (early bird fee until 15.3.) - [GPU programming with HIP](https://csc.fi/en/training-calendar/gpu-programming-with-hip-2/), March 27-28 ### Zoom breakout rooms and topics Room 1 - Getting started / beginners Room 2 - Chipster (Eija, Maria) Room 3 - Kati & Mats Room 4 - Nino & Francesco Room 5 - Q355 Room 6 - Maestro Room 7 - Matlab --- ### Questions - **Q351** Uploading a large dataset (around 1TB) to Allas that has large chunks of data inside. I sent it to a bucket with name "large_data", then it also created "large_data_segments". Is it normal? Why is that? - Yes. https://docs.csc.fi/data/Allas/using_allas/swift_client/#files-larger-than-5-gb - **Q352** Dask parallel optimising + scikitlearn, large data, spatial data (Kati) - Room 3 - **Q353** Is there a way to queue up a maestro job that consists of many (~800) small subjobs? - Room 6 - **Q354** I am running Monte Carlo simulations in MATLAB on Puhti as array jobs. While I have successfully done this on other HPC platforms, I am encountering errors on Puhti related to MATLAB license limits. My questions are: How can I determine the number of available MATLAB licenses on the server? Additionally, is there a workaround to run my simulations despite the limited number of licenses? - Room 7 / solved - **Q355** How to plot animations on vscode (Python) in Puhti on the interactive window? I am using Tykky containerized environments and I am having problems with applications that expect a display connected. - Room 5 - **Q356** Does Chipster have some input data limits? - 400GB is the default limit. Make sure when uploading big files, that the upload speed is good enough! - **Q357** Is there Allas - Chipster connections for upload? - You can use the public URL in Allas to load the data to Chipster. This is how you put data in Allas easily using the Allas Web GUI: https://docs.csc.fi/data/Allas/using_allas/web_client/ - **Q358** Questions regarding trimmming and filtering reads in Chipster - Check this video: https://www.youtube.com/watch?v=EKTGqatq6HI&list=PLjiXAZO27elBj3KYi7ACscgOxlNkNOxPc&index=3 ## 2025-01-22 session #### Short talk: Sensitive data services: SD Connect 2.0 #### Course Advertisements - [CSC Spring School on Computational Chemistry 2025, April 23-25, **registration open**](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/) - [BioExcel Summer School 2025, June 8-13, **registration open**](https://csc.fi/en/training-calendar/bioexcel-summer-school-2025/) - [CSC Summer School in High-Performance Computing](https://csc.fi/en/training-calendar/csc-summer-school-in-high-performance-computing-2025/), 24.6 - 3.7, registration open (early bird fee until 15.3) - GPU programming with HIP, March 27-28, registration opening soon ### Zoom breakout rooms and topics Room 1 - Getting started / beginners (Ari-Matti) Room 2 - Sensitive data services (Kimmo) Room 3 - Python packages (Q348, Q347, Kati - Mats) Room 4 - Rahti (Jemal) --- ### Questions - **Q347** Using VS Code with Containerized Environments - Hi there! I'm new to CSC, so I apologize if this is covered in the documentation already. I’m looking to set up an interactive session where I can connect to the CSC using VS Code, start a custom development container, and attach that container to VS Code for development and debugging. I've been able to start the container based on the documentation, but I'm having trouble attaching it to VS Code. Any guidance on how to accomplish this would be greatly appreciated! Thank you! - **Q348** Kaggle in Puhti - I am trying to install kaggle in my puhti profile to be able download a dataset using kaggle API with a command: - $ kaggle competitions download -c imagenet-object-localization-challenge - In my local laptop I do the following using conda as follows: ``` $ pip install kaggle $ kaggle --version # Kaggle API 1.6.17 $ kaggle competitions download -c imagenet-object-localization-challenge ``` But in Puhti (using container tykki) I do the followings to leverage pip install: ``` $ module purge $ module load tykky $ cd /projappl/project_2004072 ####################################### # modify packages.sh file: $ vim packages.sh pip install kaggle ####################################### $ conda-containerize update CondaCSC --post-install packages.sh Here is the error, I get: [alijanif@puhti-login12 project_2004072]$ conda-containerize update CondaCSC --post-install packages.sh [ INFO ] Constructing configuration [ WARNING ] Found existing installation(s) ['/projappl/project_2004072/CondaCSC/bin'] in PATH. It's recommended to remove these from PATH when running the tool [ INFO ] Using /local_scratch/alijanif/cw-JSBX5O as temporary directory [ INFO ] Copying container CondaCSC/container.sif [ INFO ] Copying image CondaCSC/img.sqfs [ INFO ] Copying installation to writable area, might take a while [ INFO ] /local_scratch/alijanif/cw-JSBX5O/CondaCSC [ INFO ] Running installation script Collecting kaggle Downloading kaggle-1.6.17.tar.gz (82 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [59 lines of output] /CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/dist.py:493: SetuptoolsDeprecationWarning: Invalid dash-separated options !! ******************************************************************************** Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead. By 2025-Mar-03, you need to update your project and remove deprecated calls or your builds will no longer be supported. See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details. ******************************************************************************** !! opt = self.warn_dash_deprecation(opt, section) running egg_info creating /local_scratch/alijanif/pip-pip-egg-info-z6_d3dnc/kaggle.egg-info writing /local_scratch/alijanif/pip-pip-egg-info-z6_d3dnc/kaggle.egg-info/PKG-INFO writing dependency_links to /local_scratch/alijanif/pip-pip-egg-info-z6_d3dnc/kaggle.egg-info/dependency_links.txt writing entry points to /local_scratch/alijanif/pip-pip-egg-info-z6_d3dnc/kaggle.egg-info/entry_points.txt writing requirements to /local_scratch/alijanif/pip-pip-egg-info-z6_d3dnc/kaggle.egg-info/requires.txt writing top-level names to /local_scratch/alijanif/pip-pip-egg-info-z6_d3dnc/kaggle.egg-info/top_level.txt writing manifest file '/local_scratch/alijanif/pip-pip-egg-info-z6_d3dnc/kaggle.egg-info/SOURCES.txt' reading manifest file '/local_scratch/alijanif/pip-pip-egg-info-z6_d3dnc/kaggle.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no files found matching 'LICENSE.txt' adding license file 'LICENSE' Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/local_scratch/alijanif/pip-install-3gwa72s_/kaggle_f3eaf0a7727246928a277e2260e5d464/setup.py", line 7, in <module> setup( File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/__init__.py", line 117, in setup return distutils.core.setup(**attrs) File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 186, in setup return run_commands(dist) File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 202, in run_commands dist.run_commands() File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 983, in run_commands self.run_command(cmd) File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/dist.py", line 999, in run_command super().run_command(command) File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 1002, in run_command cmd_obj.run() File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/command/egg_info.py", line 312, in run self.find_sources() File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/command/egg_info.py", line 320, in find_sources mm.run() File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/command/egg_info.py", line 548, in run self.prune_file_list() File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/command/sdist.py", line 162, in prune_file_list super().prune_file_list() File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/_distutils/command/sdist.py", line 380, in prune_file_list base_dir = self.distribution.get_fullname() File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/_core_metadata.py", line 272, in get_fullname return _distribution_fullname(self.get_name(), self.get_version()) File "/CSC_CONTAINER/miniconda/envs/env1/lib/python3.9/site-packages/setuptools/_core_metadata.py", line 290, in _distribution_fullname canonicalize_version(version, strip_trailing_zero=False), TypeError: canonicalize_version() got an unexpected keyword argument 'strip_trailing_zero' [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details. [ ERROR ] Installation failed [ ERROR ] Set CW_DEBUG_KEEP_FILES env variable to keep build files Avslutad $ export PATH="/projappl/project_2004072/CondaCSC/bin:$PATH" ``` - Answer / Room 3 - **Q349** batch creating and python packages import... (Kati - Answer / Room 3 - **Q350** I have questions on being a beginner user of Rahti. - Answer / Room 4 ## 2025-01-15 session #### Short talk: Discover CSC Services for Research: Free Resources for Open Science in Finland [Material](https://siili.rahtiapp.fi/SSH_CSC_service_overview?view) #### Course Advertisements - [CSC Spring School on Computational Chemistry 2025, April 23-25, **registration open**](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/) - [BioExcel Summer School 2025, June 8-13, **registration open**](https://csc.fi/en/training-calendar/bioexcel-summer-school-2025/) - HPC summer school ### Zoom breakout rooms and topics Room 1 - Short talk topic, selecting your services (Samantha) Room 2 - Q344 / Machine learning (Mats, Oskar,shanshan) Room 3 - Getting started/beginner room (Heli) Room 4 - Random forest, array problem, scikit learn (dask?), Kati + Q345 (Alvaro, Jarmo) Room 5 - Rahti getting started help (Tristan) --- ### Questions - **Q344** I am running vLLM on LUMI using the Pytorch module and a venv built on top of it. I have used the default Pytorch module and lately also the newer Pytorch/2.5, since it has a newer version of vLLM. Both versions seemingly randomly give this error, when trying to run my code: "RuntimeError: NCCL error: internal error - please report this issue to the NCCL developers". This does not happen every time, but seems to be happening more frequently with the newer Pytorch. I have tried excluding the nodes where this error occurs, but the list of excluded nodes is getting quite long, so it's probably not a sustainable solution. Do you have any ideas on how to solve the issue? - Answer / Room 2 - **Q345** I am trying to run a Python script on Puhti, but I have some troubles with parallelism (using 'multiprocessing' package) and reserving an appropriate number of cores and nodes. Would it be possible to get some help with this? - Answer / Room 4 - **Q346** Pytorch lightning is not using all available GPU on LUMI - https://docs.csc.fi/support/tutorials/ml-multi/#pytorch-lightning-with-ddp ## 2025-01-08 session #### Short talk: *No short talk this week* #### Course Advertisements - [CSC Spring School on Computational Chemistry 2025, April 23-25, **registration open**](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/) - [BioExcel Summer School 2025, June 8-13, **registration open**](https://csc.fi/en/training-calendar/bioexcel-summer-school-2025/) --- ### Questions - **Q343** I am quite new to puhti but now I have run some interactive sessions and I have questions regarding the time limits and efficiency of my sessions - -> Room 2 ## 2024-12-18 session #### Short talk Pukki - Database as a Service - [slides](https://a3s.fi/weekly-user-zoom/2024-12-18-DBaaS-CSC-Research-Support-Coffee.pdf) #### Advertisements - [Moving your AI training jobs to LUMI: A Hands-On Workshop, February 4-5, *registration will open next week*](https://www.lumi-supercomputer.eu/events/lumi-ai-workshop-feb2025/) - [CSC Spring School on Computational Chemistry 2025, April 23-25, **registration open**](https://csc.fi/en/training-calendar/csc-spring-school-on-computational-chemistry-2025/) - [BioExcel Summer School 2025, June 8-13, **registration open**](https://csc.fi/en/training-calendar/bioexcel-summer-school-2025/) ### Zoom breakout rooms and topics Room 1 - Getting started/beginner room (Samantha) Room 2 - Pukki (short talk, Oscar) Room 3 - MATLAB questions (Jaan) Room 4 - VS code question (Mats) Room 5 - Starting an LLM (Shanshan, Oskar) --- ### Questions - **Q341** Is it possible to use your own MATLAB lisence for parallel computing with Puhti? How? - Jaan: I'll answer MATLAB questions. - Room 3 - Depends: Is the own license a license file, license server or online login via using credentials? - `export MLM_LICENSE_FILE=~/.matlab/matlab.lic` -- license file - `export MLM_LICENSE_FILE=1766@license4.csc.fi` -- license server - `matlab -desktop` - **Q342** Type question/topic here : I have set up VS Code to work in Puhti interactively via SSH. I created a virtual environment using the —system-site-packages flag to locally add a package missing in the data python module. The set up works well, however the interpreter does not recognize numpy (Even though I can import numpy and use it) and flags the following message: Import "numpy" could not be resolved (Pylance). I have edited the settings.json inside my .vscode folder, but I still have the same problem. Is there a workaround to make pylance recognize that the module python-data and my local venv are linked, and numpy is contained in one of these places? - Room 4 ## 2024-12-04 session **Short talk: Using MLflow in Puhti (and LUMI)** [Tutorial](https://github.com/erikaster/puhti_mlflow_tutorial) | [Slides PDF](https://a3s.fi/mats/erika_mlflow.pdf) Previous short talk "How do I figure out batch job parameters?" [Slides](https://a3s.fi/saren-2001659-pub/How_to_Figure_Out_Batch_Job_Parameters_2024-11-27.pdf) **Advertisements:** - [Julia for HPC](https://csc.fi/koulutuskalenteri/julia-for-high-performance-scientific-computing-enccs-2/) 9-12.12.2024, registration open ### Zoom breakout rooms and topics Room 1 - Getting started (Laxmana) Room 2 - MLflow /short talk (Erika) Room 3 - Room 4 - Room 5 - Room 6 - Room 7 - Room 8 - --- ### Questions - **Q339** [updated] I am facing issues to install *crosspy* in venv in Puhti. I did this: ```sh # create virtual environment cd /projappl/project_2011613 module load python-data python3 -m venv --system-site-packages env_vwm_geometry # activate virtual environment cd /projappl/project_2011613 module load python-data source /projappl/project_2011613/env_vwm_geometry/bin/activate # load modules module load gcc/11.3.0 cuda/11.7.0 # install cupy and crosspy pip install --user cupy-cuda117 pip install --user crosspy # sbatch to test cupy sbatch gpu_test.sh ``` ```sh # sbatch gpu_test.sh #!/bin/bash #SBATCH --account=project_2011613 #SBATCH --partition=gputest #SBATCH --time=5 #SBATCH --cpus-per-task=1 #SBATCH --ntasks=1 #SBATCH --mem-per-cpu=16G #SBATCH --gres=gpu:v100:1 module load python-data module load gcc/11.3.0 cuda/11.7.0 source /projappl/project_2011613/env_vwm_geometry/bin/activate nvidia-smi python3 -c "import cupy as cp; print(cp.cuda.runtime.runtimeGetVersion()); print(cp.cuda.runtime.getDeviceProperties(0))" python3 -c "import crosspy” ``` ```sh # I am getting this error in the slurm output related to crosspy Traceback (most recent call last): File "<string>", line 1, in <module> File "/users/santoang/.local/lib/python3.10/site-packages/crosspy/__init__.py", line 39, in <module> from crosspy import random, utils File "/users/santoang/.local/lib/python3.10/site-packages/crosspy/random/__init__.py", line 10, in <module> from crosspy.device import Device File "/users/santoang/.local/lib/python3.10/site-packages/crosspy/device/__init__.py", line 113, in <module> from .cpu import cpu, _CPUDevice File "/users/santoang/.local/lib/python3.10/site-packages/crosspy/device/cpu/__init__.py", line 6, in <module> from crosspy.utils.array import ArrayType, register_array_type File "/users/santoang/.local/lib/python3.10/site-packages/crosspy/utils/__init__.py", line 1, in <module> from .profile import Timer ImportError: libsvml.so: cannot open shared object file: No such file or directory ``` Any advise and help will be very welcome. Thank you! - Hello, I've been replying to your ticket in servicedesk. Maybe we can continue via servicedesk email as I have a bit of sore throat today and prefer not to talk over zoom so much ;-) - Mats - Sure, let's keep talking by email. Thank you Mats. - Aniol