New archive: https://hackmd.io/uVI5gLKDQoWZgAVcufYg2A?view
Contents of this documents and quicklinks
7.9. Puhti web interface (recording)
14.9. LUMI (slides, recording)
21.9. CSC Notebooks (slides, recording)
28.9. Puhti OS update (slides)
5.10. Tykky (slides, recording)
12.10. SD services (slides, recording )
19.10. No short talk (autumn holidays)
26.10. Machine Learning guide (no slides, all information in docs, recording)
2.11. CSC cloud services Pouta and Rahti (slides, recording)
9.11. Workflows and Hyperqueue (recording)
16.11. CSC Training materials (recording)
23.11. RStudio/R in different platforms (slides, recording)
30.11. Introduction to the Language Bank of Finland (Kielipankki) (slides_part1, slides_part2, recording )
7.12. MyCSC (recording)
14.12. mini crash course in HPC
11.1. Upcoming CSC Computing Environment course (Self-learning course page)
18.1.Data retention policy updates (Slides, recording, data clean up in docs.csc.fi)
25.1.: No short talk, just Q&A session
1.2.: No short talk, just Q&A session
8.2.: Mini crash course in HPC 2
22.2.: SD Desktop (recording)
1.3.: LUMI (slides, recording)
8.3.: Copying scratch dir to Allas (recording)
15.3.: R/python-based web tools at CSC for reproducible research and teaching (slides)
22.3. No short talk
29.3. ssh keys and login safely (recording)
5.4.: Mini course as a first step to CSCs services (mini course)
Q1: Are there other platforms other than Puhti where to run R scripts? Would like to have more than 1534GB of RAM
Q2: (same op as Q1) Can you share a computation between tasks in R-singularity container? In other words, if task n does not have enough memory, can it borrow memory from another task n+1?
Q3: How to give access to others for a ocean simulation code NEMO that I have compiled (FMI)? Could you make it available for others to use?
Q4: I am providing support to researchers working in our unit. How should I instruct the users to get access to CSC?
Short talk: LUMI (slides)
Q5: I am a PhD researcher who has never worked with RNA seq (bulk and single cell) data before, and I now need to decide where to store my data and where to analyse it. I don't really know how much data I will generate since I haven't worked with these methods before, but we're doing both bulk and scRNA seq from a patient cohort (40-100 patients). No one in my group has done seq studies and those who have just store their data on external hard drives. We don't really have a data management plan either. I suppose I would use ePouta for storing the data (involves patient information), but I'm also going to do analysis on Chipster and/or R in Puhti, and don't quite understand how exporting data from the cloud works. The university's research data "helpdesk" isn't very helpful, so I'm asking basically for guidelines on where I should store the data and mostly how do these platforms ePouta and Puhti work
Edit: 15.9. Sorry for not being available at the zoom on Wednesday, bu thank you for the tips, I will try to show up next week and then get some further advice :)
Q6: I'm sometimes having trouble accessing the Puhti Webinterface. Trying to access puhti.csc.fi just times out. It worked just fine this morning (13.09.) but suddenly stopped working around 11 am. This has happened previously as well. What could be the problem?
Q7 : I'm planning to use CSC notebooks for course work with the notebook-workspace that we are planning to create. Since there are some EU students who do not have any Haka accounts, how CSC is handling their processes, do we need to pay something on behalf of those students? Also, for the memory allocation, I have an estimate memory usage per session but can those numbers change for concurrent user. For ex: If paraller jupyter instances share most of the code in memory, total usage might be smaller. And what is the limit of memory than 1 student can have per session.
Q8 : We would like to use the autograding possibilities in Jupyter Notebooks server created for students. It looks like there are almost all the components available already, jupyterhub, autograder-labextension. Administrator need to run only pip install grader_labextension
and then everything is ok. Is this correct? If not we could run nbgrader manually, but what would be the optimal way to distribute the feedback to the students? Can workspace owner write to students permanent directories? Or perhaps the feedback could be stored in the shared directory with permissions allowing only the certain user to read it? But these are not necessary if autograder already works.
Q9: Anyone from LUMI-G EAP? Multi-gpu pytorch issue
Short talk: CSC Notebooks
From previous session: this customer joining the session today!
Q5: I am a PhD researcher who has never worked with RNA seq (bulk and single cell) data before, and I now need to decide where to store my data and where to analyse it. I don't really know how much data I will generate since I haven't worked with these methods before, but we're doing both bulk and scRNA seq from a patient cohort (40-100 patients). No one in my group has done seq studies and those who have just store their data on external hard drives. We don't really have a data management plan either. I suppose I would use ePouta for storing the data (involves patient information), but I'm also going to do analysis on Chipster and/or R in Puhti, and don't quite understand how exporting data from the cloud works. The university's research data "helpdesk" isn't very helpful, so I'm asking basically for guidelines on where I should store the data and mostly how do these platforms ePouta and Puhti work
Q10: I have a large social media dataset and am thinking about the best tools to analyze the data. The data are potentially sensitive and thus might require SD-services. However certain tools such as Spark are not available by default there. So I'm wondering whether I could somehow get deploy those tools using SD-services at CSC, or whether I should instead analyze the data for instance at Rahti.
Q11: RHEL8 update
Q12: Rahti and "glusterfs" issues
Q13: Large VMs requested by univ Oulu IT customers. ePouta running our resources. Sensitive data.
Q14: Notebooks :)
Q15: Containerizing non public python packages
Short talk: Puhti Operating System update
Slides
Room 1 -
Room 2 - Q18 (Henrik)
Room 3 - Conda & Tykky (Ari-Matti, Kimmo)
Room 4 -
Q16: Do the Slurm development headers exist somewhere in Puhti? I've been trying to compile the PySlurm library, and it does not seem to find them.
Q17: I am OpenFOAM user. I run my simulations on Mahti. I do not want to use multi-threading in CFD simulations; as it doesn't bring any speed-up. In the summary emails that I got from Mahti at the end of simulations show that I utilize "Cores per node: 256" even though as I know this value is 128 for Mahti without multi-threading (Also, my Slurm output file shows that SLURM_TASKS_PER_NODE= 128). Am I doing something wrong or is this info safe to ignore? In addition to that, during OpenFOAM simulations, if I use collated write output, my simulation hangs during write time and it is like that till I cancel it. However, same run works well in non-collated output. How may I solve this issue? Thanks a lot in advance!
Cores per node: 256
CPU Efficiency: 50%
50% means actually that your job has used 100% of the core capacity - an excellent MPI application! That is because the comparison is made against 256 threads.
That collated method should be a robust method. Could you send me your batch jobs script, and some description of you OpenFOAM model, such the size of the mesh, and the equations (sub-models) being solved? You may send that to servicedesk@csc.fi. The message will finally end up to me.
Q18: Please let me ask a question, I want to import my libraries in my program on the puhti.csc.fi site, but I get the following error: ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/projappl/project_123' Check the permissions. I tried the following command: pip install – user gym, but it is not resolved. Could you please advise me?
Q19 Conda and Tykky tips needed, could we have breakout
Q20: Why does CSC not support MPIrun anymore?
Short talk: Tykky container wrapper
Slides
26 in total online
Tykky - Henrik & questions about Tykky
Room 2 -
Room 3 -
Room 4 -
Q22: I'm using Gfortran in a project I just started to use in CSC. For my project I have made a precompiler which generates necessary use statement to acces variables or subroutines stored in modules. The names of these variables start with prefix j_ or p_. The precompiler also makes indentations which I have bee lazy to do. When making indentations the precompiler keeps track of if()then …endif and do i=1, … end do -structures. When these structures are mixed up, the precompiler writes an error message which is much more useful than the error message of Gfortran. When Gfortran writes at the end of a subroutine 'expecting endif' my precompiler writes at what lines all open if-then and do-structures have started. It makes the life much easier. I think that it would be quite easy to put these error messages into Gfortran. Is there anyone in CSC who would be interested to do this?
Q22: How can Apptainer containers work with GUI programs?
Short talk: SD services
Room 1 - Questions regarding the short talk (sensitive data services, Francesca & Laura)
Room 2 - ePouta (KalleH)
Room 3 - Machine Learning / CUDA (Mats)
Room 4 -
When I try submit the batch job, error arises as follows:
There are a few questions here.
and to config parallel computing mode? I have tired that but didn’t work.
A: Unfortunately we did not have a MATLAB specialist on call today (12.11), but we will ask them to fill in an answer here. The very best way would be to send these questions also to servicedesk@csc.fi, and then ask to discuss the topic further in the weekly support session if needed :)
Q23: We are going to analyze SLAM-seq data, which unfortunately there is no option for it at CHIPSTER. So, we'll need a place to store our data set and analyze it. It appears to be possible at ePouta. I was wondering if I could discuss it at the coffee break today at 14:00.
Q24: What's the current status of LUMI-G, and can you provide any estimate (even if rough) when pilots might be able to start?
Q25: Training with the ROCm port of Megatron-DeepSpeed appears to be numerically unstable in a way that does not impact the original CUDA version. Where to start debugging this?
Q26: Last week some Scipion/Relion jobs at Mahti were being killed by CSC staff due to cluster overheating. Could we look into the reasons with someone from the Mahti end here, as I haven't heard back from CSC staff about the issue so far?
Q27: Is there GDPR training for researchers available in Finland? In the UK, the Office of National Statistics runs a "safe researcher" training course, which often is compulsory before accessing data.
Q28: I need to combine meteorogical informations with Coronavirus data from Oulu University.Some of the information needed is aggregated data (e.g. https://www.worldometers.info/coronavirus/), software needed e.g. QGIS
Who is the data controller? Agreements between CSC and the Data Controller should be in place.
What software do you need to analyse the data? (https://docs.csc.fi/data/sensitive-data/sd_desktop/#pre-installed-software)
We can then have a meeting to discuss the details.
No short talk (autumn holidays)
Room 1 - Q29 and Q30 with Rasmus - chemistry
Room 2 - Q31 - Kimmo - bash scripts on Puhti
Room 3 - Q32 - Francesca and Laura - Sensitive Data Services
Room 4 -
Q29: Since new update on Puhti, Schrödinger maestro is not working on interactive session through desktop application
schrodinger.hosts
file does not contain any references to old login nodes, e.g. puhti-login1
. If it does, please rename these as puhti-login11
. You can either do this by hand, or then delete the old hosts file and create a new one by running the /appl/soft/chem/schrodinger/set-hosts-file.bash
script.Q30: Since new update on Puhti, MMPBSA.py of AmberTools requires additional setup (e.g. Python 2)
Q31: How could one create a bash script for Puhti to run X number of different scripts that have resources, time etc allocatec?
Q32: Interested in Sensitive Data
Q33: How many years can you keep data in IDA? How about Etsin?
Feedback/Note: Webinar and other event registration, sometimes difficult to find link to join, suggestion to send links via email before the event
Newsletter about webinars etc: Training pages: https://www.csc.fi/en/training#training-calendar , sign-up to training newsletter: https://www.csc.fi/newsletter
Number of participants in courses from Uni Helsinki?
ML guide link: https://docs.csc.fi/support/tutorials/ml-guide/
Feedback/Note on Puhti OS update: some panic about ssh key update that came with Puhti OS update, email got overread as usually there is mostly general info in them; after re-reading the email everything was fine again :)
short summary on LUMI talk discussed: https://www.lumi-supercomputer.eu ; early access platform available for everyone for testing
Feedback: Docs have everything people need, but it is not organized in a way that makes it easy to find the information; also it can be hard to decide if a certain page is something one needs to be reading or not
Relevant pages are often sent from servicedesk
No short talk (autumn holidays)
Room 1 -
Room 2 - Schrödinger Maestro
Room 3 -
Room 4 -
Note: Puhti currently not available: "Lustre metadata server is overloaded by requests."
Short Discussion about when to choose Puhti, Mahti or LUMI.
Q34: Regarding my question in the previous session "Since the new update on Puhti, Schrödinger maestro is not working on interactive sessions through desktop applications". Here "not working" means when I book an interactive session on https://www.puhti.csc.fi/ and uses a desktop app to have a maestro GUI session for analysing the results of my simulations, I get the following error message.
This happend after updating my host files so I think the issue is not due to host file but something else. Thanks
login11
or login12
as the host, but currently you are launching from r18c02
. These are mostly guesses, however, as I cannot try to reproduce this with Puhti being down due to Lustre issues. I'll get back to this and meanwhile would suggest you to send an email to servicedesk@csc.fi so that we can better document this.Q35: How to start using eDuuni-workspaces? Also interested in API support.
Q36: Are there login problems to Puhti? I cant seem to login via SSH or the webfront.
Q37: LUMI, AMD vs NVIDIA? Will user run into problem, when they are used to NVIDIA?
Short talk: CSC Cloud services, Pouta and Rahti (intro)
Room 1 - Pouta & Rahti
Room 2 - SLURM & Nextflow
Room 3 - RStudio & Puhti web interface
Room 4 - Machine learning & HPC, getting started
Q38: SLURM and Nextflow, how to submit to squeue?
Q39: Suppose you have a Docker container that you have developed and tested locally on your own machine. What is the procedure to move this container to Rahti and run it? Do you need to pay any mind to concepts such as kubernetes, pods, orchestration…?
Q40: How do you know what Rahti template to choose?
Q41: Puhti R Studio interface
Q42: AI on supercomputer
Internal discussion: Getting RStudio and R MD to work together (container + browser bridge)
Note / action point: Advertise the ML guide in training newsletter
Short talk: Workflows and hyperqueue
Maria Lehtivaara - Host, Bio stuff :)
Maria Dimitrova - Co-host
Rasmus Kronberg - Short talk about workflows & hyperqueue
Mats Sjöberg - Machine learning
Anni Järvenpää - Language bank
Room 1 - Tykky & LUMI, Rasmus
Room 2 - VSCode & ssh connection to Puhti, AnniJ, MariaD, Laxman & Mats
Room 3 -
Q38: Where are we to find the other questions (<Q38)? Thanks!
A: In the archive! Link is in the box at the top of this document, and here: https://hackmd.io/@CSC-research-support/weekly_session_archive
Q39: How can I download NDVI data from NASA MODIS database? https://lpdaac.usgs.gov/products/mod13q1v006/ I have gps coordinates of my study sites and need to extract NDVI values from a given time period for these for further analyses. I have trouble getting the data downloaded in raster format that I could use. Or perhaps there is some R code ready that can be used for this, as I will be continuing analyses in R (package LEA)?
A: There seems to be some instructions for R here: https://rspatial.org/terra/modis/2-download.html and here: https://flograttarola.com/post/modis-downloads/ (no experience with them though). In the latter there is some possibility to define the output format which might help?
Q40: Do you have container wrapper tykky on Lumi? & which workflow tools do you have here?
A: Yes, https://docs.lumi-supercomputer.eu/software/installing/container-wrapper/. Although it is not named Tykky, it is basically the same tool. Regarding workflow tools, HyperQueue is projected to be on LUMI at some point. Latest versions of the program also support dependencies, although only through the Python API.
Q41: I used Visual Studio Code to connect to puhti.csc.fi but since last week I am not able to connect anymore. I haven't been able to solve the solution myself. (x2)
A: After the RHEL8 operating system update, the public key of the server has changed, making the ssh client suspect that you are not connecting to the intended machine but to a possibly malicious agent instead. To remove the outdated key from your computer, remove "puhti.csc.fi" lines from .ssh/known_hosts
file. On unix machines the .ssh
directory is usually found in your home directory, on Windows machines it typically is in C:\Users\<username>\.ssh
unless a program has saved it to a specific location.
Q42: One RT ticket discussed in main room :) Hyperqueue might be the solution there.
Q43: It would be helpful for the newbies like me if there was an example of how to write a batch job to execute a Jupyter notebook (using Tensorflow for high intensive deep learning). Thanks in advance!
A: Good idea! Unfortunately I cannot join the call today, but you could take a look at the papermill tool. I can try to create some examples on how to use it on Puhti when I have some more free time. If you send a request to servicedesk@csc.fi I can reply to you directly there once I have some examples ready.
NOTE: 16.11. notes in HedgeDoc!!!
Source | Example Content |
---|---|
CSC training calendar | Upcoming courses, Link to Elements of Supercomputing |
eLena (Moodle environment) | Research data management, CSC computing environment |
Tutorials at docs.csc.fi | Getting started with Puhti, Machine learning guide |
Training material collection in docs.csc.fi (this page) | Links to thematically collected material |
CSC Notebooks | Introduction to Python, Introduction to R, 2h intro to Quantum computing |
VideoCSC | Introductions to topics, Recordings of weekly user zoom talks |
CSC YouTube-channel | See also Chipster channel, Elmer channel |
CSC Training GitHub | Repositories to course materials ( |
LUMI Training events | Workshop on quantum computing, LUMI-C environment |
CodeRefinery Lessons | Version control, open and collaborative coding |
Aalto Python for Scientific Computing | pandas, plotting, binder, parallel programming |
Source | URL | Example Content |
---|---|---|
CSC training calendar | Link to csc.fi | Upcoming courses, Link to Elements of Supercomputing |
eLena (Moodle environment) | eLena | Research data management, CSC computing environment |
Tutorials at docs.csc.fi | Docs: Tutorials | Getting started with Puhti, Machine learning guide |
Training material collection in docs.csc.fi | Docs: Training | Links to thematically collected material |
CSC Notebooks | CSC Notebooks | Introduction to Python, Introduction to R, 2h intro to Quantum computing |
Video CSC | VideoCSC | Introductions to topics, Recordings of weekly user zoom talks |
Youtube | CSC YouTube-channel | Chipster channel, Elmer channel |
CSC Training GitHub | CSC Training repos | Repositories to course materials ( |
LUMI events and training | LUMI Training events | Workshop on quantum computing, LUMI-C environment |
CodeRefinery lessons | CodeRefinery Lessons | Version control, open and collaborative coding |
Aalto Python for Scientific Computing | Material and registration | pandas, plotting, binder, parallel programming |
Maria Lehtivaara - Bio stuff :)
Matias Jääskeläinen - Training Coordinator
Maria Dimitrova - quantum chemistry, HPC environment
Devaraju Narayanappa (CSC)- Climate stuff
Alvaro Gonzalez (CSC) - Cloud support
Room 1 - Python script in Puhti
Room 2 - Big dataset in Allas
Room 3 -
Room 4 -
Q&As from previous sessions in the archive!
Q43: It would be helpful for the newbies like me if there was an example of how to write a batch job to execute a Jupyter notebook (using Tensorflow for high intensive deep learning). Thanks in advance!
Q44: May be you can send a summary e-mail to CSC users with these links you just explained/ provided?
A: Super good idea! We will take a look into this. Tasks:
Q45: For my large model, even the longrun on Puhti with 14 days limit cannot satisfy the computation. I'm trying to use Abaqus on CSC with GPU. but I only used CPU before, how should I set the configuration accordingly with GPU assignment? Could you please help me to take a look with the batch file to change from CPU to GPU?
Maria Lehtivaara - Bio stuff :)
Maria Dimitrova - quantum chemistry, HPC environment
Juha Fagerholm - resource applications
Mats Sjöberg - machine learning
Alvaro Gonzalez - Cloud - Rahti and Pouta
Room 1 - R / RStudio
Room 2 - Nexflow
Room 3 - Moving data in Puhti / exporting to a database
Room 4 -
Q&As from previous sessions in the archive!
Q46: I’m trying to use both Wandb and Tensorboard with my Python code. Could you give an example of how to add both in Python code? I'm confused as for the syntax. Thanks! I tried to look at https://docs.csc.fi/support/tutorials/gpu-ml/#profilers, where it says "See also how to launch TensorBoard using the Puhti web interface". However, once you click it doesn't take you to an explanation of this, but just to a listings of the apps (in that sense it's a bit misleading!).
A: The person asking the question was not in the call. Please send it as a servicedesk ticket instead. See: https://docs.csc.fi/support/contact/
Q47: I would like to discuss the possibility of executing Nextflow based pipelines from https://nf-co.re/ in puhti. More details here https://nf-co.re/docs/usage/installation
(Laxman): Yes. It is possible to run nf-core pipelines in CSC environment. You can check (tutorial 4 as a toy example) on how to run a nextflow pipeline here: https://yetulaxman.github.io/Biocontainer/tutorials/nextflow_tutorial.html
If you need to run heavy computing we might want to change differnt executors (local, slurm or HyperQueue)
Q48: Maybe irrelevant, but what is the difference between HackMD and HedgeDoc? This forum used to be in HackMD. Just curious :)
A: Yes, we used HackMD previously, but it was getting a bit buggy, so now we are trying HedgeDoc. They work very similarly, but HedgeDoc is installed on our own Rahti service.
Q49: Hi, my question is about OpenCL driver for a virtual desktop on Puhti, and how we can install it so we could be able to interface with the processing units via OpenCL in python for parallel computing.
A: There was a RT ticket sent by the customer, Henrik & Sebastian are replying to that.
Q50: Hello, Can you R just avalibe in Puhti?
A: Heli discussed this in the short talk and in the breakoutroom number 1
Q51: Type your question and (nick) name here!
Q52: Type your question and (nick) name here!
Q53: Type your question and (nick) name here!
Note: Add link to Zoom to HedgeDoc
Note: Tutorial for getting started when you have a fucntional environment in your own computer, and you need to move to Puhti? (You're not a root, there's SLURM, which packages you had installed (<- RDM stuff)! etc…) Lowering the bar
Slideset part 1 coming soon
Slideset part 2
Maria Lehtivaara - Bio stuff :)
Maria Dimitrova - quantum chemistry
Samantha Wittke - Geoinformatics
Martin Matthiesen - Language bank
Sam Hardwick - Language bank
Mats Sjöberg - Machine learning
Laxmana Yetkuri - Bioinformatics
Jemal Tahir - cloud computing
Heli Juottonen - R
Anni Järvenpää - Language bank
Kimmo Mattila - Allas, sensitive data services
Q51: frankier: How can we access /appl/data/kielipankki
on Mahti?
A: Having the data on Mahti has not been requested before; ..
Note: Discussion about listing datasets available (staged) in Puhti, and extending services to Mahti and LUMI in future.
Maria Dimitrova - quantum chemistry
Samantha Wittke - Geoinformatics
Ville - myCSC
Juha Fagerholm
Juha Lento
Kimmo Mattila
Alvaro Gonzales
Heli J - R
Mats Sjöberg - ML
and some more
8 rooms used
Q&As from previous sessions in the archive!
Type your question below if you are joining the next call -this is a way for us to trying to make sure we have the correct specialist(s) on the call. Add your (nick)name so we can then more easily connect you with the specialist. This is NOT to replace our servicedesk (servicedesk@csc.fi)! You can also just join the call and ask your question there.
Q52: In Puhti, I am trying to install an R package from a github repository using the "remotes" library, but I receive errors. The package requires a C++ compiler setup that is appropriate for the version of R installed. I am not sure if the error I receive is due to this, or due to some other issue. So, my questions are: 1) How can I install an R package from github to my libpath (/projappl/…)? 2) How can I set up the C++ compiler? [Hande Topa/NC]
-> Room 2 in Zoom
Q53: I noticed there's a mismatch between spatial data available in Puhti and documentation regarding Sentinel-1 mosaics. In documentation it reads these mosaics are available since October 2019, while that's not the case as the earliest mosaics in Puhti are from the start of 2018. However, there's a gap in the mosaic data between the end of 2018 and October 2019. Why is this and is there something to be done about it in near future? [Arttu Kivimäki/FGI]
Q54: In Puhti, I'm having problems using Stan in r-env via "cmdstanr"-backend. I'm following the CSC r-env guides about using Stan. According to the guide the installation should be found at "cmdstanr::set_cmdstan_path("/appl/soft/math/r-env/421-stan/cmdstan-2.30.1")", but it doesn't seem to be there as i get an error message: "Path not set. Can't find directory: /appl/soft/math/r-env/421-stan/cmdstan-2.30.1". I'm using R through interactive RStudio -session. [Henri Wallen/LaY]
-> Room 2 in Zoom
Q55: Narges - Room 4 in Zoom
Bioinformatics, RNA velocity, Python based tools
I couldnt find any of the python tools for RNA velocity. I was wondering if they can be available on CSC and if not what is the best way to install several packages.
Q56: Is there any progress with sensitive data services regarding saving work independently of virtual machine (i.e., not losing it when crashing)? This refers to the secondary data use as in Findata. Tom R
-> Room 5 in Zoom
Q57: Is the ntasks setting consistent for Spark setting? And what will happen if I call CUDA program from cpu nodes? [name=Haining]
-> Room 6 in Zoom
Q58: Pytorch distributed computing issues. How to debug, how to run? General advice on distributed computing [name=Maxim]
-> Room 7 in Zoom
Q59: Rahti seems to have an old version of Kubernentes v1.11. Is there any roadmap to upgrade Rahti?
v1.11.0+d4cacc0
, and the internal BETA version of Rahti has kubernetes v1.23.5
. Among other improvements that will come with new Rahti, we will get new versions of Kubernetes deployed in a more "agile" way. There is yet no date for the new Rahti.-> room 8
Spyder availability on Puhti
-> geoconda and python-data modules include Spyder
https://docs.csc.fi/apps/python/#spyder
I have a simple tool to make large arrays of small job as a single MPI
task pexes.c (heavily reworked source from Juha Lento, attached). It gets a
text from stdin to the MPI member 0, and uses other tasks to execute each line
of the text via system (3). The program can be
compiled with enabled debug, so it shows actual status of 'system'
calls.
My use case normally is some long bash script with several loops
producing lists of commands that are fed to the srun … pexes command
via pipe after done. Each list needs different memory/cpu/no_tasks,
so separate job is needed for them, and there might be some lightweight
commands in between the loop. Some lists might be empty because needed results
have been done already.
For some reason, when called over more than one node on puhti the
system calls return zero status immediately without doing anything.
I have made a simple test case to illustrate the issue:
creates only first 19 files, while
works perfectly well. The above scripts used to work at puhti before the
recent upgrade, and still works perfectly well (with different project
and partition) at mahti, fmi cray systems and several other
environments.
Juha Lento from CSC suggested that it is likely a bug in slurm. Can that
be fixed?
Juha also suggested to use sbatch-hq, but I would need something that
reads stdin rather than a file, and can specify per-task memory.
In any case, it would be great to figure out the reason for silent
failure of 'system', since pexes is not the only MPI program we use,
where the call is used.
For now the suggestions have been to change the workflow and not to use system
from MPI processes. I wonder if the issue can be fixed?
- More details will be given via the existing service desk ticket.
Q62: Type your question and (nick) name here!
I have question regarding running jupyter notebook installed via tykky.
Thanks in advance!
Fatemeh
Q63: Rahti openshift kubernetes cluster (Joaquin Rives)
I have question regarding Rahti service. We would like to deploy kubeflow pipelines, however
the instalation requires the creation of some cluster-scoped resources and configuration.
In Rahti you only have acccess to your project namespace, so the deployment fails.
Is there any work around this problem?
E.g. could someone with admin privileages do the deployment for us?
Is there any other CSC service we could use?
Q64: Lumi eap question. I can't run pytorch multinode training without setting environment variable NCCL_SOCKET_IFNAME="lo". If this is set, scripts run nicely on one node, but not on multiple nodes. Any idea how to fix this?
Q65 Lumi eap question. What comparative performance should one expect for nvidia tuned pytorch models compared to lumi's 250x?
Q66 When LUMI's ROCm version will be updated? Currently LUMI is running ROCm 5.0 when 5.4 is already out. PyTorch's official PIP packages are compiled against ROCm 5.2 and they do not work as they need a higher ROCm to be installed on the machine.
Q67 LUMI pytorch packages
Click the pencil icon in the top left to switch to edit mode to start typing.
Note that if you are typing a question in here, we expect you to join the Zoom session
Room 1 - Matti & Ari-Matti, Q71
Room 2 - Sakshi & Samantha, Puhti issue with running jobs
Room 3 -
Room 4 -
Click the pencil icon in the top left to switch to edit mode to start typing.
Q74 How to use Comsol on CSC? (Max)
Q75 Alphafold, own copy of db for each? (JVL)
GPU node on Puhti vs in ePouta; do they differ in practice for alphafold (if one can use either)?
Q76 Jupyter notebooks for courses, best practices for using Jupyter notebooks for bioinformatic courses?
Q77 Hi, can we ask help for puhti-related problems during the session? (rosannahu)
Q78 When will new OpenStack volume quota be available in ePouta? (Juhana K)
Q79 Applying billing units: In the application there is question about "program codes". What does this exactly refer to?
Q81
We would like to use an internal python library in a conda environment within Puhti, so we are wondering what is the best way to do so. Cloning the github repository directly to the SCRATCH folder does not seem to work. The error that is raised when trying cloning it using SSH is: ssh: connect to host gitlab.iqm.fi port 22: Connection timed out
Q82 ePouta: are there io.haswell.32core flavors currently available? (Juhana K)
Q83 I am trying to run several analyses in r-package LEA for my RAD snp data (exported from STACKS and modified to .geno format with vcftogeno -r-code). However, I need to impute missing snp values with a custom r-code, which seems to have issues not functioning as it should. My formatting of the snp data also loses names of the snps, and I can not quite figure out how I should format the snp data to keep the ID information and get the code running. So would need help from someone familiar with R functions and preferably also with LEA r-package. (Finnish also ok!)
Q84 I noticed that the python-data
modules on Puhti set the environment variable OMP_NUM_THREADS=1
. The obvious workaround to this (when using applications that actually use OpenMP and like threads) is to set the variable myself after loading the modules, but I'd still like to understand why these modules behave like that.
OMP_NUM_THREADS=1
setting should be better documented (at least mentioned in docs.csc.fi). We (CSC) could also consider some logic that if it's already set the module doesn't change that setting. We'll consider that kind of setting.Q85 Since tuesday's maintenance break, I lost some 80% of performance on 200 nodes w/ Vlasiator, OpenMPI&GCC. ~no differenec between running a single node - two node test.
export SRUN_CPUS_PER_TASK="${SLURM_CPUS_PER_TASK:-}"
in your batch script before running srun
. The slurm version was updated and this affects how certain settings are inherited from the #SBATCH directives (notably, the number of threads per task). We are currently investigating if we could make this environment variable export global so that the old behavior would be restored.
Q86 I am new to the CSC swithcing from UH's cluster. I have two questions: a) I would need xvfb installed, it is not found on the current available modules, how can I get it? b) I have problems setting up the ssh keys, although I think I have done everything correctly, I would need someone to go through the process with me to spot the error.
Q87 Is there any advances in using SD-desktop with single registry data?
Q88 I am a member of a project which is shared between my supervisor (project manager), me, and my Master's student. Me and my student are working on the same things and using the same conda environments that I originally created. We are having some problems with reading/writing/executing permissions. For example, he can install new packages on my environments, but then I'll be unable to use them and also the packages that were overwritten by him. However, I am able to deinstall those packages by deleting them manually. What should we do to be able to both use the same environment? Should we both run something like 'chmod -R g+rwX /path/to/env/dir'? The environments are created in accordance with these instructions: https://docs.csc.fi/support/tutorials/conda/
See also more information on batch job scripts in CSC Documentation
To be continued 8.3: Mini crash course in HPC - Batch job scripts: Memory, disks and GPUs
Q89 I might have misunderstood, but did you say that srun is used for parallel processing. Isn’t it also used for any other program (e.g. serial)?
sbatch jobscript.sh
sbatch -N 1 -n 1 -c 40 --time 4:00:00 --mem 4G jobscript.sh
man sbatch
lists environment variables that the job script can use. For example, the 'SLURM_CPUS_PER_TASK'.
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
Q90 Problem with own script not running on Puhti, Anna -> room 1, Juha
htop
command. This is kind of easy, just check with squeue
in which compute node the job is running, and while the job is running, ssh to the node with something like ssh r07c01
, and in the compute node give module load htop
and htop
to see how many threads/processes the program has launched and which cpu cores are working with the threads. We saw that for some input files the program was using quite lot of memory (probably the cause of the original problem), and also that most of the time only a few cpu cores had anything to do (short burst of parallel processing). Now, the tricky part is how to write a better program that uses memory and cpu cores more efficiently, but that is another story. The first step there is to profile the existing program, to see in which parts are slow and which parts use a lot of memory.
--dependency
for such "split workflow".)
Q91 What to do if memory runs out, Pekka -> room 2, Ari-Matti
-Xmx
) See software documentation for details.Q92 Can CSC sensitive data storage be used as a backup environment for university servers? If that's the case, does CSC itself offer a backup tool to its cloud storage, or can https://restic.net/ that helps to backup to Amazon Cloud, for example, be used alongside SD Desktop? Thank you in advance – Mastaneh
Q93 General question: what would be the best way to convert two docker containers (the second inherits from the first) to use within Puhti? I know about Tikky, but I am sure if this is the best way to go about. Since the first container is based on python 3.9.12 and I add a personal package from a repository, I thought maybe I could instead directly build both containers already as Apptainers. Initials: Emiliano Godinez. Will join at 13:30
apptainer build
command), or create it as an Apptainer container from scratch: https://docs.csc.fi/computing/containers/creating/From:
Q94 Restart write performance (~3.3TB, romio hints, striping as before) seems to have gone down
Q95 Mahti BU usage not being applied to project total, only to per-user totals at my.csc.fi?
Q96 JCD UH: Is there still a problem with the docker image not working with the external drive?
Q97 JCD UH: I really love the new software library tool. Can we ask for new tools to be added after they have been tested locally?
Q98 JCD UH: Do you know yet when the move from CentOS7 to Ubuntu might happen?
Q99 I have questions on the purging of files planed to March
Q100 Do we have a list of which breakout rooms cover what topics?
Q101 Is there going to be another course/seminar on how to create singularity containers? We are starting a new project and have to bring in quite a bit of software.
Q102 What is the advantage of using same virtual machines within project members?
Comment1 Great news, that the external drive can now be used accross serval VM'S!
Short talk: LUMI
Q105 I would like to ask if it is possible to fetch specific files from my bucket in Allas. I recently moved my scrtach directory there and I need to access some files which I don't want to upload the directory again because of limited space for other users.
Q106 Hi, I need to create a PostgreSQL database and populate it with an existing SQL dump. I already created the empty databse (??) on Rahti but I'm clueless about how to load the SQL dump into it.Can I get some help with it please? Also, on Rahti I see a Deployments section with an error under Postgresql but not under websocat and something about failing to allocate a pod.
Isn't it already installed on Rahti together with PostgreSQL? Do I hav to install it on Puhti as well? It seems to me it comes toegther with PostgreSQL which I rather not install myslef.
-
I just need to run 'psql pharmacodb < pharmacodb-1.1.1.sql' to load the dump into an empty daabse but I don't know how/where to run this command.
I'm really confused about oc cp! it says "Requires that the 'tar' binary is present in your container
image. If 'tar' is not present, 'oc cp' will fail".
What container and what tar? All the examples given for oc cp are about copying folders to a pod and nothing about populating a database.
Please read: https://docs.csc.fi/support/faq/database/
I have already installed psycopg2 using pip in my tykky container but I need to populate the database first as I mentionned.
- A: Yesterday and today there is an issue in Rahti with storage. It might me source of the problem
Okay, I just emailed servicedesk with my question!Thank you.
Q107 Before backup to Allas, do you archive/tar-files (of smaller size) into a a larger size file - filename.tar (size of 1?2?3? GB)
Q108 How do we get to Allas directory from myCSC login?
Q109 SD Desktop related question