# 21.03.2023 DDLS education and training
## 2nd meeting with DDLS fellows
*Note: Meeting shortened 13:05 -- 14:15*
## **Welcome**
## **Code of Conduct reminder**
* Be respectful, honest, inclusive, accommodating, appreciative, and open to learning from everyone else.
* Do not attack, demean, disrupt, harass, or threaten others or encourage such behavior.
* Be patient, allow others to speak, and use the zoom reactions & chat if you would like to voice something.
---------
## **Links & Resources**
**Zoom link:** https://stockholmuniversity.zoom.us/j/66442975088
**Miro board:** https://miro.com/app/board/uXjVPZVPbOU=/?share_link_id=7147612421
**Training collection:** https://docs.google.com/spreadsheets/d/18hSpGPnQvnyCeIeT-Ru0nu764Qi8mZWCqfIYgld33Rw/edit?usp=sharing
**NBIS course catalogue** (https://uppsala.instructure.com/courses/48087/pages/nbis-training-catalogue)
**SciLifeLab training events** (https://www.scilifelab.se/events/#calendar)
*email (education@scilifelab.uu.se) if you like to add a course/event to the SciLifeLab calendar*
**GU and GU core facility training** (https://www.gu.se/en/core-facilities/bioinformatics-and-data-center-bdc/education-and-training)
**Glittr.org: Github repository for Bioinformatics training materials** (https://glittr.org/?per_page=25&sort_by=stargazers&sort_direction=desc)
---
## Rollcall:
🗣 Name/🐸 pronouns/✍Reseach Area/🏼 Affiliation/ :candy: favorite candy
* Jessica Lindvall / she-her/ Head of Training/ SciLifeLab and NBIS/ Sura kryptoniter
* Luisa Hugerth / she, her / Human microbiome / UU / ...chocolate?
* Jacob Vogel / he, him / Human neurodegenerative disease / LU / Daim
* Laura Carroll / she, her / Bacterial bioinformatics / UmU / Take 5
* Tobias Andermann / he, him / Biodiversity research / UU / meatballs
* Nick Pearce / he/him / Structural Bioinformatics(ish) / LiU / Cheese
* Clemens Wittenbecher / Chalmers / Precision Medicine
* Wen Zhong / she, her / Integrative multi-omics, precision medicine / LiU / chocolate
* Sergiu Netotea / Integrative multiomics, machine learning / Chalmers, NBIS Cell and molecular biology group / Scones?
---
## **AGENDA**
| Duration | Activity |
| -------- | -------- |
| 5 mins | Welcome, housekeeping, round the table introductions|
| 20 mins | **Collaborative training opportunities** - Intermediate R course, Clemens Wittenbecher |
10 mins | **Collaborative training opportunities** - Data Science using Advanced Python, Sergiu Netotea (NBIS)|
| 10 mins | Re-cap, **Step 1 - What knowledge and skills are required? [Miro](https://miro.com/app/board/uXjVPZVPbOU=/?share_link_id=7147612421)** |
| 10 mins | **Break** |
| 40 mins | Discussion, Silent documenting and work, **Step 2 - Identify target audience and levels [Miro](https://miro.com/app/board/uXjVPZVPbOU=/?share_link_id=7147612421)** |
| 10 mins | Next steps and wrap-up |
---
# Q&A:
:::success
❓ *Please add any questions you might have during the course of the session here:*
:::
----
# Open Discussions Notes:
:::success
🏼 *Please add any further notes for the session here:*
:::
----
# Collaborative training opportunities
## Clemens Wittenbecher - [intermediate R course](https://r-cubed-advanced.rostools.org/)
## Collaborative training opportunities - add ideas
*(https://docs.google.com/spreadsheets/d/18hSpGPnQvnyCeIeT-Ru0nu764Qi8mZWCqfIYgld33Rw/edit?usp=sharing*
## Sergiu Netotea (NBIS) - NBIS course (late May) Data Science using advanced Python
Some details about the course:
(older site: https://nbisweden.github.io/workshop-advanced-python/)
- Open to collaborations! Also interested in co-organizing knowledge representation course (annotations, representation learning, integrative models).
- Online, 2023-05-29 to 2021-06-02 (5 days)
- Will be advertised on SciLifeLab & Elixir websites
- Python as a technology. General overview of computer choke points for various architectures together with a fast paced tutorial on advanced language concepts.
- Data analysis. Scientific computing, statistics, visualization, graph computing and data mining, via libraries such as numpy, pandas, scipy, statmodels and several other "science stack" libraries.
- Machine learning. Perform machine learning, statistical learning and pattern recognition, via scikit-learn, pymc3.
- Deep learning. Learn how to fit and control the convergence of various deep neural networks using PyTorch and Tensorflow.
- AI. Basics of computer vision and NLP, huggingface workflows for using famous AI models and applications in biology.
- Engineering the computing infrastructure and Python's role in it. How to run Python on clouds and GPU machines
- Learning how Python can be used to organize your workflow with efficiency and reproducibility in mind.
- Pick your own task! Discuss the possible applications of this course to your project under our assistance. This is a great time to solidify your knowledge by applying it to your own research scope!
____
## Tobias Andermann - Short data-science courses/tutorials in R and Python (with biodiversity focus) - online format compatible:
- [R tutorials for working with spatial data in R](https://github.com/tandermann/spatial_R_course)
- [Python tutorials for data-science](https://github.com/tandermann/python_for_biologists)
- One-day workshop on implementing neural networks in Python (tensorflow), including lectures and tutorial (not uploaded online)
##
These could be further developed/expanded with input from different DDLS fellows, making it generally useful for PhD students in any of the DDLS fields. Could evolve into a general programming intro in Python and R.
---
# Continuing the [Miro](https://miro.com/app/board/uXjVPZVPbOU=/) work - Step 2 (Identify target audience and levels)
## re-cap last meeting
### step 1 - What knowledge and skills are required?
Identifying KSAs (knowledge, skills, abilities) i.e. content/topics/knowledge across foundational categories of what core knowledge is needed to be a Data-driven life scientist

#### **Foundational topics**
* Data science and Bioinformatics
* Mathematics and (bio)statistics
* Open Science and FAIR
* Research Data Management
* ELSI/A and Data
### Mapping existing courses
*from the captured excel document [Training collection](https://docs.google.com/spreadsheets/d/18hSpGPnQvnyCeIeT-Ru0nu764Qi8mZWCqfIYgld33Rw/edit?usp=sharing)*
**Data science & Bioinformatics courses:**
* RaukR (NBIS intermediate/advanced R course/summer school): https://nbisweden.github.io/workshop-RaukR-2206/
* Workshop on Data visualization in R (NBIS/ELIXIR-SE course): (https://uppsala.instructure.com/courses/46547)
* Introduction to Python - with application to bioinformatics (NBIS course): (https://uppsala.instructure.com/courses/71521)
* Quick and clean: Data Science in Biology using Advanced Python (NBIS advanced Python course): (https://github.com/NBISweden/workshop-advanced-python)
* Basic Data Handling and Visualization with R (LU PhD course)
* GU and GU Core Facility courses: (https://www.gu.se/en/core-facilities/bioinformatics-and-data-center-bdc/education-and-training)
* Unix applied to genomic data
* R Programming
* Python for biologists
* Gene Expression Analysis Using R
* Bioinformatics I and II
**Mathematics and Biostatistics courses:**
* Introduction to Biostatistics and Machine Learning (NBIS course): (https://uppsala.instructure.com/courses/74597)
* Neural Networks and Deep Learning (NBIS course): (https://uppsala.instructure.com/courses/75565/pages/schedule)
**Open Science and FAIR courses:**
* Tools for Reproducible Research (NBIS course): (https://github.com/NBISweden/workshop-reproducible-research/)
* Snakemake bring-your-own-code (BYOC) workshop (NBIS course): (https://uppsala.instructure.com/courses/70024)
**Research Data Management courses:**
* Introduction to Data Management Practices (NBIS course): (https://uppsala.instructure.com/courses/48087/pages/introduction-to-data-management-practices)
* Open Science and FAIR (https://github.com/NBISweden/module-open-science-dm-practices)
* Data organisation practices (https://github.com/NBISweden/module-organising-data-dm-practices)
* Metadata (https://github.com/NBISweden/module-metadata-dm-practices)
* Data publication (https://github.com/NBISweden/module-data-publication-dm-practices)
* Cleaning tabular data with OpenRefine (https://github.com/NBISweden/module-openrefine-dm-practices)
* Introduction to scripted analysis with R (https://github.com/NBISweden/module-r-intro-dm-practices)
* Versioning of Data and Code using Git (https://github.com/NBISweden/module-versioning-dm-practices)
* Data Management Plans (https://github.com/NBISweden/module-dmp-dm-practices)
**ELSI/A and Data courses:**
* AI and law (LU MOOC) (https://www.coursera.org/learn/ai-law)
* AI, Business & the Future of Work (LU MOOC) (https://www.coursera.org/learn/ai-business-future-of-work)
* Artificial Intelligence: Ethical & Societal Challenges (LU MOOC) (https://www.coursera.org/learn/ai-ethics)
---
## Step 2 - Identify target audience and levels
Continue the work in [Miro](https://miro.com/app/board/uXjVPZVPbOU=/?share_link_id=7147612421)
:::info
*✏️ Silent documenting + share outs, choose from following prompts - no need or pressure to respond to all of them, share, add +1, and ask questions.*
- Review the target groups and levels:
- Do you we miss any level?
- Any topics missing in respective level?
:::
### Discussion notes and comments (target audience and levels)
*
-
# Next steps
* Step 3 - what courses and training materials are already available?
---
# Wrap up:
*
*
## Feedback:
- How can these meetings be most effective?
## Action points:
- Join our [Slack: ](https://scilifelab.slack.com/?redir=%2Farchives%2FC041VD1TL5T) channel: #ddls-education-training
- Continue to fill the [Miro](https://miro.com/app/board/uXjVPZVPbOU=/?share_link_id=863537926181) or the [Training Collection](https://docs.google.com/spreadsheets/d/18hSpGPnQvnyCeIeT-Ru0nu764Qi8mZWCqfIYgld33Rw/edit?usp=sharing) Google excel
- Spread the survey to your local HEI and network '[Training need assessment (data-driven life science)](https://forms.gle/s86Ybzqt8er3EupT7)
## Next meeting
23rd of May 2023, 13:05 -- 15_00
---
# Thank you for joining and see you next time!