# RDS course: Module 1 plan ### High level syllabus Module 1: Introduction to data science - Sub-module A (Taught): - What is data science? How it differs (or does not differ) from other fields, overview of the variety of cultures within it and its origins - Definition of data science vs. data analysis, ML, AI and origins - Cultures: Theorise and estimate, compute and test - What is RDS/RSE, what is special about it. - Roles and skills within research projects - Difference between DS and RDS - Project management and life cycle: Basic stages in a data science project and common hurdles in each stage. - Project planning - Scoping: - Translating a research question into a data science task. - Ambiguity and complexity in scoping with some real case studies. - Collaborating with clients, adoption - Getting and wrangling data - Feature engineering, selection - Model training and evaluation - Production - Monitoring performance and updating - Handover - Intro to EDI for data science: - Forms of bias and oppression in data science and society, matrix of opprression - Common pitfalls - Forms of privilege - How to challenge power and privilege - Examples of EDI in data science projects with varying degrees of success. - How to work collaboratively in data science projects and reproducibility principles - using material from The Turing Way. - Sub-module B (Hands-on project): - Scope a project using a real-world scenario/research question: Understand the associations between SES/material circumstances and health using the EQLS dataset (a survey micro-dataset): - What is SES/material circumstances? - What is health? - How do we establish an association? What are the different ways we can think about the problem and what theories can be used? - What does the dataset contain and how can we use it to answer the question? - How do we translate the question to a data science task? - What is the purpose of doing it and how will it be used? - How can we challenge the question and dataset? What is missing/controversial/biased, what are the EDI concerns? Multiple EDI issues embedded in the problem for attendees to point out and discuss. - Setup a GitHub repo and use it to document the conversation and outcome of the activity. Learn how to use Github as the basis of collaborative work.