# RDS course: Module 1 plan
### High level syllabus
Module 1: Introduction to data science
- Sub-module A (Taught):
- What is data science? How it differs (or does not differ) from other fields, overview of the variety of cultures within it and its origins
- Definition of data science vs. data analysis, ML, AI and origins
- Cultures: Theorise and estimate, compute and test
- What is RDS/RSE, what is special about it.
- Roles and skills within research projects
- Difference between DS and RDS
- Project management and life cycle: Basic stages in a data science project and common hurdles in each stage.
- Project planning
- Scoping:
- Translating a research question into a data science task.
- Ambiguity and complexity in scoping with some real case studies.
- Collaborating with clients, adoption
- Getting and wrangling data
- Feature engineering, selection
- Model training and evaluation
- Production
- Monitoring performance and updating
- Handover
- Intro to EDI for data science:
- Forms of bias and oppression in data science and society, matrix of opprression
- Common pitfalls
- Forms of privilege
- How to challenge power and privilege
- Examples of EDI in data science projects with varying degrees of success.
- How to work collaboratively in data science projects and reproducibility principles - using material from The Turing Way.
- Sub-module B (Hands-on project):
- Scope a project using a real-world scenario/research question: Understand the associations between SES/material circumstances and health using the EQLS dataset (a survey micro-dataset):
- What is SES/material circumstances?
- What is health?
- How do we establish an association? What are the different ways we can think about the problem and what theories can be used?
- What does the dataset contain and how can we use it to answer the question?
- How do we translate the question to a data science task?
- What is the purpose of doing it and how will it be used?
- How can we challenge the question and dataset? What is missing/controversial/biased, what are the EDI concerns? Multiple EDI issues embedded in the problem for attendees to point out and discuss.
- Setup a GitHub repo and use it to document the conversation and outcome of the activity. Learn how to use Github as the basis of collaborative work.