# Module 4 Discussion Board
::::info
**Dates** Mon, 24 November 2021
**Time** 13:00 - 17:00 UK time zone (BST)
**Instructors**: Camila Rangel Smith, Callum Mole
::::
Welcome to Module 4! We will be using this document at various points along the course to coordinate discussion.
It's written in markdown. Some useful tips:
- [name=Callum] write your name at a bullet point using `- [name=Name]`.
- **bold**, _italics_
## :pushpin: Summer school materials and useful links
:floppy_disk: [Application website for the summer school](https://www.eventsforce.net/turingevents/frontend/reg/thome.csp?pageID=41405&eventID=127)
:book: Useful materials to go through in preparation for the summer school:
- [Research Software Engineering with Python (developed by Turing REG team)](https://alan-turing-institute.github.io/rsd-engineeringcourse/html/index.html)
- [Version Control with Git](https://swcarpentry.github.io/git-novice/)
- [Turing Way's Guide for Collaboration](https://the-turing-way.netlify.app/collaboration/collaboration.html)
:bookmark: Course materials (lectures notes and notebooks) are [available here](https://alan-turing-institute.github.io/rds-course/index.html). The materials are hosted in [this GitHub repository](https://github.com/alan-turing-institute/rds-course).
:performing_arts: [Feedback form following the summer school](https://forms.office.com/Pages/ResponsePage.aspx?id=p_SVQ1XklU-Knx-672OE-etitOjG6rhHtlIU40dhvK9UQVRLTVUxTUgxMlpVVUJUWEs4SEdQOTAzRi4u)
## :exclamation: Code of conduct
We are very keen to ensure that you both benefit from training and feel part of the community, and if you think you experienced any harassment, discomfort, or anything else caused by the instructors, your peers, the course materials or the training team, please get in touch by email at training@turing.ac.uk. There is no formal process of reporting incidents and we will assess each case individually. We will also ensure confidentiality where it's appropriate.
As a general guideline, we try to abide to the Turing Way Code of Conduct principles, which you can read [here](https://the-turing-way.netlify.app/community-handbook/coc/coc-details.html).
We can only improve and ensure you are having a good experience if you let us know what might have gone wrong or just felt exclusionary. We'd be pleased not receive any complaints, but as you well know, that doesn't always mean everything is perfect - so let's work together to make the Learning at the Turing space is as inclusive as possible.
## Introductions
Say hi :wave:. Please add your name and how do you use modeling in your research below:
- Jack Doyle, the same Jack Doyle as before. I use toploogical ionvarients from persitent homology to predict organic crystal packing and energ
- Hi, I'm Jacob - I'm a Maths PhD student at Manchester and I use stochastic models for the spread of epidemics. 👋 :wave:
- Rachael Pirie - Chemistry PhD student @ Newcastle/ Enrichment student. I use modelling to predict the liklihood of molecules acting as new drugs (the medicinal kind!) based on their similarity to existing drugs.
- Ryan Chan - PhD student in statistics at The Turing. Mainly work on statistical theory and methodology in my work... but have done some modelling in past / side project, e.g. of football matches :soccer:
- Frankie Cho - enrichment PhD student @ Exeter in environmental resource economics. I use stochastic optimisation algorithms (i.e. linear programming) to optimise spatial land use decisions under the uncertainties of climate change. Any chance we can learn some Bayesian modelling basics today?
- Brad Scott, Department of English, Queen Mary. Though I did a science degree many many years ago, my most recent conception of 'modelling' is as applied to devising structures to describe texts (usually in XML), to inform subsequent analysis, some of which could be statistical
- Haris Organtzidis - PhD in neuroscience, have been using reinforcement learning and/or psychophysics (process) models within hierarchical bayesian (statistical) models to fit behavioural data.
- [name=Alden Conner] Research Application Manager at the Turing Institute, background in neurobiology and product management in microscopy
- Tiago Sousa Garcia -- haven't really used modelling before!
- Kathryn Garside, RSE Newcastle University. Not currently using modelling in my work but experience in modelling fluid dynamics and growth of branching structures. Will be working on modelling neuro data in the future
- Peter Strong- PhD student at Warwick using probablistic graphical models to understand migration pathways.
## Approximate Schedule
This is the first time we've delivered this course so we may be loose on time. But we will try and keep to the following:
| Section | Time |
| -------- | -------- |
| Overview | 13:00 - 13:10 |
| 4.1 | 13:10 - 13:50 |
| break | 13:50 - 14:00 |
| 4.2 | 14:00 - 14:50 |
| break | 14:50 - 15:00 |
| 4.3 | 15:00 - 15:50 |
| break | 15:50 - 16:00 |
| 4.4 | 16:00 - 16:50 |
| sign-off | 16:50 - 17:00 |
## Exercises
### Section 4.1
In your research, do you use modeling for prediction, for explanation or both? Do you think there is a difference, for example can you have explanation without prediction?
**Notes**:
- In my case both- for instance, explaining the characteristics and structural changes of white/grey matter in the brain that help subtype a multiple sclerosis patient, and also predict the probability of membership to a certain subtype in a patient. Both feed into each other and help understand the mechanism of the disease better.
### Section 4.2
What kind of probability distributions do you encounter in your reseach?
**Notes**:
- In the past worked with Generalised Pareto (extreme values etc. used to model tails of distributions)
Have you worked with regression and logistic regressions before, what other models crop up in your field?
**Notes**:
- i did some censored regression this one time..
- I tend to start with linear/logistic regression for learning tasks for simploicity then move onto something more complex like svm/random forest
- Some experience with regression (more theoretical than actually practical!) - but some experience using Weibull/Burr distributions in a survival analysis setting.
- quite a lot of experience with regression. quite common to implement a logistic as a baseline for more complex classification problems
- have used LASSO and ridge regression for variable reduction (an example of regularisation?sl)
### Section 4.3
How else can we deal with categorical variables?
**Notes**:
-
### Section 4.4
Should we predict individuals SRH?
- Hypothetical question: a council wants to use this model on individuals under their authority. They want to build a SRH score and claim that this can help them know who needs extra assistance.
- No - I think this kind of thing is more suitable for observing and understanding trends, but individuals are so variable 😔 :worried:
- Also, let's not forget that SRH is a perceptual sense of healthiness as reported by the individual, so a use-case as described in the scenario would only identify how people would report their needs, not their actual need
- Haha, data science is all about predicting human choice and decision making is it...? At least a large chunk of it...
**Notes**:
-
We have been evaluating logistic regression specifically. What have we learned that can be generalised to other models?
**Notes**:
- 