# Delivery Notes ## M3 allocation - Overview: Callum - m3.1: CM - m3.2: CM - m3.3: CRS - m3.4: CRS - m3.5: CM - hands-on: CRS. ### schedule - 13:00-13:10: Overview - 13:10-13:50: 3.1 (flexible) - 13:50-14:00: break - 14:00-14:50: 3.2 - 14:50-15:00: break - 15:00-15:30: 3.3 - 15:30-16:00: 3.4 - 16:00-16:10: break - 16:10-17:00: 3.5 ### M3.1 - Use interactive hackmd, ask students to tell us what is wrong with these figures. - Ones to focus on: - Example 1: Axes - Example 2: Misrepresentation - Example 3: Axes scales - Example 5: Averages + Uncertainty. - Example 7: Overplotting - Example 10: Too much information. Aesthetics matter. - Ask them to share problematic figures. ### M3.2 - Pause at `Colour Schemes`. Using an interactive hackmd, discuss: - how it changes the message - which plot is preferred? - how would you improve the plot? (we haven't made the perfect plot we have just explored the tools available) - probably stop at further examples ### M3.3 - Group discussion about uncertainties in figures - When do you think is important to include uncertainty a figure? - How do you think it should be included? - In your field, which kind of data are more common, and how is typically visualised? Is there any kind of prefered data vis that hasn't been described in this section? (at end) - Encourage the students to use the live coding to explore. - Also to use the binder environment. ### M3.4 - Group discussion in the topic of emotion, context and objectivity in data visualisation (at end of module). - Is there such a thing as an objective dataset or visualisation? ### M3.5 - no discussion, just accepting questions. ### M4 allocation - Overview: CM - 4.1: CM - 4.2: CM - 4.3: CRS - 4.4: CRS - Summary/Hands-on: Both ## Notes on things to mention that might not be writen in the book # M4 ### m4.1 befre `what is data` ### m4.2 after generalised linear models, before linear to logistic regression. ### m4.3 before simple model 2 ## m4.4 - At the begginign. ## Old notes about hands-on sessions ### M3 hands-on Split into 4 teaching (with discussion + monitoring of chat) 4 hours. - explore relatinship between variables - focus on iteration. - mental wellbeing index, age, education, no. children, accommodation, depr index. + ??? - tabulate the possible interacts - depr + accom + education. - correlation table as guidance for selecting variables relationships - variables included in M4 that we do not visualise: **ISCED**, **MentalWellbeingIndex**. Could focus on these with any of the others. Also give them free range to explore any other variables (say that we may be using these variables in M4 hands-on.) - missingness - iterate visualise + exploring solutions - make a nice figure (s). - present back to the group as mini-groups (?) ### M4 hands-on Blocks. Maybe imbalanced in favour of the taught content. - imputation + rerunning models - adding interactions? Y = b0 + b1X1 + b2X2 + b3X1*X2 - everything comes with an assessment of how much better + why - write a paragraph concluding their answer to the research question, we will come back and discuss. - model improvement: comparitave analysis with another country? How would you alter the model to better capture another country? An EDI discussion perhaps. - Suggestions for how to build a better model. - Simulations from the model? Generate a new dataset. (**need to think about this - potentially not**) - Imbalance: checkout out ways. - How do we combine these models? - Hierarchical models - Model averaging. - Think about ordinal regression - ### Helpers: - Module 3 (Nov 23): Callum Mole, Camila Rangel Smith, Lydia France, Oliver Strickson, Nick Barlow, Ed Chalstrey - Module 4 (Nov 25): Callum Mole, Camila Rangel Smith, Christina Last, Pamela Wochner, Nick Barlow, Ed Chalstrey ### M3 During delivery notes - First two sections felt slow. Less time on these and more time of 3.3 & 3.4 - discussions kind of worked, but some lack of contribution (don't need much contribution though) - 3.3 tips should be dropdown but not in margin. - 3.3 boxplot graph, the code: `f, axes = plt.subplots(1,3,figsize =(35,10))` since the figure size is so large the labels come up as really small - Ridgeline plots has many y labels. - 'Staked Bars' -> 'Stacked Bars' - Change numbering in hands-on. - Remind people that they can do the hands-on in any language. - Worked well to have the middle bit a 'whip-round' then a break. - Interesting that m3 is more about missingness than ### m4 - 4.1 & 4.2 felt good. - some equation tweaks. in 4.3 and 4.2 - in 4.2 the inverse logit should be related to eta not Y. - add an initial eta = mu - not getting any engagement. why? - are people put off by the mathematics? - M4 in the probability figures the x-axis should be labelled p(x), maybe? I still don't particularly understand the figure. - inference could be a separate section. Maybe a separate taught session. - likelihood ratio suddenly we chuck a load of new stuff at them. Too complicated. - 4.4. felt more tutorial orientated. would be good to have less in 4.3 + 4.4 (especially 4.4), since it becomes cumbersome to basically do the same thing in a few different ways. ### Final discussion - Talk about curiosity in research - 'best practice' isn't learning rules and applying them in scenarios. It's building up the tools for statistical thinking. - We are really interested in feedback. We genuinely want to build a course that people learn from. The repo is public, you could raise an issue there, or even start a pull request. - Do not delete your work, we will be in touch about how to best collect it. - Bayesian? - ordinal regression? Logistic regression as the best way? - different versions with different languages? - reproducible environments so that people spend less time - optional installation support session beforehand? - wrap-up. - the helpers as spokespeople to get round it? - miro board for communicating results? - gets passed speaker anxiety. - Miro board can be for knowledge sharing during.