# SWD1b Introduction to programming in R 2023-10-23 Welcome to the hack pad for SWD1b course from Research Computing at the University of Leeds! You can edit this document using [Markdown syntax](https://guides.github.com/features/mastering-markdown/). ## Contents 1. [Links to resources](#Links-to-resources) 2. [Agenda Day](#Agenda-Day) 3. [What's your name and where do you come from?](#Whats-your-name-and-where-do-you-come-from?) 4. [Collaborations and connections](#Collaborations-and-connections) 5. [Downloading the example data](#Getting-the-data) 6. [Questions and Answers](#Questions-and-Answers) 7. [Additional Resources, Tutorials, Cheat Sheets etc.](#Additional-Links-and-Resources) ## Links to resources - **RStudio Cloud** - https://posit.cloud/ - **Contact Research Computing** - https://bit.ly/arc-help - Please reach out if you have any further questions, would like support using R for your research, or are looking for support running R on the ARC systems - **Course notes** - https://arctraining.github.io/swd1b-r/ - **Scripts from training** - https://github.com/ARCTraining/SWD1b-R-intro-notes - **Version control with git** - https://swcarpentry.github.io/git-novice/ ## Agenda Day 1 | Time | Agenda | | -------- | ------------------------------------------ | | 1000 | Intro, using Rstudio.cloud, What is R? | | 1050 | Break | | 1100 | R basics, Project management in RStudio | | 1130 | Finding help, Data Structures | | 1200 | Lunch | | 1300 | Data Structures, Data Types | | 1350 | Break | | 1400 | Subsetting Data | | 1450 | Break | | 1500 | Data visualisation and graphics | | 1550 | Questions | | 1600 | Close | ## Agenda Day 2 | Time | Agenda | | -------- | ----------------------------------------------- | | 1000 | Vectorisation | | 1025 | Functions explained | | 1050 | Break | | 1100 | Writing Data | | 1120 | Splitting and Combining Data Frames with plyr | | 1200 | Lunch | | 1300 | Data Frame Manipulation with dplyr | | 1350 | Break | | 1400 | Data Frame Manipulation with tidyr | | 1450 | Producing Reports with knitr | | 1530 | Writing Good Software | | 1550 | Questions | | 1600 | Close | ## What's your name and where do you come from? - Karen Vaughan - School of Food Science PhD student - [Andy Turner](http://agdturner.github.io), Research Software Engineer (RSE) working in the [Research Computing team](https://arc.leeds.ac.uk/about/team/) of IT Services - [Maeve Murphy Quinlan](https://murphyqm.github.io/), an RSE with the [Research Computing team](https://arc.leeds.ac.uk/about/team/) - Victor Efren Guadarrama Vilchis, PhD student at the School of Chemical and Process Engineering. I want to learn about bug data management and statistical packages on R. - Chinwe Uzokwe, PhD Student Food Science and Nutrition, Nutritional Epidemiology Group. I want to learn heln R to help me with my data analysis and visualization. - Chris Pask, Chemistry. Just trying to learn! - Resti, PhD student at the School of Earth and Environment - Cigdem Bozkir, Postdoc researcher at School of Food Science and Nutrition. I want to learn R for my research analysis. - Alexios Dosis, LIMR PGR, Keen to learn R to apply machine learning models for my data analysis. - Prima Romadhona, PhD student at ITS, I need to learn R to build my model related to public transport network - Beth Webb, LICAMM post-doc at the School of Medicine, researching Cardiovascular disease. I want to learn R for my research analysis and to analayse clinical cohorts. Also, hi Gaia! - Gaia Ferrarin, LICAMM PhD student. My reserch is in BACE1 in Cancer. hi Beth!! - Rokshana Binta Samad, PhD Student, Earth and Environemnt - John Wright, Post-doc, Centre for Cultural Value - Kittinon Charoonsrisawad- PhD student, School of Biomedical Sciences, Research in spinal mechanism of human during fatigue - Frederico Ponte,LICAMM - Leanne Shearsmith, Research Assistant in Leeds Institute of Health Sciences. - Hannah Truscott, also LIHS PhD student hoping to use R for my analysis! - Chris Trevelyan, Research Fellow in FBS, Research in biomarkers in liver cancer - Shadia Ahmed, doing a PhD in the school of Medicine - Rossana Escanilla, PGR School of Geography - Lais dos Santos, PdD student of School of Civil Engineering ## Collaborations and connections Use this area to connect with other attendees around research topic areas: give some information about what youor research is, what sort of things you hope to do with R, how people should contact you (n.b. this page is publically accessible to anyone with the link, so please be aware of that before posting email addresses etc.) ## Getting the data To get the gapminder dataset for todays course you'll need to use the following command in your RStudio cloud console: ```r= download.file("https://swcarpentry.github.io/r-novice-gapminder/data/gapminder_data.csv", destfile = "data/gapminder_data.csv") ``` ## Questions and Answers Feel free to post any questions you have here, or in the Teams chat - or just unmute and ask! ## Frequently Asked Questions **Increasing RStudio font size** > Tools> Global Options> Appearance > Editor Font Size **RStudio Themes** Change to a Dark Theme in RStudio/RStudio Cloud > Tools> Global Options> Appearance> Editor Theme ## Additional Links and Resources This will be populated during the course as we discuss different topics! - [R Data Visualisation Cheat Sheet (direct PDF download)](https://github.com/rstudio/cheatsheets/raw/main/data-visualization-2.1.pdf) - [RStudio Cloud Primers](https://posit.cloud/learn/primers) - tutorials on using R, general good practise, and stats/data vis - [Tricks with ggplot - facet plots](https://benwhalley.github.io/just-enough-r/ggplot-details.html): blog post walking through some common requirements for facet plots with multiple subplots - [Data visualisation/Graphics in R](https://benwhalley.github.io/just-enough-r/layered-graphics.html) - [University IT documentation to install R on Windows](https://it.leeds.ac.uk/it?id=kb_article_view&table=kb_knowledge&sys_kb_id=4e1642d71b6fa4504d79b455464bcb13) - [Posit Cloud](https://posit.cloud/) - [Linux basics — ARC Documentation](https://arcdocs.leeds.ac.uk/getting_started/linuxbasics.html) - https://swcarpentry.github.io/r-novice-gapminder/ - Discussion on long vs. wide data: "Each format works best for certain tasks: the long format allows data to be stored more densely, while the wide format has more explanatory power if tabular formats are required in a report. It’s up to you to choose which format works best depending on what you expect to accomplish." - [From this blog post, specifically about Python but the discussion points relevant to any language](https://towardsdatascience.com/long-and-wide-formats-in-data-explained-e48d7c9a06cb#:~:text=Each%20format%20works%20best%20for,what%20you%20expect%20to%20accomplish.) - [Data wrangling cheat sheet - common plyr, dplyr and tidyr functions with useful diagrams to explain table manipulation](https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf) - Information about [tidyverse](https://www.tidyverse.org/) packages - The "join" function in plyr to [join dataframes](https://rdrr.io/cran/plyr/man/join.html) - Various discussion posts on StackOverflow on merging/joining/combining different dataframes: [link 1](https://stackoverflow.com/questions/8091303/simultaneously-merge-multiple-data-frames-in-a-list), [link 2](https://stackoverflow.com/questions/6709151/how-do-i-combine-two-data-frames-based-on-two-columns) - [Descriptive analysis](https://thatdatatho.com/easily-create-descriptive-summary-statistic-tables-r-studio/#:~:text=Create%20Descriptive%20Summary%20Statistics%20Tables%20in%20R%20with%20Gmisc,created%20in%20an%20HTML%20file.) - [Summarise function in dplyr](https://epirhandbook.com/en/descriptive-tables.html) - [SWC notes on splitting and combining dataframes](https://swcarpentry.github.io/r-novice-gapminder/12-plyr.html) - [Working directory issue (if NOT using AppsAnywhere, and have installed in on your personal, non-work machine)](https://www.programmingr.com/r-error-messages/cannot-change-working-directory/)