# Intro to Data Science ## Foreword From now on, do **NOT** expect to understand all the code you are exposed to! This is a big leap compared to the last three days, agreed, but this format will be much more beneficial to you. In a real life situation you will almost never have to write a full program or script. There are others, computer scientists, bioinformaticians that will do that for you, and **for me for that matter**! In my daily work, rarely re-invent the wheel, I am fetching code from here and there, pasting it into my data analysis. So yes, it will feel possibly frustrating and it will require that you trust the code you are given, which may be uncomfortable, but there is no other way, the field is evolving too fast. What you should really get out of these sessions is: 1. code you can re-use for your analysis 2. code that you may not understand or may not be able to write, but of which you can interpret the results For 2., use your critical thinking in assessing if the results you observe make sense. If they don't go ask a colleague that can work through the code with you. The more you will do that, the more you will get versed in the (obscure) arts of data science. Welcome! ## Data Science intro session We will go through [Chapter 2](https://r4ds.had.co.nz/explore-intro.html) and start [Chapter 3](https://r4ds.had.co.nz/data-visualisation.html) of the [R for Data Science (R4DS)](https://r4ds.had.co.nz) book from Hadley Wickham. That book is the **reference** in the field and if I had a hard copy it would be seriously worn down, consulted too often for its own good. You are welcome to open RStudio and follow as I will go through [Chapter 3](https://r4ds.had.co.nz/data-visualisation.html). demonstrating live. In the following code, I will load the R package with the tutorials we will use in the next two days. I will then locate its vignette, open it and follow the instructions to look up a script inside the package called `01_data_science_intro.R`. ```R library(RnaSeqTutorials) vignette(package="RnaSeqTutorials") vignette("RnaSeqTutorials",package="RnaSeqTutorials") list.files(path=system.file(package="RnaSeqTutorials","scripts"), pattern="*.R",full.names=TRUE,recursive=TRUE) ``` I will then open that script and go through it. You might as well just sit back and (relax) listen. ## Guided session Now, you are more than welcome to join me. Run the following in your RStudio to list the available tutorials ```R library(RnaSeqTutorials) vignette("RnaSeqTutorials",package="RnaSeqTutorials") list.files(path=system.file(package="RnaSeqTutorials","tutorials"), pattern="*.Rmd", recursive=TRUE) ``` Then we want to start the first tutorial: ```R learnr::run_tutorial("01_data_science_intro", package = "RnaSeqTutorials") ``` It will take a while to start and open in a pop-up window. For convenience at the top of the pop-up window, you have a button you can click to convert it into a regular tab in your web browser. Once we are all set, let's start!