or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Syncing
xxxxxxxxxx
UC San Diego Library
Software Carpentry Workshop - Intro to R
April 10-11, 2019
Biomedical Library Builiding
9:00a - 4:30pm
This HackMD has been locked by the instructors and can no longer be edited.
If you would like to export these notes, you can do that by selecting Export or Download options in the HackMD options menu.
Instructors
This hackMD: https://bit.ly/2Z2gdIL
Syllabus and Schedule: https://ucsdlib.github.io/2019-04-10-UCSD/
UNIX Shell Set-up: https://swcarpentry.github.io/shell-novice/setup.html
DAY 1 - Please sign in here
Name, Affiliation (faculty, staff, student, post-doc, etc), Department/Lab
Reid Otsuji, Research Data Curation Librarian, Library
Stephanie Labou, Data Science Librarian, Library
Dan LaSusa, staff IT, Library
Rick McCosh, OB/GYN+REPRO SCI
Arden Tran, Research IT
Eva Sanchez Alvarez, Marine Bio
Ashlie Pankonin, staff, Psychology/Barner Lab
Kyle Begovich, post-doc, Biology/Wilhelm Lab
Stefanie Makowski, post-doc, Medicine/Field Lab
Charles Seller, post-doc, Biology/Schroeder Lab
Sonja Lang, post-doc, Medicine/Schnabl Lab
Huikuan Chu, post-doc, Medicine/Schnabl Lab
Zhenping Wang,assistant project scientist, Dermatology/DiNardo lab
Lu Jiang, postdoc, Medicine/Schnabl Lab
Jose Bucheli, visiting grad student, Center for US-Mexican Studies
Stephanie Gamez, grad student, Biology/Akbari Lab
David Herold, faculty, Pathologyls
Nuria Pell, Medicine, Student-phD/Schnabl Lab
Ben Croker, faculty, Pediatrics
Yi Duan, post-doc, Medicine/Schnabl lab, yid003@ucsd.edu
Nicole Gergans, Staff, San Diego Water Board
Bei Gao, postdoc, Medicine/Schnabl lab
Rongrong Zhou,post-doc,Medicine/Schnabl LAB
Isabel Salas, post-doc, Salk institute, Allen Lab
Ranveer Jayani, Assistant Project Scientist, School of Medicine, UCSD
Collaborative Notes:
Day 1 - Unix shell
Setup
Go here for setup of the Unix shell
For Windows, you'll want to open the CLI (Command Line Interface - aka. 'Command Prompt') and then enter the command
bash
to enter the bash environment. All that means is we're telling the CLI to interpret our commands in the language ofbash
. On Mac,bash
is native to the system, so already there when you open 'Terminal'Basic Definitions - "what do those acronyms mean?!"__
ls
to list our files and folders, we are seeing a stripped down version or base version of the functions that are used to list files and folders as opposed to using the usualFinder
(Mac) orFile Explorer
(Windows) to look at your files and folders.Bash Commands
pwd
: "print working directory" (tells me what directory I'm in)mkdir
: make a new folder (example: mkdir new_folder)cd <folder name>
: change directory (example: cd new_folder)nano
: make a new file (example: nano new_file.txt)CTRL + O
: use this inside the nano window to save your draft file. A prompt to enter a filename to save as will appear.CTRL + X
: To exit the nano GUI, enter this command - by default, you'll also be prompted to choose to save or not.cd ..
: Go back or "up" a level in the file directory.cd
:cd
by itself will take you back to your home directory, which will usually look like<computer name>:~ <username>$
rm
: Remove file (won't work on directories/folders)rmdir
: Remove directoryrm -r /path/to/dir/*
: Remove everything to remove all sub-directories and files"Flags" or "arguments" - in programming and with command line languages like Bash or Command Prompt, most of the commands you'll use have lots of options to modify what you want that command to do. We call these flags for Bash and with most languages, we call them arguments. So for example, the command above
rm
is to remove, but if you wanted to add an argument or flag to remove an entire directory and subsets and files, you just add the flag-r
along with the specified path to the directory you want to remove with the argument/path/to/dir/*
. You'll end up with a command that looks like this:rm -r /path/to/dir/*
history
: gives you a full list of commands you've entered for your current session (unless you also save for later use withhistory > history.txt
history > <filename>.txt
: using thehistory
command with the flag to save the output as a text file with your choice in naming.cat <filename>.txt': use the
cat` command to quickly read through a file by having the CLI list the contents of the text file within the CLI.clear
: the command wipes out the commands visible within the window of the CLI. On a Mac, just scroll up to see the history. On Windows, the CLI is actually cleared out. This is great tool to use for starting fresh visually.mv
: move or rename files. Examplemv history.txt quotes.txt
just copies the history.txt file and renames the copy as quotes.txt.mv history.txt ~/Desktop
moves the history.txt file to the Desktop.wc
: word count (example:wc *.txt
returns the number of lines, words, and characters in all files with a .txt extension)-l
to return just the number of lines (wc -l *.txt
)|
(called the "pipe") is used to chain together commandswc - l *.txt | sort -n | head -n 1
would (1) find the number of lines in all .txt files, then (2) sort the output from smallest number of lines to largest number of lines, then (3) return the first row (thehead
command)__
CLI Pro-Tips
⬆
or⬇
arrow keys and thenEnter
when you get to the command you want, or modify it as needed before hittingEnter
TAB
to auto-complete commands and files/folders in the CLI. Example, say you're trying to list a file path to ~/Desktop/thesis/, you could simply start typing~/De
and thenTAB
and it will complete to~/Desktop/
and then continue by typingt
and thenTAB
to completethesis/
giving you~/Desktop/thesis/
If the thing you are trying to autocomplete has similar named files, it may take a couple more characters before the auto-complete finds the right file name. Usels
to determine how deep you need to go with typing before you can use auto-complete.R Day 1
Gapminder Data download:
https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/gapminder_data.csv
Windows: right click -> save as
Mac: control + click -> save
Intro to R & RStudio:
RStudio is an IDE
Integrated development environment
what is RStudio:
https://www.rstudio.com/products/rstudio/
RStudio IDE Cheat sheet:
https://www.rstudio.com/resources/cheatsheets/#ide
TED Talk about gapminder data set
https://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen
old versionrstudio installers:
https://support.rstudio.com/hc/en-us/articles/206569407-Older-Versions-of-RStudio?mobile_site=true
Make sure you are using RStudio and not the R programming console.
Working with variables
R uses variables names
Use
<-
to assign values to variables, such asx <- 5
(now x is equal to value 5)Best practices for naming variables:
_
(my_new_variable) or camel case (MyNewVariable).
Variables are created (and overwritten) in the order they are run. Try running the following to see this in section:
mass <- 47.5
age <- 122
mass <- mass * 2.0
age <- age - 20
You can remove a variable using
rm()
with the variable name in()
, as inrm(my_variable)
You can see all the variables you currently have using
ls()
Getting help
Use
?
to get help about fnctions in R. For example,?min
will return (in the lower right hand window) the help page for themin()
command.Arithmetic and comparisons in R
Addition: +
Subtraction: -
Divide: /
Multiply: *
Greater than: >
Greater than or equal to: >=
Less than: <=
Less than or equal to: <=
Equals: ==
Not equals: != (the ! here is equivalent to saying "not" whatever comes next, so "not equal")
Vectorization
creat a vector namually: use
c
(combine function)`c(1,3,5,7)``
Creating a vector using seq() function
Packages
Packages in R are your friend! You can think of these as bundles of useful functions that other people have created and made available to share. Practically, this means that if you're thinking "hm, I wish I could do this thing in R…" someone probably made a package with functions to do that thing!
There are a lot of packages available: https://cran.r-project.org/web/packages/available_packages_by_name.html
You will need to install a package in order to use it. The good news is that you only have to install it once. To install, you can use
install.packages()
with the name of the package, in quotes, in the()
. For example, to install the plotting packagesggplot2
, you would useinstall.packages("ggplot2")
.You can also use the point-and-click option in RStudio: Tools –> Install packages –> put the names of the packages.
Once you have the package installed (which you only have to do once) you'll still need to tell R that you want to access that package each time you start a new R session. You do this by using
library()
with the name of the package (without quotes). Example:library(ggplot2)
means that now I can access functions withinggplot2
in my current R session.The standard convention is to run these
library()
commands at the start of each R script. You'll want to load all your packages like this at the beginning of your session/script.__
before working with data
set your working environment
menu in RStudio
session/set wroking directory/choose directory
select the directory where you have saved the Gapminder data.
Loading .csv data in RStudio
read.csv("gapminder.csv")
Gapminder data TED talk
looking at the structure of the data frame:
Different Data Types in R:
Logical
Interger
Numeric(Double)
Complex
Character
Vectors
colleciton of data points, in order, all the same data typesFactors
a variable of any of the above types can also be treated as a factor. Discrete group assignments.assinging gapminder.csv data to a variable
Creating plots
load ggplot2 library
aes = aesthetics
what happens if you remove geom_point?
save plots:
saved files are saved in the working directory
adding a theme to a plot: add after geom
using variables to build plot layers:
plot modified to show lines
adding lines and points
Additional plotting resources
R graph gallery: https://www.r-graph-gallery.com/
ggplot2 cheatsheet: https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf
General reference for ggplot2: https://ggplot2.tidyverse.org/reference/
End of day exercise: Day 1 Feedback
Learned fast!
Learned quick commands–useful. Agree
liked learning how to make plots
Go deeper into construct our own plots, any kinds
Customize plots
How to control other computers with the consol/terminal
When assisting other with questions, would be nice to keep voices down a little. Hard to focus with many voices at once
might be a good idea to encourage people to look over the lesson ahead of time so we can focus on practical application in class rather than general concepts
DAY 2 - Please sign in here
Name, Affiliation (faculty, staff, student, post-doc, etc), Department/Lab
Kyle Begovich, post-doc, Biology/Wilhelm Lab
Ben Croker, Faculty, Pediatricsz
Bei Gao, Postdoc, Medicine/Schnabl lab
zhenping wang, project scientist, Dermatology/DiNardo lab
Nicole Gergans, staff, San Diego Water Quality Control Board
Jose Bucheli, visiting grad student, Center for US-Mexican Studies
Abby Pennington, Metadata Services / Research Data Curation, Library
Stephanie Gamez, grad student, Biology/Akbari Lab
Stefanie Makowski, postdoc, Medicine/Field Lab
Charles Seller, post-doc, Biology/Schroeder lab
Ashlie Pankonin, staff, Psychology/Barner lab
Huikuan Chu, postdoc, Medicine/Schnabl lab
Ranveer Jayani, Assistant Project Scientist, School of Medicine, UCSD
Rongrong Zhou,post-doc,Medicine/Schnabl LAB
Yi Duan, post-doc, Medicine/Schnabl lab, yid003@ucsd.edu
Isabel Salas, post-doc, Salk institute, Alleb Lab
Version control with git
The material we will be covering: https://swcarpentry.github.io/git-novice/
Set up
The first time you use git, you'll need to set up your name and email.
git config --global user.name "Your Name"
git config --global user.email "myemail@domain.com"
Once you're inside the folder in which you want to version control your files, you'll use
git init
to get everything started. This will turn your folder into a "repository" where git can store version of your files.make sure you do not git init in nested folders
(This causes problems with tracking the changes of files.)
If you get stuck in vim type
:q!
and this will force an exitBasic git commands
Use
git status
to check the status of your git repository. This will tell you what files have changed (including any added or deleted files).If there are no changes, this will return
On branch master nothing to commit, working directory clean
If there are files that have changed, you will see something along the lines of
Untracked files: (use "git add <file>..." to include in what will be committed) draft.txt
Use
git add filename.extension
(example:git add draft.txt
) to tell git you want to track this fileThen,
git status
will return something along the lines ofChanges to be committed: (use "git rm --cached <file>..." to unstage) new file: mars.txt
Git now knows that it’s supposed to keep track of
draft.txt
, but it hasn’t recorded these changes as a commit yet. To get it to do that, we need to run one more command:git commit
You'll include a commit message, which is a short blurb describing what you've done in this change.
git commit -m "create draft.txt"
Once you press enter, you will see something like:
[master (root-commit) f22b25e] create draft.txt 1 file changed, 1 insertion(+) create mode 100644 draft.txt
To see a history of your commits, use
git log
. (When you get to the point of having a lot of commits, you can usegit log --oneline
to see a more succinct summary of the commits.)Order of operations
Most of working with git is the two commands,
git add
andgit commit
git add myfile.txt
git commit -m "created myfile.txt"
Use
git status
at any point to see whether there are any untracked files or any changes that haven't been committed.Look at differences between versions
Use
git add
to see how files have changed between commits. Use the commit alphanumeric number (at least first few characters) to see difference between current versio nand selected past version for specified file. For example:git diff f22b25e draft.txt
will show the difference between the current version of
draft.txt
and the version at commitf22b25e
.Can use
git diff HEAD
as a shortcut to see changes between current version and last committed version:git diff HEAD draft.txt
Similarly, can use
git diff HEAD~1
to see changes between current version and commit one prior to last commit:git add HEAD~1 draft.txt
See all commits for a particular file (only commits where certain file has changed):
git log --follow -- filename
Use
git log -p filename
to see actual differences between files in addition to commit messages.Rolling back to previous versions
Use
git checkout
to "roll back" a file to a previously committed version (aka restore a previous version):git checkout f22b25e draft.txt
To put things back the way they were:
git checkout HEAD draft.txt
Detached head: If you checkout and forget to specify a file, your whole repository will be rolled back and you will get a warning to your console that says
You are in 'detached HEAD' state.
The “detached HEAD” is like “look, but don’t touch” here, so you shouldn’t make any changes in this state. After investigating your repo’s past state, reattach your HEAD withgit checkout master
.Ignoring things
You can create a file called
.gitignore
and include any files or folders which you don't want to track.nano .gitignore
[then add files or folders]So for instance if I wanted to ignore a file called
notes.txt
or a folder calleddata/
:cat .gitignore
notes.txt data/*
GitHub
Material here: https://swcarpentry.github.io/git-novice/07-github/index.html
git will display colors for adding:
red - changes have been made to the file
green - files have been added for tracking
Remotes in GitHub
To get started with using remotes, we'll need to have a GitHub account. Go to GitHub and create an account.
Go here for the repo
Day 2 R
Make sure you have the Gapminder data saved in your working directory.
select from dplyr
select dataframe, column
select is a way to select only the columns you need.
filter() command - filters by row condition
Exercise solution:
%>% (this is a pipe)
mutate() function
Exercise 2
Break now: III
Break at 2:30: IIII
copy pasta code for arden
install knitr package
different plots
plotly
End of workshop exercise: Day 2 Feedback
The instrucxtor are very knowledgable and helping. I learned a lot in the two-day workshop.
Good overview of the kinds of things we can do with our new skills, although putting it into practice on my own seems a little daunting.
great code for plots. ++ :D
Its a great two-day wkrshop packed with lots of information. Hwoever, I feel that this workshop should be split in to multiple weeks (with 1-2 hrs a week). This will help us all (who know nothing abot all this) to grasp the commands and princiles in a beter way. And some home-work and self practice will help hone our skills.
I think it would be helpful to allow for more time to understand what we're doing because sometimes you feel like you are typing things without really knowing why we're doing it. Maybe by making us read over the lessons ahead of time.
Take the post workshop survey: https://ucsdlib.github.io/2019-04-10-UCSD/