owned this note
owned this note
Published
Linked with GitHub
Eᔕᖇs ᗯIᑎTEᖇ ᗯOᖇKᔕᕼOᑭ ☔
===
:::info
_𝔻𝕒𝕪 𝟛 (𝟘𝟜.𝟙𝟚.𝟚𝟘𝟙𝟠) - 𝔻𝕒𝕪 𝟜 (𝟘𝟝.𝟙𝟚.𝟚𝟘𝟙𝟠)_
𝕋𝕖𝕒𝕔𝕙𝕖𝕣: _Maja Kuzman_
_University of Zagreb_
_Faculty of Science, Division of Biology, Bioinformatics group_
:::
:::warning
### The basics
___
__Introduction__
+ R
+ R markdown, R notebook
##### Vectors:
+ subsetting, recycling
+ Vector types: numeric, logical, character, complex
+ Some operations : +,-, /, *, %/%, %%, ==, !=
+ Some functions: any, all, example, help, ?, ??, sum, sd, mean, factorial, abs
##### Matrices:
+ subsetting, basic operations
##### Functions:
+ Basic function format, environments, return, recursions
##### Flow control in R:
+ if, if-else, ifelse, for, while, break, next
##### Lists:
+ basic operations; accessing elements, list structure
+ lapply, sapply, tapply, by, do.call
+ ... as parameter
##### Factors:
+ structure, addition of elements, addition of new levels
+ Conversion to numeric
:::
:::warning
### Data manipulation
___
__Data frames__
+ read in: read.table, read.csv, read.tsv
+ basics - subset() and []
+ merge, order, unique
__Package: dplyr__
+ filter, slice, select
+ %>%, grouping
+ summarise, arrange, lead, lag, n, count
+ mutate, mutate_all, transmute
__Package: data.table__
+ i: selecting rows
+ j: selecting columns, returning list list() / .()
+ by: by
+ operations on columns
+ Adding new columns
+ .N, .I, .GRP
+ keys
+ .SD, .SDcols
+ {}: supressing intermediate output
+ merge
+ roll
+ foverlaps
__Regular expressions__
+ grep
+ Special characters: ^$ \ . + * ?
+ Special brackets: [], (), {}. \\1
+ stringr package
\
**Package: tidyr**
##### What is clean data?
+ rows = observations
+ columns = attributes
##### How to clean up messy data:
+ Spread: Each column single attribute
+ Gather: Column headers are variable names
+ Sepatrate: Busy columns
+ Merge multiple tables (baseR, dplyr, data.table)
:::
:::warning
### Data visualization
___
+ Plots in base R VS ggplot examples
__Some useful graphs:__
+ Scatter plot
+ Q-Q plot
+ Histograms
+ Density plots
+ correlation matrix (package:corrplot)
+ Heatmap (package: pheatmap)
__Package: ggplot2__
+ Basics: ggplot(dataframe, aes(x,y))
+ Different layer examples: geom_point(), geom_histogram(), geom_smooth(), geom_bar(), geom_boxplot(), geom_density(), ...
+ Groupings: group, fill, facets
+ Other: titles, axes, legends, colors, themes
__Interactive graphs__
+ Interactive graphs examples with ggplotly
+ shiny
:::
:::info
### Advanced topics: Bioconductor
___
__Package: Biostrings:__
+ BSgenome
+ Get sequence by GRanges - getSeq
+ Useful functions: complement, reverseComplement, reverse,c
+ subseq, Views
+ alphabetFrequency, mono, di, trinucleotideFrequency, oligonucleotideFrequency
+ translate, consensusMatrix, matchPattern, PairwiseAlignment
__Package: shortRead__
+ Handling FastQ reads
+ Handling alignments
__Package: biomaRt__
+ Choosing mart (version, type, organism) useEnsembl
+ Choosing dataset - listDatasets, useDataset, listAttributes
+ Getting the dataset - getBM
__Package: GenomicRanges and IRanges:__
+ Defining IRanges and accessing elements
+ Some functions: reduce, disjoin, findOverlaps, countOverlaps, coverage
+ Defining GRanges and applying functions on them
:::
___
##### ☃ ☃ ❄ ❄ _In your free time:_ ❄ ❄ ☃ ☃
![Rpic](https://i1.wp.com/prismoji.com/wp-content/uploads/2017/02/emojis-comp-p_2017-02-06_05-36-24.png?resize=712%2C350&ssl=1)
[Emoji data science in R: A tutorial](https://prismoji.com/2017/02/06/emoji-data-science-in-r-tutorial/) By Hamdan Azhar