# Deseq2 multifactor head break
Digestion of this [page](http://master.bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html) and this [page](http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#interactions)
Considering the Desqe2 table you are doing you analysis and a metadata file that looks like:
```
## DataFrame with 8 rows and 3 columns
## names donor condition
## <factor> <factor> <factor>
## SRR1039508 SRR1039508 N61311 Untreated
## SRR1039509 SRR1039509 N61311 Dexamethasone
## SRR1039512 SRR1039512 N052611 Untreated
## SRR1039513 SRR1039513 N052611 Dexamethasone
## SRR1039516 SRR1039516 N080611 Untreated
## SRR1039517 SRR1039517 N080611 Dexamethasone
## SRR1039520 SRR1039520 N061011 Untreated
## SRR1039521 SRR1039521 N061011 Dexamethasone
```
In deseq2 when you run the differential analysis you run the following command:
```R
dds <- DESeqDataSet(gse, design = ~ cell + dex)
```
The simplest design formula for differential expression would be `~ condition`, where condition is a column in colData(dds) that specifies which of two (or more groups) the samples belong to. For the airway experiment, we will specify `~ cell + dex` meaning that we want to test for the effect of dexamethasone (`dex`) controlling for the effect of different cell line (`cell`).
**Note:** it is prefered in R that the first level of a factor be the reference level (e.g. control, or untreated samples). But the control can be reset using:
```R
gse$dex <- relevel(gse$dex, "Untreated")
```
Also it is important to consider interaction between the different parameter put in the formula. Interaction terms can be added to the design formula, in order to test, for example, if the log2 fold change attributable to a given condition is different based on another factor, for example if the condition effect differs across group.
>If the research aim is to determine for which genes the effect of treatment is different across groups, then interaction terms can be included and tested using a design such as `~ cell + dex + cell:dex`
And the important part. If there are mode than 2 groups in one of the categories. IN the example below there are three (I, II and III) genotype categories.
>The key point to remember about designs with interaction terms is that, unlike for a design `~genotype + condition`, where the condition effect represents the overall effect controlling for differences due to genotype, by adding `genotype:condition`, the main condition effect only represents the effect of condition for the reference level of genotype (I, or whichever level was defined by the user as the reference level). The interaction terms `genotypeII.conditionB` and `genotypeIII.conditionB` give the difference between the condition effect for a given genotype and the condition effect for the reference genotype.
## Reducing effect
DESeq2 offers two kinds of hypothesis tests: the Wald test, where we use the estimated standard error of a log2 fold change to test if it is equal to zero, and the likelihood ratio test (LRT). The LRT examines two models for the counts, a full model with a certain number of terms and a reduced model, in which some of the terms of the full model are removed. The test determines if the increased likelihood of the data using the extra terms in the full model is more than expected if those extra terms are truly zero.
An example with a time series setting:
>DESeq2 can be used to analyze time course experiments, for example to find those genes that react in a condition-specific manner over time, compared to a set of baseline samples. Here we demonstrate a basic time course analysis with the fission data package, which contains gene counts for an RNA-seq time course of fission yeast (Leong et al. 2014). The yeast were exposed to oxidative stress, and half of the samples contained a deletion of the gene atf21. We use a design formula that models the strain difference at time 0, the difference over time, and any strain-specific differences over time (the interaction term strain:minute).
```R
library("fission")
data("fission")
ddsTC <- DESeqDataSet(fission, ~ strain + minute + strain:minute)
```
>The following chunk of code performs a likelihood ratio test, where we remove the strain-specific differences over time. Genes with small p values from this test are those which at one or more time points after time 0 showed a strain-specific effect. Note therefore that this will not give small p values to genes that moved up or down over time in the same way in both strains.
```R
ddsTC <- DESeq(ddsTC, test="LRT", reduced = ~ strain + minute)
resTC <- results(ddsTC)
resTC$symbol <- mcols(ddsTC)$symbol
head(resTC[order(resTC$padj),], 4)
```
###### tags: `R`,`Statistics`,`MyheadHurts`