# Regression
###### tags: `R` `statistics` `regression`
## Dataset
**Data Set Information:**
** This dataset contains the medical records of **66 patients who had Hepatic portal venousgas(HPVG)**. Each patient profile has **13 clinical variables**.
<br/>
**Attribute Information:**
* Patient No: id
* Sex (0:Female; 1:Male): sex
* Age (years): age
* Symptom onset to ED presentation (hours): p_time
* Body temperature (℃): bt
* Pulse rate (bpm): pr
* Respiratory rate (breaths/min):rr
* Mean arterial pressure (mmHg): bp
* Rapid Acute Physiology Score: RAPS
* Rapid Emergency Medicine Score: REMS
* Modified Early Warning Score: MEWS
* Management (0:Conservative; 1:Surgery): management
* ED presentation to operation (hours): op_time
* End outcome (0:Survival; 1:Death): outcome
[PLoS ONE12(9):e0184813.](https://doi.org/10.1371/journal.pone.0184813)
## Regression (Correlation and Plot)
We will use the PLOS one dataset to do the analysis. We will first perform a correlation analysis and plot a scatter plot.
```r=
# load library
library(ggplot2) # for graph
library(PerformanceAnalytics)
# import pone_data.txt file and name it data
data <- read.table("data/pone_data.txt", header = T, sep = "\t")
# check the structures of dataset
str(data)
# correlation
cor.test(data$pr, data$rr, method = "pearson")
# dotplot
ggplot(data=data, aes(x = pr, y = rr)) +
geom_point(size = 3, shape= 16) +
geom_smooth(method = lm, se = FALSE)
```
### **Plotting Symbols**

<br/>
## Regression (Modeling, Univariate Linear Regression)
```r=
# build a model
model <- lm(rr ~ pr, data = data)
summary(model)
model1 <- glm(rr ~ pr, data = data, family = gaussian)
summary(model1)
# predict the rr
predictor <- data.frame(pr = c(103, 120))
predict(model1, predictor, type="response")
```
## Regression (Modeling, Multivariate Linear Regression and Logistic Regression)
```r=
# correlation matrix
chart.Correlation(data, histogram=TRUE, method = "spearman", pch=19)
# build a model (multivariate linear)
model2 <- glm(rr ~ pr + bp, data = data, family = gaussian)
summary(model2)
# build models (multivariate logistic)
model3 <- glm(outcome ~ rr + age + bp + RAPS, data = data, family = binomial)
summary(model3)
model4 <- glm(outcome ~ rr + age + bp + REMS, data = data, family = binomial)
summary(model4)
model5 <- glm(outcome ~ rr + age + bp + MEWS, data = data, family = binomial)
summary(model5)
```