# Workshop
###### tags: `R` `Statistics` `workshop`
## Dataset
**Data Set Information:**
** This dataset contains the medical records of **66 patients who had Hepatic portal venousgas(HPVG)**. Each patient profile has **13 clinical variables**.
<br/>
**Attribute Information:**
* Patient No: id
* Sex (0:Female; 1:Male): sex
* Age (years): age
* Symptom onset to ED presentation (hours): p_time
* Body temperature (℃): bt
* Pulse rate (bpm): pr
* Respiratory rate (breaths/min):rr
* Mean arterial pressure (mmHg): bp
* Rapid Acute Physiology Score: RAPS
* Rapid Emergency Medicine Score: REMS
* Modified Early Warning Score: MEWS
* Management (0:Conservative; 1:Surgery): management
* ED presentation to operation (hours): op_time
* End outcome (0:Survival; 1:Death): outcome
[PLoS ONE12(9):e0184813.](https://doi.org/10.1371/journal.pone.0184813)
## Some Useful R Code and Package
```r=
# loading library
library(pwr) # for power analysis
library(openxlsx) # for open and save excel file
library(dplyr) # for data transformation
library(ggplot2) # for graph
# import xxxx.txt file and name it data
data <- read.table("data/xxxx.txt", header = T, sep = "\t")
# import xxxx.xlsx file and name it data1
data1 <- readWorkbook("data/xxxx.xlsx")
# check the structures of datasets
str(data)
# write to xlsx file
write.xlsx(my.table, file = "table.xlsx", colNames = T, rowNames = T)
```
<br/>
### Basic R function
```r=
# calculate mean, SD, quantile
mean(data$x)
sd(data$x)
quantile(data$x)
summary(data$x)
# The Shapiro-Wilk Test For Normality
shapiro.test(data$x)
# perform t-test
t.test(data$a, data$b)
# perform Mann Whitney U Test
wilcox.test(data$a, data$b)
# make a table
my_table <- table(data$a, data$b)
# make a table with define lable
my_table2 <- table(Sex = data$a, Angina_type = data$b)
# change row name
rownames(my_table) <- c("female","male")
# change column name
colnames(my_table) <- c("Typical","Atypical","Non-anginal pain", "Asymptomatic")
# perform Chi-square test of independence
chisq.test(my_table)
```
<br/>
### dplyr
* Use a data frame and create a data frame
* Comparisons: >, >=, <, <=, !=, and ==
* Logical operator: & (and), | (or), and ! (not)
<br/>
**filter():** Pick observations by their values

```r=
# find male with heart failure
m_hf <- filter(data, sex == 1, target == 1)
str(m_hf)
# find patient with thalassemia
thal_p1 <- filter(data, thal == 2 | thal == 3)
str(thal_p1)
```
<br/>
**arrange():** Reorder the rows

```r=
# arrange in ascending order
data_arr <- arrange(data, thal)
# in descending order
data_arr <- arrange(data1, desc(thal))
```
<br/>
**select():** Pick variables by their names

```r=
# pick age, sex and ca columns
age_sex_ca <- select(data, age, sex, ca)
# pick the columns from cp to fbs
cp_to_fbs <- select(data, cp:fbs)
# remove the columns from cp to fbs
no_cp_to_fbs <- select(data, -(cp:fbs))
# remane the restecg column
new_data <- rename(data1, ekg = restecg)
```
<br/>
**mutate():** Create new variable
**transmute():** keep the new variables only

```r=
# add new columns age_sex and cp_fbs
new_columns <- mutate(data, age_sex = age - 10 * sex, cp_fbs = cp + fbs)
# save only the new columns
new_data <- transmute(data, age_sex = age - 10 * sex, cp_fbs = cp + fbs)
```
<br/>
**summarize():** summary
**group_by():** operate group by group

```r=
# summarize the mean of age, SD and total pt number
summarize(data, age_mean = mean(age), sd = sd(age), n= n())
# group by sex and cp
group <- group_by(data, sex, cp)
summarize(group1, age_mean = mean(age), sd = sd(age), n= n())
# use count
data %>% count(sex, cp)
data %>% count(sex, target)
# seperate the data by sex
# %>% is pipe
female <- heights %>% filter(sex =="Female" )
male <- heights %>% filter(sex =="Male" )
```
<br/>
### ggplot2
```r=
# Histogram
ggplot(data=df, aes(x= x)) + geom_histogram(binwidth= 1)
# Dotplot
ggplot(data = df, aes(x = x, y = y)) + geom_dotplot(binaxis ='y', stackdir = 'center', stackratio = 0.5, dotsize = 0.3)
# Box Plot
ggplot(data = df, aes(x = x, y = y)) + geom_boxplot() +
scale_x_discrete(labels=c("0" = "Female", "1" = "male"))
# Bar Plot
ggplot(data = data, aes(x = x, fill = a )) + geom_bar()
ggplot(data = data, aes(x = x, fill = a )) +
geom_bar(position = "dodge") +
scale_x_discrete(
labels = c("a1", "a2", "a3", "a4")
)
```
</br>
## Today's Workshop
:::info
PLOS journals require authors to make all data necessary to replicate their study’s findings publicly available without restriction at the time of publication. As a result, we can easily download the raw data of the published article. Today, we will use the dataset from plos one website to perform the analysis. Our goal is to **re-create the table 1 from the dataset** and to **identify any statistical misusage**.
:::
</br>
---
---
### Rapid Emergency Medicine Score: A novel prognostic tool for predicting the outcomes of adult patients with hepatic portal venous gas in the emergency department
### Table 1

### Methods
::: success
**Statistical Analysis:**
Numerical and categorical variables are shown as mean ± SD, and frequencies are displayed as percentages (%). Univariate analyses were applied to study the association between predictors and mortality, while categorical and numerical variables were analyzed with a chi-square test and two-sample t-test respectively.
:::
---
---