# W3:相關分析-ANOVA與線性回歸
###### tags: `統計學系列課程`
## 變數、相關、相關分析模型與相關係數
### 卡方適合度分析(類別Y-類別X)
```
X=c(rep('F',10),rep('M',10))
Y=vector(length=20)
for(i in 1:20){
if(X[i]=='F'){
Y[i]=sample(c('G','NG'),size=1,replace=T,prob=c(0.25,0.75))
}
if(X[i]=='M'){
Y[i]=sample(c('G','NG'),size=1,replace=T,prob=c(0.45,0.55))
}
}
cbind(Y,X)
A=list(Glasses=Y,Gender=X)
table(A)
```
### 邏輯回歸(類別Y-數值X)
```
X=50:75
Y=vector(length=26)
for(i in 1:26){
Y[i]=rbinom(1,1,prob = (exp(-55+X[i])/(1+exp(-55+X[i]))))
}
cbind(Y,X)
plot(X,Y)
```
### ANOVA(數值Y-類別X)
```
A=iris
cbind(A$Sepal.Length,A$Species)
boxplot(A$Sepal.Length~A$Species)
```
### 回歸(數值Y-數值X)
```
X=1:20
Y=5+0.1*X+rnorm(20,0,0.2)
cbind(Y,X)
plot(X,Y)
```
### 混合模型(數值Y-數值X+類別X)
```
A=CO2
A
plot(A)
```
## ANOVA
```
X1=c(rep('apple',10),rep('grape',10),rep('orange',10))
A1=vector(length=30)
X2=rep(c(rep('PET',5),rep('Al',5)),3)
A2=vector(length=30)
X3=rep(c(rep('Jay',5),rep('Jolin',5)),3)
A3=vector(length=30)
B=vector(length=30)
for(i in 1:30){
A1[i]=switch(X1[i],'apple'=4,'grape'=-2,'orange'=-2)
A2[i]=switch(X2[i],'PET'=-1,'Al'=1)
A3[i]=switch(X3[i],'Jay'=-2,'Jolin'=2)
if(X1[i]=='orange'&&X3[i]=='Jolin'){
B[i]=+5
}else{
B[i]=-1
}
}
mu=20
Y1=mu+A1+rnorm(30,0,1)
Y2=mu+A1+A2+rnorm(30,0,1)
Y3=mu+A1+A3+B+rnorm(30,0,1)
boxplot(Y1~X1)
boxplot(Y2~X1+X2)
boxplot(Y3~X1+X3)
```
### Case 1:銷量與口味的關係(1-Way CRD)
### Case 2:銷量與口味、包裝的關係(1-Way RBD)
### Case 3:銷量與口味、代言人的關係(2-Way CRD)
```
L1=lm(Y1~X1)
L2=lm(Y2~X1+X2)
L3=lm(Y3~X1+X3)
L3t=lm(Y3~X1*X3)
anova(L1)
anova(L2)
anova(L3)
anova(L3t)
summary(L1)
summary(L2)
summary(L3)
summary(L3t)
AIC(L1)
AIC(L2)
AIC(L3)
AIC(L3t)
BIC(L1)
BIC(L2)
BIC(L3)
BIC(L3t)
```
```
> anova(L1)
Analysis of Variance Table
Response: Y1
Df Sum Sq Mean Sq F value Pr(>F)
X1 2 229.071 114.536 167.29 6.133e-16 ***
Residuals 27 18.486 0.685
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova(L2)
Analysis of Variance Table
Response: Y2
Df Sum Sq Mean Sq F value Pr(>F)
X1 2 238.957 119.478 168.865 1.272e-15 ***
X2 1 27.979 27.979 39.544 1.175e-06 ***
Residuals 26 18.396 0.708
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova(L3)
Analysis of Variance Table
Response: Y3
Df Sum Sq Mean Sq F value Pr(>F)
X1 2 214.78 107.388 25.846 6.604e-07 ***
X3 1 255.94 255.938 61.598 2.527e-08 ***
Residuals 26 108.03 4.155
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> anova(L3t)
Analysis of Variance Table
Response: Y3
Df Sum Sq Mean Sq F value Pr(>F)
X1 2 214.775 107.388 84.892 1.302e-11 ***
X3 1 255.938 255.938 202.323 3.431e-13 ***
X1:X3 2 77.669 38.835 30.700 2.427e-07 ***
Residuals 24 30.360 1.265
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> summary(L1)
Call:
lm(formula = Y1 ~ X1)
Residuals:
Min 1Q Median 3Q Max
-1.71060 -0.50629 -0.02071 0.41901 1.99355
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 23.9926 0.2617 91.69 < 2e-16 ***
X1grape -6.1636 0.3700 -16.66 9.93e-16 ***
X1orange -5.5042 0.3700 -14.88 1.58e-14 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8274 on 27 degrees of freedom
Multiple R-squared: 0.9253, Adjusted R-squared: 0.9198
F-statistic: 167.3 on 2 and 27 DF, p-value: 6.133e-16
> summary(L2)
Call:
lm(formula = Y2 ~ X1 + X2)
Residuals:
Min 1Q Median 3Q Max
-1.72875 -0.41385 -0.06592 0.54663 1.48703
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24.7133 0.3071 80.461 < 2e-16 ***
X1grape -5.7358 0.3762 -15.248 1.76e-14 ***
X1orange -6.2099 0.3762 -16.508 2.68e-15 ***
X2PET -1.9315 0.3071 -6.288 1.17e-06 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8412 on 26 degrees of freedom
Multiple R-squared: 0.9355, Adjusted R-squared: 0.9281
F-statistic: 125.8 on 3 and 26 DF, p-value: 1.349e-15
> summary(L3)
Call:
lm(formula = Y3 ~ X1 + X3)
Residuals:
Min 1Q Median 3Q Max
-3.1310 -1.7087 0.0917 1.5084 3.8895
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 20.2679 0.7443 27.230 < 2e-16 ***
X1grape -6.5538 0.9116 -7.189 1.23e-07 ***
X1orange -3.2304 0.9116 -3.544 0.00152 **
X3Jolin 5.8417 0.7443 7.848 2.53e-08 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.038 on 26 degrees of freedom
Multiple R-squared: 0.8133, Adjusted R-squared: 0.7918
F-statistic: 37.76 on 3 and 26 DF, p-value: 1.271e-09
> summary(L3t)
Call:
lm(formula = Y3 ~ X1 * X3)
Residuals:
Min 1Q Median 3Q Max
-2.0256 -0.8606 -0.1026 0.7768 1.7753
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 21.5324 0.5030 42.809 < 2e-16 ***
X1grape -6.8121 0.7113 -9.577 1.14e-09 ***
X1orange -6.7655 0.7113 -9.511 1.30e-09 ***
X3Jolin 3.3127 0.7113 4.657 9.93e-05 ***
X1grape:X3Jolin 0.5167 1.0060 0.514 0.612
X1orange:X3Jolin 7.0702 1.0060 7.028 2.88e-07 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.125 on 24 degrees of freedom
Multiple R-squared: 0.9475, Adjusted R-squared: 0.9366
F-statistic: 86.7 on 5 and 24 DF, p-value: 1.462e-14
>
> AIC(L1)
[1] 78.61046
> AIC(L2)
[1] 80.46443
> AIC(L3)
[1] 133.5724
> AIC(L3t)
[1] 99.49406
>
> BIC(L1)
[1] 84.21525
> BIC(L2)
[1] 87.47042
> BIC(L3)
[1] 140.5784
> BIC(L3t)
[1] 109.3024
```