--- tags: ISLR --- # ISLR hw3 Range: ch2 ex3, 7, ch3 1, 3, 8 1. ch02 Q3 a. pass b. `training loss`: it would reduce when model's flexity increase, bacause it grow its parameter and learn more complicate pattern from trainging data(aka overfitting). `testing loss`: it would reduce first, because model start learning more and more correct pattern from less flexity to more flexity. But, when the model start being overfitting, it will learn so many incorrect pattern from training set. Testing loss will increase. `bias`: when the model flexity increase, our model will do less asuumption than less flexity one. So, it will decrase very slow. `variance`: variance would be incease, beacuse more parameter will lead more-variance to fit data better. 2. ch02 Q7 a. Euclidean Distance from obs1 to obs6 3.0, 2.0, 3.16, 2.24, 1.41, 1.73 b. K = 1 the nearest neighbor is obs5 output is **Green** c. K = 3 top 3 nearest neighbos is obs2, obs5 and obs6 output is **Red** d. non-linear smaller K is better. Because higher K will cross so many boundary and capture more error points 3. ch03 Q1 a. intercept Represent whether sales is 0 when 3 variables is 0 Result: p < 0.0001, sales **will not** be zero when 3 variables is 0 b. TV c. radio Represent whether increase sales is 0 when TV/radio increase Result: p < 0.0001, sales **will not** be zero when TV/radio increase. d. newspeper Represent whether increase sales is 0 when newspaper increase Result: p = 0.8599, sales **will** be zero when newspaper increase. 4. ch03 Q3 Y = 50 + 20 * GPA + 0.07 * IQ + 35 * Gender + 0.001 GPA * IQ + (-10) * GPA * Gender a. i. Y = 35 * Gender - 10 * Gender * GPA Y(Female) = 35 - 10 * GPA Y(Male) = 0 Y(Male) - Y(Female) = 10 * GPA - 35 Y(Male) > Y(Female) when GPA > 3.5 ii. continue above result Y(Male) > Y(Female) when GPA < 3.5 iii. and iv. continue above result Y(Male) > Y(Female) when GPA < 3.5(high enough) **iii. is correct** b. y = 50 + 20 * 4 + 0.07 * 110 + 35 + 0.01 * 4 * 110 - 10 * 4 = 50 + 80 + 7.7 + 35 + 4.4 - 40 = 137.1 c. False, scale of IQ is bigger than other variables. 5. ch03 Q8 ```r require(ISLR) data(Auto) auto_lm <- lm(mpg ~ horsepower, data=Auto) summary(auto_lm) ``` a. i. p < 0.001, have strong confidence there is relationship between horsepower and mpg ii. p close to 0, the confidence is strong iii. negative, the summary show -0.157845 estimate column iv. ```r new_data <- data.frame(horsepower = 98) predict(auto_lm, new_data, interval = 'confidence') predict(auto_lm, new_data, interval = 'prediction') ``` b. ```r plot(Auto$horsepower, Auto$mpg) abline(auto_lm, col='green') ``` c. ```r par(mfrow=c(2,2)) plot(auto_lm) ``` seen Residual vs fitted plot, there should be a non-linear model is better