:::spoiler
[Toc]
:::
# 統計考試評分標準
## 第一次考試(總分109分)
### 第一大題(共 39 %)
#### 題目1.1(10%)
>1.1. Please use the guideline of 2k≧n to >determine the appropriate number of classes >and then construct a frequency histogram.
>[color=red]
- 正確 classes 數(k值):5% (K=6)
- 正確的圖:(w區間值:2%)(圖:3%) w=50

#### 題目1.2(8%)
>1.2. Please calculate the sample mean, sample variance, sample standard deviation, median of studying hours spent for passing the exam.[color=red]
- 各2%:平均、變異數、標準差、中位數
mean = 88.58
variance = 6134.94
stdev = 78.33
median = 66
#### 題目1.3(6%)
>1.3. Please calculate the first quartile, the third-quartile, and the interquartile range of studying hours spent for passing the exam.
- 各2%:1st quartile、3rd quartile、interquartile range
1st quartile = 26.25
3rd quartile = 132.5
interquartile range = 106.25
#### 題目1.4(5%)
>1.4. Please construct a box-and-whisker plot of studying hours spent for passing the exam.

- 盒須圖
- 長得不清楚或根本畫錯 0%
- 沒有表示清楚數字或畫的還相似 3%
- 表示清楚數字且圖乾淨明確 5%
#### 題目1.5(5%)
>1.5. According to the distribution pattern of 1.1 & 1.4, please judge the degree of skewness. Is it a right- or left-skewed distribution?
- right-skewed distribution 5%
- 理由寬鬆
#### 題目1.6(5%)
>1.6. According to the conclusion of 1.5, please suggest a number to indicate the center location of studying hours spent for passing the exam.
- For a right-skewed distribution, the median, **==66==**
- 數字是 66 或有指出 the median 才全對
### 第二大題(共 15 %)
#### 題目2.1(10%)
>2.1. Plot the sales and profits data on the same line graph.

- 如果圖本身是對的(從 10 % 開始扣)
- Sale 與 Profits 共用同一條從軸不分左右 -2 %
- 沒有標示清楚 那一條是 Sale 與 Profits -2 %
- 沒有標示清楚「從軸單位」 -2 %
- 因為橫軸很明顯是年份故不標單位沒關係 - 0%
- 圖不對(0%)
#### 題目2.2(5%)
> 2.2. Discuss the relationship between sales and profits over time.
- 理由寬鬆(最高 5%)
- 論成長率(都有成長、sales成長 > profits成長等等) +3%
- 論「利潤偶爾幾年下降」 +2%
- 論「銷售持續成長」 +2%
### 第三大題(共 38 %)
#### 題目3.1(4%)
> 3.1.Please calculate the sample mean and standard deviation of diameter. (4%)
- 各 2%:平均、標準差
mean = 1.685
stdev = 0.003825486
#### 題目3.2(5%)
>3.2. The quality policy of ABC is to assure that the diameter of a golf ball shall not be less than 1.682 inches. Please calculate the proportion of unqualified balls in this inspection. (5%)
- 5%:計算出正確 % 數
<1.682:10
prob. = 0.222
#### 題目3.3(12%)
>3.3. Please calculate the sample relative frequency to compute the probabilities of ABC golf balls whose diameters are within the intervals, x ̅±2s, x ̅±3s, and x ̅±4s, respectively. (12%)
- 計算由sample relative frequency(各2%*3)到 compute the probabilities(各2%*3)
x+-2s x+-3s x+-4s
frequency 43 45 45
prob. 0.956 1.00 1.00
#### 題目3.4(12%)
>3.4. Please use the Tchebysheff’s theorem to indicate the percentages of diameters locating within the intervals, x ̅±2s, x ̅±3s, and x ̅±4s, respectively.
- 一個 4分
1. 1-1/4= 0.75
2. 1-1/9= 0.889
3. 1-1/16= 0.9375
#### 題目3.5(5%)
> 3.5. Please compare and explain the results of 3.3 & 3.4.
- 理由寬鬆(最高 5 %)
- 有指出 Tchebysheff's theorem 比較保守(預估較小) +5%
- 有指出 Tchebysheff's theorem 比實際情況小(預估較小) +5%
- 有比較 兩者(說兩者) +5%
### 第四大題(共 17 %)
#### 題目4.1(8%)
> 4.1. Please calculate the sample mean and standard deviation of in-state tuition and out-of-state tuition, respectively.
- 一個 2%
4.1 in-state tuition out-of-state tuition
mean/8541.775766/9933.381616
stdev/5254.337716/3920.070686
#### 題目4.2(4%)
> 4.2. Please determine the coefficient of variation of in-state tuition and out-of-state tuition, respectively.
- 一個 2%
4.2 in-state tuition out-of-state tuition
C.V./0.61513412/0.39463607
#### 題目4.3(5%)
> 4.3. One guy got the math & verbal SAT scores, 550 and 500 respectively. According to the sample data of Colleges and Universities, does thit guy perform better than peers in math?
- 理由寬鬆
4.3 Math SAT Verbal SAT
504.5320334/459.3579387
65.22071489/56.08276174
standardize/0.69713996/0.724680098
- 有寫出來他考得比人家好 +5%
## 第二次考試(總分 115 分)
### 第一大題(共 10 %)
#### 題目 1.a(5%)
> Compute the probability of spotting more than 12 lions on a day trip to the safari park.
- It's a poisson distribution with λ = 12.
- X~Poisson(λ=12)
- P(X>12)=1-P(X≦12) = 0.424034751
#### 題目 1.b(5%)
> If you will have a two-day trip on the safari park, what is the probability of spotting 5 or less lions?
- A two-day trip, t = 2
- X~poisson(λt=24)
- P(X≦5)= 3.12567E-06 Therefofre, it's very rare to spot 5 or less lions for a two-day trip.(=POISSON.DIST(5,24,TRUE))
### 第二大題(共 25 %)
#### 題目 2.a(10%)
> Compute the mean and standard deviation for the sample data in the file.
- daily mean cost = 2.5084
- daily standard deviation cost = 0.1086
#### 題目 2.b(5%)
> Assuming the sample came from a normally distributed population and the sample standard deviation is a
good approximation for the population standard deviation, determine the probability that a randomly chosen
transaction would yield a price of $2.3 or less even though the population mean was $2.51.
- X~N(2.51, 0.1086^2) -1.934
- P(X≦2.3)=P(Z≦(2.3-2.51)/0.1086)=P(Z<-1.934) = 0.026570716
#### 題目 2.c(10%)
> If the national health agency wants to reduce the probability of the daily price of lung disease treatment that
is $2.55 or more to 5% at most, what would the new average cost of lung disease treatment be after the
medical and pharmaceutical department has enhanced?
- X~N(μ, 0.1086^2)
- P(X≧2.55)=P(Z≧(2.55-μ)/0.1086)) <= 0.05 Z0.05 = 1.645
- So, (2.55-μ)/0.1086 = 1.645
- μ = 2.55-0.1086x1.645 = 2.371
- The medical and pharmaceutical departmen has to reduce the average lung disease cost to a new level,==2.371==
### 第三大題(共 30 %)
#### 題目 3.a(10%)
> Produce a frequency distribution of these data. Convert the frequency distribution into probability
distribution using the relative frequency assessment method.
- max=33 min=17
| days of protection (xi) | frequency | probability,p(xi) | accumulative probability | - | - |
| ----------------------- | --------- | ----------------- | ------------------------ | ----------- | ----------- |
| 17 | 1 | 0.0048 | 0.0048 | 0.080952381 | 0.777870208 |
| 18 | 1 | 0.0048 | 0.0095 | 0.085714286 | 0.660908757 |
| 19 | 0 | 0.0000 | 0.0095 | 0 | 0 |
| 20 | 0 | 0.0000 | 0.0095 | 0 | 0 |
| 21 | 0 | 0.0000 | 0.0095 | 0 | 0 |
| 22 | 0 | 0.0000 | 0.0095 | 0 | 0 |
| 23 | 0 | 0.0000 | 0.0095 | 0 | 0 |
| 24 | 1 | 0.0048 | 0.0143 | 0.114285714 | 0.15914005 |
| 25 | 3 | 0.0143 | 0.0286 | 0.357142857 | 0.326535795 |
| 26 | 0 | 0.0000 | 0.0286 | 0 | 0 |
| 27 | 3 | 0.0143 | 0.0429 | 0.385714286 | 0.110481374 |
| 28 | 26 | 0.1238 | 0.1667 | 3.466666667 | 0.392697981 |
| 29 | 51 | 0.2429 | 0.4095 | 7.042857143 | 0.148115322 |
| 30 | 53 | 0.2524 | 0.6619 | 7.571428571 | 0.012109707 |
| 31 | 38 | 0.1810 | 0.8429 | 5.60952381 | 0.268909189 |
| 32 | 25 | 0.1190 | 0.9619 | 3.80952381 | 0.586210992 |
| 33 | 8 | 0.0381 | 1.0000 | 1.257142857 | 0.39475305 |
|-|-|-|Mean of protected days:|29.78|3.837732|
|-|-|-|-|Stdev of protected days:|1.959013|

- 圖的評分標準:
- 大致相似(+5%)
- 有左右兩邊:頻率跟機率+2
- 圖表有兩種+2
- 有單位或兩種表示有明確指出誰是誰+2
- 有圖表標題:Days protected distribution
#### 題目 3.b(10%)
> Calculate the mean value and standard deviation for the number of effective days that the special drugs
remained.
- Mean of protected days:29.78
- Stdev of protected days:1.959013
#### 題目 3.c(10%)
> If the marketing department wished to advertise a remarkable number of days that 95% of the cats remain
protected while using this special drug, what would this number of days be?
- According to a., the accumulative probability is more than 0.95 after 28 days.
- There are 4.29% of cats not protected before 28th day, while 95.71% of cats are protected from flea at least 28 days.
- Therefore, if the company wants to show that drug would keep 95% of cats protected from flea over a remarkable day or more, that day should be 28 days.
### 第四大題(共 25 %)
#### 題目 4.a(5%)
> Determine the weekly average number of crashes.
|fatal|crashes|frequency|probability xi*P(xi)|
| -------- | -------- | -------- | --- |
|0|176|0.88|0|
|1|19|0.095|0.095|
|2|4|0.02|0.04|
|3|1|0.005|0.015|
|total|200|-|0.15|
- So, the weekly average fatal crashes= 0.15
#### 題目 4.b(10%)
> Calculate the probability that at least 10 crashes (equal to 10 or more) would occur over one year if applying
the accident record of past 200 weeks.
- λt= 0.15x52= 7.8 (There are usually 52 weeks for one year.)
- P(X≧10)= 1-P(X<9)= 0.2589
#### 題目 4.c(10%)
> During a traffic enhancement program over 10 weeks has been conducted by Taichung government, the
number of fatal accidents decreases to only one. Please comment the performance of enhancement program.
Does it indeed reduce the probability of fatal crash?
- λt= 0.15x10= 1.5
P(X≦1)= 0.5578254 (or P(X=1)= 0.33469524 )
This chance to occur the number of accident that is less than one during 10 weeks is more than 50% if applying the average fatal level before enhancement.
That is to say, the enhancement program is not effective for increasing safety. It seems to be useless.
### 第五大題(共 10 %)
#### 題目 5.a(5%)
> If a scooter was assembled in the Tyler plant, what is the probability its breakdown was due to an electrical
problem?
| |mechanical|electrical|total|
| -------- | -------- | -------- | --- |
|Tyler|60|59|119|
|Lincoln|46|35|81|
|total|106|94|200|
- P(eletrical/Tyler)=59/119= 0.4958
#### 題目 5.b(5%)
> Is the probability of a scooter having a mechanical problem independent of the scooter being assembled at
the Lincoln plant?
P(mechanical)=106/200= 0.53
P(Lincoln)=81/200= 0.405
P(mechanical and Lincoln)=46/200= 0.23
P(mechanical) x P(Lincoln)= 0.21465
0.23 ≠ 0.21465
Therefore, the scooter having a mechanical problem is not independent of the scooter being assembled at the Lincoln plant.
### 第六大題(共 5 %)
#### 題目 (5%)
> Chunghwa Telecomm is launching a new wireless sensor technology to receive signals. Currently, the installed
wireless station is receiving a signal through the new model with probability 0.76 and through the old model with
probability 0.34. The chance of receiving a signal given using the new model is 80%; there is 77% chance of
receiving a signal given using the old model. What is the probability of successfully receiving a signal?
The wireless station is installed by the new model: P(new)= 0.76
The wireless station is installed by the old model: P(old)= 0.34
P(receiving signal/new)= 0.8
P(receiving signal/old)= 0.77
By the Bayes' rule
P(receiving signal)=P(receiving signal/new)x P(new)+ P(receiving signal/old) x P(old)= 0.8698
### 第七大題(共 10 %)
#### 題目 (10%)
> Generally, the empirical rule for a random variable X with a bell-shaped distribution claims: P(|X-𝝁|≤𝝈)=0.68,
P(|X-𝝁|≤2𝝈)=0.95, and P(|X-𝝁|≤3𝝈)=0.997. Please show the empirical rule by the approximation of normal
distribution.(hint: 𝝁 is the mean of X; 𝝈 is the standard deviation of X)
P(|X-μ|<σ)=0.68, P(|X-μ|<2σ)=0.95, and P(|X-μ|<3σ)=0.997.
X is a random with a bell-shaped distribution with the mean, μ, and the standard deviation, σ.
P(|X-μ|<σ)=P(-σ<X-μ<σ)=P(-1<(X-μ)/σ<1)=P(-1<Z<1)= 0.6827 ≒ 0.68
P(|X-μ|<2σ)=P(-2σ<X-μ<2σ)=P(-2<(X-μ)/σ<2)=P(-2<Z<2)= 0.9545 ≒ 0.95
P(|X-μ|<3σ)=P(-3σ<X-μ<3σ)=P(-3<(X-μ)/σ<3)=P(-3<Z<3)= 0.9973 ≒ 0.997
## 第三次考試(總分 120 分)
### 第一大題(共 80 %)
> Check the High Desert Banking dataset which contains the loan amounts of 350 customers.
#### 題目1.a
> Please develop a frequency histogram of the loan amount by eight classes. (5%)
- 
#### 題目1.b
##### 題目1.b.1
> According to the above histogram, does it look like a normal distribution? (5%)
- No, it does not look like a normal distribution because of lacking of a bell-shaped pattern.
##### 題目1.b.2
> If not, please suggest
a most likely probability distribution for the loan amount. (5%)
- It seems to distribute as an uniformal distribution.
#### 題目1.c
> If adopt x to represent the loan amounts of customers of High Desert Bank, please develop a
distribution density function (pdf), f(x), for the random variable x. (5%)
- the min of x= $3,500.00
- the max of x= $125,000.00
- the pdf of x:f(x)=1/(125000-3500)=8.230453E-06
#### 題目1.d
> Please calculate the population mean and the population standard deviation of the random variable,
x. (10%)
- the population mean= $64,250.00 or $63,668.57
- the population standard devation= $35,074.03 or $35,938.16
#### 題目1.e
> For i=1,2,3,…n, xi is the random variable with the same pdf of x. A new random variable is 𝑥̅, which
is equal to (x1+ x2+ x3+ x4+… +x30)/30. Calculate theoretically the mean of 𝑥̅ and the standard
deviation of 𝑥̅, respectively. (10%)
- x̅ is equal to (x1+ x2+ x3+ x4+… +x30)/30.
- the mean of x̅ = $64,250.00 or $63,668.57(that is, the mean is the same as the original mean of X)
- the stdev of x̅ = $6,403.61 or $6,561.38 (the stdev is decreased because ofdividing by sqrt of 30)
#### 題目1.f
##### 題目1.f.1
> Professor Hung designed an Excel formula “=INDEX($A$2:$A$351,RANDBETWEEN(1,350))”
to randomly choose values from the first array consisting of 350 numbers of the High Desert
Banking worksheet of to generate 𝑥̅. Please use this formula to generate 100 sets of sample (x1, x2,
x3, x4,…, x30). (5%, You must show the 100 data sets of selected samples on the worksheet.)
- 評分標準:
- 必須:row 有 x1~x30 最後 X_bar
- 必須:column 有 1~100
- 必須:row * column = 31*100 = 3100 格
- 必須:X_bar 有一張 frequency distribution(長得像就好)
##### 題目1.f.2
> Please
calculate the 𝑥̅ for each sample set and then plot the histogram of 𝒙̅ by the Excel tool. (5%)
- 必須:X_bar 有一張 frequency distribution(描述此 X_bar)
- 
#### 題目1.g
> What kind of distribution does 𝑥̅ look like? (5%)
- According to the histogram, X_bar looks like a normal distribution.
#### 題目1.h
##### 題目1.h.1
> Please calculate the sample mean of 𝒙̅. (5%)
- the sample mean of X_bar = 63315.83333
##### 題目1.h.2
> Check this answer and compare it with the mean of
question item d and e. (5%)
- It is still similar to the results of d & e $64,250.00 or $63,668.57
#### 題目1.i
##### 題目1.i.1
> Please calculate the sample standard deviation of 𝒙̅. (5%)
- the sample stdev of X_bar = 8078.64
##### 題目1.i.2
> Check this answer and compare it with
the standard deviation of question item d and e. (5%)
- It is similar to the results of e $6,403.61 or $6,561.38 but not d.
#### 題目1.j
> Please refer to the central limit theorem to assert your finding. (5%)
- According to the central limit theorem, the distribution of X_bar is approximately normal.
- In addition, the sample mean of X_bar is similar to that of x, while the standard devation of X_bar is decreased and dividing by the sqrt of 30.
### 第二大題(共 20 %)
> Check the Truck dataset which contains the two scales of 200 vehicle weights. One scale is WIM, while
the other is POE. The manufacturer of the WIM system claims that the WIM weight is usually heavier
than that of POE with a proportion rate of 65%
#### 題目2.a
> Please generate a new variable to denote whether the WIM weight is heavier than the POE one or
not. If yes, the code is 1, otherwise, 0. (5%. You must show this new variable on the worksheet.)
- a. WIM>POE[=IF(A2>B2,1,0)、=IF(A3>B3,1,0) .. ]
#### 題目2.b
> Please compute the proportion rate, indicating the WIM system is heavier than the POE one. (5%)
- the proportion rate of WIM>POE = 0.52
#### 題目2.c
> According to the proportion rate of b question, please evaluate the claim of the WIM manufacturer.
Is it reasonable? (10%)
- (0.52-0.66)/sqrt(0.66*0.34/200) = -2.388
- P(p_head<0.52) = P[(p_head-0.66)/sqrt((0.66)*(1-0.66)/200)<(0.52-0.66)/sqrt(0.66*0.34/200)) = P(z<-2.388) = 0.0085
---
- If the claim by the maker of WIN is correct, it's so rarely to get the sample proportion 0.52 because of a very small probability of 0.0085.
- Therefore, the WIM's assertion is not reasonable.
### 第三大題(共 20 %)
> A convenient store is open 24 hours a day every day of the week. If, on the average, 20 visitors are coming
every hour through the day, find the
#### 題目3.a
> probability that a customer will visit within the next 10 minutes. (5%)
- 20 visitors per hour--> =20/60=1/3 = 0.333
- visitors per min thus, λ= 0.333 or 1/3
- an exponential distribution for waiting time
- P(x<10) = 0.964326
#### 題目3.b
> probability that 15 or more minutes will elapse between customer visiting. (5%)
- P(x>15) = 0.006737947
#### 題目3.c
> probability that there is no customer coming withing 15 minutes. (5%)
- It's a poisson distribution for ocurrences within a time interval.
- for an interval of 15 minutes, the λt=1/3 x15 = 5
- P(X=0) = 0.006738
#### 題目3.d
> Please compare the results of b. and c to assert your finding. (5%)
- The probability of b should be the same as that of c.
- It's because that the argument that time elapse between two customer visiting is more than 15 minutes is the same as the argument that there is no occurrence within 15 minutes.
## 第四次考試(總分 120 分)
### 第一大題(共 120 %)
> The ministry of education wants to investigate the consumption volume of meat per kid of the elementary school every year. The worksheet of Meat shows the consumption kilograms (x) surveyed from the elementary schools.
#### 題目1.1(10%)
> Please examine the distribution of meat consumption. Show your plot and indicate what it looks like.


#### 題目1.2(10%)
> Please calculate the sample mean, 𝑥̅, and the sample standard deviation of meat consumption.
- the sample mean = ==86.576==
- the sample stdandard deviation = ==24.764==
#### 題目1.3(5%)
> Please calculate the standard error of sample mean, 𝑥̅.
- the standard error of sample mean Xbar = ==1.709==
#### 題目1.4(10%)
> If the nutritionist claims that the average consumption volume of meat should not be more than
85kg, which has been suggested to the ministry of education, please use the sample data to examine
whether the kitchen officers in the elementary schools had supplied the over mean of meat volume.
- applying the t-statistic to judge the policy because of normaility of data
- t=(x_bar-85)/standard error of x_bar) = 0.922
- P(t>0.922) = 0.179
- The probability, 0179, indicates that the sample mean is so usual to appear when the population mean of meat is 85.
- Therefore, the kirchen officers in the elementary schools did not deviate the policy significantly.
#### 題目1.5(10%)
> Please develop a 95% confidence interval of the mean consumption volume of meat by the sample
data.
- critical vaue t0.975,209 = ==1.971==
- margin of error = ==3.369(1.7的答案)==
- 95% of C.I., lower bound = ==83.207==
- 95% of C.I., upper bound = ==89.945==
#### 題目1.6(10%)
##### 題目1.6.a
> Before calculating the 95% confidence interval in the 1.5 question item, what assumption must be examined first?
- Applying the t-statistic to calculate the confident interval is based on the assumption of normaility of sample data.
##### 題目1.6.b
> Is this assumption accepted by the surveyed sample data?
- According to 1.1 plotting, the sample data distributes normally.
#### 題目1.7(5%)
> Please also indicate the margin of error of the confidence interval in the 1.5 question item.
- margin of error = ==3.369==
#### 題目1.8(10%)
##### 題目1.8.a
> Please check the consumption value of expert’s suggestion, 85kg. Is it within the confidence interval
developed by the 1.5 question item?
##### 題目1.8.b
> Do you have the same conclusion as the 1.4 question item?
- The suggested mean, 85, is located within the 95% confidence interval.
- That is say, the most likely range of the mean meat consumption covers the true population mean.
- Therefore, the kirchen officers in the elementary schools did not deviate the policy significantly. The conclusion is the same as the question item of 1.4.
#### 題目1.9(5%)
> The nutritionist also claims that it is likely to be an obesity (that is, being too fat) problem for an
elementary school kid if he/she consumes over 100kg of meat every year. Please use a variable, Y,
to indicate whether the meat consumption volume of each surveyed elementary school is over 100kg
or not. If yes, the code is 1, otherwise, 0. You must show this new variable on the worksheet.
- 要生成一條 column 用 「=IF(A欄>100,1,0)」公式
#### 題目1.10(5%)
> Please calculate the sample proportion, that is, 𝑝̅, of likely obesity according the sample data of Y.
- the sample proportion, that is Pbar = ==0.257==
#### 題目1.11(5%)
> Please calculate the standard error of 𝑝̅.
- the standard error of Pbar = ==0.030==
#### 題目1.12(15%)
##### 題目1.12.a
> Please develop a 95% confidence interval of the proportion rate of likely obesity due to the over
meat consumption volume based on the sample data. (10%)
- critical vaue z0.975 = ==1.960==
- margin of error = ==0.059==
##### 題目1.12.b
> Please also indicate the margin of error.(5%)
- 95% of C.I., lower bound = ==0.198==
- 95% of C.I., upper bound = ==0.316==
#### 題目1.13(10%)
> If the dean of ministry of education wants to get the 95% confidence interval of the proportion rate
of likely obesity by reducing 25% of margin of error calculated by the 1.12 question item, how many
additional samples should be investigated?
- reduction of margin of error by 25%
- thus, e*= 0.044
- the sample size = 373.3 ≒ 374
- So, the additional sample size = 374-210 = ==164==
#### 題目1.14(10%)
> A journalist of a local newspaper reported that the proportion rate of obesity for the elementary
school kids is as high as 40% within plus and minus 3% of error under the 95% confidence level.
That journalist argued that this obesity report had been studied from 540 elementary school kids.
Do you believe this report?
- The necessary sample size n ≧ 1024.426667 or 1067.111111
- The journalist just argued that the sample size is 540, which is less than the necessary size for meeting the requirement of a samll margin of error of 0.03.
- ==So, I do not believe the report.==