---
# System prepended metadata

title: 統計學 Statistics(MGT-2)

---

# 統計學 Statistics(MGT-2)
:::success
授課教師: 曾意儒
實體教室: I-204
[Statistics for Business & Economics Ebook](https://libgen.is/book/index.php?md5=28A10E70A2552E71124D881045CD19EE)
:::
:::info
:::spoiler Click to Open TOC
[TOC]
:::

## Chapter 1 Introduction
:::info
:::spoiler Learning Objectives
- [x] **Descriptive** and **Inferential statistics**
- [x] **Language of statistics** and **Key elements of statistisc**
- [x] **Population** and **Sample data**
- [x] **Types of data** and **Data-collection methods**
:::

### 【Statistical Methods 統計學方法】
#### 【Descriptive Statistics 敘述性統計】
`def：Utilizes numerical and graphical methods to explore data`
:::spoiler `Example`
![](https://i.imgur.com/FQAnXVb.png)
![](https://i.imgur.com/zcOUwQY.png)
:::
:::spoiler `Four elements of Descriptive Statistics`
1. 我們所感興趣的實驗單位的母體，或是樣本
2. 一個，或是多個我們要調查的變數
3. 能夠拿來做個總結的工具，比如說某個計算結果、圖 (graph) 或是表 (table)
4. 辨認出數據中蘊藏的趨勢
:::
#### 【Inferential Statistics 推論性統計】
`def：Utilizes sample data to make estimates, decisions, predictions, or other generalizations about a larger set of data.`
:::spoiler `Example`
>Using 1,945,071 real-time PCR results from nose and throat swabs taken from 383,812 participants between 2020/12 and 2021/5
>
>Vaccination with the ChAdOx1 or BNT162b2 vaccines already reduced SARS-CoV-2 infections ≥21 d after the first dose (61% (95% confidence interval = 54–68%) versus 66% (95% CI = 60–71%), respectively)
Greater reductions observed after a second dose (79% (95% CI = 65–88%) versus 80% (95% CI = 73–85%), respectively)
![](https://i.imgur.com/4AL4GGP.png)
:::
:::spoiler `Five elements of Inferential Statistics`
1. 我們感興趣的實驗單位的母體
2. 一個，或是多個我們要調查的變數
3. 母體的樣本
4. 以樣本中所隱藏的資訊做出的，對母體的推估
5. 推估的信度
:::

### 【Fundamental Elements of Statistics 統計的基本原素】
- Experimental unit 實驗單位 
`Object upon which we collect data 收集數據的對象`
- Variable 變數
`Characteristic of an individual experimental unit 這些單位所擁有的性質`
- Population 母體
`All items of interest 所有感興趣的單位的集合`
- Sample 樣本
`Subset of the units of a population 母體的子集合`
    - Representative sample
    `表現出目標群體所具有的典型特徵`
    - Simple random sample
    `每個不同樣本都有相同的選擇機會`
- Measure of Reliability 信度
- Statistical Inference

### 【Types of Data 資料型態】
- Quantitative data 量化
    - Discrete data 離散性
    - Continuous data 連續性
        - [Central Tendency 集中趨勢](#【Central-Tendency-集中趨勢】)
        - [Variability 變異數量](#【Variability-變異數量】)
        - [Distributional Forensics(Shape) 分配形狀](#【Distributional-Forensics(Shape)-分配形狀】)
- Qualitative data 質性
    - Ordinal data 序數型（<font color="red">排名</font>）
    - Nominal data 類別型（<font color="red">物種</font>）
    - Binomial data 二元型（<font color="red">是否為天主教：T</font>）

### 【Obtaining Data 資料蒐集】
1. Published source
2. Designed experiment
    - `Units` and `Units' Characteristic` **under control**
    - Typically involve `treatment`（實驗組） and `untreated`（對照組） group
3. Observationally study(incl. opinion polls & survey)
    - `Units` in **natural setting**
    - `Variables` are recorded
    - **No attempt** to control `units' characteristics`

## Chapter 2 Descriptive Statistics 敘述性統計
:::info
:::spoiler Learning Objectives
- [x] Describe data using **graphs**
- [x] Describe data using **numerical measures**
- [x] Describe **quantitative data** using numerical measures
- [x] Describe the **relationship between two quantitative variables using graphs**
- [x] Detecting descriptive **methods that distort the truth** 
:::
:::info
:::spoiler Outlines
![](https://i.imgur.com/0qNhXqf.png) 
![](https://i.imgur.com/F8nUagz.png)
:::
### 【Describing Qualitative Data 描述定性資料】
#### Key Terms
- Class 類別 `全校大二學生裡的資管系學生`
- Class Frequency 類別次數 `全校10000名大二學生，100名是資管系學生`
- Class Relative Frequency 類別相對次數 `全校10000名大二學生，100名是資管系學生，100/10000=0.01`
- Class Percentage 類別百分比 `全校10000名大二學生，100名是資管系學生，(100/10000)*100%=1%`

#### 【Tables】
- Lists `categories` & `number of elements`
- May show `frequencies(counts)`, `%` or both
:::spoiler Picture
![](https://i.imgur.com/cTtuj1O.png =500x200)
:::

#### 【Bar Chart 長條圖】
- Zero Point
- Equal Bar Widths
- 中間要有間格
:::spoiler Picture
![](https://i.imgur.com/wG4H61s.png =400x200)
:::

#### 【Pie Chart 圓餅圖】
- Total Quantity -> Categories（顯示按類別劃分的總數量）
- Angle size (360°)(percent)
:::spoiler Picture
![](https://i.imgur.com/ne4wO5d.png =400x300)
:::

#### 【Pareto Diagram 柏拉圖】
- 由大到小排的`Bar Chart`
:::spoiler Picture
![](https://i.imgur.com/ZOQxbEN.png =400x200)
:::

### 【Graphical Methods for Describing Quantitative Data 圖像化描述定量資料】
#### 【Dot Plot 點圖】
- Horizontal axis is a scale for the quantitative variable, e.g., percent.
:::spoiler Picture
![](https://i.imgur.com/1udgHYG.png)
:::

#### 【Stem & Leaf Display 莖葉圖】
- <font color="red">上 $\rightarrow$ 下，小 $\rightarrow$ 大</font>
- 十位數在**左側**，個位數在右側
- 相同值要寫出來以增加寬度
:::spoiler Picture
![](https://i.imgur.com/am6MXnG.png)
> Data: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41
:::

#### 【Histogram 直方圖】
- 定量變量的數值被劃分成區間
- 每個區間<font color="red">**等寬**</font>
- `Bar's height` 是 `class frequency` or `relative frequency` or `precent`
- Bar & Bar 不能有間隔
:::spoiler Picture
![](https://i.imgur.com/F1BUrYc.png)
![](https://i.imgur.com/odT96sn.png)
:::

#### 【Summary】
![](https://i.imgur.com/aTMr1Mm.png =600x300)

### 【Central Tendency 集中趨勢】
`def：the single value most` <font color="red">**typical/representative**</font> `of the collected data`
- Central Tendency 集中趨勢
    - Mean 平均值
    - Median 中位數
    - Mode 眾數
#### 【Mean 平均值】
![](https://i.imgur.com/3wmohzM.png =600x200)
- Advantage
    - Use **every value** in the data $\rightarrow$ **Good representative**
    - **Repeated drawn samples** from same population have **similar means** $\rightarrow$ **抵抗不同Sample間的波動**
- Disadvantage
    - **Sensitive** to **extreme values/outliers**
    - Not appropriate for **skewed distribution(偏態分布)**
    - Cannot be calculated for **nominal** or **nonnominal ordinal data**(癌症期數)

#### 【Median 中位數】
![](https://i.imgur.com/pEseOFK.png =600x200)
- **No affected** by **extreme values**

#### 【Mode 眾數】
![](https://i.imgur.com/jofg89J.png)
- **Not affected** by **extreme values**
- May be used for **quantitative** or **qualitative data**

### 【Variability 變異數量】
`def：the` <font color='red'>**spread, or dispersion**</font>`, of the values`
- Variability 變異數量
    - Range 全距
    - Variance 變異數
    - Standard Deviation 標準差
#### 【Range 全距】
![](https://i.imgur.com/H8TJ0bP.png)
- Disadvantage
    - **Ignores** data **distributed**
    - **Sensitive** to **extreme values/outliers**

#### 【Variance 變異數】
![](https://i.imgur.com/VihHx3J.png)
- Most common measures
- **Consider how data are distributed**
- Show variation about mean

#### 【Standard Deviation 標準差】
![](https://i.imgur.com/ZVuH8OG.png)

### 【Distributional Forensics(Shape) 分配形狀】
- Shape
    - Skewness 偏態
        - Left-Skewed
        - Symmetric
        - Right-Skewed
    - Kurtosis 峰態

#### 【Skewness 偏態】
`def：A data set is said to be` <font color='red'>**skewed**</font> `if one tail of the distribution has` **more extreme** `observations than the other tail.`
- Left-Skewed `Mean < Median`
- Symmetric `Mean = Median`
- Right-Skewed `Mean > Median`
![](https://i.imgur.com/QPbCXwK.png)

#### 【Kurtosis 峰態】
![](https://i.imgur.com/cb2VYim.png)

---

待補

---

## Chapter 3 Probability 機率
## Chapter 4 Random Variables and Probability Distributions 隨機變數與機率分布