# Week 1 Lecture
## Population, Sample, Variable
### Population
A well-defined collection of objects
e.g. **All** registered students in the course.
#### Notation for Collections:
{Element1, Element2, Element3}
The difference between 0 and {0} is that the first one refers to the number 0 while the latter refers to a collection with one element: the number 0.
### Sample
A subcollection fo the population selected in some prescribed manner.
e.g. Students currently present in the lecture is a sample of registered students in the course.
### Variable
Any characteristic whose value may change from one object ot another in the population.
### Example

1. Sample (Refers to the average break-down time of the same 10 wings)
2. Population (Refers to the average break-down time of any of the 10 wings)
## Probability vs Statistics

### Probability
Properties of a population are known, questions regarding a sample taken from the population are investigated (deductive reasoning).
### Statistics
Characteristics of a sample are known from the experiment, and conclusions regarding the population are made (inductive reasoning).
## Descriptive and Inferential Stastistics
### Inferential Statistics
Involves making predictions or inferences about a population from observations and analyses of a sample.
### Example

Step 1: Consider all students in the front 2 rows of the lecture hall as a sample, and count the number of males, n~M~ and females n~F~.
Step 2: Estimate the number of maels and females in the population:
$$
\frac{n_{M}}{n_{M}+n_{F}} \times 168 \text { and } \frac{n_{F}}{n_{M}+n_{F}} \times 168 \text { respectively}
$$
### Descriptive Statistics
Involves obtaining descriptive summaries of a sample or a population. Could be in the form of frequency distributions, measures of central tendency (mean and median), or graphs such as histograms, pie charts, and bar charts.
## Range, Mean, Median, Percentile
**Range**: difference between the largest and smallest values.
**Mean**: average of all values.
**Population mean** is denoted by *μ*.
If sample is a collection of numbers $\{{x_1,...,x_n}\}$, the **sample mean** is denoted by
$$
\bar{x}=\frac{x_{1}+\cdots+x_{n}}{n}=\frac{1}{n}\left(\sum_{i=1}^{n} x_{i}\right)
$$
### Example

### Median
Suppose we have a population of numbers:$\{{x_1,...,x_N}\}$. Let $\{{x_i,...,x_{i_N}}\}$ be a sample of this population.
The **median** refers to the "middle" value after ordering the values.
The **population median** is defined as:
$$
\tilde{\mu}=\left\{\begin{array}{ll}
x_{m}, & \text { if } N \text { is odd of the form } N=2 m-1 ; \\
\frac{x_{m}+x_{m+1}}{2} & \text { if } N \text { is even of the form } N=2 m
\end{array}\right.
$$
The **sample median** is defined as:
$$
\tilde{x}=\left\{\begin{array}{ll}
x_{i_{m}}, & \text { if } n \text { is odd of the form } n=2 m-1 ; \\
\frac{x_{i_m}+x_{i_{m+1}}}{2}, & \text { if } n \text { is even of the form } n=2 m
\end{array}\right.
$$
### Percentile
The K-th percentile of a sample or population is a number p such that at least K% of all sample values are less than or equal to p, and no more than K% of all sample values are strictly less than p.
e.g. In a collection {1,2,3,4,5}, we should fulfil p ≥ 2 and p ≤ 2 in order to get the 26% percentile. Hence, 26% percentile is 2.
Method to compute the K-th percentile of a given sample:
Suppose we have a sample {x~1~, x~2~,... x~n~}, such that after ordering the values, we get x~1~'≤ x~2~' ≤ ... ≤ x~n~'.
For 0 < K < 100, First compute $\frac{K}{100}n$.
If $\frac{K}{100}n$ is not a whole number, round it up to get the whole number m. Then K-th percentile is x~m~'.
If $\frac{K}{100}n$ is a whole number, M, then K-th percentile is $\frac{x'_M+x'_{M+1}}{2}$.
### Example

### Example 2


Hence, the definition of median is exactly the same as 50th percentile.
## Experiment, Sample Space, Event
### Experiment
**Experiment** is any process (whether real or hypothetical) in which the possible outcomes can be identified ahead of time.
e.g. Rolling a die.
### Sample Space
The **sample space** of an experiment, denoted by Ω, is the set of all possible outcomes of that experiment.
e.g. {1,2,3,4,5,6}
#### Properties of a sample space
* **Collectively exhaustive**
Means the sample space contains all possible outcomes.
* Outcomes in a sample space must be **mutually exclusive**
Means that 2 different outcomes cannot both occur at the same time.
### Event
An **event** is a subset of outcomes contained in a sample space Ω.
An event is **simple** if it consists of exactly one outcome. It is **compound** when it consists of more than one outcome.
An event A is said to **occur** if the resulting experimental outcome is contained in A.
The experiment of rolling a die has 6 simple events: {1}, {2}, ... {6}.
Exmample of compound events:
* Event that the outcome is odd: {1,3,5}
In general, exactly one simple event will occur, but many compound events may occur simultaneously.


## Sample Space vs Population
A population could possibly have repeated values (a collection), but a sample space cannot have any repeated values (a set).