Week 1 Lecture

# Week 1 Lecture ## Population, Sample, Variable ### Population A well-defined collection of objects e.g. **All** registered students in the course. #### Notation for Collections: {Element1, Element2, Element3} The difference between 0 and {0} is that the first one refers to the number 0 while the latter refers to a collection with one element: the number 0. ### Sample A subcollection fo the population selected in some prescribed manner. e.g. Students currently present in the lecture is a sample of registered students in the course. ### Variable Any characteristic whose value may change from one object ot another in the population. ### Example ![](https://i.imgur.com/uy8OrOW.png) 1. Sample (Refers to the average break-down time of the same 10 wings) 2. Population (Refers to the average break-down time of any of the 10 wings) ## Probability vs Statistics ![](https://i.imgur.com/lmDSi6J.png) ### Probability Properties of a population are known, questions regarding a sample taken from the population are investigated (deductive reasoning). ### Statistics Characteristics of a sample are known from the experiment, and conclusions regarding the population are made (inductive reasoning). ## Descriptive and Inferential Stastistics ### Inferential Statistics Involves making predictions or inferences about a population from observations and analyses of a sample. ### Example ![](https://i.imgur.com/77Hkv9q.png) Step 1: Consider all students in the front 2 rows of the lecture hall as a sample, and count the number of males, n~M~ and females n~F~. Step 2: Estimate the number of maels and females in the population: $$ \frac{n_{M}}{n_{M}+n_{F}} \times 168 \text { and } \frac{n_{F}}{n_{M}+n_{F}} \times 168 \text { respectively} $$ ### Descriptive Statistics Involves obtaining descriptive summaries of a sample or a population. Could be in the form of frequency distributions, measures of central tendency (mean and median), or graphs such as histograms, pie charts, and bar charts. ## Range, Mean, Median, Percentile **Range**: difference between the largest and smallest values. **Mean**: average of all values. **Population mean** is denoted by *μ*. If sample is a collection of numbers $\{{x_1,...,x_n}\}$, the **sample mean** is denoted by $$ \bar{x}=\frac{x_{1}+\cdots+x_{n}}{n}=\frac{1}{n}\left(\sum_{i=1}^{n} x_{i}\right) $$ ### Example ![](https://i.imgur.com/rqKbtZS.png) ### Median Suppose we have a population of numbers:$\{{x_1,...,x_N}\}$. Let $\{{x_i,...,x_{i_N}}\}$ be a sample of this population. The **median** refers to the "middle" value after ordering the values. The **population median** is defined as: $$ \tilde{\mu}=\left\{\begin{array}{ll} x_{m}, & \text { if } N \text { is odd of the form } N=2 m-1 ; \\ \frac{x_{m}+x_{m+1}}{2} & \text { if } N \text { is even of the form } N=2 m \end{array}\right. $$ The **sample median** is defined as: $$ \tilde{x}=\left\{\begin{array}{ll} x_{i_{m}}, & \text { if } n \text { is odd of the form } n=2 m-1 ; \\ \frac{x_{i_m}+x_{i_{m+1}}}{2}, & \text { if } n \text { is even of the form } n=2 m \end{array}\right. $$ ### Percentile The K-th percentile of a sample or population is a number p such that at least K% of all sample values are less than or equal to p, and no more than K% of all sample values are strictly less than p. e.g. In a collection {1,2,3,4,5}, we should fulfil p ≥ 2 and p ≤ 2 in order to get the 26% percentile. Hence, 26% percentile is 2. Method to compute the K-th percentile of a given sample: Suppose we have a sample {x~1~, x~2~,... x~n~}, such that after ordering the values, we get x~1~'≤ x~2~' ≤ ... ≤ x~n~'. For 0 < K < 100, First compute $\frac{K}{100}n$. If $\frac{K}{100}n$ is not a whole number, round it up to get the whole number m. Then K-th percentile is x~m~'. If $\frac{K}{100}n$ is a whole number, M, then K-th percentile is $\frac{x'_M+x'_{M+1}}{2}$. ### Example ![](https://i.imgur.com/7qEphjl.png) ### Example 2 ![](https://i.imgur.com/DM5e1C2.png) ![](https://i.imgur.com/ddcFeSi.png) Hence, the definition of median is exactly the same as 50th percentile. ## Experiment, Sample Space, Event ### Experiment **Experiment** is any process (whether real or hypothetical) in which the possible outcomes can be identified ahead of time. e.g. Rolling a die. ### Sample Space The **sample space** of an experiment, denoted by Ω, is the set of all possible outcomes of that experiment. e.g. {1,2,3,4,5,6} #### Properties of a sample space * **Collectively exhaustive** Means the sample space contains all possible outcomes. * Outcomes in a sample space must be **mutually exclusive** Means that 2 different outcomes cannot both occur at the same time. ### Event An **event** is a subset of outcomes contained in a sample space Ω. An event is **simple** if it consists of exactly one outcome. It is **compound** when it consists of more than one outcome. An event A is said to **occur** if the resulting experimental outcome is contained in A. The experiment of rolling a die has 6 simple events: {1}, {2}, ... {6}. Exmample of compound events: * Event that the outcome is odd: {1,3,5} In general, exactly one simple event will occur, but many compound events may occur simultaneously. ![](https://i.imgur.com/nyO2ZeX.png) ![](https://i.imgur.com/R1ruidZ.png) ## Sample Space vs Population A population could possibly have repeated values (a collection), but a sample space cannot have any repeated values (a set).