<h1 style="color:teal">CMPD103 Statistic for Computer Scicence</h1> ###### tags: `notes` `Statisctic` [toc] --- <h1 style="color:darkcyan">1. Graphical Data Representation</h1> ## :book: Basic statistic terminologies Variables : A variable is a characteristic that may change under different circumstances - e.g. - Hair color - varies between individuals - White blood cell count - varies between individuals - Class schedule - varies between courses An experimental unit : Individual or object on which a variable is measured A measurement : The results when a variable is actually measured on an experimental unit A population : A set entities concerning which statistical are to be drawn A sample : A sub-set of entities drawn from the overall population <br> | Terminologies| Properties | |:--------- | ---------- | | Variable | Hair color | | Experimental Unit | Person | | Measurement | Brown, black, blonde, etc. | | Population | Children ages between 1-3 years old | | Sample | Taska Sri Muda | <br> ## :book: How many variables you measured? Univariate data : One variable is measured on a single experimental unit Bivariate data : Two variables are measured on a single experimental unit Multivariate data : More than two variables are measured on single experimental unit <br> ## :book: Types of Variables ```graphviz graph variables { node [shape=box, color="#008080", style =rounded, margin="0.2"] Variables -- Qualitatives Variables -- Quantitatives Quantitatives -- {Discrete Continuous} } ``` Qualitative variable : - measures a ==quality or characterisitic== on each experimental unit - e.g. Hair color, car maker, gender, state of birth Quantitative variables : - measures a ==numerical== quantity on each experimental unit Discrete : - if it can assume only a finite or countable number of values (**Only whole number**) - e.g. No of cars, no of orange Continuous : - it can assume the infinitely many values corresponding to the points on a line interval (**can have floating point**) - e.g. Price, weight, time :::info :notebook: ***Example***: - For each orange tree in a grove, the number of oranges is measured **Quantitative discrete** - For a particular day,, the number of cars entering a college campus is measured **Quantitative discrete** - Time until a light bulb burns out **Quantitative continuous** ::: <br> ## :book: Graphical Data Representation - The choice of graph/chart is very much dependent on the nature of data - A ==single variable== measured for ==different population segments== can be graphed using a **pie** or **bar chart** - A histogram can be used to plot the frequency of score occurences in a continuous data set that have been divided into classes - A single variable measured over time is called a time series. It can be graphed using a line or bar chart ### Graphing Qualitative Variables - Use a **data distribution** to describe: - What values of the variable have been measured - How often each value has occured - How often can be measured 3 ways: - Frequency - Relative frequency - Percent = 100 x Relative frequency #### Example: Graphing table | Variable | Frequency (f) | Relative Frequency (Rf) | Percentage | | -------- | -------- | -------- | --- | | Measurement 1 | 3 | 3/25 = 0.12 | 0.12 * 100 = 12 | | Measurement 2 | 8 | 8/25 = 0.32 | 0.32 * 100 = 32 | | Measurement 3 | 2 | 2/25 = 0.08 | 0.08 * 100 = 8 | | Measurement 4 | 3 | 3/25 = 0.12 | 0.12 * 100 = 12 | | Measurement 5 | 5 | 5/25 = 0.2 | 0.2 * 100 = 20 | | Measurement 6 | 4 | 4/25 = 0.16 | 0.16 * 100 = 16 | | **Total** | **25** | **1** | **100** | --- ### Relative Frequency Histograms -