<h1 style="color:teal">CMPD103 Statistic for Computer Scicence</h1>
###### tags: `notes` `Statisctic`
[toc]
---
<h1 style="color:darkcyan">1. Graphical Data Representation</h1>
## :book: Basic statistic terminologies
Variables
: A variable is a characteristic that may change under different circumstances
- e.g.
- Hair color - varies between individuals
- White blood cell count - varies between individuals
- Class schedule - varies between courses
An experimental unit
: Individual or object on which a variable is measured
A measurement
: The results when a variable is actually measured on an experimental unit
A population
: A set entities concerning which statistical are to be drawn
A sample
: A sub-set of entities drawn from the overall population
<br>
| Terminologies| Properties |
|:--------- | ---------- |
| Variable | Hair color |
| Experimental Unit | Person |
| Measurement | Brown, black, blonde, etc. |
| Population | Children ages between 1-3 years old |
| Sample | Taska Sri Muda |
<br>
## :book: How many variables you measured?
Univariate data
: One variable is measured on a single experimental unit
Bivariate data
: Two variables are measured on a single experimental unit
Multivariate data
: More than two variables are measured on single experimental unit
<br>
## :book: Types of Variables
```graphviz
graph variables {
node [shape=box, color="#008080", style =rounded, margin="0.2"]
Variables -- Qualitatives
Variables -- Quantitatives
Quantitatives -- {Discrete Continuous}
}
```
Qualitative variable
: - measures a ==quality or characterisitic== on each experimental unit
- e.g. Hair color, car maker, gender, state of birth
Quantitative variables
: - measures a ==numerical== quantity on each experimental unit
Discrete
: - if it can assume only a finite or countable number of values (**Only whole number**)
- e.g. No of cars, no of orange
Continuous
: - it can assume the infinitely many values corresponding to the points on a line interval (**can have floating point**)
- e.g. Price, weight, time
:::info
:notebook: ***Example***:
- For each orange tree in a grove, the number of oranges is measured
**Quantitative discrete**
- For a particular day,, the number of cars entering a college campus is measured
**Quantitative discrete**
- Time until a light bulb burns out
**Quantitative continuous**
:::
<br>
## :book: Graphical Data Representation
- The choice of graph/chart is very much dependent on the nature of data
- A ==single variable== measured for ==different population segments== can be graphed using a **pie** or **bar chart**
- A histogram can be used to plot the frequency of score occurences in a continuous data set that have been divided into classes
- A single variable measured over time is called a time series. It can be graphed using a line or bar chart
### Graphing Qualitative Variables
- Use a **data distribution** to describe:
- What values of the variable have been measured
- How often each value has occured
- How often can be measured 3 ways:
- Frequency
- Relative frequency
- Percent = 100 x Relative frequency
#### Example: Graphing table
| Variable | Frequency (f) | Relative Frequency (Rf) | Percentage |
| -------- | -------- | -------- | --- |
| Measurement 1 | 3 | 3/25 = 0.12 | 0.12 * 100 = 12 |
| Measurement 2 | 8 | 8/25 = 0.32 | 0.32 * 100 = 32 |
| Measurement 3 | 2 | 2/25 = 0.08 | 0.08 * 100 = 8 |
| Measurement 4 | 3 | 3/25 = 0.12 | 0.12 * 100 = 12 |
| Measurement 5 | 5 | 5/25 = 0.2 | 0.2 * 100 = 20 |
| Measurement 6 | 4 | 4/25 = 0.16 | 0.16 * 100 = 16 |
| **Total** | **25** | **1** | **100** |
---
### Relative Frequency Histograms
-