# DataCamp _ Introduction to R
### class()
```python
// 檢查變數的型態
my_character <- "universe"
my_logical <- FALSE
class(my_chracter) will shows character
class(my_logical) will shows logical
```
---
## :penguin: Vector
:::info
Vectors are **one-dimension arrays** that can hold numeric data, character data, or logical data.
:::
### Naming a vector -- names() function
```python
// 使用names()
# Poker winnings from Monday to Friday
poker_vector <- c(140, -50, 20, -120, 240)
# The variable days_vector
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
output:
it will lable days_vector on poker_vector
```
### Vector selection -- use []
:::info
To select elements of a vector **(and later matrices, data frames, …)**, you can use square brackets.
Except the single integer indexing you can use, you **can also use vector to do the index job** like the example below
:::
```python
poker_vector <- c(140, -50, 20, -120, 240)
poker_midweek <- poker_vector[c(2,3,4)] # this can abbreviated to [2:4], not showing again afterward
output:
poker_midweek will shows -50 / 20 / -120
```
:::info
If the vector which you created is named by the way, **you can also index subvector by the name** like the example below:
:::
```python
poker_vector <- c(140, -50, 20, -120, 240)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
poker_start <- poker_vector[c("Monday", "Tuesday", "Wednesday")]
output:
poker_start will shows
Monday Tuesday Wednesday
140 -50 20
```
:::info
If you got a logical vector after some operations, you can also index subvector by using it.
R knows what to do when you pass a logical vector in square brackets: **it will only select the elements that correspond to TRUE**
Example below:
:::
```python
poker_vector <- c(140, -50, 20, -120, 240)
days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
names(poker_vector) <- days_vector
selection_vector <- poker_vector > 0 # Monday Tuesday Wednesday Thursday Friday
# TRUE FALSE TRUE FALSE TRUE
poker_winning_days <- poker_vector[selection_vector]
output:
poker_winning_days will shows
Monday Wednesday Friday
140 20 240
```
---
## :penguin: Matrix
### Creat a matrix -- Matrix() function
```python
Matrix(1:9, byrow=TRUE, nrow=3)
# or creating in this way
new_hope <- c(460.998, 314.4)
empire_strikes <- c(290.475, 247.900)
return_jedi <- c(309.306, 165.8)
# Create box_office
box_office <- c(new_hope,empire_strikes,return_jedi)
# Construct star_wars_matrix
star_wars_matrix <- matrix(box_office,byrow=TRUE,nrow=3)
```
### Naming a matrix -- colnames() & rownames()
```python
# Vectors region and titles, used for naming
region <- c("US", "non-US")
titles <- c("A New Hope", "The Empire Strikes Back", "Return of the Jedi")
# Name the columns with region
colnames(star_wars_matrix) <- region
# Name the rows with titles
rownames(star_wars_matrix) <- titles
```
### Matrix operation functions
:::info
There are some kind of matrix operation functions like:
- **rowSums():** calculate the sum of each row.
- **colSums():** calculate the sum of each column
- **cbind():** add a new column to a matrix. Ex: add a vector to a matrix
:warning: By the way, you can use ls() to check the variables which exists in the environment.
- **rbind():** like paste a matrix to another matrix by row direction
:::
---
## :penguin: factor
:::info
### What's a factor
Factor used to store categorical variables.
Ex. Limit the sex categories to "Male" or "Female".
:::
### Creat a factor -- factor() function
```python
# First we create a vector
sex_vector <- c("Male", "Female", "Female", "Male", "Male")
# Obviously, there are two categories, or in R-terms 'factor levels', at work here: "Male" and "Female".
# Convert sex_vector to a factor
factor_sex_vector <- factor(sex_vector)
# Print out factor_sex_vector
print(factor_sex_vector)
output:
[1] Male Female Female Male Male
Levels: Female Male
```
### Categorical type
:::info
- **Nominal categorical variable:** is a categorical variable without an implied order. Meanwhile, you're not allowed to compare the elements in nominal type factor.
- **Ordinal categorical variable:** do have a natural ordering. :warning:Hint: If there's order between elements, in the factor() function you need to set the **order & levels** parameters.
:::
### Summarizing a factor
```python
# When it still a vector survey_vector <- c("M", "F", "F", "M", "M")
summary(survey_vector)
output:
Length Class Mode
5 character character
# When the vector become a factor and use summary() function
factor_survey_vector <- factor(survey_vector)
levels(factor_survey_vector) <- c("Female", "Male")
summary(factor_survey_vector)
output:
Female Male
2 3
```
### Comparing elements in ordinal type factor
:::info
After you get the elements and save as a variable, you can simply compare with it by >... operators.
:::
---
## :penguin: DataFrame
:::info
Beforehand instruction of DataFrame like slicing are like vector or matrix, and further use some functions like **head()** or **str()**. But at slicing part, one of the different is you can use feature name as column index (Example below).
:::
```python
# Hardly know the position number of the column we want, we can instead use its name to pick it up.
# Select first 5 values of diameter column
planets_df[1:5,"diameter"]
output:
[1] 0.382 0.949 1.000 0.532 11.209
```
```python
# We can also select full elements from a feature by "$" sign.
rings_vector <- planets_df$rings
output:
[1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE
```
### Take Partial of the DataFrame (1)
:::info
DataFrame itself contains "rings" feature, and we take the value of "rings" feature alone saved as rings_vector.
If we use boolean like vector to be the index, R will take the **TRUE** option to be the index.
:::
```python
# Select all columns for planets with rings
planets_df[rings_vector, ]
output:
name type diameter rotation rings
5 Jupiter Gas giant 11.209 0.41 TRUE
6 Saturn Gas giant 9.449 0.43 TRUE
7 Uranus Gas giant 4.007 -0.72 TRUE
8 Neptune Gas giant 3.883 0.67 TRUE
```
### Take Partial of the DataFrame (2)
```python
# We use subset() to acheive the same result above
subset(planets_df, subset = rings) # this will have same result from above
subset(planets_df, subset = diameter < 1)
output:
name type diameter rotation rings
1 Mercury Terrestrial planet 0.382 58.64 FALSE
2 Venus Terrestrial planet 0.949 -243.02 FALSE
4 Mars Terrestrial planet 0.532 1.03 FALSE
```
### Sorting
```python
a <- c(100, 10, 1000)
order(a)
output:
[1] 2 1 3 # Seems like it returns the index value of elements from min to max.
a[order(a)] # reshuffle a vector
output:
[1] 10 100 1000
```
```python
# Use order() to create positions
positions <- order(planets_df$diameter) # A vector store index value
# Use positions to sort planets_df
planets_df[positions,]
```
---
## :penguin: List
:::info
A list in R allows you to gather a variety of objects under one name (that is, the name of the list) in an ordered way. These objects can be matrices, vectors, data frames, even other lists, etc.
:::
### Creating a list -- list() function
```python
# Vector with numerics from 1 up to 10
my_vector <- 1:10
# Matrix with numerics from 1 up to 9
my_matrix <- matrix(1:9, ncol = 3)
# First 10 elements of the built-in data frame mtcars
my_df <- mtcars[1:10,]
# Construct list with these different elements:
my_list <- list(my_vector, my_matrix, my_df)
output:
[[1]]: vector [[2]]: matrix [[3]]: dataframe
```
### Nameing your list
:::info
:warning: To avoid not knowing the components of your list.
:::
```python
# First way
my_list <- list(name1 = your_comp1,
name2 = your_comp2)
-----------------------------------
# Second way (Same as we name a vector, using vector include the name and names() function)
my_list <- list(your_comp1, your_comp2) # Create your list
names(my_list) <- c("name1", "name2") # Name your list
```
### Selecting elements from a list
```python
# We already have shining_list contains vector matrix and dataframe
# Select the component in the list
shining_list[[2]]
or
shining_list[["actors"]]
shining_list$actors
# Select specific elements out of these components
shining_list[["actors"]][2]
```