# Data transformation
## Learning objectives
* Identify computer programming as a form of problem solving
* Practice decomposing an analytical goal into a set of discrete, computational tasks
* Identify the verbs for a language of data manipulation
* Clarify confusing aspects of data transformation from [R for Data Science](http://r4ds.had.co.nz/transform.html)
* Practice transforming data
---
### Diamonds Example(s)
- If you know the order of arguments, you don't have to specify the arguments explicitly
- It's best practice to name arguments explicitly though
1. Identify inputs
```
data("diamonds")
```
2. Filter
```
diamonds_ideal <- filter(.data = diamonds, cut == "Ideal")
//if I knew the order of arguments, this would also work:
diamonds_ideal <- filter(diamonds, "Ideal")
```
- The first argument in dplyr functions is usually called .data so as not to be confused with the function `data()`
3. Summarize
```
summarize(.data = diamonds_ideal, avg_price = mean(price))
```
- For calculating statistics for sub-groups in your dataset, use `group_by` (which defines a grouping structure)
----
### `Tidyverse`
#### We <3 Hadley Wickham
**The functions to know are in the slides**
- You can use British English or American English
- The `=` and `<-` are for assigning objects name
- The `==` operation checks for equivalence
- Imagine the `%>%` operator as the coding equivalent of the words "and then"
EX. ` flights %>% group_by(dest)`