{%hackmd xNRvyO9tTASkQbPzLcM4zg %}
# Simple Visualization Presentations
---
## The Fastest Possible Path
We want to do this in the lowest-friction way we can. So we have to make a few assumptions:
* Our data is reasonably accessible
* Our data does not require substantial cleaning
* We sort of know the story we want to tell
---
## HackMD: the Platform
[HackMD](https://hackmd.io/) is a platform for rapid, lightweight text creation. It uses `markdown` for syntax, and extends this with a number of [features]( https://hackmd.io/features?both) that make it a great place to develop slides and visualizations.
We've set up a workspace at [hackmd.io/is457-fall2020/](https://hackmd.io/is457-fall2020) where you can collaborate and view notes.
But: don't mistake its ease of use for a lack of functionality!
----
This is functional!
---
## Text Markup 1
Text markup in markdown is pretty straightforward, and might even match what you would otherwise expect!
Headings produced by `# Heading 1`, where adding additional `#` to the beginning decreases the header size.
|Example|Result|
|-|-|
|`_emphasis_`|_emphasis_|
|`*sorta emphasized*`|*sorta emphasized*|
|`**really emphasized**`|**really emphasized**|
|`[a link](https://google.com/)`|[a link](https://google.com/)|
|``||
---
## Text Markup 2
You can do bullet and numbered lists by something like this:
```
* first
* second
* third
```
And numbered lists:
```
1. First
1. Second
1. Third
```
They don't even need to be in order!
1. First
1. Second
1. Third
---
## Creating Slides
You can create slides -- and in fact, this presentation was created in HackMD -- by creating a structured markdown file.
To separate out one slide from another, use `---` on a line all by itself -- and be sure to include an empty line both before and after it!
HackMD has some [detailed instructions on slide presentations](https://hackmd.io/s/how-to-create-slide-deck) as well.
---
## What about visualizations?
We can do those! You can either embed images (which HackMD helpfully uploads to imgur) or you can embed a vega-lite visualization itself.
---
## vega-lite: embed
To embed vega-lite, you do a "code" segment, where the code has been identified as type `vega`.
For instance, this would embed a vega-lite visualization:
```
```vega
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "https://vega.github.io/editor/data/barley.json"},
"mark": "bar",
"encoding": {
"x": {"aggregate": "sum", "field": "yield", "type": "quantitative"},
"y": {"field": "variety", "type": "nominal"},
"color": {"field": "site", "type": "nominal"}
}
}
```
```
that looks like ...
----
```vega
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "https://vega.github.io/editor/data/barley.json"},
"mark": "bar",
"encoding": {
"x": {"aggregate": "sum", "field": "yield", "type": "quantitative"},
"y": {"field": "variety", "type": "nominal"},
"color": {"field": "site", "type": "nominal"}
}
}
```
---
## Wait, vega-lite?
Yeah, vega-lite!
https://vega.github.io/editor/#/
vega-lite a visualization system utilizing declarative JSON specifications. This specification will typically take a form similar to this:
```json
{
"data": .. ,
"transform": [ .. ],
"mark": .. ,
"selection": .. ,
"encoding": .. ,
"config": ..
}
```
---
# vega-lite syntax: basics
From the vega-lite examples, you can make a bar chart that is an aggregate like so:
```
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "https://vega.github.io/editor/data/movies.json"},
"mark": "bar",
"encoding": {
"x": {
"bin": true,
"field": "IMDB Rating",
"type": "quantitative"
},
"y": {
"aggregate": "count",
"type": "quantitative"
}
}
}
```
----
```vega
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "https://vega.github.io/editor/data/movies.json"},
"mark": "bar",
"encoding": {
"x": {
"bin": true,
"field": "IMDB Rating",
"type": "quantitative"
},
"y": {
"aggregate": "count",
"type": "quantitative"
}
}
}
```
---
# vega-lite syntax
There are several mechanisms by which we describe data representations in
vega-lite, but the overarching principle is that it is declarative. We define
what it does based on what we say we want it to look like.
The place where this is no longer true is when we modify `datum` values.
---
# vega-lite syntax
The syntax you will need to be the most familiar with:
* `mark`: how to visually represent something
* [`encoding`](https://vega.github.io/vega-lite/docs/encoding.html): the translation between data and the mark
* `aggregate`: operating over a collection of points -- `mean`, `sum`, `median`,
`min`, `max`, `count`
* `type`: `quantitative`, `temporal`, `ordinal`, or `nominal`
---
## Marks - I
vega-lite has numerous different `mark` types. We can break these down by the type of data they can represent. We will only consider "primitive" marks today.
* `area` & `line`
* `bar` & `rect`
* `point` & `circle` & `square`
* `rule` & `text`
* `tick`
* `geoshape`
We will demonstrate several of these using our datasets, but first we need to learn how to transform data.
---
## Transformations - I
At the `view`-level of your definition, you can specify transformations that modify, filter, or reshape the data.
At the top level, we specify a transformation. We can transform data within a given dataset (by specifying a new attribute of each data point) or by reshaping the data.
The types of transformations we will cover today are `filter` and `calculate`.
---
## Transformations - II
We apply a `filter` transform by specifying the field to filter on and the filtering characteristic. This can be a selection, an expression, or a logical definition. We will address selection and expression filtering later.
---
## Transformations - III
A logical filtering operation might look like one of these:
```json
"transform": [
{"filter": {
"field": "eye_color", "oneOf": ["blue", "brown"]
}
},
{"filter": {"field": "age", "lte": 100 }
}
]
```
We can use `lt`, `gt`, `lte`, `gte`, `eq`, `oneOf`, `range` and `valid`.
---
## Transformations - IV
We can also compute a new field using the `calculate` transform. This is an expression that is evaluated on every data point, which is supplied as the variable `datum` to the expression.
```json
"transform": [
{"calculate": "datum.age / 7", "as": "dog_years"}
]
```
---
## Selections - I
Selections are defined with *names* -- this seems to be the most common stumbling block. You get to choose the name!
We use selections in one of a few ways.
* We can conditionally encode data -- for instance, change visibility, or alpha, or color.
* We can use selections as input for filtering data. Typically this is done with one plot showing unfiltered data and another using a filter from that selection.
* Scale a domain based on a selection
---
## Selections - II
There are three types of selections:
* `single` -- selecting a single point,
* `multi` -- multiple points
* `interval` -- collections of values along encoding axes
We will focus on the `interval` selection.
---
## Selections - III
We can define a box-based selector that operates along the x axis by specifying which encoding it is linked to. Here, we name it `valrange`, but we can choose whatever name we like.
```json
"selection": {
"valrange": {"type": "interval",
"encodings": ["x"]
}
}
```
Let's try this.
---
```vega
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {"url": "https://vega.github.io/editor/data/movies.json"},
"transform": [
{"filter": {"field": "US Gross", "gt": 0}},
{"filter": {"field": "Worldwide Gross", "gt": 0}}
],
"hconcat": [
{
"selection": {"selectedMovies": {"type": "interval"}},
"mark": "point",
"encoding": {
"x": {
"field": "US Gross",
"type": "quantitative",
"scale": {"type": "log"}
},
"y": {
"field": "Worldwide Gross",
"type": "quantitative",
"scale": {"type": "log"}
}
}
},
{
"layer": [
{
"mark": "bar",
"encoding": {
"x": {"bin": true, "field": "IMDB Rating", "type": "quantitative"},
"y": {"aggregate": "count", "type": "quantitative"},
"color": {"value": "black"}
}
},
{
"mark": "bar",
"transform": [{"filter": {"selection": "selectedMovies"}}],
"encoding": {
"x": {"bin": true, "field": "IMDB Rating", "type": "quantitative"},
"y": {"aggregate": "count", "type": "quantitative"}
}
}
]
}
]
}
```
---
## Voters Example
Let's take a look at party identification data.
```json
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"url": "https://uiuc-ischool-dataviz.github.io/fall2019/data/voters.json"
},
"transform": [
{"fold": ["dem", "rep", "decline", "other"], "as": ["party", "percentage"]},
{"calculate": "datum.registered * datum.percentage", "as": "total_people"}
],
"mark": "area",
"encoding": {
"x": {"field": "year", "type": "quantitative"},
"y": {"field": "total_people", "type": "quantitative", "stack": true},
"color": {
"field": "party",
"type": "nominal",
"scale": {
"domain": ["dem", "rep", "other", "decline"],
"range": ["#0000DDAA", "#DD0000AA", "#00DD00AA", "#000000AA"]
}
}
}
}
```
----
```vega
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"data": {
"url": "https://uiuc-ischool-dataviz.github.io/fall2019/data/voters.json"
},
"transform": [
{"fold": ["dem", "rep", "decline", "other"], "as": ["party", "percentage"]},
{"calculate": "datum.registered * datum.percentage", "as": "total_people"}
],
"mark": "area",
"encoding": {
"x": {"field": "year", "type": "quantitative"},
"y": {"field": "total_people", "type": "quantitative", "stack": true},
"color": {
"field": "party",
"type": "nominal",
"scale": {
"domain": ["dem", "rep", "other", "decline"],
"range": ["#0000DDAA", "#DD0000AA", "#00DD00AA", "#000000AA"]
}
}
}
}
```
---
## TopoJSON Example
```json
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"layer": [
{
"data": {
"url": "https://vega.github.io/editor/data/us-10m.json",
"format": {"type": "topojson", "feature": "states"}
},
"mark": {"type": "geoshape", "fill": "whitesmoke", "stroke": "black"}
},
{
"data": {"url": "https://vega.github.io/editor/data/flights-airport.csv"},
"transform": [
{
"aggregate": [
{"op": "sum", "field": "count", "as": "sum_count"},
{"op": "count", "as": "routes"}
],
"groupby": ["origin"]
},
{
"lookup": "origin",
"from": {
"data": {"url": "https://vega.github.io/editor/data/airports.csv"},
"key": "iata",
"fields": ["latitude", "longitude"]
}
}
],
"mark": {"type": "circle"},
"selection": {
"nearest": {"type": "single", "on": "mouseover", "empty": "none", "nearest": true}
},
"encoding": {
"latitude": {"field": "latitude", "type": "quantitative"},
"longitude": {"field": "longitude", "type": "quantitative"},
"size": {"field": "routes", "type": "quantitative"},
"tooltip": {"field": "routes", "type": "quantitative"},
"color": {
"field": "sum_count",
"type": "quantitative",
"condition": {"selection": "nearest", "value": "red"}
}
}
}
],
"projection": {"type": "albersUsa"},
"height": 768,
"width": 1024
}
```
----
```vega
{
"$schema": "https://vega.github.io/schema/vega-lite/v4.json",
"layer": [
{
"data": {
"url": "https://vega.github.io/editor/data/us-10m.json",
"format": {"type": "topojson", "feature": "states"}
},
"mark": {"type": "geoshape", "fill": "whitesmoke", "stroke": "black"}
},
{
"data": {"url": "https://vega.github.io/editor/data/flights-airport.csv"},
"transform": [
{
"aggregate": [
{"op": "sum", "field": "count", "as": "sum_count"},
{"op": "count", "as": "routes"}
],
"groupby": ["origin"]
},
{
"lookup": "origin",
"from": {
"data": {"url": "https://vega.github.io/editor/data/airports.csv"},
"key": "iata",
"fields": ["latitude", "longitude"]
}
}
],
"mark": {"type": "circle"},
"selection": {
"nearest": {"type": "single", "on": "mouseover", "empty": "none", "nearest": true}
},
"encoding": {
"latitude": {"field": "latitude", "type": "quantitative"},
"longitude": {"field": "longitude", "type": "quantitative"},
"size": {"field": "routes", "type": "quantitative"},
"tooltip": {"field": "routes", "type": "quantitative"},
"color": {
"field": "sum_count",
"type": "quantitative",
"condition": {"selection": "nearest", "value": "red"}
}
}
}
],
"projection": {"type": "albersUsa"},
"height": 768,
"width": 1024
}
```
{"metaMigratedAt":"2023-06-15T12:24:00.923Z","metaMigratedFrom":"YAML","title":"Simple Visualization Presentations","breaks":true,"description":"This will show you how you might set up vega-lite visualizations and use them in a presentation or document.","slideOptions":"{\"theme\":\"white\",\"center\":false}","contributors":"[{\"id\":\"b5271ffb-860b-43eb-9d4a-83de2e909b5c\",\"add\":14397,\"del\":345}]"}