---
GA: UA-159972578-2
---
###### tags: `R` `sunburst` `Visualization` `Report` `資料視覺化`
# Sunburst套件介紹
[Refernce](http://www.buildingwidgets.com/blog/2015/7/2/week-26-sunburstr)
## 使用條件
+ 資料型態:路徑String + 頻率
1. 把重複的路徑拿掉, 只顯示過程變化:
![](https://i.imgur.com/HhssFm3.png =80%x)
2. 還原所有過程變化:
![](https://i.imgur.com/RQJBYlo.png =80%x)
## 資料型態轉換
### 1. 讀取資料
```{r}
library(TraMineR)
# use example from TraMineR vignette
data("mvad")
mvad.alphab <- c(
"employment", "FE", "HE", "joblessness",
"school", "training"
)
mvad.seq <- seqdef(mvad, 17:86, xtstep = 6, alphabet = mvad.alphab)
```
![](https://i.imgur.com/e8GlXYB.png)
### 2. 型態轉換
+ 寫法1
```{r}
# To make this work, we'll compress the sequences with seqdss
library(pipeR)
seqtab( seqdss(mvad.seq), tlim = 0, format = "SPS" ) %>>%
attr("freq") %>>%
(
data.frame(
# appending "-end" is necessary for this to work
sequence = paste0(
gsub(
x = rownames(.)
, pattern = "(/[0-9]*)" # 把不斷重複的'/1'拿掉
, replacement = ""
, perl = T # 是否使用perl兼容的正則表達式(regexps)
)
,"-end"
)
,freq = as.numeric(.$Freq)
,stringsAsFactors = FALSE
)
) %>>%
sunburst
```
+ 寫法2
```{r}
library(tibble)
seq_df = seqtab( seqdss(mvad.seq), idxs = 0, format = "SPS" ) %>%
attr("freq") %>%
rownames_to_column("Path")
seq_df$Path = gsub('/1', "" , seq_df$Path) %>% paste0("-end")
```
## 繪製Sunburst
```{r}
library(sunburstR)
sequence_data <- read.csv(
paste0(
"https://gist.githubusercontent.com/kerryrodden/7090426/",
"raw/ad00fcf422541f19b70af5a8a4c5e1460254e6be/visit-sequences.csv"
)
,header=F
,stringsAsFactors = FALSE
)
sunburst(sequence_data)
```
![](https://i.imgur.com/KP5ZvZ1.png)
## Sunburst比例的算法
```{r}
index = grep("^home-home",sequence_data$V1) # 由home開頭的序列
sum(sequence_data$V2[index])/sum(sequence_data$V2)
```
> [1] 0.06031