tags: `R` `sunburst` `Visualization` `Report` `資料視覺化`
# Sunburst套件介紹
[Refernce](http://www.buildingwidgets.com/blog/2015/7/2/week-26-sunburstr)
## 使用條件
+ 資料型態:路徑String
+ 頻率
1. 把重複的路徑拿掉, 只顯示過程變化:
![](https://i.imgur.com/HhssFm3.png =80%x)
2. 還原所有過程變化:
![](https://i.imgur.com/RQJBYlo.png =80%x)
## 資料型態轉換
### 1. 讀取資料
```{r}
library(TraMineR)
# use example from TraMineR vignette
data("mvad")
mvad.alphab <- c(
  "employment", "FE", "HE", "joblessness", "school",
  "training"
)
mvad.seq <- seqdef(mvad, 17:86, xtstep = 6, alphabet = mvad.alphab)
```
![](https://i.imgur.com/e8GlXYB.png)
### 2. 型態轉換
+ 寫法1
```{r}
# To make this work, we'll compress the sequences with seqdss
library(pipeR)
seqtab( seqdss(mvad.seq), tlim = 0, format = "SPS" ) %>>%
  attr("freq") %>>%
  (
    data.frame(
      # appending "-end" is necessary for this to work
      sequence = paste0(
        gsub(
          x = rownames(.)
          , pattern = "(/[0-9]*)" # 把不斷重複的'/1'拿掉
          , replacement = ""
          , perl = T # 是否使用perl兼容的正則表達式(regexps)
        )
        ,"-end"
      )
      ,freq = as.numeric(.$Freq)
      ,stringsAsFactors = FALSE
    )
  ) %>>%
  sunburst
```
+ 寫法2
```{r}
library(tibble)
seq_df = seqtab( seqdss(mvad.seq), idxs = 0, format = "SPS" ) %>%
  attr("freq") %>%
  rownames_to_column("Path")
seq_df$Path = gsub('/1', "" , seq_df$Path) %>% paste0("-end")
```
## 繪製Sunburst
```{r}
library(sunburstR)
sequence_data <- read.csv(
  paste0(
    "https://gist.githubusercontent.com/kerryrodden/7090426/",
    "raw/ad00fcf422541f19b70af5a8a4c5e1460254e6be/visit-sequences.csv"
  )
  ,header=F
  ,stringsAsFactors = FALSE
)
sunburst(sequence_data)
```
![](https://i.imgur.com/KP5ZvZ1.png)
## Sunburst比例的算法
```{r}
index = grep("^home-home",sequence_data$V1) # 由home開頭的序列
sum(sequence_data$V2[index])/sum(sequence_data$V2)
```
> [1] 0.06031