--- GA: UA-159972578-2 --- ###### tags: `R` `sunburst` `Visualization` `Report` `資料視覺化` # Sunburst套件介紹 [Refernce](http://www.buildingwidgets.com/blog/2015/7/2/week-26-sunburstr) ## 使用條件 + 資料型態:路徑String + 頻率 1. 把重複的路徑拿掉, 只顯示過程變化: ![](https://i.imgur.com/HhssFm3.png =80%x) 2. 還原所有過程變化: ![](https://i.imgur.com/RQJBYlo.png =80%x) ## 資料型態轉換 ### 1. 讀取資料 ```{r} library(TraMineR) # use example from TraMineR vignette data("mvad") mvad.alphab <- c( "employment", "FE", "HE", "joblessness", "school", "training" ) mvad.seq <- seqdef(mvad, 17:86, xtstep = 6, alphabet = mvad.alphab) ``` ![](https://i.imgur.com/e8GlXYB.png) ### 2. 型態轉換 + 寫法1 ```{r} # To make this work, we'll compress the sequences with seqdss library(pipeR) seqtab( seqdss(mvad.seq), tlim = 0, format = "SPS" ) %>>% attr("freq") %>>% ( data.frame( # appending "-end" is necessary for this to work sequence = paste0( gsub( x = rownames(.) , pattern = "(/[0-9]*)" # 把不斷重複的'/1'拿掉 , replacement = "" , perl = T # 是否使用perl兼容的正則表達式(regexps) ) ,"-end" ) ,freq = as.numeric(.$Freq) ,stringsAsFactors = FALSE ) ) %>>% sunburst ``` + 寫法2 ```{r} library(tibble) seq_df = seqtab( seqdss(mvad.seq), idxs = 0, format = "SPS" ) %>% attr("freq") %>% rownames_to_column("Path") seq_df$Path = gsub('/1', "" , seq_df$Path) %>% paste0("-end") ``` ## 繪製Sunburst ```{r} library(sunburstR) sequence_data <- read.csv( paste0( "https://gist.githubusercontent.com/kerryrodden/7090426/", "raw/ad00fcf422541f19b70af5a8a4c5e1460254e6be/visit-sequences.csv" ) ,header=F ,stringsAsFactors = FALSE ) sunburst(sequence_data) ``` ![](https://i.imgur.com/KP5ZvZ1.png) ## Sunburst比例的算法 ```{r} index = grep("^home-home",sequence_data$V1) # 由home開頭的序列 sum(sequence_data$V2[index])/sum(sequence_data$V2) ``` > [1] 0.06031