# Processing binny and CoAssembly results The previous tutorial [here]() ended with the generation of bins and co Assembly but what now? Now we can look at how the bins are distributed at each time point. For that we use `R` with the multitool library `tidyverse` ```R library(tidyverse) ``` ## Loading the files We will be loading the individual library mapping to the coassembly first: ```R covlist<-list.files("Covstats_FS2C_B1/", full.names = T) cov<-list() for (i in 1:length(covlist)){ cov[[i]]<-read_tsv(covlist[i]) %>% rename(ID=`#ID`) %>% separate(ID, into = "ID", sep=" ",extra = "drop") %>% mutate(Lib=paste0("D", i-1)) } cov<-reduce(cov, rbind) ``` Then the contig distribution summary generated by binny ```R bin<-read_tsv("contig_data.tsv")%>% separate(bin, into=c("binID1","binID2"),sep="\\.") %>% type_convert ``` Then the GTDB-Tk summary results ```R tax<-read_tsv("gtdbtk/gtdbtk.bac120.summary.tsv") %>% separate(classification, into=c("K","P","C","O","F","G","S"), sep=";") %>% mutate(user_genome=gsub("binny_", "", user_genome)) %>% separate(user_genome, into=c("binID1","binID2","C","P"), extra="drop") %>% mutate(binID2=as.numeric(binID2), binID1=gsub("R0","R",binID1), binID1=gsub("I0","I",binID1)) ``` ## Plotting What a metagenome looks like ```R cov %>% rename(contig=ID) %>% left_join(bin) %>% left_join(tax) %>% ggplot(aes(y=Avg_fold,x=Ref_GC,col=O))+ geom_point()+ facet_wrap(~Lib)+ scale_y_log10() ``` ![](https://hackmd.io/_uploads/B1eFEN34n.png) How bin distribution varies accross libraries (here they don't varie much) ```R cov %>% rename(contig=ID) %>% select(contig,Lib,Avg_fold) %>% spread(Lib,Avg_fold) %>% left_join(bin) %>% left_join(tax) %>% ggplot(aes(x=D0,y=D1,col=O))+ geom_point()+ scale_y_log10()+ scale_x_log10() ``` ![](https://hackmd.io/_uploads/rkT5VE3Nn.png) Then the average bin coverage (average fold) mapping within each library ```R cov %>% rename(contig=ID) %>% left_join(bin) %>% left_join(tax) %>% ggplot(aes(x=paste(binID1,binID2),y=Avg_fold,col=O))+ geom_boxplot()+ scale_y_log10()+ facet_wrap(~Lib) ``` ![](https://hackmd.io/_uploads/rkvkSVnNh.png) And the average coverage for each bin accross time ```R cov %>% rename(contig=ID) %>% left_join(bin) %>% left_join(tax) %>% group_by(Lib) %>% mutate(prop_cov=prop.table(Avg_fold)*100) %>% ggplot(aes(x=Lib,y=prop_cov,col=O))+ geom_boxplot()+ facet_wrap(~S, scales = "free") ``` ![](https://hackmd.io/_uploads/BkYYrVhV3.png) ###### tags: `tutorials`,`R`, `Metagenomics`,`Binning`