# ChromHMM Tutorial ## 1. Prepare Histone Files for Cell Lines >Where to find CHIP-Seq? (BED peaks file) http://cistrome.org/db/#/ ## 2. Creat cell-mark table ```script=1 #tab to separate #cellmark15STATES.txt GM12878 H3K27me3 45206_peaks.bed GM12878 CTCF 45218_peaks.bed CD4+T H3K4me1 8127_peaks.bed CD4+T H3K4me1 36436_peaks.bed ``` ## 3. BinarizeBed ```script=1 java -mx1600M -jar ChromHMM.jar BinarizeBed -center CHROMSIZES/hg38.txt data/bed cellmark15STATES.txt data/InputBinary ``` ## 4. LearnModel ```script=1 java -mx1600M -jar ChromHMM.jar LearnModel data/InputBinary OUTPUTSAMPLE 15 hg38 ``` ## 5. Read segment.bed Based on the states-model, this function can give each segment a state for a cell sequence. ```Rscript=1 if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(version = "3.13") library("BiocManager") BiocManager::install("rtracklayer") library(rtracklayer) seg<-import("./reco/runSTIAN0.0/proc/ChromHMM/OUTPUTSAMPLE/CD14+_15_segments.bed", format="bed") seg #write to a .csv file write.csv(data.frame(seg),'OUTPATH/*.csv') ``` e.g. states for GM500_rep1: ![](https://i.imgur.com/pajc6Co.png) ## 6. States Heatmap Input file: **`emission.txt`** ```Pythonscript=1 emission_matrix_15 = pd.read_csv("emissions_15.txt",sep='\t',index_col=0) plt.subplots(figsize=(10, 10)) Emission_Matrix = sns.heatmap(emission_matrix_15, annot=True, vmax=1, square=True,cmap='YlOrRd') Emission_Matrix.set_xticklabels(Emission_Matrix.get_xticklabels(),rotation=60) Emission_Matrix.set_title("Chromatin State") plt.show() ```