Gistic was designed for SNP6 array data. I saw many papers use it for whole exome sequencing data as well.
I have the segment files from sequenza and want to convert them to the gistic input.
segment file:
(1) Sample (sample name)
(2) Chromosome (chromosome number)
(3) Start Position (segment start position, in bases)
(4) End Position (segment end position, in bases)
(5) Num markers (number of markers in segment)
(6) Seg.CN (log2() -1 of copy number)
The conversion should be log2 (logarithm base 2) - 1, so that copy number 2 is 0.
Every segment start and end in the segments file should appear in the markers file, not the other way around.
when the copy number is 0 (a homozygous deletion of both copies). You can't do a log2(0)-1, just put a small number e.g. -5
(1) Marker Name
(2) Chromosome
(3) Marker Position (in bases)
Note gistic2 does not require a marker file anymore.
sequenza gives a segment file. Segmentation was done by copynumber bioconductor package.
13 columns of the *segments.txt
file
"chromosome" "start.pos" "end.pos" "Bf" "N.BAF" "sd.BAF" "depth.ratio" "N.ratio" "sd.ratio" "CNt" "A" "B" "LPP"
We only need the chromosome
, start.pos
, end.pos
, N.BAF
and depth.ratio
columns.
The depth.ratio
column is the GC content normalized ratio. a depth ratio of 1 means it has copy number of 2 (the same as the normal blood control in my case).
To convert to gistic input, I have to do log2(2^depth.ratio) - 1 = depth.ratio -1
UPDATED 01/03/2018
I have a bunch of sgement files in the same folder.
add the sample name in the final column and do the log2 math in R.
library(tidyverse)
library(readr)
seg_files<- list.files(".", pattern = "*segments.txt", full.names = F)
seg_dat_list <- lapply(seg_files, function(f) {
dat<- read_tsv(f, col_names = T, col_types = cols(.default = col_character()))
sample<- gsub("_vs_.*segments.txt", "", f)
dat$sample<- sample
return(dat)
})
seg_dat <- do.call(rbind, seg_dat_list)
gistic_input<- seg_dat %>% select(sample, chromosome, start.pos, end.pos, N.BAF, depth.ratio) %>% mutate(depth.ratio = as.numeric(depth.ratio)) %>% mutate(depth.ratio = log2(2 * depth.ratio) -1)
write_tsv(gistic_input, "all_segments.txt")
## marker file:
cat all_segments.txt | sed '1d' | cut -f2,3 > markers.txt
cat all_segments.txt | sed '1d' | cut -f2,4 >> markers.txt
## sort the files by chromosome, take the unique ones and number the markers.
cat markers.txt | sort -V -k1,1 -k2,2nr | uniq | nl > markers_gistic.txt
modify the gistic2
script a bit.
#!/bin/sh
## set MCR environment and launch GISTIC executable
## NOTE: change the line below if you have installed the Matlab MCR in an alternative location
MCR_ROOT=/scratch/genomic_med/apps/Matlab_Complier_runTime
MCR_VER=v83
echo Setting Matlab MCR root to $MCR_ROOT
## set up environment variables
LD_LIBRARY_PATH=$MCR_ROOT/$MCR_VER/runtime/glnxa64:$LD_LIBRARY_PATH
LD_LIBRARY_PATH=$MCR_ROOT/$MCR_VER/bin/glnxa64:$LD_LIBRARY_PATH
LD_LIBRARY_PATH=$MCR_ROOT/$MCR_VER/sys/os/glnxa64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH
XAPPLRESDIR=$MCR_ROOT/$MCR_VER/MATLAB_Component_Runtime/v83/X11/app-defaults
export XAPPLRESDIR
## launch GISTIC executable
./gp_gistic2_from_seg $@
I removed ./
for the last line since I have put all executables in my path.
mkdir gistic_out
gistic2 -b gistic_out -seg all_segments.txt -refgene /scratch/genomic_med/apps/gistic/refgenefiles/hg19.mat -mk markers_gistic.txt -genegistic 1 -smallmem 1 -broad 1 -brlen 0.5 -conf 0.90 -armpeel 1 -savegene 1 -gcm extreme
or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Syncing