Immunohistochemistry image processing and analysis
- Author: Lun-Hsien Chang
- Date created: May 2020
Using the R function in next section to annotate all IHC images
- Download the cell segmentation data file
cell-seg-data-merged_CD8.txt
from QIMR L drive L:/Lab_MarkS/lunC/work/Immunohistochemistry_images/data_output/AP_Exp108.1_PeterMac-lungCancer-CD8-PD1/analysis-results
- Download the folder with IHC image files from QIMR L drive
L:/Lab_MarkS/lunC/work/Immunohistochemistry_images/data_input/AshR/1. mIHF/Exp 108 210728/CD8 path
- Create a new R script file. Copy the code chunk below and modify the path of input files and folders.
- Run the
source()
function to load the R function to the working environment.
- Modify the file path within the double quotes. Make sure backword slashes \ are replaced by forward slashes / when the path is copied from Windows address bar.
- In R, place your cursor anywhere in the line with the
source()
and run this function by pressing the Ctrl and Enter keys at the same time.
- Modify the values that will be taken by the following arguments in the function
DrawAllIHCImagesTMAWithRectanglesInForm()
input.cell.seg.data.file.path=
This is the full path of your cell segmentation file. Make sure the string value is within a pair of quotes.
input.images.folder.path=
This is the full path of the folder with IHC image files to annotate. Do not end the path with / .
threshold.marker=
Specify a threshold for positive marker measurement with a number.
threshold.confidence=
Specify a threshold for confidence with a number.
rectangle.width.half=
Specify 50% of the width of a square. This value is used to calculate the coordinates of the four points. rectangle.width.half=15
creates squares that are 30 by 30 pixels.
legend.location=
Specify which corner you want your legend to be. Aceept four string values- "topright"
"topleft"
"bottomright"
"bottomleft"
rectangle.color=
Name the color of your rectangles. To use a different color from the default black, look up a color name in R An overview of color names in R
legend.text.line1=
legend.text.line2=
legend.text.line3=
Allow a 3-line legned to be placed using the text you supply. Use line 1 for your project/study name. Use line 2 for your marker. Use line 3 for the thresholds.
legend.color=
Name the color of the legend. To use a different color from the default black, look up a color name in R An overview of color names in R
legend.text.cex=
Use a number to increase or decrease the font size of the legend. legend.text.cex=2.5
expands the size to 250%
output.folder.path=
Create a new folder where the annotated image files will be exported. Add the folder path to the argument. Do not end the path with /
output.excluded.images.folder.path=
Create another new folder where the unannotated image files will be exported. Add the folder path to the argument. Do not end the path with /
- Download R script file
annotate-IHC-images_Exp108v1.R
with the code above from QIMR L drive L:/Lab_MarkS/lunC/work/Immunohistochemistry_images/scripts
A R function to annotate IHC images using a cell segmentation data file
- Download the R script file
RFunction_annotate-all-IHC-images.R
from QIMR network drive L L:/Lab_MarkS/lunC/work/Immunohistochemistry_images/scripts
- The input cell segmentation data file should be
- A tab-separated file
- Having at least five columns with headers in this order. The function identifies the columns by their positions, not by their names
- column 1 : image file names
- column 2 : X coordinates
- column 3 : Y coordinates
- column 4 : Marker measurement
- column 5 : Confidence
- The input image files can be in different formats such as .tif, .tiff, .jpg, .jpeg, and .png (not tested yet) *
if (!"dplyr" %in% rownames(installed.packages())) install.packages("dplyr")
if (!"foreach" %in% rownames(installed.packages())) install.packages("foreach")
if (!"doParallel" %in% rownames(installed.packages())) install.packages("doParallel")
if (!"magick" %in% rownames(installed.packages())) install.packages("magick")
if (!"graphics" %in% rownames(installed.packages())) install.packages("graphics")
if (!"grDevices" %in% rownames(installed.packages())) install.packages("grDevices")
if (!"tools" %in% rownames(installed.packages())) install.packages("tools")
library(dplyr)
library(doParallel)
DrawAllIHCImagesTMAWithRectanglesInForm<- function( input.cell.seg.data.file.path="D:/googleDrive/copy-to-lunC_work/Merged_data/190529 batch analysis_lung cancer_cell_seg_data.txt"
,column.position.image.file.name=1
,column.position.x.cooridnate=2
,column.position.y.cooridnate=3
,column.position.marker.measurement=4
,column.position.confidence=5
,input.images.folder.path
,threshold.marker=8
,threshold.confidence=0
,rectangle.width.half=12.5
,legend.location="topright"
,rectangle.color="black"
,legend.text.line1="Project: demo"
,legend.text.line2="Marker: CD8+"
,legend.text.line3="Marker threshold: 8, confidence threshold: 0"
,legend.color="black"
,legend.text.cex=2.5
,output.folder.path="D:/googleDrive/copy-to-lunC_work/CD8_annotated"
,output.excluded.images.folder.path="E:/Lab_MarkS/lunC_work/Immunohistochemistry_images/data_output/AP_Exp108.1_PeterMac-lungCancer-CD8-PD1/CD8_path_view/excluded"){
if(file.exists(input.cell.seg.data.file.path)!=TRUE){
cat("Input file for cell segmentation data could not be found")
} else {
data <- read.delim(file=input.cell.seg.data.file.path, header = TRUE, sep = "\t", stringsAsFactors = F) %>%
dplyr::select(c( column.position.image.file.name
,column.position.x.cooridnate
,column.position.y.cooridnate
,column.position.marker.measurement
,column.position.confidence)) %>%
dplyr::mutate(
Confidence_percent= as.numeric(stringr::str_replace_all(string=.[,5]
,pattern="%"
,replacement="")))
data.subset <- data %>% dplyr::filter(data[,4] > threshold.marker & data[,5] > threshold.confidence)
cat((nrow(data)-nrow(data.subset))/nrow(data)*100, "% data are filtered out by the marker threshold of ", threshold.marker, "and confidence threshold of", threshold.confidence)
}
if(dir.exists(input.images.folder.path)!=TRUE){
cat("Input image file could not be found")
} else{
image.file.paths <- list.files(path = input.images.folder.path, full.names = TRUE)
cat("There are",length(image.file.paths),"images in the folder")
foreach::foreach(i=1:length(image.file.paths)
,.combine = 'c'
,.packages=c("foreach","dplyr")) %dopar% {
print(paste0("=========== Working on image ",i,"============"))
image.file.path <-image.file.paths[i]
image.file.name <-basename(image.file.path)
d <- data.subset %>% dplyr::filter(file.name==image.file.name)
if(nrow(d)==0){
file.copy( from = image.file.path
,to=file.path(output.excluded.images.folder.path, image.file.name)
,copy.date = TRUE
)
} else{
tryCatch({
image <- magick::image_read(path=image.file.path)
image.drew <- magick::image_draw(image)
rect.x.left <- d[,2] - rectangle.width.half
rect.x.right <- d[,2] + rectangle.width.half
rect.y.top <- d[,3] - rectangle.width.half
rect.y.bottom <- d[,3] + rectangle.width.half
graphics::legend(legend.location
,legend = c( legend.text.line1
,legend.text.line2
,stringr::str_wrap(string=legend.text.line3, width=80)
,paste0(rectangle.width.half*2, " x ", rectangle.width.half*2, " pixels")
)
,bty = "n"
,pt.cex = legend.text.cex
,cex = legend.text.cex
,text.col = c(legend.color))
graphics::rect( xleft= rect.x.left
,ybottom=rect.y.bottom
,xright=rect.x.right
,ytop=rect.y.top
,col = NA
,border = rectangle.color
,lty = par("lty")
,lwd = 2)
grDevices::dev.off()
output.image.file.name <- paste("annotated", basename(image.file.path)
,sep = "_")
output.image.file.path <- file.path(output.folder.path, output.image.file.name)
output.image.file.name.ext.rm <- tools::file_path_sans_ext(output.image.file.name)
if(nchar(output.image.file.path) < 260){
magick::image_write( image= image.drew
,path= output.image.file.path)
} else {
cut.length <- nchar(output.image.file.path) - 259
end.position <- nchar(output.image.file.name.ext.rm) - cut.length
output.image.file.path.shortened <- file.path(output.folder.path
,paste0( stringr::str_sub( string= tools::file_path_sans_ext(output.image.file.name)
,start = 1
,end= end.position)
,"."
,tools::file_ext(output.image.file.name)
)
)
magick::image_write( image= image.drew
,path= output.image.file.path.shortened)
}
}, error=function(e){cat("ERROR :",conditionMessage(e), "\n")}
)
}
}
}
}
Using the R function in next section to summarise cell segmentation data to counts of single, double positive cells
- Download the cell segmentation data file
AP_Exp108.1_PeterMac-lungCancer-CD8-PD1_cell-seg-data-merged_CD8_PD1.tsv
from QIMR L drive L:/Lab_MarkS/lunC/work/Immunohistochemistry_images/data_output/AP_Exp108.1_PeterMac-lungCancer-CD8-PD1/analysis-results
- Create a new R script file. Copy the code chunk and modify the paths of input and output files and folders
- Run the
source()
function to load the R function to the working environment.
- Modify the file path within the double quotes. Make sure backword slashes \ are replaced by forward slashes / when the path is copied from Windows address bar.
- In R, place your cursor anywhere in the line with the
source()
and run this function by pressing the Ctrl and Enter keys at the same time.
- Modify the values that will be taken by the following arguments in the function
SummariseTwoMarkersCellSegmentationDataInForm()
file.path.cell.seg.data.file=
This is the full path of your cell segmentation file. Make sure the string value is within a pair of quotes.
input.images.folder.path=
This is the full path of the folder with IHC image files to annotate. Do not end the path with / .
marker.1.info.list=
Specify a list of three values for marker 1 (1) Name your marker 1 (e.g., CD8), (2) a threshold that dichotimises marker measurement values into negative or positive, (3) a threshold that dichotimises confidence.
marker.2.info.list=
Specify a list of three values for marker 2 (1) Name your marker 1 (e.g., CD226), (2) a threshold that dichotimises marker measurement values into negative or positive, (3) a threshold that dichotimises confidence.
output.folder.path=
Create a new folder where the summary data will be exported. Add the folder path to the argument. Do not end the path with /
- The R code above is in the R script file
annotate-IHC-images_Exp108v1.R
at QIMR L drive L:/Lab_MarkS/lunC/work/Immunohistochemistry_images/scripts
A R function to summarise cell segmentation data to counts of single, double positive cells
- Location of the script file on QIMR network drive L L:/Lab_MarkS/lunC/work/Immunohistochemistry_images/scripts/RFunction_summarise-cell-segmentation-files.R
- The input cell segmentation data file should be
- A tab-separated file
- Having at least 4 columns with headers in this order. The function identifies the columns by their positions, not by their names
- column 1 : A grouping variable that uniquely identifies TMA core location
- column 2 : Confidence variable with string values (e.g., 95%)
- column 3 : measurement of marker 1
- column 4 : measurement of marker 2
if (!"dplyr" %in% rownames(installed.packages())) install.packages("dplyr")
if (!"tools" %in% rownames(installed.packages())) install.packages("tools")
library(dplyr)
SummariseTwoMarkersCellSegmentationDataInForm<- function( file.path.cell.seg.data.file="E:/Lab_MarkS/lunC_work/Immunohistochemistry_images/data_output/AP_Exp108.1_PeterMac-lungCancer-CD8-PD1/analysis-results/AP_Exp108.1_PeterMac-lungCancer-CD8-PD1_cell-seg-data-merged_CD8_PD1.tsv"
,column.position.grouping.variable=1
,column.position.confidence=2
,column.position.marker.1=3
,column.position.marker.2=4
,marker.1.info.list=list(c("CD8", 1.8, 98))
,marker.2.info.list=list(c("CD226", 3, 98))
,output.folder.path="E:/Lab_MarkS/lunC_work/Immunohistochemistry_images/data_output/AP_Exp109.1_PeterMac-lungCancer-CD8-CD226/analysis-results"){
if(file.exists(file.path.cell.seg.data.file)!=TRUE){
cat("Input file for cell segmentation data could not be found")
} else {
column.positions <- as.numeric(c( column.position.grouping.variable
,column.position.confidence
,column.position.marker.1
,column.position.marker.2
)
)
cat("Reading these columns by position from the input data file into R","\n"
,column.positions,"\n")
varname.marker.1 <- marker.1.info.list[[1]][1]
varname.marker.2 <- marker.2.info.list[[1]][1]
data <- read.delim( file=file.path.cell.seg.data.file
,header = TRUE
,sep = "\t"
,stringsAsFactors = F
) %>%
dplyr::select(column.positions) %>%
dplyr::rename( grouping.variable = 1
,!!varname.marker.1 := 3
,!!varname.marker.2 := 4 ) %>%
dplyr::mutate(
confidence_percent= as.numeric(stringr::str_replace_all(string=.[,2]
,pattern="%"
,replacement=""))) %>%
dplyr::select(-2) %>%
dplyr::select(grouping.variable, confidence_percent, everything())
measurement.threshold.marker.1 <- as.numeric(marker.1.info.list[[1]][2])
confidence.threshold.marker.1 <- as.numeric(marker.1.info.list[[1]][3])
measurement.threshold.marker.2 <- as.numeric(marker.2.info.list[[1]][2])
confidence.threshold.marker.2 <- as.numeric(marker.2.info.list[[1]][3])
filtration.conditions.single.positive.marker.1 <- expression(.[,3] > measurement.threshold.marker.1 &
.[,2] > confidence.threshold.marker.1)
filtration.conditions.single.positive.marker.2 <- expression(.[,4] > measurement.threshold.marker.2 &
.[,2] > confidence.threshold.marker.2)
filtration.conditions.double.positive <- expression(.[,3] > measurement.threshold.marker.1 &
.[,2] > confidence.threshold.marker.1 &
.[,4] > measurement.threshold.marker.2 &
.[,2] > confidence.threshold.marker.2)
data.subset.single.positive.marker.1 <- data %>%
dplyr::filter(eval(filtration.conditions.single.positive.marker.1)) %>%
dplyr::mutate(cell.status=paste0(marker.1.info.list[[1]][1],"+"))
cat("Subsetting single positive cells based on"
,marker.1.info.list[[1]][1]
,"threshold of"
, measurement.threshold.marker.1
, "and confidence threshold of"
, confidence.threshold.marker.1,"\n"
,(nrow(data)-nrow(data.subset.single.positive.marker.1))/nrow(data)*100
,"% data are filtered out","\n")
data.subset.single.positive.marker.2 <- data %>%
dplyr::filter(eval(filtration.conditions.single.positive.marker.2))%>%
dplyr::mutate(cell.status=paste0(marker.2.info.list[[1]][1],"+"))
cat("Subsetting single positive cells based on"
,marker.2.info.list[[1]][1]
,"threshold of"
, measurement.threshold.marker.2
, "and confidence threshold of"
, confidence.threshold.marker.2,"\n"
,(nrow(data)-nrow(data.subset.single.positive.marker.2))/nrow(data)*100
,"% data are filtered out","\n")
data.subset.double.positive <- data %>%
dplyr::filter(eval(filtration.conditions.double.positive))%>%
dplyr::mutate(cell.status=paste0( marker.1.info.list[[1]][1],"+"
,marker.2.info.list[[1]][1],"+"))
cat("Subsetting double positive cells based on"
,marker.1.info.list[[1]][1]
,"threshold of"
, measurement.threshold.marker.1
,marker.2.info.list[[1]][1]
,"threshold of"
, measurement.threshold.marker.1
, "confidence threshold of"
, confidence.threshold.marker.1
,"&" , confidence.threshold.marker.2,"\n"
,(nrow(data)-nrow(data.subset.double.positive))/nrow(data)*100
,"% data are filtered out","\n")
summary.data <- data %>%
dplyr::group_by(grouping.variable) %>%
dplyr::count(name = "number.cells")
summary.marker.1 <- data.subset.single.positive.marker.1 %>%
dplyr::group_by(grouping.variable) %>%
dplyr::count(name = paste0(marker.1.info.list[[1]][1],"_p"))
summary.marker.2 <- data.subset.single.positive.marker.2 %>%
dplyr::group_by(grouping.variable) %>%
dplyr::count(name = paste0(marker.2.info.list[[1]][1],"_p"))
summary.double.positive <- data.subset.double.positive %>%
dplyr::group_by(grouping.variable) %>%
dplyr::count(name = paste0(marker.1.info.list[[1]][1],"p"
,marker.2.info.list[[1]][1],"p"))
summary.all <- list( summary.data
,summary.marker.1
,summary.marker.2
,summary.double.positive) %>%
purrr::reduce(dplyr::full_join, by=c("grouping.variable"))
attributes(summary.all)$class <- "data.frame"
column.name.ratio <- paste0(names(summary.marker.2)[2],"_ratio")
summary.all[[column.name.ratio]] <- summary.all[,5]/summary.all[,3]
summary.all[is.na(summary.all)] <- 0
write.table( x=summary.all
,file = file.path(output.folder.path, "summary-of-cell-seg-data_single-positive_double-positive_ratio.tsv")
,sep = "\t"
,col.names = TRUE
,quote = FALSE
,row.names = FALSE)
cat("Exported summary data file at","\n"
,file.path(output.folder.path, "summary-of-cell-seg-data_single-positive_double-positive_ratio.tsv"),"\n")
}
}
Annotate tissue and detect DAB in QuPath
CD39
Create a new empty folder at
C:\Lab_MarkS\lunC_work\Immunohistochemistry_images\QuPath-projects\QuPath-project_Exp91-BRAFRES1-RBWHMM_CD39
Create a new QuPath project and select the folder above as the project folder
File> Project > create project
Add images to the project from
L:\Lab_MarkS\AshR\CD39 project images (all)\exp91_CD39 abcam M3R repeat
Create class
Annotations > right-click in the class panel > Add class > Name it as "tissue" > change color to "Red"
The pixel classifier tissueClassifier adds red to tissue according the color specified by tissue=red
Annotations > right-click in the class panel > Add class > Name it as "DAB-positive" > change color to "Lime".
The pixel classifier DAB-measurement adds green to DAB according the color specified by DAB-positive=green
Following the project description at L:\Lab_MarkS\AshR\5. QuPath Analysis\Analysis01.1 - MM_CD39 DAB Measure_AP210401\Description analysis01.1.txt, find out which the json file names used as the pixel classifiers. Here they are
findtissue vh H G 1 0.05 tissue ignore e.json
vh dab g 0.25 pos neg ash.json
Copy the source [pixel_classifiers folder](L:/Lab_MarkS/AshR/5. QuPath Analysis/Analysis01.1 - MM_CD39 DAB Measure_AP210401/classifiers/pixel_classifiers) to the project folder
Copy an existing scripts folder to the project scripts folder
Modify the script file
Run the script
Script Editor> Run > Run for project > select all images
Classifying pixels with 2 pixel classifiers (manually)
- Create a project and add images
- load a pixel classifier

- Select tissueClassifier (/classifiers/pixel_classifiers/tissueClassifier.json)

- Apply the classifier to full image

-
Specify hole size as 1000

-
Export renderred RGB

-
exported image

-
Delete annotations

-
load another pixel classifer

- Apply the classifier to full image

- Specify hole size as 1000

- Export renderred RGB

- Exported image as

Classifying pixels with 2 pixel classifiers (Tam's script)
- Run the script on 1 image

- exported image

Merge 3 channels in Fiji
Problem: THUNDER Imaging Systems generates images in grayscale. You would like to combine these grayscale images to single colored images
Data:
L:\Lab_MarkS\lunC\Immunohistochemistry_images\data_input\Exp59_HNSCC-CD226-ratio-201014-Rescanned\HNSCC_TMA_A_1core
HNSCC A Re-scan_1B_ICC Merged_RAW_ch00.tif
HNSCC A Re-scan_1B_ICC Merged_RAW_ch01.tif
HNSCC A Re-scan_1B_ICC Merged_RAW_ch02.tif
Solution:
Drag and drop a group of 3 images to Fiji

Select channels

Save your merged image

Merge channels by a script
Problem: You have a large number of grayscale images to merge. You don't enjoy clicking your mouse multiple times.
Data:
L:\Lab_MarkS\lunC\Immunohistochemistry_images\data_input\Exp59_HNSCC-CD226-ratio-201014-Rescanned\HNSCC_TMA_A_1core
HNSCC A Re-scan_1B_ICC Merged_RAW_ch00.tif
HNSCC A Re-scan_1B_ICC Merged_RAW_ch01.tif
HNSCC A Re-scan_1B_ICC Merged_RAW_ch02.tif
Solution:
Move image files to a new folder if non-image
Run a imageJ macro script file (C:\Lab_MarkS\lunC\Immunohistochemistry_images\scripts\merge-red-green-blue-channels.ijm)
Plugins > Macro > Edit > Select the merge-red-green-blue-channels.ijm file

Run the script by clicking the Run
tab or Ctrl+R

Enter file suffixes for blue, green and red channels

Select the source image folder

Select a folder to save merged images

This error occurs if the number of elements in the folder is not a multiple of 3

Channel merging successfully completed

Problem: What if you don't know the file suffixes of channels
Solution?: Check *_Properties.xml
file under the MetaData folder
