# [Digital Comics Image Indexing Based on Deep Learning](https://pdfs.semanticscholar.org/9a77/84eea6bfa62bf2834ee0b87a3cdda46006f2.pdf)
Introduce a new dataset with a global indexing system
Comparison of deep learning models for comic book analysis tasks
### Proposed Method :

- The model is separated into offline and online tasks and the offline tasks are formulated to capture the comic book elements
- For panel,character and face detection YOLOv2 is used with anchor boxes from Faster R-CNN.
- The anchor bounding boxes are found using Kmeans on the ground tructh
**Speech Balloon Segmentation**
- The traditional methods produce several false positives and the deep learning models can not identify the boundaries as well as the traditional methods.
- The proposed method combines deep learning and traditional method
- The deep learning method used is DeepLabv2 and the detected balloons if they have an IOU above a threshold with detected balloons of traditional method are added to the output set.
**Text Recognition**

- Authors propose a method to utilise unlabed data for text recigntion of certain style.
- First a pretrained OCR is used to detect text which is then evaluated with a lexicality metrics and then used to train a second OCR model with the output of the first model as pseudo ground truth.
- The pretrained OCR used is FineReader.
- The lexicality metric is L = (1 −mean Levenshtein distance per character)
**Loss:**
- The losses assesed in the paper are perceptual loss and pixel difference.
- Perceptual loss is used as pixel difference makes it difficult to reconstruct intra-subject variations of a template.
### Experiments :
- The datasets used are eBDtheque, Fahad18 and the new DCM772 dataset.
- For character and face detection mAP is taken as the metric, for panel detection the precision and recall is considered using the IOU between ground truth and prediction.
- For ballon segmentation segmentation accuracy is used and for text recognition, character and word error rates are used.
- For character and face detection deep learning models perform well but fall behind in panel detection.
- In balloon segmentation boundaries problem persists for ML methods and the combined method performs a little better.
- The text recognition OCR gives high error rated which may be attributed to the text extraction algorithm that cuts lines.