討論 Week18 的主題

# 討論 Week18 的主題 ###### tags: `OCR track` 1. 反光偵測模型實作 2. text detection 新演算法實作 3. 訓練新的 OCR model 4. hand-writing OCR model 5. 分類模型 6. 其他想法 ## RPA support CV 任務 * 昊中：CRNN + CTC (Pytorch)(用cc_ws_ocr專案的dataset去訓練) * 信賢：data augmentation, OCR, yolo backbone (3 選 1) * 沛筠：data augmentation * RPA 任務：驗證碼 OCR ## OCR 步驟 * 分類模型 * efficientNet * mobilenet * 影像前處理 * 去噪 * data augmentation * detection * text (e.g., craft) * object (e.g., yolov4) * segmentation * OCR * 前處理：e.g. 文字加深加黑再套用 OCR model * 精進舊版本的 recogntion model * hand-writing OCR model * 信用卡自扣申請書 * 排除 * crnn+ctc: 90 ~ 91% * [clova](https://github.com/clovaai/deep-text-recognition-benchmark) * attention * 支票手寫 * 匯款單 ## 執行方式 * :star: 報告順序: * week 18: 固定演算法，嘗試多種 data augmentation 對 OCR performance 的貢獻 (固定 CRNN+CTC, 參考 if_cv_dev) * week 21: 先找幾個OCR演算法(except CRNN+CTC、attention、clova)，使用相同的資料去比較 performance * 後續比較方式: * case 1: CRNN + CTC & 原始資料 (baseline) * case 2: hand-writing OCR model & 原始資料 * case 3: CRNN + CTC & data augmentation (method 1, 2, ...) * case 4: hand-writing OCR model & data augmentation ### augmentation 筆記 ## 方法論信賢： * :star:Image-to-Image Translation with Conditional Adversarial Networks - [blog](https://kaonashi-tyc.github.io/2017/04/06/zi2zi.html) - [paper](https://arxiv.org/pdf/1611.07004.pdf) - [code1](https://github.com/kaonashi-tyc/zi2zi) - [code2](https://github.com/jasonlo0509/Font2Font) * TextRecognitionDataGenerator [code](https://github.com/Belval/TextRecognitionDataGenerator) * Text Renderer [code](https://github.com/Sanster/text_renderer) * Augmentor [code](https://github.com/mdbloice/Augmentor) 昱睿： * :star:AutoAugment: Learning Augmentation Strategies from Data * paper: [AutoAugment: Learning Augmentation Strategies from Data](https://arxiv.org/pdf/1805.09501.pdf) * code: [DeepVoltaire/AutoAugment](https://github.com/DeepVoltaire/AutoAugment) 立晟： * :star:imgaug - [docs](https://imgaug.readthedocs.io/en/latest/source/examples_basics.html) * :star:Albumentations - [paper](https://arxiv.org/pdf/1809.06839.pdf) - [docs](https://albumentations.readthedocs.io/en/latest/) 昊中： * imgaug * https://github.com/Belval/TextRecognitionDataGenerator * https://github.com/Sanster/text_renderer * https://github.com/williamyang1991/TET-GAN * https://github.com/pyliaorachel/cross-lingual-font-style-transfer * (備)https://github.com/Canjie-Luo/Text-Image-Augmentation * :star:https://github.com/RubanSeven/Text-Image-Augmentation-python 沛筠： * :star:RandAugment: Practical automated data augmentation with a reduced search space * paper: [RandAugment: Practical automated data augmentation with a reduced search space](https://arxiv.org/pdf/1909.13719.pdf) * code: [pytorch-randaugment](https://github.com/ildoonet/pytorch-randaugment) * code: [imgaug (現成)](https://github.com/aleju/imgaug/blob/master/imgaug/augmenters/collections.py) * 為 pytorch 版本的延伸 * code: [tensorflow (較舊)](https://github.com/tensorflow/tpu/blob/1ebfc4b5fa6b5fad0bd7d422832ae8826e8059f2/models/official/efficientnet/autoaugment.py)