HandWriting

AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks

github: https://github.com/dmitrijsk/attentionhtr

很廢,就是拿Attention Model拿來fine tune連code都直接從clova那邊clone過來,augmentation都放在furture work的paper
但我想拿這篇來用

用Iam和Imgur5K兩個手寫dataset來FineTune

Transformer-based Optical Character Recognition with Pre-trained Models

github: https://github.com/microsoft/unilm/tree/master/trocr

Model Architecture

就是很基礎的Transformer架構,用上億張生成的印刷字體pretrain,在手寫上finetune

Encoder Initialization

  1. DeiT:利用CNN base的方法當Teacher用imageNet等級的資料就訓練得起來
  2. BEiT: BERT Pre-Training of Image Transformers

Decoder Initialization

  1. RoBERTa
  2. MiniLM

Augmentation

  • randomrotation(-10 to 10degrees)
  • Gaussianblurring
  • image dilation,image erosion
  • downscaling
  • underlining

HandWriting成效