HandWriting

AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks

github: https://github.com/dmitrijsk/attentionhtr

很廢，就是拿Attention Model拿來fine tune連code都直接從clova那邊clone過來，augmentation都放在furture work的paper
但我想拿這篇來用…

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

用Iam和Imgur5K兩個手寫dataset來FineTune

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Transformer-based Optical Character Recognition with Pre-trained Models

github: https://github.com/microsoft/unilm/tree/master/trocr

Model Architecture

就是很基礎的Transformer架構，用上億張生成的印刷字體pretrain，在手寫上finetune

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Encoder Initialization

DeiT:利用CNN base的方法當Teacher用imageNet等級的資料就訓練得起來
BEiT: BERT Pre-Training of Image Transformers
Image Not Showing Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →

Decoder Initialization

RoBERTa
MiniLM

Augmentation

randomrotation(-10 to 10degrees)
Gaussianblurring
image dilation,image erosion
downscaling
underlining

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

HandWriting成效

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported