###### tags: `cv_infra` `CV team`
# CV Training Pipeline
## Annotation format
- 貼標工具:labelme
- 標籤格式:json
- 標籤範例
```json=
{
"version": "4.5.9",
"flags": {},
"shapes": [
{
"label": "acct_id-A123456789",
"points": [
[
100.123456687,
100.354647465
],
[
100.123456687,
200.354647465
],
[
200.123456687,
200.354647465
],
[
200.123456687,
100.354647465
],
],
"group_id": null,
"shape_type": "polygon",
"flags": {}
},
{
...
}
],
"imagePath": "../Desktop/test.jpg", # 路徑為相對路徑
"imageData": null, # 也可以存 base64string
"imageHeight": 634,
"imageWidth": 772
}
```
## 分類模型
- 資料夾結構
```python
/data
├── train
│ ├── ID_FRONT
│ ├── ID_BACK
│ ├── PASSBOOK_COVER
│ ├── PASSBOOK_INNER
│ ├── NTB_FINANCIAL_STATEMENT
│ ├── WITHHOLDING_STATEMENT
│ └── OTHERS
│
├── test
│ ├── ID_FRONT
│ ├── ID_BACK
│ ├── PASSBOOK_COVER
│ ├── PASSBOOK_INNER
│ ├── NTB_FINANCIAL_STATEMENT
│ ├── WITHHOLDING_STATEMENT
│ └── OTHERS
```
- 切分 train / validation 資料集函式
```python=
import numpy as np
import torch
from torchvision import datasets, transforms, models
from torch.utils.data.sampler import SubsetRandomSampler
data_dir = '/data/train'
def load_split_train_test(datadir, valid_size=0.2):
train_transforms = transforms.Compose([transforms.Resize(224),
transforms.ToTensor(),
])
test_transforms = transforms.Compose([transforms.Resize(224),
transforms.ToTensor(),
])
train_data = datasets.ImageFolder(datadir,
transform=train_transforms)
test_data = datasets.ImageFolder(datadir,
transform=test_transforms)
num_train = len(train_data)
indices = list(range(num_train))
split = int(np.floor(valid_size * num_train))
np.random.shuffle(indices)
train_idx, test_idx = indices[split:], indices[:split]
train_sampler = SubsetRandomSampler(train_idx)
test_sampler = SubsetRandomSampler(test_idx)
trainloader = torch.utils.data.DataLoader(train_data,
sampler=train_sampler, batch_size=64)
testloader = torch.utils.data.DataLoader(test_data,
sampler=test_sampler, batch_size=64)
return trainloader, testloader
trainloader, testloader = load_split_train_test(data_dir, 0.2)
```
[參考來源](https://towardsdatascience.com/how-to-train-an-image-classifier-in-pytorch-and-use-it-to-perform-basic-inference-on-single-images-99465a1e9bf5)
---
## 定位模型:
### Input Label (1 個圖檔配 1 個 label 檔 (.json))
```json=
{
"shapes": [
{
"tag": "acct_id-A123456789",
"points": [
[
100.123456687,
100.354647465
],
[
100.123456687,
200.354647465
],
[
200.123456687,
200.354647465
],
[
200.123456687,
100.354647465
],
],
"group_id": null, # 期望育銓改成可以打英文
"shape_type": "polygon",
"flags": {}
},
{
...
}
],
"imagePath": "../Desktop/test.jpg", # 路徑為相對路徑
"imageHeight": 634,
"imageWidth": 772
}
```
- 各定位模型自行針對上述 label 檔進行轉換
```python=
# yolov4 (寫在 yolov4 training code 裡)
def convert_ano():
"""
1. points轉換xmin, ymin, xmax, ymax (注意int, float問題)
2. tag 取 label.split('-')[0]
"""
```
- 從以上json轉成yolo input格式

- 從以上json轉成其他detction model要的input格式(segmentation...之類的~~)
---
### Yolov5:
13.9k stars
github: https://github.com/ultralytics/yolov5
from Ultralytics (API service)
Documention: https://docs.ultralytics.com
- 資料夾結構:

- 分train/test
- 1 jpg - 1 txt

https://www.kaggle.com/ultralytics/coco128
- label tools:
1.CVAT: https://github.com/openvinotoolkit/cvat
2.makesense: makesense.ai
export your labels to YOLO format, with one *.txt file per image (if no objects in image, no *.txt file is required).
1. One row per object
2. Each row is class x_center, y_center, width, height format.
3. Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.
4. Class numbers are zero-indexed (start from 0).

---
## 辨識模型:
- label 檔轉換函式
```python=
def gen_ocr_data(input_img, input_json, output_path):
"""
拿臨時人力用 labelme 貼完的 label 檔 (.json) 做下面兩件事:
1. 切圖 (.jpg)
2. 產切圖的 label 檔 (.json)
3. output_path/img/*.jpg, output_path/json/*.json
"""
```
- json 範例 (1 個圖檔配 1 個 label 檔 (.json))
```json=
# train / evaluation input
{
"filepath": "/project/cc-apa-ocr/test_crop.jpg",
"tag": "acct_id",
"label": "A123456789"
}
```
```json=
# evaluation output
{
"filepath": "/project/cc-apa-ocr/test_crop.jpg",
"tag": "acct_id",
"label": "A123456789",
"pred":"xxx",
"prob": 0.99
}
```
```json=
# evaluation detection output
{
"filepath": "/project/cc-apa-ocr/test.jpg",
"tag": "acct_id",
"xmin":,
"ymin":,
"xmax":,
"ymax":,
"pred_xmin":,
"pred_ymin":,
"pred_xmax":,
"pred_ymax":,
"prob": 0.99
}
```
| | Paddle | EasyOCR | ChineseOCR
| -------- | -------- | -------- | -------- |
| star | 13.3k | 12.1k | 2.5k |
| format | txt | csv | json |
| 資料夾結構 |train/test|train/test|train/test|
| | 多jpg 1txt | 多jpg 1csv | 多jpg 1json |
### Paddle(txt):
13.3k stars
github: https://github.com/PaddlePaddle/PaddleOCR

file path, label, 其他模型要的東西(像width, height, label index)

- label tools:
https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/data_annotation_en.md
1. labelImg
2. rolabelImg 框斜的框
3. labelme 框多邊形
4. PPOCRLabel: https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.1/PPOCRLabel
### EasyOCR(csv):
12.1k stars
github: https://github.com/JaidedAI/EasyOCR
Document: https://jaided.ai/easyocr/modelhub/
file path, label

Dataset(https://jaided.ai/easyocr/modelhub/)
### ChineseOCR(json):
2.5k stars
github: https://github.com/xiaofengShi/CHINESE-OCR
#### Annotation Format:
用dataset github: https://ctwdataset.github.io

https://ctwdataset.github.io/tutorial/1-basics.html#Download-images-and-annotations
---
討論:
1. 可以先試用多個label tool看哪個好用,和產出會是什麼
- [CVAT](https://github.com/openvinotoolkit/cvat)
- [makesense](http://makesense.ai/)
- labelImg
- rolabelImg
- labelme
[->labelImg系列](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/data_annotation_en.md)
3. 因為label tools會一次產 座標(定位用)+label(辨識用)像我們的labelImg產的xml那樣,需再寫一個轉換的工具將定位、辨識模型各別的annotation data分開
4. 用1jpg-1json(辨識模型)、1jpg-1txt(定位模型),清楚明瞭也方便更正錯誤的內容
| | Yolo v5 | 定位模型 |
| -------- | -------- | -------- |
| format | txt | txt |
| 資料夾結構| train/test | train/test|
||1jpg 1txt|1jpg 1txt|
| | Paddle | EasyOCR | ChineseOCR| 辨識模型|
|-------- | -------- | -------- | -------- |-------- |
| format | txt | csv | json | json|
| 資料夾結構 |train/test|train/test|train/test|train/test|
| | 多jpg 1txt | 多jpg 1csv | 多jpg 1json | 1jpg 1json|