owned this note
owned this note
Published
Linked with GitHub
# Font Style Transfer Meeting Record
5/27
test -- 100 epoch
| From Groud Truth | From First Stage Predict |
| -------- | -------- |
| ![](https://i.imgur.com/VqxRIfm.png) | ![](https://i.imgur.com/LSmz5Sm.png) |
| ![](https://i.imgur.com/noR8Xl4.png) | ![](https://i.imgur.com/m2Qno6s.png) |
| ![](https://i.imgur.com/LW4Vme9.png) | ![](https://i.imgur.com/wQM1V3x.png) |
| ![](https://i.imgur.com/6cS9pFe.png) | ![](https://i.imgur.com/29dJlLH.png) |
| ![](https://i.imgur.com/uuAQW6F.png) | ![](https://i.imgur.com/S1WhB9T.png) |
| ![](https://i.imgur.com/owZiYwg.png) | ![](https://i.imgur.com/xDMyRNL.png) |
| | |
---
> 1000
| Groud Truth | Predict |
| -------- | -------- |
| ![](https://i.imgur.com/NHV8MbJ.png) | ![](https://i.imgur.com/oSv6XX5.png) |
| ![](https://i.imgur.com/cEQP1FV.png) | ![](https://i.imgur.com/1kotCH0.png) |
| ![](https://i.imgur.com/omvnjoQ.png) | ![](https://i.imgur.com/NGzf5hK.png) |
| ![](https://i.imgur.com/UeQmK8i.png) | ![](https://i.imgur.com/4HOr4V5.png) |
| ![](https://i.imgur.com/aqWoOzs.png) | ![](https://i.imgur.com/w5wZWHa.png) |
| | |
> 300
| Groud Truth | Predict |
| -------- | -------- |
| ![](https://i.imgur.com/eLAcwrC.png) | ![](https://i.imgur.com/szunWef.png) |
| ![](https://i.imgur.com/Ik0w38X.png) | ![](https://i.imgur.com/XYrzLhi.png) |
| ![](https://i.imgur.com/PrKCZkt.png) | ![](https://i.imgur.com/dOdwFJ6.png) |
| ![](https://i.imgur.com/nI7hyrt.png) | ![](https://i.imgur.com/BBKitHP.png) |
| ![](https://i.imgur.com/xApmKBr.png) | ![](https://i.imgur.com/fJG1xgd.png) |
| | |
> 100
| Groud Truth | Predict |
| -------- | -------- |
| ![](https://i.imgur.com/IQ5FcDw.png) | ![](https://i.imgur.com/48dFHHN.png) |
| ![](https://i.imgur.com/2C4aSXC.png) | ![](https://i.imgur.com/wWjijzG.png) |
| ![](https://i.imgur.com/F6KZogU.png) | ![](https://i.imgur.com/UXh49XF.png) |
| ![](https://i.imgur.com/oOUAdvi.png) | ![](https://i.imgur.com/51pUNPo.png) |
| ![](https://i.imgur.com/zuiziEy.png) | ![](https://i.imgur.com/LTZ8gmD.png) |
| | |
> 10
| Groud Truth | Predict |
| -------- | -------- |
| ![](https://i.imgur.com/0GjfDPr.png) | ![](https://i.imgur.com/9vK1JGw.png) |
| ![](https://i.imgur.com/xmvigOq.png) | ![](https://i.imgur.com/API6bYJ.png) |
| ![](https://i.imgur.com/X0jiGN6.png) | ![](https://i.imgur.com/AWSQNon.png) |
| ![](https://i.imgur.com/Agxnzhn.png) | ![](https://i.imgur.com/6bn19H5.png) |
| ![](https://i.imgur.com/zuiziEy.png) | ![](https://i.imgur.com/Fb5loDq.png) |
| | |
| | |
## Note
### input
- Stroke id
- Bounding Box
| some strokes (1) | Default |
| -------- | -------- |
| <img src=https://i.imgur.com/Ubse25E.png width=512> | <img src=https://i.imgur.com/baE4voV.png width=512> |
> info (2)
```=json
{
"ct":"UCS-CNS",
"code":"2AE67",
"item":[
{
"strid":123,
"rel":[111,186,75,526],
"val":[80,117]
}]
}
```
> stroke ID info (3)
```=json
{
"strid":123,
"rel":[96,216,68,520],
"val":[74,110],
"pts":[
{"x":134,"xmode":1,"xvalue":[0,0],"y":222,"ymode":1,"yvalue":[0,0]},
{"x":60,"xmode":1,"xvalue":[0,0],"y":216,"ymode":1,"yvalue":[0,0]},
{"x":24,"xmode":1,"xvalue":[1,1],"y":502,"ymode":1,"yvalue":[1,1]},
{"x":126,"xmode":1,"xvalue":[1,1],"y":546,"ymode":1,"yvalue":[1,1]}
],"paths":[
[
{"src":3,"dest":4,"offpt":[58,344]},
{"src":4,"dest":5,"offpt":[]},
{"src":5,"dest":2,"offpt":[134,442]},
{"src":2,"dest":3,"offpt":[]}
]
]
}
```
## Second Stage -- Stroke Detection
- [Instance Segmentation V.S. Semantic Segmentation ](https://www.zhihu.com/question/51704852)
- [Mask RCNN](https://paperswithcode.com/paper/mask-r-cnn)
- [DeepMask](https://github.com/facebookresearch/deepmask)
- [Learning to Segment Object Candidates](https://arxiv.org/abs/1506.06204)
- [Learning to Refine Object Segments](https://arxiv.org/abs/1603.08695)
- [R-CNN、Fast R-CNN、Faster R-CNN、YOLO、SSD](https://kknews.cc/zh-tw/code/k2yqmvb.html)
-
<!-- [toc] -->
## Little Conclusion
- [Baseline - pix2pix](https://phillipi.github.io/pix2pix/)
- [Our Parent - zi2zi](https://github.com/kaonashi-tyc/zi2zi)
- Little Related Works
- https://medium.com/@ankankumarbhunia/recurrent-font-gan-4b5ba27ad138
- https://medium.com/@ankankumarbhunia/recurrent-font-gan-4b5ba27ad138
- https://www.slideshare.net/cnanews/gan-137298578
- [FontRnn](https://github.com/ShusenTang/FontRNN)
## current result analysis
- data from zi2zi sample
- ![](https://i.imgur.com/nfdHVsH.png)
- train discriminator
- PatchGAN discriminator not good: ~83% accurracy
- pretrained resnet18: 100% accurracy within minutes
- ![](https://i.imgur.com/orJ3gxa.png)
- generated (fake) images have higher temperature
> Our PatchGAN can therefore be understood as a form of texture/style loss. [pix2pix paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.pdf)
- but font data has no texture
## 現有model比較
- 不用 pix2pix 的理由
- 1 on 1
- 不用 zi2zi(tensorflow) 的理由
- 使用tensorflow 1.0,不符合現時開發需求
- code凌亂,維護不易
- 不用 FontGan 的理由
- 花太多時間與資源
- 為了使用CPM(content prior module),得先train一個1 on 1的model,兩階段training偏怪(如果要fine-tune可能要train三階段
- 同時train了stylized跟destylized,但destylized的地方只有下述兩loss:
- <img src="https://i.imgur.com/G3G3jR7.png" width="50%">
- <img src="https://i.imgur.com/w28Ccdy.png" width="50%">
- 因此花了很多時間,初步預計結果不會比zi2zi好很多,但花了相對較多的資源
- **用** zi2zi(pytotch)的理由
- 修了很多bug後,結果應該是跟tensorflow版本差不多的
- 修過的bug:
- Lconst_penalty沒有乘上去(意即從1->15)
- zi2zi用的時間不多,train出相對好很多的結果
- Improvement
- 可以apply一些data augment方法
- Random mask
- Rotation
- Adversarial training (add noise on input)
-
- 結論
- Focus on **stylizing** FontGan, zi2zi-pytorch, apply VAE (style embedding), and related loss (KL loss ...)
> Concrete 時間數據
>
## 三月中之前的分工
- 我
- zi2zi apply FontGan Loss
- FontRNN
- 邱譯
- FontRNN
- 中誠
- zi2zi apply FontGan Loss
- 學長
- Personal project
## Training Record
### 3/2
MARK: Without data argument
- zi2zi
- 碑銘體爆炸
- zi2zi-pytorch
- 沒zi2zi好
- FontGan
- 沒Train好結果
TODO:
- [Denoisy Training](https://medium.com/trustableai/%E9%87%9D%E5%B0%8D%E6%A9%9F%E5%99%A8%E5%AD%B8%E7%BF%92%E7%9A%84%E6%83%A1%E6%84%8F%E8%B3%87%E6%96%99%E6%94%BB%E6%93%8A-%E4%B8%80-e94987742767)
-
## Current Process
### Image Based
```=bash
# Zi2zi
python2 font2img.py --src_font ../FontTransferExperiment/create_font_image/ttf/TW-Sung-98_1.ttf --dst_font ../FontTransferExperiment/create_font_image/ttf/TW-Kai-98_1.ttf --charset CN --sample_count 4000 --sample_dir dir4000 --label 0 --filter 1 --shuffle 1 --x_offset 0 --y_offset 0 --char_size 256 --canvas_size 256
python2 package.py --dir dir4000 --save_dir bin_dir4000 --split_ratio "[0,1]"
```
```
ln -s Kai/train trainA
ln -s Sung/train trainB
```
#### Pix2pix
- Finished (黑體 <-> 宋體)
- TODO (宋體 <-> 楷體)
```
python train.py --dataroot ../create_font_image/uniImage/ --name unet256_pix2pix --model pix2pix --direction AtoB --netG unet_256 --dataset_mode unaligned --batch_size 16 --gpu_ids 0 --serial_batches --preprocess none --num_threads 32 --lr 8e-5 --no_flip
```
#### FontGan
- Implementing...
### Bezier Curve Based
#### ChinFont
- ?
#### FontRnn
- ?
> [Time Schedual](https://docs.google.com/spreadsheets/d/1HakHO5LbiisVaNRJm1LworbjGz68ZnYn_UtAhFbE9Wk/edit?fbclid=IwAR3W4KzNh9P8_jYYvd0idWkkkwx5SdRUN_i_Jf_HynSKuWqtjv8A0j-H5EM#gid=0)
## Weekly Record
### 12/16
- [FID Score](https://github.com/mseitzer/pytorch-fid)
- Data Pruning
- Next Stage: 比劃的classification
- Font Gan Problem
- 128*128 is better
- 試試圖片轉成FontRNN的data,往畫出骨幹再用設計好的筆畫取代的方向走
### 12/2
- SVG Type
- [OTF to SVG](https://convertio.co/otf-svg/)
- https://fontforge.org/en-US/documentation/scripting/
- [Convert PNG to SVG using python](https://pypi.org/project/pypotrace/)
- https://www.google.com/search?q=python+image+to+svg
- [SVG intro](https://www.oxxostudio.tw/articles/201406/svg-04-path-1.html)
- https://github.com/nvictus/svgpath2mpl
- https://pypi.org/project/svg.path/
- Sketch RNN / Dual RNN
- TTF Font
- [新宋体](https://cooltext.com/Download-Font-%e6%96%b0%e5%ae%8b%e4%bd%93+Sim+Sun)
- [國家感謝你](https://data.gov.tw/dataset/5961)
- [黑體](https://cooltext.com/Download-Font-%e9%bb%91%e4%bd%93+Sim+Hei)
- [圓體](https://cooltext.com/Download-Font-%e7%b4%b0%e5%9c%93%e9%ab%94%e7%b9%81+Yen+Light)
- [行書體](https://cooltext.com/Download-Font-%e8%a1%8c%e6%9b%b8%e7%b9%81+Shin+Su+Medium)
- [隸書體](https://cooltext.com/Download-Font-%e9%9a%b8%e6%9b%b8%e7%b9%81+Li+Su+Medium)
- Python + SVG
- [Document](http://www.pygal.org/en/stable/api/pygal.svg.html?highlight=svg#module-pygal.svg)
- [svgwrite](https://pypi.org/project/svgwrite/)
### 11/25
- 小範圍不好
- Paper surveying
- Need Data
- [Font GAN](https://hackmd.io/qW4iAlvjR1qYSFRLd-tySw)
### 11/18
- cancel
### 11/11
- Specified 小範圍
```python
# data/base_dataset.py
from PIL import ImageDraw
def get_random_mask(length, width, fill):
crop_size = min(length, width) // 3
x = random.randint(0, max(0, length - crop_size))
y = random.randint(0, max(0, width - crop_size))
mask = Image.new("L", (length, width), 0)
draw = ImageDraw.Draw(mask)
draw.rectangle((x, y, x + crop_size, y + crop_size), fill=fill)
return mask
```
```python
# data/class aligned_dataset.py
from data.base_dataset import get_random_mask
def __getitem__(self, index):
...
# <Original>
# A_transform = get_transform(self.opt, transform_params, grayscale=(self.input_nc == 1))
# B_transform = get_transform(self.opt, transform_params, grayscale=(self.output_nc == 1))
empty = Image.new("RGB", A.size, (255, 255, 255))
mask = get_random_mask(A.size[0], A.size[1], 255)
A_mask = Image.composite(A, empty, mask)
B_mask = Image.composite(B, empty, mask)
A = A_transform(A_mask)
B = B_transform(B_mask)
# return {'A': A, 'B': B, 'A_paths': AB_path, 'B_paths': AB_path}
```
### 11/4
- pix to 貝茲曲線 (儲存字體的方式)
- Paper Survey (from SD)
- https://hackmd.io/LZAOcM7ZRcGuIovqG771QA?view
### pix2pix on two font
#### Conclusion
- Color
#### Method
- Download file to create data ---- [link](https://drive.google.com/file/d/1GtxLiJX7Q3zwZGmEYUbmwlU3cOImUYkN/view?usp=sharing)
- `python3 createFontData.py` and there will be a `font` folder
- Clone this code [pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) and move the `font` folder under the `pytorch-CycleGAN-and-pix2pix/dataset`
#### Definition
Current Font Source:
- [Adobe Font](https://github.com/adobe-fonts)
- [SourceHanSerifTC-Regular](https://github.com/adobe-fonts/source-han-serif/tree/release/OTF/TraditionalChinese)
- [SourceHanMono-Regular](https://github.com/adobe-fonts/source-han-mono/tree/master/Regular/OTC)
- [Todo1: source-serif-pro](https://github.com/adobe-fonts/source-serif-pro/tree/release/OTF)
- [Todo2: source-code-pro](https://github.com/adobe-fonts/source-code-pro/tree/release/OTF)
- [MicroSoft Font](https://docs.microsoft.com/zh-tw/typography/font-list/)
```
Current Setting
- src_font: SourceHanSerifTC-Regular.otf
- dst_font: SourceHanMono-Regular.otf
```
#### Command
```=python
CUDA_VISIBLE_DEVICES=0 python train.py --dataroot ./datasets/font --name font_AtoB_pix2pix --model pix2pix --num_threads 8 --pool_size 50 --batch_size 16 --direction AtoB --niter 100
```
```=python
CUDA_VISIBLE_DEVICES=0 python train.py --dataroot ./datasets/font --name font_BtoA_pix2pix --model pix2pix --num_threads 8 --pool_size 50 --batch_size 16 --direction BtoA --niter 100
```
#### Train on Frigga or Sif (config)
```
Host Asgard
Hostname 140.112.187.116
User USER_ID
Host Frigga
Hostname Frigga
ProxyJump Asgard
User USER_ID
Host Sif
Hostname Sif
ProxyJump Asgard
User USER_ID
```
and then you can `ssh Sif` or `ssh Frigga`
#### Little Warning
please specify torch version for on Sif and Frigga
`pip install torch==1.0.0 torchvision==0.2.1 Pillow visdom opencv-contrib-python` ( [Refer](https://pytorch.org/get-started/previous-versions/) )
### Survey Link
- [高中生科展](https://www.slideshare.net/cnanews/gan-137298578)
- [zi2zi](https://kaonashi-tyc.github.io/2017/04/06/zi2zi.html?fbclid=IwAR0TWBAHbMR8EXtMSMKuFHiRUMY17uQUXgZK3MzO8yLnI3Kbl3V_kJ-ZX28)
- [Paper survey Google slide](https://docs.google.com/presentation/d/1ojH5a_FESllDaDp7LB2eVxeZkqqNWWlU1uDetbafvAQ/edit#slide=id.g64ee598f49_0_117)
- [Font-to-Font](https://medium.com/@ankankumarbhunia/recurrent-font-gan-4b5ba27ad138)
### 10/28
- Reproduce 一對一結果, Pix2pix
- 新字體一千個 產生其他的
- zi2zi -> 多對一,幾十種
- 生成時間(兩階段)
- Stage2: 不好變成好的
### 10/4
- 讀paper
- Evaluation
- Data with 華康