Font Style Transfer Meeting Record

# Font Style Transfer Meeting Record 5/27 test -- 100 epoch | From Groud Truth | From First Stage Predict | | -------- | -------- | | ![](https://i.imgur.com/VqxRIfm.png) | ![](https://i.imgur.com/LSmz5Sm.png) | | ![](https://i.imgur.com/noR8Xl4.png) | ![](https://i.imgur.com/m2Qno6s.png) | | ![](https://i.imgur.com/LW4Vme9.png) | ![](https://i.imgur.com/wQM1V3x.png) | | ![](https://i.imgur.com/6cS9pFe.png) | ![](https://i.imgur.com/29dJlLH.png) | | ![](https://i.imgur.com/uuAQW6F.png) | ![](https://i.imgur.com/S1WhB9T.png) | | ![](https://i.imgur.com/owZiYwg.png) | ![](https://i.imgur.com/xDMyRNL.png) | | | | --- > 1000 | Groud Truth | Predict | | -------- | -------- | | ![](https://i.imgur.com/NHV8MbJ.png) | ![](https://i.imgur.com/oSv6XX5.png) | | ![](https://i.imgur.com/cEQP1FV.png) | ![](https://i.imgur.com/1kotCH0.png) | | ![](https://i.imgur.com/omvnjoQ.png) | ![](https://i.imgur.com/NGzf5hK.png) | | ![](https://i.imgur.com/UeQmK8i.png) | ![](https://i.imgur.com/4HOr4V5.png) | | ![](https://i.imgur.com/aqWoOzs.png) | ![](https://i.imgur.com/w5wZWHa.png) | | | | > 300 | Groud Truth | Predict | | -------- | -------- | | ![](https://i.imgur.com/eLAcwrC.png) | ![](https://i.imgur.com/szunWef.png) | | ![](https://i.imgur.com/Ik0w38X.png) | ![](https://i.imgur.com/XYrzLhi.png) | | ![](https://i.imgur.com/PrKCZkt.png) | ![](https://i.imgur.com/dOdwFJ6.png) | | ![](https://i.imgur.com/nI7hyrt.png) | ![](https://i.imgur.com/BBKitHP.png) | | ![](https://i.imgur.com/xApmKBr.png) | ![](https://i.imgur.com/fJG1xgd.png) | | | | > 100 | Groud Truth | Predict | | -------- | -------- | | ![](https://i.imgur.com/IQ5FcDw.png) | ![](https://i.imgur.com/48dFHHN.png) | | ![](https://i.imgur.com/2C4aSXC.png) | ![](https://i.imgur.com/wWjijzG.png) | | ![](https://i.imgur.com/F6KZogU.png) | ![](https://i.imgur.com/UXh49XF.png) | | ![](https://i.imgur.com/oOUAdvi.png) | ![](https://i.imgur.com/51pUNPo.png) | | ![](https://i.imgur.com/zuiziEy.png) | ![](https://i.imgur.com/LTZ8gmD.png) | | | | > 10 | Groud Truth | Predict | | -------- | -------- | | ![](https://i.imgur.com/0GjfDPr.png) | ![](https://i.imgur.com/9vK1JGw.png) | | ![](https://i.imgur.com/xmvigOq.png) | ![](https://i.imgur.com/API6bYJ.png) | | ![](https://i.imgur.com/X0jiGN6.png) | ![](https://i.imgur.com/AWSQNon.png) | | ![](https://i.imgur.com/Agxnzhn.png) | ![](https://i.imgur.com/6bn19H5.png) | | ![](https://i.imgur.com/zuiziEy.png) | ![](https://i.imgur.com/Fb5loDq.png) | | | | | | | ## Note ### input - Stroke id - Bounding Box | some strokes (1) | Default | | -------- | -------- | | <img src=https://i.imgur.com/Ubse25E.png width=512> | <img src=https://i.imgur.com/baE4voV.png width=512> | > info (2) ```=json { "ct":"UCS-CNS", "code":"2AE67", "item":[ { "strid":123, "rel":[111,186,75,526], "val":[80,117] }] } ``` > stroke ID info (3) ```=json { "strid":123, "rel":[96,216,68,520], "val":[74,110], "pts":[ {"x":134,"xmode":1,"xvalue":[0,0],"y":222,"ymode":1,"yvalue":[0,0]}, {"x":60,"xmode":1,"xvalue":[0,0],"y":216,"ymode":1,"yvalue":[0,0]}, {"x":24,"xmode":1,"xvalue":[1,1],"y":502,"ymode":1,"yvalue":[1,1]}, {"x":126,"xmode":1,"xvalue":[1,1],"y":546,"ymode":1,"yvalue":[1,1]} ],"paths":[ [ {"src":3,"dest":4,"offpt":[58,344]}, {"src":4,"dest":5,"offpt":[]}, {"src":5,"dest":2,"offpt":[134,442]}, {"src":2,"dest":3,"offpt":[]} ] ] } ``` ## Second Stage -- Stroke Detection - [Instance Segmentation V.S. Semantic Segmentation ](https://www.zhihu.com/question/51704852) - [Mask RCNN](https://paperswithcode.com/paper/mask-r-cnn) - [DeepMask](https://github.com/facebookresearch/deepmask) - [Learning to Segment Object Candidates](https://arxiv.org/abs/1506.06204) - [Learning to Refine Object Segments](https://arxiv.org/abs/1603.08695) - [R-CNN、Fast R-CNN、Faster R-CNN、YOLO、SSD](https://kknews.cc/zh-tw/code/k2yqmvb.html) -  ## Little Conclusion - [Baseline - pix2pix](https://phillipi.github.io/pix2pix/) - [Our Parent - zi2zi](https://github.com/kaonashi-tyc/zi2zi) - Little Related Works - https://medium.com/@ankankumarbhunia/recurrent-font-gan-4b5ba27ad138 - https://medium.com/@ankankumarbhunia/recurrent-font-gan-4b5ba27ad138 - https://www.slideshare.net/cnanews/gan-137298578 - [FontRnn](https://github.com/ShusenTang/FontRNN) ## current result analysis - data from zi2zi sample - ![](https://i.imgur.com/nfdHVsH.png) - train discriminator - PatchGAN discriminator not good: ~83% accurracy - pretrained resnet18: 100% accurracy within minutes - ![](https://i.imgur.com/orJ3gxa.png) - generated (fake) images have higher temperature > Our PatchGAN can therefore be understood as a form of texture/style loss. [pix2pix paper](http://openaccess.thecvf.com/content_cvpr_2017/papers/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.pdf) - but font data has no texture ## 現有model比較 - 不用 pix2pix 的理由 - 1 on 1 - 不用 zi2zi(tensorflow) 的理由 - 使用tensorflow 1.0，不符合現時開發需求 - code凌亂，維護不易 - 不用 FontGan 的理由 - 花太多時間與資源 - 為了使用CPM(content prior module)，得先train一個1 on 1的model，兩階段training偏怪（如果要fine-tune可能要train三階段 - 同時train了stylized跟destylized，但destylized的地方只有下述兩loss: - <img src="https://i.imgur.com/G3G3jR7.png" width="50%"> - <img src="https://i.imgur.com/w28Ccdy.png" width="50%"> - 因此花了很多時間，初步預計結果不會比zi2zi好很多，但花了相對較多的資源 - **用** zi2zi(pytotch)的理由 - 修了很多bug後，結果應該是跟tensorflow版本差不多的 - 修過的bug: - Lconst_penalty沒有乘上去(意即從1->15) - zi2zi用的時間不多，train出相對好很多的結果 - Improvement - 可以apply一些data augment方法 - Random mask - Rotation - Adversarial training (add noise on input) - - 結論 - Focus on **stylizing** FontGan, zi2zi-pytorch, apply VAE (style embedding), and related loss (KL loss ...) > Concrete 時間數據 > ## 三月中之前的分工 - 我 - zi2zi apply FontGan Loss - FontRNN - 邱譯 - FontRNN - 中誠 - zi2zi apply FontGan Loss - 學長 - Personal project ## Training Record ### 3/2 MARK: Without data argument - zi2zi - 碑銘體爆炸 - zi2zi-pytorch - 沒zi2zi好 - FontGan - 沒Train好結果 TODO: - [Denoisy Training](https://medium.com/trustableai/%E9%87%9D%E5%B0%8D%E6%A9%9F%E5%99%A8%E5%AD%B8%E7%BF%92%E7%9A%84%E6%83%A1%E6%84%8F%E8%B3%87%E6%96%99%E6%94%BB%E6%93%8A-%E4%B8%80-e94987742767) - ## Current Process ### Image Based ```=bash # Zi2zi python2 font2img.py --src_font ../FontTransferExperiment/create_font_image/ttf/TW-Sung-98_1.ttf --dst_font ../FontTransferExperiment/create_font_image/ttf/TW-Kai-98_1.ttf --charset CN --sample_count 4000 --sample_dir dir4000 --label 0 --filter 1 --shuffle 1 --x_offset 0 --y_offset 0 --char_size 256 --canvas_size 256 python2 package.py --dir dir4000 --save_dir bin_dir4000 --split_ratio "[0,1]" ``` ``` ln -s Kai/train trainA ln -s Sung/train trainB ``` #### Pix2pix - Finished (黑體 <-> 宋體) - TODO (宋體 <-> 楷體) ``` python train.py --dataroot ../create_font_image/uniImage/ --name unet256_pix2pix --model pix2pix --direction AtoB --netG unet_256 --dataset_mode unaligned --batch_size 16 --gpu_ids 0 --serial_batches --preprocess none --num_threads 32 --lr 8e-5 --no_flip ``` #### FontGan - Implementing... ### Bezier Curve Based #### ChinFont - ? #### FontRnn - ? > [Time Schedual](https://docs.google.com/spreadsheets/d/1HakHO5LbiisVaNRJm1LworbjGz68ZnYn_UtAhFbE9Wk/edit?fbclid=IwAR3W4KzNh9P8_jYYvd0idWkkkwx5SdRUN_i_Jf_HynSKuWqtjv8A0j-H5EM#gid=0) ## Weekly Record ### 12/16 - [FID Score](https://github.com/mseitzer/pytorch-fid) - Data Pruning - Next Stage: 比劃的classification - Font Gan Problem - 128*128 is better - 試試圖片轉成FontRNN的data，往畫出骨幹再用設計好的筆畫取代的方向走 ### 12/2 - SVG Type - [OTF to SVG](https://convertio.co/otf-svg/) - https://fontforge.org/en-US/documentation/scripting/ - [Convert PNG to SVG using python](https://pypi.org/project/pypotrace/) - https://www.google.com/search?q=python+image+to+svg - [SVG intro](https://www.oxxostudio.tw/articles/201406/svg-04-path-1.html) - https://github.com/nvictus/svgpath2mpl - https://pypi.org/project/svg.path/ - Sketch RNN / Dual RNN - TTF Font - [新宋体](https://cooltext.com/Download-Font-%e6%96%b0%e5%ae%8b%e4%bd%93+Sim+Sun) - [國家感謝你](https://data.gov.tw/dataset/5961) - [黑體](https://cooltext.com/Download-Font-%e9%bb%91%e4%bd%93+Sim+Hei) - [圓體](https://cooltext.com/Download-Font-%e7%b4%b0%e5%9c%93%e9%ab%94%e7%b9%81+Yen+Light) - [行書體](https://cooltext.com/Download-Font-%e8%a1%8c%e6%9b%b8%e7%b9%81+Shin+Su+Medium) - [隸書體](https://cooltext.com/Download-Font-%e9%9a%b8%e6%9b%b8%e7%b9%81+Li+Su+Medium) - Python + SVG - [Document](http://www.pygal.org/en/stable/api/pygal.svg.html?highlight=svg#module-pygal.svg) - [svgwrite](https://pypi.org/project/svgwrite/) ### 11/25 - 小範圍不好 - Paper surveying - Need Data - [Font GAN](https://hackmd.io/qW4iAlvjR1qYSFRLd-tySw) ### 11/18 - cancel ### 11/11 - Specified 小範圍 ```python # data/base_dataset.py from PIL import ImageDraw def get_random_mask(length, width, fill): crop_size = min(length, width) // 3 x = random.randint(0, max(0, length - crop_size)) y = random.randint(0, max(0, width - crop_size)) mask = Image.new("L", (length, width), 0) draw = ImageDraw.Draw(mask) draw.rectangle((x, y, x + crop_size, y + crop_size), fill=fill) return mask ``` ```python # data/class aligned_dataset.py from data.base_dataset import get_random_mask def __getitem__(self, index): ... # <Original> # A_transform = get_transform(self.opt, transform_params, grayscale=(self.input_nc == 1)) # B_transform = get_transform(self.opt, transform_params, grayscale=(self.output_nc == 1)) empty = Image.new("RGB", A.size, (255, 255, 255)) mask = get_random_mask(A.size[0], A.size[1], 255) A_mask = Image.composite(A, empty, mask) B_mask = Image.composite(B, empty, mask) A = A_transform(A_mask) B = B_transform(B_mask) # return {'A': A, 'B': B, 'A_paths': AB_path, 'B_paths': AB_path} ``` ### 11/4 - pix to 貝茲曲線 (儲存字體的方式) - Paper Survey (from SD) - https://hackmd.io/LZAOcM7ZRcGuIovqG771QA?view ### pix2pix on two font #### Conclusion - Color #### Method - Download file to create data ---- [link](https://drive.google.com/file/d/1GtxLiJX7Q3zwZGmEYUbmwlU3cOImUYkN/view?usp=sharing) - `python3 createFontData.py` and there will be a `font` folder - Clone this code [pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) and move the `font` folder under the `pytorch-CycleGAN-and-pix2pix/dataset` #### Definition Current Font Source: - [Adobe Font](https://github.com/adobe-fonts) - [SourceHanSerifTC-Regular](https://github.com/adobe-fonts/source-han-serif/tree/release/OTF/TraditionalChinese) - [SourceHanMono-Regular](https://github.com/adobe-fonts/source-han-mono/tree/master/Regular/OTC) - [Todo1: source-serif-pro](https://github.com/adobe-fonts/source-serif-pro/tree/release/OTF) - [Todo2: source-code-pro](https://github.com/adobe-fonts/source-code-pro/tree/release/OTF) - [MicroSoft Font](https://docs.microsoft.com/zh-tw/typography/font-list/) ``` Current Setting - src_font: SourceHanSerifTC-Regular.otf - dst_font: SourceHanMono-Regular.otf ``` #### Command ```=python CUDA_VISIBLE_DEVICES=0 python train.py --dataroot ./datasets/font --name font_AtoB_pix2pix --model pix2pix --num_threads 8 --pool_size 50 --batch_size 16 --direction AtoB --niter 100 ``` ```=python CUDA_VISIBLE_DEVICES=0 python train.py --dataroot ./datasets/font --name font_BtoA_pix2pix --model pix2pix --num_threads 8 --pool_size 50 --batch_size 16 --direction BtoA --niter 100 ``` #### Train on Frigga or Sif (config) ``` Host Asgard Hostname 140.112.187.116 User USER_ID Host Frigga Hostname Frigga ProxyJump Asgard User USER_ID Host Sif Hostname Sif ProxyJump Asgard User USER_ID ``` and then you can `ssh Sif` or `ssh Frigga` #### Little Warning please specify torch version for on Sif and Frigga `pip install torch==1.0.0 torchvision==0.2.1 Pillow visdom opencv-contrib-python` ( [Refer](https://pytorch.org/get-started/previous-versions/) ) ### Survey Link - [高中生科展](https://www.slideshare.net/cnanews/gan-137298578) - [zi2zi](https://kaonashi-tyc.github.io/2017/04/06/zi2zi.html?fbclid=IwAR0TWBAHbMR8EXtMSMKuFHiRUMY17uQUXgZK3MzO8yLnI3Kbl3V_kJ-ZX28) - [Paper survey Google slide](https://docs.google.com/presentation/d/1ojH5a_FESllDaDp7LB2eVxeZkqqNWWlU1uDetbafvAQ/edit#slide=id.g64ee598f49_0_117) - [Font-to-Font](https://medium.com/@ankankumarbhunia/recurrent-font-gan-4b5ba27ad138) ### 10/28 - Reproduce 一對一結果， Pix2pix - 新字體一千個產生其他的 - zi2zi -> 多對一，幾十種 - 生成時間(兩階段) - Stage2: 不好變成好的 ### 10/4 - 讀paper - Evaluation - Data with 華康

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.