---
tags: iAgents Lab
---
# Research Journal
> [name=Yang-Sheng Lin] [time=Tue, Sep 30, 2021][color=#97CBFF]
## Table of Contents
[TOC]
## Github
https://github.com/jason2714/de-i2i-gan
## Ideas
### TODO
1. adaptive adjust loss weight
4. use involution instead of SPADE to aggregate spatial map - base on CC-FPSE: use FPN to get global information and spatial agnostic depth-wise convolution
5. Adding different type of noise - [Understanding Noise Injection in GANs](http://proceedings.mlr.press/v139/feng21g/feng21g.pdf)
7. Use VLM to guild generation spatial distribution - [GAL](https://openaccess.thecvf.com/content/CVPR2022/papers/Petryk_On_Guiding_Visual_Attention_With_Language_Specification_CVPR_2022_paper.pdf)
### DOING
1. use TTUR instead of 5 dis 1 gen
5. Train on another Dataset - [PlantVillage](https://data.mendeley.com/datasets/tywbtsjrjv/1)
6. pretrain on normal images - [transferI2I](https://arxiv.org/pdf/2105.06219.pdf), [Masked AutoEncoder](https://arxiv.org/pdf/2111.06377.pdf), or use AE with self SIMCLR in middle layer, or TransferI2I, or MAE on G -> SimCLR on D, original D to discriminate real and unmasked images, change mask to noise
7. Random Position Spatial Label
8. add SPADE to enc_res_blk
9. https://openaccess.thecvf.com/content_CVPR_2020/papers/Guo_Online_Knowledge_Distillation_via_Collaborative_Learning_CVPR_2020_paper.pdf
### DONE
1. add skip connection and compare the difference
2. Test the result of multiple labels
3. Test the result of spatial distribution map (partial label)
## Journals
- **Week X ( xx/xx ~ xx/xx )**
+ **What have I done/read?**
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
- **Week 15 ( 04/17 ~ 04/23 )**
+ **What have I done/read?**
1. sean with rand embed will fail without MAE-pretrained
2. FID comparision with different sean block design using shrink dataset
| Name | fid | lpips | IS | Iters | Note |
|:------------:|:------:|:-----:|:-----:|:-----:|:-----------------------------------------------------------------------:|
| org_wo_sean | 103.71 | 0.645 | 2.923 | 25000 | best FID is 90.68 at epoch 140 |
| latent | 81.02 | 0.671 | 3.371 | 25000 | bad defect patterns |
| latent_D | 87.38 | 0.661 | 3.785 | 25000 | bad defect patterns (best of three) |
| latent_noise | 94.25 | 0.664 | 3.57 | 25000 | bad defect patterns |
| embed | 105.75 | 0.662 | 3.68 | 25000 | best FID is 83.25 at epoch 140 |
| embed_l | 108.53 | 0.644 | 3.353 | 40000 | FID grow up to 200 in the first 100 epoch |
| embed_nf | 123.99 | 0.663 | 3.243 | 25000 | FID grow up to 150 in the first 100 epoch, best FID is 101 at epoch 140 |
3. FID comparision with different sean block design using shrink dataset with pretrained mae
| Name | fid | lpips | IS | Iters | Note |
|:---------------:|:-----:|:-----:|:-----:|:-----:|:--------------------------------:|
| wo_sean | 62.22 | 0.639 | 3.097 | 20000 | |
| wo_sean_zero | 72.09 | 0.636 | 3.100 | 20000 | |
| wo_sean_zero_l | 72.56 | 0.643 | 3.077 | 30000 | best LPIPS is 0.632 at epoch 150 |
| wo_sean_low_rec | 66.38 | 0.635 | 3.219 | 30000 | rec_loss weight 5 |
| latent | 67.24 | 0.647 | 2.928 | 30000 | best FID is 65.80 at epoch 240 |
| embed | 69.84 | 0.638 | 3.303 | 30000 | |
| embed_rand | 70.27 | 0.632 | 3.324 | 30000 | |
| embed_nf | 59.64 | 0.654 | 3.251 | 30000 | best at epoch 160, then dropping |
| embed_rand_nf | 61.81 | 0.642 | 3.253 | 30000 | best at epoch 240 with 1 embed |
4. mae embed to latent with 40000 iters
| epoch | alpha | fid | lpips | IS | Note |
| ------ |:-----:|:-----:|:-----:|:-----:|:---------------------------:|
| latest | 1 | 83.42 | 0.632 | 4.278 | |
| latest | 0.9 | 81.29 | 0.632 | 4.239 | |
| latest | 0.5 | 73.42 | 0.640 | 3.775 | |
| latest | 0.1 | 66.77 | 0.665 | 2.827 | |
| latest | 0 | 80.46 | 0.673 | 2.371 | bad pattern (mode collapse) |
| 270 | 1 | 76.02 | 0.630 | 3.810 | |
| 270 | 0.9 | 74.10 | 0.630 | 3.770 | |
| 270 | 0.5 | 66.56 | 0.637 | 3.348 | |
| 270 | 0.1 | 66.71 | 0.655 | 2.715 | few patterns (mode collapse) |
| 270 | 0 | 87.65 | 0.658 | 2.595 | bad pattern (mode collapse) |
5. model distill for latent only
| epoch | alpha | fid | lpips | IS | Note |
| ------ |:-----:|:-----:|:-----:|:-----:|:---------------------------:|
| latest | 1 | 57.52 | 0.646 | 2.945 | num_embed = 1 |
| latest | 0.8 | 58.84 | 0.648 | 2.860 | num_embed = 1 |
| latest | 0.5 | 62.19 | 0.651 | 2.740 | num_embed = 1 |
| latest | 0 | 73.67 | 0.659 | 2.466 | bad pattern (mode collapse) |
6. asd
| Name | fid | lpips | IS | Iters | Note |
|:----:|:-----:|:-----:|:-----:|:-----:|:----:|
| l2e | 66.55 | 0.644 | 3.428 | 40000 | |
+ **Problems encountered and solutions tri
+ **What is my plan for the coming week?**
- **Week 14 ( 04/10 ~ 04/16 )**
+ **What have I done/read?**
1. can't distinguish latent space
1. remove resblock
2. latent to style encoder
3. mean and variance of style encoder
2. we use the latent only...
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
- **Week 13 ( 04/03 ~ 04/09 )**
+ **What have I done/read?**
1. this result is all for latent only
| Name | fid | Note |
|:----------------:|:-----:|:-------------------------------------:|
| org_nm_label | 41.69 | FID hardly increases after sd_con 0.3 |
| org_sean | 47.95 | FID hardly increases after sd_con 0.3 |
| org_sean_embed_5 | 43.48 | FID increases slowly to 46 |
| org_sean_embed_3 | 44.26 | FID increases slowly to ? |
| org_sean_embed_1 | 42.87 | lw, FID increases slowly to ? |
| org_sean_latent | 44.37 | |
2. the FID values with embed in the table above is wrong
+ **Problems encountered and solutions tried**
1. Generate a grayscale image and find that the corruption starts after sd_con is greater than 0.3, and then the gan_loss of G begins to decrease, and the FID begins to increase
+ **What is my plan for the coming week?**
- **Week 12 ( 03/27 ~ 04/02 )**
+ **What have I done/read?**
1. test acc
| Name | Accuracy | Loss |
|:--------------:|:--------:|:-----:|
| org_deep | 0.633 | 0.563 |
| org | 0.636 | 0.562 |
| GAN-D | 0.547 | 0.375 |
| ViT | 0.639 | 0.223 |
| ViT-f | 0.763 | 0.346 |
| vit-shrink-f | 0.649 | 0.492 | | vit-l-shrink-f | 0.503 | 0.582 |
| vit-l-shrink | 0.589 | 0.263 |
2. asd
3. asd
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
- **Week 11 ( 03/20 ~ 03/26 )**
+ **What have I done/read?**
1. [DeepI2I](https://arxiv.org/pdf/2011.05867.pdf)
3. TODO: [Corrupted Image Modeling](https://arxiv.org/pdf/2202.03382.pdf)
4. TODO: [AAMIM](https://arxiv.org/pdf/2205.13943.pdf)
5. combine with aug method
6. FID between df and bg
1. train - 100
2. val - 150
7. TODO: check CAM of discriminator
8. TODO: on Horse2zebra dataset or cat2dog
9. TODO: dynamic growing up mask ratio
10. TODO: [CLASSIFIER-FREE DIFFUSION GUIDANCE](https://arxiv.org/pdf/2207.12598.pdf)'s method as label input
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
- **Week 10 ( 03/13 ~ 03/19 )**
+ **What have I done/read?**
1. test on shrink dataset
1. org - 81.2
2. mae - 78.0
3. cycle_gan_skip
5. split
6. split wo label_insert
7. d_only
8. freeze-D
10. 16 patch size - 73.3
11. shifted mask - 68.2
12. skip connection - 67.5
13. wo gan loss - 81.3
14. without label insertion - 74
15. mean mask - 67.3
16. mask token scalar - 63.8
17. mask token full - 61.2
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
1. use GCAM to get discriminator's attention when model overfit
- **Week 9 ( 03/06 ~ 03/12 )**
+ **What have I done/read?**
1. [TransferI2I](https://arxiv.org/pdf/2105.06219.pdf)
2. Trined MAE with different args (inherit)
1. mae
2. mae with full loss (contain unmasked grid)
3. mae gan
4. mae gan lr
5. mae gan with spade label input and fusion dataset
6. mae gan with classifier loss
7. mae gan with l1 loss
+ **Problems encountered and solutions tried**
1. model pretrained from mae doesn't get better performance, but can reduce training epoch --> maybe can use freezeD, transferI2I's method or trained on 10% dataset
+ **What is my plan for the coming week?**
1. build PlantVillage Dataset
2. use reduced codebrim dataset to train
3. pretrain spliting
- **Week 8 ( 02/27 ~ 03/05 )**
+ **What have I done/read?**
1. [Masked AutoEncoder](https://arxiv.org/pdf/2111.06377.pdf)
2. [A Survey on Masked Autoencoder](https://arxiv.org/pdf/2208.00173.pdf)
3. 使用amp加速training
4. Build mae_trainer
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
1. Run Experiments
- **Week 7 ( 02/20 ~ 02/26 )**
+ **What have I done/read?**
1. [02/24 Biweekly Research Journal](https://docs.google.com/presentation/d/1Y9BSTOFQ28dwsZlh5U2BN45DWct9nsV-XZ6DRGUQYBg/edit?usp=sharing)
2. test the result of limited sd <br>
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
1. Train a model using PlantVillage Dataset
2. Pretrain model on normal images
3. Random position spatial label
- **Week 6 ( 02/13 ~ 02/19 )**
+ **What have I done/read?**
1. Test GAL
1. background <br> 
2. defects <br> 
3. plantVillage <br> 
2. add sd and foreground to tensorboard's log
3. [FreezeD](https://arxiv.org/pdf/2002.10964.pdf)
+ **Problems encountered and solutions tried**
1. GAL not work --> CLIP is finding the mutual latent space for language and image, so you can't find where to synthesis defects in the normal image, only can find where is the defects
+ **What is my plan for the coming week?**
1. find method better than GAL
- **Week 5 ( 02/06 ~ 02/12 )**
+ **What have I done/read?**
1. Finished unet architecture and skip connection
1. org_lw_unet keeps the similar FID<br>
2. org_lw_unet_skip loss 4 FID <br>
2. cycle_gan can be trained successfully with skip connection and reach 50 FID <br>
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
### 2023 Winter Vacation
- **Week 5 ( 01/30 ~ 02/05 )**
+ **What have I done/read?**
+ **Problems encountered and solutions tried**
+ **What is my plan for the coming week?**
- **Week 4 ( 01/23 ~ 01/29 )**
+ **What have I done/read?**
1. [TTUR](https://arxiv.org/pdf/1706.08500.pdf)
3. run and compare different option for defectGAN
1. spectral-norm is better and more stable
2. color-jitter can improve performance
3. default loss-weight can be trained successfully with spectral-norm, but if a loss-weight like [1 1 1 1 1] is used, grey images will appear in some minority classes.
4. cycle-GAN can't be trained successfuly even with spectral-norm and lw(fid 65), org-wo-spec --> 200(gray image), org-wo-spec-lw --> 75(gray if keep training)
+ **Problems encountered and solutions tried**
1. find generated grey image problem can be solved by spectral normalization
+ **What is my plan for the coming week?**
1. pretrained on normal images
- **Week 3 ( 01/16 ~ 01/22 )**
+ **What have I done/read?**
1. [01/16 Research Directions](https://docs.google.com/document/d/1HZpzPOe8eIKVR7CBbXhz55ROzP2uGymjAL82yKYevsU/edit?usp=share_link)
1. train cycleGAN w/o LWC
+ **Problems encountered and solutions tried**
1. Rec-Loss of CycleGAN won't decrease, generate gray image with defect foreground
+ **What is my plan for the coming week?**
1. pretrained on normal images
- **Week 2 ( 01/09 ~ 01/15 )**
+ **What have I done/read?**
2. merge cal_fid into training stage
+ **Problems encountered and solutions tried**
1. None
+ **What is my plan for the coming week?**
1. train cycleGAN w/o LWC
- **Week 1 ( 01/02 ~ 01/08 )**
+ **What have I done/read?**
1. fid calculater
2. test_defectgan.py
+ **Problems encountered and solutions tried**
1. 使用LWC(layer wise composition)不限制spatial distributtion會造成sd=1, fg=前景加灰色底,合成起來就只剩灰色的fg,但因為有紋理跟單調的背景,因此會被認定是**真實且類別正確的**圖片。
2. 不好控制spatial distributtion的loss weight,如果沒控制好就會造成1的情況發生,可能可以加perceptual loss在生成圖跟原圖上?
3. ~~FID Score用train跟test比較就有40,如何降低FID,用data augmentation創造更多資料再來算?~~ Fix with data augmentation
+ **What is my plan for the coming week?**
1. train cycleGAN w/o LWC
2. merge cal_fid into training stage
3. pretrained on normal images0.
### 2022 Fall
- **2022/10/1 ~ 2022/12/31**
+ **What have I done/read?**
1. 刻完DefectGAN code<br/>
+ **Problems encountered and solutions tried**
1. None
+ **What is my plan for the coming Season?**
1. Improve the performance or decrease required data for defect-GAN
### 2021 Fall
- **Week 5 ( 2021/10/25 ~ 2021/10/31 )**
+ **What have I done/read?**
+ **What is my plan for the coming week?**
- **Week 4 ( 2021/10/18 ~ 2021/10/24 )**
+ **What have I done/read?**
* **Involution: Inverting the Inherence of Convolution for Visual Recognition**
> *link : https://arxiv.org/pdf/2103.06255v1.pdf*[color=#D3A4FF]
* **FPN**
* **dynamic convolution**
* **transformer**
* **attention**
* **DETR**
* **ViT**
+ **What is my plan for the coming week?**
* **R-CNN系列文章**
- **Week 3 ( 2021/10/11 ~ 2021/10/17 )**
+ **What have I done/read?**
* **Artificial Intelligence for Social Good**
> *link : https://arxiv.org/ftp/arxiv/papers/1901/1901.05406.pdf*[color=#D3A4FF]
* **Consciousness Prior**
> *link : https://arxiv.org/pdf/1709.08568v1.pdf*[color=#D3A4FF]
+ **Problems encountered and solutions tried**
* 使用pytorch的DataLoader, num_workers使用0以外的就會出問題
* import torchvision.models 的pretrain model要model.cuda()不然type不同會error
+ **What is my plan for the coming week?**
* **Involution**
- **Week 2 ( 2021/10/04 ~ 2021/10/10 )**
+ **What have I done/read?**
* **[1406.2661] Generative Adversarial Networks - arXiv**
> *link : https://arxiv.org/pdf/1406.2661.pdf*[color=#D3A4FF]
* **DLCV_hw0 (PCA)**
+ **What is my plan for the coming week?**
* Artificial Intelligence for Social Good
- **Week 1 ( 2021/09/27 ~ 2021/10/03 )**
+ **What have I done/read?**
* research journal
* summer paper list
+ **What is my plan for the coming week?**
* Generative Adversarial Networks
### 2021 Summer
*[Summer paper list with LaTeX](https://www.overleaf.com/read/vpsntxkfngdg)*
## Experiments
<style>
:root {
--img-width: 500px;
--img-height: 300px;
}
table {
width: 100%;
border-collapse: collapse;
table-layout: fixed;
}
th, td {
padding: 0px;
border: 1px solid black;
}
th {
background-color: #ccc;
text-align: center;
}
img {
max-width: var(--img-width);
max-height: var(--img-height);
width: 100%;
height: auto;
}
</style>
<table>
<thead>
<tr>
<th>name</th>
<th>org</th>
<th>wo_lw</th>
<th>wo_spec_lw</th>
<th>org_ttur</th>
<th>unet_skip</th>
</tr>
</thead>
<tbody>
<tr>
<th>Single</th>
<td><img src="https://i.imgur.com/tuA6GCU.jpg"></td>
<td><img src="https://i.imgur.com/if1MKoO.png"></td>
<td><img src="https://i.imgur.com/6C5BrYZ.png"></td>
<td><img src="https://i.imgur.com/OLpzPru.png"></td>
<td><img src="https://i.imgur.com/uvKndmu.png"></td>
</tr>
<tr>
<th>Multiple</th>
<td><img src="https://i.imgur.com/PkoIlVf.jpg"></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<th>FID Score</th>
<td><img src="https://i.imgur.com/geVg0aS.png"></td>
<td><img src="https://i.imgur.com/F89F8dO.png"></td>
<td><img src="https://i.imgur.com/hBZHMXp.png"></td>
<td><img src="https://i.imgur.com/W4lk6xx.png"></td>
<td><img src="https://i.imgur.com/WRcWTN4.png"></td>
</tr>
<tr>
<th>GAN Loss</th>
<td><img src="https://i.imgur.com/6NRbjHR.png"></td>
<td><img src="https://i.imgur.com/5olIxyE.png"></td>
<td><img src="https://i.imgur.com/0KBinrI.png"></td>
<td><img src="https://i.imgur.com/zezaNdd.png"></td>
<td><img src="https://i.imgur.com/Yx2zE8N.png"></td>
</tr>
<tr>
<th>CLF Loss</th>
<td><img src="https://i.imgur.com/64n0FB8.png"></td>
<td><img src="https://i.imgur.com/vtY87AF.png"></td>
<td><img src="https://i.imgur.com/k9QedUS.png"></td>
<td><img src="https://i.imgur.com/iWu6ljr.png"></td>
<td><img src="https://i.imgur.com/kMVxbhq.png"></td>
</tr>
<tr>
<th>AUX Loss</th>
<td><img src="https://i.imgur.com/CUKN5WL.png"></td>
<td><img src="https://i.imgur.com/Q48VByG.png"></td>
<td><img src="https://i.imgur.com/VKnc2Cj.png"></td>
<td><img src="https://i.imgur.com/swcjWpt.png"></td>
<td><img src="https://i.imgur.com/IeXD0il.png"></td>
</tr>
<tr>
<th>Note</th>
<td>original result with spectral norm, noise adding and lw changing</td>
<td>drop 5 FID score without lw changing</td>
<td>generated gray scay image without spectral norm and lw change</td>
<td>drop 2 FID score with ttur lr</td>
<td>drop 2 FID score with skip connection</td>
</tr>
</tbody>
</table>
# Weekly Progress Report Template
1. What have I done/read?
2. Problems encountered and solutions tried
3. What is my plan for the coming week?