2022.11.28 - HackMD

# 2022.11.28 # A. Image Inpainting ## 1. Reference-guided Image inpainting之中，用到multi-view的inpainting方法就是reference-guided，可以額外使用一張/多張reference image，使用其中的內容來補足missing region，因此可以不用使用到generative method ### OPN (ICCV 2019) Onion-Peel Networks for Deep Video Completion * 可以視作multi-reference image inpainting * reference-guided image inpainting較早期的SoTA，可以inpaint大區域 * 用attention計算missing region的boundary區域以及refernce上區域的相似度 * 一次填補boundary附近pixel，iteratively ### TransFill (CVPR2021) Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations * 單一refernece image (2views) * 使用multi-homography proposal去align source image，尋找對應的color proposal再blending ### SLTFILL (ICIP2022) Spatial and Light Transformer for Multi-Reference Image Inpainting * multiple-reference alignment * 用spatial attention transformer將aligned references fuse在一起 * 贏過OPN以及DeepFillv2 ### SiamTrans (AAAI2022) Zero-Shot Multi-Frame Image Restoration with Pre-trained Siamese Transformers * multi-frame deraining/desnowing等(雖然沒有實驗在inpainting上，但方法類似) * 提出siamese transformers做pretrain可以達成zero-shot * 兩個不同degradation但相同pose?的輸入，但使用的dataset(NTURain)有motion * 若可以使用有motion的data，那或許可以當multi-view/reference-guided inpainting的參考 ### GeoFill (WACV2023) Reference-Based Image Inpainting with Better Geometric Understanding * single refernce (2views) * 沒有做planarity的假設(multi-homography proposal主要是基於這個假設) * 採用monocular depth estimate及predicted pose * optimize depth/pose 以獲得estimated 3D scene，再進行mesh rendering OPN是較早期的reference-guided方法且效果不錯，常被當作比較baseline。 TransFill以及GeoFill使用了各自的geometric alignment方法，是否能夠將NeRF的方法引入進這些基於幾何的方法中? 又或是有沒有技巧能夠應用在NeRF上(reason出pose的方法? 但應該有不少研究做到了)。這些方法和NeRF最大的差別之一在於NeRF需要一定數量的view，但這些方法是不要求要大量view。雖然NeRF也有不少可以用few views/single view的方法，但是若要有NeRF中高品質rendering的優點還是需要一定數量的view。 ## 2. GAN-based 大多數的single image inpainting都是使用generative的方法，列出幾個在當時較有代表性或是比較常與其他方法進行比較的，有助於了解大多數的single image inpainting是怎麼做的。 ### Context Encoders (CVPR2016) Feature Learning by Inpainting * unsupervised * mask: square or object(arbitrary) * CNN GAN, CNN encoder-decoder with adversarial loss * 早期learning based inpainting，第一個GAN inpainting，此篇發表後inpainting領域開始快速發展 ### PConv (ECCV2018) Image Inpainting for Irregular Holes Using Partial Convolutions * 改善傳統CNN，避免artifact ### DeepFillv2 (ICCV2019) Free-Form Image Inpainting with Gated Convolution * 針對free-form mask的generative方法，改進傳統conv以及partial conv的缺點 * Gated convolution + SN-Patch GAN ### EdgeConnect (ICCVW2019) Structure Guided Image Inpainting using Edge Prediction * 用兩個generator，先generate edge再用edge 做conditional generation補足 * 可以降低artifact ### Pro-Fill (ECCV2020) High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling * 根據confidence map，只填補高信任度的區域, iteratively * GAN-based的缺點:生成的解析度不高，使用guided upsampling * 使用contextual attention module可以borrow HR的feature-patch ### CR-Fill (ICCV2021) Generative Image Inpainting With Auxiliary Contextual Reconstruction * 有使用attention機制的generative method會相known region去borrow feature patch來填補missing region，但這種方式在尋找對應區域失敗時會造成artifact，CR-Fill避免這個情況 ### EII (CVPR2021) Image Inpainting With External-Internal Learning and Monochromic Bottleneck * 目標是解決artifacts * 用external(大型dataset)學習structure，用interal(image本身statistics)學習補足顏色 * backbone使用常見的HiFill, EdgeConnect等 ### CoModGAN (ICLR2021) LARGE SCALE IMAGE COMPLETION VIA CO-MODULATED GENERATIVE ADVERSARIAL NETWORKS * unconditional GAN適合用於生成整張影像，conditional GAN可以透過輸入影像補足小區塊，但是兩者都不適合直接用於用剩餘的少量區塊影像補足大區塊 * CoModGAN將unconditional以及conditional GAN結合起來，可以用GAN進行大面積的inpainting ### LaMa (WACV2022) LaMa: Resolution-robust Large Mask Inpainting with Fourier Convolutions * Use Fast fourier convolution to increase receptive field * Large receptive field helps to learn whole structure ### MAT (CVPR2022) Mask-Aware Transformer for Large Hole Image Inpainting * transformer-based方法可以達到long-range interactions但是受限於computation cost，常常只能做在低解析度 * 可以直接用transformer在高解析度進行大區域遮蔽的inpainting * CVPR2022唯一與inpainting相關，達成多個dataset的SoTA ### MAE-FAR (ECCV2022) Learning Prior Feature and Attention Enhanced Image Inpainting * 將ViT加入model使用MAE pretrain當作prior feature * 使用attention-based CNN restoration (ACR) 其中比較有代表性的是Context Encoders。 DeepFillv2則是進一步改進convolution方法，改進了下方的PConv(partial convolution)。除了提出的gated conv.方法，method本身也常被當作比較baseline。目前single image inpainting的SoTA大多都是這類型。 ## 3. Other 非使用adversarial或是單就架構問題/訓練方法進行改善的single image inpainting，包含幾個不是使用GAN的SoTA。 ### Deep Image Prior (CVPR2018) * 目標為各種基礎image inverse problem (也有應用在inpainting上) * 單張degraded image訓練，沒有其他資料或pretrain * 顯示出generator network架構有能力抓住任何learning task的prior * 用在inpainting上，只有小區域比較適用，會需要針對每張image調超參 Deep image prior顯示即使只使用一張degraded影像輸入，沒有其他dataset進行pretraining，依然能學習到影像的prior。從隨機初始化開始進行generator的訓練就有辦法做到任何inverse任務，推測是因為network學習影像內容比學習degradation快，所以只要對degraded image進行訓練且中斷在正確的時間點，就能達成inverse problem任務。但是需要多少iteration等超參根據影像有所不同，導致難以實際應用，但是啟發了不少後續研究。 PConv與前一個段落的Gated convolution也是承先啟後的發表。PConv在UNet上使用提出的partial convolution即可有不錯的非generative single image inpainting效果。這類型非使用generative adversarial的方法與我們的single image inpainting目標比較接近。他們其中一些可以達成單張影像不用其他資料做訓練，其他則是使用dataset進行pretrain之後可以用單張image做inpainting。如果條件與他們相同，就必須和他們比較inpainting品質。又或是看他們有什麼可以借鏡的地方/基於他們的方法再下去改進。 # B. Video Inpainting 相較於著重於implicit representation的NeRV類型，主要著重在inpainting任務的，且使用類似INR方法來達成 ### DFG (CVPR2019) Deep Flow-Guided Video Inpainting * 提出deep flow compltion network來得到optical flow * 使用optical flow來做guided inpainting，不直接生成RGB而是根據flow來propagate ### Internal Video Inpainting by Implicit Long-range Propagation (ICCV2021) * 由deep image prior延伸 * internal learning * 有點像INR但是是用CNN 這類型找到的paper比較少，video inpainting似乎還是以著重於使用prior knowedge為主。又或是之前提到的NeRV等video INR方法有辦法再做進一步的延伸。 # C. Image Implicit Function 不使用dataset進行pretrain，只使用單一image做成implicit representation。沒辦法直接應用在inpainting上，但是對於single image implicit representation的想法或許有啟發。 ### LIIF (CVPR2021) Learning Continuous Image Representation with Local Implicit Image Function, single image implicit function * cell latent code ### Neural Knitworks (Rejected from NeurIPS)(2021) Patched Neural Implicit Representation Networks * coordinate-based MLP, single image (patch) implicit representation * GAN-based # Appendix ### APAP (CVPR2013) As-projective-as-possible image stitching with moving dlt * 提出Moving DLT，避免傳統的projective warp造成view之間inconsistent的問題 * 後期還會拿來當baseline的傳統方法? ### SinGAN (ICCV2019) * Learning a Generative Model from a Single Natural Image * general unconditaional GAN (不是特別做inpainting) ## teasor圖片，使用的資料類型