# Image Colorization Using Conditional GANs > VLG Open Project Summers 2022 ![](https://i.imgur.com/h2MiZJI.jpg) ### Problem Statement The task of colourizing black and white photographs necessitates a lot of human input and hardcoding. The goal is to create an end-to-end deep learning pipeline that can automate the task of image colorization by taking a black and white image as input and producing a colourized image as output. ### Motivation Colorization is the process of adding color information to monochrome photographs or videos. The colorization of grayscale images is an ill-posed problem, with multiple correct solutions. Online tools can be used for image colorization but the problem with these tools is a lack of inductive bias like in the image of an apple above which results in inappropriate colors and doesn't even work for a few image domains. Deep learning algorithms that better understand image data like the colors that are generally observed for human faces should ideally perform better in this task. ### Solution - The colorization of grayscale images can be thought of as an image-to-image translation task where we have the corresponding labels for the input grayscale image. A conditional GAN conditioned on grayscale images can be used to generate the corresponding colorized images. - The architecture of the model consists of a conditional generator with grayscale image inputs and a random noise vector and the output of the generator are two image channels a, b in the LAB image space to be concatenated with the L channel i.e. the grayscale input image. - This generated image is input to the PatchGAN discriminator which outputs a score for each patch in an input image based on if the patch is real or not. These are used as learning signals for the Generator to generate better images. Along with the generated images, the Discriminator is also fed real images. - When trained adversarially, the generator should get better at generating realistic colorized images that share a similar structure with the input grayscale images and the discriminator should get better at discriminating between real and fake images. - The trained generator can then be used for generating colorized images given input grayscale images. ### Timeline #### Week 1 : `May 25, 2022` - `May 31, 2022` - Get comfortable with Google Colab and PyTorch framework - Understand concepts related to the problem statement - Basic Jargon - RGB and LAB image space - Conditional GANs #### Week 2 : `Jun 1, 2022` - `Jun 7, 2022` - Download a subset of RGB images (roughly 10000 images) to be used for training from the `COCO images` dataset. - Understand the utility functions to be used for conversions between image spaces, dataloaders, adversarial loss in GANs. - Code the `U-Net generator` and `PatchGAN discriminator` networks as proposed in the `pix2pix` image translation paper. - Implement a pipeline putting all these elements together and run your code for a few epochs. - Analyse the loss trajectory for the generator and the discriminator. #### Week 3 : `Jun 8, 2022` - `Jun 14, 2022` - Tune the hyperparameters to stabilize GAN training for the network corresonding to the implementation in the paper. - Try experimenting with the model - using a `pretrained encoder` - adding or removing `dropout` layers in the generator - Train the model till the loss saturates and the colorization outputs are comparable to the original RGB images. - Analyse with graphs and write a conclusion reporting the qualitative and quantitative results. ### Resources - [LAB Color Space](http://shutha.org/node/851) - [Overview of GANs](https://jonathan-hui.medium.com/gan-whats-generative-adversarial-networks-and-its-application-f39ed278ef09) - [Conditional GANs](https://jonathan-hui.medium.com/gan-cgan-infogan-using-labels-to-improve-gan-8ba4de5f9c3d) - [Pix2Pix Image Translation Paper](https://arxiv.org/pdf/1611.07004.pdf)