Upscale Images using Super- Resolution Convolutional Neural Network (SRCNN)

                                  Vũ Uy - Bùi Huy Giáp

Colab Notebook

Dự án sử dụng CNN - Convolutional Neural Network để tạo AI Upscale và nâng cao chất lượng hình ảnh

Mục tiêu: Cải thiện chất lượng hình ảnh (Upscaling)
Ứng dụng:Tạo ảnh nền (wallpaper) chất lượng cao, phục hồi ảnh sau khi bị nén.

Giới thiệu

Convolutional neural networks là thuật toán thường được sử dụng trong nhận dạng hình ảnh hoặc xử lí ngôn ngữ. Cấu trúc của nó được dựa trên hệ thống nơ ron thần kinh trong não người, vì thế chúng rất hữu ích khi dụng trong việc nhận biết những đặc điểm trong xử lí ảnh

Tại sao lại là CNN?

Phương pháp dùng CNN khi xử lí hình ảnh bởi tính gọn nhẹ của nó. Lấy ví dụ một tấm ảnh với resolution 1920x1080 ta sẽ có đến hơn 2 triệu pixel và 3 color channel (RGB), điều này dẫn tới lượng data cần xử lí sẽ vô cùng lớn bởi data sẽ đc flatten về một mảng 1 chiều, từ đó có thể gây ra mất mát một số đặc trưng của hình ảnh gốc/đầu vào.

Về SRCNN (Super Resolution)

Super-Resolution Convolutional Neural Network (SRCNN) dù là kĩ thuật sử dụng trong deep learning nhưng nó khá cổ điển và không hề "deep" như ta nghĩ. Với 3 phần đơn giản, patch extraction and representation, non-linear mapping, and reconstruction

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Thực hiện:

1. Get image from the Internet

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Download images to make dataset

import requests 
import json
from multiprocessing import Pool, cpu_count


def download_task(id):
    print(f'downloading image {id}')
    res = requests.get(f'https://images.unsplash.com/photo-{id}', stream=True)
    open(f'download/{id}.jpg', 'wb').write(res.content)


urls = json.load(open('urls.json'))

with Pool(cpu_count()) as p:
    p.map(download_task, urls)

Script dùng để download và lưu trữ raw images
Lúc này mỗi bức ảnh sẽ có độ phân giải khoảng 4k

Disclaimer: Hình ảnh được lấy từ unsplash là nguồn ảnh chất lượng cao thường dùng làm wallpaper và có [giấy phép tự do]''(https://unsplash.com/license) sử dụng

2.

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Preprocess Images

















import os
import cv2
from multiprocessing import Pool, cpu_count


def resize_task(file):
  print('resizing ' + file)
  image = cv2.imread('./download/' + file)
  h, w, c = image.shape
  scale = 1280 / max(w, h)
  image = cv2.resize(image, (int(w * scale), int(h * scale)), interpolation=cv2.INTER_LINEAR)

  cv2.imwrite('./download/' + file, image)


with Pool(cpu_count()) as p:
    p.map(resize_task, os.listdir('./download'))

Sau khi chạy đoạn code trên, 2000 ảnh 4k download về (Rawdata) sẽ được resize xuống còn 720p (1280 x 720). Những tấm ảnh này sẽ được chia làm 2 tệp train và test.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Ở Project này, chúng ta sẽ chia rawdata thành 2 phần
- 1500 train_images
- 500 test images

3. Create Patches (Dataset)

Với 1500 train images ta sẽ cắt nhỏ mỗi image thành nhiều mảng(patches) khác nhau. Resize những mảng này để tạo input cho model.
Chuyển đổi rawdata là các ảnh độ phân giải gốc từ định dạng RGB sang YCbCr
Lợi ích
- Mắt người nhạy cảm với độ sáng (channel Y) hơn màu
- Giảm channel xuống(3
  $\to$ 1), giúp quá trình tạo dataset dễ dàng hơn.
- Thời gian train sau này cũng được giảm bớt do độ lớn training data được giảm.
- Mô hình chính xác hơn do phải xử lí ít chiều data hơn
Hạn chế:
- Định dạng nén ảnh jpeg được sử dụng trong dataset đã downsample channel Cr và Cb nên chất lượng màu của hình ảnh sẽ luôn thấp
- Model không thể xử lý tình trạng nhiễu màu của hình ảnh

INPUT_SIZE = 33
PADDING = 6
OUTPUT_SIZE = INPUT_SIZE - 2 * PADDING
STRIDE = 14

Trong đó:

INPUT_SIZE : Độ lớn patch đầu vào(input) 33x33
PADDING: Là phần viền ngoài của input.
OUTPUT_SIZE: Độ lớn patch đầu ra 21x21
STRIDE: Khoảng cách giữa những input lấy vào để tạo ra ouput. 14

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

4. Build SRCNN Model

Required Pakages
















import os
import shutil
import random

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from PIL import Image
from tqdm import tqdm
from math import log10, sqrt

from tensorflow.keras.preprocessing.image import load_img, img_to_array
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.optimizers import Adam

Train the model

























def model():
    SRCNN = Sequential(name='SRCNN')
    SRCNN.add(Conv2D(filters=64, kernel_size=(9, 9),
                     input_shape=(None, None, 1),
                     padding='VALID',
                     use_bias=True,
                     kernel_initializer='he_normal',
                     activation='relu'))
    SRCNN.add(Conv2D(filters=32, kernel_size=(3, 3),
                     padding='SAME',
                     use_bias=True,
                     kernel_initializer='he_normal',
                     activation='relu'))
    SRCNN.add(Conv2D(filters=1, kernel_size=(5, 5),
                     padding='VALID',
                     use_bias=True,
                     kernel_initializer='he_normal',
                     activation='linear'))

    optimizer = Adam(learning_rate=0.0001)

    SRCNN.compile(optimizer=optimizer, loss='mse')
    
    return SRCNN

Optimizer: Adam [3]
Loss Function: MSE [4] (Mean Squared Error)

Evaluation

Load trực tiếp các hình ảnh trong dataset test vào model tạo các hình ảnh prediction.

Sử dụng các metric: PSNR và SSIM so với ảnh gốc và ảnh low-resolution để xác định độ hiệu quả của model.

5. Kết quả

PSNR:

Low-res: 31.84
Prediction: 32.6

SSIM:

Low-res: 0.86
Prediction: 0.88

Điều học được

Biết tạo evironment cho python dependencies
Hiểu rõ hơn về cách CNN hoạt động cũng như cách áp dụng vào các bài toán thị giác máy tính thực tế
Biết tạo script download và preprocess dataset
Hiểu được hạn chế của mô hình SRCNN và cách để cải tiến
Biết cách tận dụng tài nguyên máy (multi-thread, GPU computing)
Biết cách port mô hình keras qua Tensorflow.js, tạo web app bằng Vite và WASM backend cũng như hiểu rõ được hạn chế của model web frontend
Phải biết kiểm soát rủi ro

Hướng cải tiến

Sử dụng dataset bitmap hoặc png thay vì jpeg
Tạo patches kích thước lớn hơn
Sử dụng model CNN sâu hơn, lớn hơn, có các layer Residual
Sử dụng GAN để tăng độ thực tế của output
Kết hợp nhiều model khác nhau như model giảm nhiễu hình ảnh, model bỏ jpeg artifact,…

References

[1] Giới thiệu- Upscale ảnh với một mạng CNN đơn giản

[2] Image Super-Resolution Using Deep Convolutional Networks

[3] Adam

[4] MSE

Upscale Images using Super- Resolution Convolutional Neural Network (SRCNN)

Colab Notebook

Table of contents:

Dự án sử dụng CNN - Convolutional Neural Network để tạo AI Upscale và nâng cao chất lượng hình ảnh

Giới thiệu

Tại sao lại là CNN?

Về SRCNN (Super Resolution)

Thực hiện:

1. Get image from the Internet

Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → Download images to make dataset

2. Image Not Showing Possible Reasons The image file may be corruptedThe server hosting the image is unavailableThe image path is incorrectThe image format is not supported Learn More → Preprocess Images

3. Create Patches (Dataset)

4. Build SRCNN Model

Required Pakages

Train the model

Evaluation

5. Kết quả

References

Read more

Uy Vu

Capstone Project

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Download images to make dataset

2.

Image Not Showing Possible Reasons
The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported
Learn More →

Preprocess Images