Virtual Expo 2024

Image Quality Manipulation Using Autoencoders

Envision
CompSoc

A Project By:

Mentors: Aakarsh Bansal, Abhishek Srinivas, Raajan Rajesh Wankhade

Mentees: Ananya A K, Raunak Nayak, Sai Akhilesh, Sanga Balanarsimha, Sarth Santosh Shah, Utkarsh Shukla, Vaibhavi Nagaraja Nayak, Vanshika Mittal, Vedh Adla

GitHub Repository: https://github.com/raajanwankhade/autoencoder-image-quality-manipulation

Google Meet Link: https://meet.google.com/wgx-vacj-xsc 

Aim

The aim of this project is to design and develop autoencoders that can enhance image quality. We deploy these autoencoders to do 2 tasks:

1. Denoising of noisy images, and

2. Super resolution of low quality images.

Introduction

In our project, we use the capabilities of deep autoencoders to enhance image quality. By using super-resolution and noise removal techniques, our project aims to tackle two of the most important problems with Image Quality. Deep autoencoders unveil intricate details within images, making them significant tools in applications requiring image quality preservation and restoration such as medical diagnostics, surveillance, and satellite imagery. This project is a right step into the realm of image processing, promising impactful solutions for real-world applications.

Methodology

Since we have 2 deliverables, we have split the mentees into 2 separate teams that will work on each of these tasks independently.

Denoising of Images

Presence of noise in an image makes it difficult to interpret. For this task, we used a Medical Image Dataset from Kaggle that comprised 120 images of teeth. This dataset was noise free, that is it did not have any noise. We artificially induce noise into these images of the teeth - We choose randomly between adding noise from a Gaussian Distribution (Gaussian Noise) and noise from a Uniform Distribution (Uniform Noise). An example of this is shown here:

We use an autoencoder with skip connections (non-conventional U-Net) to obtain the denoised image from the noisy image. The autoencoder architecture comprises an encoder, consisting of two convolutional layers with ReLU activation, facilitating the compression of input images into a lower-dimensional representation. Following encoding, the decoder utilizes upsampling layers to reconstruct the original image dimensions. Skip connections are also employed to preserve spatial information during reconstruction. We finally use the sigmoid activation function to get the pixel values in the range [0,1].

 

We use Mean Square Error as the loss function and Adam as the optimiser. The model was trained for 200 epochs and we converged to a loss of 0.000290 from 0.007233.

Super Resolution of Images

Super resolution of images involves transforming a low quality image into a higher quality image while maintaining the content, color and details as much as possible. For this task, we chose an Image Super Resolution dataset from Kaggle, that comprised 685 train images and 175 test images. An example of a high resolution and corresponding low resolution images is as follows:
 

Again, we used an autoencoder to perform this task. We model a U-Net to do this where the encoder layer has a series of downsampling blocks, each consisting of two convolutional layers, after which we do batch normalization and activation by ReLU. This is for feature extraction and dimensionality reduction. Following this we have a bottleneck layer that acts as a bridge between the encoder and the decoder. The decoder consists of transpose convolutional layers that are used for upsampling. Moreover, at each upsampling step, we have skip connections from the encoder to help in understanding spatial context. Lastly, we have a 1*1 convolutional layer to adjust the number of output channels. 

 

This is the encoder of the model used. Moreover, we used MSE loss as the loss function and Adam as the optimiser.

Results

X-Ray Image Denoising

For denoising of images, on running for 200 epochs, we were able to achieve a loss of 0.00029. Some results obtained are as follows:

Here are the calculated Median SSIM and PSNR on the entire dataset:

SSIM (Noisy): 0.00298811656483968
SSIM (Denoised): 0.04422990805663252
PSNR (Noisy): 4.836596727387118
PSNR (Denoised): 5.8340839697113385

As you can see, the SSIM and PSNR have increased after denoising, which was the aim of the project.

Image Super-Resolution

For the super resolution of images, we were able to obtain a training loss of around 0.001 and some of the results on the test images are here:
 

Inferences

We were successfully able to implement both the deliverables: denoising of images and super resolution of images with great accuracy. Both of these tasks have several real-life applications with denoising of images being widely used in Medical and Astronomical Imaging and Super Resolution being used in Surveillance, Security, Satellite Imagery and Historical Document Preservation. We were also able to deploy models for both tasks using Streamlit where users can select an input image to either denoise or get a higher resolution image. The Streamlit app is depicted below:

 

References

  1. U-Net: Convolutional Networks for Biomedical Image Segmentation
  2. Reducing the Dimensionality of Data with Neural Networks | Science
  3. https://www.kaggle.com/datasets/parthplc/medical-image-dataset 
  4. https://www.kaggle.com/datasets/adityachandrasekhar/image-super-resolution  
  5. Deep Learning for Single Image Super-Resolution: A Brief Review
  6. Image Denoising: The Deep Learning Revolution and Beyond -- A Survey Paper –

 

 

 

 



 

METADATA

Report prepared on May 7, 2024, 12:15 a.m. by:

  • Raajan Rajesh Wankhade [CompSoc]
  • Aakarsh Bansal [CompSoc]
  • Abhishek Srinivas [CompSoc]

Report reviewed and approved by Aditya Pandia [CompSoc] on May 9, 2024, 10:49 p.m..

Check out more projects!