Open Access Research Article

Eye Diseases Prediction and Classification using Deep Learning Techniques

Ahmed M. Fahmy*

Department of Artificial Intelligence, Faculty of Informatics and Computer Science, The British University in Egypt El Sherouk, Cairo, Egypt

Corresponding Author

Received Date: September 24, 2024;  Published Date: October 01, 2024

Abstract

In human anatomy, the eye is a major sense organ and loss of vision would have an enormous impact on quality of life. In addition, it may also be possible that the eye is showing signs of severe health problems. Eye diseases detection is mainly the problem here. Detecting eye diseases using different advanced AI- based techniques to analyze and predict the type of eye disease from retinal eye images and improve the performance and the accuracy of the proposed system. This proposed system can help ophthalmologists to save their time in manual examination. This paper aims to develop Deep Learning (DL) for prediction and classification of eye diseases by applying Gabor Filter as feature extractor as it can extract the important features such as textual and imaginary patterns of the structure and shape of the images and then use its output as an input to the Convolutional Neural Network (CNN) training models. Then classifying retinal eye images to predicate the type of the disease. After training and evaluating these trained models, the model with the best accuracy was VGG-16 with a model accuracy of 85.34%.

Keywords:Eye diseases; AI- based techniques; gabor filter; deep learning; VGG-16

Introduction

Overview

The main sense organ of vision function is the eye. Early diagnosis and detection of eye diseases is essential since vision loss can occur due to some types of eye diseases. The provider of eye care or the doctor should be able to protect patients from vision loss or blindness through early detection for an eye diseases. Doctors sometimes cannot detect the disease in the retinal image in a rapid way as it might be not clear to detect or diagnose if it is a normal eye or have disease. Consequently, in detection of eye diseases, there are more advanced technologies enhanced together like Gabor filter for feature extraction along with various deep learning techniques are integrated together in this research. By using these technologies, the proposed system will automatically detect and classify various eye diseases based on retinal image data.

Scope and Objectives

This research paper aims to improve the performance of eye disease detection while detecting the type of the disease and extracting the important information or features of the images and see how these methods will affect the eye disease detection. The Scope for predicting and classifying the eye retinal scanned images will be:

a) Normal Eye

b) Diabetic Retinopathy (DR)

c) Glaucoma

d) Cataract

The objective of this research paper is to apply machine learning and deep learning techniques to classify these images and apply Gabor filter for enhancing the structure and shape of the retinal images.

Work Methodology

The work methodology here is based on using different models of deep learning that will be trained and tested to classify on the dataset using Gabor filter as feature extractor. The extracted features from the image will be used after combining these features as input trained data for the CNN model. CNNs can automatically learn hierarchical features from data, while Gabor filters can provide informative textural features as input to the CNN models. This can enhance also the results of the predicated output. The proposed system block diagram is illustrated in (Figure 1).

The Dataset

The chosen dataset is downloaded from Kaggle [1]. The total number of images is 4217. The dataset consists of 4 classes which are “Normal”, “Diabetic Retinopathy”, “Cataract” and “Glaucoma”, as it shown in (Figure 2) which is a sample of the dataset. These collected images are from different sources such as HRF, Oculur recognition, IDRiD etc. It is a multiclass classification. The data distribution:

irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications

a) Cataract:1037 images

b) Diabetic Retinopathy: 1098 images

c) Glaucoma: 1007 images

d) Normal: 1074 images

irispublishers-openaccess-biostatistics-biometric-applications

The dataset wasn’t balanced as shown in (Figure 3), these images have no consistency as they have different dimensionality, shape, and format. The dataset has one folder that contains 4 subfolders of the different types of eye disease which are: “cataract”, “diabetic retinopathy”, “glaucoma”, and “normal”. Dataset augmentation has been carried out to get a balanced dataset as shown in (Figure 4). Data Augmentation is a method to create a new training sample by applying its techniques to the images that includes Rotating and flipping. It can improve and enhance the classification model performance and results. These are the used data augmentation. Randomly flips the image with 0.5 probability horizontally and vertically. Rotating the images randomly with 30 degrees. The techniques of the data augmentation techniques applied on the images are [2]:

irispublishers-openaccess-biostatistics-biometric-applications

a) Greyscale image

b) Image Rotation

c) Brightness and Contrast

d) Flipping

The difference between each class distribution is not too large, they are all nearly close to each other. Each image was resized to a dimension of 224x224 to ensure that all images have the same dimensions. Figure 4 shows the dataset distribution among the 4 categories.

Data Preprocessing

a) Greyscale image

b) Image Rotation

c) Brightness and Contrast

d) Flipping

The difference between each class distribution is not too large, they are all nearly close to each other. Each image was resized to a dimension of 224x224 to ensure that all images have the same dimensions. Figure 4 shows the dataset distribution among the 4 categories.

Data Preprocessing

The used dataset needs to be ready and prepared before using it as input for DL training models. The dataset is already balanced, so there are few preparation steps. The importance of data preprocessing is making sure that the images data is consistent before training in CNN models. In the data preprocessing phase, it includes data analysis to understand the nature of the data [3,4]. The dataset was split into 80% for training, 10% and 10% for validation and testing respectively.

Image Resizing

The images should be consistent with each other, so images are resized to be in a fixed size 224x224 pixels. This step is important to standardize the size of the input with CNN training models.

Conversion to Grayscale

Converting the RGB images to grayscale images can reduce the dimensionality. In RBG images, it has 3 dimensions while grayscale images have only one dimension. It also reduces model complexity

Normalization

Image normalization includes changing the pixel intensity values of an image to a specified range. This specified range usually falls within 0 to 255 for images that have an 8-bit depth, with 0 symbolizing black and 255 denoting whites. The procedure of normalization is utilized either to boost the contrast within an image or to standardize the pixel values in preparation for subsequent processing. Error! Reference source not found.

Gabor Filter

The Gabor filter is a linear filter used for texture analysis, which essentially means that it analyzes whether there is any specific frequency content in the image in specific directions in a localized region around the point or region of analysis.

Deep Learning Techniques

Convolutional Neural Networks (CNN)

CNN is one of deep learning architecture which learn directly from dataset. For image processing and classification tasks, CNNs are suited deep learning algorithm for it. CNNs are suitable to recognize objects as it can find the patterns in the images. CNN consists of: Input layer, Convolutional layer, Activation function, Pooling layer, fully connected layer, and Output. Figure 5 shows the architecture of the CNN [5-7].

irispublishers-openaccess-biostatistics-biometric-applications
Visual Geometry Group (VGG)

VGG is known for its uniformity and simplicity. It was developed at Oxford University. VGG have two types VGG 16 and VGG 19. In VGG 16 it consists of 13 convolution layers with 3x3 filters and 3 fully connected layers which all 16 layers. And the same for VGG 19, but it contains 19 layers. In VGG 19, it contains 16 with grouped into 5 blocks pooling layers of convolution layers and 3 fully connected layers. The aim of using VGG 16 and 19 is to train the used datasets on it, and it can affect the accuracy. Figure 6 shows the architecture of VGG [8].

MobileNet

A lightweight network which reduces computation and parameters. To deepen the network, it uses separable convolution. It provides embedded vision applications and mobile an efficient model. Applying MobileNet architecture can limit the parameters, complexity, and memory. Its structure is based on separable filters. Error! Reference source not found. Figure 7 shows the architecture of Mobilenet.

ResNet

Residual Network was developed in 2015 by research experts at Microsoft. It was designed to overcome vanishing gradient problem. Resnet network use method called “skip connections”. By skipping some layers, the network can connect to the activations layer. With the skipping connections, the Resnet can skip the layer which reduces the performance and accuracy by regularization. Resnet are made stacking Residual Blocks. Resnet has two types: Resnet 50, Resnet 101, [9-11]. Figure 8 shows the architecture of ResNet.

irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications
EfficientNet

In 2019, the EfficientNet models were introduced from Google Research in their published paper “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks” by Tan et al. EfficientNet is a CNN family and scaling method that designed to be highly efficient in terms of computational resources and accuracy. It uses the new scaling method which is compound coefficient instead of scaling up the resolution, depth, or width of models randomly. Compound scaling uniformly scales each dimension of resolution, depth, or width using compound coefficient method. Figure 9 shows the difference between compound scaling and other different scaling methods [12].

irispublishers-openaccess-biostatistics-biometric-applications

Implementation

After applying the data preprocessing stage, Gabor filter is applied to the images for the feature extraction stage. The main purpose of applying Gabor filters is that it is effective in capturing texture patterns, which can be useful for analyzing features in input retinal image associated with various eye diseases. Feature extraction is the process of extracting important information or pattern from the input image. Feature extraction is how to get from an image the feature embeddings. These features can be used for image classification. Gabor filter is used for texture patterns classification, feature extraction, and edge detection in image processing as it is a linear filter. The kernel of Gabor is bandpass filter which is used for images analysis. Error! Reference source not found. Extracting features using Gabor filters have several ways to used which are:

a) Using the response matrix of the imaginary part or the real part as a feature vector.

b) Since the response matrix contains relevant information about the edges, as it can be taken as feature vectors.

c) The response matrix’ s amplitude can be taken as a feature vector.

The parameters of Gabor kernel used in this research paper are shown in (Figure 10) are [13]: get Gabor Kernel (Ksize, sigma, theta, lambda, gamma, psi=0, Ktype=cv2.CV_32F).

irispublishers-openaccess-biostatistics-biometric-applications

a) Ksize: is the filter kernel size in pixels.

b) Sigma: it is Gaussian envelope standard deviation.

c) Theta: it is the filter orientation in degrees.

d) Lambda: is the sinusoidal factor wavelength.

e) Gamma: is the filter spatial aspect ratio.

f) Psi: an optional parameter, it is Phase offset of the filter in degrees. The default value = 0.

g) Ktype: the filter coefficients data type. The default value is cv2.

The results of preprocessing steps and feature extraction stage are shown in (Figure 11). After the feature extraction stage, the images are converted back to RGB format to match the CNN input format. These processes on the dataset are saved in a new directory, to feed it to the network of CNN training models. The used CNN models are: VGG 19, VGG 16, MobileNet, ResNet50, EfficientNetB0, DenseNet121, and InceptionV3. In the following Figure, creating one function of CNN model for all used architectures. The used optimization function is Adam. The learning rate was set to LR = 1e-4. The performance metric used here is accuracy.

Simmulation Results

The following table, (Table 1) shows the comparison between different models. As shown, it is concluded that VGG-16 has the highest accuracy. The following figures shows the learning curves for each model used. The following figure shows a sample of prediction for VGG-16 and MobileNet output trails as both got the highest accuracy values (Figures 12-17).

Table 1: Results comparison.

irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications
irispublishers-openaccess-biostatistics-biometric-applications

Conclusion

In conclusion, this project aims to detect, diagnosis, and predicate eye disease. This project developed by using deep learning techniques. By using Gabor Filter for feature extraction, it can extract and capture the important features such as textual and imaginary patterns of the structure and shape of the images and then use it as an input to feed it to the CNNs training models. The features extracted from the CNN models used as input to feed to PNN to enhance and improve the model performance. With these technologies, the ophthalmologists can detect the diseases from the patient’s eye easily. The used CNN architectures are VGG 16, MobileNetV2, ResNet50, and EfficientB0. There were a lot of training models, but due to running these models in CPU it was very slow and took a lot of time. The best model of these models was VGG 16 with 83% achieved accuracy.

Future Work

While working on eye disease detection, diagnosis, and prediction system implementation, there are many several plans can be implemented to improve and enhance this system like creating user interface system “web Application” and with adding Graphical User Interface (GUI) this can improve the predicated models and make it useful for the ophthalmologists to use it in clinics. This can improve the model performance after taking feedback from ophthalmologists. If ophthalmologist used this web application, they could save time, make the work easier, and they could ensure of their diagnosis results.

References

  1. kaggle.com
  2. Jyotsana (2023) Image augmentation Techniques. Medium.
  3. IS Jacobs, CP Bean (1963) Fine particles, thin films and exchange anisotropy in Magnetism. G. T. Rado and
  4. H Suhl, [Eds.,]. New York, Academic. PP. 271-350.
  5. P Baheti (2024) A simple guide to data preprocessing in machine learning.
  6. Preprocess Images | Robo flow Docs.
  7. (2024) Normalize an image in OpenCV Python. Geeks for Geeks.
  8. U Kant (2022) Tokenization - A complete guide. Medium.
  9. Atulanand (2022) VGGNet Complete Architecture - CodeX. Medium.
  10. T Mostafid (2023) Overview of VGG16, ResNet50, XCeption and MobileNet neural networks. Medium.
  11. (2023) Residual Networks (ResNet) Deep learning. GeeksforGeeks.
  12. S Bangar (2022) Resnet Architecture Explained. Medium.
  13. A Sarkar (2023) Understanding EfficientNet - The most powerful CNN architecture. Medium.
  14. Gabor filter banks for texture classification - skimage 0.23.2 documentation.
Citation
Keywords
Signup for Newsletter
Scroll to Top