62 CHAPTER 5 CONVENTIONAL AI BASED IMAGE CLASSIFICATION TECHNIQUES 5.1 INTRODUCTION Artificial Intelligence (AI) is one of the widely preferred automated techniques for image processing applications. The availability of an in-built memory has made these techniques much superior to the conventional image processing algorithms. The presence of in-built memory has improved the accuracy of the results which is one of the necessary characteristic features of any automated image classification techniques. Thus, these techniques have been used significantly in the medical field where accuracy plays a major role in applications such as image classification. Some of the applications of AI in medical image processing are tumor detection in scan images, anatomical segmentation in scan images, etc. Even though AI is highly advantageous, these techniques have been seldom used in the field of ophthalmology. Most of the earlier researches have been focused only on the conventional image processing algorithms for automated retinal image classification applications. In this research work, emphasis has been given on exploring the usage of various AI techniques for abnormal retinal image classification. ANN and fuzzy theory are the two significant AI techniques for imaging applications. The exploration of the improvement in the accuracy of the results of these techniques over the conventional image processing algorithms has been one of the objectives of this work. In this work, Back Propagation Neural Network (BPN), Kohonen Self- Organizing Maps (SOM), Radial Basis Function Neural Networks (RBF) and Counter Propagation Networks (CPN) are used as representatives of the ANN in the context of retinal image classification. The fuzzy nearest neighbor classifier is used
29
Embed
CHAPTER 5 CONVENTIONAL AI BASED IMAGE CLASSIFICATION ...shodhganga.inflibnet.ac.in/bitstream/10603/10111/11... · of various AI techniques for abnormal retinal image classification.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
62
CHAPTER 5
CONVENTIONAL AI BASED IMAGE CLASSIFICATION TECHNIQUES
5.1 INTRODUCTION
Artificial Intelligence (AI) is one of the widely preferred automated
techniques for image processing applications. The availability of an in-built memory
has made these techniques much superior to the conventional image processing
algorithms. The presence of in-built memory has improved the accuracy of the
results which is one of the necessary characteristic features of any automated image
classification techniques. Thus, these techniques have been used significantly in the
medical field where accuracy plays a major role in applications such as image
classification. Some of the applications of AI in medical image processing are tumor
detection in scan images, anatomical segmentation in scan images, etc. Even though
AI is highly advantageous, these techniques have been seldom used in the field of
ophthalmology. Most of the earlier researches have been focused only on the
conventional image processing algorithms for automated retinal image classification
applications. In this research work, emphasis has been given on exploring the usage
of various AI techniques for abnormal retinal image classification. ANN and fuzzy
theory are the two significant AI techniques for imaging applications. The
exploration of the improvement in the accuracy of the results of these techniques
over the conventional image processing algorithms has been one of the objectives of
this work. In this work, Back Propagation Neural Network (BPN), Kohonen Self-
Organizing Maps (SOM), Radial Basis Function Neural Networks (RBF) and
Counter Propagation Networks (CPN) are used as representatives of the ANN in the
context of retinal image classification. The fuzzy nearest neighbor classifier is used
63
as the representative of fuzzy techniques. An extensive analysis on the performance
measures of these AI approaches are also presented in this chapter.
5.2 BLOCK DIAGRAM OF ANN BASED IMAGE CLASSIFICATION
The work carried out with the ANN based techniques are shown in Figure 5.1.
The four neural classifiers are experimented with retinal images. A comparative
analysis has been also performed on these classifiers based on various quality
measures.
Figure 5.1 Framework of the proposed methodology
The techniques of pre-processing and feature extraction have been performed
before the classification process to enhance the quality of the results. The image
database, pre-processing techniques and feature extraction used for the Conventional
techniques are used for ANN based classification also. A description on these
techniques has been detailed in section 3.3 and 3.4 . The neural classifiers are
initially experimented followed by the experiments on fuzzy classifier.
5.3 NEURAL TECHNIQUES FOR RETINAL IMAGE CLASSIFICATION
In this work, four neural networks are used for retinal image classification.
The complete image database is divided into training set and testing set. Initially, the
Retinal Image Database
Image Pre-processing
Feature Extraction
BPN based
image
classification
RBF based
image
classification
SOM based
image
classification
CPN based
image
classification
Fuzzy based
image
classification
Comparative Analysis
64
features are extracted from all the training images of the four categories and are used
as input to these neural classifiers. After training, the testing of the stabilized
networks is done with the testing image set. The training and the testing process are
carried out with the corresponding mathematical algorithms. Finally, the
performance of these classifiers in successful pattern classification is estimated
through various quality measures.
5.3.1 BPN based Image Classification
Back Propagation Neural networks are the primarily used supervised neural
network for imaging applications. BPN belongs to the category of feed forward
networks with gradient descent algorithm for the training methodology. A properly
trained BPN tends to give reasonable response to inputs that it has never been
subjected earlier.
5.3.1.1 Architecture of BPN
In this work, a three layer network is developed. The number of neurons in the
input layer is ‘n’, where ‘n’ corresponds to the number of input features. The number
of neurons in the output layer is based on the number of output classes. When
designing a neural network, one crucial and difficult step is determining the number
of neurons in the hidden layers. The hidden layer is responsible for internal
representation of the data and the information transformation input and output layers.
If there are too few neurons in the hidden layer, the network may not contain
sufficient degrees of freedom to form a representation. If too many neurons are
defined, the network might become over trained. In this work, one hidden layer with
20 neurons is used for the classification problem. The number of hidden layer
neurons is selected based on trial and error method.The architecture is shown in
Figure 5.2.
65
Figure 5.2 Architecture of BPN
The proposed architecture consists of 16 neurons in the input layer, 20 neurons in the
hidden layer and 4 neurons in the output layer. The input layer and the hidden layer
neurons are interconnected by the set of weight vectors U and the hidden layer and
the output layer neurons are interconnected by the weight matrixV . In addition to the
input vector X and output vector Y, the target vector T is given to the output layer
neurons. Since BPN operates in the supervised mode, the target vector is mandatory.
During the training process, the difference between the output vector and the target
vector is calculated and the weight values are adjusted based on the difference value.
5.3.1.2 Training Algorithm
In this work, the gradient descent algorithm is used for training the BPN. The
gradient of the performance function is used in these algorithms to determine how to
adjust the weights in order to minimize the error. The weight vectors are randomly
initialized to trigger the training process. During training, the weights of the network
are iteratively adjusted to minimize the sum of squared error.
2
otE (5.1)
where ‘t’ is the target vector and ‘o’ is the output vector.
.
.
.
.
.
.
.
.
.
. 16
n
1
2
1
2
20
1
2
4
X
Inp
ut
X
U
Ou
tpu
t
Y
V
t
t
t
66
Equation (5.1) can be expressed in terms of the input training vector X, the weight
vectors and the activation function. The gradient is determined using a technique
called back propagation where computational operation is performed in the backward
direction. The weights are adjusted in the direction where the error decreases most
rapidly with the gradient being negative. Such an iterative process can be expressed
as
qqq gWW 1 (5.2)
where qW is the weight vector (includes U and V), is the learning rate and qg is the
current gradient.
The derivative of the error value with respect to the weights is the gradient vector.
Hence, the weight adjustment criterion of the BPN network is given by
qqq
w
EWW
1 (5.3)
where ‘q’ is iteration counter and ‘E’ is the difference between the target and
output of the network.
The network is said to be stabilized when the weight vectors (U and V) of the
network remain constant for successive iterations. These weight vectors are the
finalized vectors which represent the trained network. The testing images are then
given as input to the trained network and the performance measures are analyzed.
5.3.2 RBF based Image Classification
One of the emerging neural networks from the supervised category is the
Radial Basis Function neural networks. Though supervised in nature, there are some
differences between RBF and other supervised neural networks in terms of
architecture and training algorithm. These differences have made RBF networks
much significant for usage in practical applications.
67
5.3.2.1 Architecture of RBF
A feed-forward structure is used in RBF with a single input layer, a hidden
layer and a summation layer. The number of neurons in the input layer is based on
the number of input features. The number of hidden layer neurons is selected
randomly. The summation layer is the third layer which is equivalent to the output
layer of any other neural networks but the operation performed by this layer is only
summation. Hence, it is named as summation layer. Unlike other 2-layer neural
networks, the number of weight matrices used in RBF is only one which forms the
interconnection between the hidden layer and the output layer. The architecture of
RBF is shown in Figure 5.3.
t
Figure 5.3 Framework of RBF
An architectural size of 16 input neurons, 20 hidden neurons and 4output neurons are
used in this work for RBF neural network. The only weight matrix connecting the
input layer and the hidden layer is denoted byW . Each neuron in the hidden layer is
considered as a radial basis function. The radial basis function used in this work is
the Gaussian function. A target vector is also supplied to the summation layer of the
RBF neural network.
5.3.2.2 Training Algorithm of RBF
The training algorithm of RBF neural network is slightly different from the
other neural networks. Normally, the weight matrices alone are adjusted in the neural
.
.
.
.
.
.
.
.
.
. 16
n
1
2
1
2
20
1
2
4
X
Inp
ut
Ou
tpu
t
W
t
t
68
networks but in RBF, the parameters of the basis function are also adjusted in
addition to the weight adjustment. Also, the hidden layer neurons’ outputs are not
calculated using the weighted-sum mechanism and sigmoid activation. Instead, the
Gaussian function is used to estimate the outputs of hidden layer neurons. In this
work, the training algorithm is carried out in two stages. Initially, the parameters of
the Gaussian function are adjusted using the distance measure and then the weight
matrices are adjusted in the second stage using the least square method. The detailed
algorithm is given below:
Step 1: The mean j , standard deviation j , target vector ‘o’ and the weight
matrix W are randomly initialized.
Step 2: The values of ‘ ’ is estimated for each hidden layer neuron ‘j’ iteratively
using the formula
q
jqj
qj X 1 (5.4)
This equation is executed for ‘q’ iterations till the second term in Equation
(5.4) falls below the specified threshold value. Thus, the closeness of the
input with the mean values is used as the measure to estimate the stabilized
set of ‘ ’ values. In the above equation, ‘ ’ is the learning rate.
Step 3: The values of ‘ ’ is estimated using the formula
2
2
1
1
jj
j
(5.5)
Step 4: The stabilized output for each neuron in the hidden layer ‘j’ is determined
using the formula given by
2
2
2exp
j
j
j
XZ
(5.6)
Step 5: In the second stage, the output of the summation layer neurons ‘k’ is
estimated with the randomly initialized weights using the following formula
j
j
jkk Zwo
(5.7)
69
Step 6: The error values are further estimated using the following formula
2
k
kk otE (5.8)
Step 7: If the error value is large, the following weight adjustment equation is used to
determine the stabilized set of weights
jkk
qjk
qjk Zotww 1 (5.9)
Step 8: The steps 5-7 are repeated till the error value falls below a specified threshold
level to obtain the stabilized weights.
Using these stabilized set of weights, mean and standard deviation values, the
network is tested to determine the outputs of the summation layer for every
individual input. The unknown input is allotted to the class for which the output of
the neuron is maximum.
5.3.3 SOM based Image Classification
SOM belongs to the category of unsupervised neural networks where there is
no requirement for the target vector. SOM is also called as Kohonen neural networks
where the ‘winner take-all’ training algorithm is used in the training algorithm.
Similar to statistical clustering algorithms, these Kohonen networks are able to find
the natural groupings from the training data set. As the training algorithm follows the
‘winner take-all’ principle, these networks are also called as competitive learning
networks.
5.3.3.1 Architecture of SOM
The topology of the Kohonen self-organizing map is represented as a 2-
Dimensional, single-layered output neural network. Each input neuron is connected
to each output neuron. The number of input nodes is determined by the number of
training patterns. There is no particular geometrical relationship between the output
neurons in the competitive learning networks. During the process of training, the
70
input patterns are fed into the network sequentially. Output neurons represent the
‘trained’ classes and the center of each class is stored in the connection weights
between input and output neurons. The topology of SOM is shown in Figure 5.4.
Figure 5.4 Architecture of SOM
The architecture used in this work is 16-4 where 16 corresponds to the
number of input layer neurons and 4 corresponds to the number of output layer
neurons. The weight matrix is denoted byW .
5.3.3.2 Training Algorithm of SOM
One of the successive applications of SOM is categorization where the data is
grouped into any one of the categories based on some similarity measures. Mapping
of input vectors to the output vectors based on the characteristic features is
performed by the SOM training algorithm. The competitive learning rule is adopted
for training the network. The “winner take-all” concept is a methodology in which a
winner neuron is selected based on the performance metrics. The weight adjustment
is performed for the winner neurons and also the neighboring neurons of the winner
neuron. The weights of all other neurons remain unchanged. The neighboring
neurons are determined using a radius around the winner neuron. In this work, unit
.
.
.
.
.
.
. 16
1
2
1
2
4
X
Inp
ut
W
Ou
tpu
t
71
radius is selected which shows the weights of the winner neuron alone is adjusted
during the process. A detailed training algorithm has been given below:
Step 1: Initialize weights ijw ; ‘i’ corresponds to the input layer and ‘j’ corresponds to
output layer.
Step 2: While stopping condition is false, do steps 3 to 6.
Step 3: The distance measure for each j (output layer neurons) is computed using the
formula given by
jD 2 i
iij xw (5.10)
Step 4: The index j with minimum jD is selected.
Step 5: The winner neuron’s weight is determined using the rule given by
oldwxoldwneww ijiijij (5.11)
xi denotes the feature values of input data set.
α denotes the learning rate.
Step 6: The training is stopped when the maximum number of iterations is reached.
The training process is carried out with the training image set. The entire process
is repeated for the specified number of iterations in the algorithm. The weights
yielded by the network in the last iteration are stored as the stabilized weights.
Further, the testing images are used to estimate the performance of the neural
network.
5.3.4 CPN based Image Classification
Counter Propagation Neural networks are one of the widely used neural
networks. It is named as hybrid neural network since the concept of both supervised
and unsupervised methodologies are involved in the training algorithm.
Theoretically, the characteristic features of both the training methodologies are
available in CPN.
72
5.3.4.1 Architecture of CPN
CPN is a 2-layered network with a single hidden layer apart from the input
and the output layers. The hidden layer is also named as Kohonen layer which uses
the unsupervised methodology for training the network. The output layer is also
named as Grossberg layer which uses the supervised training methodology. Two set
of weight matrices are involved in the architecture. The target is usually supplied to
the Grossberg layer of the CPN. The architecture of CPN is same as that of BPN
which is given in Figure 5.2. The difference between these two networks is found
only in the mode of stabilizing the weights.
5.3.4.2 Training Algorithm of CPN
In the training algorithm of CPN, both the supervised and unsupervised
training methodologies are employed for adjusting the two set of weights. The
“winner take-all” strategy is used to adjust the weights of the Kohonen layer and the
output Grossberg layer’s weight adjustment is based on the error signal which is the
difference between the target and the output vector. The error signal is used to update
only the output layer weights unlike back propagation network where error is used to
update weights of both the layer. Thus, this network is named as Counter
Propagation Neural Network to show that it is contrary to the conventional BPN.
The weight adjustment between the input layer and the competition layer is given by
jq
ijq
ijq
ij ZUxUU 1
(5.12)
In the above expression, x denotes the input vector, ‘q’ is the iteration number,
‘i’ denotes the input layer neuron, ‘j’ corresponds to the hidden layer neuron and ‘ ’
is the learning coefficient. A value of ‘1’ is given to jZ if ‘j’ is the winner neuron or
‘0’ if the neuron ‘j’ fails in the competition. The fact that unsupervised training
methodology is used in the Kohonen layer is verified by Equation (5.12) which
73
shows that the weights are adjusted only for the winner neuron and the weights of the
remaining neurons remain the same.
After the weight vectors ijU have stabilized, as the second step, the output
layer begins learning the desired output. The weight adjustments for the output layer
is given by
qjk
qjk
qjk VtVV
1 (5.13)
where ‘t’ is the target vector, ‘k’ corresponds to the output layer neurons. The
involvement of the target vector has justified the usage of supervised training
methodology in the output layer. After the training process, the CPN is experimented
with the testing images using the stabilized set of weights. All the four categories of
the training inputs are now represented by the weight matrices and hence the
unknown input can be associated to the corresponding category during the testing
process.
5.4 FUZZY TECHNIQUE FOR RETINAL IMAGE CLASSIFICATION
The classifier in which the fuzzy sets or fuzzy logic is used in course of its operation
is called as fuzzy classifier. A degree of membership is assigned to the four classes
and the unknown input is assigned to the class for which the membership value is
maximum. Fuzzy classifiers are often designed to be transparent, i.e., steps and logic
statements leading to the class prediction are traceable and comprehensible. The
fuzzy systems are accurate only if sufficient expert opinion is available about the
corresponding application. In this work, the fuzzy nearest neighbor classifier is used
to test the applicability of fuzzy systems for retinal image classification.
5.4.1 Fuzzy Nearest Neighbor Classifier for image classification
The operation of this algorithm is carried out in two phases. In the first phase,
each of the training data set is clustered into three clusters (background, blood