26 CHAPTER 3 REAL CODED GENETIC ALGORITHM FOR FUZZY CLASSIFIER DESIGN 3.1 INTRODUCTION Genetic Algorithm (GA) (Goldberg 1989) is a generalized search and optimization technique inspired by the theory of biological evolution. While many authors applied GA for system identification and control problems, only a few authors have focused on data classification problems and still there exist a number of issues in applying genetic algorithms for designing the fuzzy classifier. The conventional Binary-coded GA (BGA) has Hamming cliff problem (Devaraj et al. 2005) which sometimes may cause difficulties in the case of coding continuous variables. Also, for discrete variables with total number of permissible choices not equal to 2 k (where k is an integer), it becomes difficult to use a fixed length binary coding to represent all permissible values. To overcome the above difficulty, this chapter presents Real- coded GA (RGA) for fuzzy classifier design. It also discusses the issues to be addressed in developing a Genetic Fuzzy classifier model.
22
Embed
CHAPTER 3 REAL CODED GENETIC ALGORITHM FOR FUZZY ...shodhganga.inflibnet.ac.in/bitstream/10603/25376/8/08_chapter3.pdf · REAL CODED GENETIC ALGORITHM FOR FUZZY CLASSIFIER DESIGN
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
26
CHAPTER 3
REAL CODED GENETIC ALGORITHM FOR FUZZY
CLASSIFIER DESIGN
3.1 INTRODUCTION
Genetic Algorithm (GA) (Goldberg 1989) is a generalized search
and optimization technique inspired by the theory of biological evolution.
While many authors applied GA for system identification and control
problems, only a few authors have focused on data classification problems
and still there exist a number of issues in applying genetic algorithms for
designing the fuzzy classifier.
The conventional Binary-coded GA (BGA) has Hamming cliff
problem (Devaraj et al. 2005) which sometimes may cause difficulties in the
case of coding continuous variables. Also, for discrete variables with total
number of permissible choices not equal to 2k (where k is an integer), it
becomes difficult to use a fixed length binary coding to represent all
permissible values.
To overcome the above difficulty, this chapter presents Real- coded
GA (RGA) for fuzzy classifier design. It also discusses the issues to be
addressed in developing a Genetic Fuzzy classifier model.
27
No
Yes
Calculate fitness value
Select parents for reproduction
Apply crossover and mutation
Evaluate fitness of Chromosomes
Converged ?
Stop
N
Generate initial population
Start
3.2 OVERVIEW OF GENETIC ALGORITHM
Genetic Algorithm (Sivanandam, 2007) maintains a population of
individuals that represent candidate solutions. Each individual is evaluated to
give some measure of its fitness to the problem from the objective function. In
each generation, a new population is formed by selecting the more fit
individuals based on a particular selection strategy. Some members of the new
population undergo genetic operations to form new solutions.
Figure 3.1 Flowchart of Genetic Algorithm
The two commonly used operations are crossover and mutation.
Crossover is a mixing operator that combines genetic material from selected
parents. Mutation acts as a background operator and is used to search the
unexplored search space by randomly changing the values at one or more positions of the selected chromosome.
28
As shown in Figure 3.1, starting with an initial population, the
genetic algorithm exploits the information contained in the present population
and explores new individuals by generating offspring using the three genetic
operators namely, reproduction, crossover and mutation which can then
replace members of the old generation. Fitter chromosomes with higher
probabilities are selected for the next generation. After several generations,
the algorithm converges to the best chromosome, which hopefully represents
the optimum or near optimal solution.
3.3 REAL CODED GENETIC ALGORITHM
In a standard Simple Genetic Algorithm (SGA), binary strings are
used to represent the solution variables. It is widely recognized that the SGA
scheme is capable of locating the neighborhood of the optimal or near-optimal
solutions, but, in general, SGA requires a large number of generations to
converge. This is especially because of decoding of binary strings into real
numbers and vice versa. Moreover, this binary string makes GA to suffer
from the Hamming Cliff’s problem.
The Hamming cliff (Yu et al. 2010) is the phenomena that occurs
when the pair 0111111111 and 1000000000 belonging to neighboring points
in the phenotype space but have a maximum Hamming distance in the
genotype space. To cross this Hamming cliff all bits have to be changed
simultaneously. The probability of this to occur with crossover and mutation
is very small and results in premature convergence.
To overcome this difficulty, modifications are made in the
proposed Real-coded Genetic Algorithm (RGA) such that rule set is
represented using integer number and membership functions are represented
using floating point number. The power of GA lies in the kind of genetic
operators applied to modify the parameters of the chromosome and find the
29
better solutions to the problem. The following real parameter genetic
operators are popularly used in the literature for the RGA based fuzzy
classifier design.
Roulette wheel selection, Weighted mean cross over, Uniform
mutation are used by Russo et al. (2000).
Rank selection, Arithmetic cross over, Random mutation
(Setnes et al. 2000).
Roulette wheel selection, Max-min arithmetical crossover and
Uniform mutation are used by Casillas et al. (2005).
Tournament Selection, BLX-a crossover, Random mutation
are used by Alcala et al. (2007).
Tournament selection, Arithmetic crossover and Uniform
mutation are used by Devaraj et al. (2010).
In this research work, Tournament Selection, BLX-
Non-uniform mutation operations are used. The details of the genetic
operators are presented in the following subsections:
3.3.1 Tournament Selection
The goal of selection is to allow the “fittest” individuals to be
selected more often to reproduce. In tournament selection, ‘n’ individuals are
selected randomly from the population, and the best of the ‘n’ is inserted into
the new population for further genetic processing. Tournaments are often held
between pairs of individuals, although larger tournaments can be used. This
procedure is repeated until the mating pool is filled.
30
I
e2e1
u2u1
u min u max
3.3.2 BLX-
Crossover is a mixing operator that combines particle’s individual
best position from the randomly selected particles. BLX-
finds out a new position y from the space [ ], 21 ee as follows:
y = otherwisesamplingrepeat
uyuifeere:
: maxmin121 (3.1)
where, )( 1211 uuue (3.2)
1222 uuue (3.3)
:r Uniform random number 1,0
Figure 3.2 illustrates the BLX-
dimensional case. It is noted from the Figure 3.2 that e1 and e2 will lie
between minu and maxu , the variable’s lower and upper bound respectively. In a
number of trial runs, it i
Figure 3.2 BLX-
One interesting feature of this type of crossover operator is that the
created point depends on the location of both parents. If both parents are close
to each other, the new point will also be close to the parents. On the other
hand, if parents are far from each other, the search is more like a random search.
31
After the crossover, the fitness of the individual best position is
compared with that of the two offspring, and the best one is taken as the new
individual best position.
3.3.3 Non-uniform Mutation
Mutation is a varying operator that randomly changes the values at
one or more positions of the selected particle. In Non-uniform mutation, for
each chromosome mti xxxX ,...,, 21 in the population of t-th iteration, an
offspring mti xxxX ...,,, '
2'1
1 is produced as below:
1,,0,,'
israndomaifLBxtxisrandomaifxUBtx
xkk
kkk (3.4)
where LB and UB are the lower and upper bounds of the variables xk. The
function yt, returns a value in the range [0,y] such that yt, approaches
zero as t increases. This property causes this operator to search the space
uniformly at initial stages (when t is small), and very locally at later stages.
This strategy increases the probability of generating a new number
close to its successor than a random choice. The function yt, is evaluated
as below:
b
Tt
ryyt1
1., (3.5)
where r is a random number from [0,1], T is the maximum iteration, b is a
system parameter determining the degree of dependency on the iteration
number.
32
3.4 RGA IMPLEMENTATION
While designing an FLBCS using any population based stochastic
optimization techniques including RGA; the following issues are to be
addressed:
Representation of solution variables.
Formulation of Fitness function.
3.4.1 Representation
The first important consideration for designing an optimal FLBCS
is the representation strategy to be followed for the solution variables namely
rule set and membership function. A fuzzy system is said to be completely
specified only when the rule set and the membership function associated with
each fuzzy set are represented as a single individual.
3.4.1.1 Representation of MF
To represent the membership function, the range of each input
variable is partitioned into areas and then it is associated with fuzzy sets. In
general three to seven partitions are appropriate to cover the required range of
an input variable. Once the names of the fuzzy sets are determined, then their
associated membership functions are to be considered. In general, the shape
of the membership function depends on the nature of the problem. Piecewise
linear functions such as triangular or trapezoidal functions are widely used
membership functions in the fuzzy system design because of its simplicity
and sensibility.
In this research work, Trapezoidal membership function is used for
starting and ending fuzzy regions and Triangular membership function is used
33
0
P9 P8
P7 P6
P5
P3P4
P2P1
Variable Range
L HM
1
for intermediate fuzzy regions. After deciding the type of membership
function, the points required for placing them in the fuzzy space and their
ranges are computed as per the procedure illustrated using Figure 3.3.
Figure 3.3 Fuzzy Partitions
As shown in Figure 3.3, if the range of an input variable is
partitioned into three fuzzy sets namely, Low (L), Medium (M) and High (H),
then a total of nine membership points (P1, P2, P3, P4, P5, P6, P7, P8, P9) are
required for representing the input variable.
In that nine points, first and last points (P1 and P9) are fixed that are
the minimum and maximum of the input variable. The remaining seven
membership points are evolved between the dynamic range such that P2 has
[P1,P9], P3 has [P2,P9], P4 has [P2,P3], P5 has [P4,P9], P6 has [P5,P9], P7 has
[P5,P6] and P8 has [P7,P9] as limits.
As an extension of the above method if five fuzzy sets are used to
represent each variable, then a total of fifteen membership points (P1, P2, P3,
P4, P5, P6, P7, P8, P9, P10, P11, P12, P13, P14, P15) are required and has limits as
discussed for the three fuzzy sets.
34
3.4.1.2 Representation of Rule Set
Each rule in the rule set has three sections: rule selection (‘R’),
representation for the input variables (antecedent – I1, I2,…, In) and the
representation for the output classes (consequent – ‘O’). Rule selection may
take either ‘1’ to select the rule otherwise ‘0’.
If three fuzzy sets (low, medium and high) are used, then each
antecedent part may take an integer value ranges from 0 to 3 such that ‘0’