Fuzzy Systems Fuzzy Rule Generation Rudolf Kruse Christian Moewes {kruse,cmoewes}@iws.cs.uni-magdeburg.de Otto-von-Guericke University of Magdeburg Faculty of Computer Science Department of Knowledge Processing and Language Engineering R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 1 / 61
61
Embed
Fuzzy Systems - Fuzzy Rule Generationfuzzy.cs.ovgu.de/wiki/uploads/Lehre.FS0910/fs0910lecture09.pdf · Wang & Mendel Algorithm • basic fuzzy rule learning method 1. predefine global
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Fuzzy SystemsFuzzy Rule Generation
Rudolf Kruse Christian Moewes{kruse,cmoewes}@iws.cs.uni-magdeburg.de
Otto-von-Guericke University of MagdeburgFaculty of Computer Science
Department of Knowledge Processing and Language Engineering
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 1 / 61
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 20 / 61
Extensions avoiding Global Grids
• in high-dimensional X , global granulation leads to many rules
⇒ now, no global dependence on granulation• individual membership functions for each rule• better modeling of local properties
R1 : if x1 is µ1,1 and . . . and xp is µ1,p then . . .. . .
Rr : if x1 is µr ,1 and . . . and xp is µr ,p then . . .
• not all attributes will be used for all rules• individual choice of constraints on few attributes per rule• better interpretability in high dimensions• no exponential number of rules with increasing dimensionality
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 21 / 61
Local Granulation
x1
1
0
x2
1 0x1
µ1,1µ1,2 µ1,3
µ2,1
µ2,2
µ2,2
R1
R2 R3
• 3 rules in 2 dimensions
• compare with globalgranulation à laWang & Mendel
• possible disadvantage
• potential loss ofinterpretation
• projection of allfuzzy sets onto oneXj is usually notmeaningful
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 22 / 61
Berthold & Huber Algorithm
• constructs rule base with individual fuzzy sets per rule
• parameters that must be specified• granulation of Y, i.e., number and shape of membership functions• c fuzzy sets defined by µk
y with 1 ≤ k ≤ c
• algorithm iterates over S and fine-tunes evolving model
• final rule base consists of fuzzy rules Rkd , 1 ≤ d ≤ rk
• rk = number of rules for output region k
• output for k−th region and some x equals Mamdani controller
µk(x) = max1≤d≤rk
{
min1≤j≤p
{µkd,j(xj)}
}
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 23 / 61
Form of RulesRules
• all rules rely on trapezoidal membership functions
⇒ each rule can be described by 4 parameters per Xj
if x1 is 〈a1, b1, c1, d1〉 and . . . and xp is 〈ap, bp, cp , dp〉then y is µk
y
• however, if some trapezoids cover entire domain of an Xj ,
then rule’s degree of fulfillment is independent from of this Xj
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 24 / 61
Algorithm 1 Berthold & Huber
Input: S = {(x i , ci) | 1 ≤ i ≤ n} with x ∈ IRp and c ∈ IN class of x
do {for each training example (x, c) ∈ S {
if correct rule of class c exists {increase weight by one // COVERadjust core region of rule to cover x
} else {insert new rule with core equals x // COMMITsupport equals ∞ (i.e., rule is not constrained)
}reduce support of all rules of conflicting class that cover x //SHRINK
}} while changes have occurred
• COVER and COMMIT are easy to implement
• SHRINK is based on heuristics (e.g., volume-based)
Some Remarks on Core and Support Regions
• algorithm finds rule base that completely describes data
• each rule is partial hypothesis for subset S ⊂ S
• core = most specific hypothesis covering S
• support = (one of the) most general hypotheses covering S
⇒ support is more general than core
• both core and support regions can be seen as
• smallest area with highest degree of confidence (evidence)
• largest area without conflict (no counter-example)
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 26 / 61
Example: Berthold & Huber Algorithm
• given two-dimensional X andtraining data S where |Y| = 2
• task: fuzzy binaryclassification
• first, start with empty rulebase for each region/class
• Java applet is available thatdemonstrates algorithm
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 27 / 61
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 28 / 61
Example: Berthold & Huber Algorithm (cont.)
• suppose that 2nd pattern isfrom different class
⇒ new rule is inserted for 2ndpattern
• also adjust 1st (conflicting)rule
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 29 / 61
Example: Berthold & Huber Algorithm (cont.)
• suppose that 3rd pattern isfrom same class as 2nd one
⇒ adjust free feature to avoidconflict with 3rd pattern
• and so on. . .
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 30 / 61
Choosing the Right Feature to Shrink
• in p-dimensional feature space, there are p choices
• algorithm uses several heuristics:
1. maximize remaining volume ⇒ low rule numbers, good coverage
2. minimize number of constrained attributes ⇒ feature reduction
3. minimize number of constraints on free features ⇒ interpretability
4. use information theoretic measures ⇒ feature importance
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 31 / 61
Outline
1. Motivation
2. Extracting Grid-Based Fuzzy Rules
3. Extracting Individual Fuzzy Rules
4. Rule Generation by Fuzzy ClusteringExtend Membership Values to Continuous Membership FunctionsExample: The Iris DataInformation Loss from ProjectionExample: Transfer Passenger Analysis
5. Different Approaches
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 32 / 61
Rule Generation by Fuzzy Clustering
1. apply fuzzy clustering to X ⇒ fuzzy partition matrix U = [uij ]
2. use obtained U = [uij ] to define membership functions
• usually X is multidimensional
⇒ How to specify meaningful labels for multidim. membershipfunctions?
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 33 / 61
Extend uij to Continuous Membership Functions
• assigning labels for one-dimensional domains is easier ⇒
1. project U down to X1, . . . , Xp axis, respectively2. only consider upper envelope of membership degrees3. linear interpolate membership values ⇒ membership functions4. cylindrically extend membership functions
• original clusters are interpreted as conjunction of cyl. extensions
• e.g., cylindrical extensions “x1 is low”, “x2 is high”⇒ multidimensional cluster label “x1 is low and x2 is high”
• labeled clusters = classes characterized by labels
• every cluster = one fuzzy rule
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 34 / 61
Convex Completion [Höppner et al., 1999]
• problem of this approach: non-convex fuzzy sets
⇒ having upper envelope, compute convex completion
• we denote p1, . . . , pn, p1 ≤ . . . , pk as ordered projections ofx1, . . . , xn and µi1, . . . , µin as respective membership values
• eliminate each point (pt , µit), t = 1, . . . , n, for which two limitindices tl , tr = 1, . . . , n, tl < t < tr , exist s.t.
muit < min{muitl, muitr }
• after that: apply linear interpolation of remaining points
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 35 / 61
• collected by Ronald Aylmer Fischer (famous statistician)• 150 cases in total, 50 cases per Iris flower type• measurements: sepal length/width, petal length/width (in cm)• most famous dataset in pattern recognition and data analysis
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 36 / 61
• rules 1 and 5: aircraft with relatively small amount of maximalpassengers (80-200), short- to medium-haul destination, anddeparting late at night usually have high amount of transferpassengers (80-90%)
• rule 2: flights with medium-haul destination and small aircraft(about 150 passengers), starting about noon, carry relatively highamount of transfer passengers (ca. 70%)
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 58 / 61
Outline
1. Motivation
2. Extracting Grid-Based Fuzzy Rules
3. Extracting Individual Fuzzy Rules
4. Rule Generation by Fuzzy Clustering
5. Different Approaches
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 59 / 61
Different Approaches
• constructive: find fuzzy rules by growing singletons
• hierarchical : merge grid cells if no points are covered or sameclass is predicted
• adaptive:
• initialize rules randomly (e.g., with expert knowledge) anditeratively optimize rule parameters (e.g., location, number offuzzy sets)
• based on, e.g., gradient descent, neural networks, . . .
• evolutionary : find rules by mutation/crossover over generations
• neuro-fuzzy : inject fuzzy rules into ANN, use its learningalgorithm
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 60 / 61
Literature about Fuzzy Rule Generation
Berthold, M. R. and Hand, D. J., editors (2003).Intelligent Data Analysis: An Introduction.Springer, Berlin, Germany, 2nd edition.
Borgelt, C., Klawonn, F., Kruse, R., and Nauck, D.(2003).Neuro-Fuzzy-Systeme: Von den Grundlagen künstlicherNeuronaler Netze zur Kopplung mit Fuzzy-Systemen.Vieweg, Wiesbaden, Germany, 3rd edition.
Höppner, F., Klawonn, F., Kruse, R., and Runkler, T.(1999).Fuzzy Cluster Analysis: Methods for Classification, DataAnalysis and Image Recognition.John Wiley & Sons Ltd, New York, NY, USA.
Keller, A. and Kruse, R. (2002).Fuzzy rule generation for transfer passenger analysis.In Wang, L., Halgamuge, S. K., and Yao, X., editors,Proceedings of the 1st International Conference on FuzzySystems and Knowledge Discovery (FSDK’02), pages667–671, Orchid Country Club, Singapore.
R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 61 / 61