Fuzzy association rules pre final
Post on 27-Jun-2015
197 Views
Preview:
DESCRIPTION
Transcript
Fuzzy Association Rules
Aswin– 215 111 074Deepak– 215 111 049
Content
• Introduction– Data Mining– Association Rules– Fuzzy logic– Applications
• Procedure– Support and confidence– Steps
• Example:Risk analysis– Funda: Conditional probability– Example analysis
Data Mining
Data
Information
Knowledge
• KDD• Extraction of
Knowledge from Huge amounts of data
• This knowledge is – implicit, – previously unknown – and potentially
useful
Association rules
• Item sets (Z:C)
• Antecedent (X:A)
• Consequent (Y:B)
• Ex: If X is A, then Y is B
Application
• Strategic Decision Making
• Marketing Strategy formulation
• Predictive analytics:– CRM– Machine Maintenance– Employee Relations
• Artificial Intelligence : Video games, Robots
• Machines: Air conditioner, Washing machines, ABS
What is likely to happen, If so…..
IF – THENAntecedent - Consequent
Procedure
• Two factors: Support and Confidence
Trans_ID
Bread Butter Biscuit Milk
1 1 1 0 1
2 0 1 1 0
3 1 0 1 0
4 1 1 0 1
5 1 1 1 0
6 1 1 1 1
Procedure
• Hypothesis : Customers who buy bread and butter, also buy milk.
• Support = Desired Outcome/ Total Opportunities• Support = 3/6 = 0.5
Trans_ID
Bread Butter Biscuit Milk
1 1 1 0 1
2 0 1 1 0
3 1 0 1 0
4 1 1 0 1
5 1 1 1 0
6 1 1 1 1
Procedure
• Customers who buy bread and butter, also buy milk.
• Confident = Desired Outcome/ Desired Opportunities
• Confident = 3/4 = 0.75
Trans_ID
Bread Butter Biscuit Milk
1 1 1 0 1
2 0 1 1 0
3 1 0 1 0
4 1 1 0 1
5 1 1 1 0
6 1 1 1 1
Inference
• Hypothesis becomes Rule : Customers who buy bread and
butter, also buy milk.• With 75% confidence and 50%
support from past transactions records
Procedure
Fuzzification• Continuous to Discrete data
Analysis• Threshold
Defuzzification• State rule with confidence and
support
Risk analysis for issuing loan
Bank customer Data set
RISK = ASSETS – DEBT – WANTS
Case Age Income Risk Credit Result
1 20 52,623 –38,954 red 02 26 23,047 –23,636 green 13 46 56,810 45,669 green 14 31 38,388 –7,968 amber 15 28 80,019 –35,125 green 16 21 74,561 –47,592 green 17 46 65,341 58,119 green 18 25 46,504 –30,022 green 19 38 65,735 30,571 green 110 27 26,047 –6 red 1
Bank’s weight for each attribute and condition for analysis
Attribute Weight
Credit 0.800
Risk 0.700
Income 0.550
Age 0.450
Result 0.691Objective : Provide
Confident/Risk factor for the bank to issue loans for the
customers
FunctionPercenta
ge
Minimum Support
25
Minimum Confidence
90
Membership Function
FuzzificationAttribut
eLevel Representati
onWeigh
tMembershi
p valueSupport
(Rjk)
Age Young R11 0.450 0.580 0.261Age Middle R12 0.450 0.300 0.135Age Old R13 0.450 0.131 0.059
Income High R21 0.550 0.000 0.000Income Middle R22 0.550 0.890 0.490Income Low R23 0.550 0.109 0.060
Risk High R31 0.700 0.457 0.320Risk Middle R32 0.700 0.208 0.146Risk Low R33 0.700 0.332 0.233
Credit Good R41 0.800 0.720 0.576Credit Bad R42 0.800 0.280 0.224Result On
TimeR51 0.691 0.930 0.643
Result Default
R52 0.691 0.069 0.048
Item set
• C = complete sets, individual items• L = Set of items above minimum
support, grouped items• minsupp = 0.25• Conditional probability = support
Apriori Algorithm
Compute conditional
probability of each element in
SET C
Eliminate items < minsupp
to form SET L
Is L = 0
NO : nCr to form new SET
C
YES : STOP
START
C1 -> L1 - >C2C1 Support
R11 0.261R12 0.135R13 0.059R21 0.000R22 0.490R23 0.060R31 0.320R32 0.146R33 0.233R41 0.576R42 0.224R51 0.643R52 0.048
L1
R11
R22
R31
R41
R51
C2
(R11 , R22)(R11 , R31)(R11 , R41)(R11 , R51)(R22 , R31)(R22 , R41)(R22 , R51)(R31 , R41)(R31 , R51)(R41 , R51)
C2 -> L2 -> C3
L2
(R22 , R41)(R22 , R51)(R31 , R41)(R31 , R51)(R41 , R51)
C3
(R22, R41, R51)
(R22, R31, R41)
(R22, R51, R31)
(R31, R41, R51)
C2 Support
(R11 , R22) 0.235(R11 , R31) 0.207(R11 , R41) 0.212(R11 , R51) 0.230(R22 , R31) 0.237(R22 , R41) 0.419(R22 , R51) 0.449(R31 , R41) 0.266(R31 , R51) 0.264(R41 , R51) 0.560
C3 -> L3 -> C4
L3
(R22, R41, R51)
(R31, R41, R51)
C3 Support
(R22, R41, R51)
0.417
(R22, R31, R41)
0.198
(R22, R51, R31)
0.196
(R31, R41, R51)
0.264
C4
(R22, R31, R41, R51)
C4 -> L4
L4STOP
C4 Support
(R22, R31, R41, R51)
0.1957
Possible Associations
Items from L3 Associations
(R22, R41, R51)
R22, R41->R51
R22, R51->R41
R51, R41->R22
(R31, R41, R51)
R31, R41->R51
R31, R51->R41
R51, R41->R33
Items from L2 Associations
(R22, R41)R22->R41
R41->R22
(R22, R51)R22->R51
R51->R22
(R31,R41)R3`->R41
R41->R31
(R31, R51)R31->R51
R31->R51
(R41, R51)R41->R51
R51->R41
Confidence of each association
Items from L3
Associations
Confidence
(R22, R41, R51)
R22, R41->R51
0.995
R22, R51->R41
0.928
R51, R41->R22
0.744
(R31, R41, R51)
R31, R41->R51
0.993
R31, R51->R41
1.000
R51, R41->R31
0.472
Items from L2
Associations
Confidenc
(R22, R41)
R22->R41 0.855
R41->R22 0.727
(R22, R51)
R22->R51 0.916
R51->R22 0.697
(R31,R41
)
R3`->R41 0.831
R41->R31 0.462
(R31, R51)
R31->R51 0.825
R31->R51 0.410
(R41, R51)
R41->R51 0.972
R51->R41 0.870
Associations meeting minconf
Items from L3
Associations
Confidence
(R22, R41, R51)
R22, R41->R51
0.995
R22, R51->R41
0.928
R51, R41->R22
0.744
(R31, R41, R51)
R31, R41->R51
0.993
R31, R51->R41
1.000
R51, R41->R31
0.472
Items from L2
Associations
Confidenc
(R22, R41)
R22->R41 0.855
R41->R22 0.727
(R22, R51)
R22->R51 0.916
R51->R22 0.697
(R31,R41
)
R3`->R41 0.831
R41->R31 0.462
(R31, R51)
R31->R51 0.825
R31->R51 0.410
(R41, R51)
R41->R51 0.972
R51->R41 0.870
Confident Associations that meet the objective of the analysis
Items from L3
Associations
Confidence
(R22, R41, R51)
R22, R41->R51
0.995
R22, R51->R41
0.928
R51, R41->R22
0.744
(R31, R41, R51)
R31, R41->R51
0.993
R31, R51->R41
1.000
R51, R41->R31
0.472
Items from L2
Associations
Confidenc
(R22, R41)
R22->R41 0.855
R41->R22 0.727
(R22, R51)
R22->R51 0.916
R51->R22 0.697
(R31,R41
)
R3`->R41 0.831
R41->R31 0.462
(R31, R51)
R31->R51 0.825
R31->R51 0.410
(R41, R51)
R41->R51 0.972
R51->R41 0.870
Defuzzification
• If Income is middle, then payment will be received on time
R22->R51;(91.6%)• If Credit is good, then payment will be received
on time R41->R51;(97.2%)
• If Income is middle and Credit is good, then payment will be received ontime R41, R22-> R51; (99.5%)
• If Risk is high and Credit is good, then payment will be received on time
R31, R41->R51; (99.25%)
Conclusion
References
Thank you
top related