Top Banner
Jagdeep Singh HYBRID TECHNIQUE FOR ASSOCIATIVE CLASSIFICATION: A NOVAL APPROACH
18

Associative Classification: Synopsis

Sep 11, 2014

Download

Education

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Associative Classification: Synopsis

Jagdeep Singh

HYBRID TECHNIQUE FOR ASSOCIATIVE CLASSIFICATION: A NOVAL APPROACH

Page 2: Associative Classification: Synopsis

Table of Contents

Ø  Introduction Ø Data Mining Process Ø Classification Ø Association

Ø  Motivation Ø  Literature Survey Ø  Problem Formulation Ø  Objectives

Ø  Methodology Ø  Facilities Required Ø  References

Page 3: Associative Classification: Synopsis

Data Mining

Data mining computational process of finding patterns in large data sets including methods at the intersection of machine learning, artificial intelligence, statistics and database systems. The main focus of data mining process is to obtain information from the data and converted it into an knowledgeable and reasonable structure for further use.

Page 4: Associative Classification: Synopsis

Data Mining Process

Figure 1 : The Data Mining Process [10]

Page 5: Associative Classification: Synopsis

Classification

Classification is the problem of identifying to which of a set of categories a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.

Page 6: Associative Classification: Synopsis

Association

Association learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using different measures of interestingness.

For example, the rule : {onions, potatoes} => {burger}.

Page 7: Associative Classification: Synopsis

Example : The Weather Problem

ID outlook temperature humidity windy play

1 sunny hot high false no

2 sunny hot high true no

3 overcast hot high false yes

4 rainy mild high false yes

5 rainy cool normal false yes

6 rainy cool normal true no

7 overcast cool normal true yes

8 sunny mild high false no

9 sunny cool normal false yes

10 rainy mild normal false yes

11 sunny mild normal true yes

12 overcast mild high true yes

13 overcast hot normal false yes

14 rainy mild high true no

Page 8: Associative Classification: Synopsis

Association rules for: Weather Problem

1. humidity=normal windy=FALSE (4) ==> play=yes (4) ���

 2. temperature=cool (4)==> humidity=normal (4)  

3. outlook=overcast (4) ==> play=yes (4)    ���

 4. temperature=cool play=yes (3) ==> humidity=normal (3)     ���

 5. outlook=rainy windy=FALSE (3) ==> play=yes (3)     ���

 6. outlook=rainy play=yes (3) ==> windy=FALSE (3)    ���

 7. outlook=sunny humidity=high(3) ==> play=no (3)     ���

 8. outlook=sunny play=no (3) ==> humidity=high (3)     ���

 9. temperature=cool windy=FALSE (2) ==> humidity=normal play=yes (2)    ���

10. temperature=cool humidity=normal windy=FALSE (2) ==> play=yes (2)   

Page 9: Associative Classification: Synopsis

Result new prediction ?

Outlook Temp. Humidity Wind Play

Sunny Cool High True

Page 10: Associative Classification: Synopsis

Literature Survey

Ø  Liao et al. [8] author report about data mining techniques and application,

development through a survey of literature, form 2000 to 2011. Paper surveys

three areas of data mining research: knowledge types, analysis types, and

architecture types. A discussion deals with future progress in social science and

Engineering methodologies implement data mining techniques and the development

of applications in problem- oriented

Ø  The first association rule mining algorithm was the Apriori algorithm [3] developed

by Agrawal, and swami. The Apriori algorithm generates the candidate item sets in

one pass through only the item sets with large support in the previous pass, without

considering the transactions in the database.

Page 11: Associative Classification: Synopsis

Continue…

Ø  Kwon et al.[9] evaluated the data set features are most affective on

classification algorithms performance. It is a complex problem to find out

which algorithm is highly effective in relation to which data set. Author’s

research experimentally examines how data set characteristics affect

algorithm performance, in terms of elapsed time and accuracy.

Ø  B. Liu et al. [2] presented an associative classification, to integrate

classification rules and association rule mining. The integration is done by

focusing on mining a special subset of association rules whose consequent

parts are restricted to the classification class labels, called Class Association

Rules (CARs).

Page 12: Associative Classification: Synopsis

Problem Formulation

Ø  Associative and classification suffers from inefficiency due to the fact that it

often generates a very large number of rules in association rule mining.

Often this leads to generation of a large number of insignificant rules and

at the same time good rules with relatively low support are not produced. It

takes efforts to select high quality rules from among them.

Ø  Most of the associative classification algorithms adopt the exhaustive search

method presented in the famous Apriori algorithm to discover the rules and

require multiple passes over the database. Furthermore, they find frequent

items in one phase and generate the rules in a separate phase consuming

more resources such as storage and processing time.

Page 13: Associative Classification: Synopsis

Objectives

Ø  Purpose a framework that can generate Classification Association Rules (CARs) efficiently.

Ø  Perform evaluation of proposed approach. Ø  Comparative analysis of proposed Algorithm with

other state-of-the-art techniques.

Page 14: Associative Classification: Synopsis

Methodology

Ø  Review of the classification and association rule generation methods.

Ø  Understanding the existing model associative classification.

Ø  Implement a classification system based on association rules and compare the performance of several model construction methods or algorithms in Weka environment.

Ø  Comparison of proposed approach with exiting methods.

Page 15: Associative Classification: Synopsis

Ø  Data mining tools is used for the implementation of the proposed project work like Weka.

Facilities Required

Page 16: Associative Classification: Synopsis

References

Ø  Tom M. Mitchell, “Machine Learning”, 1st ed.U.K.: McGraw-Hill, 1997.

Ø  Bing Liu, Wynne Hsu, and Yiming Ma, “Integrating classification and association rule mining”. In Knowledge Discovery and Data Mining, New York, vol. 2, pp 80–86, 1998.

Ø  R. Agrawal and R. Srikant, “Fast algorithms for mining association rules”, In VLDB, pp. 487-499, Santiago, Chile, September 12-15, 1994.

Ø  Wenmin Li, Jiawei Han, and Jian Pei, “CMAR: Accurate and efficient classifi- cation based on multiple class-association rules”. In ICDM'01 Proc. of the 2001 IEEE International Conference on Data Mining, pp 369–376, IEEE Computer Society Washington, DC, USA , 2001.

Ø  X. Yin and J. Han, “CPAR: Classification based on Predictive Association Rules,” Proc. SIAM Int. Conf. on Data Mining, pp. 331-335, San Francisco, CA, May 2003.

Ø  Thabtah, Fadi Abdeljaber, “A review of associative classification mining”. Knowledge Engineering Review, vol. 1, pp. 37-65, 2007.

Page 17: Associative Classification: Synopsis

Continue …

Ø  T.V.Mahendra, N.Deepika and N.Keasava Rao, “Data Mining for High Performance Data Cloud using Association Rule Mining”, International Journal of Advanced Research in Computer Science and Software Engineering, vol. 2, Issue 1, 2012.

Ø  S. H. Liao, P. H. Chu, and P. Y. Hsiao, “Data mining techniques and applications – A decade review from 2000 to 2011”, Elsevier Expert Systems with Applications, vol. 39, pp. 11303–11311, 2012.

Ø  Ohbyung Kwon and Jae Mun Sim, “Effects of data set features on the performances of classification algorithms”, Expert Systems with Applications, vol. 40, pp. 1847–1857, 2013.

Ø  http://www.infovis-wiki.net/index.php?title=File:Fayyad96kdd-process.png

Page 18: Associative Classification: Synopsis

Jagdeep Singh