Top Banner
WEKA WEKA A Data Mining Tool By Susan L. Miertschin 1
27

WEKA Data Mining Tool

Feb 03, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: WEKA Data Mining Tool

WEKAWEKAA Data Mining Tool

By Susan L. Miertschin

1

Page 2: WEKA Data Mining Tool

Data MiningData Mining

Task Types Numerous AlgorithmsTask Types Numerous Algorithms

Classification

Clustering C4.5 Decision Tree

K Means Clustering Clustering

Discovering Association Rules

K-Means Clustering

Discovering Sequential Patterns – Sequence Analysis

R i Regression

Detecting Deviations from Normal

2

Page 3: WEKA Data Mining Tool

http://www cs waikato ac nz/ml/weka/ http://www.cs.waikato.ac.nz/ml/weka/

WEKA can be freely downloaded by visiting the Web sitey y g

3

Page 4: WEKA Data Mining Tool

WEKA Data Mining SoftwareWEKA – Data Mining Software Developed by the Machine Learning Group, University of

Waikato , New Zealand

Vision: Build state-of-the-art software for developing machine learning (ML) techniques and apply them to realmachine learning (ML) techniques and apply them to real-world data-mining problems

Developed in Javap J

4

Page 5: WEKA Data Mining Tool

WELA’s Collection of Machine Learning Alg ithAlgorithms Algorithms for data mining tasks

WEKA is open source software issued under the GNU General Public License

T l f Tools for: Data pre-processing ClassificationClassification Regression Clustering Association rules Visualization

5

Page 6: WEKA Data Mining Tool

After Installing Start WEKAAfter Installing - Start WEKA

6

Page 7: WEKA Data Mining Tool

WEKA Main InterfaceWEKA Main Interface

7

Page 8: WEKA Data Mining Tool

WEKA Sample FilesWEKA Sample Files C:\Program Files\weka\data

WEKA formatted files (.arff)

Open the contact-lenses file

8

Page 9: WEKA Data Mining Tool

Example Contact Lens DataExample – Contact Lens DataHow many ydata instances are in the are in the file?How many attributes?Numerical attributes?attributes?Categorical attributes?

9

Page 10: WEKA Data Mining Tool

Example Contact Lens DataExample – Contact Lens Data

Can you Can you think of problems that might be solved

ith thi with this data?

10

Page 11: WEKA Data Mining Tool

Example Contact Lens DataExample – Contact Lens Data

If supervised learning were to be were to be done, which would be the output attribute attribute, do you think?

11

Page 12: WEKA Data Mining Tool

Example Contact Lens DataExample – Contact Lens Data

12

Page 13: WEKA Data Mining Tool

Example Contact Lens DataExample – Contact Lens Data

13

Page 14: WEKA Data Mining Tool

Example Contact Lens DataExample – Contact Lens Data

14

Page 15: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

15

Page 16: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

16

Page 17: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

Select the Select the rule generator named PART from th li t th t the list that shows up after you after you select Choose

17

Page 18: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

18

Page 19: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

19

Page 20: WEKA Data Mining Tool

10 Fold Cross Validation10-Fold Cross-Validation Data is partitioned into 10 equally (or nearly equally) sized

segments or folds

10 iterations of training and validation are completed

I h d ff f ld f h d h ld f In each iteration a different fold of the data is held out for validation, with the remaining 9 folds used for learning

20 http://www.public.asu.edu/~ltang9/papers/ency-cross-validation.pdf

Page 21: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

21

Page 22: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

IF tear-prod-IF tear prodrate = reduced THEN contact-l lenses = none

IF tig ti IF astigmatism = no THEN contact-lenses co tact e ses= soft

22

Page 23: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

Coverage = Coverage 12

23

Page 24: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

IF tear-prod-IF tear prodrate = reduced THEN contact-l lenses = none

IF tig ti IF astigmatism = no THEN contact-lenses co tact e ses= soft

24

Page 25: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data Coverage = 6

Misclassification = 1

Accuracy = 5/6 = 83.3%

25

Page 26: WEKA Data Mining Tool

E l Cl if C t t L D tExample – Classify - Contact Lens Data

26

Page 27: WEKA Data Mining Tool

WEKAWEKAA Data Mining Tool

By Susan L. Miertschin

27