Top Banner
Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
63

Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

Jan 18, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

Introduction to Weka

Xingquan (Hill) Zhu

Slides copied from Jeffrey Junfeng Pan (UST)

Page 2: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

Outline

Weka Data Source Feature selection Model building

Classifier / Cross Validation Result visualization

Page 3: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

WEKA

http://www.cs.waikato.ac.nz/ml/weka/ Data mining software in Java Open source software

UCI Data Repository http://www.ics.uci.edu/~mlearn/

MLRepository.html

Page 4: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 5: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 6: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 7: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

Explorer: pre-processing the data

Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary

Data can also be read from a URL or from an SQL database (using JDBC)

Pre-processing tools in WEKA are called “filters”

WEKA contains filters for: Discretization, normalization, resampling, attribute

selection, transforming and combining attributes, …

Page 8: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

@relation heart-disease-simplified

@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}

@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,present67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...

WEKA only deals with “flat” files

Page 9: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

@relation heart-disease-simplified

@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}

@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,present67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...

WEKA only deals with “flat” files

Page 10: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 11: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 12: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 13: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 14: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 15: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 16: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 17: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 18: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 19: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 20: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 21: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 22: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 23: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 24: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 25: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 26: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 27: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 28: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 29: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 30: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 31: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

Explorer: attribute selection

Panel that can be used to investigate which (subsets of) attributes are the most predictive ones

Attribute selection methods contain two parts: A search method: best-first, forward selection, random,

exhaustive, genetic algorithm, ranking An evaluation method: correlation-based, wrapper,

information gain, chi-squared, … Very flexible: WEKA allows (almost) arbitrary combinations of

these two

Page 32: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 33: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 34: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 35: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 36: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 37: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 38: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 39: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 40: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

Explorer: building “classifiers”

Classifiers in WEKA are models for predicting nominal or numeric quantities

Implemented learning schemes include: Decision trees and lists, instance-based classifiers,

support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, …

“Meta”-classifiers include: Bagging, boosting, stacking, error-correcting output

codes, locally weighted learning, …

Page 41: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 42: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 43: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 44: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 45: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 46: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 47: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 48: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 49: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 50: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 51: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 52: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 53: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 54: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 55: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 56: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 57: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 58: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 59: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 60: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 61: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)
Page 62: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

Problem with Running Weka

Solution : java -Xmx1000m -jar weka.jar

Problem : Out of memory for large data set

Page 63: Introduction to Weka Xingquan (Hill) Zhu Slides copied from Jeffrey Junfeng Pan (UST)

Outline

Weka Data Source Feature selection Model building

Classifier / Cross Validation Result visualization