Top Banner
MIMA Group Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University Machine Learning & Data Mining Chapter 7 Decision Trees M L D M
46

Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

Aug 16, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA Group

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University

Machine Learning& Data Mining

Chapter 7

Decision Trees

M LD M

Page 2: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 2

Top 10 Algorithms in DM #1: C4.5 #2: K-Means #3: SVM #4: Apriori #5: EM #6: PageRank #7: AdaBoost #7: kNN #7: Naive Bayes #10: CART

Page 3: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 3

Content Introduction CLS ID3 C4.5 CART

Page 4: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 4

Inductive Learning

The general conclusion should apply to unseen examples.

Examples Model Prediction

Generalize

Instantiated for another case

Page 5: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 5

Decision Tree A decision tree is a tree in which

each branch node represents a choice between a number of alternatives

each leaf node represents a classification or decision

Page 6: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 6

Example I

Page 7: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 7

Example II

Page 8: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 8

Decision Rules

( )( )( )

Outlook HumidityOutlo

sunny normalovercasok

Outlot

rain weo in ak d kW

Page 9: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 9

Decision Tree Learning We wish to be able to induce a decision tree

from a set of data about instances together with the decisions or classifications for those instances.

Learning Algorithms: CLS (Concept Learning System) ID3 C4 C4.5 C5

Page 10: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 10

Appropriate Problems forDecision Tree Learning

Instances are represented by attribute-value pairs.

The target function has discrete output values. Disjunctive descriptions may be required. The training data may contain errors. The training data may contain missing attribute

values.

Page 11: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 11

CLS Algorithm1. T the whole training set. Create a T node.2. If all examples in T are positive, create a ‘P’ node with T

as its parent and stop. 3. If all examples in T are negative, create an ‘N’ node with

T as its parent and stop. 4. Select an attribute X with values v1, v2, …, vN and

partition T into subsets T1, T2, …, TN according their values on X. Create N nodes Ti (i = 1,..., N) with T as their parent and X = vi as the label of the branch from Tto Ti.

5. For each Ti do: T Ti and goto step 2.

Page 12: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 12

Example

(tall, blond, blue) w

(short, silver, blue) w

(short, black, blue) w

(tall, blond, brown) w

(tall, silver, blue) w

(short, blond, blue) w

(short, black, brown) e

(tall, silver, black) e

(short, black, brown) e

(tall, black, brown) e

(tall, black, black) e

(short, blond, black) e

Page 13: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 13

Example

(short, blond, black) e

black(short, blond, blue) wblue

(short, black, blue) w

blue

(short, black, brown) e(short, black, brown) e

brown

(tall, silver, blue) w

blue(tall, silver, black) e

black

(short, silver, blue) w

silver

(short, blond, blue) w(short, blond, black) e

blond

(short, black, blue) w(short, black, brown) e(short, black, brown) e

black

(tall, black, brown) e(tall, black, black) e

black

(tall, silver, black) e(tall, silver, blue) w

silver

(tall, blond, blue) w(tall, blond, brown) w

blond

(short, silver, blue) w(short, black, blue) w(short, blond, blue) w

(short, black, brown) e(short, black, brown) e(short, blond, black) e

short(tall, blond, blue) w

(tall, blond, brown) w(tall, silver, blue) w

(tall, silver, black) e(tall, black, brown) e(tall, black, black) e

tall

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w(tall, blond, brown) w

(tall, silver, blue) w(short, blond, blue) w

(short, black, brown) e(tall, silver, black) e

(short, black, brown) e(tall, black, brown) e(tall, black, black) e

(short, blond, black) e

Page 14: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 14

Example

(short, black, brown) e(short, black, brown) e

(tall, black, brown) e

blackblond

(tall, blond, brown) w

brown

(tall, blond, brown) w

(short, black, brown) e(short, black, brown) e

(tall, black, brown) e

black

(tall, silver, black) e(tall, black, black) e

(short, blond, black) e

blue

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w

(tall, silver, blue) w(short, blond, blue) w

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w(tall, blond, brown) w

(tall, silver, blue) w(short, blond, blue) w

(short, black, brown) e(tall, silver, black) e

(short, black, brown) e(tall, black, brown) e(tall, black, black) e

(short, blond, black) e

Page 15: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 15

ID3

Iterative Dichotomizer (version) 3developed by Ross Quinlan

Select decision sequence of the tree based on information gain.

Page 16: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 16

ID3 Entropy (Binary Classification)

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w(tall, blond, brown) w

(tall, silver, blue) w(short, blond, blue) w

(short, black, brown) e(tall, silver, black) e

(short, black, brown) e(tall, black, brown) e(tall, black, black) e

(short, blond, black) e

1 2 1 2 2 2( ) log logEntropy S p p p p

S =

C1 : Class 1 C2 : Class 2

1 1( )p P s C

2 2( )p P s C

Page 17: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 17

ID3 Entropy (Binary Classification)

0

0.5

1

0 0.5 1

1p

( )Entropy S

1 2 1 2 2 2( ) log logEntropy S p p p p

Page 18: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 18

ID3 Entropy (Binary Classification)

( )Entropy S

1 2 1 2 2 2( ) log logEntropy S p p p p

0

0.5

1

0 0.5 1

1 2| |p p

Page 19: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 19

ID3 Entropy (Binary Classification)

++

++

+ + +

+

+++

+

+

+

# + = 14

# – = 1

14 /151/15

pp

0.353359Entropy

1 2 1 2 2 2( ) log logEntropy S p p p p

Page 20: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 20

Information Gain

S

S1 S2 Sn

v1 v2 vn

| |( , ) ( ) ( )| |

vv

v A

SGain S A Entropy S Entropy SS

1Attribute { , , }nA v v K

Page 21: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 21

Information Gain| |( , ) ( ) ( )| |

vv

v A

SGain S A Entropy S Entropy SS

(short, silver, blue) w(short, black, blue) w(short, blond, blue) w

(short, black, brown) e(short, black, brown) e(short, blond, black) e

short(tall, blond, blue) w

(tall, blond, brown) w(tall, silver, blue) w

(tall, silver, black) e(tall, black, brown) e(tall, black, black) e

tall

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w(tall, blond, brown) w

(tall, silver, blue) w(short, blond, blue) w

(short, black, brown) e(tall, silver, black) e

(short, black, brown) e(tall, black, brown) e(tall, black, black) e

(short, blond, black) e

( ) 1Entropy S

( ) 1tallEntropy S ( ) 1shortEntropy S

( , ) 0Gain S Height

Page 22: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 22

Information Gain

( ) 1Entropy S

( ) 0.811278blondEntropy S ( ) 0.721928blackEntropy S

( , ) 0.199197Gain S Hair

( ) 0.918296silverEntropy S

blond

(tall, blond, blue) w(tall, blond, brown) w(short, blond, blue) w

(short, blond, black) e

silver

(short, silver, blue) w(tall, silver, blue) w

(tall, silver, black) e

black

(short, black, blue) w(short, black, brown) e

(tall, black, brown) e(short, black, brown) e

(tall, black, black) e

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w(tall, blond, brown) w

(tall, silver, blue) w(short, blond, blue) w

(short, black, brown) e(tall, silver, black) e

(short, black, brown) e(tall, black, brown) e(tall, black, black) e

(short, blond, black) e

| |( , ) ( ) ( )| |

vv

v A

SGain S A Entropy S Entropy SS

Page 23: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 23

Information Gain

brown

(tall, blond, brown) w

(short, black, brown) e(short, black, brown) e

(tall, black, brown) e

black

(tall, silver, black) e(tall, black, black) e

(short, blond, black) e

blue

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w

(tall, silver, blue) w(short, blond, blue) w

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w(tall, blond, brown) w

(tall, silver, blue) w(short, blond, blue) w

(short, black, brown) e(tall, silver, black) e

(short, black, brown) e(tall, black, brown) e(tall, black, black) e

(short, blond, black) e

( ) 1Entropy S

( ) 0blueEntropy S ( ) 0blackEntropy S

( , ) 0.829574Gain S Eye

( ) 0.811278brownEntropy S

| |( , ) ( ) ( )| |

vv

v A

SGain S A Entropy S Entropy SS

Page 24: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 24

Information Gain

( , ) 0Gain S Height ( , ) 0.199197Gain S Hair

( , ) 0.829574Gain S Eye

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w(tall, blond, brown) w

(tall, silver, blue) w(short, blond, blue) w

(short, black, brown) e(tall, silver, black) e

(short, black, brown) e(tall, black, brown) e(tall, black, black) e

(short, blond, black) e

Page 25: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 25

ID3 (modify of CLS)1 T the whole training set. Create a T node.2 If all examples in T are positive, create a ‘P’ node

with T as its parent and stop. 3 If all examples in T are negative, create a ‘N’ node

with T as its parent and stop. 4 Select an attribute X with values v1, v2, …, vN and

partition T into subsets T1, T2, …, TN according their values on X. Create N nodes Ti (i = 1,..., N) with T as their parent and X = vi as the label of the branch from T to Ti.

5 For each Ti do: T Ti and goto step 2.

Page 26: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 26

ID3 (modify of CLS)1 T the whole training set. Create a T node.2 If all examples in T are positive, create a ‘P’ node

with T as its parent and stop. 3 If all examples in T are negative, create a ‘N’ node

with T as its parent and stop. 4 Select an attribute X with values v1, v2, …, vN and

partition T into subsets T1, T2, …, TN according their values on X. Create N nodes Ti (i = 1,..., N) with T as their parent and X = vi as the label of the branch from T to Ti.

5 For each Ti do: T Ti and goto step 2.

By maximizing the information gain.

Page 27: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 27

Example

(short, black, brown) e(short, black, brown) e

(tall, black, brown) e

blackblond

(tall, blond, brown) w

brown

(tall, blond, brown) w

(short, black, brown) e(short, black, brown) e

(tall, black, brown) e

black

(tall, silver, black) e(tall, black, black) e

(short, blond, black) e

blue

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w

(tall, silver, blue) w(short, blond, blue) w

(tall, blond, blue) w(short, silver, blue) w(short, black, blue) w(tall, blond, brown) w

(tall, silver, blue) w(short, blond, blue) w

(short, black, brown) e(tall, silver, black) e

(short, black, brown) e(tall, black, brown) e(tall, black, black) e

(short, blond, black) e

Page 28: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 28

Windowing

++

++

++

++

++

++

++

++

++

++ +

++

+

++

++

++

++

++

++

++

++

++

++

–––

–––

–––

–––

–––

––––

––

–––

––––

–––

–––

++

++ +

+

++

+–

++

++ +

+

++

+–+

+

++

++

++

––––

+

++

++

+

Page 29: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 29

Windowing ID3 can deal with very large data sets by performing

induction on subsets or windows onto the data. 1. Select a random subset of the whole set of training instances. 2. Use the induction algorithm to form a rule to explain the current

window. 3. Scan through all of the training instances looking for exceptions

to the rule. 4. Add the exceptions to the window

Repeat steps 2 to 4 until there are no exceptions left.

Page 30: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 30

Inductive Biases Shorter trees are preferred. Attributes with higher information gain are

selected first in tree construction. Greedy Search

Preference bias (relative to restriction bias as in the VS approach)

Why prefer short hypotheses? Occam's razor Generalization

Page 31: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 31

Overfitting to the Training Data

The training error is statistically smaller than the test error for a given hypothesis.

Solutions: Early stopping Validation sets Statistical criterion for continuation (of the tree) Post-pruning Minimal description length

cost-function = error + complexity

Page 32: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 32

Pruning Techniques

Reduced error pruning (of nodes)Used by ID3

Rule post-pruning Used by C4.5

Page 33: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 33

Reduced Error Pruning Use a separated validation set Tree accuracy:

percentage of correct classifications on validation set Method:

Do until further pruning is harmful Evaluate the impact on validation set of pruning each

possible node. Greedily remove the one that most improves the

validation set accuracy.

Page 34: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 34

Reduced Error Pruning

Page 35: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 35

C4.5:An Extension of ID3 Some additional features of C4.5 are:

Incorporation of numerical (continuous) attributes. Nominal (discrete) values of a single attribute may be

grouped together, to support more complex tests. Post-pruning after induction of trees, e.g. based on

test sets, in order to increase accuracy. C4.5 can deal with incomplete information (missing

attribute values). Use gain ratio instead of information gain

Page 36: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 36

Rule Post-Pruning Fully induce the decision tree from the training

set (allowing overfitting) Convert the learned tree to rules

one rule for each path from the root node to a leaf node

Prune each rule by removing any preconditionsthat result in improving its estimated accuracy

Sort the pruned rules by their estimated accuracy, and consider them in this sequence when classifying subsequent instances

Page 37: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 37

Converting to Rules

Page 38: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 38

handling numeric attributes Continuous attribute discrete attribute Example

Original attribute: Temperature = 82.5 New attribute: (temperature > 72.3) = t, f

Example:Temprature<54 Temprature54-85 Temprature>85

How to choose split points?

Page 39: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 39

handling numeric attributes Choosing split points for a continuous attribute

Sort the examples according to the values of the continuous attribute.

Identify adjacent examples that differ in their target labels and attribute values a set of candidate split points

Calculate the gain for each split point and choose the one with the highest gain.

Page 40: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 40

Attributes with Many Values Information Gain – biases to attribute with many

values e.g., date

One approach – use GainRatio instead of information gain.

((( ,

, )))

, Gain S ASplitInformat

GainRatioion

SS

AA

1

| | | |log| | |

( , )|

ci i

i

S SS S

SplitInformation S A

Page 41: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 41

Associate Attributes of Cost The availability of attributes may vary

significantly in costs, e.g., medical diagnosis.

Example: Medical disease classification Temperature BiopsyResult Pulse BloodTestResult

High

High

How to learn a consistent tree with low expected cost?

Page 42: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 42

Associate Attributes of Cost Tan and Schlimmer (1990)

Nunez (1988)

2 ( , )( , )Co

Ga nst

ASS

Ai

( , )2 1( ) 1( )

Gain S A

wCost A

w [0, 1] determines the importance of cost.

Page 43: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 43

Unknown Attribute Values

Attribute A={x, y, z}( , , )x L L

( , , )y L L

( , , )z L L

( , )?, L L

( , , )x L L( , , )x L L

( , , )y L L

( , )?, L L

( , , )y L L

( , , )z L L( , , )z L L

( , , )y L LS =( , ) ?Gain S A

Assign most common value of A to the unknown one.

Assign most common value of A with the same target value to the unknown one.

Assign probability to each possible value.

Possible Approaches:

Page 44: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 44

CART Classification And Regression Trees

Generates binary decision tree: only 2 children created at each node (whereas ID3 creates a child for each subcategory).

Each split makes the subset more pure than that before splitting.

In ID3, Entropy is used to measure the splitting; in CART, impurity is used.

Page 45: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University 45

CART Node impurity is 0 when all patterns at the node

are of the same category; it becomes maximum when all the classes at the node are equally likely.

Entropy Impurity

Gini Impurity

Misclassification impurity

)(log)()( 2 jj

j PPNi

ji

ji PPNi )()()(

)(max1)( jjPNi

Page 46: Chapter 7 Decision Trees - Shandong Universitymima.sdu.edu.cn/Members/xinshunxu/Courses/ML/Chapter7.pdf · Machine Learning & Data Mining Chapter 7 ... Decision Tree Learning ...

MIMA Group

Xin-Shun Xu @ SDU School of Computer Science and Technology, Shandong University

Any Question?