Top Banner
Chapter 9 - Data Analysis Presented by, Professor, Hair Priya Pucchakayala Dr. T. Y. Lin & Donavon Norwood
21
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Mining

Chapter 9 - Data Analysis

Presented by, Professor, Hair Priya Pucchakayala Dr. T. Y. Lin& Donavon Norwood

Page 2: Data Mining

22

Outline

• What is Data Analysis?• Why data should be analyzed?• What is a Decision Table?• Condition/Decision attributes of the Stoker table• Analyzing Inconsistent data in the Stoker table• Analyzing Consistent data in the Stoker table• Conclusion

Page 3: Data Mining

3

Data Analysis

Data Analysis is a process of gathering, modelling and transforming data with a goal of highlighting useful information, suggesting conclusions, and supporting decision making.

Data Analysis is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data.

Ways to analyze data

Compare constantly Categorize and sort

Page 4: Data Mining

4

Why data should be analyzed?

Data Mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes.

Data analysis does NOT explain but it does summarize and provide organization to data.

It tries to make sense of these rows and columns in the decision tables.

Page 5: Data Mining

55

Decision TableAccording to Wiki enclycopedia a decision table are a precise yet compact way to model complicated logic. Decision tables, like if-then-else and switch-case statements, associate conditions with actions to perform. The structure of Decision tables are typically divided into four quadrants, as shown below:

Conditions Condition alternatives

Actions Action entries

The four quadrants

Each decision corresponds to a variable, relation or predicate whose possible values are listed among the condition alternatives. Each action is a procedure or operation to perform, and the entries specify whether (or in what order) the action is to be performed for the set of condition alternatives the entry corresponds to. In this presentation we will use the Stoker decision table as the example of a decision table.

Page 6: Data Mining

66

Decision Table as Protocol of Observations

•We will analyze the stoker’s decisions while controlling the clinker kiln

• Aim of the stoker is to keep the kiln in a “proper” state

•Actions of the stoker can be described by a “Decision Table”

Page 7: Data Mining

77

•Actions of the stoker can be described by a decision table, where

1. Burning zone temperature(BZT) 2. Burning zone color(BZC) Condition3. Clinker granulation(CG) attributes4. Kiln inside color(KIC)

1. Kiln revolutions(KR) Decision 2. Coal worm revolutions(CWR) attributes

Condition/Decision attributes of the Stoker table

Page 8: Data Mining

88

•The table describes action undertaken by a stoker, during one shift, when specific conditions observed by the stoker have been fulfilled numbers given in the column TIME are in fact ID labels of moments of time when the decision took place, and form the “universe” of the decision table.

•Since many decisions are the same, hence identical decision rules can be removed

• It can be described by table shown below,

Page 9: Data Mining

99

Decision Table 1 with Condition Attributes and Decision Attributes

Page 10: Data Mining

1010

Derivation of Control Algorithms from Observation

•Consistency of the stoker’s knowledge

•The reduction of his knowledge and control algorithm generation from the observed stoker’s behavior

Page 11: Data Mining

1111

Table 2: Removing attribute ‘a’ from the table1

U b c d e f1 3 2 2 2 4

2 2 2 2 2 4

3 2 2 1 2 4

4 2 2 1 1 4

5 2 2 2 1 4

6 2 2 3 2 3

7 3 2 3 2 3

8 3 2 3 2 3

9 3 3 3 2 2

10 4 3 3 2 2

11 4 3 2 2 2

12 3 3 2 2 2

13 2 3 2 2 2

Page 12: Data Mining

1212

Table 4: Removing attribute ‘c’ from table1

U a b d e f1 3 3 2 2 4

2 3 2 2 2 4

3 3 2 1 2 4

4 2 2 1 1 4

5 2 2 2 1 4

6 3 2 3 2 3

7 3 3 3 2 3

8 4 3 3 2 3

9 4 3 3 2 2

10 4 4 3 2 2

11 4 4 2 2 2

12 4 3 2 2 2

13 4 2 2 2 2

Page 13: Data Mining

1313

Table 5: Removing attribute ‘d’ from table1

U a b c e f1 3 3 2 2 4

2 3 2 2 2 4

3 3 2 2 2 4

4 2 2 2 1 4

5 2 2 2 1 4

6 3 2 2 2 3

7 3 3 2 2 3

8 4 3 2 2 3

9 4 3 3 2 2

10 4 4 3 2 2

11 4 4 3 2 2

12 4 3 3 2 2

13 4 2 3 2 2

Page 14: Data Mining

1414

Table 3: Removing attribute ‘b’ from table1

U a c d e f1 3 2 2 2 4

2 3 2 2 2 4

3 3 2 1 2 4

4 2 2 1 1 4

5 2 2 2 1 4

6 3 2 3 2 3

7 3 2 3 2 3

8 4 2 3 2 3

9 4 3 3 2 2

10 4 3 3 2 2

11 4 3 2 2 2

12 4 3 2 2 2

13 4 3 2 2 2

Page 15: Data Mining

1515

Table 6: After removing superfluous attribute ‘b’ & duplicate rules

u a c d e f1 3 2 2 2 4

2 3 2 1 2 4

3 2 2 1 1 4

4 2 2 2 1 4

5 3 2 3 2 3

6 4 2 3 2 3

7 4 3 3 2 2

8 4 3 2 2 2

Page 16: Data Mining

1616

a3c2d2 ==> e2f4

c2d2 ==> e2f4 (rule 1) c2d2 ==> e1f4 (rule 4)

a3c2 ==> e2f4 (rule 1) a3c2 ==> e1f3 (rule 5)

For e.g. let us compute core values and reduct values for the first decision rule:

In the table 7, values ‘a’ and ‘d’ are indispensible in the rule Since the following pairs of rules are Inconsistent.

Thus a3 and d2 are core values of the decision value a3c2d2 ---> e2f4

Page 17: Data Mining

1717

Table 6: Inconsitent rows without column a

u c d e f1 2 2 2 4

2 2 1 2 4

3 2 1 1 4

4 2 2 1 4

5 2 3 2 3

6 2 3 2 3

7 3 3 2 2

8 3 2 2 2

Page 18: Data Mining

1818

u a d e f1 3 2 2 4

2 3 1 2 4

3 2 1 1 4

4 2 2 1 4

5 3 3 2 3

6 4 3 2 3

7 4 3 2 2

8 4 2 2 2

Table 6: Inconsitent rows without column c

Page 19: Data Mining

1919

Table 6: Inconsitent rows without column d

u a c e f1 3 2 2 4

2 3 2 2 4

3 2 2 1 4

4 2 2 1 4

5 3 2 2 3

6 4 2 2 3

7 4 3 2 2

8 4 3 2 2

Page 20: Data Mining

2020

Conclusion

We determined that condition attributes a, c, and d were core attributes to the Stoker table because when they were removed respectively from the Stoker table, the Stoker table contained inconsistent rows;

We then determined that condition attribute b was a reduct or superfluous attribute of the Stoker table, because when it was removed from the Stoker table the table still contained consistent rows.

From there we were able to build our final Stoker decision table and algorithms.

Page 21: Data Mining

2121

Thank you