Data Mining

Chapter 9 - Data Analysis

Presented by, Professor, Hair Priya Pucchakayala Dr. T. Y. Lin& Donavon Norwood

22

Outline

• What is Data Analysis?• Why data should be analyzed?• What is a Decision Table?• Condition/Decision attributes of the Stoker table• Analyzing Inconsistent data in the Stoker table• Analyzing Consistent data in the Stoker table• Conclusion

3

Data Analysis

Data Analysis is a process of gathering, modelling and transforming data with a goal of highlighting useful information, suggesting conclusions, and supporting decision making.

Data Analysis is the process of systematically applying statistical and/or logical techniques to describe and illustrate, condense and recap, and evaluate data.

Ways to analyze data

Compare constantly Categorize and sort

4

Why data should be analyzed?

Data Mining is a particular data analysis technique that focuses on modeling and knowledge discovery for predictive rather than purely descriptive purposes.

Data analysis does NOT explain but it does summarize and provide organization to data.

It tries to make sense of these rows and columns in the decision tables.

55

Decision TableAccording to Wiki enclycopedia a decision table are a precise yet compact way to model complicated logic. Decision tables, like if-then-else and switch-case statements, associate conditions with actions to perform. The structure of Decision tables are typically divided into four quadrants, as shown below:

Conditions Condition alternatives

Actions Action entries

The four quadrants

Each decision corresponds to a variable, relation or predicate whose possible values are listed among the condition alternatives. Each action is a procedure or operation to perform, and the entries specify whether (or in what order) the action is to be performed for the set of condition alternatives the entry corresponds to. In this presentation we will use the Stoker decision table as the example of a decision table.

66

Decision Table as Protocol of Observations

•We will analyze the stoker’s decisions while controlling the clinker kiln

• Aim of the stoker is to keep the kiln in a “proper” state

•Actions of the stoker can be described by a “Decision Table”

77

•Actions of the stoker can be described by a decision table, where

1. Burning zone temperature(BZT) 2. Burning zone color(BZC) Condition3. Clinker granulation(CG) attributes4. Kiln inside color(KIC)

1. Kiln revolutions(KR) Decision 2. Coal worm revolutions(CWR) attributes

Condition/Decision attributes of the Stoker table

88

•The table describes action undertaken by a stoker, during one shift, when specific conditions observed by the stoker have been fulfilled numbers given in the column TIME are in fact ID labels of moments of time when the decision took place, and form the “universe” of the decision table.

•Since many decisions are the same, hence identical decision rules can be removed

• It can be described by table shown below,

99

Decision Table 1 with Condition Attributes and Decision Attributes

1010

Derivation of Control Algorithms from Observation

•Consistency of the stoker’s knowledge

•The reduction of his knowledge and control algorithm generation from the observed stoker’s behavior

1111

Table 2: Removing attribute ‘a’ from the table1

U b c d e f1 3 2 2 2 4

2 2 2 2 2 4

3 2 2 1 2 4

4 2 2 1 1 4

5 2 2 2 1 4

6 2 2 3 2 3

7 3 2 3 2 3

8 3 2 3 2 3

9 3 3 3 2 2

10 4 3 3 2 2

11 4 3 2 2 2

12 3 3 2 2 2

13 2 3 2 2 2

1212

Table 4: Removing attribute ‘c’ from table1

U a b d e f1 3 3 2 2 4

2 3 2 2 2 4

3 3 2 1 2 4

4 2 2 1 1 4

5 2 2 2 1 4

6 3 2 3 2 3

7 3 3 3 2 3

8 4 3 3 2 3

9 4 3 3 2 2

10 4 4 3 2 2

11 4 4 2 2 2

12 4 3 2 2 2

13 4 2 2 2 2

1313

Table 5: Removing attribute ‘d’ from table1

U a b c e f1 3 3 2 2 4

2 3 2 2 2 4

3 3 2 2 2 4

4 2 2 2 1 4

5 2 2 2 1 4

6 3 2 2 2 3

7 3 3 2 2 3

8 4 3 2 2 3

9 4 3 3 2 2

10 4 4 3 2 2

11 4 4 3 2 2

12 4 3 3 2 2

13 4 2 3 2 2

1414

Table 3: Removing attribute ‘b’ from table1

U a c d e f1 3 2 2 2 4

2 3 2 2 2 4

3 3 2 1 2 4

4 2 2 1 1 4

5 2 2 2 1 4

6 3 2 3 2 3

7 3 2 3 2 3

8 4 2 3 2 3

9 4 3 3 2 2

10 4 3 3 2 2

11 4 3 2 2 2

12 4 3 2 2 2

13 4 3 2 2 2

1515

Table 6: After removing superfluous attribute ‘b’ & duplicate rules

u a c d e f1 3 2 2 2 4

2 3 2 1 2 4

3 2 2 1 1 4

4 2 2 2 1 4

5 3 2 3 2 3

6 4 2 3 2 3

7 4 3 3 2 2

8 4 3 2 2 2

1616

a3c2d2 ==> e2f4

c2d2 ==> e2f4 (rule 1) c2d2 ==> e1f4 (rule 4)

a3c2 ==> e2f4 (rule 1) a3c2 ==> e1f3 (rule 5)

For e.g. let us compute core values and reduct values for the first decision rule:

In the table 7, values ‘a’ and ‘d’ are indispensible in the rule Since the following pairs of rules are Inconsistent.

Thus a3 and d2 are core values of the decision value a3c2d2 ---> e2f4

1717

Table 6: Inconsitent rows without column a

u c d e f1 2 2 2 4

2 2 1 2 4

3 2 1 1 4

4 2 2 1 4

5 2 3 2 3

6 2 3 2 3

7 3 3 2 2

8 3 2 2 2

1818

u a d e f1 3 2 2 4

2 3 1 2 4

3 2 1 1 4

4 2 2 1 4

5 3 3 2 3

6 4 3 2 3

7 4 3 2 2

8 4 2 2 2

Table 6: Inconsitent rows without column c

1919

Table 6: Inconsitent rows without column d

u a c e f1 3 2 2 4

2 3 2 2 4

3 2 2 1 4

4 2 2 1 4

5 3 2 2 3

6 4 2 2 3

7 4 3 2 2

8 4 3 2 2

2020

Conclusion

We determined that condition attributes a, c, and d were core attributes to the Stoker table because when they were removed respectively from the Stoker table, the Stoker table contained inconsistent rows;

We then determined that condition attribute b was a reduct or superfluous attribute of the Stoker table, because when it was removed from the Stoker table the table still contained consistent rows.

From there we were able to build our final Stoker decision table and algorithms.

2121

Thank you

Data Mining

Documents

stoker decision table

stoker table conclusion

data analysis data analysis

decision making

inconsistent data

u c d e f

data mining

b d e f