Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique

Third Colloquium:

Application of Data Mining in Education

SITI KHADIJAH MOHAMAD

FACULTY OF EDUCATION

APRIL 10 & 11, 2018

Introduction Data Mining, Software, RQs,

Data Mining

Data Mining is a technique which use to discover patterns in data, gain knowledge.

Machine Learning is the algorithms used in data mining technique.

Types of DM: Decision tree, Association rules, Clustering, etc.

Supervised and Unsupervised Learning?

Cross validation?

Software

Types: WEKA, Microsoft SQL Server 2008, RapidMiner, Clementine, R

Download: http://www.cs.waikato.ac.nz/ml/weka/

Supported Platform: Linux, Windows, Mac OS

Created: Researchers at the University of Waikato, New Zealand

Research Question

Association, Clustering and Decision tree are NOT Cause - Effect analysis.

It is actually about relationship analysis.

Eg of RQs:

1. To develop a decision tree model that can predict student’s performance based on the

mechanisms of metacognitive scaffolding prompted by the instructor in Facebook discussion.

2. To formulate learning performance pathways based on the reflective thinking and types of

feedback through educational blogging

3. How the provision of feedback and reflective thinking shape the reflection process through

educational blogging

4. To develop deaf students’ learning patterns when using the e-learning environment in studying

Nuclear Energy

Decision Tree

• This is related to lifestyle and heart disease.

• Age, Smoker (y/n), Diet (good/poor), and a label Risk

(Less Risk/More Risk).

• The biggest influence on Risk turns out to be the

Smoker attribute.

• Smoker becomes the first branch in our tree.

• For Smokers, the next influential attribute happens to

be Age, however, for non smokers, the data indicates

that their diet has a bigger influence on the risk.

• The tree will branch into two different nodes until the

classification is reached.

• Decision tree can be a great way to visualize how a

decision is derived based on the attributes in your

Credit to: refactorthis.net

Association Rules

Q1 Q2 T1 conf: (1)

Q7 T3 conf: (0.92)

T2 Q2 conf: (0.5)

Support (coverage) and Confidence (accuracy)

Clustering

Credit to: Almodiel

WEKA Workbench 2

WEKA Workbench (1) Performance Comparison

Graphical Interface

Classifiers

Command-line Interface

WEKA Workbench (2)

Supply data here

Details of the data

• Attributes == Variables

• Instances == No of samples

Preprocess Tab

4 options to

classify the data

WEKA Workbench (3)

Classify Tab (also known as postprocessing tab)

Results panel

Lists of algorithms

Right click here to

view the tree

What Does Precision and Recall Tell Us?

Precision: Given all the predicted labels (for a given class X), how many

instances were correctly predicted?

Recall: For all instances that should have a label X, how many of these

were correctly captured?

Suppose a computer program for recognizing dogs in scenes from a

video identifies 7 dogs in a scene containing 9 dogs and some cats. If 4

of the identifications are correct, but 3 are actually cats, the program's

precision is 4/7 while its recall is 4/9.

Application & Interpretation

True Positives and True Negatives: are correct classification

False Positives: when the outcome is incorrectly predicted as yes when it is actually no

False Negatives: when the outcome is incorrectly predicted as no when it is actually yes Credit to: wikipedia

Calculate Recall for Class A:

= TP_A / (TP_A+ FN_A)

= 10 / (10 + 2 )

= 0.83

Predicted Class

a b c Total

Actual

a 10 1 1 12

b 2 0 1 3

c 1 0 0 1

Total 13 1 2 16

Application & Interpretation

Calculate Precision for Class A:

= TP_A / (TP_A+ FP_A)

= 10 / (10 + 3 )

= 0.769

Thank You! Questions?

Third Colloquium: Application of Data Mining in Educationsmartdigitalcommunity.utm.my/cite/files/2018/05/THIRD-COLLOQUIUM-Application-of-DM-in...Data Mining Data Mining is a technique

Documents

Lecture 2: Data Mining 1. Roadmap What is data mining? Data....

Data Mining vs. Statistics Pavel Brusilovsky. 2 Objectives 2...

Data mining week 1 - pengantar data mining

Data Mining: Introduction. Chapter 1. Introduction...

Datenbanksysteme 3 Sommer 2003 Data Mining - 1 Worzyk FH...

Data Mining Introduction to Data Mining

Data Mining Lecture 1: Introduction to Data Mining

FROM DATA MINING TO KNOWLEDGE MINING: SYMBOLIC DATA ...

What is Data Mining? Data Mining Motivation Data Mining...

Data Mining LECTURE # 01 Introduction to Data Mining

Data Mining By: Thai Hoa Nguyen Pham. Data Mining Define...

1 Chapter 1. Introduction Motivation: Why data mining? What....

Data Mining: What is Data Mining?

DATA MINING AND ANALYSIS - doc.lagout.org Mining/Data Mining...

September 4, 20151 Chapter 1. Introduction Motivation: Why.....

Data mining and privacy preserving in data mining