Top Banner
Data Mining & Methods Remzi uza˘ ga¸c Introduction What is data mining? Usage Areas How does it work? Methods & Algorithms Classification & Prediction Clustering Genetic Algorithm Questions Data Mining & Methods Remzi D¨ uza˘ ga¸c February 11, 2015
16
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Data Mining & Methods

Remzi Duzagac

February 11, 2015

Page 2: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

What is data mining?

Data mining is the task of discovering interesting patternsfrom large amounts of data

Page 3: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Why do we need data mining?

Computers have promised us a fountain of wisdom butdelivered flood of data

Data explosion problem

Automated data collection tools and mature databasetechnology lead to tremendous amounts of data stored indatabases, data warehouses and other informationrepositories

Page 4: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Why do we need data mining?

We are drowning in data, but starving for knowledge

The greatest problem of today is how to teach people toignore the irrelevant, how to refuse to know things, beforethey are suffocated. For too many facts are as bad asnone at all. (W.H. Auden)

Page 5: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Medical / Pharma

Computer Assisted Diagnosis (expert systems learning)

Characterization/prediction of patient’s response toproduct dosage

Identification of successful medical therapies (successfulprescription patterns).

Study of relations between dosage and potentially relatedadverse events

Page 6: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Insurance and Health Care

Discovery of medical procedures that are claimed togetherthrough claims analysis

Identification of customers that are potential buyers fornew policies.

Detection of behavior patterns capable of identifying riskycustomers.

Detection of fraudulent behavior.

Page 7: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Retail / Marketing

Discovery of buying behavior patterns

Detection of associations among customer characteristics.

Prediction of the probability that clients answer to mailing.

Page 8: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Banking / Finance

Detection of fraudulent credit card usage patterns.

Risk management related to attribution of loans usingscorecards.

Find hidden correlations between different financialindicators.

Identification of stocks trading rules from historical marketdata.

Page 9: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Computer Science

Image processing

Natural language processing

Information retrivial (Search engines)

Bioinformatics

Page 10: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Real Estate

...

...

...

...

...

Page 11: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Steps

Data cleaning: missing values, noisy data, andinconsistent data

Data integration: merging data from multiple data stores

Data selection: select the data relevant to the analysis

Data transformation: aggregation (daily sales to weeklyor monthly sales) or generalisation (street to city; age toyoung, middle age and senior)

Data mining: apply intelligent methods to extractpatterns

Pattern evaluation: interesting patterns shouldcontradict the user’s belief or confirm a hypothesis theuser wished to validate

Knowledge presentation: visualisation andrepresentation techniques to present the mined knowledgeto the use

Page 12: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Classification

Decision Tree Learning

Bayesian Learning (Naive Bayes, Bayesian Tree)

KNN

Neural Networks

Page 13: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Prediction

Regression (Linear, Multiple, Non-Linear)

Page 14: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Clustering

Hierarchical clustering

K-Means

Markov Cluster Algorithm

Page 15: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Genetic Algorithm

Genetic Algorithm

Genetic Programming

Page 16: Datamining

Data Mining& Methods

RemziDuzagac

Introduction

What is datamining?

Usage Areas

How does itwork?

Methods &Algorithms

Classification &Prediction

Clustering

GeneticAlgorithm

Questions

Questions