Top Banner
DATA MINING Holly B. Smith University of North Texas
33
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: DATA MINING Holly B. Smith University of North Texas.

DATA MINING

Holly B. Smith University of North Texas

Page 2: DATA MINING Holly B. Smith University of North Texas.

 “Copyright and Terms of Service

Copyright © Texas Education Agency. The materials found on this website are copyrighted © and trademarked ™ as the property of the Texas Education Agency and may not be reproduced without the express written permission of the Texas Education Agency, except under the following conditions:

1) Texas public school districts, charter schools, and Education Service Centers may reproduce and use copies of the Materials and Related Materials for the districts’ and schools’ educational use without obtaining permission from the Texas Education Agency;

2) Residents of the state of Texas may reproduce and use copies of the Materials and Related Materials for individual personal use only without obtaining written permission of the Texas Education Agency;

3) Any portion reproduced must be reproduced in its entirety and remain unedited, unaltered and unchanged in any way;

4) No monetary charge can be made for the reproduced materials or any document containing them; however, a reasonable charge to cover only the cost of reproduction and distribution may be charged. Private entities or persons located in Texas that are not Texas public school districts or Texas charter schools or any entity, whether public or private, educational or non-educational, located outside the state of Texas MUST obtain written approval from the Texas Education Agency and will be required to enter into a license agreement that may involve the payment of a licensing fee or a royalty fee.

Call TEA Copyrights with any questions you have.

Copyright © Texas Education Agency, 2012. All rights reserved. 2

Page 3: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 3

1.8

TEN

2020

Page 4: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 4

“Truth is a gem that is found at a great depth…”

Lord Byron

Page 5: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 5

BY

• Definition• Methods Used

OF

• Who Mines/Uses Data• Storage Issues

FOR

• TEKS• Classroom Activities

Page 6: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 6

Patterns

Page 7: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 7

Page 8: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 8

A 6 step process1.Defining2.Preparing3.Exploring4.Building5.Exploring and validating6.Deploying and Updating

Page 9: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 9

Questions that Data Mining answers:

What goods should be promoted to this customer?

What is the probability that a certain customer will respond to a planned promotion?

Can one predict the most profitable securities to buy/sell during the next trading session?

Will this customer default on a loan or pay back on schedule?

Can you think of others?

Page 10: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 10

1990 Data Mining was introduced

Family tree has three branches:classical statisticsartificial intelligencemachine learning

Page 11: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 11

Techniques in Data MiningAssociation Rule:

• to discover interesting associations between attributes contained in a database. • market basket analysis. • tells if item X is a part of the event, then what is the percentage of item Y is also part of the event. • Who uses the Market Basket analysis?

Clustering:• to find appropriate groupings of elements for a set of data. • a kind of undirected knowledge discovery or unsupervised learning; that is, there is no target field, and the relationship among the data is identified by bottom-up approach. • Who uses Clustering?

Page 12: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 12

Techniques in Data MiningDecision Trees:

• predicts the value of a target• based on several variables• simple to interpret and understand, performs well with a large amount of data in a short time

Page 13: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 13

Techniques in Data MiningNeural Network:

• often represented as a layered set of interconnected processors, or neurodes• uses a complex mathematical process to resemble the brain• Artificial Neuron Networks can

examine entire warehouses of databases• large companies can spot trends (i.e., silly bands)• science and engineering community, geographical systems, and gaming (machine playing chess)

Page 14: DATA MINING Holly B. Smith University of North Texas.

14

Data Mining Example #1

• Local Aware Mobile Social Networks

• Loopt https://www.loopt.com/• Whrrl (recently acquired by Groupon) • foursquare https://foursquare.com/• buddycloud http://buddycloud.com/,

and • brightkight http://brightkite.com/

(currently being revised)

Page 15: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 15

“mental pain and distress, far greater than could be inflicted by mere bodily harm.”

Louis Brandeis

Page 16: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 16

Page 17: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 17

Counterterrorism

Page 18: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 18

“the Industrial Revolution of data”

Joe HellersteinComputer ScienceUniversity of California in Berkeley

Page 19: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 19

STORAGE

SOFTWARE METADATA

Extracts data from warehouse

Manage new database

Retrieve and analyze the data

Information about the data in the warehouse

Used to run searches

Used to identify key pieces of data

Page 20: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 20

Storage Issues

Page 21: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 21

There is an upside

Page 22: DATA MINING Holly B. Smith University of North Texas.

22

Data Mining Example #1

• Reality Mining

• http://www.google.org/flutrends

• http://www.google.org/flutrends/video/GoogleFluTrends_USFluActivity.mov

Page 23: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 23

Page 24: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 24

Page 25: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 25

Page 26: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 26

Define fields

Enter database structure

Analyze company’s data requirements

Access information in the database system

Create a meaningful data set

Import and export databases

Page 27: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 27

Data EntryData AnalysisData Collection

Page 28: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 28

Page 29: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 29

www.wpi.edu

Page 30: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 30

I Want YOUto Mine Data

Page 31: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 31

Page 32: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 32

Resources

Slide Source3 http://en.thinkexist.com/search/searchquotation.asp?search=gems4 http://thoughts.forbes.com/thoughts/truth-lord-byron-truth-is-a6 http://www.unc.edu/~xluan/258/datamining.html 7 http://terrorism.about.com/od/counterterrorism/a/DataMining.htm8 Journal of Engineering Science and Technology Vol. 6, No. 2 (2011) - By Francisca

Nonyelum Ogwueleka9 http://www.unc.edu/~xluan/258/datamining.html10 http://www.time.com/time/printout/0,8816,2058205,00.html.10 http://www.unc.edu/~xluan/258/datamining.html11 www.itworld.com/050805datamining12 http://www.unc.edu/~xluan/258/datamining.html 13 http://www.unc.edu/~xluan/258/datamining.html 14 http://www.time.com/time/printout/0,8816,2058505,00.html15 http://www.theatlantic.com/health/print/2012/05/using-data-mining-to-predict-epidemics16 http://terrorism.about.com/od/counterterrorism/a/DataMining.htm19 http://seqcc.icarnegie.com/content/SSD/SSD7/1.5.2/normal/pg-trends/pg- datawarehouse/pg-datawarehouse.html20 http

://articles.cnn.com/2008-02-20/living/cb.top.jobs.top.industries_1_job-growth-median-annual-salary-physician-assistants?_s=PM:LIVING

Page 33: DATA MINING Holly B. Smith University of North Texas.

Copyright © Texas Education Agency, 2012. All rights reserved. 33

Resources

Slide Source

23 http://3.bp.blogspot.com/-0mZJJX8zPnc/Tbcusl4kDzI/AAAAAAAAAIY/vf7apxeUxEA/s1600/Bizarre-Architecture-Nord-LB-Building-Hanover-Germany.jpg

23 http://fashionshow007.files.wordpress.com/2010/04/design-for-fashion.jpg24 http://www.databaseguides.com/wp-content/uploads/2009/01/database-backup.jpg25 http://sp.life123.com/bm.pix/database1.s600x600.jpg26 http://www.ehow.com/info_7896179_middle-school-math-data-projects.html#ixzz1vu1kC

5VT26 http://chinwag.com/files/images/photos/junk_mail.jpg28 http://donmamporro.files.wordpress.com/2010/04/poster1-3-i-want-you.jpg29 http://www.zealeap.com/wp-content/uploads/2010/11/Thank-you-sign1.jpg