Top Banner
1 The University of Iowa Intelligent Systems Laboratory Data Mining: STATISTICA The University of Iowa Intelligent Systems Laboratory Outline •Prepare the data •Classification and regression
16

Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

Feb 16, 2019

Download

Documents

buidat
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

1

The University of Iowa Intelligent Systems Laboratory

Data Mining: STATISTICA

The University of Iowa Intelligent Systems Laboratory

Outline

•Prepare the data•Classification and regression

Page 2: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

2

The University of Iowa Intelligent Systems Laboratory

Prepare the Data• Statistica can read from Excel, .txt and many other types of files• Compared with WEKA, Statistica is much easier in terms of data

preparing

The University of Iowa Intelligent Systems Laboratory

Open an Excel File• Click the “Import selected sheet to Spreadsheet”• Select the desired Excel sheet where your data is stored• Get variable names from the first row

Page 3: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

3

The University of Iowa Intelligent Systems Laboratory

Open an Excel File• Change variable type

The University of Iowa Intelligent Systems Laboratory

Open an Excel File• Change variable type

Page 4: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

4

The University of Iowa Intelligent Systems Laboratory

Classification and Regression

• C&RT

• Boosting tree

• Neural Networks

The University of Iowa Intelligent Systems Laboratory

C&RT Classification• Iris data is used as a example data set

Page 5: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

5

The University of Iowa Intelligent Systems Laboratory

C&RT Classification• Click “Data Mining” menu and find the “Interactive Trees”

The University of Iowa Intelligent Systems Laboratory

C&RT Classification• View the final tree and understand the results

Page 6: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

6

The University of Iowa Intelligent Systems Laboratory

C&RT---Regression• Use the CPU data set and select the regression analysis

Don’t check it

The University of Iowa Intelligent Systems Laboratory

C&RT---Regression• Regression tree structure

Page 7: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

7

The University of Iowa Intelligent Systems Laboratory

C&RT---Regression

Pre

dict

ed v

alue

s

The University of Iowa Intelligent Systems Laboratory

Boosting tree Classification• In “Data Mining” menu and find the “Boosted Trees”

Page 8: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

8

The University of Iowa Intelligent Systems Laboratory

Boosting tree Classification• See the results and predictor’s importance

The University of Iowa Intelligent Systems Laboratory

Boosting tree Classification• See the results and predictor’s importance

Page 9: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

9

The University of Iowa Intelligent Systems Laboratory

Boosting tree Regression• CPU data set

The University of Iowa Intelligent Systems Laboratory

Boosting tree Regression• See the results and predictor’s importance

Pre

dict

ed v

alue

s

Page 10: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

10

The University of Iowa Intelligent Systems Laboratory

Boosting tree Regression• See the results of Observed values vs. Predicted values

The University of Iowa Intelligent Systems Laboratory

Boosting tree Regression• See the results and predictor’s importance

Page 11: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

11

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• In “Data Mining” menu and find the “Automated Neural Networks”

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• Choose “Classification”, then select variables

Page 12: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

12

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• Statistica will try a set of different neural networks and keep the best ones

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• See the classification results

Page 13: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

13

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• See the classification results---Predictions

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• See the classification results---Predictions

Page 14: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

14

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• See the classification results---Confusion matrix

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• CPU data set

Page 15: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

15

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• CPU data set, select variables

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• Training and results

Page 16: Data Mining: STATISTICA - University of Iowauser.engineering.uiowa.edu/~ie_155/Lecture/Statistica.pdf · 2 The University of Iowa Intelligent Systems Laboratory Prepare the Data •

16

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• Predictions

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• Some statistics about the predictions