Access to Modern Multivariate Data Analysis Pengchao Si (司鹏超) Shandong University 08-09-2011
Access to Modern Multivariate Data Analysis
Pengchao Si (司鹏超)Shandong University
08-09-2011
The world is multivariate……
From www.theblessednaturalist.com
Woman With Weather ?
dreamstime.com
Healthy conditionsSocial environmentLove indexFood & drinkShopping requirements鸭梨很大
(and so on……)
Health status also depends on several factors:genes,social position,eating habits,stress……
Age
Energy
Why to learn Multivariate Data Analysis?
• Explanation of a social or physical phenomenon must be tested by gathering and analyzing data
• Complexities of most phenomena require an investigator to collect observations on many different variables
Why to learn Multivariate Data Analysis?(MVDA)
We are Materials Men! (M_M)
Access to Modern Multivariate Data Analysis
OverviewOverview
Course Objective...Introduce fundamental concepts and methods in
multivariate data analysisYou will learn about:
• methodology to analyze data• how to do multivariate data analysis• processing multivariate data via professional software
This course will help you to:• train methodology properly• use multivariate data analysis in practice• improve you English and team spirit
Associ. Prof.: Pengchao Si88399858 (office)[email protected]
Course outline
Time: The 1st Semester Annual (Fall semster)L7-8, Thursday (16:00 - 18:00)
Location: South-Loop Campus 5-202
Activities:• to present methods for data analysis• to announce reading and group work• to take quizzes and final oral defence
Course outline
Teaching purposes:This course aims to acquaint students with
fundamental concepts and methods of multivariate data analysis via an English teaching model.
Through online experimental work, well-known professional software, Unscrambler will be introduced to the students.
At the same time, English capability of students will be improved and strengthened.
In the end, students would understand the basic concepts and methods of multivariate data analysis.
Course outlineScheduleI Overview for Multivariate data Analysis(6 hrs)
1 Overview2. An Introduction of Multivariate Methods 3. A Review of Statistical Concepts and Methods
II Preparing For a MV Analysis (6 hrs)4. Examining Your Data5. Principal Components Analysis6. Factor Analysis
III Dependence Techniques (12 hrs)7. Multiple Regression Analysis8. Principle Component Regression (PCR) 9. Partial Least Squares Regression (PLS)
IV Introduction for Data Analysis Software (8 hrs)11. Brief for statistic software 12. Introduction for Unstrambler13. Experimental work
Course outline
• Theoretical part 28 hrsPractical part 4 hrs
• Software: Unscrambler X
• Fully English teaching with group work
• Fully English oral defence (Pass / No pass)
Course outline
• Course website (building……)
• References: 1.《多元统计分析与应用》
余锦华、杨维权
2.《数据分析》(第二版)范金城、梅长林
More on
References:
Analyzing Multivariate DataJames Lattin, Douglas Carroll, Paul Green
Multivariate Data Analysis (7th Edition) Joseph F. Hair, William C. Black (Author), Barry J. Babin, Rolph E. Anderson
References:
Multivariate Data Analysis - in practice Kim H. Esbensen (Author), Dominique Guyot(Editor), Frank Westad(Editor), Lars P. HoumÃller (Editor)
Several variables
—— Multivariate
Only one variable
—— Univariate
Direct and Indirect Observations
http://wi17.com/view.php?id=179
Long distance and high temperature IR thermometer (-10~1450℃)
Data must carry useful information!
Some basic concepts
1. Objects are the entities on which the measurements are taken. Eg. person, items……
2. Variables are the aspects of the objects that are measure. Eg. size, weight……
Some basic concepts
Measurement Scales1. Nonmetric Measurement Scales1) Nominal Scales Data are data described
categorically (根据类别).Eg. lady is 1; man is 2.
2) Ordinal Scales Data are ranked data.Eg. which fruits do you prefer, orange, apple, pear,
banana, or Durian? Banana > Orange > Pear > Apple > Durian
Some basic concepts2. Metric Measurement Scales
1) Interval Scales Data allow us to say how much more of the measured characteristic is possessed by one object over another.
Eg. Preference Level Scale Value
Very high preference 5
High preference 4
Moderate preference 3
Low preference 2
Very low preference 1
Some basic concepts
2) Ratio Scales Data have the same properties as interval scale data but also possesses a meaningful origin.
X / Y relationship
X —— measured variables Object vector
Y —— desired property
Y is the function of X.
Multivariate Data Analysis - in practice Kim H. Esbensen et.al.
Multivariate Data Analysis - in practice Kim H. Esbensen et.al.
What Is Multivariate Analysis?Statistical methodology to analyze data with measurements on many variables
L
L
L
controllable factors
uncontrollable factors
input output
Process
Types of Multivariate Techniques
Generally, multivariate data analysis includes: principle components analysis (PCA)factor analysiscluster analysis discriminant analysisand so on…
Which method you choose depends on the type of answer you want to get out of data analysis.
Purposes of using multivariate data analysis
Here in this course:
1. Data description (explorative data structure modeling)
2. Discrimination and classification3. Regression and prediction
Data description (explorative data structure modeling)
A large part of multivariate data analysis is concerned with revealing the intrinsic data structures visually by suitable graphs. Eg. Yields (产率) in organic synthesis.
Principal component Analysis (PCA) is a promising method that frequently used for data description and explorative data structure modeling.
Principal component Analysis(主成分分析)
= +Love LierLovelier
Information metrix = Structure + Noise
Discrimination and classification
Discrimination deals with the separation of groups of data.
Classification is used to say which groups are relevant to models.
Regression and prediction
Regression is an approach for relating two sets of variables to each other.This often happens to Indirect Observations.
This course will mainly introduce regression methods Principle Component Regression (PCR) and Partial Least Squares Regression (PLS).
Regression and predictionPrediction means determine Y-values for new X-objects, based on a previously estimated or calibrated X-Y model.
Summery
1. World is multivariate. Why to learn multivariate data analysis?
2. Overview of this course3. Basic concepts and brief methods
of multivariate data analysis
Thanks for your attention!