Session 7 - Multivariate Data Analysis

Post on 09-Apr-2018

225 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 1/28

12/18/201012/18/2010 1 1

Session 7

MULTIVARIATE DATA ANALYSIS

Contents«

1. Introduction to multivariate analysis2. Dependence methods

3. Interdependence methods

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 2/28

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 3/28

12/18/201012/18/2010 3 3

Dependence methods: O ne or more variables have

been designated as being predicted by a set of independent variables.

Multiple regression, AN OV A, Conjoint analysis,Discriminant analysis, Structural Equation Modeling...

Interdependence methods: No variable(s) aredesignated as being predicted by others. It is theinterrelationship among all the variables taken together that interests the researcher .

Factor analaysis, Cluster, Multidimensional Scaling.

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 4/28

12/18/201012/18/2010 4 4

II. DEPENDENCE METHODSScale requirement

Method Required scale of variable(s)

Dependent Independent

One dependent variable

Multiple regression Interval interval

AN OV A Interval Nominal

Multiple regression withdummy variable

Interval Nominal

Discriminant analysis Nominal Interval

Conjoint analysis O rdinal Nominal

Two or more dependent variablesCanonical analysis Interval Interval

MANOV A Interval Nominal

Network structure including many dependent and independent variables

SEM Interval Interval

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 5/28

12/18/201012/18/2010 5 5

II.1 Multiple Regression Y = a 1X1 + a 2X2 + a 3X3 + ... a nXn + b

O ne DV, two or more IDVs

y All are intervally scaled variables (except dummy variable)

y Three key results to analyze:

The fitness of the multiple regression equation:

represented by r 2 = 0 1 (coefficient of determination)

% of variation of Y explained by the regression.

Test of the significance level of r 2: Use F ± test (sig. )Test of the significance level of each regression coeficient(a1, a2, a3,«) : Use t ± test (sig.)

(SPSS provides all sig. levels)

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 6/28

12/18/201012/18/2010 6 6

Assumptions in multiple regression

a. Linearity: relationships between DV

and IDV

s are linear.Test by observing the scatter diagram or correlation matrix

b. Multicolinearity: No linear correlation among ID V s.

Test by investigating ³Tolerance´ or V IF

c. Normality of all variables and of all residuals

d. Constant variance of the error term (Homoscedasticity)

e. Independence of the Error Terms

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 7/28

12/18/201012/18/2010 7 7

Notes when using multiple regression:

Applicable when there exist linear correlations amongvariables.

Do not prove causal relationship.

Can be used for Prediction or Explanation

There should be more than 10 observations for one ID V (

requird sample size)If IDV is nominally scaled, dummy variable regression can

be employed

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 8/28

12/18/201012/18/2010 8 8

Example:Identifying the determinants of employee satisfaction in XYZ Co.

DV

: Employee satisfaction.IDV s: Rewards, Working condition, Recognition by managers,Peer relationship, Promotion O pport., Development O pport.

IDV s UnstandardizedCoefficients

Standardized

Coefficients

t Sig. CollinearityStatistics

B Std.Error

Beta Toleance

V IF

(Constant) 0.540 0.193 2.793 .007

Rewards 0.526 0.081 0.596 6.491 .000 .793 1.062

Recognition 0.205 0.061 0.310 3.380 .001 .793 1.262

r 2=0.619 F sig. = 0.000

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 9/28

12/18/201012/18/2010 9 9

II.2. ANOVA ± ANALYSIS OF VARIANCE

Non-metric ID V s and metric D V Used to compare means of D V under the impact of one or

more ID V s.

Can be used with more than one ID V (factorial AN OV A).

Principle: ³between-group variance > within-group

variance´ significant differences in the means of groups

Family: ANCOVA / MANOVA / MANCOVA

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 10/28

12/18/201012/18/2010 1010

Example of ANOVA:

A survey of 200 companies in garment, cosmeticand plastic industries about their average expenses

for sales promotion during the last three years.

The researcher wants to explore whether there are

significant differences in the average expenses for

sales promotion among these three industries

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 11/28

12/18/201012/18/2010 1111

IDV s: Industry(nominal) (3 treatments)DV : Sales Promotion expenses (ratio)

Company No. Industry SP expenses(1000 USD)

1 Garment 123

2 Garment 235

3 Cosmetic 1346

4 Plastic 876.. ..

199 Plastic 68

200 Garment 12

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 12/28

12/18/201012/18/2010 1212

Possible method: compare the mean values of D V for

each pair of industries (using t ± test).However, when the No. of treatments increases the

comparisons become arduous.

In such a situation, AN OV A is the better method:

H0 : Q 1 = Q 2 = ... = Q k = Q

Ha : at least one Q i which is significantly different fromthe others.

Where Q = population mean

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 13/28

12/18/201012/18/2010 1313

II.3. DISCRIMINANT ANALYSIS

Purpose: to identify the linear combination of ID V s that isbest discriminate among the prespecified groups that areformed on the basis of a D V .

Metric ID V s, Nominal D V .

O utcomes: A linear combination:Y = v 1.X1 + v 2.X2 + v 3.X3 + «and critical score Y cri

For a particular subject:

Calculate its Y score,Compare Y Ycri

predict which group the subject belongs.

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 14/28

12/18/201012/18/2010 1414

Example

An IT trading company wants to know whether family income

(X1) householder¶s education (X 2) are useful to discriminatebetween PC buyers and non-PC buyers.

Conduct a survey of n households (with / without a PC).

IDV s: X1 ± income, X

2 ± education : metric variables

DV : with a PC, without a PC: categorical variable

Analysis results: discriminant function Y= v 1X1 + v 2X2

v1, v 2 : discriminant coefficients

Ycri : critical score

Given a household i (X 1i and X 2i ) we can predict whether it isa (potential) buyer.

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 15/28

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 16/28

12/18/201012/18/2010 1616

Example

Test a new product with 3 attributes:Price: (high, medium, low)Package size: (small, medium, large)Features: (simple, complex)

Form 8 test alternatives (instead of 18 combinations). Ask respondents to rank order

Results:

contribution of each attribute to overall preferencepreference of each treatment in an attribute.identify the most preferred combination. .

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 17/28

II.5. Structural Equation Modeling - SEM

CUS TOO I T TIO

C O TITOO I T TIO

UN TIONALC OO I NA TION

BUS INESSPERF ORMANCE

RESP ONS IVENESS

PR OF ITOR IEN T A TION

MARKE TOR IEN T A TION

..

.

MANAGEMEN TC OMPE TENC IES

...

.

.

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 18/28

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 19/28

12/18/201012/18/2010 1919

Example:

Case X 1 X2 X3 «. «. X m

123

«n

Factor analysis: grouping m variables into k factorsFactor 1 includes X1 X6 X9 XmFactor 2 includes X2 X3 X10 Xm - 1Factor 3 includes X4 X5 X7 X8 ...

Exploratory factor analysis (EFA)Confirmatory factor analysis (CFA).

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 20/28

12/18/201012/18/2010 2020

III.2. Cluster analysis

Segmenting objects into homogeneous groups, givendata for the objects on a variety of characteristics.

Ex: Market SegmentationBuying behavior Typology

Procedure:- Identify variables / characteristics for for grouping

- Segmenting based on similarities - distances.- Labeling clusters based on their shared charateristcs.- V alidation and profiling

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 21/28

12/18/201012/18/2010 2121

Example: Segmenting the detergent market

Metric ScalesBased on consumer buying behaviors.

³ P lease indicate the importance level (f rom 1 f or very important

to 5 f or not important at all ) o f the f ollowing f actors when youconsider buying detergent powder´

X1 ± Product quality ____ X2 ± Price ____ X3 ± Convenience ____ X4 ± Known brand ____ X5 ± Sales promotion ____

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 22/28

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 23/28

12/18/201012/18/2010 2323

III.3. Multidimensional scaling (perceptual mapping)

Inferring the number / nature of dimensions underlyingrespondent perceptions based on their judgements about

objects (brands, products, companies, localities, etc.)

Metric / nonmetric scale

Identifying the relative positions (on a map) of competitive

brands based on several dimensions.

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 24/28

12/18/201012/18/2010 2424

Example: MDS result for T V brands in HCMC

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 25/28

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 26/28

12/18/201012/18/2010 2626

PRACTICE PROJECT

A better procedure: Assess and refine the scales by using Factor analysis and Reliability assessmentCalculate factor scores using the qualified variables

Multiple regressionInterpret the results

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 27/28

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 28/28

12/18/201012/18/2010 2828

END SESSION 7

top related