Top Banner
1 When Prediction Met PLS: What We Learned in 3 Years of Marriage Galit Shmueli National Tsing Hua University, Taiwan PLS 2017, June 17, Macau
45

When Prediction Met PLS: What We learned in 3 Years of Marriage

Jan 22, 2018

Download

Data & Analytics

Galit Shmueli
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: When Prediction Met PLS: What We learned in 3 Years of Marriage

1

When Prediction Met PLS: What We Learned in 3 Years of Marriage

Galit ShmueliNationalTsingHuaUniversity,Taiwan PLS2017,June17,Macau

Page 2: When Prediction Met PLS: What We learned in 3 Years of Marriage

2

CrossingModelingBorders:UsingPredictiveModelsforCausalExplanation

andUsingExplanatoryModelsforPrediction

Bestexplanatorymodel

Bestpredictivemodel

Point#1Point#2

ExplanatoryPower

PredictivePower≠

Cannotinferonefromtheother

Shmueli(2010)“ToExplainorToPredict?”,StatisticalScienceShmueli&Koppius(2011)“PredictiveAnalyticsinISResearch”,MISQ

Page 3: When Prediction Met PLS: What We learned in 3 Years of Marriage

PLSvsNN 3

SCECR2010,NY

The Future of PLS-PM: Prediction or Explanation?

2010 20142015

20162017

Mediator&Prediction

Page 4: When Prediction Met PLS: What We learned in 3 Years of Marriage

4

Predictionwithmodelsforobservabledata(regression,machinelearningalgorithms)

Predictionwithlatentvariablemodels(PLS,CB-SEM)

Page 5: When Prediction Met PLS: What We learned in 3 Years of Marriage

5

GeneratingPredictions

&PredictionErrors

EvaluatingPredictivePerformance

ConductingPredictiveSimulationStudies

UsingPLSPredictions

Page 6: When Prediction Met PLS: What We learned in 3 Years of Marriage

SimplePLSModel

6

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

Page 7: When Prediction Met PLS: What We learned in 3 Years of Marriage

3DPredictionLandscape(latentmodels)

7

In-sample

Out-of-sample

D2

D1Construct Item

Averagecase

Case-wiseD3

Machinelearning

Page 8: When Prediction Met PLS: What We learned in 3 Years of Marriage

GeneratingPredictions&PredictionErrors

Page 9: When Prediction Met PLS: What We learned in 3 Years of Marriage

Q#1:whattopredict?

9

? Shouldwepredictitemsorcomposites?(wecanpredictboth!)

Answer:Dependsonrequiredaction

Page 10: When Prediction Met PLS: What We learned in 3 Years of Marriage

10

Abilitytogeneratetestablepredictions

1. Generatepredictions2. Evaluateaccuracyof

predictions

Challenge:PLSmodelscangeneratetestablepredictionsforitemsbutuntestablepredictionsforcomposites

Page 11: When Prediction Met PLS: What We learned in 3 Years of Marriage

validstructuralcommunal

latentoperative

redundant

6TypesofPredictionfromPLSModels

11

IN OUT validIN OUT

structuralIN OUT

communal

IN OUT

redundant

IN OUT

latent

IN OUT

operative

Lohmoller(1989) Predictoutcome

Evaluatepredictions

in-sample

out-of-sample Over-fitting?

Page 12: When Prediction Met PLS: What We learned in 3 Years of Marriage

AverageCasevs.Case-wise

12

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

k

k

k

k

k

k

k

k

k

k

k

k

k

k

𝒚𝐢𝐣𝐤

Page 13: When Prediction Met PLS: What We learned in 3 Years of Marriage

WhyPredicttheAverageCase?

13

Somesocialscientiststhink• Predictingbehaviorofindividualsisdifficult• Predictingbehaviorofgroupsispossible

Page 14: When Prediction Met PLS: What We learned in 3 Years of Marriage

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

Q#2:predictionerrorsofwhat?

14

? RMSEperitemorpercomposite?

𝒆𝐢𝐣

Page 15: When Prediction Met PLS: What We learned in 3 Years of Marriage

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

k

k

k

k

k

k

k

k

k

k

k

k

k

k

Q#3:computingpredictionintervals

15

?Howtoestimatepredictionvarianceforaverage-case?Forcase-wise?

Pointpredictionssameforcase-wiseandaveragecase

𝒚&𝐢𝐣𝐤 = 𝒚&𝐢𝐣 = 𝒚(𝒊𝒋.

Answer:Averagecase->usebootstrapCasewise ->bootstrap+error

Page 16: When Prediction Met PLS: What We learned in 3 Years of Marriage

Scenario:

Wehaveanewrecord

Option1:Predictthevaluefor“thatkindofrecord”

Option2:Predictthevalueforthatspecificrecord

16

Page 17: When Prediction Met PLS: What We learned in 3 Years of Marriage

PredictionIntervalforAverageCase

17

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%

Trainingsamplesizen

1. GetBbootstrapsamplesoftrainingdata2. FitPLSmodeltoeachbootstrap(Bmodels)3. GetBpredictionsforthenewrecord:

4. Use5th,95th percentilesfromtheBpredictionstoget90%PI(foraveragecase)

𝒚&𝟏𝐢𝐣𝐍𝐄𝐖, …, 𝒚&𝑩𝐢𝐣𝐍𝐄𝐖

CapturesuncertaintyduetoPLSmodelestimation

Page 18: When Prediction Met PLS: What We learned in 3 Years of Marriage

Trainingsamplesizen

1. GetBbootstrapsamplesoftrainingdata2. FitPLSmodeltoeachbootstrap(Bmodels)3. Foreachbootstrapsampleb:

• Getpredictionsfornewrecordandeachtrainingrecord:𝒚&𝒃𝐢𝐣𝐤 (k=1,…,n)𝒚&𝒃𝐢𝐣𝐍𝐄𝐖

• Computentrainingpredictionerrors:𝒆𝒃𝐢𝐣𝐤 = 𝒚𝒃𝐢𝐣𝐤 - 𝒚&𝒃𝐢𝐣𝐤

• Addrandomlyselectederrorto𝒚&𝒃𝐢𝐣𝐍𝐄𝐖

𝒚& ∗ 𝒃𝐢𝐣𝐍𝐄𝐖 = 𝒚&𝒃𝐢𝐣𝐍𝐄𝐖 + 𝒆𝒃𝐢𝐣𝐤

1. Use5th,95th percentilesfromtheBpredictionstoget90%case-wisePI

PredictionIntervalforIndividualRecord

18

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

!"′$%

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

k

k

k

k

k

k

k

k

k

k

k

k

k

k

+Uncertaintyduetodeviationfromaverage

UncertaintyduetoPLSmodelestimation

PlowingThroughthePLSPathModel

Page 19: When Prediction Met PLS: What We learned in 3 Years of Marriage

Q#4:WhichitemstouseasinputsforpredictingY2?

19

X1

X2

Y1β1

β2

x11

x12

x13

w11

w12

w13

z1

Y2

y21

y22

y23λ43λ42λ41

ε21

ε22

ε23

β3

z2

y24

λ44

ε24

x21

x22

x23

w21

w22

w23

y11

y12

y13λ33λ32λ31

ε11

ε12

ε13

y14

λ34

ε14

? Multiplepossiblesetsofpredictors(predictionpaths)

Mediator&Prediction

Page 20: When Prediction Met PLS: What We learned in 3 Years of Marriage

EvaluatingPredictive

Performance

Page 21: When Prediction Met PLS: What We learned in 3 Years of Marriage

“Classic”Out-of-SamplePerformanceEvaluation(ingeneral,notpathmodels)

21

estimationtraining

predictionholdout

Dataset:

1 2 3 4 5 6 7 8 9 10

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

predictions - actuals = residualstrainingholdout

10-fo

ld c

ross

-val

idat

ion

𝑅𝑀𝑆𝐸 = 𝐴𝑣𝑔(𝒓𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔C)�

visualization of residualspredictive power

Page 22: When Prediction Met PLS: What We learned in 3 Years of Marriage

Q#5:Whattobenchmarkagainst?

22

? • SimpleAverage• Linearregressionmodel• Machinelearningalgorithm(specifically,NeuralNet)

• Adifferent(simpler)PLSmodel

Page 23: When Prediction Met PLS: What We learned in 3 Years of Marriage

23

X1

X2

Y1β1

β2

x11

x12

x13

w11

w12

w13

z1

Y2

y21

y22

y23λ43λ42λ41

ε21

ε22

ε23

β3

z2

y24

λ44

ε24

x21

x22

x23

w21

w22

w23

y11

y12

y13λ33λ32λ31

ε11

ε12

ε13

y14

λ34

ε14

Causal Theory dictates:Latent variablesPath model structureMeasurement modelArrow directions

[path model + path coefficients important]

Architecture req’s/options:Each construct has itemsMediation possible

Causal Theory dictates:Independent variables (X’s)Dependent variable (Y)Implicit construct-to-variable mappingLinear model

[model coefficients important]

Mediation requires multiple models

User dictates (not causal):Inputs (X’s)Output (Y)Hidden layer, nodes[model coefficients unimportant; no “mediation”]

Algorithm constraints:Arrows direction: left-to-rightOnly input+output have dataOne “item” per node

PLS LinearRegression NeuralNet

Page 24: When Prediction Met PLS: What We learned in 3 Years of Marriage

Example:TAMGefen &Straub,CAIS2005

PU

PEOU

USE

SCECR2010,NY

Page 25: When Prediction Met PLS: What We learned in 3 Years of Marriage

IterativeestimationNode-levelerrors

NeuralNetworkHackl &Westlund TQM 2000;Hsu,Chen&HsiehTQM2006

Method HoldoutRMSE

PLS(reflective) 2.10PLS(formative) 2.18NeuralNet 1.94LinearRegression 1.84

Predicts“3”

• Insufficientdata?• 5-pointLikertscale?

Page 26: When Prediction Met PLS: What We learned in 3 Years of Marriage

26

Figure 1. Model for product returns in online retail

NeuralNetworksasanApproximationtoProbabilisticGraphicalModels:UsingSEMforPredictiveAnalytics

• Modelcomplex,non-linearcausalrelationships• Largescaledatasetswithmillionsofrecordsandtens

orhundredsofthousandsofdimensions(attributes)• Neuralnetworksstatisticallyapproximatestructural

equationmodels,inwhichboththeouterandtheinnermodelaredefinedbylogisticregressionmodels

Page 27: When Prediction Met PLS: What We learned in 3 Years of Marriage

Q#6:HowtoMeasureOut-of-SamplePredictivePower?

27

? • Holdout:RMSE,MAD,MAPE• In-sample:R2,Q2 ?• NewtoPLS:AIC,BIC,GM,… (in-sample)

Togetsomeanswers,weneedasimulationstudy

Whichmeasure selectsthebestpredictivePLSmodel?

Page 28: When Prediction Met PLS: What We learned in 3 Years of Marriage

Predictivemodelselection:Twolenses

1.Predictiononly(P):– Focusonlyoncomparingthepredictiveaccuracyofmodels(Gregor,2006)– Limitedornoroleoftheory(nocausalexplanation)– Selectthemodelwithbestout-of-samplepredictiveaccuracy– Out-of-samplecriteria(e.g.RMSE)arethegoldstandardforjudging

2.ExplanationwithPrediction(EP):– Focusonbalancingcausalexplanationandprediction(Gregor,2006)– Prominentroleoftheory(causalexplanationisforemost)– Requirestrade-offinpredictivepowertoaccommodateexplanatorypower

Prediction-orientedmodelselection inPLS-PM(Sharmaetal.2017,submitted)

Page 29: When Prediction Met PLS: What We learned in 3 Years of Marriage

ConductingPredictiveSimulationStudies

Page 30: When Prediction Met PLS: What We learned in 3 Years of Marriage

Simulation Study

1. SimulatedatafromaspecificPLSmodel,manipulatingfactorsofinterest

2. Partitiondataintotrainingandholdout samples3. EstimaterelevantPLSmodelsfromtrainingsample4. Generateholdoutpredictionsusingeachestimated

model

TypicalStepsinPredictiveSimulationStudy

Page 31: When Prediction Met PLS: What We learned in 3 Years of Marriage

Welearned:Simulation isimportant,andnotstraightforward

31

Q#7:Howtosimulate dataforPLSfitting?

Q#8:Whichfactors tovary?

Q#9:Howbigofaholdout set?

Q#10:Roleof“generatingmodel”

? Simsem inRSEGIRLSinR(Schlittgen,2015)

?

?

pathmodel,coefficients,factorloading,samplesize

Large– morereliableout-of-sampleevaluationSmall– morerealisticinPLSstudies

? Shouldgoodpredictivemodelrecovergenmodel?Includegeneratingmodelinconsiderationset?

Page 32: When Prediction Met PLS: What We learned in 3 Years of Marriage

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"

%$Model 3: Incorrect model

!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"

%$Model 4: Incorrect model

!"

!#

!$

%"

%$Model 6: Incorrect model

&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"

%$Model 3: Incorrect model

!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"

%$Model 4: Incorrect model

!"

!#

!$

%"

%$Model 6: Incorrect model

&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"

%$Model 3: Incorrect model

!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"

%$Model 4: Incorrect model

!"

!#

!$

%"

%$Model 6: Incorrect model

&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"

%$Model 3: Incorrect model

!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"

%$Model 4: Incorrect model

!"

!#

!$

%"

%$Model 6: Incorrect model

&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

Example:ModelComparisonStudy

Prediction-orientedmodelselection in

PLS-PM(Sharmaetal.2017,

submitted)

Page 33: When Prediction Met PLS: What We learned in 3 Years of Marriage

33

Model # 1 2 3 4 5 6 7 8

PLS Criteria

R2 0.000 0.273 0.000 0.003 0.019 0.000 0.695 0.009Adjusted R2 0.000 0.537 0.000 0.005 0.074 0.000 0.303 0.081GoF 0.000 0.001 0.000 0.000 0.037 0.000 0.962 0.000Q2 0.003 0.305 0.000 0.004 0.224 0.002 0.179 0.281

Information Theoretic Criteria

FPE 0.000 0.638 0.000 0.006 0.091 0.000 0.163 0.101CP 0.000 0.686 0.000 0.006 0.100 0.001 0.096 0.111GM 0.000 0.743 0.000 0.006 0.109 0.007 0.011 0.123AIC 0.000 0.638 0.000 0.006 0.091 0.000 0.164 0.101AICu 0.000 0.688 0.000 0.006 0.099 0.002 0.093 0.112AICc 0.000 0.649 0.000 0.006 0.093 0.001 0.146 0.104BIC 0.000 0.731 0.000 0.006 0.107 0.005 0.032 0.120HQ 0.000 0.695 0.000 0.006 0.100 0.001 0.085 0.112HQc 0.000 0.705 0.000 0.006 0.102 0.002 0.070 0.114

Out of Sample Criteria

MAD 0.000 0.351 0.000 0.000 0.183 0.000 0.236 0.229RMSE 0.000 0.365 0.000 0.000 0.186 0.000 0.218 0.230MAPE 0.094 0.044 0.247 0.076 0.044 0.347 0.090 0.058SMAPE 0.000 0.365 0.000 0.000 0.123 0.000 0.343 0.168

PerformanceMeasuresChooseWhichModel?

Page 34: When Prediction Met PLS: What We learned in 3 Years of Marriage

34

Asimulationstudycantakelongtorun• Bootstrap• Parallelizing?• Pilotruns(fewerbootstraprounds)

Page 35: When Prediction Met PLS: What We learned in 3 Years of Marriage

35

UsingPLSPredictions

Page 36: When Prediction Met PLS: What We learned in 3 Years of Marriage

36

How to use it?

Page 37: When Prediction Met PLS: What We learned in 3 Years of Marriage

37

μy21,&σy21

μy22,&σy22

μy23,&σy23

!"′$%

Assess Relevance

Evaluate Predictability

Low/nopredictivepower• weaknessintheoreticalmodel• qualityofthemeasureditems• phenomenonisnaturallyunpredictable• modelsufficientonlyforexplanationbut

notprediction(e.g.,userbehavior)• Externalvalidity(overfitting)– compare

in-samplevs.out-of-sampleprediction

Page 38: When Prediction Met PLS: What We learned in 3 Years of Marriage

38

“If we can predict successfully on the basis of a certain explanation we have a good reason, and perhaps the best sort of reason, to accept the explanation”

The Conduct of Inquiry: Methodology for Behavioral ScienceKaplan (1964)

Page 39: When Prediction Met PLS: What We learned in 3 Years of Marriage

generatenew theory

Develop Measures

Low/nopredictivepowerofexistingmodel1.Opportunityforgeneratingnewtheory

2.Identifyconstructsthatyieldpoorpredictions(boosttraditionalrigorousmeasurementpractices)

Page 40: When Prediction Met PLS: What We learned in 3 Years of Marriage

Improve existing theory

X1

X2

Y1β1

β2

x11

x12

x13

w11

w12

w13

z1

Y2

y21

y22

y23λ43λ42λ41

ε21

ε22

ε23

β3

z2

y24

λ44

ε24

x21

x22

x23

w21

w22

w23

y11

y12

y13λ33λ32λ31

ε11

ε12

ε13

y14

λ34

ε14

X1

X2

Y1β1

β2

x11

x12

x13

w11

w12

w13

z1

Y2

y21

y22

y23λ43λ42λ41

ε21

ε22

ε23

β3

z2

y24

λ44

ε24

x21

x22

x23

w21

w22

w23

y11

y12

y13λ33λ32λ31

ε11

ε12

ε13

y14

λ34

ε14

X1*X2x11x21x11x21x11x21

W31

W32

W33

β3

AsymmetricPredictions:predictiveaccuracy/precisionvariesfordifferentsubgroups

Createmorenuancedtheories

Page 41: When Prediction Met PLS: What We learned in 3 Years of Marriage

41

Ray, Kim, and Morris: The Central Role of Engagement in Online Communities540 Information Systems Research 25(3), pp. 528–546, © 2014 INFORMS

Table 3 Structural Results of Proposed and Alternative Models

Proposed model First alternative Second alternative

CE SAT KC WOM CE SAT KC WOM CE SAT KC WOM

R2 0076 0050 0057 0054 0048 0045 0078 0050 0061 0055CI 0026⇤⇤⇤ 0039⇤⇤⇤ 0014 0037⇤⇤⇤ 0027⇤⇤⇤ 0038⇤⇤⇤ É0013 0007SIV 0030⇤⇤⇤ 0012 0015⇤ 0017⇤ 0031⇤⇤⇤ 0013 É0009 É0004EFF 0034⇤⇤⇤ 0019⇤⇤ 0003 0032⇤⇤⇤ 0019⇤ 0034⇤⇤⇤ 0019⇤⇤ 0000 É0007CE 0061⇤⇤⇤ 0047⇤⇤⇤ 0081⇤⇤⇤ 0053⇤⇤⇤

SAT 0016⇤⇤ É0005 0030⇤⇤⇤ 0015⇤ É0004 0027⇤⇤

CE⇥ EFF É0010⇤ É0009⇤

ArtifactsaVC 0015⇤⇤ É0014⇤⇤ 0001 É0017⇤⇤ 0009 É0014⇤ 0015⇤⇤ É0014⇤ É0001 É0017⇤⇤

aPD É0016⇤⇤⇤ 0003 0009 0014⇤⇤ É0001 0007 É0016⇤⇤ 0002 0012 0016⇤

aPP É0004 0004 0012⇤⇤ 0002 0010 0002 É0004 0004 0013⇤⇤ 0003aRP É0007⇤ 0015⇤⇤ 0000 0008 É0003 0009 É0007 0015⇤⇤ 0002 0008aUM 0000 É0007 É0009⇤ É0001 É0008 É0004 0000 É0007 É0008 É0002

ControlscGEN 0001 0006 É0003 0001 É0002 0003 0000 0006 0003 0001cAGE É0004 É0002 É0005 É0001 É0008 É0004 É0005 É0002 É0004 É0001cFREQ 0008 0010⇤ 0023⇤⇤⇤ É0003 0027⇤⇤⇤ 0004 0007 0010 0021⇤⇤⇤ É0003cTENURE É0011⇤⇤ 0014⇤⇤ É0002 0005 É0009 0005 É0011⇤ 0014⇤⇤ É0001 0006

Note. CI: community identification; SIV: self-identity verification; EFF: knowledge self-efficacy; CE: community engagement; SAT: satisfaction; KC: knowledgecontribution; WOM: positive word of mouth; aVC: virtual copresence; aPD: profile depth; aPP: past postings; aRP: regulatory practices; aUM: user moderation;cGEN: gender; cAGE: age; cFREQ: frequency of past visitation; cTENURE: tenure at online community.

Path significances: ⇤p < 0005; ⇤⇤p < 0001; ⇤⇤⇤p < 00001.

our proposed model), as was that of word-of-mouth(a 1.85% increase over our proposed model). Thus,engagement and satisfaction appear to fully mediate(Baron and Kenny 1986) the influence of identity factorson prosocial intentions.

Overall, the results strongly uphold the main princi-ples of our proposed model. Specifically, the identityfactors that earlier studies focused on appear to beantecedent to the more powerful mediating condi-tions of engagement and satisfaction that ultimately

Figure 2 Structural Results of Proposed Model

Self-identityverification

Knowledgeself-efficacy

Knowledgecontribution

Satisfaction

Communityengagement

Positiveword of mouth

Communityidentification

0.34***

0.19 **

0.30***

0.26

***

0.61***

0.39*** 0.30***

–0.10*

0.47 ***

0.16

**

Note. Nonsignificant hypothesized paths are dashed.Path significances: ⇤p < 0005; ⇤⇤p < 0001; ⇤⇤⇤p < 00001.

determine prosocial outcomes in online communities.The theory-free alternative models did not yield anyadditional advantage when both power and parsimonywere considered. We also note the failed hypothesesand unexpectedly significant control effects found inour empirical results. First, satisfaction does not directlyinfluence knowledge contribution intentions, althoughit does influence word-of-mouth intentions. Second,self-identity verification did not have a significantrelationship with satisfaction. Our artifact measures

Dow

nloa

ded

from

info

rms.o

rg b

y [1

40.1

14.1

39.1

81] o

n 14

Oct

ober

201

4, a

t 01:

00 .

For p

erso

nal u

se o

nly,

all

right

s res

erve

d.

Ray, S., Kim, S. S., and Morris, J. G. 2014. “The Central Role of Engagement in Online Communities,”Information Systems Research (25:3), pp. 528–546.

Reduced formMa & Agarwal (2007)

compare competing theories

ModelComparison&Selection

• Fundamentaltoscientificwork• PLSasexploratory• p-valueschallengeinlargesamples

compare alternative models

Page 42: When Prediction Met PLS: What We learned in 3 Years of Marriage

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"

%$Model 3: Incorrect model

!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"

%$Model 4: Incorrect model

!"

!#

!$

%"

%$Model 6: Incorrect model

&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"

%$Model 3: Incorrect model

!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"

%$Model 4: Incorrect model

!"

!#

!$

%"

%$Model 6: Incorrect model

&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"

%$Model 3: Incorrect model

!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"

%$Model 4: Incorrect model

!"

!#

!$

%"

%$Model 6: Incorrect model

&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"

%$Model 3: Incorrect model

!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"

%$Model 4: Incorrect model

!"

!#

!$

%"

%$Model 6: Incorrect model

&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

DifferentTypesofModels(generating,parsimonious,incorrect,saturated,overspecified)

Sharmaetal.2017

Page 43: When Prediction Met PLS: What We learned in 3 Years of Marriage

43

GeneratingPredictions

&PredictionErrors

EvaluatingPredictivePerformance

ConductingPredictiveSimulationStudies

UsingPLSPredictions

Page 44: When Prediction Met PLS: What We learned in 3 Years of Marriage

MoreOpenQs

44

? Whatis“good”predictionaccuracy?precision?

? Evaluatingconstruct-levelpredictions

? WhichpartstransfertoCB-SEM?

Page 45: When Prediction Met PLS: What We learned in 3 Years of Marriage

AnalyticsHumanity

Responsibility

Galit Shmueli徐茉莉Institute of Service Science