When Prediction Met PLS: What We learned in 3 Years of Marriage

1

When Prediction Met PLS: What We Learned in 3 Years of Marriage

Galit ShmueliNationalTsingHuaUniversity,Taiwan PLS2017,June17,Macau

2

CrossingModelingBorders:UsingPredictiveModelsforCausalExplanation

andUsingExplanatoryModelsforPrediction

Bestexplanatorymodel

Bestpredictivemodel

≠

Point#1Point#2

ExplanatoryPower

PredictivePower≠

Cannotinferonefromtheother

Shmueli(2010)“ToExplainorToPredict?”,StatisticalScienceShmueli&Koppius(2011)“PredictiveAnalyticsinISResearch”,MISQ

PLSvsNN 3

SCECR2010,NY

The Future of PLS-PM: Prediction or Explanation?

2010 20142015

20162017

Mediator&Prediction

4

Predictionwithmodelsforobservabledata(regression,machinelearningalgorithms)

Predictionwithlatentvariablemodels(PLS,CB-SEM)

5

GeneratingPredictions

&PredictionErrors

EvaluatingPredictivePerformance

ConductingPredictiveSimulationStudies

UsingPLSPredictions

SimplePLSModel

6

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31

Exogenous Factors Endogenous Factors

Structural model (Inner model)

z1 ε11

ε12

ε13

Measurement Model (Outer Model)

y14

λ34

ε14

3DPredictionLandscape(latentmodels)

7

In-sample

Out-of-sample

D2

D1Construct Item

Averagecase

Case-wiseD3

Machinelearning

GeneratingPredictions&PredictionErrors

Q#1:whattopredict?

9

? Shouldwepredictitemsorcomposites?(wecanpredictboth!)

Answer:Dependsonrequiredaction

10

Abilitytogeneratetestablepredictions

1. Generatepredictions2. Evaluateaccuracyof

predictions

Challenge:PLSmodelscangeneratetestablepredictionsforitemsbutuntestablepredictionsforcomposites

validstructuralcommunal

latentoperative

redundant

6TypesofPredictionfromPLSModels

11

IN OUT validIN OUT

structuralIN OUT

communal

IN OUT

redundant

IN OUT

latent

IN OUT

operative

Lohmoller(1989) Predictoutcome

Evaluatepredictions

in-sample

out-of-sample Over-fitting?

AverageCasevs.Case-wise

12

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

k

k

k

k

k

k

k

k

k

k

k

k

k

k

𝒚𝐢𝐣𝐤

WhyPredicttheAverageCase?

13

Somesocialscientiststhink• Predictingbehaviorofindividualsisdifficult• Predictingbehaviorofgroupsispossible

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

Q#2:predictionerrorsofwhat?

14

? RMSEperitemorpercomposite?

𝒆𝐢𝐣

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

k

k

k

k

k

k

k

k

k

k

k

k

k

k

Q#3:computingpredictionintervals

15

?Howtoestimatepredictionvarianceforaverage-case?Forcase-wise?

Pointpredictionssameforcase-wiseandaveragecase

𝒚&𝐢𝐣𝐤 = 𝒚&𝐢𝐣 = 𝒚(𝒊𝒋.

Answer:Averagecase->usebootstrapCasewise ->bootstrap+error

Scenario:

Wehaveanewrecord

Option1:Predictthevaluefor“thatkindofrecord”

Option2:Predictthevalueforthatspecificrecord

16

PredictionIntervalforAverageCase

17

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%

Trainingsamplesizen

1. GetBbootstrapsamplesoftrainingdata2. FitPLSmodeltoeachbootstrap(Bmodels)3. GetBpredictionsforthenewrecord:

4. Use5th,95th percentilesfromtheBpredictionstoget90%PI(foraveragecase)

𝒚&𝟏𝐢𝐣𝐍𝐄𝐖, …, 𝒚&𝑩𝐢𝐣𝐍𝐄𝐖

CapturesuncertaintyduetoPLSmodelestimation

Trainingsamplesizen

1. GetBbootstrapsamplesoftrainingdata2. FitPLSmodeltoeachbootstrap(Bmodels)3. Foreachbootstrapsampleb:

• Getpredictionsfornewrecordandeachtrainingrecord:𝒚&𝒃𝐢𝐣𝐤 (k=1,…,n)𝒚&𝒃𝐢𝐣𝐍𝐄𝐖

• Computentrainingpredictionerrors:𝒆𝒃𝐢𝐣𝐤 = 𝒚𝒃𝐢𝐣𝐤 - 𝒚&𝒃𝐢𝐣𝐤

• Addrandomlyselectederrorto𝒚&𝒃𝐢𝐣𝐍𝐄𝐖

𝒚& ∗ 𝒃𝐢𝐣𝐍𝐄𝐖 = 𝒚&𝒃𝐢𝐣𝐍𝐄𝐖 + 𝒆𝒃𝐢𝐣𝐤

1. Use5th,95th percentilesfromtheBpredictionstoget90%case-wisePI

PredictionIntervalforIndividualRecord

18

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

!"′$%

X1

X2

Y1

β1

β2

x11

x12

x13

w11

w12

w13

x21

x22

x23w23

w22

w21

y11

y12

y13λ33

λ32

λ31



z1 ε11

ε12

ε13


y14

λ34

ε14

k

k

k

k

k

k

k

k

k

k

k

k

k

k

+Uncertaintyduetodeviationfromaverage

UncertaintyduetoPLSmodelestimation

PlowingThroughthePLSPathModel

Q#4:WhichitemstouseasinputsforpredictingY2?

19

X1

X2

Y1β1

β2

x11

x12

x13

w11

w12

w13

z1

Y2

y21

y22

y23λ43λ42λ41

ε21

ε22

ε23

β3

z2

y24

λ44

ε24

x21

x22

x23

w21

w22

w23

y11

y12

y13λ33λ32λ31

ε11

ε12

ε13

y14

λ34

ε14

? Multiplepossiblesetsofpredictors(predictionpaths)

Mediator&Prediction

EvaluatingPredictive

Performance

“Classic”Out-of-SamplePerformanceEvaluation(ingeneral,notpathmodels)

21

estimationtraining

predictionholdout

Dataset:

1 2 3 4 5 6 7 8 9 10

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

predictions - actuals = residualstrainingholdout

10-fo

ld c

ross

-val

idat

ion

𝑅𝑀𝑆𝐸 = 𝐴𝑣𝑔(𝒓𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔C)�

visualization of residualspredictive power

Q#5:Whattobenchmarkagainst?

22

? • SimpleAverage• Linearregressionmodel• Machinelearningalgorithm(specifically,NeuralNet)

• Adifferent(simpler)PLSmodel

23

X1

X2

Y1β1

β2

x11

x12

x13

w11

w12

w13

z1

Y2

y21

y22

y23λ43λ42λ41

ε21

ε22

ε23

β3

z2

y24

λ44

ε24

x21

x22

x23

w21

w22

w23

y11

y12

y13λ33λ32λ31

ε11

ε12

ε13

y14

λ34

ε14

Causal Theory dictates:Latent variablesPath model structureMeasurement modelArrow directions

[path model + path coefficients important]

Architecture req’s/options:Each construct has itemsMediation possible

Causal Theory dictates:Independent variables (X’s)Dependent variable (Y)Implicit construct-to-variable mappingLinear model

[model coefficients important]

Mediation requires multiple models

User dictates (not causal):Inputs (X’s)Output (Y)Hidden layer, nodes[model coefficients unimportant; no “mediation”]

Algorithm constraints:Arrows direction: left-to-rightOnly input+output have dataOne “item” per node

PLS LinearRegression NeuralNet

Example:TAMGefen &Straub,CAIS2005

PU

PEOU

USE

SCECR2010,NY

IterativeestimationNode-levelerrors

NeuralNetworkHackl &Westlund TQM 2000;Hsu,Chen&HsiehTQM2006

Method HoldoutRMSE

PLS(reflective) 2.10PLS(formative) 2.18NeuralNet 1.94LinearRegression 1.84

Predicts“3”

• Insufficientdata?• 5-pointLikertscale?

26

Figure 1. Model for product returns in online retail

NeuralNetworksasanApproximationtoProbabilisticGraphicalModels:UsingSEMforPredictiveAnalytics

• Modelcomplex,non-linearcausalrelationships• Largescaledatasetswithmillionsofrecordsandtens

orhundredsofthousandsofdimensions(attributes)• Neuralnetworksstatisticallyapproximatestructural

equationmodels,inwhichboththeouterandtheinnermodelaredefinedbylogisticregressionmodels

Q#6:HowtoMeasureOut-of-SamplePredictivePower?

27

? • Holdout:RMSE,MAD,MAPE• In-sample:R2,Q2 ?• NewtoPLS:AIC,BIC,GM,… (in-sample)

Togetsomeanswers,weneedasimulationstudy

Whichmeasure selectsthebestpredictivePLSmodel?

Predictivemodelselection:Twolenses

1.Predictiononly(P):– Focusonlyoncomparingthepredictiveaccuracyofmodels(Gregor,2006)– Limitedornoroleoftheory(nocausalexplanation)– Selectthemodelwithbestout-of-samplepredictiveaccuracy– Out-of-samplecriteria(e.g.RMSE)arethegoldstandardforjudging

2.ExplanationwithPrediction(EP):– Focusonbalancingcausalexplanationandprediction(Gregor,2006)– Prominentroleoftheory(causalexplanationisforemost)– Requirestrade-offinpredictivepowertoaccommodateexplanatorypower

Prediction-orientedmodelselection inPLS-PM(Sharmaetal.2017,submitted)


Simulation Study

1. SimulatedatafromaspecificPLSmodel,manipulatingfactorsofinterest

2. Partitiondataintotrainingandholdout samples3. EstimaterelevantPLSmodelsfromtrainingsample4. Generateholdoutpredictionsusingeachestimated

model

TypicalStepsinPredictiveSimulationStudy

Welearned:Simulation isimportant,andnotstraightforward

31

Q#7:Howtosimulate dataforPLSfitting?

Q#8:Whichfactors tovary?

Q#9:Howbigofaholdout set?

Q#10:Roleof“generatingmodel”

? Simsem inRSEGIRLSinR(Schlittgen,2015)

?

?

pathmodel,coefficients,factorloading,samplesize

Large– morereliableout-of-sampleevaluationSmall– morerealisticinPLSstudies

? Shouldgoodpredictivemodelrecovergenmodel?Includegeneratingmodelinconsiderationset?

!"

!#

!$

%"

%$Model 1: Incorrect model

!"

!#

!$

%"


!"

!#

!$

%"

%$Model 5: Data generation model

!"

!#

!$

%"

%$Model 7: Saturated model

!"

!#

!$

%"

%$Model 2: Parsimonious model

!"

!#

!$

%"


!"

!#

!$

%"


&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"

%$Model 8: Overspecified model

!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"


Example:ModelComparisonStudy

Prediction-orientedmodelselection in

PLS-PM(Sharmaetal.2017,

submitted)

33

Model # 1 2 3 4 5 6 7 8

PLS Criteria

R2 0.000 0.273 0.000 0.003 0.019 0.000 0.695 0.009Adjusted R2 0.000 0.537 0.000 0.005 0.074 0.000 0.303 0.081GoF 0.000 0.001 0.000 0.000 0.037 0.000 0.962 0.000Q2 0.003 0.305 0.000 0.004 0.224 0.002 0.179 0.281

Information Theoretic Criteria

FPE 0.000 0.638 0.000 0.006 0.091 0.000 0.163 0.101CP 0.000 0.686 0.000 0.006 0.100 0.001 0.096 0.111GM 0.000 0.743 0.000 0.006 0.109 0.007 0.011 0.123AIC 0.000 0.638 0.000 0.006 0.091 0.000 0.164 0.101AICu 0.000 0.688 0.000 0.006 0.099 0.002 0.093 0.112AICc 0.000 0.649 0.000 0.006 0.093 0.001 0.146 0.104BIC 0.000 0.731 0.000 0.006 0.107 0.005 0.032 0.120HQ 0.000 0.695 0.000 0.006 0.100 0.001 0.085 0.112HQc 0.000 0.705 0.000 0.006 0.102 0.002 0.070 0.114

Out of Sample Criteria

MAD 0.000 0.351 0.000 0.000 0.183 0.000 0.236 0.229RMSE 0.000 0.365 0.000 0.000 0.186 0.000 0.218 0.230MAPE 0.094 0.044 0.247 0.076 0.044 0.347 0.090 0.058SMAPE 0.000 0.365 0.000 0.000 0.123 0.000 0.343 0.168

PerformanceMeasuresChooseWhichModel?

34

Asimulationstudycantakelongtorun• Bootstrap• Parallelizing?• Pilotruns(fewerbootstraprounds)

35

UsingPLSPredictions

36

How to use it?

37

μy21,&σy21

μy22,&σy22

μy23,&σy23

!"′$%

Assess Relevance

Evaluate Predictability

Low/nopredictivepower• weaknessintheoreticalmodel• qualityofthemeasureditems• phenomenonisnaturallyunpredictable• modelsufficientonlyforexplanationbut

notprediction(e.g.,userbehavior)• Externalvalidity(overfitting)– compare

in-samplevs.out-of-sampleprediction

38

“If we can predict successfully on the basis of a certain explanation we have a good reason, and perhaps the best sort of reason, to accept the explanation”

The Conduct of Inquiry: Methodology for Behavioral ScienceKaplan (1964)

generatenew theory

Develop Measures

Low/nopredictivepowerofexistingmodel1.Opportunityforgeneratingnewtheory

2.Identifyconstructsthatyieldpoorpredictions(boosttraditionalrigorousmeasurementpractices)

Improve existing theory

X1

X2

Y1β1

β2

x11

x12

x13

w11

w12

w13

z1

Y2

y21

y22

y23λ43λ42λ41

ε21

ε22

ε23

β3

z2

y24

λ44

ε24

x21

x22

x23

w21

w22

w23

y11

y12

y13λ33λ32λ31

ε11

ε12

ε13

y14

λ34

ε14

X1

X2

Y1β1

β2

x11

x12

x13

w11

w12

w13

z1

Y2

y21

y22

y23λ43λ42λ41

ε21

ε22

ε23

β3

z2

y24

λ44

ε24

x21

x22

x23

w21

w22

w23

y11

y12

y13λ33λ32λ31

ε11

ε12

ε13

y14

λ34

ε14

X1*X2x11x21x11x21x11x21

W31

W32

W33

β3

AsymmetricPredictions:predictiveaccuracy/precisionvariesfordifferentsubgroups

Createmorenuancedtheories

41

Ray, Kim, and Morris: The Central Role of Engagement in Online Communities540 Information Systems Research 25(3), pp. 528–546, © 2014 INFORMS

Table 3 Structural Results of Proposed and Alternative Models

Proposed model First alternative Second alternative

CE SAT KC WOM CE SAT KC WOM CE SAT KC WOM

R2 0076 0050 0057 0054 0048 0045 0078 0050 0061 0055CI 0026⇤⇤⇤ 0039⇤⇤⇤ 0014 0037⇤⇤⇤ 0027⇤⇤⇤ 0038⇤⇤⇤ É0013 0007SIV 0030⇤⇤⇤ 0012 0015⇤ 0017⇤ 0031⇤⇤⇤ 0013 É0009 É0004EFF 0034⇤⇤⇤ 0019⇤⇤ 0003 0032⇤⇤⇤ 0019⇤ 0034⇤⇤⇤ 0019⇤⇤ 0000 É0007CE 0061⇤⇤⇤ 0047⇤⇤⇤ 0081⇤⇤⇤ 0053⇤⇤⇤

SAT 0016⇤⇤ É0005 0030⇤⇤⇤ 0015⇤ É0004 0027⇤⇤

CE⇥ EFF É0010⇤ É0009⇤

ArtifactsaVC 0015⇤⇤ É0014⇤⇤ 0001 É0017⇤⇤ 0009 É0014⇤ 0015⇤⇤ É0014⇤ É0001 É0017⇤⇤

aPD É0016⇤⇤⇤ 0003 0009 0014⇤⇤ É0001 0007 É0016⇤⇤ 0002 0012 0016⇤

aPP É0004 0004 0012⇤⇤ 0002 0010 0002 É0004 0004 0013⇤⇤ 0003aRP É0007⇤ 0015⇤⇤ 0000 0008 É0003 0009 É0007 0015⇤⇤ 0002 0008aUM 0000 É0007 É0009⇤ É0001 É0008 É0004 0000 É0007 É0008 É0002

ControlscGEN 0001 0006 É0003 0001 É0002 0003 0000 0006 0003 0001cAGE É0004 É0002 É0005 É0001 É0008 É0004 É0005 É0002 É0004 É0001cFREQ 0008 0010⇤ 0023⇤⇤⇤ É0003 0027⇤⇤⇤ 0004 0007 0010 0021⇤⇤⇤ É0003cTENURE É0011⇤⇤ 0014⇤⇤ É0002 0005 É0009 0005 É0011⇤ 0014⇤⇤ É0001 0006

Note. CI: community identification; SIV: self-identity verification; EFF: knowledge self-efficacy; CE: community engagement; SAT: satisfaction; KC: knowledgecontribution; WOM: positive word of mouth; aVC: virtual copresence; aPD: profile depth; aPP: past postings; aRP: regulatory practices; aUM: user moderation;cGEN: gender; cAGE: age; cFREQ: frequency of past visitation; cTENURE: tenure at online community.

Path significances: ⇤p < 0005; ⇤⇤p < 0001; ⇤⇤⇤p < 00001.

our proposed model), as was that of word-of-mouth(a 1.85% increase over our proposed model). Thus,engagement and satisfaction appear to fully mediate(Baron and Kenny 1986) the influence of identity factorson prosocial intentions.

Overall, the results strongly uphold the main princi-ples of our proposed model. Specifically, the identityfactors that earlier studies focused on appear to beantecedent to the more powerful mediating condi-tions of engagement and satisfaction that ultimately

Figure 2 Structural Results of Proposed Model

Self-identityverification

Knowledgeself-efficacy

Knowledgecontribution

Satisfaction

Communityengagement

Positiveword of mouth

Communityidentification

0.34***

0.19 **

0.30***

0.26

***

0.61***

0.39*** 0.30***

–0.10*

0.47 ***

0.16

**

Note. Nonsignificant hypothesized paths are dashed.Path significances: ⇤p < 0005; ⇤⇤p < 0001; ⇤⇤⇤p < 00001.

determine prosocial outcomes in online communities.The theory-free alternative models did not yield anyadditional advantage when both power and parsimonywere considered. We also note the failed hypothesesand unexpectedly significant control effects found inour empirical results. First, satisfaction does not directlyinfluence knowledge contribution intentions, althoughit does influence word-of-mouth intentions. Second,self-identity verification did not have a significantrelationship with satisfaction. Our artifact measures

Dow

nloa

ded

from

info

rms.o

rg b

y [1

40.1

14.1

39.1

81] o

n 14

Oct

ober

201

4, a

t 01:

00 .

For p

erso

nal u

se o

nly,

all

right

s res

erve

d.

Ray, S., Kim, S. S., and Morris, J. G. 2014. “The Central Role of Engagement in Online Communities,”Information Systems Research (25:3), pp. 528–546.

Reduced formMa & Agarwal (2007)

compare competing theories

ModelComparison&Selection

• Fundamentaltoscientificwork• PLSasexploratory• p-valueschallengeinlargesamples

compare alternative models

!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


!"

!#

!$

%"


&" = 0.2

&+ = 0.4

&- = 0.1

!"

!#

!$

%"


DifferentTypesofModels(generating,parsimonious,incorrect,saturated,overspecified)

Sharmaetal.2017

43

GeneratingPredictions

&PredictionErrors

EvaluatingPredictivePerformance


UsingPLSPredictions

MoreOpenQs

44

? Whatis“good”predictionaccuracy?precision?

? Evaluatingconstruct-levelpredictions

? WhichpartstransfertoCB-SEM?

AnalyticsHumanity

Responsibility

Galit Shmueli徐茉莉Institute of Service Science

When Prediction Met PLS: What We learned in 3 Years of Marriage

Data & Analytics