Survey Analysis: Data Mining versus Standard Statistical Analysis for Better
Analysis of Survey Responses
By Dean AbbottAbbott Analytics
http://www.abbottanalytics.com
Salford Systems Data Mining 2006March 27-31 2006
San Diego, CA
2© Abbott Analytics, 2000-2006
Acknowledgements
Work done under contract with Seer AnalyticsWork done under contract with Seer Analytics
Subcontractors: Subcontractors: TessarTessar and Associates (now Mobile and Associates (now Mobile
Foundry), Abbott Consulting (now Abbott Analytics)Foundry), Abbott Consulting (now Abbott Analytics)
Seer Analytics, LLCSeer Analytics, LLC518 North Tampa Street518 North Tampa StreetTampa, FL 33602Tampa, FL 33602813813--318318--01110111http://http://www.seeranalytics.comwww.seeranalytics.com
we help you see what's there.
SEER
http://http://www.mobilefoundry.netwww.mobilefoundry.net//
3© Abbott Analytics, 2000-2006
About Abbott Analytics
Abbott AnalyticsAbbott AnalyticsFounded in 1999, based in San Diego, CAFounded in 1999, based in San Diego, CA
Dedicated to data mining consulting and trainingDedicated to data mining consulting and training
Principal: Dean AbbottPrincipal: Dean AbbottApplied Data Mining for 19+ years inApplied Data Mining for 19+ years in
Direct Marketing, CRM, Survey Analysis, Tax Compliance, Fraud Direct Marketing, CRM, Survey Analysis, Tax Compliance, Fraud Detection, Predictive Toxicology, Biological Risk AssessmentDetection, Predictive Toxicology, Biological Risk Assessment
Course InstructionCourse InstructionPublic 2Public 2--day Data Mining Coursesday Data Mining Courses
Conference TutorialsConference Tutorials
Customized Training and Knowledge TransferCustomized Training and Knowledge TransferData mining methodology (CRISPData mining methodology (CRISP--DM)DM)
Training services for software products, including CART, Training services for software products, including CART, Clementine, Clementine, AffiniumAffinium Model, Insightful MinerModel, Insightful Miner
4© Abbott Analytics, 2000-2006
Talk Outline
Member surveyMember survey
Survey descriptionSurvey description
Results using statistical modelingResults using statistical modeling
Lessons learnedLessons learned
Employee surveyEmployee survey
Survey descriptionSurvey description
Results using decision trees (CART)Results using decision trees (CART)
Lessons learnedLessons learned
5© Abbott Analytics, 2000-2006
Problem Setup: Member Survey
Question:Question:
What are the characteristics of members who indicated the What are the characteristics of members who indicated the
highest overall satisfaction with their Club?highest overall satisfaction with their Club?
Data:Data:
32,811 records containing survey answers32,811 records containing survey answers
No demographic data except what was on survey (marital No demographic data except what was on survey (marital
status, children, age, gender)status, children, age, gender)
Approach:Approach:
Create supervised learning models with target variable Create supervised learning models with target variable
““overall_satisfaction = 1overall_satisfaction = 1””
6© Abbott Analytics, 2000-2006
Data Preparation
Begin with 57 candidate inputs to modelBegin with 57 candidate inputs to model
All survey questions are multiple choiceAll survey questions are multiple choice
Treated as categories, not numbersTreated as categories, not numbers
Typically 6 categories per question (1Typically 6 categories per question (1--5)5)
Unknown initially coded as Unknown initially coded as ““00””
No text comments fields included as inputs to modelNo text comments fields included as inputs to model
Create new column for target variableCreate new column for target variable
If overall_satisfaction = 1, variable value = 1,If overall_satisfaction = 1, variable value = 1,otherwise, variable value = 0otherwise, variable value = 0
Data very clean with respect to missing dataData very clean with respect to missing data
Only needed to record # children fieldsOnly needed to record # children fields
Number missingNumber missing
11,006 children < 6; 10,701 children 611,006 children < 6; 10,701 children 6--12; 10,873 children 1312; 10,873 children 13--17; 4,936 children 17; 4,936 children (overall)(overall)
When missing, recoded values with When missing, recoded values with ““--11”” to indicate missingto indicate missing
7© Abbott Analytics, 2000-2006
Member Survey Question Categories
8© Abbott Analytics, 2000-2006
Sampling
Begin with 32,811 responsesBegin with 32,811 responses
Set aside about half for validation (not used during Set aside about half for validation (not used during modeling): 16,379 recordsmodeling): 16,379 records
These records will be used to provide final summaries of the These records will be used to provide final summaries of the segmentssegments
16,433 records used in creating and scoring model16,433 records used in creating and scoring model
5,059 had overall satisfaction = 1 (30.8%)5,059 had overall satisfaction = 1 (30.8%)
Model 1 splits data into training and testing data: 2/3 for Model 1 splits data into training and testing data: 2/3 for training (creating model), 1/3 for testing (scoring and ranking training (creating model), 1/3 for testing (scoring and ranking models)models)
Approximately 11,503 for training; 4,930 for testingApproximately 11,503 for training; 4,930 for testing
9© Abbott Analytics, 2000-2006
Relationship of Overall Satisfaction to Recommend to Friends
0 1 2 3 4
OVERALL.RA
0
1
2
3
4
5
RE
CO
MM
EN
D.
Overall satisfaction
Rec
omm
end
to F
riend
•Of the 4912 / 16739 (30.2%) with Overall Satisfaction = 1
•86% have Recommend to friends = 1
•Of the 8708 / 16739 (54%) with Recommend to Friends = 1
•49% have Overall Satis. = 1• 4227 / 16739 (26.0%) have both overall satisfaction and recommend to friends both equal to 1•This is the biggest bin of the cross tab, followed by
•Overall = 2 / recommend = 2 (24%; 3890 / 16739)•Overall = 2 / recommend = 1 (22%; 3565 / 16739)•No other bin greater than 5% of records
10© Abbott Analytics, 2000-2006
Objective and Data Challenges
Project ObjectiveProject ObjectiveInterpret results of survey for large health clubInterpret results of survey for large health club (not a predictive model)(not a predictive model)
ChallengesChallengesMissing data (some questions either N/A or blank)Missing data (some questions either N/A or blank)
Solution: Impute values that least effect information communicatSolution: Impute values that least effect information communicated by ed by question (not a mean or median!)question (not a mean or median!)
Answers (target variables) highly correlated with one anotherAnswers (target variables) highly correlated with one another
MultiMulti--collinearity and interpretation of results problematiccollinearity and interpretation of results problematic
Must reduce dimensionality without losing interpretation of resuMust reduce dimensionality without losing interpretation of resultslts
Solution: Factor analysisSolution: Factor analysis
Target variableTarget variable
Three questions pointed to the important actionable information Three questions pointed to the important actionable information (related to (related to how satisfied members were)how satisfied members were)
Solution: combine all three into a new Solution: combine all three into a new ““index of excellenceindex of excellence””
11© Abbott Analytics, 2000-2006
Data Preprocessing Approach
Reduce input data (for understanding)Reduce input data (for understanding)
Use factor analysis to identify groupings of variables that are Use factor analysis to identify groupings of variables that are
interesting. interesting.
Factors can be candidate inputs to models, but didnFactors can be candidate inputs to models, but didn’’t work as well on t work as well on
this datathis data
Selected as inputs, those variables with highest loadings as Selected as inputs, those variables with highest loadings as
representative of that type of factorrepresentative of that type of factor
Also retained key questions in addition to the factor analysis Also retained key questions in addition to the factor analysis
representative questionsrepresentative questions
The effect is to remove questions The effect is to remove questions ““too highlytoo highly”” correlated correlated
with one another, while maintaining relevant information for with one another, while maintaining relevant information for
modeling.modeling.
12© Abbott Analytics, 2000-2006
Predictive Modeling Approach
Identify Key Questions
Identify Key Questions
Factor Analysis: 10 factors
Factor Analysis: 10 factors
Regression Model: Find Significant
Variables
Regression Model: Find Significant
Variables
Regression Model: Find Significant
Variables
Regression Model: Find Significant
Variables
3 questions with high association with target
10 factors, or variables that loaded highest on each factor
13 fields down to 7
Variable ranks
60+
Sur
vey
Que
stio
ns60
+ S
urve
y Q
uest
ions
3 key questions
13© Abbott Analytics, 2000-2006
loadings
00.5
11.5
22.5
33.5
44.5
5
Factor1
Factor2
Facto r3
Factor4
Facto r5
Factor6
Factor7
Factor8
Factor9
Factor10
Factor
Lo
adin
g
loadings
Factor 1
0.00
0.20
0.40
0.60
0.80
1.00
Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12
Top Question Loadings
Load
ing
Val
ue
Factor 2
0.00
0.20
0.40
0.60
0.80
Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q23
Top Question Loadings
Load
ing
Val
ues
Factor Analysis: Making the Complex Simple
14© Abbott Analytics, 2000-2006
Member Survey Factor Analysis Loadings
15© Abbott Analytics, 2000-2006
Reduce Variables using Regression
Already beginning with Already beginning with
only 13 variablesonly 13 variables
Question: how many of Question: how many of
these are useful these are useful
predictors?predictors?
Decided to retain 5 Decided to retain 5
factors for final modelfactors for final model
Regression Rankings of Questions/Factors
0
0.1
0.2
0.3
0.4
0.5
0.6
Q44 Q22 Q25
facto
r3.2
facto
r3.9
facto
r3.1
facto
r3.4
facto
r3.3
facto
r3.8
facto
r3.1
0
facto
r3.6
facto
r3.5
facto
r3.7
Question/Factor
Reg
ress
ion
Co
effi
cien
t
16© Abbott Analytics, 2000-2006
Explaining Results Through Visualization
Customer Customer waswas notnot interested in interested in ““technotechno”” solutionssolutions
Customer Customer waswas interested in what actions could be taken interested in what actions could be taken
as a result of the data mining modelsas a result of the data mining models
Which characteristics are most correlated with best Which characteristics are most correlated with best
customers?customers?
What do they like and dislike about the club?What do they like and dislike about the club?
Is it equipment? relationships? facility? staff?Is it equipment? relationships? facility? staff?
Show key contributors, how each club compared with other Show key contributors, how each club compared with other
club locations, and if club is improvingclub locations, and if club is improving
17© Abbott Analytics, 2000-2006
Key: Explaining Results
Visualization shows Visualization shows
key variables in survey key variables in survey
associated with associated with
““excellenceexcellence””, and , and
performance metrics performance metrics
for each clubfor each club
How well did this How well did this
club do?club do?
What is the change What is the change
over last yearover last year’’s s
result?result?
Shows which attributes Shows which attributes
does the club need to does the club need to
improve to improve improve to improve
customer satisfaction.customer satisfaction.
relationships
facility
equipment
Staff 2Staff 1
goals
value
Drivers ofSatisfaction
18© Abbott Analytics, 2000-2006
So What’s The Problem with That?
Regression, Neural Networks are Regression, Neural Networks are ““globalglobal”” estimatorsestimators
The operate over the entire data spaceThe operate over the entire data space
Descriptors of Regression represent Descriptors of Regression represent averageaverage influenceinfluence
Neither technique provides explicit localized characteristicsNeither technique provides explicit localized characteristics
Customer would like actionable analyticsCustomer would like actionable analytics
Clear characteristics of subgroups Clear characteristics of subgroups
Different strategies for subgroupsDifferent strategies for subgroups
Conclusion: In Round 2 (Employee Survey), use Conclusion: In Round 2 (Employee Survey), use another approachanother approach
19© Abbott Analytics, 2000-2006
Employee Survey Analysis Problem Setup
Very similar to member surveyVery similar to member survey60+ questions60+ questions
Few demographicsFew demographics
Attitudes the jobAttitudes the job
How to handle questionsHow to handle questionsThey are ordinal, but CARTThey are ordinal, but CART®® supports interval and nominal supports interval and nominal typestypes
Treat as categorical, but make sure values arenTreat as categorical, but make sure values aren’’t split upt split upIf see a split on a question having values 1, 2, 4If see a split on a question having values 1, 2, 4——rebuild as interval rebuild as interval variablevariable
DidnDidn’’t happen this way thought happen this way though——all worked out wellall worked out well
20© Abbott Analytics, 2000-2006
Employee Survey Question Groupings
21© Abbott Analytics, 2000-2006
Employee Survey:Target Variable Definition
Predict key attitudes that are consequentsPredict key attitudes that are consequentsSatisfactionSatisfaction
Recommend to a FriendRecommend to a Friend
Intend to Work Next Year at ClubIntend to Work Next Year at Club
Club is Good Place to WorkClub is Good Place to Work
Exclude these from each othersExclude these from each others’’ modelsmodelsThey are highly correlated with each otherThey are highly correlated with each other
Models that predict a target variable with these as inputs are nModels that predict a target variable with these as inputs are not actionableot actionable
Key Predictors, questions relating to:Key Predictors, questions relating to:Communications with managementCommunications with management
Quality of supervisorsQuality of supervisors
Training receivedTraining received
Effectiveness of clubEffectiveness of club
Fairness of policiesFairness of policies
Perceived member attitudesPerceived member attitudes
22© Abbott Analytics, 2000-2006
Employee Satisfaction (=1) Model: Data Information
File: modeling data with binarized dependents w missing.txtTarget Variable: Q1_1Predictor Variables: Q66, Q67, Q68, Q69, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10,
Q11, Q12, Q13, Q14, Q15, Q16, Q17, Q18, Q20, Q21,Q22, Q23, Q24, Q25, Q26, Q27, Q28, Q29, Q30, Q31,Q32, Q33, Q34, Q35, Q36, Q37, Q38, Q45, Q46, Q47,Q48, Q49, Q50, Q51, Q52, Q53, Q54, Q55, Q56, Q57,Q58, Q59, Q60, Q61, Q62, Q63, Q64, Q65
Class N Cases Pct Cases0 4,645 76.0%1 1,470 24.0%
23© Abbott Analytics, 2000-2006
Employee Satisfaction Model: Performance
Node
Cases Target Class
% of Node Tgt. Class
% Target Class
Cum % Tgt. Class
Cum % Pop % Pop
Cases in Node Cum lift Lift
8 859 60.75 58.44 58.44 23.12 23.12 1,414 2.53 2.53 4 95 43.58 6.46 64.90 26.69 3.57 218 2.43 1.81 7 201 42.23 13.67 78.57 34.47 7.78 476 2.28 1.76 3 30 17.44 2.04 80.61 37.29 2.81 172 2.16 0.73 5 92 14.38 6.26 86.87 47.75 10.47 640 1.82 0.60 6 14 13.86 0.95 87.82 49.40 1.65 101 1.78 0.58 2 124 10.12 8.44 96.26 69.44 20.03 1,225 1.39 0.42 1 55 2.94 3.74 100.00 100.00 30.56 1,869 1.00 0.12
Class N Cases N Misclassified Pct. Class0 4,645 953 20.521 1,470 315 21.43
24© Abbott Analytics, 2000-2006
Employee Satisfaction Model: Splits
• Q8: Feel Welcome– Surrogate: Q27 (family friendly),
Q28 (inclusive environment), Q18 (good working conditions)
– Q18: Good working conditions– Surrogate: Q17 (necessary
support/materials to do job)
• Q3: Feeling of accomplishment– Surrogates: Q6 (responsibilities
good fit with interests/skills)– Q7: Staff Competent
– Surrogates: Q15 (supervisor lets know work is appreciated), Q33 (trust management to take interests into account), Q5 (good opportunities for professional growth)
1
2
3
8
Q36
Q3
Q7
Q32
Q3
Q18
Q8
4
5
6 7
25© Abbott Analytics, 2000-2006
Employee Satisfaction:Q8 Split (root node)
Competitor Split Improvementwinner Q8 1 0.1174
1 Q18 1 0.11692 Q3 1 0.09983 Q35 1 0.09574 Q6 1 0.09515 Q7 1,2 0.094
Strongly agree feel welcome
26© Abbott Analytics, 2000-2006
Employee Satisfaction:Q18 Split (right side or root)
This is the best terminal node for satisfaction
Strongly agree feel welcome
Competitor Split ImprovementWinner Q18 1 0.0271
1 Q3 1 0.02032 Q35 1 0.01953 Q6 1 0.01774 Q14 1,5 0.01725 Q13 1,5 0.0167
27© Abbott Analytics, 2000-2006
Employee Satisfaction Model: Key Variables
Primary splitters onlyVariable ScoreQ18 100Q8 81.02Q14 72.03Q27 55.11Q26 50.53Q28 50.12Q5 17.66Q3 14.14Q17 14.05Q11 13.15Q7 11.89Q13 11.56Q6 11.27Q33 11.03Q16 9.6
Variable ScoreQ8 100Q18 23.11Q3 17.46Q7 14.68Q36 2.88Q32 2.68
• Q8: Feel Welcome– Surrogate: Q27 (family friendly),
Q28 (inclusive environment), Q18 (good working conditions)
– Q18: Good working conditions– Surrogate: Q17 (necessary
support/materials to do job)
• Q3: Feeling of accomplishment– Surrogates: Q6 (responsibilities
good fit with interests/skills)– Q7: Staff Competent
– Surrogates: Q15 (supervisor lets know work is appreciated), Q33 (trust management to take interests into account), Q5 (good opportunities for professional growth)
28© Abbott Analytics, 2000-2006
Member Satisfaction Model: Key Rules
/*Rules for terminal node 8*/Matches • 1,414 surveys (23.1%), • 859 highly satisfied (60.8%),• 58.4% of all highly satisfied
RULE:If ( Q18 = 1 and Q8 = 1)Then Highly SatisfiedP(0) = 0.39;P(1) = 0.61; Lift 2.5
If strongly agree that there are good working conditions and
strongly agree that member feels welcome, then highly
satisfied
/*Rules for terminal node 7 */Matches
• 476 surveys (7.8%), • 201 highly satisfied (42.2%),• 13.7% of all highly satisfied
RULE:If ( Q8 = 1 and Q18 <> 1 and Q3 ==
1 and Q32 == 1 or 2)Then Highly SatisfiedP(0) = 0.58;P(1) = 0.42; Lift 1.8
If strongly agree that feel welcome and strongly agree working at the club gives feeling of personal accomplishment, and agree management will take interests into account, even if don’t
strongly agree good working conditions, then highly satisfied
/*Rules for terminal node 4 */Matches
• 218 surveys (3.6%), • 95 highly satisfied (43.6%),• 6.5% of all highly satisfied
RULE:If ( Q8 <> 1 and Q7 = 1 or 2 and
Q3 == 1 and Q36 == 1 or 2)Then Highly SatisfiedP(0) = 0.56;P(1) = 0.44; Lift 1.8
If agree that I’ll be recognized for doing a good job, and strongly agree working at the club gives feeling of personal accomplishment, and agree that am paid fairly, even if don’t
strongly agree feel welcome, then
highly satisfied
29© Abbott Analytics, 2000-2006
Member Satisfaction Model: Unsatisfied Rules
/*Rules for terminal node 1*/
Matches • 1,869 surveys (30.6%),
• 55 highly satisfied (2.9%),• 3.7% of highly satisfied
• 39.0% of all not highly satisfied
RULE:
If ( Q8 <> 1 and Q7 <> 1 or 2)Then not highly satisfied
P(0) = 0.96;P(1) = 0.04; Lift 0.12
If don’t strongly agree that feel welcome and don’t agree that will be properly
recognized for a good job, then not highly satisfied.
/*Rules for terminal node 5*/
Matches • 640 surveys (10.5%),
• 92 highly satisfied (14.4%),• 6.3% of all highly satisfied
• 11.8% of all not highly satisfied
RULE:
If ( Q8 = 1 and Q18 <> 1 and Q3 <> 1)Then not highly satisfied
P(0) = 0.86;
P(1) = 0.14; Lift 0.58
If don’t strongly agree that there are good working conditions and don’t strongly
agree that feel welcome and work doesn’t give a feeling of accomplishment, even
though strongly agree that feel welcome,
then not highly satisfied.
/*Rules for terminal node 2 */
Matches • 1,225 surveys (20.0%),
• 124 highly satisfied (10.1%),• 8.4% of highly satisfied
• 23.7% of all not highly satisfied
RULE:If ( Q8 <> 1 and Q7 = 1 or 2 and Q3 <>
1)
Then not highly satisfied P(0) = 0.90;
P(1) = 0.10; Lift 0.42
If don’t strongly agree that feel welcome
and work doesn’t give a feeling of accomplishment, even though I agree that
I will be properly recognized for a good
job, then not highly satisfied.
30© Abbott Analytics, 2000-2006
Recommend to Friend (=1) Model: Data Information
File: modeling data with binarized dependents w missing.txtTarget Variable: Q44_1Predictor Variables: Q66, Q67, Q68, Q69, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10,
Q11, Q12, Q13, Q14, Q15, Q16, Q17, Q18, Q19, Q20, Q21,Q22, Q23, Q24, Q25, Q26, Q27, Q28, Q29, Q30, Q31,Q32, Q33, Q34, Q35, Q36, Q37, Q38, Q45, Q46, Q47,Q48, Q49, Q50, Q51, Q52, Q53, Q54, Q55, Q56, Q57,Q58, Q59, Q60, Q61, Q62, Q63, Q64, Q65
Class N Cases Pct0 3,958 64.7%1 2,157 35.3%
This model includes Q19 (am treated with respect), and is the best model to report
31© Abbott Analytics, 2000-2006
Recommend to Friend Model Performance
Class N Cases N Misclassified Pct. Class0 3,958 894 22.591 2,157 525 24.34
NodeCases Target
Class% of Node Tgt. Class % Target Class
Cum % Tgt. Class
Cum % Pop % Pop
Cases in Node Cum lift Lift
10 1,113 71.90 51.60 51.60 25.32 25.32 1,548 2.04 2.04 9 110 58.51 5.10 56.70 28.39 3.07 188 2.00 1.66 5 198 56.57 9.18 65.88 34.11 5.72 350 1.93 1.60 4 128 49.81 5.93 71.81 38.32 4.20 257 1.87 1.41 8 83 45.36 3.85 75.66 41.31 2.99 183 1.83 1.29 3 215 29.49 9.97 85.63 53.23 11.92 729 1.61 0.84 7 36 24.83 1.67 87.30 55.60 2.37 145 1.57 0.70 2 132 15.60 6.12 93.42 69.44 13.84 846 1.35 0.44 6 12 14.12 0.56 93.97 70.83 1.39 85 1.33 0.40 1 130 7.29 6.03 100.00 100.00 29.17 1,784 1.00 0.21
32© Abbott Analytics, 2000-2006
Recommend to Friend Model Splits
Q19: Treated with respectQ19: Treated with respect
Surrogates: Q18 (good working conditions) and Q8 Surrogates: Q18 (good working conditions) and Q8
(feel welcome)(feel welcome)
Q37: Compensation practice is fairQ37: Compensation practice is fair
Surrogates: Q36 (I am paid fairly)Surrogates: Q36 (I am paid fairly)
Q45: How think members rate club Q45: How think members rate club
Surrogates: Q47, Q46, Q60 (memberSurrogates: Q47, Q46, Q60 (member--cleanliness, cleanliness,
enough equip., check on progress)enough equip., check on progress)
Q33: Trust management to take interests into accountQ33: Trust management to take interests into account
Surrogates: Q32 (management keeps promises), Q34 Surrogates: Q32 (management keeps promises), Q34
(leaders remove roadblocks to inclusion)(leaders remove roadblocks to inclusion)
Q5: Good opportunities for professional growthQ5: Good opportunities for professional growth
Surrogates: Q4 (responsibilities good fit with interests), Surrogates: Q4 (responsibilities good fit with interests),
Q7 (appropriately recognized)Q7 (appropriately recognized)
Q8: Feel welcomeQ8: Feel welcome
Surrogates: Q7Surrogates: Q7
1
2
4 8
5 9
7
6Q8
Q5
Q45
Q33
Q50
Q35
Q45
Q37
Q19
10
3
33© Abbott Analytics, 2000-2006
Recommend to Friend Model Key Variables
Primary splitters only
Variable ScoreQ8 100.0
Q19 99.1 Q18 97.4 Q15 64.5 Q16 63.1 Q14 61.3 Q33 39.6 Q35 33.8 Q32 24.7 Q34 23.9 Q31 23.9 Q9 21.5 Q7 15.4
Q45 14.8 Q37 12.9 Q5 10.0
Q36 9.7 Q4 4.3
Q38 4.0 Q22 1.6 Q50 1.4 Q26 1.0 Q48 0.8 Q47 0.7 Q28 0.6 Q46 0.6 Q11 0.3 Q51 0.3 Q60 0.1 Q49 0.0
Variable ScoreQ19 100Q33 32.23Q45 14.94Q37 12.99Q5 8.98Q8 3.03
Q35 1.67Q50 1.34
Q19: Treated with respectQ19: Treated with respect
Surrogates: Q18 (good working conditions) and Surrogates: Q18 (good working conditions) and
Q8 (feel welcome)Q8 (feel welcome)
Q37: Compensation practice is fairQ37: Compensation practice is fair
Surrogates: Q36 (I am paid fairly)Surrogates: Q36 (I am paid fairly)
Q45: How think members rate club Q45: How think members rate club
Surrogates: Q47, Q46, Q60 (memberSurrogates: Q47, Q46, Q60 (member--cleanliness, cleanliness,
enough equip., check on progress)enough equip., check on progress)
Q33: Trust management to take interests into Q33: Trust management to take interests into
accountaccount
Surrogates: Q32 (management keeps promises), Surrogates: Q32 (management keeps promises),
Q34 (leaders remove roadblocks to inclusion)Q34 (leaders remove roadblocks to inclusion)
Q5: Good opportunities for professional growthQ5: Good opportunities for professional growth
Surrogates: Q4 (responsibilities good fit with Surrogates: Q4 (responsibilities good fit with
interests), Q7 (appropriately recognized)interests), Q7 (appropriately recognized)
Q8: Feel welcomeQ8: Feel welcome
Surrogates: Q7Surrogates: Q7
34© Abbott Analytics, 2000-2006
Recommend to Friend Model: Key Rules
/*Rules for terminal node 10*/
Matches • 1,548 surveys (25.3%),
• 1,113 recommend (71.9%),• 51.6% of all strong recommends
RULE:If ( Q19= 1 and Q37 = 1 or 2)
Then Recommend = 1
P(0) = 0.281;P(1) = 0.719;; Lift = 2.0
If strongly agree that supervisors treat me with respect, and agree that
compensation practice is fair, then
strongly agree that will recommend to
friend.
/*Rules for terminal node 9*/
Matches • 188 surveys (3.1%),
• 110 recommend 58.5%),• 5.1% of all strong recommends
RULE:If ( Q19 = 1 and Q37 <> 1or 2 and
Q45 = 1)Then Recommend = 1
P(0) = 0.415;P(1) = 0.585; Lift = 1.7
If strongly agree that supervisors treat
me with respect, and believe that
members strongly agree they are highly satisfied, even though don’t agree
compensation practice is fair, then strongly agree that will recommend to
friend
/*Rules for terminal node 5*/
Matches • 350 surveys (5.7%),
• 198 recommend (73.5%),
• 9.2% of all strong recommends
RULEIF ( Q19 <> 1 and Q33 = 1 or 2 and
Q45 = 1 )Then Recommend = 1
P(0)= 0.434;
P(1) = 0.566; Lift = 1.4
If agree that trust management will take my interests into account, and believe
that members strongly agree they are highly satisfied, even though don’t
strongly agree supervisors treat me with
respect, then strongly agree that will recommend to friend
35© Abbott Analytics, 2000-2006
Recommend to Friend Model: Rules for Not Recommending
/*Rules for terminal node 1 */Matches
• 1,784 surveys (29.2%), • 130 highly recommend (7.3%), 94% don’t highly rec.
• 6.0% of all highly recommend
RULE:
If ( Q31 <> 1 and Q22 <> 1)Then Don’t Strongly Recommend
P(0) = 0.94
P(1) = 0.06;
If don’t strongly agree that supervisors treat me with respect, and don’t agree that management will take
interests into account, then don’t strongly agree that will recommend to friend.
/*Rules for terminal node 2 */Matches
• 846 surveys (13.84%),
• 132 highly recommend (15.6%), 84.4% don’t highly rec.• 6.1% of all highly recommend
RULE
If ( Q19 <>1and Q33 = 1or 2 and Q45 <> 1 and Q5 <> 1 or 2)
Then Don’t Strongly RecommendP(0) = 0.84;
P(1) = 0.16;
If don’t strongly agree that supervisors treat me with respect, and don’t strongly believe that members are highly satisfied, and don’t
agree that there are good opportunities for professional growth, then even though agree that management will take interests into account,
don’t strongly agree that will recommend to friend.
36© Abbott Analytics, 2000-2006
Intend to Continue Working at Club (=1) Model: Data Information
File:modeling data with binarized dependents w missing.txtTarget Variable: Q39_1Predictor Variables: Q66, Q67, Q68, Q69, Q3, Q4, Q5, Q6, Q7, Q8, Q9, Q10,
Q11, Q12, Q13, Q14, Q15, Q16, Q17, Q18, Q20, Q21,Q22, Q23, Q24, Q25, Q26, Q27, Q28, Q29, Q30, Q31,Q32, Q33, Q34, Q35, Q36, Q37, Q38, Q45, Q46, Q47,Q48, Q49, Q50, Q51, Q52, Q53, Q54, Q55, Q56, Q57,Q58, Q59, Q60, Q61, Q62, Q63, Q64, Q65
Class N Cases Pct0 3,030 49.6%1 3,085 50.4%
37© Abbott Analytics, 2000-2006
Intend to Continue Working at Club:Model Performance
Class N Cases N MisclassifiedPct.
Misclass0 3,030 868 28.651 3,085 849 27.52
Node
Cases
Target
Class
% of Node
Tgt. Class
% Target
Class
Cum %
Tgt. Class
Cum %
Pop % Pop
Cases in
Node Cum lift Lift
10 1,099 80.81 35.62 35.62 22.24 22.24 1,360 1.60 1.60
9 486 69.63 15.75 51.38 33.66 11.42 698 1.53 1.38
5 349 67.38 11.31 62.69 42.13 8.47 518 1.49 1.34
8 100 65.36 3.24 65.93 44.63 2.50 153 1.48 1.30
4 202 53.87 6.55 72.48 50.76 6.13 375 1.43 1.07
7 75 43.86 2.43 74.91 53.56 2.80 171 1.40 0.87
2 224 35.33 7.26 82.17 63.93 10.37 634 1.29 0.70
3 43 33.59 1.39 83.57 66.02 2.09 128 1.27 0.67
6 65 30.23 2.11 85.67 69.53 3.52 215 1.23 0.60
1 442 23.73 14.33 100.00 100.00 30.47 1,863 1.00 0.47
38© Abbott Analytics, 2000-2006
Intend to Continue Working at Club Model: Splitters
• Q8: Feel Welcome
– Surrogate: Q27 (family friendly place), Q28 (diverse environment), Q18 (good working conditions)
• Q69: Age
– Surrogate: Q66 (how long worked at Club), Q68 (education)
• Q18: Good Working Conditions
– Q17 (have necessary support and materials to do job)
• Q5: Good Opportunities for Professional Growth
– Q7, Q33 (Management will take my interests into account)
• Q7: Will be Recognized for Good Job
– Q15 (Work is appreciated)
Q56
Q66
Q7
Q5
Q6
Q5
Q18
Q69
Q8
1
2
3 4
65
87
9
10
39© Abbott Analytics, 2000-2006
Intend to Continue Working at Club Model: Key Variables
Primary splitters only
Variable Score
Q8 100
Q18 84.13
Q27 63.23
Q11 57.03
Q28 50.45
Q26 48.54
Q7 43.43
Q5 37.23
Q33 32.81
Q31 23.56
Q69 22.21
Q4 21.86
Q9 18.79
Q3 13.82
Q13 9.98
Q14 9.46
Q16 8.12
Q15 6.03
Q66 5.26
Q17 3.99
Q56 2.15
Q6 2.03
Q23 1.63
Q68 1.23
Variable Score
Q8 100
Q5 37.07
Q69 17.48
Q7 11.24
Q18 10.7
Q66 5.19
Q56 2.15
Q6 2.03
• Q8: Feel Welcome
– Surrogate: Q27 (family friendly place), Q28 (diverse environment), Q18 (good working conditions)
• Q69: Age
– Surrogate: Q66 (how long worked at Club), Q68 (education)
• Q18: Good Working Conditions
– Q17 (have necessary support and materials to do job)
• Q5: Good Opportunities for Professional Growth
– Q7, Q33 (Management will take my interests into account)
• Q7: Will be Recognized for Good Job
– Q15 (Work is appreciated)
40© Abbott Analytics, 2000-2006
Intend to Continue Working at Club Model: Key Rules
/*Rules for terminal node 10 */Matches • 1,360 surveys (22.2%), • 1,099 intend to continue
(80.8%),• 35.6% of all intend to continue
RULE:If (Q8 = 1 and Q69>=2.5 )Then Intend to continueP(0) = 0.19;P(1) = 0.81;; Lift = 1.6
If strongly agree that feel welcome and am 35 years old or
older, then strongly agree that intend to continue working at the club.
/*Rules for terminal node 9 */Matches
• 698 surveys (11.4%), • 486 intend to continue (69.6%),• 15.8% of all intend to continue
RULE:If ( Q8 = 1 and Q18 = 1and Q69 <= 2.5 )Then Intend to continueP(0) = 0.30;P(1) = 0.70; Lift = 1.4
If strongly agree that feel welcome and
strongly agree that there are good working conditions, am older than 35 years old, then strongly agree that intend to continue working at the club.
/*Rules for terminal node 5 */Matches
• 518 surveys (8.5%), • 349 intend to continue (67.4%),• 11.3% of all intend to contiue
RULEIF ( Q8 <> 1 and Q5 = 1 or 2 and Q7 = 1 or
2 and Q66 > 2.5 )Then Intend to continueP(0)= 0.32;P(1) = 0.68; Lift = 1.3
If I strongly agree that if I do a good job I’ll be recognized, and I strongly agree that there are good opportunities for professional growth, and I have worked at the club for more than 2 years, even though don’t strongly agree that
feel welcome , then I strongly agree that intend to continue working at the club.
41© Abbott Analytics, 2000-2006
Intend to Continue Working at Club Model: Rules for Don’t Strongly Intend to Continue
/* Rules for terminal node 1 */Matches • 1,863 surveys (30.5%), • 442 strongly intend to continue working (23.7%),• 14.3% of all strongly intend to continue working• 46.9% of all not strongly intending to continue
RULE:If ( Q8 <> 1 and Q5 <> 1 or 2)Then not strongly intending to continue working at clubP(0) = 0.76;P(1) = 0.24; Lift 0.47
If don’t strongly agree that feel welcome and don’t strongly agree that there are good opportunities for professional growth, then don’t strongly agree that intend to continue working at the club.
/*Rules for terminal node 2 */Matches
• 634 surveys (10.4%), • 224 strongly intend to continue working (35.3%),• 7.3% of all strongly intend to continue working
• 13.5% of all not strongly intending to continue working
RULEIf ( Q8 <> 1 and Q5 = 1 or 2 and Q7 <> 1 or 2 )Then not strongly intending to continue working at clubP(0) = 0.65;
P(1) = 0.35; Lift 0.70
If don’t strongly agree that feel welcome and don’t strongly agree that if I do a good job I’ll be recognized, even though I strongly agree that there are good opportunities for professional growth, then don’t strongly
agree that intend to continue working at the club.
42© Abbott Analytics, 2000-2006
Satisfaction ModelSatisfaction Model
Top two rules identify 65% of most satisfiedTop two rules identify 65% of most satisfied
Top three rules identify 79% of most satisfiedTop three rules identify 79% of most satisfied
Recommend to FriendRecommend to Friend
Top three rules identify 66% of most likely to recommend to Top three rules identify 66% of most likely to recommend to
friendfriend
Intend to Keep Working at ClubIntend to Keep Working at Club
Top three rules identify 63% of most likely to keep workingTop three rules identify 63% of most likely to keep working
Summary of Results
43© Abbott Analytics, 2000-2006
Summary of Results
Satisfaction keys:Satisfaction keys:Make an environment where employees feel welcome, and have a senMake an environment where employees feel welcome, and have a sense se of purposeof purpose
Recommend to a Friend keysRecommend to a Friend keysSupervisors treat employees with respect and either good pay or Supervisors treat employees with respect and either good pay or it is it is perceived that members really like the clubperceived that members really like the club
Will work at club in a years timeWill work at club in a years timeFor those under 35: feel welcome (relationships)For those under 35: feel welcome (relationships)
For those over 35 (or worked at club a long time): feel welcome For those over 35 (or worked at club a long time): feel welcome and and good good working conditionsworking conditions
For those who donFor those who don’’t feel welcome, need good opportunities for t feel welcome, need good opportunities for professional growthprofessional growth
44© Abbott Analytics, 2000-2006
Conclusions
Trees can be used to provide concise summaries Trees can be used to provide concise summaries of behavioral tendencies from surveys of behavioral tendencies from surveys
Regression shows global, average attitudesRegression shows global, average attitudes
Trees show specific, localized attitudesTrees show specific, localized attitudes
Two or three rules can describe nearly 2/3 of all Two or three rules can describe nearly 2/3 of all employee attitudes of interestemployee attitudes of interest
Rules make sense, and are easy to explainRules make sense, and are easy to explain
Rules and are actionableRules and are actionable