1 The influence of the questionnaire design on the magnitude of change scores Sandra Nolte 1 , Gerald Elsworth 2 , Richard Osborne 2 1 Association of Dermatological Prevention Hamburg, GERMANY 2 Deakin University Melbourne, AUSTRALIA
Mar 30, 2015
1
The influence of the
questionnaire design on the
magnitude of change scores
Sandra Nolte1, Gerald Elsworth2, Richard Osborne2
1 Association of Dermatological PreventionHamburg, GERMANY
2 Deakin UniversityMelbourne, AUSTRALIA
2
The measurement of program outcomes
… it is the basis for continuous quality assurance / improvement
… it delivers crucial information for a wide range of stakeholders
… it can / should deliver information on what works and what doesn’t
… is important because …
3
Bias in outcomes assessment
However …
while program evaluations are crucial, there are continuous concerns about:
biases that may threaten the validity of outcomes data
one such bias that is a common concern in pre-test / post-test data is:
Response Shift
(Howard
1979)
4
Response Shift
Change in common metric because of redefinition, reprioritisation and/or recalibration of the target construct (Schwartz & Sprangers, 1999)
Common “remedy” to circumvent Response Shift: collection of retrospective pre-test data
• [actual pre-test - retrospective pre-test] = magnitude and direction of Response
Shift• [post-test - retrospective pre-test]
= “true” program outcome (Visser et al.,
2005)
5
The retrospective pre-test
• Collected after an intervention, generally in close proximity to post-tests
How “good” (i.e. valid, reliable) are retrospective pre-test data?
• Past research generally focused on comparison of retrospective pre-test with actual pre-test; however,
• only few tested influence of scores on each other
• none tested the psychometric performance of retrospective pre-tests
6
Study aim
1) To explore influence of posing retrospective pre-test questions on ratings of post-tests
2) To explore whether other types of questions influenced post-
tests (i.e. transition questions)
7
Research design
• Setting: chronic disease self-management courses
• Randomised design: three versions of the Health Education Impact Questionnaire (heiQ) were distributed at post-test (randomised within courses)
8
Research design
• Randomised design – Version I
1) post-test ONLY (n=331) (6-point Likert scale: “strongly disagree” to “strongly agree”)
9
Group I: post-test ONLY
P l e a s e a n s w e r t h e f o l l o w i n g q u e s t i o n s :
C h e c k a b o x b y c r o s s i n g i t : R i g h t n o w
O n m o s t d a y s o f t h e w e e k , I d o a t l e a s t o n e a c t i v i t y t o i m p r o v e m y h e a l t h ( e . g . , w a l k i n g , r e l a x a t i o n , e x e r c i s e )
1
2
3
4
5
O n m o s t d a y s o f t h e w e e k , I d o a t l e a s t o n e a c t i v i t y t o i m p r o v e m y h e a l t h ( e . g . , w a l k i n g , r e l a x a t i o n , e x e r c i s e )
A s w e l l a s s e e i n g m y d o c t o r , I r e g u l a r l y m o n i t o r c h a n g e s i n m y h e a l t h
I o f t e n w o r r y a b o u t m y h e a l t h
I a m v e r y g o o d a t u s i n g a i d s a n d d e v i c e s t o m a k e m y l i f e e a s i e r
M o s t d a y s I a m d o i n g s o m e o f t h e t h i n g s I r e a l l y e n j o y
10
Research design
• Randomised design – Version II
1) post-test ONLY (n=331) (6-point Likert scale: “strongly disagree” to “strongly agree”)
2) post-test + transition questions (n=304) (transition Qs: 5-point response scale: “much worse” to “much better”)
11
P l e a s e a n s w e r t h e f o l l o w i n g q u e s t i o n s :
C h e c k a b o x b y c r o s s i n g i t :R i g h t n o w
C o m p a r e d w i t h b e f o r e t h e p r o g r a m
1O n m o s t d a y s o f t h e w e e k , I d o a t l e a s t o n e a c t i v i t y t o i m p r o v e m y h e a l t h ( e . g . , w a l k i n g , r e l a x a t i o n , e x e r c i s e )
2I a m v e r y g o o d a t u s i n g a i d s a n d d e v i c e s t o m a k e m y l i f e e a s i e r
3M o s t d a y s I a m d o i n g s o m e o f t h e t h i n g s I r e a l l y e n j o y
4A s w e l l a s s e e i n g m y d o c t o r , I r e g u l a r l y m o n i t o r c h a n g e s i n m y h e a l t h
5 I o f t e n w o r r y a b o u t m y h e a l t h
Group II: post-test + transition question
12
Research design
• Randomised design – Version III
1) post-test ONLY (n=331) (6-point Likert scale: “strongly disagree” to “strongly agree”)
2) post-test + transition questions (n=304) (transition Qs: 5-point response scale: “much worse” to “much better”)
3) post-test + retrospective pre-test (n=314) (both 6-point Likert scale: “strongly disagree” to “strongly agree”)
13
P l e a s e a n s w e r t h e f o l l o w i n g q u e s t i o n s :
C h e c k a b o x b y c r o s s i n g i t :R i g h t n o w B e f o r e t h e p r o g r a m
1O n m o s t d a y s o f t h e w e e k , I d o a t l e a s t o n e a c t i v i t y t o i m p r o v e m y h e a l t h ( e . g . , w a l k i n g , r e l a x a t i o n , e x e r c i s e )
2I a m v e r y g o o d a t u s i n g a i d s a n d d e v i c e s t o m a k e m y l i f e e a s i e r
3M o s t d a y s I a m d o i n g s o m e o f t h e t h i n g s I r e a l l y e n j o y
4A s w e l l a s s e e i n g m y d o c t o r , I r e g u l a r l y m o n i t o r c h a n g e s i n m y h e a l t h
5 I o f t e n w o r r y a b o u t m y h e a l t h
Group III: post-test + retro pre-test
14
Results
Across the three randomised groups:
no significant differences in:
demographic characteristics
pre-test scores (= scores collected before
intervention)
The randomisation worked
15
Results (cont.)
• Posing transition questions in addition to post-test questions had hardly any influence on post-test levels (Group II)
• In contrast, posing retrospective pre-test questions after an intervention had significant influence on ratings of post-tests in six of the eight heiQ subscales:
• Post-test ONLY (Group I)
• mean post-test: 4.76 • Post-test + retrospective pre-test (Group III)
• mean post-test: 4.96 (on 6-pt Likert scale)
16
Group 1 Group 2 Group 3 Mean (SD) Mean (SD) Mean (SD) Pretest 4.47 (0.92) 4.51 (1.02) 4.42 (0.98)
1. Positive and Active Engagement in Life Posttest* 4.78 (0.78) 4.87 (0.71) 5.00 (0.74)
Pretest 4.31 (1.18) 4.42 (1.12) 4.30 (1.16)
2. Health-Directed Behaviour Posttest* 4.65 (0.98) 4.85 (0.85) 4.83 (1.00) Pretest 4.08 (0.92) 4.17 (0.95) 4.10 (0.96)
3. Skill and Technique Acquisition Posttest* 4.64 (0.72) 4.79 (0.67) 4.90 (0.69) Pretest 4.51 (0.93) 4.57 (0.96) 4.42 (1.02)
4. Constructive Attitudes and Approaches Posttest* 4.72 (0.85) 4.82 (0.86) 4.90 (0.86)
Pretest 4.73 (0.65) 4.79 (0.67) 4.74 (0.68)
5. Self-Monitoring and Insight Posttest* 4.96 (0.55) 5.03 (0.50) 5.16 (0.52) Pretest 4.62 (0.90) 4.65 (0.92) 4.64 (0.95)
6. Health Service Navigation Posttest* 4.84 (0.81) 4.83 (0.86) 5.00 (0.79) Pretest 4.26 (1.13) 4.27 (1.17) 4.16 (1.21)
7. Social Integration and Support Posttest 4.43 (1.06) 4.53 (1.03) 4.50 (1.13)
Pretest 3.29 (1.23) 3.28 (1.26) 3.29 (1.21) 8. Emotional Well-Being Posttest 3.57 (1.16) 3.55 (1.22) 3.54 (1.18)
* Significant differences; robust ANOVA (Brown- Forsythe), p<0.05
Group IMean (SD)
Group IIMean (SD)
Group IIIMean (SD)
17
Conclusions
Asking retrospective pre-test questions at post-test has a significant influence on the ratings of post-test levels
The influence was so substantial that it leads to different conclusions about program effectiveness
It remains uncertain whether the application of retrospective pre-tests provides a more or less accurate reflection of the impact of chronic disease self-management programs
18
Conclusions
“It remains uncertain whether the application of retrospective pre-tests provides a more or less accurate reflection of the impact of chronic disease self-management programs”
However, psychometric properties of retrospective pre-test data seem to be substantially weaker than classic pre-test
Classic pre-test / post-test design may be the more valid approach to evaluate self-management programs
19
Discussion
Possible explanations:
1. Cognitive task may have triggered distorted responses consistent with theories:
Effort justification (Hill & Betz, 2005)
Implicit theory of change (Ross, 1989)
Social desirability (Crowne & Marlowe, 1964)
2. The task of remembering pre-test levels might have been too complex for some respondents making these data less reliable
20
Discussion (cont.)
3. It remains to be shown what people think while responding to questionnaires
qualitative research into response processes is essential to help understand & interpret self-report data