Top Banner
Adjust Survey Response Distributions Using Multiple Imputation: A Simulation with External Validation Frank Liu & Yu-Sung Su
46

We Want a Better Guess of Respondents' Preferences.

Dec 30, 2015

Download

Documents

amela-blevins

Adjust Survey Response Distributions Using Multiple Imputation: A Simulation with External Validation Frank Liu & Yu-Sung Su. We Want a Better Guess of Respondents' Preferences. How to Utilize M ultiple I mputation to make better guesses about the respondent’s preferences?. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: We Want a Better Guess of Respondents' Preferences.

Adjust Survey Response Distributions Using Multiple

Imputation: A Simulation with External Validation

Frank Liu & Yu-Sung Su

Page 2: We Want a Better Guess of Respondents' Preferences.

We Want a Better Guess of Respondents' Preferences.

Page 3: We Want a Better Guess of Respondents' Preferences.

How to Utilize Multiple Imputation to make better guesses about the

respondent’s preferences?

Page 4: We Want a Better Guess of Respondents' Preferences.

• Calculating proportions based on raw data and omitting the non-response data result in biased proportion of interested variables (Bernhagen & Marsh, 2007).

• Multiple Imputation (MI) is a cost-efficient and methodological sound approach for better use of raw survey and poll data (Barzi, 2004).

Page 5: We Want a Better Guess of Respondents' Preferences.

Picture Source: Kyle F. Edwards, Christopher A. Klausmeier, and Elena Litchman (2011)

Page 6: We Want a Better Guess of Respondents' Preferences.

MI Modeling vs. MI for DV

Page 7: We Want a Better Guess of Respondents' Preferences.

While most studies use MI for modeling, there is room to examine to which MI can be applied to electoral forecast (King et al., 2001; Snijders & Bosker, 2011).

Page 8: We Want a Better Guess of Respondents' Preferences.

Joint MI(Amelia II)

vs.

Conditional MI(mi & MICE)

Page 9: We Want a Better Guess of Respondents' Preferences.

Joint MI assumes a multivariate normal distribution.

vs.

Conditional MI does NOT.

Page 10: We Want a Better Guess of Respondents' Preferences.

mi takes advantage of existing regression models to handle various kinds of variables types (Su, Gelman, Hill, & Yajima, 2011): •using a logistical regression model to predict a binary outcome, •Using an ordered logit regression model to predict an ordinal outcome, •Using a multinomial logit regression model to predict an unordered categorical outcome.

Page 11: We Want a Better Guess of Respondents' Preferences.

mi results can be very biased when data include extreme values (He and Raghunathan, 2009)

Page 12: We Want a Better Guess of Respondents' Preferences.

Data

RDD telephone survey data about Taiwanese voters’ national identification (N=1,078), collected from Jan. 23 to Feb. 4, 2013. (AAPOR #3: 21.56%)

Page 13: We Want a Better Guess of Respondents' Preferences.

Sensitive Question (DV)and Item Non-Response

Variables Question Wording Distribution Missing (%)

camp(v33)

Do you support for any political party?

Blue camp: 215Green camp: 204

658 (61.04)

Page 14: We Want a Better Guess of Respondents' Preferences.
Page 15: We Want a Better Guess of Respondents' Preferences.

Auxiliary Variables (1)v Wording

v6 Do you agree that we can influence the government with our votes?

v7 Do you agree that we have little influence on what the government plans to do?

v9 Do you agree that we should use Taiwanese as the major language in Taiwan?

v10 Do you agree that Taiwanese children perform better than those in mainland China?

v17 Do you agree that those identifying with Taiwan can be titled as Taiwanese?

v18 Do you agree that Chinese from mainland China have more money than sense?

v20 Do you agree that those people should not be called Taiwanese if they don't know Matsu(name of a sea goddess widely worshipped on the SE China coast and in SE Asia)?

Page 16: We Want a Better Guess of Respondents' Preferences.

v22 Do you agree that our government should give more restrict policy on mainland China tourists?

v27 Some call themselves Taiwanese, some Chinese, and some Both, how about you?

v28 Do you agree that “Taiwan” is the formal name of our country?

v29 Do you agree that Taiwanese people and those in mainland China belongs to a same nation?

v32 Do you agree that we should seek for unification with mainland China if it becomes a democracy?

v37 Do you agree that the two sides of Taiwan Straight will eventually becomes one country?

v38 Do you ever go to mainland China in the past two years?

age Which year were you born? (re-coded to real age)

edu What is your education level?

sex (coded by interviewer)

Page 17: We Want a Better Guess of Respondents' Preferences.

Study 1: Check Missingness Patterns

• Using conditional MI algorithm to check its missingness patterns and to conduct MI.

• the data should be at least missing at random (MAR), i.e., the missingness mechanism does not depend on the unobserved data.

• MCAR > MAR > MNAR

Page 18: We Want a Better Guess of Respondents' Preferences.
Page 19: We Want a Better Guess of Respondents' Preferences.

Check the Missingness Patterns

1: Specify a conditional MI model• include many valid auxiliary variables (17);

2: Contrast Simulated Data with MI Data• Set the imputed datasets as the baseline for comparison• create three copies of the MI data• randomly remove values from the completely imputed data

according to the original missing rate of the data, i.e., 61%.

Page 20: We Want a Better Guess of Respondents' Preferences.

• We create simulated datasets by modeling the missingness of each variable conditional on a linear combination of the rest of variables with logistic regressions.

• Then we use the predictive missingness to create missing values on the three imputed data.

• In short, we compare simulated datasets with the original MI dataset.

Page 21: We Want a Better Guess of Respondents' Preferences.

Results of the Check for MCAR (1)

Page 22: We Want a Better Guess of Respondents' Preferences.

Results of the Check for MCAR (2) Mean SE Blue Green

Original Data 0.488 0.500 51.1% 48.8%

Imputed Original Data 0.474+ 0.500+ 52.6% 47.4%

#1 Imputed MCAR Data 0.464+ 0.499+ 53.5% 46.5%

#2 Imputed MCAR Data 0.464+ 0.500+ 53.6% 46.4%

#3 Imputed MCAR Data 0.453+ 0.500+ 54.7% 45.3%

Table 2: Summary of the camp variable between the original data and imputed MCAR datasets.Note: + The mean’s and SE’s reported here are pooled mean and SE’s for three chains of MI.

Page 23: We Want a Better Guess of Respondents' Preferences.

Results of the Check for MCAR (3)

Figure 3: Plots of Camp Variable against Other Variable Using the Imputed MCAR data.

Page 24: We Want a Better Guess of Respondents' Preferences.
Page 25: We Want a Better Guess of Respondents' Preferences.
Page 26: We Want a Better Guess of Respondents' Preferences.

Results of the Check for MAR (1)

Page 27: We Want a Better Guess of Respondents' Preferences.

Results of the Check for MAR (2) Mean* SE* Blue Green

Original Data 0.488 0.500 51.1% 48.8%

Imputed Original Data 0.474+ 0.500+ 52.6% 47.4%

#1 Imputed MAR Data 0.477+ 0.500+ 50.1% 49.9%

#2 Imputed MAR Data 0.497+ 0.500+ 50.3% 49.7%

#3 Imputed MAR Data 0.499+ 0.500+ 50.1% 49.9%

Table 3: Summary of the camp variable between the original data and imputed MAR datasets.Note: + The mean’s and SE’s reported here are pooled mean and SE’s for three chains of MI.

Page 28: We Want a Better Guess of Respondents' Preferences.

Results of the Check for MAR (3)

Page 29: We Want a Better Guess of Respondents' Preferences.
Page 30: We Want a Better Guess of Respondents' Preferences.
Page 31: We Want a Better Guess of Respondents' Preferences.

Summary: the assumption of MAR holds

Page 32: We Want a Better Guess of Respondents' Preferences.

Study 2: External Validation

1: compare respondents' answers with MI guesses• figure out how well the MI prediction works

2: understand why prediction performs not so well, if this is the case.• follow-up telephone surveys with identity check• Face-to-face one-by-one interviews for explanations

Page 33: We Want a Better Guess of Respondents' Preferences.

Follow-up telephone surveys1. Called out

• for the 658 respondents’ camp identification whose political camp id is NA. (April 13-15, 2013)

• Forced to choose between Green and Blue. • N=143

2. Identity check• Answers must be consistent with the first survey

regarding two questions: (1) whether ever going to mainland China in the past two years; (2) the frequency of watching political news

Page 34: We Want a Better Guess of Respondents' Preferences.

3. In-depth Interview: • 45 out of 143 respondents were contacted for face-to-

face one-by-one Interview. (incentive: cash $70)• 5 out of 45 respondents were interviewed between April

20 to May 6, 2013.

Page 35: We Want a Better Guess of Respondents' Preferences.

Findings

ID MI Camp pred.correct5 0.161 B 18 0.400 B 19 0.804 G 1

15 0.728 G 124 0.348 B 125 0.259 B 136 0.300 B 144 0.574 G 158 0.587 B 059 0.730 G 169 0.382 B 1

the total number of the 145 respondents whose values fall between .45 and .55: 19

Page 36: We Want a Better Guess of Respondents' Preferences.

Rate of Successful Prediction using MI: 70% (100/145).

Page 37: We Want a Better Guess of Respondents' Preferences.

Case #1

ID Sex Age Edu Date-Time

Place MI Camp ID

Causes of Inconsistency

905 F 34 3 2013.4.27 10:00

Taipei Main Station,

Taipei City

0.42 (B)

G Disappointed by the KMT's reform on domestic policies. Feeling lost some reasons to keep supporting KMT.

Page 38: We Want a Better Guess of Respondents' Preferences.

Case #2 ID Sex Age Edu Date-

Time Place MI Camp

IDCauses of

Inconsistency

206 M 43 2 13.4.3010:00

NSYSU campus, Kaohsiung City

0.50 (B)

B Disappointed by both political camps but felt more concerned about DPP than KMT regarding DPP's ideology of seeking Taiwan independence.

Page 39: We Want a Better Guess of Respondents' Preferences.

Case #3ID Sex Age Edu Date-

Time Place MI Camp

IDCauses of

Inconsistency

140 M 29 3 13.4.3014:00

Kaohsiung City 0.54 (G)

B Growing up with KMT supporter mother and have been ok with KMT. Turning to like DPP for a growing Taiwanese national identification.

Page 40: We Want a Better Guess of Respondents' Preferences.

Case #4ID Sex Age Edu Date-

Time Place MI Camp

IDCauses of

Inconsistency

384 M 25 4 13.5.414:00

Taipei Main Station,

Taipei City

0.47 (B)

G Feeling cross-pressured because parents support for KMT but friends support for DPP. The first vote is for DPP in 2008 presidential election. Feeling disappointed by KMT's performance but not aware the core ideology of DPP.

Page 41: We Want a Better Guess of Respondents' Preferences.

Case #5ID Sex Age Edu Date-

Time Place MI Camp

IDCauses of

Inconsistency

286 M 37 3 2013.5.416:00

Taipei Main Station,

Taipei City

0.40 (B)

G Feeling cross-pressured because his family have been supporting for KMT but wife's family support for DPP. Feeling disappoint about KMT's leadership.

Page 42: We Want a Better Guess of Respondents' Preferences.

Summary

1. These “danglers” are politically aware.2. Explanations for the ambivalence:

1. potential blue camp supporters (ID 905, 206, 384, and 286) have become unsatisfied with the incumbent's performance and public policy.

2. potential green camp supporters (ID 905, 206, 384, and 286) are affected by nationalism.

3. Cross-pressured (ID 140, 384, and 286).

Page 43: We Want a Better Guess of Respondents' Preferences.

Conclusion

1. MI scores reflect respondents’ partisan orientation, including the level of their ambivalence. It seems reasonable to adopt this method to reconstruct the distribution of partisan orientation of the electorate.

2. By face-to-face interviews inconsistency of their answers can be explained.

Page 44: We Want a Better Guess of Respondents' Preferences.

Implications & Suggestions

• Encourage respondents to give up answering those they could not answer

• Explore auxiliary questions for partisanship (e.g., policy and performance ones) and encourage respondents to answer sincerely that seem less sensitive to them.

• Use these easier auxiliary variables for MI.

Page 45: We Want a Better Guess of Respondents' Preferences.

• Continue to test the reliability of this method using panel data.

• Be aware of the limit of forcing respondents to choose an answer. Alternative methods or question wording of probing true answers about partisanship are still needed.

Page 46: We Want a Better Guess of Respondents' Preferences.

Thank you.

request for the paper, data, and/or R code

[email protected]