1 The Total Survey Error Framework and Survey Quality Controls in the Data Harmonization Process Marta Kołczyńska The Ohio State University & Polish Academy of Sciences Kazimierz M. Slomczynski The Ohio State University & Polish Academy of Sciences 2015 International Total Survey Error Conference Baltimore, MD, 21 September 2015
36
Embed
Survey Data Harmonization The Issue of Data and ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
The Total Survey Error Framework and Survey Quality Controls
in the Data Harmonization Process
Marta Kołczyńska The Ohio State University & Polish Academy of Sciences
Kazimierz M. Slomczynski The Ohio State University & Polish Academy of Sciences
2015 International Total Survey Error Conference
Baltimore, MD, 21 September 2015
2
Outline
1. About the Harmonization Project
2. Target Variables
3. General Schema for Quality Controls
4. Data Structure
5. Data Quality Controls
6. Harmonization Controls
7. Conclusion
3
1. Harmonization Project
Democratic Values and Protest Behavior: Data Harmonization, Measurement Comparability, and Multi-Level Modeling
dataharmonization.org
Funding: Polish National Science Centre (2012/06/M/HS6/00322)
Research Team: Kazimierz M. Slomczynski, J. Craig Jenkins, Irina Tomescu-Dubrow,
Joshua K. Dubrow, Przemek Powałko, Olena Oleksiyenko, Ilona Wysmułek,
Marta Kołczyńska, Marcin W. Zieliński, Matthew Schoene
Institutional support: Cross-national Studies, Interdisciplinary Research and Training Program - CONSIRT, Polish Academy of Sciences and The Ohio State University
consirt.osu.edu
4
Criteria for selecting survey projects
- contain questions about political attitudes and behavior;
- designed as cross-national, and, preferably, multi-wave;
- with the samples intended as representative of the adult population
of given country or territory;
- non-commercial;
- freely available in the public domain;
- with documentation (description, codebook, questionnaire) in
English.
5
Abbrev. Survey Project Time span Waves Files Data Sets Cases
5. Data Quality Controls 5.1. Coverage and Sampling (ISSP 2011 SI)
Study description (extended)
Target population: adult residents of Slovenia, older than 18 years, living on permanent address. Excluded: Institutionalised people.
Sampling frame: Central Register of Population (a list of names and addresses constantly updated by public administration).
Sampling procedure: two-stage stratified random sample from Central Register of Population, where every population unit has equal probability of selection.
First stage: PSU selection is made by probability proportional to size of CEA (Clusters of Enumeration Areas) (150 PSUs). CEA are stratified according to 12 regions*6 type of settlement.
Second stage: systematic random selection inside CEA brings fixed numbers of persons (150x24) with name and address. Split-halves samples were used for parallel SJM surveys (2x1800).
12
5.1 Coverage and Sampling (ISSP 2011 US)
Study description (simplified)
Multi-stage area probability sample.
13
5.2 Non-response
1 – documentation contains information about the response rate achieved, or information sufficient to compute the response rate.
0 – otherwise.
After some deliberation we decided not to include the actual response rate value do to the frequent lack of sufficient information about the definition of response rate in a particular survey and method of calculation, as well as sampling scheme. Some illustrations of these ambiguities follow:
14
5.2 Response Rate: Definition
Response Rate = full interviews / full & partial interviews + non-interviews (refusal + break-off + non-contacts + others) + all cases of unknown eligibility.
4 more definitions of response rates.
4 definitions of cooperation rates.
3 definitions of refusal rates.
3 definitions of contact rates.
Source: Standard Definitions report (7th edition, 2011), aapor.org.
15
5.2 Response Rate: Definition
“For non-probability samples, response rate calculations make little sense, given the broader inferential concerns. Further, for many of these surveys, the denominator is unknown, making the calculation of response rates impossible”
Source: Standard Definitions report (7th edition, 2011), p. 32, aapor.org
16
5.2 Response Rate (WVS 2005 CY)
17
5.2 Response Rate (WVS 2005 CY)
1200 / 1265 = 0.95
1050 / 1265 = 0.83
18
Response Rate (ISSP 2010 IL)
Study description
„Interviews: 1023
These figures pertain to interviews in Jewish and Mixed (Jewish-Arab) communities. In the case of additional 193 interviews conducted in small Arab communities there was no sampling list and we have no information on response rates”
19
5.3 Translation method
1 – documentation contains information about the method of questionnaire translation (any documented method more sophisticated than translation by the survey team).
0 – otherwise; includes:
- documentation includes information that no translation method was used.
- documentation does not include information about translation method at all.
20
5.3 Translation (ISSP 2011)
Methods report
21
5.4 Pretesting
1 – documentation contains information about pretesting/piloting.
0 – otherwise - includes:
- documentation contains information about no pretesting having been carried out;
- documentation does not contain information about pretesting.
22
5.5 Fieldwork Control
1 – documentation contains information about fieldwork control/backchecking
0 – otherwise; includes:
- documentation contains information about no fieldwork control having been carried out.
- documentation does not contain information about fieldwork control.
Worst 0 - Bolivia, Costa Rica, Honduras, Nicaragua Panama, Paraguay, Guatemala, El Salvador
27
Quality Index over time
28
6. Harmonization Controls Example: Trust in parliament
18 projects
137 countries/territories
1313 surveys
1.7 million individuals (unweighted)
29
Example: Trust in parliament
Wording, meaning of „trust”
Response options/scale (scale length, direction)
Context of the question (position in questionnaire)
30
Wording
I would like to ask you a question about how much trust you have in certain institutions. For each of the following institutions, please tell me if you tend to trust it or tend not to trust it. (EB 77.3)
In order to get ahead, people need to have confidence and to feel that they can trust themselves and others. To what degree do you think that you trust the following totally, to a certain point, little, or not at all? (CDCEE 2)
Please look at this card and tell me, for each item listed, how much confidence you have in them, is it a great deal, quite a lot, not very much or none at all? (EVS 4)
31
Wording
English - trust (ESS) vs. confidence (EVS); synonyms, subtle differences Albania - the same (besim) Belgium (dut) - the same (vertrouwen) Belgium (fr) – the same (confiance) Bulgaria – the same (доверие) Croatia – the same (povjerenje) Czech Republic – the same, noun and verb (důvěra / důvěřovat) Denmark – the same (tillid) Estonia – the same (usaldate) Poland – the same (zaufanie)
32
Response scale: Length and direction
Length of scale Direction of scale
Traditional (descending) Reversed (ascending)
11 CNEP, ESS
10 EQLS
7 AMB, NBB (wave 5, 6)
5 ISSP, VPCPEE CB, LITS
4
ARB, ASB, ASES,
CDCEE, EVS, LB, NBB
(wave 1, 3), WVS
AFB
2 EB
33
Position in questionnaire
- Based on master/core questionnaire for each wave
Range: 6 (ARB 1) – 320 (EVS 1)
Mean: 87.5
Quartiles: 23; 62; 136
34
Example: Trust in parliament
Wording, meaning of „trust” – stable within project
Response options/scale (scale length, direction) – stable within project
Context of the question (position in questionnaire) – stable within wave
35
7. Conclusions
- Surveys vary greatly with regard to data and documentation quality and methodology, even within waves of the same survey project
- Joint analysis of data from different surveys requires quality and harmonization controls to account for these differences