Introduction to the key large-scale government surveys Vanessa Higgins, Jo Wathan and Reza Afkhami ESDS Government Centre for Census and Survey Research (CCSR) University of Manchester
Mar 28, 2015
Introduction to the key large-scale government
surveys
Vanessa Higgins, Jo Wathan and Reza Afkhami
ESDS Government
Centre for Census and Survey Research (CCSR)
University of Manchester
Our mission…
• Intro to Economic and Social Data Service (ESDS) and how we can help you.
• What data is available, how it can be used in research, accompanying documentation
• Hierarchical nature of the data
• Registration, access, support services/resources
• STATA demo
ESDS Government
• One of four specialist services of ESDS. ESDS is a new national data service (since Jan 03)– ESDS Government– ESDS Longitudinal– ESDS Qualidata– ESDS International
• ESDS Government provides access and user support for key large-scale government surveys such as Labour Force Survey and General Household Survey
• Access remains via the UKDA
ESDS Government - some of the things we do!
• Helpdesk
• Survey pages incl. how to get started
• Online guides – SPSS, STATA, Weighting, Employment Research, Health Research
• User Group seminars (data users and data creators)
• Publications Database
• Derived variables - consistent over time- consistent with Census
• Teaching datasets
• Training
http://www.esds.ac.uk/government
Which surveys?
UK or GB surveys • General Household Survey• Labour Force Survey• Family Resources Survey • Expenditure and Food Survey (previously the National
Food Survey and Family Expenditure Survey)• ONS Omnibus Survey • National Travel Survey • Time Use Survey • British Crime Survey/Scottish Crime Survey• British Social Attitudes/Scottish Social
Attitudes/Northern Ireland Life & Times/Young People’s Social Attitudes
• Health Survey for England/Wales/Scotland• Survey of English Housing (England only)
Microdata
QUALITY OF DATA (1)
• Two main data collectors:– Office for National Statistics (ONS)– NatCen
• Both have considerable experience– ONS Social Surveys started in 1941– Natcen founded in 1969 (as SCPR)
• Permanent panels of highly trained field interviewers
• Management and Quality Checking• (Relatively) high response rates – but falling• Widespread use by secondary analysts
QUALITY OF DATA (2)Example of GHS data collection
STAGE/ PROCESSCommisssionSampling/allocationQ'r developmentSurvey documentationPilot studyInterviewing training FieldworkData processingAnalysing and reportingPublicationArchiving/review etc.
PRE-FIELDWORK
POST FIELDWORKFIELDWORK
What would you use the data for?
• Straightforward secondary analysis– To assess theoretical accounts– To quantify characteristics or behaviours– To challenge official views– To apply alternative definitions
• Context to your own primary research – Your research could be quantitative or
qualitative– To assess the national context of an area study– To assess whether your sample is typical– To assess the scale of behaviours
Practical research uses of the data
• Looking at change over time
• Look at sub-populations
• Using the flexibility of the data to look at alternative definitions
• Looking within households
Change over time
Secondary analysis:change over time among
sub-populations
Marmot, M (2003)
SMOKING AND SOCIAL CLASS - MEN
0
510
15
2025
30
3540
45
1994 1995 1996 1997 1998 1999 2000 2001
year
%
all sc I&II sc IV&VSource:HSE
Using successive cross-sectional data over time
Pros…• Reasonable amount of
comparability• Can pool
years/quarters• Data is representative
at each time point• Good at looking at
impacts on groups
Cons…• Limits to continuity in
the data (e.g. ethnic)• Cannot establish
individual change
Looking at small populations
• Only the Samples of Anonymised Records have larger sample sizes
• Many surveys with 10+k respondents– Permits minority groups to be represented– Rare subpopulations sample size may be too
small… can consider combining years if appropriate
Survey data is subject to sampling error!
Example: Pregnancy and Employment
•Using 1998-99 General Household Survey data alone there are only 168 pregnant women aged 16-49
•95% Confidence interval for % pregnant women economically inactive 34.2 – 49.1%
•Combined 3 years’ data to obtain sample of 465 pregnant women
•Confidence interval using 3 years’ data: 34.9 – 43.9%
Combining datasets to increase sample size
Using the flexibility of the data to look at alternative
definitionsWhat are ‘hours worked’?
• Is it just paid work? Or unpaid as well?• Hours usually worked, or actually worked last week?• In main job, or in any job? • What about students?• Overtime – paid?• Overtime – unpaid?• Lunch hours?• Do non-workers work zero hours or should they be
excluded?
Choosing a survey for research
• Which surveys cover your main topic?
• Which other topics are you interested in?
• Measurement over time
• Geography
• Respondents – whole household, children?
• Sample size
Using the data in teaching• Methods courses
– Using the data in a hands on manner– Using substantive exemplars to demonstrate a
methodological point– Using the surveys as methodological exemplars
• Substantive courses– Making your point using data– Integrating methods into substantive courses
• Teaching datasets– General Household Survey– Labour Force Survey– British Crime Survey– Health Survey for England
QUALITY OF DOCUMENTATION
• Questionnaire
• Code book of Variables
• Description of Derived Variables
• Definitions
• Methodology including
•Sampling method
•Response achieved
•Population base
• Published reports
Health Survey for England documentation
Health Survey for England documentation
Health Survey for England documentation
Documentation - GHS Questionnaire
Documentation - GHS Questionnaire
Continuous Population Survey
• Objectives
• Survey Aspects
• Benefits
• Risks
• Development programme
Objectives
• Develop a world class modular survey system• Provide more coherent, better quality
information • Increase the precision of most existing
statistical outputs • Create a range of new outputs• survey system with the flexibility to
accommodate other surveys• Maintaining continuity• Delivering further efficiency savings
Development programme
• October 2004 –First formal consultation
• Spring 2005 First feasibility trial• Summer 2005 Second feasibility trial• Late 2005 Pilot• Mid 2006 Parallel run• January to March 2007 second consultation• April 2007 Decision• January 2008 Start date
Survey Aspects
• Integrates LFS, GHS, EFS, OMN, APS into single survey– Annual achieved sample size = 265,000 households
(over half a million adults) GB– Common core module of questions to whole sample
Topic modules to portions of sampleCore and topic modules combined to construct a number of different viable interview combinations
Benefits
• a very large annual sample for core module variables;
• an improved, ‘unclustered’ sample design;• better representation at local authority
district level, and• improved weighting methodology• Greater coherence in official statistics, fewer
‘competing estimates’ between surveys
Risks
• Should the CORD or SCMS projects suffer significant delays in delivery, the CPS timetable would be affected.
• To a cost-effective CPS, interviewers would need to cover a larger range of topics. New e-based learning methodologies and techniques must be devised and validated in field trials.
• Equally, the proposed new fieldwork design and modular questionnaire would need to be tested in practice to ensure they are acceptable to respondents and interviewers
• while also demonstrating that outputs could be delivered within the timescales necessary for required quality.
• any decision to implement the CPS rests on a demonstrable ability to deliver the benefits anticipated while maintaining the integrity of key time series.