1 Lecture 9 of 47C5 Social Research Process I: Using Secondary Datasets Paul Lambert, 8.10.03, 9-10am
Jan 13, 2016
1
Lecture 9 of 47C5 Social Research Process I:
Using Secondary Datasets Paul Lambert, 8.10.03, 9-10am
2
47C5: Survey research lectures
Lecture 8: The Survey Method
Intro. to & qualities of survey method
Lecture 9: Using Secondary Datasets
Data access and issues
Lectures 11/12: Sampling
Sample design, data collection / analysis
3
Resources for lectures 8,9,11,12
• Lecture slides on WebCT site
• 2 Reading lists: – Initial list in 47C5 unit outlines– Some additions on further list at WebCT site
Also: http://staff.stir.ac.uk/paul.lambert/teaching.htm
4
L9: Using Secondary Datasets
1) Introduction and background
2) Accessing secondary datasets
3) Qualities of secondary datasets
4) Data analysis / management issues
5) Key variables in survey research
5
1) Introduction and Background
• Vast quantity of surveys conducted
• An efficient step would be to analyse existing data (secondary) rather than personally collect your own (primary)
• Data archives collate survey datasets and supply them for secondary analysis
6
Large scale data
Lecture 8: Modern social survey analysis most often either large scale secondary or small scale primary
• Several assets of large scale surveys: – Generalise – Multivariate (more variables and more cases)
7
Large surveys’ high expenses:
• Government funds many large surveys
(also EU; LA’s; charities; commercial)
• Often made available freely or at low cost An ideal research tool (see ESRC):
– Quick to access– Methodological rigour– Falsifiable – others can access also
8
Secondary analysis of surveys
• Makes particular sense when large scale datasets are desirable
• Also often applies to smaller surveys
• Involves particular issues of data analysis, management and interpretation
• …Is a highly marketable skill!
9
2) Accessing Secondary datasets
• Internet and computing developments have revolutionised delivery of data resources
• Three steps to data access:1. Find out survey details / documentation
2. Apply for access from archive or collectors
3. Obtain and analyse the data
10
2.1) Finding Details
• The modern way: Internet search, eg UK data archive, UK
Question Bank, many others (reading list)
• The old fashioned way: Look out for research reports using datasets
and contact authors / data collectors directly
11
The UK data archive www.data-archive.ac.uk
• ESRC Efforts to encourage usage
• ‘Athens’ authentication
• Survey descriptions and lists of research
• Variable lists
• ‘NESSTAR’ to browse data
• Links to more sources for secondary data
12
2.2) Applying for access
• The modern way: Email / webpage forms, agree to conditions
of access (anonymised data to reduce ‘disclosure risk’)
• The old fashioned way: Personal contacts and requests to original
data collectors
13
2.3) Obtaining / analysing data
• The modern way: Download data from supplier (usually compressed
and portable format), use with documentation and variable lists in data analysis package (eg SPSS)
• The old fashioned way: A plain text computer file on disk, and copy of
original questionnaire, arrive by post: good luck!
14
3) Qualities of Secondary data
• Efficient: cheap & quick to access / analyse
• Scale of data larger than most can afford
• Methodological rigour of major suppliers: – Sampling – Questionnaire and variable design – Trained interviewers and data entry
• Falsifiable nature of analysis
15
Some drawbacks
• Distance from data collection – Harder to assess reliability / validity– Many variables already pre-coded – Can’t change / add anything in study
• Time delays in accessing to results
• Data analysis / management complex
• May be bracketed with survey originators
16
Analytical possibilities vary by survey data type
One division: Mirco-social v’s Macro-social most social survey analysis uses former
• Macro-social data– Government statistics www.statistics.gov.uk– Cross-national statistics (UN, OECD)– Macro-economic time series (trends / forecasts)– Beware: many critiques of ‘official statistics’
17
Types of micro-social data
• Census’s– General overview of whole population– Disclosure risk issues
• Cross-sectional surveys – Most widely used sources– Huge range of topic coverage– May be used to study small / rare populations
18
..more types of micro-social data
• Longitudinal datasets– Repeated cross-sections– Panel datasets– Cohort studies – Retrospective studies – Strengths: understand process and causality– Problems: sampling and attrition; complexity
19
..more types of micro-social data
• Cross-nationally comparative datasets– Focussed surveys (IPUMS census’s; ISSP;
World Values Survey; European Social Survey)– Longitudinal studies (LIS; ECHP; CHER)– Many analytical attractions, but issues of
comparable analysis are complex
20
Some major UK social surveys
• Cross-sectional:
OPCS Census British Crime Survey
Labour Force Survey British Social Attitudes
New Earnings Survey British Election Studies
Family Expenditure S. Policy Studies (Ethnicity)
General Household Survey Social Mobility enquires
21
Some major UK social surveys
• Longitudinal:
OPCS Census Longitudinal Study
Labour Force Surveys (repeated X section)
British Household Panel (Scottish, W, NI extensions)
Cohort studies: 1946, 1958, 1970, 2001, YCS
British election panel studies
22
4) Data analysis & management
• ..become core skills in using secondary surveys…
• Software packages – SPSS, SAS, STATA, .. – with wide capabilities
• Good and bad practice – should only do sensible things with data… (see 47C6)
23
Data Analysis
• Good practice– Reflects properties of variables– Describes output in appropriate context
• Bad practice (..is widespread)– Forcing data into style of analysis – Attributing false properties to data– Over zealous conclusions
24
Data AnalysisAssessing appropriateness of data analysis
techniques is inherent to assessing survey research findings (need to learn about
statistics and analysis..)
• Secondary data analysis misuses common – too easy to get data & run (bad) analyses
• Primary theme: must remember social context and theories throughout analysis
25
Data management
• Matching data files
• Coding / transforming variables
• Dealing with ‘missing’ data
Secondary dataset management tends to be:
• More complex • More error prone • Subject to external scrutiny
26
5) Key variables in social investigations
• Variable operationalisation key to surveys
• Choices: - in initial data collection
- in data recoding / analytic treatment
• In secondary analysis, researcher can only influence latter
• Here: Comment on some widely used variables (cf Burgess 1986, others)
27
Age and gender:
• Age– Linear or grouping or quadratic.. - which has
most social significance?– Age / Period / Cohort confusion
• Gender: – Deceptively simple, politically sensitive– Concepts of sexuality; masculinity
28
Education and occupation:
• Education– Changing ‘levels’ of education over time– Education as proxy for ability, intelligence?
• Occupation– Contested meanings of labour market status– Occupational indicators of stratification– Occupational gender segregation
29
Ethnicity and Health:
• Ethnicity– Existence of groups or racist language?– Identity v’s nationality v’s religion v’s ..
• Health– Subjective nature of self-reports– Changing terminology and social stigmas
30
Income and crime
• Income– High non-response and recording errors– Current income general well-being?
• Crime– Most crimes not reported– Categories of crimes arbitrary / debated /
changing
31
Key variables: summary
• Methods guidelines on appropriate handling
‘Harmonised concepts and questions’; textbooks; papers / debates specific issues
• Choices / approximations always used
• Research reports and methods appendices must explain and justify position taken
32
Summary: Secondary datasets
• Wealthy resource for survey analysis
• Issues and problems in use – but benefits outweigh disadvantages
• To understand, best tactic is to read social science research reports based on relevant secondary datasets