Towards a Fully Adjusted Census Database for the 2011 Census Christine Sexton (ONS)

Towards a Fully Adjusted Census Database for the 2011 Census

Christine Sexton (ONS)Alan Taylor (ONS)

James Brown (ADMIN @ IoE)

Outline

• Overview of the Census Coverage Assessment and Adjustment Strategy for 2011

• The 2001 Adjustment Strategy

• Learning from 2001

• Assessment of the 2001 Adjustment System

• The Way Forward

Overview of the Coverage Assessment and Adjustment Process

Estimation

Matching

Adjustment

2011 Census

Quality Assurance

Census Coverage

Survey

The 2001 Adjustment Strategy

• Stage 1: Imputation of missed households (with people)

─ Model to derive predicted census household coverage probabilities using matched census to CCS data to obtain coverage weights─ tenure, ethnicity, household structure

─ Calibrate coverage weights to key variable estimates─ tenure exactly

─ Impute households with people into the database─ whole household records copied


• Stage 2: Imputation of missed individuals into counted households

─ Model to derive predicted person within counted census household coverage probabilities using matched census to CCS data

─ age, sex, activity, household structure, LA

─ Calibrate coverage weights to key variable estimates at local authority level ─ age-sex groups exactly

─ Impute people into census counted households─ whole person records copied


• Stage 3: Final adjustment

─ Further adjustments to meet local authority level estimates for age-sex groups and household size distributions─ taking out imputed individuals─ putting in extra individuals(pruning and grafting)

Ref: Steele, Brown and Chambers (2001), JRSS, series A.

• Insufficient control of household size and characteristics for imputed households─ Too many people in certain age-sex groups added at household imputation stage

─ Much time spent “pruning and grafting”

• Insufficient heterogeneity in the imputed population for some characteristics─ Whole records copied to imputed households and individuals

─ Ensured Census edit rules satisfied but may not reflect variability in population

Learning From 2001

Assessing the Performance of the 2001 System

• Used simulations

– Uses 2001 census extracts as the ‘true population’

– modelled 2001 matched census and CCS data– 10 simulated censuses and CCSs for one Estimation Area (two LAs)

– Census coverage 94%– 200,000 households– 490,000 persons– Used true totals as calibration constraints

LA age-sex group totals, activity, tenure, household size

Performance measures

100

T

TT10N

1

RAB

N

1e

10

1ie

(adj)ei

100

T

TT10N

1

RRAMSE

N

1e

10

1i

2

e(adj)ei

Relative Average Bias Results for Tenure

-16-14-12-10-8-6-4-20

Tenure

Rel

Ave B

ias

(%)

Census

Adjusted

RRAMSE Results for Tenure

010

203040

5060

Tenure

RR

AM

SE

(%

)

Census

Adjusted

Relative Average Bias Results for Males by Age

-15

-10

-5

0

5

10

Age group (males)

Rel

ativ

e A

vera

ge

Bia

s (%

)

Census Adjusted

RRAMSE Results for Males by Age

0

5

10

15

20

25

30

01-

45-

9

10-1

4

15-1

9

20-2

4

25-2

9

30-3

4

35-3

9

40-4

4

45-4

9

50-5

4

55-5

9

60-6

4

65-6

9

70-7

4

75-7

980

+

Age group (males)

RR

AM

SE

(%

)

Census Adjusted

Relative Average Bias Results for Activity

-20

-15

-10

-5

0

5

Activity

Rel

Ave

Bia

s (%

)

Census

Adjusted

RRAMSE Results for Activity

0

20

4060

80

100

120

Activity

RR

AM

SE

(%

)

Census

Adjusted

The Way Forward

• Aim to improve imputation by gaining better control of numbers of individuals imputed into households and their characteristics

─ Correct distribution of age group and household size at lower levels of geography

─ Reduce time spent on final adjustment

(pruning and grafting)

Modelling Missed Individuals

• In 2001 we modelled individuals missed within counted households─ no direct control of individuals missed within missed

households

• Proposed new model –

all missed individuals in single model ─ missed within counted households ─ missed within missed households

• Calibrate coverage weights for all individuals

then split weights into two components based on the model

Reverse the order of imputation

• In 2001 household imputation carried out first─ Within household imputation used to make up

shortfall─ Household weights did not match individual totals─ Imputed households did not contain correct types

of individuals

Reverse the order of imputation

• New person model gives direct control over split between two sources of undercount─ Can put missed individuals into counted households

first to complete counted households

• Then model census household coverage

• Calibrate household weights to key variables at EA level – tenure and household size

• Also calibrate household weights to key individual level variables from the persons in missed HHs totals – age-sex groups – at LA level to recover totals at the individual level

Conclusions

• By implementing the proposed changes we aim to improve on the 2001 system by gaining better control of the age-sex by household size distribution of the adjusted database and reduce the need for the final stage adjustment

• Analysis of 2001 method gives us a bench-mark to compare changes

• Work in progress

Questions?

[email protected]

Towards a Fully Adjusted Census Database for the 2011 Census Christine Sexton (ONS)

Documents

census coverage assessment

matched census

census extracts

individualsensured census

adjustment strategystage

coverage weightstenure

lascensus coverage

adjusted census database