Register-Based Census 2011 in Slovenia – Some Quality Aspects Danilo Dolenc Statistical Office of the Republic of Slovenia UNECE-Eurostat Expert Group Meeting on Censuses Using Registers, Geneva, 22-23 May 2012
Feb 24, 2016
Register-Based Census 2011 in Slovenia – Some
Quality Aspects
Danilo DolencStatistical Office of the Republic of Slovenia
UNECE-Eurostat Expert Group Meeting on Censuses Using Registers, Geneva, 22-23 May 2012
Basic facts (1) Fully register-based census using data from
Administrative registers (5) Statistical register (1) Administrative databases (7) Statistical surveys (full coverage) (6)
Organized as a project Started in 2009 Project team consists of 6 employees
No permanent staff No outsourcing
No budget (savings around 14 mio EUR) Census considered as regular statistical survey
Basic facts (2) Reference date 1 January 2011 Three (four) stages of the statistical process
following the availability of sources concluded by dissemination of data: Basic demographic data (30 April 2011)
Produced quarterly Households and families (30 June 2011) Other population topics (30 December 2011)
Including occupied dwellings – preliminary data Housing (by the end of June 2012)
Delay due to the updated version of administrative data (Real Estate Register)
Background Four basic registers set up by SORS far ago
CPR for the first time used for statistics in 1986 Data already used in 1991 and 2002 Censuses
Register of Spatial Units (address list) in 80‘s Statistical Register on Employment – from 1986
In 2002 data on occupation, industry, place of work taken over Business Register in 1976
Two missing registers available after 2002 Real Estate Register established in 2007 Household Register computerized
CPR supplemented with dwellings number Mini project of SORS and Ministry of Interior in 2010
Preparatory phase (1) Analyses and evaluation
Data sources Quality of data Methodological and processing solutions
Trial census carried out in 2010 - main findings Inconsistencies in Household Register
Easy to improve quality Solved by Ministry of the Interior on the basis of SORS
guidelines Detected errors should be corrected in primary source
Preparatory phase (2) Missing dwelling numbers (DN) in CPR
More than half of population living in multi-dwelling buildings
Two main activities for improvement undertaken Automated determination of DN on the basis of ownership
and residence 49,000 letters sent to residents without DN
– Response rate 75 % including returning letters by post
Still 12.3 % of missing DN in input database
Preparatory phase (3) Unsatisfactory quality of Real Estate Register data
The main problem in whole statistical process SORS analyses sent to register keeper Public data – owners had chance to check and change
data Data on ownership depends on long-lasting legal
matters
Re-updating of final database – selected topics
Linkage of data (1) Identifiers crucial for integration of persons,
households and dwellings PIN (transformed to SID before the process)
Basic identifier for most of linkage regarding persons Household number
Housekeeping concept is implemented Not available for foreigners - 2.1% HN missing Relation to the reference person could be considered as identifier
(key for family generation) Dwelling number
The share of missing data still high – 12.3% Address
Unique identifier of every building
Linkage of data (2) Statistical process almost completely automated
Very complex rules for imputing key identifiers Interface for manual editing incorporated in the
statistical process Better quality – but only 1% records
Household formation of foreigners Family formation
– Multi-member households– Households without data on biological parents or spouses
Quality indicators - identifiers
Identifier Number
of Unchanged Imputed Corrections records Automated Manual Automated Manual
Share in %
Dwelling ID1) 724,479 75.3 11.7 0.6 11.9 0.5
Household ID2) 2,016,423 94.9 2.0 0.1 2.3 0.6
Relation to the 2,016,423 91.6 4.1 0.1 3.3 0.83) reference person2)
1) Multi-dwelling buildings only. 2) Private households only. 3) Manual corrections in the stage of family generation also included.
Current activity status Population aged 15+
Data integration stage
Prio-rity Source content Period Input Output Share %
1 Employed persns 24-31.12. 2010 820.793 804.854 98,12 Registered unemployed persons 1.1.2011 109.994 104.560 95,13 Students enrolment 2010/2011 91.654 77.346 84,44 Scholarship resipients 1.1.2011 31.076 18.353 59,15 Recipients of pensions 1.1.2011 582.594 511.279 87,86 Health insured persons 1.1.2011 1.253.284 187.418 15,07 Recipients of social benefits 2010 94.812 14.455 15,28 Income tax payers 2010 1.619.247 10.124 0,6
Number of records
Imputation Almost all missing data imputed
Except occupation, industry and status in employment for persons working abroad (e. g. daily commuters)
Two main methods used Automated corrections on the basis of existing correlated
data (e.g. activity status by health insurance code) Hot-deck imputation
Imputation rates – lower than in 2002 CensusActivity status 1.5% Occupation (unemployed) 5.2%Occupation (employed) 3.9% Industry (unemployed) 18.0%Industry (employed) 3.7%Place of work 3.8% Educational attainment 1.5%
Where should we be heading ? Integration into social statistics
Census data used for regular surveys (e.g. country ob birth of parents, immigrant background)
Coverage Cooperation with MI to improve over-registration
Geo-referencing Free of charge on the web – application KASPeR
Qualty of processes and outputs Every single change of data from the input databases to
the final census database is recorded Introducing manual interface for improving quality
Common tools – internal integration of IT processes
Over-registration Common problem of register-based systems Missing data on activity status used as indicator
Data from 8 sources used For 1,25 % of population no evidence in any source
Overestimated population groups Foreigners with permanent residence Working age population (30-44 years) – working abroad? Administrative survivors (over 94 years)
Final estimation 0.9% Very comparable with households surveys
No need for post-enumeration survey
Conclusion Two main conditions for input data quality
Close cooperation with register keepers Feedback implemented in primary source
Permanent use of registers Not only for statistical purposes
In future no more ‘‘Census‘‘ but regular annual/periodical survey Every 3-4 years complete ‘‘Census‘‘ Every year education, activity, migration data Twice a year basic demographic data including
citizenship New term instead of ‘‘Census‘‘ ???
Thank you for your attention!