Top Banner
IPUMS-International and Integrated European Census Microdata Projects Reduce Risks of Managing Trans-border Access and Add Significant Value * * * * * * Robert McCaa and Albert Esteve Palos Robert McCaa and Albert Esteve Palos Minnesota Population Center and Centre d’Estudis Demografics--Barcelona www.ipums.org/international www.iecm-project.org
19

* * * Robert McCaa and Albert Esteve Palos IPUMS-International and Integrated European Census Microdata.

Jan 16, 2016

Download

Documents

Hillary Curtis
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

IPUMS-International and Integrated European Census Microdata Projects Reduce Risks of Managing Trans-border Access and

Add Significant Value* * ** * *

Robert McCaa and Albert Esteve PalosRobert McCaa and Albert Esteve PalosMinnesota Population Center and Centre d’Estudis Demografics--Barcelona

www.ipums.org/international www.iecm-project.org

Page 2: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

““Dissemination [means] Dissemination [means] opening up the value inherent in our data.”opening up the value inherent in our data.”

-- Walter Radermacher and Pieter Everaers-- Walter Radermacher and Pieter EveraersSeminar on Emerging Trends in Data Communication Seminar on Emerging Trends in Data Communication

and Statistics, UNSC, New York, Feb. 19, 2010and Statistics, UNSC, New York, Feb. 19, 2010**

Page 3: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

Trans-Border access is essential in 21Trans-Border access is essential in 21stst Century. Century.Many researchers (e.g., demographers, members of Many researchers (e.g., demographers, members of

IUSSP) reside outside their country of birthIUSSP) reside outside their country of birth

• New ZealandersNew Zealanders 60% reside outside country of 60% reside outside country of birthbirth

• Dutch Dutch 40%40%

• Germans Germans 38% 38%

• DanesDanes 34%34%

• ChineseChinese 30%30%

• BelgiansBelgians 31%31%

• BritishBritish 25%25%

• AustraliansAustralians 22%22%

• Canadians, Finns, French, Japanese, Swiss, etc.Canadians, Finns, French, Japanese, Swiss, etc.~20% ~20%

Limiting access Limiting access to in-country is to in-country is old-fashioned, old-fashioned,

inefficient, inefficient, costly, & costly, & unfair.unfair.

Encourages Encourages violations, violations,

brain drain.brain drain.

Page 4: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

IPUMS-International IPUMS-International dark greendark green = anonymized, harmonized and disseminating = anonymized, harmonized and disseminating

(69 countries, 212 censuses, 480 millon person records)(69 countries, 212 censuses, 480 millon person records)medium green = to be integrated (29 countries, 75 censuses, ~100 mpr)medium green = to be integrated (29 countries, 75 censuses, ~100 mpr)

Mollweide projection

IPUMS-International: 2012 (weighted by population size)IPUMS-International: 2012 (weighted by population size)

2012 launch:2012 launch:El Salvador (2)El Salvador (2)Indonesia (9)Indonesia (9)Mexico (2010)Mexico (2010)Morocco (3)Morocco (3)Nicaragua (3)Nicaragua (3)Turkey (3)Turkey (3)Uruguay (5) Uruguay (5) Work began in 1999. Work began in 1999.

By 2020 we hope to integrate By 2020 we hope to integrate census microdata of 100 countries, census microdata of 100 countries,

including 2010 round censuses.including 2010 round censuses.

Page 5: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

IPUMS-International IPUMS-International dark greendark green = anonymized, harmonized and disseminating = anonymized, harmonized and disseminating

(17 countries, 56 censuses, 93 millon person records)(17 countries, 56 censuses, 93 millon person records)medium green = to be integrated (2 countries, 6 censuses, ~5 mpr)medium green = to be integrated (2 countries, 6 censuses, ~5 mpr)

Mollweide projection

IECM/ IPUMS-Europe: 2012 (weighted by population size)IECM/ IPUMS-Europe: 2012 (weighted by population size)

Countries not yetCountries not yetparticipating areparticipating areinvited to consider invited to consider doing so: Albania, doing so: Albania, Belgium, Bosnia-H, Belgium, Bosnia-H, Croatia, Denmark, Croatia, Denmark, Estonia, Finland, Estonia, Finland, Iceland, Latvia, Iceland, Latvia, Lithuania, Moldova Lithuania, Moldova R., Norway, Russia, R., Norway, Russia, Serbia, Slovak R., Serbia, Slovak R., Sweden, etc.Sweden, etc.

Page 6: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

• NSOs that disseminate microdata by “going it alone” incur NSOs that disseminate microdata by “going it alone” incur

significant risks, substantial costs, & much user dissatisfaction significant risks, substantial costs, & much user dissatisfaction I.I. IPUMS & IECM offer a “one-stop” comprehensive solution to IPUMS & IECM offer a “one-stop” comprehensive solution to

managing access to census microdata managing access to census microdata II.II. Statistical Confidentiality and SecurityStatistical Confidentiality and SecurityIII.III. IntegrationIntegrationIV.IV. Manage trans-border accessManage trans-border accessV.V. Conclusion: Invitation to cooperate, Conclusion: Invitation to cooperate,

entrust 2010 round census microdata as soon as feasible.entrust 2010 round census microdata as soon as feasible.

Outline: IPUMS-International & IECMOutline: IPUMS-International & IECM Reduce Risks of Managing Trans-border Access

and Add Significant Value

Page 7: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

I. One-stop, comprehensive solution I. One-stop, comprehensive solution to disseminating census microdata & metadata…to disseminating census microdata & metadata…

of Europe and the worldof Europe and the world

1.1. OrganizeOrganize Uniform agreement with each NSOUniform agreement with each NSO

2.2. AdministerAdminister We manage approval/denial of user accessWe manage approval/denial of user access

3.3. AnonymizeAnonymize We are responsible for data anonymizationWe are responsible for data anonymization

4.4. IntegrateIntegrate We do the workWe do the work Metadata Metadata Official language and integrated in EnglishOfficial language and integrated in English Microdata Microdata Integrated globally & optimized for EuropeIntegrated globally & optimized for Europe

5.5. DisseminateDisseminate Extracts, custom-tailored to each requestExtracts, custom-tailored to each request

6.6. ShareShare We share: results, We share: results, comprehensive electronic bibliographycomprehensive electronic bibliography

No longer enough to prepare a CD or post a dataset on a web-siteNo longer enough to prepare a CD or post a dataset on a web-site

Page 8: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

II. Statistical Confidentiality and SecurityII. Statistical Confidentiality and Security

A.A. Microdata security and confidentiality protectionsMicrodata security and confidentiality protections

• Employees face fines, job loss, and possible Employees face fines, job loss, and possible imprisonment for violationsimprisonment for violations

• Security: “best practice” – Dennis Trewin, ex Aus. Stat.Security: “best practice” – Dennis Trewin, ex Aus. Stat.

B.B. Statistical disclosure control protections: Statistical disclosure control protections:

• Suppression of records using sub-sampling, names, low-Suppression of records using sub-sampling, names, low-level geography, unique variates, level geography, unique variates,

• Paired swapping of geographical identifiers of Paired swapping of geographical identifiers of households to create uncertaintyhouseholds to create uncertainty

• Top/bottom coding, global recodes, deletion of digits, etc.Top/bottom coding, global recodes, deletion of digits, etc.

C.C. Managing restricted access to microdata (next slide)Managing restricted access to microdata (next slide)

Page 9: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

II. Statistical Confidentiality and Security (cont’d.)II. Statistical Confidentiality and Security (cont’d.)

A.A. Microdata security and confidentiality protectionsMicrodata security and confidentiality protections

B.B. Statistical disclosure control protections: Statistical disclosure control protections:

C.C. Managing restricted access to microdataManaging restricted access to microdata

• Detailed registration form to establish bona-fidesDetailed registration form to establish bona-fides

• 4/5ths of viewers do not complete the form! 4/5ths of viewers do not complete the form! --automatic denial--automatic denial

• Conditions of use bind researcher & institution; Conditions of use bind researcher & institution; violations penalize every researcher at institution violations penalize every researcher at institution

• Custom-tailored extracts encourage researchers to Custom-tailored extracts encourage researchers to jealously guard their downloads.jealously guard their downloads.

• More than 5,000 researchers approved for accessMore than 5,000 researchers approved for access

Page 10: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

III. Integration: Metadata & MicrodataIII. Integration: Metadata & Microdata

D.D. Comprehensive source metadata in official language(s) Comprehensive source metadata in official language(s)

• Questionnaires, instructions, manuals, etc.Questionnaires, instructions, manuals, etc.

E.E. Integrated, DDI compatible metadata: definitions, concepts, Integrated, DDI compatible metadata: definitions, concepts, variable names, value labels, codes--all link back to sources variable names, value labels, codes--all link back to sources

• Descriptions of censuses and samples,Descriptions of censuses and samples,

• Variables defined, comparability discussions, Variables defined, comparability discussions,

• Example: educational attainment (next slide) Example: educational attainment (next slide)

F.F. Integrated, pooled microdata: multiple censuses in a single Integrated, pooled microdata: multiple censuses in a single filefile

G.G. Integrated boundary files (GIS) linked to microdataIntegrated boundary files (GIS) linked to microdata

H.H. IPUMS value added variablesIPUMS value added variables

Page 11: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

Example of composite coding: Educational attainmentExample of composite coding: Educational attainment

Page 12: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

III. Integration: Metadata & Microdata (cont’d.)III. Integration: Metadata & Microdata (cont’d.)

D.D. Comprehensive source metadata in official language(s) Comprehensive source metadata in official language(s)

E.E. Integrated, DDI compatible metadata: definitions, concepts, Integrated, DDI compatible metadata: definitions, concepts, variable names, value labels, codes--all link back to sources variable names, value labels, codes--all link back to sources

F.F. Integrated, pooled microdata: many censuses in single fileIntegrated, pooled microdata: many censuses in single file

G.G. Integrated boundary files (GIS) linked to microdataIntegrated boundary files (GIS) linked to microdata

H.H. IPUMS value added variables:IPUMS value added variables:

• Technical variables: weights, identifiersTechnical variables: weights, identifiers

• Family, household info: summary indicatorsFamily, household info: summary indicators

• Person variables: Locations of mother, father, spouse Person variables: Locations of mother, father, spouse and rules for linking (momloc, poploc, sploc)and rules for linking (momloc, poploc, sploc)

Page 13: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

IV. Managing Trans-border AccessIV. Managing Trans-border Access

I.I. Trans-border access: uniform experience for access to all Trans-border access: uniform experience for access to all countries, regardless of nationalitycountries, regardless of nationality

J.J. Custom-tailored extracts: user selects country(ies), Custom-tailored extracts: user selects country(ies), censuses, variables, sub-populationscensuses, variables, sub-populations

• Extract engine fulfills request, generates custom-tailored Extract engine fulfills request, generates custom-tailored microdata and metadatamicrodata and metadata

• 3 unique IPUMS extract tools:3 unique IPUMS extract tools:

1.1. Select casesSelect cases

2.2. Attach characteristicsAttach characteristics

3.3. Customize sample sizeCustomize sample size

K.K. Usage: 8,048 extracts in 2011; 40,142 samples. See next Usage: 8,048 extracts in 2011; 40,142 samples. See next page.page.

Page 14: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

Disclosure Controls for Trans-Border access to Disclosure Controls for Trans-Border access to Census Microdata via a Single License, Access Point: Census Microdata via a Single License, Access Point:

The IPUMS-IECM partnershipThe IPUMS-IECM partnership* * ** * *

Robert McCaa and Albert Esteve PalosRobert McCaa and Albert Esteve PalosMinnesota Population Center and Centre d’Estudis Demografics--Barcelona

www.ipums.org/international

“You have to do due diligence, something to assure yourself “You have to do due diligence, something to assure yourself that the people you’re giving your data to can be trusted.” that the people you’re giving your data to can be trusted.”

----http://www.nytimes.com/2011/09/09/us/09breach.html?hp

IPUMS-International Google Analytics: 2011 IPUMS-International Google Analytics: 2011 Trans-Border Access: 169 countries/territories Trans-Border Access: 169 countries/territories

3,033 cities, 45,000 page views. Up 4X from 2010 3,033 cities, 45,000 page views. Up 4X from 2010

Page 15: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

Table 2. Rank of the Top Five and all European Countries plus Canada and the USAby Number of Extracts for the 2000 round census (statistics for calendar year 2011)

Rank CountrySample

%*Variables

(n)* Years of census samples Extracts1 Brazil 5 106 1960, 70, 80, 91, 2000 7122 Mexico 10 120 1960p, 70, 90, 95, 2000, 05 6263 United States 5 92 1960, 70, 80, 90, 2000, 05 5544 Colombia 10 120 1964p, 72, 85, 93, 2005 5165 South Africa 10 108 1996, 2001, 2007 4287 Canada 2.5 59 1971p, 81p, 91p, 2001p 4099 France 33 94 1962, 68, 75, 82, 90, 99, 06 380

10 Spain 5 99 1981, 91, 2001 36613 Greece 10 89 1971, 81, 91, 2001 32718 Austria 10 75 1971, 81, 91, 2001 31025 Italy 5 81 2001 28526 Portugal 5 96 1981, 91, 2001 28329 Romania 10 97 1976, 92, 2002 27230 Switzerland 5 79 1970, 80, 90, 2000 26632 United Kingdom 3 47 1991, 2001p 26338 Hungary 5 74 1970, 80, 90, 2001 22242 The Netherlands 1 33 1960p, 71p, 2001p 21145 Slovenia 10 80 2002 18548 Belarus 10 84 1999 179

 Total samples extracted for 55 countries (162 samples) available from January 1, 2011. 8,048*2000 round census; refers to all integrated variables, including IPUMS constructed variables.“p” = person sample; all other samples are of households

15

Page 16: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

IECM value-added (in beta test): IECM value-added (in beta test): Password protected, trans-border on-line tabulatorPassword protected, trans-border on-line tabulator

Page 17: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

• Substantial returns to NSOs; no cost: economies of scale, low Substantial returns to NSOs; no cost: economies of scale, low

risk.risk.• 96 NSOs are participating96 NSOs are participating• If yours is not, let’s discuss how to resolve the obstacles: If yours is not, let’s discuss how to resolve the obstacles:

Amend legislation, Amend legislation, Revise regulations, Revise regulations, Advocate statistical transparency, etc.Advocate statistical transparency, etc.

• Entrust 2011 census microdata, as soon as feasibleEntrust 2011 census microdata, as soon as feasible• Provide boundary files at low-level geography for each census Provide boundary files at low-level geography for each census

possiblepossible

Reflections Reflections

Page 18: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

IPUMS at the 59IPUMS at the 59thth ISI ISI (Hong Kong, Aug 24-30, 2013)(Hong Kong, Aug 24-30, 2013) http://www.isi2013.hk/ http://www.isi2013.hk/

» IPUMS IPUMS WorkshopWorkshop

» Microdata Microdata sessionsession

» IPUMS IPUMS Funding for Funding for delegates delegates from from developing developing countries countries

» IPUMS IPUMS boothbooth

Page 19: * * * Robert McCaa and Albert Esteve Palos   IPUMS-International and Integrated European Census Microdata.

Thank youThank you

If your NSO is not participating yet, please If your NSO is not participating yet, please contact: contact: [email protected]

When processing of your 2011 census When processing of your 2011 census

microdata is completed, please contact:microdata is completed, please contact: [email protected]