Top Banner
Disseminating Disseminating official statistics with a focus on official statistics with a focus on census microdata census microdata Example: IPUMS-International Example: IPUMS-International http:// www.ipums.org * * * * * * Robert McCaa, Professor of Population Robert McCaa, Professor of Population History History and Wendy L. Thomas, Archivist, and Wendy L. Thomas, Archivist, University of Minnesota Population University of Minnesota Population Center Center [email protected] This .ppt, docs, & This .ppt, docs, & additional additional information at: information at: www.hist.umn.edu/~rmccaa/ipums-africa www.hist.umn.edu/~rmccaa/ipums-africa
40

Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International * * *

Jan 15, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Roundtable on Archiving and DisseminatingRoundtable on Archiving and Disseminatingofficial statistics with a focus on census microdataofficial statistics with a focus on census microdata

Example: IPUMS-InternationalExample: IPUMS-Internationalhttp://www.ipums.org

* * ** * *Robert McCaa, Professor of Population HistoryRobert McCaa, Professor of Population History

and Wendy L. Thomas, Archivist,and Wendy L. Thomas, Archivist,University of Minnesota Population CenterUniversity of Minnesota Population Center

[email protected] This .ppt, docs, &This .ppt, docs, & additional information at:additional information at:

www.hist.umn.edu/~rmccaa/ipums-africawww.hist.umn.edu/~rmccaa/ipums-africa

Page 2: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Our common fate on a crowded planet: Our common fate on a crowded planet: new forms of global cooperation are required.new forms of global cooperation are required.

We must engage interdisciplinary research We must engage interdisciplinary research combining theory and practice.combining theory and practice.

--Jeffrey D. Sachs, --Jeffrey D. Sachs, Common WealthCommon Wealth (Penguin 2008) (Penguin 2008)

Page 3: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

A Census Microdata RevolutionA Census Microdata Revolution

1.1. Preserve all microdata and documentation 20 slidesPreserve all microdata and documentation 20 slides

Product (tables and microdata)Product (tables and microdata)

Process (of conducting census and producing census Process (of conducting census and producing census microdata)microdata)

2.2. Integrate microdata and metadataIntegrate microdata and metadata 8 8

3.3. Disseminate to researchers world-wide 3Disseminate to researchers world-wide 3

Conclusion: strengths, challenges, 7 golden rules 4Conclusion: strengths, challenges, 7 golden rules 4

Page 4: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

A Census Microdata RevolutionA Census Microdata Revolution

1.1. Preserve all census microdata and documentationPreserve all census microdata and documentationproduct and process:product and process: 1960s – present1960s – present ~100 countries (80 have endorsed IPUMS MoU)~100 countries (80 have endorsed IPUMS MoU) ~400 censuses (219 are entrusted to IPUMS)~400 censuses (219 are entrusted to IPUMS)

2.2. Integrate: both microdata and metadataIntegrate: both microdata and metadata

3.3. Disseminate to researchers world-wide— “extracts” Disseminate to researchers world-wide— “extracts” of database: countries, censuses, sub-populations, of database: countries, censuses, sub-populations, sample size, variables sample size, variables

Page 5: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

IPUMS-International Today IPUMS-International Today dark greendark green = already integrated: = already integrated:

35 countries, 111 censuses, 263 million person records35 countries, 111 censuses, 263 million person recordsgreen = to be integrated: 39 countries, 103 censuses, 150 mill.green = to be integrated: 39 countries, 103 censuses, 150 mill.

Mollweide projection

Page 6: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

IPUMS dissemination calendar (see handout)IPUMS dissemination calendar (see handout)samples for 35 countries available now, 74 soonsamples for 35 countries available now, 74 soon

» Europe 10:4Europe 10:4» Available (10):Available (10): Austria, Belarus, France, Greece, Hungary, Netherlands, Portugal, Austria, Belarus, France, Greece, Hungary, Netherlands, Portugal,

Romania, Spain, UKRomania, Spain, UK» Soon (4): Germany, Czech Republic, Slovenia, SwitzerlandSoon (4): Germany, Czech Republic, Slovenia, Switzerland

» Americas Americas (funding renewed July 1) (funding renewed July 1) 11:1111:11» Available (11):Available (11): Argentina, Brazil, Canada, Chile, Colombia, Costa Rica, Ecuador, Argentina, Brazil, Canada, Chile, Colombia, Costa Rica, Ecuador,

Mexico, Panama, USA, VenezuelaMexico, Panama, USA, Venezuela» Soon (11): Bolivia, Cuba, Dominican Republic, El Salvador, Guatemala, Soon (11): Bolivia, Cuba, Dominican Republic, El Salvador, Guatemala,

Honduras, Nicaragua, Paraguay, Peru, Puerto Rico, UruguayHonduras, Nicaragua, Paraguay, Peru, Puerto Rico, Uruguay» Africa 6:11Africa 6:11

» Available (6):Available (6): Egypt, Ghana, Kenya, Rwanda, South Africa, Uganda Egypt, Ghana, Kenya, Rwanda, South Africa, Uganda» Soon (11): Botswana, Ethiopia, Guinea (Conakry), Madagascar, Malawi, Mali, Soon (11): Botswana, Ethiopia, Guinea (Conakry), Madagascar, Malawi, Mali,

Mauritius, Sierra Leone, Sudan, Tanzania, ZambiaMauritius, Sierra Leone, Sudan, Tanzania, Zambia» Asia 8:13Asia 8:13

» Available (8):Available (8): Cambodia, China, Iraq, Israel, Malaysia, Palestine, Philippines, Cambodia, China, Iraq, Israel, Malaysia, Palestine, Philippines, VietnamVietnam

» Soon (13): Armenia, Bangladesh, Fiji, India, Indonesia, Jordan, Kyrgyz Soon (13): Armenia, Bangladesh, Fiji, India, Indonesia, Jordan, Kyrgyz Republic, Mongolia, Nepal, Pakistan, Thailand, TurkmenistanRepublic, Mongolia, Nepal, Pakistan, Thailand, Turkmenistan

Page 7: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

IPUMS timelineIPUMS timeline

» 1995: IPUMS-USA first release of integrated microdata1995: IPUMS-USA first release of integrated microdata IPUMS-USA continues: 1850-2000 + ACS samplesIPUMS-USA continues: 1850-2000 + ACS samples

» 1999: IPUMS-International funded1999: IPUMS-International funded

» 2002 - 12002 - 1stst International release: 7 countries, including International release: 7 countries, including Colombia and MexicoColombia and Mexico

» 2006: 20 countries, 63 censuses 2006: 20 countries, 63 censuses

» 2008: 35 countries, 111 censuses2008: 35 countries, 111 censuses» ~263 million person records~263 million person records» Two thousand usersTwo thousand users

» 2013: ~70 countries, ~200 censuses2013: ~70 countries, ~200 censuses» 214 sets of microdata are already entrusted to MPC214 sets of microdata are already entrusted to MPC» Coming: Germany (8), Switzerland (4), Bangladesh (2), Cuba (1)...Coming: Germany (8), Switzerland (4), Bangladesh (2), Cuba (1)...

Page 8: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

1. Preserve (Archive)1. Preserve (Archive)IPUMS Global workshop, ISI (Lisbon, Aug 2007)IPUMS Global workshop, ISI (Lisbon, Aug 2007)

Page 9: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Microdata: Archiving & DisseminatingMicrodata: Archiving & Disseminating• The producer’s perspective (official statisticians):The producer’s perspective (official statisticians):– Archiving: Archiving:

• Comprehensive preservation of both data and documentation Comprehensive preservation of both data and documentation (metadata) with easily searchable indices(metadata) with easily searchable indices

• Continually updated with technological innovation—hardware, Continually updated with technological innovation—hardware, software (doc, pdf, txt, xls, jpg, etc.) and wet-ware software (doc, pdf, txt, xls, jpg, etc.) and wet-ware

– Disseminating: the web revolution Disseminating: the web revolution

• The consumer’s perspective (researchers)The consumer’s perspective (researchers)– Access: locate and use on the web without obstacles Access: locate and use on the web without obstacles

– Disseminating: free access to anyone, anywhere, anytime Disseminating: free access to anyone, anywhere, anytime (access postponed is access denied)(access postponed is access denied)

• What are your interests?What are your interests?

Page 10: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Microdata: Archiving & DisseminatingMicrodata: Archiving & Disseminating

Our perspective:Our perspective:• ““Archiving Census Microdata and Documentation: Archiving Census Microdata and Documentation:

Preserving Memory, Increasing Stakeholders” (UNSD Preserving Memory, Increasing Stakeholders” (UNSD NYC, 2001) – copy of paper at ~rmccaa/ipums-africaNYC, 2001) – copy of paper at ~rmccaa/ipums-africa– Long term, 7 keys: readable, intelligible, identifiable, Long term, 7 keys: readable, intelligible, identifiable,

encapsulated, understandable, reconstructable, authenticencapsulated, understandable, reconstructable, authentic

– What to preserve: the product and the processWhat to preserve: the product and the process

– How to assess future value: stakeholders, future impact, How to assess future value: stakeholders, future impact, anticipated use, informing the futureanticipated use, informing the future

– Challenges: archive, plan, trained staff, external repositoryChallenges: archive, plan, trained staff, external repository

Page 11: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Preservation, the problem: 1973 census tapes of Sudan were at risk!

Page 12: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

A Solution: Data recovery (by a specialized data recovery company)

Page 13: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

>3,000 tapes >3,000 tapes recovered: 1971 Germanyrecovered: 1971 Germany

1980 Mexico, 1980 Mexico, Mali 76, Sudan 73Mali 76, Sudan 73

and many moreand many more

MicrodataMicrodataon this tape on this tape

were recovered!!were recovered!!

Data recovery. Example: Bangladesh Bureau of Data recovery. Example: Bangladesh Bureau of Statistics--1981 census, 276 tapes, recovery in Aug. ‘08)Statistics--1981 census, 276 tapes, recovery in Aug. ‘08)

Page 14: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Census Microdata: 1950sCensus Microdata: 1950sfew countries archived microdatafew countries archived microdata

(a country in green indicates microdata exist for the decade) (a country in green indicates microdata exist for the decade)see: www.hist.umn.edu/~rmccaa/IUMSI/country6.htmsee: www.hist.umn.edu/~rmccaa/IUMSI/country6.htm

Mollweide projection

Page 15: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Census Microdata: 1960sCensus Microdata: 1960sThe Americas: The Americas:

in the vanguard for preservation of microdatain the vanguard for preservation of microdata

Mollweide projection

Page 16: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Census Microdata: 1970sCensus Microdata: 1970sthe preservation of microdata was almost universal in the Americasthe preservation of microdata was almost universal in the Americas

and was becoming widespread in Europe, Africa and Asiaand was becoming widespread in Europe, Africa and Asia

Mollweide projection

Mali, 1976: Mali, 1976: census census microdata microdata recovered from recovered from old Bernoulli old Bernoulli boxesboxes

Page 17: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Census Microdata: 1980sCensus Microdata: 1980sThe preservation of microdata became generalizedThe preservation of microdata became generalized

Mollweide projection

Ghana, 1984: Ghana, 1984: census census microdata microdata recovered recovered from floppy from floppy discs!discs!

Page 18: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Census Microdata: 1990sCensus Microdata: 1990smany countries preserved microdatamany countries preserved microdata

(or are disposed to recover them) (or are disposed to recover them)

Mollweide projection

Page 19: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Census Microdata: 2000sCensus Microdata: 2000smany countries have microdatamany countries have microdata

(or are disposed to make them available for research) (or are disposed to make them available for research)

Mollweide projection

Page 20: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Inventory of census microdata archived by region Inventory of census microdata archived by region and decade (% of censuses conducted)and decade (% of censuses conducted)

•Note: cases confirmed by the corresponding official statistical institute. Some Note: cases confirmed by the corresponding official statistical institute. Some datasets remain to be certified. Some countries have not responded to the invitation to datasets remain to be certified. Some countries have not responded to the invitation to inventory their stocks of data. inventory their stocks of data. Source: http://www.hist.umn.edu/~rmccaa/IPUMS/country6.htmSource: http://www.hist.umn.edu/~rmccaa/IPUMS/country6.htm

Region/continent Countries 2000s 1990s 1980s 1970s   1960s

Latin America 21 100% 100% 89% 81% 72%

North America 27 91% 72% 64% 24% 8%

Africa 58 15% 22%  25%  15%  2% 

Asia 44 ?% 54% 31% 30% 13%

Europe 46 ?% 67% 55% 41% 13%

Pacific(pob>.5m) 7 100% 100% 100% 43% 29%

Page 21: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

1.1. Census Questionnaires (forms): dwellings, Census Questionnaires (forms): dwellings, households, persons, mortality, migration, etc.households, persons, mortality, migration, etc.

2.2. Enumerator instructionsEnumerator instructions

3.3. Data Dictionaries (layouts)Data Dictionaries (layouts)

4.4. CodebooksCodebooks

a.a. Geographic codesGeographic codes

b.b. Occupation / Industry / Education codesOccupation / Industry / Education codes

5.5. Data processing protocolsData processing protocols

6.6. Official StatisticsOfficial Statistics

7.7. Official Reports (Analytical, Technical, Methdological) Official Reports (Analytical, Technical, Methdological)

7 Essential Types of Metadata for Each Census7 Essential Types of Metadata for Each CensusSee IPUMS Documentation (“Table 1”)See IPUMS Documentation (“Table 1”)

Page 22: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

7 Essential Types of Metadata for Each Census7 Essential Types of Metadata for Each CensusExample: Ghana Example: Ghana

www.hist.umn.edu/~rmccaa/ipums-africawww.hist.umn.edu/~rmccaa/ipums-africa

Page 23: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

7 Essential Types of Metadata for Each Census7 Essential Types of Metadata for Each CensusExample: Guinea (Conakry)Example: Guinea (Conakry)

www.hist.umn.edu/~rmccaa/ipums-africawww.hist.umn.edu/~rmccaa/ipums-africa

Page 24: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

2. Integration: 2. Integration: Microdata and MetadataMicrodata and Metadata

Page 25: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

IPUMS integration of metadata and IPUMS integration of metadata and microdatamicrodata

» Comprehensive documentation, including Comprehensive documentation, including » Data dictionaries and codebooksData dictionaries and codebooks

» Complete original source documentation in the official Complete original source documentation in the official language:language: questionnaires, manuals, etc. questionnaires, manuals, etc.

» All translated to English All translated to English (from the German--thanks again to (from the German--thanks again to Martin Podehl!!)Martin Podehl!!) and converted into metadatabase for each and converted into metadatabase for each censuscensus

» Integration Integration ≠ standardization≠ standardization» Composite codes (11, 12, 21, 22…) ≠ serial codes (1, 2, 3, …) Composite codes (11, 12, 21, 22…) ≠ serial codes (1, 2, 3, …)

(see next slide)(see next slide)

Page 26: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Chile Chile MéxicoMéxico

CodeCode LabelLabel 19921992 20022002 19901990 20002000

00 NIUNIU X X X X X X X X

ACTIVE (In Labor Force)ACTIVE (In Labor Force)

100100 EMPLOYED, not specifiedEMPLOYED, not specified · · · · · · · ·

110110 At workAt work X X X X X X X X

111111 At work, and 'student'At work, and 'student' · · · · · · X X

112112 At work, and 'housework'At work, and 'housework' · · · · · · X X

113113 At work, and 'seeking work'At work, and 'seeking work' · · · · · · X X

114114 At work, and 'retired'At work, and 'retired' · · · · · · X X

115115 At work, and 'no work'At work, and 'no work' · · · · · · X X

116116 At work, and 'other'At work, and 'other' · · · · · · X X

117117 At work, family holding, not specifiedAt work, family holding, not specified · · · · · · · ·

118118 At work, family holding, not agriculturalAt work, family holding, not agricultural · · · · · · · ·

119119 At work, family holding, agriculturalAt work, family holding, agricultural · · · · · · · ·

120120 Have job, not at work last weekHave job, not at work last week X X X X X X X X

IPUMS—Microdata integration method: IPUMS—Microdata integration method: composite codes (multiple digits)composite codes (multiple digits)

retains not only significant distinctions retains not only significant distinctions but also integrates comparable conceptsbut also integrates comparable concepts

Page 27: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Chile Chile MéxicoMéxico

CodeCode LabelLabel 19921992 20022002 19901990 20002000

00 NIUNIU X X X X X X X X

ACTIVE (In Labor Force)ACTIVE (In Labor Force)

100100 EMPLOYED, not specifiedEMPLOYED, not specified · · · · · · · ·

110110 At workAt work X X X X X X X X

111111 At work, and 'student'At work, and 'student' · · · · · · X X

112112 At work, and 'housework'At work, and 'housework' · · · · · · X X

113113 At work, and 'seeking work'At work, and 'seeking work' · · · · · · X X

114114 At work, and 'retired'At work, and 'retired' · · · · · · X X

115115 At work, and 'no work'At work, and 'no work' · · · · · · X X

116116 At work, and 'other'At work, and 'other' · · · · · · X X

117117 At work, family holding, not specifiedAt work, family holding, not specified · · · · · · · ·

118118 At work, family holding, not agriculturalAt work, family holding, not agricultural · · · · · · · ·

119119 At work, family holding, agriculturalAt work, family holding, agricultural · · · · · · · ·

120120 Have job, not at work last weekHave job, not at work last week X X X X X X X X

IPUMS—Microdata integration method: IPUMS—Microdata integration method: composite codes (multiple digits)composite codes (multiple digits)

retains not only significant distinctions retains not only significant distinctions but also integrates comparable conceptsbut also integrates comparable concepts

Goal of integration coding scheme: Goal of integration coding scheme: Assist each researcher in making informed Assist each researcher in making informed decisions on comparability—not to attempt decisions on comparability—not to attempt to make the one best decision for all to make the one best decision for all researchers.researchers.

Page 28: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

IPUMSI IPUMSI Col Col Fra Fra Ken Mex Mex US Viet Viet

Code Label 1964 1993 1962 1975 1999 1970 2000 1960 1989 1999

 0000 N/A *,5 B * B BB 0 BB 00 B B,1

   ACTIVE (In Labor Force)                    

 1000 EMPLOYED, not specified 1               1

 1100 At work   4 1 1 01 1 10 10    

 1101         At work, and 'student'             14      

 1102         At work, and 'housework'             15      

 1103         At work, and 'seeking work'             13      

 1104           At work, and 'retired'             16      

 1105           At work, and 'no work'             18      

 1106           At work, public emergency               11    

 1107           At work, family holding, not specified                    

 1108           At work, family holding, not agricultural         03          

 1109           At work, familiy holding, agricultural         04          

 1110           Working and studying (France)                    

 1200     Have job, not at work last week   3     02   20 12    

 1300     Armed forces               13    

 1301           Armed forces, at work               14    

 1302           Armed forces, not at work last week               15    

 1303           Military trainee (France)     8 6            

 2000 UNEMPLOYED, not specified 2     3 05 2 30 20    

 2001             Unemployed (Vietnam)                 4 5

 2002             Worked less than 6 months, permanent job                 2

 2003             Worked less than 6 months, temporary job                 6  

 2100         Unemployed, experience worker   1           21    

 2101             Seeking work, worked less than 3 months     2              

 2102             Seeking work, worked 3 to 6 months     3              

 2103             Seeking work, worked 6 to 12 months     4              

 2104             Seeking work, worked more than 1 year     5              

 2105             Seeking work, experience unspecified     6              

 2200         Unemployed, new worker   2 7         22    

 3000 INACTIVE (Not in Labor Force)               30    

 3100     Housework 3 6     10 3 50 31 6 2

 3200     Unable to work/disabled 7 7     09   70 32 7 4

 3300     In school 4 5 9 5 07   40 33 5 3

 3400     Retirees and living on rent 8           60      

 3401     Living on rent payments                    

 3402         Retirees/pensioners   8   4 08          

 3500 Elderly 6                  

 3600     No work available/discouraged         06          

 3700     Inactive, other reasons 9 0 0 0 11 4 80 34   6

 9000 UNKNOWN/MISSING 9 00 9 99 9

Note: In the source data columns: a comma indicates more than one code was coded to the respective IPUMS-International

value; an asterisk means programming logic was used; B indicates a blank in the source data.

Translation Table for Employment Status

Harmonized Codes and Labels Source Data Codes (selected samples) MetadataMetadata: Employment Status: Employment Status

EMPSTATEMPSTATEmployment statusEmployment status

DescriptionDescriptionEMPSTAT indicates whether or not the respondent was part of the labor force -- EMPSTAT indicates whether or not the respondent was part of the labor force -- working or seeking work -- over a specified period of time. Depending on the sample, working or seeking work -- over a specified period of time. Depending on the sample, EMPSTAT can also convey further information.EMPSTAT can also convey further information.

The first digit of EMPSTAT is fully comparable, and classifies the population into three The first digit of EMPSTAT is fully comparable, and classifies the population into three groups: employed, unemployed, and inactive. The combination of employed and groups: employed, unemployed, and inactive. The combination of employed and unemployed yields the total labor force. The second and third digits of EMPSTAT unemployed yields the total labor force. The second and third digits of EMPSTAT preserve additional information available for some countries and census years but not preserve additional information available for some countries and census years but not for others.for others.

Employment status is sometimes referred to in other sources as "activity status."Employment status is sometimes referred to in other sources as "activity status."

Comparability -- GeneralComparability -- GeneralThe age of persons to whom the question applies varies across the samples (see The age of persons to whom the question applies varies across the samples (see Universe). Universe).

The reference period for the employment status question varies. For most samples, The reference period for the employment status question varies. For most samples, employment status was reported with respect to the day of the census or…employment status was reported with respect to the day of the census or…

Page 29: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

IPUMSI IPUMSI Col Col Fra Fra Ken Mex Mex US Viet Viet

Code Label 1964 1993 1962 1975 1999 1970 2000 1960 1989 1999

 0000 N/A *,5 B * B BB 0 BB 00 B B,1

   ACTIVE (In Labor Force)                    

 1000 EMPLOYED, not specified 1               1

 1100 At work   4 1 1 01 1 10 10    

 1101         At work, and 'student'             14      

 1102         At work, and 'housework'             15      

 1103         At work, and 'seeking work'             13      

 1104           At work, and 'retired'             16      

 1105           At work, and 'no work'             18      

 1106           At work, public emergency               11    

 1107           At work, family holding, not specified                    

 1108           At work, family holding, not agricultural         03          

 1109           At work, familiy holding, agricultural         04          

 1110           Working and studying (France)                    

 1200     Have job, not at work last week   3     02   20 12    

 1300     Armed forces               13    

 1301           Armed forces, at work               14    

 1302           Armed forces, not at work last week               15    

 1303           Military trainee (France)     8 6            

 2000 UNEMPLOYED, not specified 2     3 05 2 30 20    

 2001             Unemployed (Vietnam)                 4 5

 2002             Worked less than 6 months, permanent job                 2

 2003             Worked less than 6 months, temporary job                 6  

 2100         Unemployed, experience worker   1           21    

 2101             Seeking work, worked less than 3 months     2              

 2102             Seeking work, worked 3 to 6 months     3              

 2103             Seeking work, worked 6 to 12 months     4              

 2104             Seeking work, worked more than 1 year     5              

 2105             Seeking work, experience unspecified     6              

 2200         Unemployed, new worker   2 7         22    

 3000 INACTIVE (Not in Labor Force)               30    

 3100     Housework 3 6     10 3 50 31 6 2

 3200     Unable to work/disabled 7 7     09   70 32 7 4

 3300     In school 4 5 9 5 07   40 33 5 3

 3400     Retirees and living on rent 8           60      

 3401     Living on rent payments                    

 3402         Retirees/pensioners   8   4 08          

 3500 Elderly 6                  

 3600     No work available/discouraged         06          

 3700     Inactive, other reasons 9 0 0 0 11 4 80 34   6

 9000 UNKNOWN/MISSING 9 00 9 99 9

Note: In the source data columns: a comma indicates more than one code was coded to the respective IPUMS-International

value; an asterisk means programming logic was used; B indicates a blank in the source data.

Translation Table for Employment Status

Harmonized Codes and Labels Source Data Codes (selected samples) MetadataMetadata: Employment Status, example: Mexico: Employment Status, example: Mexico

Comparability -- MexicoComparability -- MexicoThe universe and reference period are fully comparable across the Mexico samples. The universe and reference period are fully comparable across the Mexico samples.

The 1970 Census did not provide detail on the inactive population except for The 1970 Census did not provide detail on the inactive population except for "houseworkers," while the later samples have numerous subcategories."houseworkers," while the later samples have numerous subcategories.

In 1990, the employment status question refers to "Principal Activity" and therefore under-In 1990, the employment status question refers to "Principal Activity" and therefore under-reports secondary economic activity by students, housewives, family-workers, the semi-reports secondary economic activity by students, housewives, family-workers, the semi-retired, and others.retired, and others.

The 2000 Census sought to overcome deficiencies in reporting work status for people whose The 2000 Census sought to overcome deficiencies in reporting work status for people whose primary activity was not work (students, housewives, retirees, etc.), but who in fact were primary activity was not work (students, housewives, retirees, etc.), but who in fact were working according to international definitions. A second question introduced for the first working according to international definitions. A second question introduced for the first time in 2000 sought to capture this secondary economic activity. For strict comparability time in 2000 sought to capture this secondary economic activity. For strict comparability with earlier Mexican censuses, this recovered activity (codes 1101-1106) should be with earlier Mexican censuses, this recovered activity (codes 1101-1106) should be considered "inactive."considered "inactive."……

Integrate: retain all significant detail, harmonize everythingIntegrate: retain all significant detail, harmonize everythingNot standardize: force square pegs in round holesNot standardize: force square pegs in round holes

Page 30: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

IPUMS integrated metadata: Instantly, compare text &/or IPUMS integrated metadata: Instantly, compare text &/or image of enumeration forms and instructions for any image of enumeration forms and instructions for any

combination of countries and censuses (example: combination of countries and censuses (example: educational attainment)educational attainment)

Page 31: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

In addition…In addition…

»Microdata: new high precision samples not Microdata: new high precision samples not only for contemporary censuses but also for only for contemporary censuses but also for historical ones (before the 90s)historical ones (before the 90s)

» Systematic metadata for all variablesSystematic metadata for all variables»UniversesUniverses»DefinitionsDefinitions»Comparability Comparability »Dynamic System—facilitates comparing the Dynamic System—facilitates comparing the

wording of questionnaires and instructions for any wording of questionnaires and instructions for any combination of countries and censusescombination of countries and censuses

Page 32: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

3. Dissemination3. Dissemination

Page 33: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

- Caution -- Caution -

• IPUMS microdata are anonymized samples.IPUMS microdata are anonymized samples.– They are for advanced analysis and research. They are for advanced analysis and research. – Use of a statistical software is required.Use of a statistical software is required.– Statistical software provides great power.Statistical software provides great power.– “ “With great power, comes great responsibility.”With great power, comes great responsibility.”

• IPUMS samples are for analysis.IPUMS samples are for analysis.• IPUMS samples are IPUMS samples are not not official statistics.official statistics.

Page 34: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

6 steps6 steps usingusing https://international.ipums.org/international:https://international.ipums.org/international:

1. Logon 1. Logon w/ passwordw/ password

2a. Study documentation2a. Study documentation2b. Design extract2b. Design extract

3. Receive email; 3. Receive email; logon with p/wordlogon with p/word

4. Download 4. Download extract (SSL extract (SSL encrypted)encrypted)

5. UnZip data5. UnZip data

(also SAS, (also SAS, STATA) STATA)

6. Analyze6. Analyze

Page 35: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Conclusion: Conclusion: IPUMS Strengths and Challenges plus IPUMS Strengths and Challenges plus 7 golden rules for promoting microdata 7 golden rules for promoting microdata

revolutionrevolution

Page 36: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

The IPUMS team (Feb. 2008) The IPUMS team (Feb. 2008)

(Not present: computer gurus, some researchers, (Not present: computer gurus, some researchers, and others who were too busy for a photo!)and others who were too busy for a photo!)

Steven Ruggles, inventor of IPUMS, Professor of History, and Director of the Minnesota Population Center

Page 37: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

1.1. Uniform legal authorization with national statistical Uniform legal authorization with national statistical authorities authorities

2.2. Access restricted to academics with need who agree to abide Access restricted to academics with need who agree to abide by stringent confidentiality protectionsby stringent confidentiality protections

3.3. Sanctions against individual and institution—denial of access Sanctions against individual and institution—denial of access to all microdata for the entire institutionto all microdata for the entire institution

4.4. Experienced integration teamsExperienced integration teams

5.5. Proven web-based distribution systemProven web-based distribution system

6.6. High user satisfaction with microdata & metadataHigh user satisfaction with microdata & metadata

7.7. Sustainable funding: NSF, NIHSustainable funding: NSF, NIH

IPUMS-International strengthsIPUMS-International strengths

Page 38: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

5 Challenges5 Challenges

1.1. Microdata to recover (30 countries), integrate Microdata to recover (30 countries), integrate (60 countries)(60 countries)

2.2. 2010 round of censuses (~100 countries) 2010 round of censuses (~100 countries)

3.3. Tabulator (research tool—not official stats) Tabulator (research tool—not official stats)

4.4. GISGIS

5.5. High security laboratory for sensitive, High security laboratory for sensitive, comprehensive microdatacomprehensive microdata

Page 39: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

1.1. Respect “restricted-access” conditions of use: Respect “restricted-access” conditions of use: » protect confidentialityprotect confidentiality

» ““share” data only with registered users share” data only with registered users

2.2. Study both source documentation and metadata: Study both source documentation and metadata: » Original source: census forms, instructions to enumerators, etc.Original source: census forms, instructions to enumerators, etc.

» Integrated metadata: samples, variables, comparability discussionsIntegrated metadata: samples, variables, comparability discussions

3.3. Construct extracts judiciously:Construct extracts judiciously:» extract only needed countries, censuses, variables, sub-pops extract only needed countries, censuses, variables, sub-pops

» use sample size &/or “subsamp” features to keep samples smalluse sample size &/or “subsamp” features to keep samples small

4.4. Use weights:Use weights:either households or individuals (geographical strata = power)either households or individuals (geographical strata = power)

5.5. Analyze carefully:Analyze carefully:proper statistical techniques, keeping in mind data quality, sample errorproper statistical techniques, keeping in mind data quality, sample error

6.6. Cite properly: Cite properly: IPUMSIPUMS and National Statistical Agencies and National Statistical Agencies

7.7. Share publications: Share publications: IPUMSIPUMS and National Statistical Agenciesand National Statistical Agencies

7 golden rules for 7 golden rules for the global microdata revolutionthe global microdata revolution

Page 40: Roundtable on Archiving and Disseminating official statistics with a focus on census microdata Example: IPUMS-International  * * *

Thank you!!Thank you!!

[email protected]@umn.edu