Top Banner
Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm
57
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census Data for Researchers

Thursday Feb 23, 20123:30-4:45 pm

Page 2: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Getting Acquainted with Research Data

Six 1.5 hour seminars this Spring Census (Content & Issues): Today Foreclosure Crisis Data : March 8 Data from IGO’s : March 16 Geoportal/GIS resources : March 22 AddHealth : April 12 Census (Resources) : April 26

Page 3: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

The Census Bureau spends a LOT of money – ($7 billion (2010), 11 billion dollars (2011)) - each year in its mission as the primary statistical agency of the US. It gathers an enormous amount of data about individuals, households, establishments and firms from a broad array of surveys and censuses. All of this is made possible by ongoing investments in developing new content, maintaining sampling frames, evaluating quality of responses, and responding to methodological issues and concerns.

Today, we will identify the primary data collections of the Census Bureau, looking at content, geography, access levels, standard data products, value-added research resources, and issues in cross-time and cross-survey analyses.

Today

Page 4: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Recurring Questions

Basic distinctions about collections Survey vs. CensusPopulation & Households vs. EconomicTitle 13 vs. Title 15Microdata vs. Aggregate data

ContentWhat questions are asked and how?

GeographyWhat data is available for what areas?

Multi-legged stoolsDrawing on multiple resources, surveys, time periods and geographies….and the strengths

and drawbacks.

Page 5: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Why is the Census Bureau important?

Huge data collection budget Even more money allocated on basis of data

collection (~400 Billion annually) Most widely used social science data

– High quality sample frames– Large samples sizes, small geographies– Consistency

Page 6: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Broad Data Collections

Population & Housing Census - every 10 years Economic Census - every 5 years Census of Governments - every 5 years

American Community Survey – annually Many additional surveys -- both Demographic &

Economic Economic Indicators - each indicator is released on a

specific schedule

Page 7: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Supplementary Resources

Population Projections and Estimates

Small Area Income and Poverty Estimates Small Area Health Insurance Estimates

Geographic shapefiles & resources

Page 8: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Behind the scenes:Sampling Frames (Household Surveys)

Master Address File (MAF)– Official inventory of known living quarters– Linked to TIGER

Housing Units– Based on Census 2000 MAF and updates from

the USPS’ Delivery Sequence File Group Quarters

– updates from the administrative records and the FSCPE

Page 9: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Behind the scenes:Sampling Frames (Business Surveys)

Business Register– Census Bureau’s master business list– Industry classification - NAICS– Geographic classification – states, counties, etc.– Legal form & tax status

Establishments– Places of Business

Enterprises– Firms

Page 10: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Behind the scenes:Sampling Frames (Business Surveys)

Source

Payroll Tax Returns

(IRS Forms 941 & 943)25 million

Sole Proprietorships’ Business Income Tax Returns

(IRS Form 1040, Schedule C)22 million

Other Business Income Tax Returns (Forms 990, 1065, 1120) 10 million

Social Security Administration Industry Codes (Form SS-4) 1.8 million

Bureau of Labor Statistics Industry Codes 1.5 million

Page 11: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Behind the scenes:Sampling Frames (Geography/Other)

Topologically Integrated Geographic Encoding and Referencing system (TIGER)

Boundary and Annexation Survey (BAS)– Annual; legally defined geographies

Population Estimates– Based on Vital statistics data, IRS migration,

Medicare enrollment data

Page 12: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Broad Data Collections

Population & Housing Census – Every 10 years – Full enumeration– Mixed mode (mail-in, CATI, in-person)– Long form/short-form (2000 and earlier)– Multiple data releases

Page 13: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census 2010: ContentCensus 2010: Content

10 Questions10 Questions NameName SexSex AgeAge Relationship (to Household Head)Relationship (to Household Head) Hispanic OriginHispanic Origin RaceRace Owner/Renter StatusOwner/Renter Status

PlusPlus Whether each member sometimes lives/stays elsewhereWhether each member sometimes lives/stays elsewhere

Total number living in residenceTotal number living in residenceProbe for unreported personsProbe for unreported personsTelephone contactTelephone contact

Page 14: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census 2010: ProductsCensus 2010: Products

Reapportionment data – December 2010Reapportionment data – December 2010

Redistricting data – February-April 2011 Redistricting data – February-April 2011

SF 1 – June – August 2011SF 1 – June – August 2011

SF 2 - Dec 2011 – April 2012SF 2 - Dec 2011 – April 2012

Same Sex Couple Summary File – Nov 2011Same Sex Couple Summary File – Nov 2011

Congressional District Summary File – Jan 2013Congressional District Summary File – Jan 2013

AIAN Summary File – April 2013AIAN Summary File – April 2013

State Legislative District Summary File – June 2013State Legislative District Summary File – June 2013

PUMS - TBDPUMS - TBD

Page 15: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census 2010: Product DetailCensus 2010: Product Detail

P.L. 94-171 (Redistricting Data)

State and sub-state counts down to the block level are shown for the total population and the population 18 years and over for 63 race groups; and not Hispanic or Latino origin by 63 race groups. Also shown are housing unit counts by occupancy

status (occupied units, vacant units).

Page 16: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census 2010: Product DetailCensus 2010: Product Detail

P.L. 94-171 (Redistricting Data)1980 1990 2000 2010

Race 5 Race ANDSpanish Origin

5 Race x2 Hispanic

63 Race x 2 Hispanic

63 Race x 2 Hispanic

Age --- Total, Age 18+ Total, Age 18+ Total, Age 18+

Housing --- Total Housing Units --- Occupied vs Vacant

Page 17: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census 2010: Product DetailCensus 2010: Product Detail

Summary File 1 (SF1)Summary File 1 (SF1) About 300 tables Counts and cross tabulations Counts for detailed race, Hispanic or Latino groups, and

American Indian/Alaska Native tribes (to the tract) Tables repeat for major race groups alone, two or more

races, Hispanic or Latino, White not Hispanic or Latino Geography: block, census tract

http://www.census.gov/population/www/cen2010/glance/files/SF1_Final_1.5_Internet.xls

Page 18: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Examples of SF1 TabulationsExamples of SF1 Tabulations

P1 Total population (1)P3 Race (71)P8 Hispanic or Latino (17)P12 Sex by age (5-year groupings) (49)P14 Sex by age for the population

under 20 (single years of age) (43)P15 Households (1)P17 Average household size (1)

Page 19: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census 2010: Product DetailCensus 2010: Product Detail

Summary File 2Detailed tables on age, sex, households, families, relationship to householder, housing units, and group quarters.

Tables are repeated by 141 race groups, 98 American Indian and Alaska Native tribes/tribal groupings, and 39 Hispanic or Latino origin groups.

Page 20: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Per Census Bureau Technical Documentation:

“The concept of race, as used by the Census Bureau, reflects self-identification by people according to the race or races with which they most closely identify. These categories are socio-political constructs and should not be interpreted as being scientific or anthropological in nature. Furthermore, the race categories include both racial and national-origin groups.”

“Origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person’s parents or ancestors before their arrival in the United States. People who identify their origin as Spanish, Hispanic, or Latino may be of any race.”

Page 21: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

SF2 - Detailed Asian SF2 - Detailed Asian CategoriesCategories

Asian Japanese Asian Indian Korean Bangladeshi Laotian Bhutanese Malaysian

Burmese Nepalese Cambodian Pakistani Chinese Sri Lankan Chinese, except Taiwanese Thai Taiwanese Vietnamese Filipino Other Asian Hmong Indonesian

Page 22: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

SF2 - Detailed Hispanic/Latino CategoriesSF2 - Detailed Hispanic/Latino Categories

Hispanic or Latino(of any race)

Mexican Puerto Rican Cuban Other Hispanic or Latino

DominicanCentral American

Costa Rican Guatemalan Honduran Nicaraguan Panamanian Salvadoran

South American Argentinean

Bolivian Chilean Colombian Ecuadorian Paraguayan Peruvian Uruguayan VenezuelanSpaniardAll other Hispanic or Latino

Page 23: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

SF 2 - 42 American Indian CategoriesSF 2 - 42 American Indian Categories

American Indian

Apache Houma South American Indian

Arapaho Iroquois Spanish American Indian

Blackfeet Kiowa Tohono O'Odham

Canadian and French American Indian Lumbee Ute

Central American Indian Menominee Yakama

Cherokee Mexican American Indian Yaqui

Cheyenne Navajo Yuman

Chickasaw Osage American Indian tribes, Other

Chippewa Ottawa

Choctaw Paiute Alaska Native

Colville Pima Alaskan Athabascan

Comanche Potawatomi Aleut

Cree Pueblo Inupiat

Creek Puget Sound Salish Tlingit-Haida

Crow Seminole Tsimshian

Delaware Shoshone Yup'ik

Hopi Sioux

Page 24: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

SF2 Subject ContentSF2 Subject Content

36 Population tables at census tract (PCT) level

10 Population tables at county level

10 Housing tables at census tract (HCT) level

Page 25: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Other Deccenial-based tabulations

Same-Sex Tabulation a single table, but tabulated as reported, not “edited” to unmarried partner

Congressional and State Legislative summary files – retabulation of SF1 to new boundaries

Other summary files draw upon ACS, rather than decennial

Page 26: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Where’s all the interesting stuff?

In 2000 (and earlier) censuses, the census used more than one form:

A “short” form, which asked basic demographic data, just like the 2010 census form (AKA – 100% data)

A “long” form, which collected both the items on the short form and a broader set of items about income, education, ancestry, language, disability, employment, etc.. (AKA – sample data)

Now, decennial census focuses solely on basic demographic data, and social and economic data are collected in the American Community Survey (ACS)

Page 27: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Broad Data Collections

American Community Survey Annual Replacement for the “long form” of the decennial

census. HH sample fully implemented in January 2005,

annual sample of around 3 million. Multi-mode: mail, CATI, CAPI Multiple Data releases

– 1 year, 3 year, 5 year, PUMS

Page 28: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

ACS Content - Basic

Page 29: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

ACS: Design of the Sample

Annual Sample Size of 3 million addresses Series of Monthly Samples of 250,000 addresses HU sample in each of the 3,141 Counties Areas with smaller populations sampled at higher

rates than those with larger populations HU Address sampling rate set by Block Final sampling rate varies between 1.6% and 10% No HU address can be sampled more than once in 5

years

Page 30: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

ACS: Data Collection

HU addresses by three modes – Mailout of paper questionnaire in 1st month– Telephone (CATI) non-response follow-up in 2nd

month – Personal visit (CAPI) non-response follow-up in

3rd month to a sub-sample

GQ– Personal visit within 6 weeks of sample selection

Page 31: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

ACS: Sample Design

GQ facilities sample for each state Two stratum

– Small (15 or fewer residents)– Large ( more than 15 residents)

Small – Data collected on all residents– Facility eligible once in 5 years

Large– Groups of ten residents sub-sampled– Number of groups determined by size of facility– Facility eligible every year

Page 32: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

ACS Content Tests

2006Health InsuranceMarital HistoryVeteran's Service-connected Disability

2007Field of Degree (BA)

2010Computer Ownership-Internet AccessParental Place of Birth

2011-2013Testing of Internet Response mode

Page 33: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Distribution Formats

Like former decennial census data, released in both aggregate and microdata formats

Because of change to continuous sampling, however, aggregate data released at different geographic levels with differing collection frames

Page 34: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Sample Data Summary FilesSample Data Summary Files

Summary File ….. 3?Summary File ….. 3? 813 tables of data

Counts and cross tabulations of sample items (income, occupation, education, rent and value, vehicles available)

Lowest level of geography: block group

Page 35: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Multi-year estimates

Larger geographies have multiple options for estimates – 1 year, 3 year, 5 year

Comparing and interpreting overlapping multi-year estimates not intuitive: only differences come from the non-overlapping period.

Page 36: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.
Page 37: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

EEOC: Fall 2010

37

Page 38: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

New this time: EEOC

ACS 2006-2010 5-year file

Margins of error

2010 Census population base

2010 SOC Occupation categories

Additional variable: Citizenship

38

Page 39: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Basic Census Geography

Page 40: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Legal/Administrative Entities

Page 41: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Statistical Entities

Page 42: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

A public use microdata area (PUMA) is a decennial census area for which the U.S. Census Bureau provides specially selected extracts of raw data from a small sample of long-form census records that are screened to protect confidentiality. These extracts are referred to as ‘‘public use microdata sample (PUMS)’’ files. For Census 2000, two two types of PUMAs were delineated within states.

PUMAs of one type comprise areas that contain at least 100,000 people. The PUMS files for these PUMAs contain a 5-percent sample of the long-form records. The other type of PUMAs, super-PUMAs, comprise areas of at least 400,000 people. The sample size is 1 percent for the PUMS files for super-PUMAs.

PUMAs cannot be in more than one state or statistically quivalent entity. The larger 1-percent PUMAs are aggregations of the smaller 5-percent PUMAs. PUMAs of both types, wherever the population size criteria permit, comprise areas that are entirely within or outside metropolitan areas or the central cities of metropolitan areas.

Non-Nested Geographies – PUMAs

Page 43: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Some Key Points to Remember

Census Geographies include nested and non-nested geographies

Some geographies defined politically, others for statistical and reporting purposes

Geographies range in size from a block to the nation as a whole, but different sorts of data available depending on type of geography

Page 44: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Demographic (Household) Surveys

Survey of Income and Program Participation Survey of Program Dynamics American Housing Survey Current Population Survey Consumer Expenditure Survey (CES)

And more…..

Page 45: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Survey of Income and Program Participation

The Survey of Income and Program Participation (SIPP) program, initiated in 1983, is a longitudinal, multi-panel survey primarily of adults in households in the United States.

Sampled households are interviewed at least nine times at four-month intervals and followed over the life of the panel. New samples (panels) are drawn periodically, ranging in size from around 13,000 HHs to around 40,000 HH’s. (annually 1984-1993; 1996, 2001, 2004, 2008)

The SIPP attempts to interview all members age 15 and older in the household during the first wave of interviewing. Subsequent interviews may be in-person or by phone, with the same interviewer speaking to the same respondents.

New members who join the household are interviewed after they join; departing members are interviewed at their new address.

Page 46: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Survey of Income and Program Participation SIPP information falls into two categories: the core

information, and other questions (found in "topical modules") that produce in-depth information on specific subjects and are asked at only one or two interviews.

SIPP core content covers demographic characteristics, work experience, earnings, program participation, transfer income, and asset income.

Page 47: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Current Population Survey

The Current Population Survey (CPS) is a monthly survey of about 50,000 to 65,000 households conducted by the Bureau of the Census for the Bureau of Labor Statistics. The survey has been conducted for more than 50 years.

The CPS is the primary source of information on the labor force characteristics of the U.S. population. The sample is scientifically selected to represent the civilian noninstitutional population.

Households are in the survey eight times: four consecutive months, eight months off, and then a final four months.

Estimates obtained from the CPS include employment, unemployment, earnings, hours of work, and other indicators. They are available by a variety of demographic characteristics including age, sex, race, marital status, and educational attainment. They are also available by occupation, industry, and class of worker.

Supplemental questions to produce estimates on a variety of topics including school enrollment, income, previous work experience, health, employee benefits, and work schedules are also often added to the regular CPS questionnaire.

Page 48: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Current Population Survey

Annual Social and Economic Supplement (ASEC) – (formerly called the Annual Demographic Survey or March Supplement)

Voting and Registration (November) School Enrollment (October) Food Security; every year since 1995 Computer Ownership Fertility and Marital History Fertility and Birth Expectations Contingent Workers and Alternative Employment Displaced Workers Job Tenure and Occupational Mobility Race and Ethnicity Tobacco Use Work Experience Work Schedules

Page 49: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

American Housing Survey

Provides information on the size and composition of the housing inventory in the United State, neighborhood characteristics, characteristics of occupants. household characteristics, income, housing and neighborhood quality, housing costs, equipment and fuels, size of housing unit, and recent movers.

The AHS returns to the same housing units year after year to gather data; therefore, this survey is ideal for analyzing the flow of households through housing.

Sample of ~ 65,000 Collected for HUD Separate national (fixed sample for ~50,000, followed since 1985) and

metropolitan samples (~3,200 – 4,800 per area, every 6 years, 12-14 areas/year)

More detailed data, less geographic detail, than census

Page 50: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Consumer Expenditure Survey

The Consumer Expenditure Survey (CES) provides information on the buying habits of American consumers and also furnishes data to support periodic revisions of the Consumer Price Index. A new sample is drawn annually, and includes about 60,000 households.

The survey consists of two separate components: (1) a quarterly Interview Survey in which each consumer unit in the sample is interviewed every three months over a fifteen-month period, and (2) a Diary Survey completed by the sample consumer units for two consecutive one-week periods.

The quarterly interview gathers retrospective data on purchases, and focuses on regular and large expenses.

The Diary Survey contains consumer information on small, frequently-purchased items such as food, beverages, food consumed away from home, gasoline, housekeeping supplies, nonprescription drugs and medical supplies, and personal care products and services. Participants are asked to maintain expense records, or diaries, of all purchases made each day for two consecutive one-week periods.

Page 51: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Selected Other Data

National Crime Victimization Survey– 48,000 addresses in 809 PSU’s in US– Operating since 1972– 7 interviews over 3 ½ year period

National Corrections Reporting Program – Prison Admission and discharges. Variables include incarceration history, current offenses, and

total time served. Background information on individuals includes year of birth, sex, age, race, Hispanic origin, and educational attainment.

A variety of surveys for NCHS, e.g.– National Health Interview Survey– National Hospital Discharge Survey– National Survey of Ambulatory Surgery

National Survey of College Graduates– Baseline Survey based on Census

1993 from 1990 census, 2003 from 2000 census Follow-up surveys every 2 years (4 total per decade)

Page 52: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census – Over time

One of the great strengths of Census collections is that their temporal span is quite wide.Decennial Census- Aggregate data from 1790 onward- Microdata from 1850 onwardCPS- Aggregate data from 1940’s- Microdata from 1962 onward

Page 53: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Census – Over time

However…- Content/Questions change- Geography changes

Lots of value-added resources to help address these issues….

Page 54: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Historical Census Geography

Census Tracts– First created in 1910 – 8 cities tracted and 1910 and 1920– By 1940, 60 cities tracted– Substantial increase in tracting with advent of

Metropolitan Areas in 1950– Entire nation tracted/BNA’d by 1990– Tracts can split/merge

Page 55: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Historical Census Geography

Places– State-specific requirements for incorporation – In 1950, CDPs introduced by Census (called

unincorporated places).– Increase in size due to annexations– Increase in numbers due to incorporation– Merging of places possible– Between 1980 and 1990, 40% of places

experienced some change in boundary

Page 56: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

In April…..

I’ve primarily talked about what’s available, not how to get it.

In April, I’ll talk about resources for online analysis and exploration, download resources, documentation, and local (Berkeley) resources…. And if you can’t wait, come by and visit me.

[email protected]

Page 57: Census Data for Researchers Thursday Feb 23, 2012 3:30-4:45 pm.

Geographic grain and Margin of Error