Top Banner
Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001
73

Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Mar 27, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Units of Analysis

The Basics

Chuck HumphreyACCOLEDS/DLI Training

December, 2001

Page 2: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Outline

An illustration

Definitions

Elements of the unit of analysis

Complexity

Data structure

Page 3: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

A group of students in an econometrics class were sent to the Data Library to find some data for an assignment.

Page 4: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

A typical request was like this one.

“I want to look at crime rates and a person’s level of education.”

Page 5: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

crime rates are usually associated with spatial units or a time series

a person’s education is an attribute of individuals

This request raises problems.

Page 6: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

does the student want crime rates and the percentage of the population with certain education levels for specific cities? This would be data aggregated over geography.

What are we looking for?

Page 7: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

does the student want the crime rate for one city over time, such as the number of homicides in Edmonton over the past 40 years. This would be data aggregated over time.

What are we looking for?

Page 8: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

does the student want the education level of criminals? This would be a special subpopulation of individuals convicted of crimes and consist of a microdata file of criminals.

What are we looking for?

Page 9: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

does the student want the education level of victims of crimes? This would be a special subpopulation of individuals who were victimized and consist of a microdata file of victims.

What are we looking for?

Page 10: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

Looking at crime rates and level of education can differ depending upon the unit of analysis.

•individuals

•geographic areas

•changes over time

Page 11: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

After walking the student through these steps, he chose to build a model predicting income on the basis of highest educational attainment and a few other variables from the Census individual-level public use microdata file.

He completely abandoned his interest in crime!

Page 12: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

An Illustration

Unfortunately, the student’s initial request not only failed to specify a clear unit of analysis, it included a mix of different units, which suggests that the concept was not understood.

Page 13: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

The Point of the Illustration

The unit of analysis is fundamental to the data reference interview. Early identification of the unit of analysis will help focus a search on statistics, aggregate data, or microdata.

Page 14: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

The Point of the Illustration

Furthermore, the unit of analysis is fundamental to secondary data analysis. It may be that knowledge of the unit of analysis is even more crucial in secondary analysis than in primary analysis, where the unit is implicit in the sample design, if not otherwise explicit.

Page 15: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

The Point of the Illustration

Finally, the unit of analysis is a fundamental characteristic of statistical data structures, which are the formal ways in which data are organized for processing.

Page 16: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Definitions

The unit of analysis is the basic entity or object

about which generalizations are to be made based on an analysis, and

for which data have been collected

Page 17: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Definitions

How does the unit of analysis relate to the unit of observation?

The unit of observation is the entity in primary research that is observed and about which information is systematically collected.

Page 18: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Definitions

The unit of observation and the unit of analysis are the same when the generalizations being made from a statistical analysis are attributed to the unit of observation.

Page 19: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Unit of Observation– in original data collections, the unit of

observation is determined by the method by which observations are selected

Unit of Analysis– the unit of analysis is determined by an

interest in exploring or explaining a specific phenomenon

Definitions

Page 20: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Identifying a Unit of Analysis

As hinted in the earlier illustration, the unit of analysis is shaped by three attributes:

– Social Phenomena

– Time

– Space

Page 21: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Research Outputs

Let’s begin by looking at a finished product to display these attributes.

We’ll use a table from the Health Indicators Database about suicide.

Page 22: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Social Characteristics

Geography and Time held constant

Page 23: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Ordered by Time

Geography and Age held constant

Page 24: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Geography Emphasized

Time and Age held constant

Page 25: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Social Phenomena

observations of a single social entity, such as a person or an institution

observations of multiple entities with a defined relationship, such as family, employer-employee

Page 26: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Social Phenomena

transactional observations that are the result of actions among entities, such as labour strikes or international conflicts, including wars

Page 27: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Time

observations made at one point in time; commonly referred to as a cross-sectional study

Page 28: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Time

observations made at multiple points in time the data may be organized by

time; commonly referred to as a time series

time may structure some form of repeated measures of content or subjects

Page 29: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Space

observations made within a specific spatial area

observations made within a hierarchy of spatial areas

Page 30: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

Complexity occurs when multiple types of entities are introduced within the same study.

Examples

parent child teacherperson activity timeperson car trips

Page 31: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

This complexity can arise within one of the attributes just discussed.

– a study of parents, children, and teachers, which are all social units

or between attributes– a study of people, their daily

activities, and the length of time of each activity

Page 32: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

Complexity is often represented in an hierarchy when the units can be grouped or nested within one another. For example, children may be grouped with their parents.

Page 33: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

Children grouped (nested) with Parents.

Parent 1 Parent 2

Child 1 Child 2 Child 3

Page 34: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

Parents and their children may be grouped into families and families grouped into households.

Household 1

Family A

Person i

Person ii

Household 2

Family A

Person i

Person ii

Page 35: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

Complexity may also be represented by combinations of entities among units. Those entities that are associated with one another are combined and those that aren’t associated, aren’t combined.

Page 36: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

These combinations are often described as having been crossed. For example, activities may be crossed with people.

Page 37: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

Activities crossed with people.Activity 1

Activity 2

Activity 4Activity 3

Activity 5 Activity 6

X

=

Person B

Person A

Person A Activity 3 Activity 6

Person B Activity 1 Activity 5

Page 38: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

Up to this point, complexity has been described conceptually. We’ve mentioned how multiple units of analysis and the ways in which they are related can create complexity.

Page 39: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complexity

Complexity also manifests itself structurally through the ways in which data are organized to represent the nesting or crossing of multiple units of analysis.

Page 40: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Thinking about Units of Analysis

Conceptually– What is the content? This is what

we’ve been reviewing up to this point.Structurally

– How is it organized? This takes us to a discussion about data structure.

Page 41: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Let’s review basic data structure.The unit of analysis defines the underlying structure of a data file.

Statistical Data Structure

Page 42: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

This structure consists of a series of rows with each row containing the data of one member of the unit of the unit of analysis.This simple structure is known as the flat, rectangular data matrix.

Statistical Data Structure

Page 43: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Case 1

Case 2

Case 3

*

Case n

*

*

Case n-1

Statistical Data Structure

Page 44: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

All of the information collected for each member of the unit of analysis is organized in a fixed location in the file called fields or variables.

Statistical Data Structure

Page 45: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Case 1

Case 2

Case 3

*

Case n

*

*

Field 1*Field2

Field 3* Field k-1

Field k

Case n-1

Statistical Data Structure

Page 46: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Case 1

Case 2

Case 3

*

Case n

*

*

Field 1*Field2

Field 3* Field k-1

Field k

Case n-1

Statistical Data Structure

Page 47: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

This structure looks like the grid of a spreadsheet. However, there is one very important difference between a statistical data structure and a spreadsheet.

Statistical Data Structure

Page 48: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

The spread sheet is organized around individual cells, while the statistical data structure is organized around the rows.

Statistical Data Structure

Page 49: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Spreadsheet

Statistical Data Structure

Page 50: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Cell B2

Cell E3

Cell C5

Cell F7

Spreadsheet

Statistical Data Structure

Page 51: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Statistical Data Structure

Row 1

Row 3

Row k-1

Statistical Data Structure

Page 52: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

The next slide presents the way that this simple statistical data structure appears in SPSS.

Statistical Data Structure

Page 53: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.
Page 54: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Row 1

Page 55: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Row 1

Row 8

Page 56: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Row 1

Row 8

Row 15

Page 57: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Row 1

Row 8

Row 15

Field 8

Page 58: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

00001 169895714620691266912141307220251100002 212294362410300523012070302230352100003 61737841020370633712140603220251100004 151962542420280422806979797441062000005 169587521220230312352100302240312100006 173783282420380633864979797140755000007 88434954710300523032070302240352100008 76062182420300523006979797110157000009 581476302410260422636979797331062000010 1234850712204407344949797972212570

Person: GSS 10 Main

Page 59: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

00001 169895714620691266912141307220251100002 212294362410300523012070302230352100003 61737841020370633712140603220251100004 151962542420280422806979797441062000005 169587521220230312352100302240312100006 173783282420380633864979797140755000007 88434954710300523032070302240352100008 76062182420300523006979797110157000009 581476302410260422636979797331062000010 1234850712204407344949797972212570

RECID

WG

HTFNL

PROV

DVSEXDVAG

ECAP

Page 60: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Adding Complexity to Data

Structurally– hierarchical : order & different

record layouts for different units of analysis

– relational : 1 to n relations– compound records : combination

of units represented on each record

Page 61: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complex Data Structure

Household 1

Person 1

Person 2

Household 2

Household 3

Person 1

Person 2

Person 3

Hierarchical Data Structure

Page 62: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

4600000000000 000000 000000004600100000000 000000 000000004600100105024RM 024000 5010 820900004600100205024RM 024000 5010 820900004600100305024RM 024000 5010 820900004600100405027T 024000 5010 820904104600100505027T 024000 5010 820904104600100605027T 024000 5010 820904104600100705031RM 031000 5011 821000004600100805031RM 031000 5011 82100000

Geography: 1991 Census N9101 Population 15 years and over by age groups (17) and marital status (6a), showing labour force activity (8) and sex (3)

Page 63: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

4600000000000 000000 000000004600100000000 000000 000000004600100105024RM 024000 5010 820900004600100205024RM 024000 5010 820900004600100305024RM 024000 5010 820900004600100405027T 024000 5010 820904104600100505027T 024000 5010 820904104600100605027T 024000 5010 820904104600100705031RM 031000 5011 821000004600100805031RM 031000 5011 82100000

PROV

FED

EA CD CSDCSD T

ype

CCSCM

A/CA

Page 64: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

4600000000000 000000 000000004600100000000 000000 000000004600100105024RM 024000 5010 820900004600100205024RM 024000 5010 820900004600100305024RM 024000 5010 820900004600100405027T 024000 5010 820904104600100505027T 024000 5010 820904104600100605027T 024000 5010 820904104600100705031RM 031000 5011 821000004600100805031RM 031000 5011 82100000

PROV

FED

EA CD CSDCSD T

ype

CCSCM

A/CA

Page 65: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complex Data Structure

Relational Data Structure

R1

R2

R3

R4

R5

R1 C1

R1 C2

R1 C3

R1 C4

R3 C1

R3 C2

R4 C1

R5 C1

R5 C2One to Many

Page 66: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

0000111169122244421472240699799799779979000021113011219442077219069979979977997900003111371229344214729306997997997799790000511123522094421072090699799799779979000061133862280441047280019999973601999900007111303120344207720306997997997799790000831330021854421079970723099799780459000083233002235331097997072639979972028900010113449219344209219301287997293509490001032344923202220879970736399799720439

Person: GSS 10 Union

Page 67: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

0000111169122244421472240699799799779979000021113011219442077219069979979977997900003111371229344214729306997997997799790000511123522094421072090699799799779979000061133862280441047280019999973601999900007111303120344207720306997997997799790000831330021854421079970723099799780459000083233002235331097997072639979972028900010113449219344209219301287997293509490001032344923202220879970736399799720439

RECIDUNIO

NTYP

UNIONRNK

Page 68: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

00001111691222444214000021113011219442070000311137122934421400005111235220944210000061133862280441040000711130312034420700008313300218544210000083233002235331090001011344921934420900010323449232022208

00001 1698957146206900002 2122943624103000003 617378410203700004 1519625424202800005 1695875212202300006 1737832824203800007 884349547103000008 760621824203000009 5814763024102600010 12348507122044

GSS 10 Main GSS 10 Union

Page 69: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

00001 1698957146206900002 2122943624103000003 617378410203700004 1519625424202800005 1695875212202300006 1737832824203800007 884349547103000008 760621824203000009 5814763024102600010 12348507122044

00001111691222444214000021113011219442070000311137122934421400005111235220944210000061133862280441040000711130312034420700008313300218544210000083233002235331090001011344921934420900010323449232022208

GSS 10 Main GSS 10 Union

Page 70: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

Complex Data Structure

Compound Data Structure

R1 x T1 x A1

R1 x T2 x A4

R1 x T3 x A7

R1 x T4 x A3

R1 x T4 x A1

R2 x T1 x A2

R2 x T2 x A9

Page 71: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

000041144504000800024010000000012518733000041144308000900006011222220012518733000041141709000930003031222220012518733000041141709301100009031222220012518733000041141211001330015011222220012518733000041149113301630018011222220012518733000041141216301800009011222220012518733000041143018002000012031222220012518733000041147920002015001541222220012518733000041143720152130007531222220012518733

GSS 2 Episode

Page 72: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

000041144504000800024010000000012518733000041144308000900006011222220012518733000041141709000930003031222220012518733000041141709301100009031222220012518733000041141211001330015011222220012518733000041149113301630018011222220012518733000041141216301800009011222220012518733000041143018002000012031222220012518733000041147920002015001541222220012518733000041143720152130007531222220012518733

SEQNUM

DDAYNO

_EPIS

O

ACT_CO

DE

Page 73: Units of Analysis The Basics Chuck Humphrey ACCOLEDS/DLI Training December, 2001.

000041144504000800024010000000012518733000041144308000900006011222220012518733000041141709000930003031222220012518733000041141709301100009031222220012518733000041141211001330015011222220012518733000041149113301630018011222220012518733000041141216301800009011222220012518733000041143018002000012031222220012518733000041147920002015001541222220012518733000041143720152130007531222220012518733

SEQNUM

DDAYNO

_EPIS

O

ACT_CO

DE