Top Banner
Crosstabs & Measures of Association POL242 October 9 and 11, 2012 Jennifer Hove
29

POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Mar 31, 2015

Download

Documents

India Leason
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Crosstabs & Measures of Association

POL242

October 9 and 11, 2012

Jennifer Hove

Page 2: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Questions of CausalityRecall:

Most causal thinking in social sciences is probabilistic, not deterministic: as X increases, the probability of Y increases, not that X invariably produces Y

We can observe only association per HumeWe must therefore infer causationNot one, but many possible causes

Page 3: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Inferring Causal Relations1. There must be association

X Y; ~X ~Y

2. Time order must be consideredPresumed cause should precede presumed effect

3. Must rule out possible rival explanations Sometimes what appears to be a strong relationship

between two variables is due to influence of others

4. Must be able to identify the process by which one factor brings about change in anotherCausal linkage

Page 4: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Establishing AssociationWith nominal or ordinal data, relationships usually

presented in tabular or table formWhy? Hypotheses rest on core idea of comparison

Ex: if we compare respondents on basis of their value on the IV, say party identification, they should also differ along DV, say support for gay rights

Crosstabs are a wonderful means of making comparisons

“God speaks to you through crosstabs!”

Page 5: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Using/Interpreting CrosstabsData arranged in side-by-

side frequency distributionsIV (X) presented across the

top of the table – in columns If ordinal, arrange from low

scores (on left) to high scores (on right)

DV (Y) presented down the left hand side of the table – in rowsAgain, if ordinal, arrange

from low (at top) to high (at bottom)

Low HighAll

Respondents86.1%(173)

52.7%(355)

60.4%(528)

13.9(28)

47.3(318)

39.6(346)

Tau-b=.29

Source: Strategic Counsel, CTV/Globe and Mail Survey, July 2007

100(201)

100(673)

100(874)

Table 1: Support for the Afghan Mission by Perceived Impact of Taliban Resurgence, 2007

Low

High

Total (N)

Fear of Taliban Resurgence

Support for Afghan Mission

Page 6: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Using/Interpreting CrosstabsData presented so

that categories of the IV add to 100%Percentaging within

categories of the IV (down in a table)

Comparisons are made across categories of the IVFrom left to rightTo see the effect of

the IV on the DV

Low HighAll

Respondents86.1%(173)

52.7%(355)

60.4%(528)

13.9(28)

47.3(318)

39.6(346)

Tau-b=.29

Source: Strategic Counsel, CTV/Globe and Mail Survey, July 2007

100(201)

100(673)

100(874)

Table 1: Support for the Afghan Mission by Perceived Impact of Taliban Resurgence, 2007

Low

High

Total (N)

Fear of Taliban Resurgence

Support for Afghan Mission

Page 7: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Rules (!) of Crosstabs1. Make the IV define the columns and the DV define

the rows of the table

2. Always percentage down within categories of the IV

3. Interpret the relationship by comparing across columns, within rows of the table

Page 8: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Example: 2 x 2 CrosstabSupport for Y Variable by Support for X Variable

Score on X Variable Low High

Score on Y Variable

Low A B A + B High C D C + D

A + C B+ D

Low HighAll

Respondents86.1%(173)

52.7%(355)

60.4%(528)

13.9(28)

47.3(318)

39.6(346)

100(201)

100(673)

100(874)

Table 1: Support for the Afghan Mission by Perceived Impact of Taliban Resurgence, 2007

Low

High

Total (N)

Fear of Taliban Resurgence

Support for Afghan Mission

Page 9: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

DiagonalsMain diagonal: running to the right and down

When larger proportion of cases fall on main diagonal, relationship is said to be direct or positive

Low values on X associated with low values on Y; high values on X associated with high values on Y

Score on X Variable Low High

Score on Y Variable

Low A B A + B High C D C + D

A + C B+ D

Page 10: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

DiagonalsOff diagonal: running to the right and up

When larger proportion of cases fall on off diagonal, relationship is said to be inverse or negative

Low values on X associated with high values on Y; high values on X associated with low values on Y

Score on X Variable Low High

Score on Y Variable

Low A B A + B High C D C + D

A + C B+ D

Page 11: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Explaining Variation in YRelationships between variables in social sciences

are rarely, if ever, perfectly predictableYou are unlikely to see something like this:

Support for Y Variable by Support for X VariableLow High

Low 100% 0High 0 100%Total 100 100

Score on X Variable

Score on Y Variable

Page 12: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Explaining Variation in YThere is likely to be more than one explanation or

“cause” behind the variation in YSo we will generally be looking at:

X1 Y

X2 Y

To compare, we want to know relative strength of each relationship

A variety of summary terms called measures of association are used

Page 13: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Measures of AssociationCompress information that appears in a crosstab

into a single number by summarizing:Magnitude (strength) of the relationshipDirection of the relationship

Magnitude: ranges from 0 (completely unpredictable) to 1 (perfectly predictable)

Direction: positive (+) = cases primarily on main diagonal; negative (-) = cases primarily on off diagonal

Page 14: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Two Cautionary NotesDirection is not useful with nominal-level variables,

since they are not ordered/ranked from low to highEven with ordinal measurement, interpretation of

direction depends entirely on how your variables are codedShould always code your variables so that high scores

indicate “more” of what you want to explain

Page 15: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Direction & StrengthCombining direction & strength, we get a range

of possibilities

All intermediary values can also occur, e.g.

-.2367Note that equivalent positive and negative scores are

equal in strengthEx: +.4 and -.4 are equal in strength; they differ only in

direction

-1.0 -.8 -.6 -.4 -.2 0 +.2 +.4 +.6 +.8 +1.0

Page 16: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Choosing among Measures We use different measures of association for 2 main

reasons:

1. There are different levels of measurementOrdinal measurement offers ranking information used

to calculate association, which isn’t available with nominal data

2. Some measures are specific to tables of certain sizes and shapesSpecific measures for 2 x 2 tables; others for larger

square tables; still others for rectangular tables

Page 17: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Phi ΦUse with dichotomous variables, 2 x 2 tablesApplies to nominal and ordinal dataMeasures the strength of a relationship by taking the

# of cases on the main diagonal minus the # of cases on the off diagonal (adjusting for marginal distribution of cases, i.e. the sum of the columns and rows)

))()()(( DBCADCBA

BCAD

Page 18: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

2 Examples: Phi Φ

6.

2.

Low HighLow 75% 10%High 25% 90%Total 100 100

Score on X Variable

Score on Y Variable

Low HighLow 50% 20%High 50% 80%Total 100 100

Score on X Variable

Score on Y Variable

Page 19: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Cramer’s VAn extension of PhiLogic of Cramer’s V is based on percentage

differences across the columns, not on logic of diagonals

Use with nominal data, when tables are larger than 2 x 2

Page 20: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Lambda Lambda (λ) is another measure of association for

nominal dataIts rationale of “percentage of improvement” or

“proportion reduction in error” is relatively easy to explain

Not recommended in this courseWhen modal category of each column is in same row,

λ=0

Page 21: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Measures of Association: Ordinal DataMeasures include Tau-b, Tau-c and Gamma Rely on analysis of diagonals

Support for X Low Med High

Support for Y

Low a b c Med d e f

High g h i

Page 22: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Measures of Association: Ordinal DataMeasures include Tau-b, Tau-c and Gamma Rely on analysis of diagonals

Support for X Low Med High

Support for Y

Low a b c Med d e f

High g h i

Page 23: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Measures of Association: Ordinal DataMeasures include Tau-b, Tau-c and Gamma Rely on analysis of diagonals

Support for X Low Med High

Support for Y

Low a b c Med d e f

High g h i

Page 24: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Mind your Ps and QsThe letter P indicates the # of pairs of cases on the

main diagonals (from left to right)The letter Q indicates the # of pairs of cases on the

off diagonal (from right to left)If P > Q, we have a positive associationIf P < Q, we have a negative associationThe core calculation = P - Q

Page 25: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

GammaThe information of P and Q can be used to

calculate Gamma (γ)

Problems:Any vacant cell produces a score of 1.0Tends to overstate strength of a relationship

QP

QP

QP

Q

QP

P

QP

QP

Page 26: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Tau-b and Tau-cPreferable to Gamma, though built on the same

logic of diagonalsTends to produce results similar to phi (using

nominal data) or the most important interval measure (r) – to be discussed later in the year

))(( YQPXQP

QPbTau

Page 27: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Tau-b and Tau-cTau-b never quite reaches 1.0 in non-square tablesSo Tau-c was developed to use with rectangular

tablesIn practice, the difference between Tau-b and Tau-c

when applied to the same table is not great, but keep the distinction above in mind

Page 28: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Example

Approval of Chavez

Very Bad

Bad GoodVery Good

All Respondents

Disapprove12.7%(26)

22.8%(64)

43.4%(171)

67.9%(110)

35.6%(371)

Approve87.3(178)

77.2(217)

56.6(223)

32.1(52)

64.4(670)

100(394)

100(162)

100(1041)

Table 2: Approval of President Chavez by Opinion of the United States, 2007

Opinion of the United States

Total (N)

100(204)

100(281)

Tau-c: -.39 Tau-b: -.35Source: Latinobarometer, 2007 – Venezuelan respondents only

Page 29: POL242 October 9 and 11, 2012 Jennifer Hove. Questions of Causality Recall: Most causal thinking in social sciences is probabilistic, not deterministic:

Summing UpWith nominal data, use Phi or Cramer’s V

Phi used for 2 x 2 tablesCramer’s V used for any other crosstab involving

nominal dataAvoid Lambda

With ordinal data, use Tau-c or Tau-bTau-b used for square tables: 3 x 3, 4 x 4, etcTau-c used for rectangular tablesAvoid Gamma