Top Banner

of 19

Joint Distribution

Apr 03, 2018

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/28/2019 Joint Distribution

    1/19

    Two Random Variables

    W&W, Chapter 5

  • 7/28/2019 Joint Distribution

    2/19

    Joint Distributions

    So far we have been talking about the

    probability of a single variable, or a variable

    conditional on another.We often want to determine the joint probability

    of two variables, such as X and Y.

    Suppose we are able to determine the following

    information for education (X) and age (Y) for

    all U.S. citizens based on the census.

  • 7/28/2019 Joint Distribution

    3/19

    Joint Distributions

    Education (X) Age (Y):

    25-35

    30

    Age: 35-

    55

    45

    Age: 55-

    100

    70

    None 0 .01 .02 .05

    Primary 1 .03 .06 .10

    Secondary 2 .18 .21 .15

    College 3 .07 .08 .04

  • 7/28/2019 Joint Distribution

    4/19

    Joint Distributions

    Each cell is the relative frequency (f/N).

    We can define the joint probabilitydistribution as:

    p(x,y) = Pr(X=x and Y=y)

    Example: what is the probability of gettinga 30 year old college graduate?

  • 7/28/2019 Joint Distribution

    5/19

    Joint Distributions

    p(x,y) = Pr(X=3 and Y=30)

    = .07

    We can see that:

    p(x) = y p(x,y)

    p(x=1) = .03 + .06 + .10 = .19

  • 7/28/2019 Joint Distribution

    6/19

    Marginal Probability

    We call this the marginal probability

    because it is calculated by summing

    across rows or columns and is thusreported in the margins of the table.

    We can calculate this for our entire table.

  • 7/28/2019 Joint Distribution

    7/19

    Marginal Probability Distribution

    Education

    (X)

    Age (Y):

    30 45 70

    p(x)

    None: 0 .01 .02 .05 .08

    Primary: 1 .03 .06 .10 .19

    Secondary:

    2

    .18 .21 .15 .54

    College: 3 .07 .08 .04 .19

    p(y) .29 .37 .34 1

  • 7/28/2019 Joint Distribution

    8/19

    Independence

    Two random variables X and Y are

    independent if the events (X=x) and

    (Y=y) are independent, or:p(x,y) = p(x)p(y) for all x and y

    Note that this is similar to Event E is

    independent of F if:

    Pr(E and F) = Pr(E)Pr(F) Eq. 3-21

  • 7/28/2019 Joint Distribution

    9/19

    Example

    Are education and age independent?Start with the upper left hand cell:

    p(x,y) = .01p(x) = .08

    p(y) = .29

    We can see they are not independentbecause (.08)(.29)=.0232, which is notequal to .01.

  • 7/28/2019 Joint Distribution

    10/19

    Independence

    In a table like this, if X and Y are

    independent, then the rows of the table

    p(x,y) will be proportional and so will thecolumns (see Example 5-1, page 158).

  • 7/28/2019 Joint Distribution

    11/19

    Covariance

    It is useful to know how two variables varytogether, or how they co-vary. We

    begin with the familiar concept ofvariance (E is expectation).

    2 = E(x- )2 = (x- )2 p(x)

    X,Y = Covariance of X and Y= E(X - X)(Y - Y)

    = (X - X)(Y - Y)p(x,y)

  • 7/28/2019 Joint Distribution

    12/19

    Covariance

    Lets calculate the covariance for education (X)and age (Y).

    First we need to calculate the mean for X andY:X = xp(x) = (0)(.08)+(1)(.19)+(2)(.54)+(3)(.19)=1.84

    Y = yp(y) = (30)(.29)+(45)(.37)+(70)(.34)=49.15

    Now calculate each value in the table minus itsmean (for X and Y), multiplied by the jointprobability!

  • 7/28/2019 Joint Distribution

    13/19

    Covariance

    X,Y = (X - X)(Y - Y)p(x,y)

    = (0-1.84)(30-49.15)(.01) +

    (0-1.84)(45-49.15)(.02) + (0-1.84)(70-49.15)(.05) +

    (1-1.84)(30-49.15)(.03) + (1-1.84)(45-49.15)(.06) +

    (1-1.84)(70-49.15)(.10) + (2-1.84)(30-49.15)(.18) +

    (2-1.84)(45-49.15)(.21) + (2-1.84)(70-49.15)(.15) +(3-1.84)(30-49.15)(.07) + (3-1.84)(45-49.15)(.08) +

    (3-1.84)(70-49.15)(.04) = -3.636

  • 7/28/2019 Joint Distribution

    14/19

  • 7/28/2019 Joint Distribution

    15/19

    Covariance and Independence

    If X and Y are independent, then they areuncorrelated, or their covariance is zero:

    X,Y = 0

    The value for covariance depends on the unitsin which X and Y are measured. If X, for

    example, were measured in inches instead offeet, each X deviation and hence X,Y itselfwould increase by 12 times.

  • 7/28/2019 Joint Distribution

    16/19

    Correlation

    We can calculate the correlation instead:

    = X,Y

    X Y

    Correlation is independent of the scale it is

    measured in, and is always bounded:

    -1 1

  • 7/28/2019 Joint Distribution

    17/19

    Correlation

    A perfect positive correlation (=1); all x,y coordinate

    points will fall on a straight line with positive slope.

    A perfect negative correlation (=-1); all x,y coordinate

    points will fall on a straight line with negative slope.

    A correlation of zero indicates no relationship between

    X and Y (or independence!).

    Positive correlations (as X increases, Y increases)

    Negative correlations (as X increases, Y decreases)

  • 7/28/2019 Joint Distribution

    18/19

    Example of Correlation

    Calculate the correlation between

    education and age:

    = X,Y = -3.636

    X Y (.8212)(16.14)

    = -0.2743

  • 7/28/2019 Joint Distribution

    19/19

    Interpretation

    There is a weak, negative correlation

    between education and age, which

    means that older people have lesseducation.

    Later on we will learn how to conduct a

    hypothesis test to determine ifissignificantly different from zero.