Top Banner

of 20

Chi Distributions

May 30, 2018

Download

Documents

anindya_kundu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 8/9/2019 Chi Distributions

    1/20

    Chi Square Distributions

    Prof G.R.C.Nair

  • 8/9/2019 Chi Distributions

    2/20

    Applications

    To test if, more than two populationproportions can be considered equal.

    To determine if two attributes areindependent of each other.

    To test goodness of fit .

    To test the discrepancies betweenobserved and expected frequencies.

  • 8/9/2019 Chi Distributions

    3/20

    1. Continuous distribution.

    2. Assumes non negative values only .

    3. Chi square distribution curve starts at the originand lies entirely to the right of Y axis.

    4.Parameter degrees of freedom (d f) also figure.

    5.Shape of Chi square distribution curve is skewed

    for very small d.f and changes drastically as d.fincreases . For large d.f, Chi-square distribution

    looks like a normal distribution curve.

    Characteristics

  • 8/9/2019 Chi Distributions

    4/20

    Conditions for use

    1.Used for large samples.

    2.All expected frequency > 10 for good

    accuracy, and should be minimum 5.

    3.Take Ho: There is no significant

    difference between the sample

    proportions or between the observedand the corresponding expected values.

  • 8/9/2019 Chi Distributions

    5/20

    4. If the degrees of freedom and area

    required in the right tail (ie, significance

    level of test ) are given, the criticalvalues of chi-square can be found from

    the table.

    5. If the Chi sq value got is > criticalvalue, H0is rejected.

  • 8/9/2019 Chi Distributions

    6/20

    Goodness Of Fit Test

    This is a test to see if a particular

    theoretical probability distribution is

    appropriate in a specific case.Frequencies obtained from the actual

    performance of an experiment /

    observation in a survey are calledobserved frequencies: fO

  • 8/9/2019 Chi Distributions

    7/20

    We test the null hypothesis that these observed

    frequencies follow a certain expected frequency

    as per a specific theoretical distribution fE. The

    hypothesis tested is how good the observed

    frequencies fit a given probability distribution.

    H0: The observed values follow the selectedprobability distribution.

    H1 : They do not follow.

  • 8/9/2019 Chi Distributions

    8/20

    Test statistic for a Chi square test is calculated

    as G =7 ( fO- fE) 2/ fE

    Degrees of freedom = n-1, where n denotes thenumber of possible outcomes/samples.

    The expected frequency for each categoryshould not be < 5. If there is a category with anexpected frequency of less than 5 either

    increase the sample size or combine two ormore categories to make expected frequencyequal to 5.

  • 8/9/2019 Chi Distributions

    9/20

    Contingency table

    For testing association of attributes.

    Expected frequency in a contingencytable to be calculated by the formula,

    fE =T R x T C / n. and d.f = (c-1)(r-1)

    H0:Attributes have no association, ie

    independent

    H1: they have association ie, dependent

  • 8/9/2019 Chi Distributions

    10/20

    Example -1

    The number of car accidents in a city

    during the last 10 months was as follows

    20,17,12,6,7,15,8,5,16, and 14. Does it

    agree with the general belief that the

    number of accidents are more or less the

    same every month? Test at 10%Sig level.

  • 8/9/2019 Chi Distributions

    11/20

    H0 : No of accidents are same every month

    H1: They are not the same

    Expected number of accidents everymonth, if they are the same =120/10=12

    G!7_fO-fE)2/fEa!_

    a_aetc !

    This is " the table value for d.fat Sign Level, which is 14.684.

    So reject H0

  • 8/9/2019 Chi Distributions

    12/20

  • 8/9/2019 Chi Distributions

    13/20

  • 8/9/2019 Chi Distributions

    14/20

    Example - 2

    The following table shows the goals made in the world cup

    matches. Test if they are as per the Poissons distribution

    at 20%S.L

    No of goals 0 1 2 3 4 5 6 7

    No of match 95 158 108 63 40 9 5 2

    H0: Poisson Distribution can be fitted well, H1: Do not fit well

    Mean no of goals Q=

    (95*0+1

    58*1+10

    8*2+

    .) / (95+1

    58+10

    8+

    )=1

    .7

    Expected frequency as per Poisson fE= n* e-Q Qx/x!

  • 8/9/2019 Chi Distributions

    15/20

    No of goals f O fE (fO-fE)2/fE

    0 95 88 49/88=0.56

    1 158 150 64/150=0.43

    2 108 126 324/126=2.57

    3 63 72 =1.12

    4 40 30 =3.33

    5 9* 10* =0.29

    6 5* 3*

    7 2* 1*

    7 * Combine 8.3d.f= 5, S.L=20% G critical= 7.289

    calculated value is > critical value. So reject H0.

    Poisson distribution do not fit well.

  • 8/9/2019 Chi Distributions

    16/20

    Example 3

    100 students selected randomly from 10,000 doing MBA,were cross classified by their age at entry and final grade

    obtained. Test if there is any association between age at

    entry and final grade obtained at 5%.

    Grade under 21yrs 21 to24 Over24 Total

    Up to 3 6 9 5 20

    3.1 to 3.5 18 14 8 40

    3.6 to 4.0 11 12 17 40

    Total 35 35 30 100

  • 8/9/2019 Chi Distributions

    17/20

    Expected frequency Table

    Grade under 21yrs 2

    1to24 Over24

    Total

    Up to 3 7 7 6 20

    3.1 to 3.5 14 14 12 40

    3.6 to 4.0 14 14 12 40

    Total 35 35 30 100

    Gcalculated =6.37 d.f= 4, S.L=5% G critical = 9.49

    calculated value is < critical value. So cannot reject H0.

    Age and Grade points are independent. ( ie no association)

  • 8/9/2019 Chi Distributions

    18/20

    Example 4

    In a survey of 200 boys, of the 75 found to

    be intelligent,40 had skilled fathers; while

    85 of the unintelligent boys had unskilledfathers. Do these figures support the

    hypothesis that skilled fathers get intelligent

    boys? Use G2 test at 5% significance.

  • 8/9/2019 Chi Distributions

    19/20

    Example 5 / HW

    A number of managers from various levels were randomlyselected and interviewed for their concern for environmentalissues. The response of each persons was tallied into one of thethree categories as given below:

    Use 0.01 significance level to determine whether there is anyrelation ship between the level ofmanagement andenvironmental concern.

    No concern Some concern Great Concern

    Top level 15 13 12

    Middle level 20 19 21

    Supervisor 7 7 6

    Group leader 28 21 13

  • 8/9/2019 Chi Distributions

    20/20

    FurtherExamples / HW

    2009 mid term make up Q. 6

    2007Terminal Part B. Q. 3

    2009 terminal part B Q.3

    2007 make up Terminal, 7

    2007 mid term 2?,4

    Mid term 2009 Part C . Q. 6

    2007 Mid term Make up Q. 5