1 Nonparametric Methods Nonparametric Methods Sign Test Sign Test Wilcoxon Signed-Rank Test Wilcoxon Signed-Rank Test Mann-Whitney-Wilcoxon Test Mann-Whitney-Wilcoxon Test Kruskal-Wallis Test Kruskal-Wallis Test Rank Correlation Rank Correlation
33
Embed
1 1 Slide Nonparametric Methods n Sign Test n Wilcoxon Signed-Rank Test n Mann-Whitney-Wilcoxon Test n Kruskal-Wallis Test n Rank Correlation.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1 1 Slide
Slide
Nonparametric Methods Nonparametric Methods
Sign TestSign Test Wilcoxon Signed-Rank TestWilcoxon Signed-Rank Test Mann-Whitney-Wilcoxon TestMann-Whitney-Wilcoxon Test Kruskal-Wallis TestKruskal-Wallis Test Rank CorrelationRank Correlation
2 2 Slide
Slide
Most of the statistical methods referred to as Most of the statistical methods referred to as parametric require the use of parametric require the use of intervalinterval- or - or ratio-ratio-scaled datascaled data..
Nonparametric methods are often the only Nonparametric methods are often the only way to analyze way to analyze nominalnominal or or ordinal dataordinal data and and draw statistical conclusions.draw statistical conclusions.
Nonparametric methods require no Nonparametric methods require no assumptions about the population probability assumptions about the population probability distributions.distributions.
Nonparametric methods are often called Nonparametric methods are often called distribution-free methodsdistribution-free methods..
Nonparametric MethodsNonparametric Methods
3 3 Slide
Slide
Nonparametric MethodsNonparametric Methods
In general, for a statistical method to be In general, for a statistical method to be classified as nonparametric, it must satisfy at classified as nonparametric, it must satisfy at least one of the following conditions.least one of the following conditions.
• The method can be used with nominal data.The method can be used with nominal data.
• The method can be used with ordinal data.The method can be used with ordinal data.
• The method can be used with interval or The method can be used with interval or ratio data when no assumption can be made ratio data when no assumption can be made about the population probability about the population probability distribution.distribution.
4 4 Slide
Slide
Sign TestSign Test
A common application of the A common application of the sign testsign test involves involves using a sample of using a sample of n n potential customers to potential customers to identify a preference for one of two brands of a identify a preference for one of two brands of a product.product.
The objective is to determine whether there is The objective is to determine whether there is a difference in preference between the two a difference in preference between the two items being compared.items being compared.
To record the preference data, we use a plus To record the preference data, we use a plus sign if the individual prefers one brand and a sign if the individual prefers one brand and a minus sign if the individual prefers the other minus sign if the individual prefers the other brand.brand.
Because the data are recorded as plus and Because the data are recorded as plus and minus signs, this test is called the sign test.minus signs, this test is called the sign test.
5 5 Slide
Slide
Example: Peanut Butter Taste TestExample: Peanut Butter Taste Test
Sign Test: Large-Sample CaseSign Test: Large-Sample Case
As part of a market research study, a sample of 36 As part of a market research study, a sample of 36
consumers were asked to taste two brands of peanutconsumers were asked to taste two brands of peanut
butter and indicate a preference. Do the data shownbutter and indicate a preference. Do the data shown
below indicate a significant difference in the consumerbelow indicate a significant difference in the consumer
preferences for the two brands?preferences for the two brands?
The analysis is based on a sample size of 18 + 12 = 30.The analysis is based on a sample size of 18 + 12 = 30.
6 6 Slide
Slide
HypothesesHypotheses
HH00: No preference for one brand over the other : No preference for one brand over the other existsexists
HHaa: A preference for one brand over the other : A preference for one brand over the other
existsexists Sampling DistributionSampling Distribution
2.742.74
Sampling distribution of the number of “+” values if there is no brand preference
Sampling distribution of the number of “+” values if there is no brand preference
= 15 = .5(30)= 15 = .5(30)
Example: Peanut Butter Taste TestExample: Peanut Butter Taste Test
7 7 Slide
Slide
Example: Peanut Butter Taste TestExample: Peanut Butter Taste Test
Rejection RuleRejection RuleUsing .05 level of significance,Using .05 level of significance,
Reject Reject HH00 if if zz < -1.96 or < -1.96 or zz > 1.96 > 1.96 Test StatisticTest Statistic
z z = (18 - 15)/2.74 = 3/2.74 = 1.095 = (18 - 15)/2.74 = 3/2.74 = 1.095 ConclusionConclusion
Do not reject Do not reject HH00. There is insufficient . There is insufficient evidence in the sample to conclude that a evidence in the sample to conclude that a difference in preference exists for the two difference in preference exists for the two brands of peanut butter. brands of peanut butter.
Fewer than 10 or more than 20 individuals Fewer than 10 or more than 20 individuals would have to have a preference for a particular would have to have a preference for a particular brand in order for us to reject brand in order for us to reject HH00..
8 8 Slide
Slide
Wilcoxon Signed-Rank TestWilcoxon Signed-Rank Test
This test is the nonparametric alternative to This test is the nonparametric alternative to the parametric matched-sample test the parametric matched-sample test presented in Chapter 10.presented in Chapter 10.
The methodology of the parametric matched-The methodology of the parametric matched-sample analysis requires:sample analysis requires:• interval data, andinterval data, and• the assumption that the population of the assumption that the population of
differences between the pairs of differences between the pairs of observations is normally distributed.observations is normally distributed.
If the assumption of normally distributed If the assumption of normally distributed differences is not appropriate, the Wilcoxon differences is not appropriate, the Wilcoxon signed-rank test can be used.signed-rank test can be used.
District OfficeDistrict Office OvernightOvernight NiteFliteNiteFlite
SeattleSeattle 32 hrs. 32 hrs. 25 hrs. 25 hrs.
Los AngelesLos Angeles 3030 2424
BostonBoston 1919 1515
ClevelandCleveland 1616 1515
New YorkNew York 1515 1313
HoustonHouston 1818 1515
AtlantaAtlanta 1414 1515
St. LouisSt. Louis 1010 88
MilwaukeeMilwaukee 77 99
DenverDenver 1616 1111
11 11 Slide
Slide
Wilcoxon Signed-Rank TestWilcoxon Signed-Rank Test
Preliminary Steps of the TestPreliminary Steps of the Test• Compute the differences between the Compute the differences between the
paired observations.paired observations.• Discard any differences of zero.Discard any differences of zero.• Rank the absolute value of the differences Rank the absolute value of the differences
from lowest to highest. Tied differences are from lowest to highest. Tied differences are assigned the average ranking of their assigned the average ranking of their positions.positions.
• Give the ranks the sign of the original Give the ranks the sign of the original difference in the data.difference in the data.
• Sum the signed ranks.Sum the signed ranks.. . . next we will determine whether the sum . . . next we will determine whether the sum is significantly different from zero.is significantly different from zero.
Reject Reject HH00. There is sufficient evidence in . There is sufficient evidence in the sample to conclude that a difference exists the sample to conclude that a difference exists in the delivery times provided by the two in the delivery times provided by the two services. Recommend using the NiteFlite services. Recommend using the NiteFlite service. service.
Mann-Whitney-Wilcoxon TestMann-Whitney-Wilcoxon Test
This test is another nonparametric method for This test is another nonparametric method for determining whether there is a difference determining whether there is a difference between two populations.between two populations.
This test, unlike the Wilcoxon signed-rank test, This test, unlike the Wilcoxon signed-rank test, is is notnot based on a matched sample. based on a matched sample.
This test does This test does notnot require interval data or the require interval data or the assumption that both populations are normally assumption that both populations are normally distributed.distributed.
The only requirement is that the measurement The only requirement is that the measurement scale for the data is at least ordinal.scale for the data is at least ordinal.
16 16 Slide
Slide
Mann-Whitney-Wilcoxon TestMann-Whitney-Wilcoxon Test
Instead of testing for the difference between Instead of testing for the difference between the means of two populations, this method the means of two populations, this method tests to determine whether the two tests to determine whether the two populations are identical.populations are identical.
The hypotheses are:The hypotheses are:
HH00: The two populations are identical: The two populations are identical
HHaa: The two populations are not identical: The two populations are not identical
17 17 Slide
Slide
Example: Westin FreezersExample: Westin Freezers
Mann-Whitney-Wilcoxon Test (Large-Sample Mann-Whitney-Wilcoxon Test (Large-Sample Case)Case)
Manufacturer labels indicate the annual Manufacturer labels indicate the annual energy cost associated with operating home energy cost associated with operating home appliances such as freezers.appliances such as freezers.
The energy costs for a sample of 10 The energy costs for a sample of 10 Westin freezers and a sample of 10 Brand-X Westin freezers and a sample of 10 Brand-X Freezers are shown on the next slide. Do the Freezers are shown on the next slide. Do the data indicate, using data indicate, using = .05, that a difference = .05, that a difference exists in the annual energy costs associated exists in the annual energy costs associated with the two brands of freezers?with the two brands of freezers?
Mann-Whitney-Wilcoxon Test (Large-Sample Mann-Whitney-Wilcoxon Test (Large-Sample Case)Case)
• HypothesesHypotheses
HH00: Annual energy costs for Westin : Annual energy costs for Westin freezersfreezers
and Brand-X freezers are the same. and Brand-X freezers are the same.
HHaa: Annual energy costs differ for the: Annual energy costs differ for the
two brands of freezers.two brands of freezers.
20 20 Slide
Slide
First, rank the First, rank the combinedcombined data from the lowest data from the lowest to to
the highest values, with tied values being the highest values, with tied values being assigned the average of the tied rankings.assigned the average of the tied rankings.
Then, compute Then, compute TT, the sum of the ranks for the , the sum of the ranks for the first sample.first sample.
Then, compare the observed value of Then, compare the observed value of TT to the to the sampling distribution of sampling distribution of TT for identical for identical populations. The value of the standardized populations. The value of the standardized test statistic test statistic zz will provide the basis for will provide the basis for deciding whether to reject deciding whether to reject HH00..
Mann-Whitney-Wilcoxon Test:Mann-Whitney-Wilcoxon Test:Large-Sample CaseLarge-Sample Case
21 21 Slide
Slide
Sampling Distribution of Sampling Distribution of TT for Identical for Identical PopulationsPopulations
• MeanMean
TT = = nn11((nn11 + + nn22 + 1) + 1)
• Standard DeviationStandard Deviation
• Distribution FormDistribution Form
Approximately normal, providedApproximately normal, provided
nn11 >> 10 and 10 and nn22 >> 10 10
Mann-Whitney-Wilcoxon Test:Mann-Whitney-Wilcoxon Test:Large-Sample CaseLarge-Sample Case
Do not reject Do not reject HH00. There is insufficient . There is insufficient evidence in the sample data to conclude that evidence in the sample data to conclude that there is a difference in the annual energy cost there is a difference in the annual energy cost associated with the two brands of freezers.associated with the two brands of freezers.
25 25 Slide
Slide
Kruskal-Wallis TestKruskal-Wallis Test
The Mann-Whitney-Wilcoxon test can be used to The Mann-Whitney-Wilcoxon test can be used to test whether two populations are identical.test whether two populations are identical.
The MWW test has been extended by Kruskal The MWW test has been extended by Kruskal and Wallis for cases of three or more and Wallis for cases of three or more populations.populations.
The Kruskal-Wallis test can be used with ordinal The Kruskal-Wallis test can be used with ordinal data as well as with interval or ratio data.data as well as with interval or ratio data.
Also, the Kruskal-Wallis test does not require the Also, the Kruskal-Wallis test does not require the assumption of normally distributed populations.assumption of normally distributed populations.
The hypotheses are:The hypotheses are:
HH00: All populations are identical: All populations are identical
HHaa: Not all populations are identical: Not all populations are identical
26 26 Slide
Slide
Rank CorrelationRank Correlation
The Pearson correlation coefficient, The Pearson correlation coefficient, rr, is a , is a measure of the linear association between two measure of the linear association between two variables for which interval or ratio data are variables for which interval or ratio data are available.available.
The The Spearman rank-correlation coefficientSpearman rank-correlation coefficient, , rrs s , is , is a measure of association between two variables a measure of association between two variables when only ordinal data are available.when only ordinal data are available.
Values of Values of rrss can range from –1.0 to +1.0, where can range from –1.0 to +1.0, where
• values near 1.0 indicate a strong positive values near 1.0 indicate a strong positive association between the rankings, andassociation between the rankings, and
• values near -1.0 indicate a strong negative values near -1.0 indicate a strong negative association between the rankings.association between the rankings.
where: where: nn = number of items being ranked = number of items being ranked
xxii = rank of item = rank of item ii with respect to with respect to one variableone variable
yyii = rank of item = rank of item ii with respect to a with respect to a second second variable variable
ddii = = xxii - - yyii
2
2
61
( 1)i
s
dr
n n
2
2
61
( 1)i
s
dr
n n
28 28 Slide
Slide
Test for Significant Rank CorrelationTest for Significant Rank Correlation
We may want to use sample results to make We may want to use sample results to make an inference about the population rank an inference about the population rank correlation correlation ppss..
To do so, we must test the hypotheses:To do so, we must test the hypotheses:
HH00: : ppss = 0 = 0
HHaa: : ppss = 0 = 0
29 29 Slide
Slide
Sampling Distribution ofSampling Distribution of rrss when when ppss = 0 = 0
• MeanMean
• Standard DeviationStandard Deviation
• Distribution FormDistribution Form
Approximately normal, provided Approximately normal, provided nn >> 1010
Connor Investors provides a portfolio Connor Investors provides a portfolio management service for its clients. Two of management service for its clients. Two of Connor’s analysts rated ten investments from Connor’s analysts rated ten investments from high (6) to low (1) risk as shown below. Use high (6) to low (1) risk as shown below. Use rank correlation, with rank correlation, with = .10, to comment on = .10, to comment on the agreement of the two analysts’ ratings.the agreement of the two analysts’ ratings.
InvestmentInvestment AA BB CC DD EE FF GG HH II JJ
Do no reject Do no reject HH00. There is not a significant . There is not a significant rank correlation. The two analysts are not rank correlation. The two analysts are not showing agreement in their rating of the risk showing agreement in their rating of the risk associated with the different investments.associated with the different investments.