Shahid Lecture-4-MKAG1273

MAL1303: STATISTICAL HYDROLOGY

Hypothesis TestDr. Shamsuddin Shahid

Department of Hydraulics and HydrologyFaculty of Civil Engineering, Universiti Teknologi Malaysia

Room No.: M46-332; Phone: 07-5531624; Mobile: 0182051586 Email: [email protected]

11/23/2015 Shamsuddin Shahid, FKA, UTM

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

mailto:[email protected]

http://www.novapdf.com

How can we solve it?

Groundwater depth (m)data is collected from twoaquifer namely X and Y. Wewant to know isgroundwater depth is bothaquifers are same or not.





After using a new technique,groundwater yield hasincreased significantly. Howcan we prove it.





Environmental activist claimthat after introduction offertilizer based agriculturegroundwater quality of the areahas been deteriorated. Is itpossible to prove?




Is it the solution?

Sixteen (16) river dischargedata (randomly selected) oftwo rivers are collected. Fromthe mean of the dischargedata it is clear that River-Bhas higher dischargecompared to River-A. It ispossible to say discharge ofRiver-B is higher than River-A?




Interval of Mean Discharge

For River-A at 95% level of confidence:30.2 A 215.5

For River-B at 95% level of confidence:60.4 B 190.7

River-A and River-B can have same mean discharge value.

Is it the solution?




One tailed Test:Rejection region forHa: 520 when a .025

Two tailed Test:Rejection region forHa: 520 when a .025




Comparing two sets of data




Comparing two sets of data




Hypothesis Tests

One important use of hypothesis tests is to evaluate andcompare groups of data. Statistical tests are the mostquantitative ways to determine whether hypotheses can besubstantiated, or whether they must be modified orrejected outright.

Hypothesis tests have at least two advantages over educatedopinion:

1. They insure that every analyst of a data set using thesame methods will arrive at the same result.

2. They present a measure of the strength of the evidence(the p-value).




1) Choose the appropriate test.2) Establish the null and alternate hypotheses.3) Decide on an acceptable error rate α.4) Compute the test statistic from the data.5) Compute the p-value.6) Reject the null hypothesis if p ≤ α.

Structure of Hypothesis Tests




Selection of Appropriate Test

There are a larger number of hypothesis tests. They are classified based on

1. The measurement scales of the data2. Distribution of the data

If the measurement scales are interval/ratio and data distribution isnormal, we use parametric hypothesis tests

If the measurement scales are not interval/ration (such as ordinal orcategorical) or event interval/ratio but not normally distribution,then we use non-parametric hypothesis tests.




Null Hypothesis and Alternative Hypothesis

The 'null' often refers to the common view of something, while thealternative hypothesis is what the researcher really thinks is the causeof a phenomenon. The null hypothesis is a hypothesis which theresearcher tries to disprove, reject or nullify.

The null hypothesis, denoted as H0

The alternative hypothesis, denoted as Ha




Want to test mean can be 190?

Ho: = 190 when =0.05 [Null hypothesis: mean value can be 190]Ha: 190 when =0.05 [Alternative hypothesis: mean value can not be190]

Comparing two population means, µ1 and µ2:

Null Hypothesis, H0: µ1 = µ2.

The alternative hypothesis, H1: µ1 ≠ µ2 (two-tailed t test),

H1: µ1 < µ2 (one tailed t test),orH1: µ1 > µ2 (one-tailed t test).

Example: Null and Alternative Hypothesis




1) Choose the appropriate test.2) Establish the null and alternate hypotheses.3) Decide on an acceptable error rate α.4) Compute the test statistic from the data.5) Compute the p-value.6) Reject the null hypothesis if α p.

Structure of Hypothesis Tests




Permiability of groundwater is found to vary very widely in an area. Onehundred (n=100) permiability measurements are done in an area.Calculated mean of permiability of 100 measurements is 190. For someengineering purpose we need to know whether groundwater permiabilityin the area can have a mean value of 180 or not? We want to determine itat 95% level of confidence.

Ho: = 190 when =0.05 [Null hypothesis: mean value can be 190]

Ha: 190 when =0.05 [Alternative hypothesis: mean value can not be 190]

A Simple Example




A Simple Example

Accepted Region=

Result: 180 can not be the mean permeability in the region

At 95% level of confidence:




Comparing Two sets of Data: Student t-test

Underlying assumptions made in using the t test to comparetwo population means:

1. The underlying distributions for both populations arenormal.

2. The variances of the two populations are approximatelyequal:

s1 = s2




Null HypothesisThe null hypothesis, denoted as H0, is expressed as follows for thet-test comparing two population means, µ1 and µ2:

H0: µ1 = µ2.

Alternative HypothesisThe alternative hypothesis, denoted as H1, is expressed as one ofthe following for the t test comparing two population means, µ1 andµ2:

H1: µ1 ≠ µ2 (two-tailed t test),

H1: µ1 < µ2 (one tailed t test),orH1: µ1 > µ2 (one-tailed t test).

Null Hypothesis




Student t-test: Comparing two sets of data

Standard Error in Mean

t-statistic estimated using:

Where,n1 is the number of xi observations, n2 is the number of yiobservations,Sx

2 is the sample variance of xi , Sy2 is the sample variance of yi,

x is the sample average for xi , and y is the sample average for yi




1. Once the t-statistic has been computed, we can compare ourestimated t value to critical t values given in a table for the tdistribution.

2. If estimated t value is greater than the critical t value entry in the ttable associated with a significance level of α (one-sided t test) orα/2 (two-sided t test) we can reject the null hypothesis.

3. Thus, we compare our t value to the t distribution table entry for:

t(α, n1 + n2 − 2) (one-sided)or

t(α/2, n1 + n2 − 2) (two-sided)

where α is the level of significance (equal to 1 – level ofconfidence), and n1 and n2 are the number of samples from eachof the two populations being compared.

Making Decision




Student t-test: ExampleGroundwater samples are from near aunderground mining area before thestarting mining and after mining are givenbelow. It is anticipated by many scientiststhat increasing concentration of Chemical-Xin groundwater due to the mining. Is it true?

Null Hypothesis, H0: µ1 = µ2[No change in groundwater quality]

Alternative Hypothesis, H1: µ1 ≠ µ2[Groundwater quality has changed]




Student t-test: Example

t(calculated) = 0.7968

Degree of freedom= n1 + n2 -2= 16 + 14 – 2 = 28

At Alpha = 0.05t(critical) = t(0.025, 28) = 2.3685

t(calculated) < t(critical)

Decision: Null hypothesis can not be rejected at 95% level of confidence.




ANalysis Of VAriance (ANOVA)

Analysis of variance (ANOVA) is a method for testing the hypothesisthat there is no difference between two or more population means(usually at least three).

Why t-test cannot be applied?

• t-test, which is based on the standard error of the differencebetween two means, can only be used to test differencesbetween two means

• With more than two means, could compare each mean witheach other mean using t-tests. Conducting multiple t-tests canlead to error and is NOT RECOMMENDED




Three groups tightly spread about their respective means, the variabilitywithin each group is relatively small.

Three groups have the same means as in previous figure but thevariability within each group is much larger.

ANOVA examines the difference between the groups as well as thedifference within a group.

Analysis of Variance (ANOVA)




Assumptions of ANOVA

1. The observations are sampled independently, the groupsunder consideration are Independent. Selection of onesample has no effect on another

2. Each of the populations is Normally distributed with thesame variance (homogeneity of variance)

3. Population variances are equal




Calculating an ANOVA means that we want to calculate the Fstatistic. There are six steps to calculating the F statistic:

1. Calculation of “sum of squares” between the groups,2. Calculation of “sum of squares” within the groups,3. Determine the degrees of freedom for each.4. Calculation of “mean square between” and “mean square

within”5. Calculation of the F ratio (or F statistic)6. Making a decision

Calculating an ANOVA




Calculating an ANOVA

Mean Square Between (MSB)

Mean Square Within (MSW)

F-statistics

Larger F-statistics mean more variation between the groupcompared to within the group. Larger F-statistics support thegroups are from different population.




Calculation of Degree of Freedom

Degrees of freedom between (DFB) and the degrees of freedomwithin (DFW) can be calculated by following way:

DFB = No. of groups - 1

DFW = Population size - No. of groups




Example ANOVA Test




Hypotheses

We may test the

Null Hypothesis : There is no difference in groundwater depth in three catchments

against the

Alternative Hypothesis : the groundwater depth of at least one pair of catchments are not equal




Example ANOVA Test




Sum of Square Between (SSB)

38.798SSB

SSB11/23/2015 Shamsuddin Shahid, FKA, UTM



Total Sum Square(TSS)

Total sum square = Sum square between (SSB) + Sum square within (SSW)

44.735TSS

TSS11/23/2015 Shamsuddin Shahid, FKA, UTM



Total sum square (TSS)= Sum square between (SSB) + Sum square within (SSW)

Therefore,

SSW = TSS – SSB

= 44.735 – 38.798

= 5.937

Mean Square Within (MSW)




Determine Degree of Freedoms

Between group degree of freedom (BDF) =Number of group – 1= 3 -1 =2

Within group degree of freedom (WDF) =Total population – Total Group= 30 – 3=27




Mean Squares

Between Group Mean Square = SSB / BDF= 38.798 / 2= 19.399

Within Group Mean Square= SSW / WDF= 5.937 / 27= 0.2199




F-Statistics

Between Group Mean Square F = --------------------------------------------------

Within Group Mean Square

= 19.399 / 0.2199

= 88.2

F (0.05; 2,27) = 3.36

F(calculated)>F(critical). Therefore, we can reject null hypothesis.

Important:The F statistic doesn’t advise us about which groups are different, itonly says that mean values does or does not differ significantly bydifferent groups. In this case, it only says groundwater depth differssignificantly in different catchments.




One-way and Two-way ANOVA

When there is only one qualitative variable which denotes the groups and onlyone measurement variable (quantitative), a one-way ANOVA is carried out. Thepurpose of one-way ANOVA is to find out whether data from several groups havea common mean. That is, to determine whether the groups are actually differentin the measured characteristic.

The purpose of two-way ANOVA is to test the effectives of two independentvariables of several groups. One-way ANOVA and two-way ANOVA differ in thatthe groups in two-way ANOVA have two categories of defining characteristicsinstead of one.

Suppose sediment samples are collected from three different areas. Contents of two minerals (A & B) are measured for each sample. We want to see are the samples are different from area to area as well as from types of mineral contents.




Chi-square Test of Normality
















Normsdist(z) [Excel Function]

Normsdist(-1) = 0.158655Normsdist (-1) – (Normsdist(0) = 0.341345Normsdist(0 ) – Normsdist(1) = 0.341345Normsdist(1) – Normsdist(2) = 0.135905

Expected Frequency = n x [probability of z-value occurring in that class interval]

Example = 12 x 0.158655 = 1.903863





Example: (2 – 1.903863)2/1.903863= 0.004855

Chi (calculated) = 0.09292Chi(critical) (alpha,df) = ?

Degree of Freedom (df) = m – k – 1

Where, m is the number of class (here 4)We estimated y(bar) and s, so k = 2Therefore, df = 4 – 2 – 1 =1

Chi (0.05, 1) = 3.841459Chi(calculated) < Chi(critical)

Null hypothesis can not be rejected.





We can conclude that, the measurements has come from normal distribution at 95% level of confidence





Parametric and Non-parametric Tests




Mann-Whitney U-Test

Computational Steps

1. Two samples are taken.

2. The data are put into order, based on size.

3. Data can be ranked from highest to lowest or lowest to highest values

4. Calculate Mann-Whitney U statistic

U = n1n2 + n1(n1+1) – R12




Example of Mann-Whitney U-test

Two tailed null hypothesisthat there is no differencebetween transmissivity in twoaquifers

Ho: Aquifer-A and Aquifer-Bhave same Transmissivity

HA: Transmissivity ofAquifer-A and Aquifer-B arenot same.




Transmis.Aquifer-A

Transmis.Aquifer-A

Ranks of Trans. Of A

Ranks of Trans. Of B

193 175 1 7

188 173 2 8

185 168 3 10

183 165 4 11

180 163 5 12

178 6

170 9

n2 = 7 n1 = 5 R1 = 30 R2 = 48

Example of Mann-Whitney U testU1 = n1n2 + n1(n1+1) – R1

2U1 =(5)(7) + (5)(6) – 30

2U1 = 35 + 15 – 30U1 = 20

U 0.05,7,5 = 5

The value is equal to our value, Therefore, Ho is rejected.

We can say at 95% level of confidence that the two samples have different mean

U2 = n1n2 + n2(n2+1) – R22

U2 =(5)(7) + (7)(8) – 482

U2 = 35 + 28 – 48U2 = 15

U2 ~ U1 = 15 ~ 20







• The Kruskal-Wallis test is a nonparametric (distribution free) test,which is used to compare three or more groups of sample data.

• Kruskal-Wallis Test is used when assumptions of ANOVA are not met.In ANOVA, we assume that distribution of each group should benormally distributed. In Kruskal-Wallis Test, we do not assume anyassumption about the distribution. So Kruskal-Wallis Test is adistribution free test.

• If normality assumptions are met, then the Kruskal-Wallis Test is not aspowerful as ANOVA.

• The Kruskal-Wallis Test was developed by Kruskal and Wallis jointlyand is named after them.

Kruskal-Wallis Test




Steps of Kruskal-Wallis Test

1. Arrange the data of all samples in a single series in ascending order.2. Assign rank to them in ascending order. In the case of a repeated

value, assign ranks to them by averaging their rank position.3. Different samples are separated and summed up as R1 R2 R3, etc.4. To calculate the value of Kruskal-Wallis Test, apply the following

formula:

Where,H = Kruskal-Wallis Testn = total number of observations in all samplesRi = Rank of the sample




Calculation of Degree of Freedom:Degree of freedom = k-1; population is each group should be morethan 5.

Kruskal-Wallis Test statistics is approximately a chi-squaredistribution.

Value of Kruskal-Wallis Test < The chi-square table value:The null hypothesis is can not be rejected. The sample comes fromsame population.

Value of Kruskal-Wallis Test H > Tthe chi-square table value: Thenull hypothesis is rejected. The sample comes from a differentpopulation.

Kruskal-Wallis Test




Example: Groundwater depth in three catchments (A, B, C) aremeasured. Is there any variation in groundwater depth in threecatchments?

Kruskal-Wallis Test: Example




Example: Cont..

H = 9.84

Degree of Freedom = No. of groups -1= 3 -1 = 2

H(critical) = 5.99

H (calculated) > H (critical) at p = 0.01

Null hypothesis rejected.

Result: Significant difference exists in groundwater depth of three catchments.11/23/2015 Shamsuddin Shahid, FKA, UTM



Chi-square Table




Nonparametric Methods

Mann-Whitney-Wilcoxon Test Kruskal-Wallis Test Sign Test Wilcoxon Signed-Rank Test Run Test




Example: Sign Test

As part of research, studies were carried out to measure whether thenew method proposed by you (Method-A) can remote the Arsenic inwater more than the well-known existing method (Method-B). Atotal of 36 case studies were conducted. The obtained result is givenbelow. Do the data shown below indicate a significant difference inthe two method?

18 found Method-A is better (+ sign recorded)12 found Method-B is better (_ sign recorded)

6 cases both methods gives similar ambiguity

The analysis is based on a sample size of 18 + 12 = 30.




HypothesesH0: No preference for one method over the other existsHa: A preference for one method over the other exists

Rejection RuleIf binomial table value is less than certain p value (such as 0.05)

Test StatisticNEGBINOMDIST(12,18,0.5) = 0.1145 (cumulative value)

ConclusionDo not reject H0. There is insufficient evidence in the sample toconclude that a difference in methods exists

We could reject if success is 20 and failure is 10 (Table value: 0.034).

Example




Example: Sign Test -Prevalence of one mineral

ProblemAs part of study, we want to seewhether concentration ofMineral-A is more compared toMineral-B in a place. We havecollected 14 samples and measurethe concentration of Mineral-Aand Mineral-B is the samples. Isthere any difference inconcentration of minerals in thesamples?




Example: Prevalence of one mineral

Test StatisticYes = 11, No, 3, Cumulative Binomial Value = 0.023

ConclusionBinomial values is less than 0.05. Therefore, Reject H0 at 95% levelof confidence.

Decision: There is sufficient evidence in the sample to conclude thatconcentration of one mineral is more compared to other.




Example: Wilcoxon Signed-Rank TestThis test is the nonparametric alternative to the parametric matched-sampletest

AsAs partpart ofof study,study, wewe wantwant toto seesee whetherwhether concentrationconcentration ofof MineralMineral--AA isis moremorecomparedcompared toto MineralMineral--BB inin aa placeplace.. WeWe havehave collectedcollected 1010 samplessamples andand measuremeasurethethe concentrationconcentration ofof MineralMineral--AA andand MineralMineral--BB inin thethe samplessamples.. IsIs therethere anyanydifferencedifference inin concentrationconcentration ofof mineralsminerals inin thethe samples?samples?




WilcoxonWilcoxon SignedSigned--Rank TestRank Test

Preliminary Steps of the Test• Compute the differences between the paired observations.• Discard any differences of zero.• Rank the absolute value of the differences from lowest to

highest. Tied differences are assigned the average ranking of their positions.

• Give the ranks the sign of the original difference in the data.• Sum the signed ranks individually (“+” together and “–”

together)• Wilconxon Statistics W = minimum (“+” Rank; “-” Rank)• Compare calculated value to Wilconxon Tabulated value. • If your value less than the tabulated value Reject Null

Hypothesis




Example: Example: Wilcoxon SignedSigned--Rank TestRank Test

+ Rank = 49.5; - Rank = 5.5; W = Mininmum (+Rank; - Rank) = 5.5

H0: The concentration of minerals are sameHa: Concentration of minerals are not same.




Wilcoxon Critical Value Table

W = 5.5

N = 10

W(calculated) < W (critical)

Important Note: If W(calculated) is less than critical table value, then null hypothesis is rejected.

Decision:Reject H0. There is sufficient evidence in the sample to conclude that a difference exists in mineral concentration.




• The runs test is used to determine for serialrandomness: whether or not observations occur in asequence in time or over space.

• Runs Test is used for Nominal Data

• In Hydrological study, the runs test is most often usedto determine whether observations are random orfollowing some pattern.

Run TestRun Test




For example, we have sampled occurrence of some hydrologicaldisaster in every year, resulting in the data set:

Run TestRun Test

Where A denotes “No Disaster” and B denotes “Disaster” year. We areinterested in determining whether the order of the Disastruous year israndom or not. In some cases, some phenomena follows somepattern, Like below:




Unlike other tests there is no equation for the runs test unless thesample size of either group is greater than 30. One only needs tocount the number of runs (u), a run being a series of the samenominal value when counting from left to right.

Run TestRun Test




Run Test: Example (Two tailed)Run Test: Example (Two tailed)

Flood years in a place during the lasttwenty-one years (1990-2010) has beengiven in the table below. It has beenreported in different studies that climatechange has caused an increase of floodfrequency in the recent years. We wantto check whether it is true in the place ofour interest.




Run Test: Example (Two tailed)Run Test: Example (Two tailed)

YNYNNYNNYNYYYNYYYNYYY

HypothesisH0 : The occurrence of flood in random.Ha : The occurrence of flood is not random.

Computation of Testn1 = 13 ← there are 13 occurrences of flood.n2 = 8 ← there are 8 occurrences of no flood.u = 13 ← there are 13 runs.

DecisionAt α = 0.05, u(critical) = 6, 16 ← there are 2 critical

values of u, if the calculated value falls betweenthese then H0 is accepted.

Since 6 < 13 < 16 accept H0The distribution of flood years are random




Critical Critical Values for Values for Run TestRun Test




If a one tailed runs test is used, we can determine whether the dataare either random, non-random due to clustering, or non-random dueto uniformity.

u has two critical values:If u < the lower u(critical )then the data are non-random due toclustering.If u > the upper u(Critical) then the data are non-random due touniformity.If u falls between the lower and upper uCritical then the data arerandom.

Run Test: Example (One tailed)Run Test: Example (One tailed)