Bayan Almukhlif مخلفن ال | بيا1 Tutorial 6-ch6 Exercise 1: (same Example 6.1 page 297) Municipal wastewater treatment plants are required by Jaw to monitor their discharges into rivers and streams on a regular basis. Concern about the reliability of data from one of these self- monitoring programs led to a study in which samples of effluent were divided and sent to two laboratories for testing. One-half of each sample was sent to the Wisconsin State Laboratory of Hygiene, and one-half was sent to a private commercial laboratory routinely used in the monitoring program. Measurements of biochemical oxygen demand (BOD) and suspended solids (SS) were obtained, for n = 11 sample splits, from the two laboratories. The data are displayed in Table 6.1 a) Test of 0 : =0 b) Construct 95% Simultaneous C.I for the components of the mean difference vector . c) Construct 95% Bonferroni Simultaneous intervals for the components of the mean difference vector . Compare the lengths of these intervals with those of the simultaneous intervals constructed in part (b). d) Use 2 to test 0 :=0 and construct 95% Simultaneous C.I for the components of the mean difference vector . (suppose n is large ) Solution: a) 0 :=[ 1 2 ]=[ 0 0 ] . 1 : ≠ 0 ; ℎ = 1 − 2 #Test statistic for vector of mean differences 2 = ( − ) ′ −1 ( − ), 0 2 ≥ ( − 1) − ,−, Note: From R result 0 . 2 = = 2 ( − (−1) )≥ , −, ( − 1) − ,−, 1− = 10 ∗ 2 ∗ 2,9 , 0.95 9 = 10 ∗ 2 ∗ 4.25649 9 = 9.4589 =[ −9.3636 13.2727 ] ; =[ 199.255 88.309 88.309 418.618 ] , −1 =[ 0.0055363 −0.0011679 −0.0011679 0.0026352 ] Since 2 = 13.6312 > 9.4589, we reject 0 :=0 and conclude that there is a nonzero mean difference between the measurements of the two laboratories. b) The 100(1 − ) % Simultaneous Intervals for individual Mean differences (Paired Comparisons):
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Bayan Almukhlif 1 | بيان المخلف
Tutorial 6-ch6
Exercise 1: (same Example 6.1 page 297)
Municipal wastewater treatment plants are required by Jaw to monitor their discharges into rivers
and streams on a regular basis. Concern about the reliability of data from one of these self-
monitoring programs led to a study in which samples of effluent were divided and sent to two
laboratories for testing. One-half of each sample was sent to the Wisconsin State Laboratory of
Hygiene, and one-half was sent to a private commercial laboratory routinely used in the
monitoring program. Measurements of biochemical oxygen demand (BOD) and suspended solids
(SS) were obtained, for n = 11 sample splits, from the two laboratories. The data are displayed in
Table 6.1
a) Test of 𝐻0: 𝛿 = 0⃗⃗
b) Construct 95% Simultaneous C.I for the components of the mean difference vector 𝛿.
c) Construct 95% Bonferroni Simultaneous intervals for the components of the mean
difference vector 𝛿. Compare the lengths of these intervals with those of the simultaneous
intervals constructed in part (b).
d) Use 𝜒2 to test 𝐻0: 𝛿 = 0 and construct 95% Simultaneous C.I for the components of the
mean difference vector 𝛿. (suppose n is large )
Solution:
a) 𝐻0: 𝛿 = [𝛿1
𝛿2] = [
00
] 𝑉. 𝑆 𝐻1: 𝛿 ≠ 0 ; 𝑤ℎ𝑒𝑟𝑒 𝛿 = 𝜇𝑑1 − 𝜇𝑑2
#Test statistic for vector of mean differences 𝛿
𝑇2 = 𝑛(�̅� − 𝛿)′𝑆𝑑−1 (�̅� − 𝛿), 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑖𝑓 𝑇2 ≥
(𝑛 − 1)𝑃
𝑛 − 𝑃𝐹𝑃,𝑛−𝑃,𝛼
Note: From R result 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻0 𝑖𝑓 𝑇. 2 = 𝐹 = 𝑇2 (𝑛−𝑃
(𝑛−1)𝑃) ≥ 𝐹𝑃, 𝑛−𝑃, 𝛼
(𝑛 − 1)𝑃
𝑛 − 𝑃𝐹𝑃,𝑛−𝑃, 1−𝛼 =
10 ∗ 2 ∗ 𝐹2,9 , 0.95
9=
10 ∗ 2 ∗ 4.25649
9= 9.4589
�̅� = [−9.363613.2727
] ; 𝑆𝑑 = [199.255 88.30988.309 418.618
] , 𝑆𝑑−1 = [
0.0055363 −0.0011679−0.0011679 0.0026352
]
Since 𝑇2 = 13.6312 > 9.4589, we reject 𝐻0: 𝛿 = 0 and conclude that there is a nonzero
mean difference between the measurements of the two laboratories.
b) The 100(1 − 𝛼) % Simultaneous Intervals for individual Mean differences (Paired
##compute Hotelling Test ## #_first way z<- cbind(data2$fuel,data2$repair,data2$capital) ICSNP::HotellingsT2(z ~data2$truck_type,test = "f")
Hotelling's two sample T2-test data: z by data2$truck_type T.2 = 16.375, df1 = 3, df2 = 55, p-value = 1e-07 alternative hypothesis: true location difference is not equal to c(0,0,0)
# note 𝑇2 = (𝑎𝑝𝑝𝑟𝑜𝑥 𝐹) ((𝑛1+𝑛2−2)𝑃
𝑛1+𝑛2−𝑃−1)
#_second way man.Res<- manova(cbind(fuel,repair,capital) ~truck_type,data=data2) summary(man.Res)
Df Pillai approx F num Df den Df Pr(>F) truck_type 1 0.4718 16.375 3 55 1e-07 ***
alpha <- 0.01 n1 <-length(which((data2$truck_type=="gasoline"))) n2 <-length(which((data2$truck_type=="diesel"))) p <- 3 #mean vector of gasoline pop and mean vector of diesel pop. xbar1<-colMeans(data2[data2$truck_type=="gasoline",-4]) xbar2<-colMeans(data2[data2$truck_type=="diesel",-4])
# test and Construct 99% C.I when the covariance matrices are not homogenous (by use chi-square) S.chi <- matrix((S1/n1+S2/n2),ncol=3,nrow=3) (T2.chi<-t(xbar.d) %*% solve(S.chi) %*% xbar.d)
[,1] [1,] 43.17639
(chi.crit<- qchisq(.99, df=3))
[1] 11.34487
ifelse(T2.chi>chi.crit,"Reject H0","Fail to Reject H0")
Bayan Almukhlif 9 | بيان المخلف
[,1] [1,] "Reject H0"
for(i in 1:3) { print(xbar.d[i] + sqrt(S.chi[i,i])*sqrt(chi.crit)*c(-1,1)) }
critical value = 𝐹2𝑃,2(Σ 𝑛𝑙 −𝑃−2 ),1− 𝛼 = 𝑭𝟐(𝟒) , 𝟐(𝟓𝟏𝟎), 𝟎.𝟗𝟗 = 𝟐. 𝟓𝟐𝟖𝟔𝟖𝟐 since 𝟏𝟖. 𝟒𝟖𝟕𝟖𝟔 > 𝟐. 𝟓𝟐𝟖𝟔𝟖𝟐 , we reject 𝐻0 at the 1% level and conclude that average costs
differ, depending on type of ownership.
b) Other statistics for checking the equality of several multivariate means, Bartlett has shown
that if 𝐻0 is true and Σ 𝑛𝑙 = 𝑛 is large,
Test statistic for Larg Σ 𝑛𝑙 𝑖𝑠: 𝑇 = − (𝑛 − 1 −𝑃 + 𝑔
2) 𝑙𝑛 (
|𝑊|
|𝐵 + 𝑊|)
we reject 𝐻0 at significance level 𝛼 if 𝑇 > 𝜒𝑃(𝑔−1),𝛼2
𝑇 = − (𝑛 − 1 −𝑃 + 𝑔
2) 𝑙𝑛 (
|𝑊|
|𝐵 + 𝑊|) = 𝟏𝟑𝟖. 𝟓𝟐𝟏𝟓
𝜒𝑃(𝑔−1),1−𝛼2 = 𝜒4(2),0.99
2 = 𝟐𝟎. 𝟎𝟗𝟎𝟐𝟒
Since 𝑻 > 𝜒8 ,0.992 we reject 𝐻0 at the 1% level. This result is consistent with the result based on
the foregoing F-statistic.
c) simultaneous 95% confidence intervals to determine which mean components of
variable 𝑥3 differ among the populations.
By use Result 6.5: (𝜏𝑘𝑖 − 𝜏𝑙𝑖) ∈ [ �̅�𝑘𝑖 − �̅�𝑙𝑖 ± 𝑡𝑛−𝑔 ,𝛼
Since 𝑪 = 𝟐𝟒𝟎. 𝟗𝟏𝟏𝟐 > 𝜒20 ,0.952 = 𝟑𝟏. 𝟒𝟏𝟎𝟒 , it is clear that 𝐻0 is rejected at any reasonable
level of significance. We conclude that the covariance matrices of the cost variables associated
with the three populations of nursing homes are not the same.
R Code # Example 6.10 & 6.11 page 306 rm(list=ls()) # note: Examples 6.10-6.12 are limited by the summary statistics rather than raw data - calculations are bit off due to rounding alpha <- 0.01 n1 <- 271 n2 <- 138 n3 <- 107 n <- n1 + n2 + n3 p <- 4 g <- 3 xbar1 <- matrix(c(2.066,0.480,0.082,0.360),ncol=1) xbar2 <- matrix(c(2.167,0.596,0.124,0.418),ncol=1) xbar3 <- matrix(c(2.273,0.521,0.125,0.383),ncol=1) S1 <- matrix(c(.291,-.001,.002,.010, -.001,.011,.000,.003, .002,.000,.001,.000, .010,.003,.000,.010),ncol=4) S2 <- matrix(c(.561,.011,.001,.037,
critical value = 𝐹2𝑃,2(Σ 𝑛𝑙 −𝑃−2 ),1− 𝛼 = 𝑭𝟖 , 𝟏𝟔𝟖, 𝟎.𝟗𝟓 = 𝟏. 𝟗𝟗𝟑𝟗 since 𝟐. 𝟎𝟒𝟗 > 𝟏. 𝟗𝟗𝟑𝟗 , we reject 𝐻0 at the 5% level implying there is a time period effect.
There is a difference of male Egyptian skulls for three different time periods.
b) For pairwise comparisons in treatment means, the Bonferroni approach can be used to
construct simultaneous confidence intervals for the components of the differences 𝜏𝑘𝑖 − 𝜏𝑙𝑖.
From Result 6.5, For the MANOVA model, with confidence level at 100(1-𝛼)%
By use Result 6.5: (𝜏𝑘𝑖 − 𝜏𝑙𝑖) ∈ [ �̅�𝑘𝑖 − �̅�𝑙𝑖 ± 𝑡𝑛−𝑔 ,𝛼
qtlevel <- qt(1-alpha/(p*g*(g-1)),df=n-g) for ( i in 1:p ){ # \tau_{1i}-\tau_{2i} LCI12 <- (xbar1[i]-xbar2[i])-qtlevel*sqrt(W[i,i]/(n-g)*(1/n1+1/n2)) UCI12 <- (xbar1[i]-xbar2[i])+qtlevel*sqrt(W[i,i]/(n-g)*(1/n1+1/n2)) cat("tau1[",i,"]-tau2[",i,"] belongs to (",LCI12,",",UCI12,")\n",sep="") # \tau_{1i}-\tau_{3i} LCI13 <- (xbar1[i]-xbar3[i])-qtlevel*sqrt(W[i,i]/(n-g)*(1/n1+1/n3)) UCI13 <- (xbar1[i]-xbar3[i])+qtlevel*sqrt(W[i,i]/(n-g)*(1/n1+1/n3)) cat("tau1[",i,"]-tau3[",i,"] belongs to (",LCI13,",",UCI13,")\n",sep="") # \tau_{2i}-\tau_{3i} LCI23 <- (xbar2[i]-xbar3[i])-qtlevel*sqrt(W[i,i]/(n-g)*(1/n2+1/n3)) UCI23 <- (xbar2[i]-xbar3[i])+qtlevel*sqrt(W[i,i]/(n-g)*(1/n2+1/n3)) cat("tau2[",i,"]-tau3[",i,"] belongs to (",LCI23,",",UCI23,")\n",sep="") }
tau1[1]-tau2[1] belongs to (-4.442312,2.442312) tau1[1]-tau3[1] belongs to (-6.542312,0.3423115) tau2[1]-tau3[1] belongs to (-5.542312,1.342312) tau1[2]-tau2[2] belongs to (-2.673706,4.473706) tau1[2]-tau3[2] belongs to (-3.773706,3.373706) tau2[2]-tau3[2] belongs to (-4.673706,2.473706) tau1[3]-tau2[3] belongs to (-3.68011,3.88011) tau1[3]-tau3[3] belongs to (-0.6467765,6.913443) tau2[3]-tau3[3] belongs to (-0.7467765,6.813443) tau1[4]-tau2[4] belongs to (-2.061423,2.661423) tau1[4]-tau3[4] belongs to (-2.394756,2.328089) tau2[4]-tau3[4] belongs to (-2.694756,2.028089)
Total 4.2655 -0.7855 -0.2395 -0.7855 5.0855 1.9095 -0.2395 1.9095 74.2055
gbn-1= 19
R Code _two way MANOVA # Example 5 (6.13 Page 318 (339pdf)) rm(list=ls()) data <- read.table(file.choose()) #file T6-4 names(data) <- c("ExtrusionRate","Additive","TearRes","Gloss","Opacity") # For computerized approaches - make appropriate columns factors, but this can be really annoying so don't do it in dat data2 <- data data2[,1] <- as.factor(data2[,1]) data2[,2] <- as.factor(data2[,2]) ## Part (a) alpha <- 0.05