-
12.1
12 P."obabilityand Statistics "
L COI'\ffiINATIONS AND PERlI.fUTATIONS
There are a finite number of ways in which n elements can be
combined into distinctly different groups ofr items For example,
suppos e a fanner has a chicken, a rooster, a duck, and a cage that
holds only two birds. The possible combiMtiom of w e. birds taken
two at a time are (chicken, rooster), (chicken, duck), and
(rooster, duck). The birds in the cage will not remain stationary,
so the combination (rooster, chicken) is not distinctly different
from (chicken, rooster) That is, the combinations are not armr
conxiou,
Etpmtion 12.1: Combinations
P(n,r) C(n ,r ) = --
,! n!
r !(n - r )! 12.1
The number of combiMtiom ofn items taken r at a time is written
C(n , .. ), ~, ne" "C" or (:.') (pronounced "n choos e r"). It is
sometimes r eferred to as the binomial cO
-
12.2
An order-conscious subset ofr items taken from a set ofn items
is thep
-
12.3
Solution The marble colors represent different typ es of objects
The number of permutations of the marbles taken 13 at a time is
P(U; 4, 2, 7)
The tmSWeT is (D).
n! 13! n , !n2! .. . nk! 4!2!7! 13 X 12 X 11 X 10 X 9 X 8 X
7
x 6 x 5x4x3x2xl 4 x 3 x 2xlx2xl
x 7 x 6 x 5 x4 x3 x 2 X I = 25,740 (26,OOO)
2. LAWS OF PROBABILITY
Probability tixlory determines the relative likdihood that a
particular event will occur An ovont. E. is one of the possible
outcomes of a trial Theprobability of E occurring is denoted as
P(E)
Probabilities are real numbers in the range of zero to one Ifan
event E is certain to occur. then the probability P(E) of the event
is equal to one If the event is certain not to occur. then the
probability P(E) of the event is equal to zero. The probability of
any other event is between zero and on,
The probability ofan event occurring is equal to one minus the
probability of the event not occurring This is known as a
complw",ntary probability
PtE) = 1 - P(not E)
Complementary probability can be used to simplifY some
probability calculations. For example. calculation of the
probability of numerical events being "greater than or "less than
or quantities being "at least"" a certain number can often be
simplified by calculating the probability of the complementary
event
Probabilities of multiple events can be calculated from the
probabilities of individual events using a variety of methods \Vhen
multiple events are considered. thos e events can either be
independent or dependent The probability of an i"'*p""'*nt ovont do
es not affect (and is not affected by) other events. The assumption
of independence is appropriate when sampling from infinite or very
large populations. when sampling from finite populations with
replacement. or when sampling from different populations (univers
es). For example. the outcome of a second coin toss is generally
not affected by the outcome of the first coin toss. The probability
of a '*'p""'*nt ovont is affected by what has previously happened
For example. drawing a second card from a deck of cards without
replacement is affected by what was drawn as the first card
Events can be combined in two basic ways. according to the way
the combination is described. Events can be connected by the words
"and' and "or For example. the question. "\Vhat is the probability
of eventA and event B occurring?' is different than the question.
"\Vhat is the probability of eventA or event B occurring?' The
combinatorial "and' is designated in various ways AB. A B. A x B .
An B. and A, B. among others. In this book. the probability of A
and B both occurring is designated as PtA, B)
The combinatorial "or" is designated as A + B and A U B In this
book. the probability of A or B occurring is designated as PtA + B)
Eqllation 12.-1: La"", of TIH4l Probnbilily
P(A + B) = P(A ) + P(B) - P(A,B) 124
EqualIon 124 gIveS the probabililytlm either evtlll A or B will
occur P(A. B) i, \heprobability tim both A and B will occur
A deck often children's cards CotltaJru three fish cards, two
dog cards, and five cal cards \\!haI lS \he probabilily of drawmg
ather a cat card or a do g card from a full dec\(/
-
12.4
A deck of len children's cards contains w e. fish cards, Iwo dog
cards, and five cal cards. \VIla1 is the probabilily of drawing
either a cal card or a dog card from a full deck?
(A) 1110
(B) 2/10
(C) 5/10
(D) 7110
Solution The Iwo evenls are mUlually exclusive, so the
probability of both happening, A;A, B), is zero. The lotal
probability of drawing either a cat card or a dog card is
PtA + 8) = P(A) + P{B) - PtA, B) = }50 + }20 - 0 = 1/ 10
The tmSWeT is (D).
E'llifwon 11. 5: Lnw ,!!ColHJ'OflHd (Joint) Probnbility
P(A, B) = P(A )P(BIA) = P{B)P(AIB)
[independent I
events
12.5
Equation 12.5. the !.:;w of compouM ('Oint) probabll!ry, gives
the probab:lily thaI evenls A and B will both occur. A;B~) is the
COMItIOOO! prcbability thai B wll occur given tha A has already
occurred. Likewis e, A;AIB) is the ccm titional probabilty thai A
will occur given thai B h
-
12.5
Solution There is a total of 17 balls There are 2 white balls.
The probability of picking a white ball as the first ball is
p eA) 2 17
After picking a white ball first, there are 16 balls remaining,
7 of which are orange The probability ofpicki"lg an orange ball
second givm that a white ball was chosm first is
P(BIA) 7
"
The probability of picking a white ball first and an (range ball
second is
p eA, H) = P(A)P(BIA )
~ C'7)C'6) = 0.05147 (0.052)
The tmSWeT is (B). E'I'mtion 1Z.': BIJJ_' Thell"m
( ) ~P-"(B-,,;)--,P(,,,A!=:IB,!.C) :-c P HilA = r "'" P(AI8;
)P(B;) L....~ l
p(BAA) = PCB and A) peA)
12. 6
G.vm two d"prruItnt ,ets of ""rm:;. A and B, the probability
thaI l:Vm1 B will occur glvm the facl\hat the dep mdtlll l:Vm1 A
has already occurred i, wmtma, p(BjIA) and is givm by ~,'
1""0,""",. Eq 12.6
A medical pabrm ~bits a ,ymptom that oc curs nalurally 10% of!he
tune in all people. The symptom IS also ~hlI.ed by all patimls who
have a particular disease. The incidtnee of that particular dI,
ease among all ptop1e i, 0.0002% \VIlat i, !he probability of !he
patimt havtng \hat particular eli,eas""
(A) 0. 002%
(8) 0. 01%
(e) 0.3%
(Dj 4%
-
12.6
""iu/ion ~his problem is asking for a conditional probability:
the probability that a ,erson has a diseas e, D, given that the
person has a symptom, S. Us e Bayes' theorem to calculate the
probability: The probability that a person has the symptom S. given
that they have the diseas e D is F\.51D) and is 100%. Multiply by
100% to get the answer as a percentage
P{DIS)
The tmSWeT is (A).
P(D JP(SID) ~ ="",,,,~~=m P{SI D)P(D) + P{Slnot D)P(not D)
(0.OOOO(2)( 1.00)
(1.00)(0.000Cl02) + (0. 10)(0.999998) = 0.00002 (0.002%)
3. MEASURES OF CENTRAL TENDENCY It i, often unnec essary to
present =erimental:tata in their entirety, 6ther in tabular or
graj:hic fonn. In such ca,"s, the data and distritution can be
repres",ted by various parameters One typ e of parameter is a
measure of contrai to,1
-
12.7
45 measurements were between 0.859 and 0.900
0.901 was observed once
0.902 was observed three times
0.903 was observed twic e
0.9[14 was observed four times
45 measurements were between 0.905 and 0.958
The smallest value was 0.859, and the largest value was 0.958
The sum ofall100 measurements was 91.170. Except thos e noted, no
measurements occurred more than twic e
\VIlat are the (a) mean, (b) mode, and (c) median of the
measurements, respectively?
(A) 0.908; 0.902; 0.902
(B) 0.908; 0.9[14; 0.903
(e) 0.912; 0.902; 0.902
(D) 0.912; 0.9[14; 0.903
Solution (a) From Eq 127. the arithmetic mean is
.. ( I ) X = (l In) LX; = 100 (91.170) = 0.9117 ....
(0.912)
(b) Tte mode is the value that occurs most frequer.Uy The value
of 0.9[1.1 occurred four times, and no other measurements repeated
more ttan four tines. 0.9[14 is the mode
(e) Tte median is the value
-
12.8
s{wiem score 80%
95%
71%
95%
\VIlat is most nearly the student's final grade in the
course?
(A) 82%
(B) 85%
(e) 87%
(D) 89%
Solution The student's final grade is the weighted arithmetic
mean of the individual exam scores
x . LW;X; 2: wi
(1)(80%) + (2)(95%) + (2)(72%) + (5)(95%) 1 + 2 + 2 + 5
= 88.9% (89%)
The tmSWeT is (D).
Etpllliion 12. J : ~o_tI'ic ~
129
The goo'""tric ,""an ofn nonnegative values is defined by Eq
12.9. The geometric mean is the number that. when raised to the
power of the sample size. produces the same result as the product
ofall samples. It is appropriate to us e the geometric mean when
the values being averaged are used as consecutive multipliers in
other calculations. For example. the total revenue earned on an
investment of C earning an effective interest rate of it in year k
is calculated as R - 0:)]i2i3 ... i0. The interest rate. i. is a
multiplicative element. Ifa $100 investment earns 10% in year 1
(resulting in $110 at the end of the y ear) . then the $110 earns
30% in year 2 (resulting in $143). and the $143 earns 50% in year 3
(resulting in $215). the average interest earned each year would
not be the arithmetic mean of (1 0% + 30%+ 50%)/3 - 30% The average
would be calculated as a geometric mean (24.66%)
\VIlat is most nearly the geometric mean of the following data
set?
0.820, 1.96,2.22,0.190,1.00
-
12.9
(A) 079
(B) 0.81
(C) 0.93
(D) 0.96 Solution T:1e geometric m= of the data ,et is
The tmSWeT is (C).
(0.820)(1.96)(2.22) x(O. I90)(J.OO)
= 0.925 (0.93)
Etpmtion 12.10: Root-ldmn-StpllU
sample root-wea&8Quare value = J( l jnlLJ4 12. 1 0
T:1e root-m
-
12.10
4. :MEASURES OF DISPERSION
Mwwr., oj di'p"r5ion describe the variability in observed
data
EtpUJtion 12.11 TlU"OI'6" Eq. 12.15: SttmiIanJ Devintion
1211
1212
1213
U-. =
1214
1215
One measu-e of dispersion is:he ,tandard deviation, defined in
Eq. 12.11 . Nis the total. population size, not the sample size, n.
This implies that the entire j:opulation is measured
Equation 12.11 can be used te calculate the stancard deviation
only when the entire poj:ulation can be included in the calculation
\Vben only a small subs
-
12.11
A cat colony living in a small town has a total population of
seven cats The ages of the cats are as shown
nwnber
\VIlat is most nearly the standard deviation of the age of the
cat population?
(A) 1.7 yr
(B) 2.0 yr
(e) 2.2 yr
(D) 2.4 yr
S:l)ution Using Eq 12.7. the arithmetic mean of the ages is the
population mean. If-
~ (') (1)(7 YT) + (1)(8 YT) + (2)(10 yr ) 1 +(1)(12 yr) + (2)(
13 yr)
= lOA YT
From Eq. 12.11 . the standard deviation of the ages is
-
12.12
EtpUJtion 12.16: Sampk StJUuJanl Devintion
" ~ = [l/(n- l )] L (Xi- X )l ; - 1
12. 16
The standard deviation oj a wmplo (particularly a small sample)
ofn items calculated from Eq 12.11 is a bia",d .stimaior of (i. e.,
on the average, it is not equal to) the population standard
deviation. A different measure of dispersion called the wmplo
standard deviation, s (not the same as the standard deviation of a
sample), is an unbiased estimator of the population standard
deviation. The sample standard deviation can be found using Eq 12.
16
Samples of aluminum-alloy channels were tested for stiffness The
following distribution of results was obtained
s tiffness
2480
2440
2400
2360
2320
frequency
If the mean of the samples is 2402, what is the approximate
standard deviation of the population from which the samples are
taken?
(A) 48.2
(B) 49.7
(e) 50.6
(D) 50.8
Solution The number of samples is
n = 23 + 35+40+33 + 21 = 152
The sample standard deviation, s, is the unbiased estimator of
the population standard deviation, (5 .
~ [1/(n-II]f:(X, - XI' .-,
I 152 1
= 50.82 (50.8)
(23)(2480 - 2402)2 + (35)(2440 - 2402)1 + (40)(2400 - 2402)1 +
(33)(2360 - 2402)2 + (21 )(2320 _ 2402)2
-
12.13
The tmSWeT is (D).
EtpUJtion 1Z.1 7 TlUOf'6" Eq. 1Z.IJ: Vtuitutct!o tmiI Sampk
Vtuitutct!o
.;: = (l / N) [(X I _ 1')2 + (X , _ p)l + ... + (XJV _ 1')2]
12.17
N
~ = (1 / N) L (X; - p)2 12. 18
" 8 2 = [l / (n - 1)1 L (X ; - X )!
; _ 1 12. 19
The varicmc. is the square of the standard deviation Since there
are two standard deviations. there are two varianc es. The varicmc.
ojt"" population (i. e .. thepopulation varicmc.) is,r. and the
wmplo varicmc. is ,2 The population varianc e can be found using
either Eq 12.17 or Eq. 12.18. both derived from Eq 12.11 . and the
sample varianc e can be found using Eq. 12.19. derived from Eq.
12.16
Most nearly. what is the sample varianc e of the following data
set?
2, 4,6, 8,10, 12, 14
(A) 4.3
(B) 5.2
(e) 8.0
(D) 19
Solution Find the mean using Eq 12.7
X =(I/nl tX,= n ) (Z+4+6+8+1O +1Z+14) .-,
~ 8
From Eq. 12.19. the sample varianc e is
8 2 = p/(n - l )l t (X, - X)2
-
12.14
(A) 4.3
(B) 5.2
(e) 8.0
(D) 19
Solution Find the mean using Eq 12.7
x =(1/ n) tX,= n)(2+4+6+8+1O +12+14) .-,
From Eq. 12.19, the sample varianc e is
$% = [l/(n - l)l t (X, - X)2
~ (7 ~ 1) ((2 :(:)~:)~4:(:~1:8~~ :~:22 _ 8)2 ) +(14 _ 8)2
= 18.67 (19)
The tmSWeT is (D).
EtpUJlion 12.20: Sampk CHfflCWnJ ofVarintion
cv = $jX 12. 20
The rdativ. di'p"rolon is defined as a measure of dispersion
divided by a measure of central tendency The wmplo coojjici.nt
o/variation, CV, is a relative dispersion calculated from the
sample standard deviation and the mean
The following data were recorded from a laboratory
experiment
20,25,30, 32,27, 22
-
12.15
The mean of the data is 26 \VIlat is most nearly the sample
coefficient of variation of the data?
(A) 0.18
(B) 1.1
(e) 2.4
(D) 4.6
Solution Find the sample standard deviation of the data using Eq
12. 16
[I/ (n - IlJ t (Xi - X )2 ; _ 1
(
(20 - 26)1 + (25 _ 26)1 ) = (6~I) +(30 _ 26)1 + (32 _ 26)1
+ (27 - 26)2 + (22 _ 26)2 = 4.6
From Eq. 12.20, the sample co efficient of variation is
cv =.rx = ~: = 0.177 (0.18)
The tmSWeT is (A). 5. NUMERICAL EVENTS A dixroto nwmrical ovont
is an occurrence that can be described by an integer. For example,
27 cars passing through a bridge toll booth in an hour is a
discrete numerical event Most numerical events are continoou,ly
di,triDutod and are not constrained to discrete or integer values
For example, the resistance ofa 10% 1 II resistor may be any value
between 0.9 II and 1.1 II
6. EXPECTED VALVES
Etpmtion lZ.Z1: Exp~d Vah"" of ... Disc"~ VtJrinbk
" /J = E[X] = L Z",(Z~ )
.-, 12. 21
The oxpoctod val"", E, ofa discrete random variable, X, is given
by Eq. 12.21. !(Zk) is the probability mass function as defined in
Eq. 12.28
The probability distribution of the number of calls, X, that a
customer service agent receives each hour is shown
-
12.16
, M c 000 , eM , 0.05
0 0.10
0 0.35
W 0.46 \VIla! is most nearly the average number of phone calls
that a customer servic e agent expects to receive in an hour?
CAl 5
(B) 7
(e) 8
CD) 9
Solution The expected number of received calls is
p. = E[X] = L,,*,(Xk) ~,
= (0)(0.00) + (2)(0.04) + (4)(0.05) + (6)(0.10) + (8)(0.35) +
(10)(0.46)
= 8.28 (8)
The tmS>W!T is (0. Etpmtion 12.22: VIJI'ianuo of /J Disc""
VaritJbk
rr = V[Xl = L (::r~ - 1')' f(x.) .-,
1222
Equation 12.22 gives the varianc e, d2, ofa discrete function of
variable X To us e Eq. 12.22, the population mean, /1-, must be
known, having bem calculated from the total population ofn values
The name "discrete" requires only that n be a finite number and all
values ofx be known It do es not limit the values ofx to
integers
Etpmtion lZ.Z1 (JIuJ Eq. 12.24: Exp"tl Vah"" (M~an) of tJ
CO"IiIUIUfIS Varinbk
00
p. = S[X) = / z/(z)dz - 00
1223
00
E [Y] = E[g(X)] = / g(z)f(it)dJ: _ 00
12.24
-
12.17
Equation 12.23 calculates the population mean, /1-, of a
continuous variable, X, from the probability density function, fX).
Equation 12.24 calculates the mean of any continuously distributed
variable defined by Y- g(x), whos e values are observed according
to the probabilities given by the probability density function
(PDF)fx). Equation 12.24 is the general fonn of Eq 12.23. where
g(x) - x
~
'" = Vlx] = E[IK - pi'] = 1(' -pi' fl' )" ro
12 15
Equabon 12 15 gives the vanance ofa coIl\lIlu.ous random
varr.ble, X /I- 's the mean of X, and / (z) II the dmslly tunCllon
of X
1216
The standard deviation is always the square root of the varianc
e, as shown in the variation equation. Equation 12.26 gives the
standard deviation for a continuous random variable, X
Etpmtion 12.2 7: CH.JIicWnI of V"rintion of tJ COnJimlOJIS
V"ritJbk
CV = (T I l' 1227
The co efficient of variation of a continuous variable is
calculated from Eq 12.27
-
12.18
7. PROBABll.JTY DENSITY FUNCTIONS (DISCRETE) Etpmtion 1Z.ZB:
ProbabiJiJy ldtJss Frmction
12. 28
A dixr.t. rmilim variaN., X, can take on values from a set of
discrete values, Xi. The set of values can be finite or infinite,
as long as each value can be expressed as an integer. The
probability ma" junction, defined by Eq. 12.28. gives the
probability that a discrete random variable, X, is equal to each of
the set's possible values, xk. The probabilities ofall possible
outcomes add up to unity
Etpmtion 1Z.ZJ: ProbabiJiJy IknsiJy Frurction
P(a :O::; X :O::; b) = J J(z )dz
12. 29
A ""mity junction is a nonnegative function whos e integral
taken over the entire range of the independent variable is unity A
probability ""mity junction (PDF) is a mathematical fonnula that
gives the probability of a numerical event
Various mathematical models are used to describe probability
density functions. Figure 12. 1 shows a graph of a continuous
probability density function. The area under the probability
density function is the probability that the variable will assume a
value between the limits of evaluation The total probability, or
the probability that the variable will assume any value over the
interval, is 1. O. The probability of an exact numerical event is
zero. That is, there is no chance that a numerical event will be
exacUy a It is possible to determine only the probability that a
numerical event will 'b e less than a, greater than b, or between
the values of a and b
Figure 12.1 Probability DImity Function
pix }
, b x
If a random variable, X, is continuous over an interval, then a
nonnegative probability ""mity junction of that variable exists
over the interval as defined by Eq. 12.29
8. PROBABILITY DISTRIBUTION FUNCTIONS (CONTINUOUS) A wmulativ.
probability di,tribution junction, FIx), gives the probability that
a numerical event will occur or the probability that the numerical
event will 'b e less than or equal to some value,,,
-
12.19
Etpmtion 12.10: C'Ulfldnu",~ Disui1nllion F,mcuon: Disc""
RnniIom V"rinbk
F(Zm) = L P(z~) = P(X ~ Zm) [m =1,2, ... ,n] .-,
1230
For a dixr.t. rmilim variaN . X. the probability distribution
function is the sum of the individual probabilities of all possible
events up to and including event x",. The wmulativ. di,triDution
junction (CDF) is a function that calculates the cumulative sum of
all values up to and including a particular end point For discrete
probability density functions (PDF s). PIx",). the CDF can be
calculated as a summation. as shown in Eq. 1230
Because calculating cumulative probabilities can be cumbersome.
tables of values are often used. Table 12. I at the end of this
chapter gives values for cumulative binomial probabilities. where n
is the number of trials. P is the probability of success for a
single trial. and x is the maximum number of successful trials
Etpmtion 12.11: C'Ulfldnu",~ Disui1nllion F,mcuon: COntimiOlIS
RnniIom VtJrinbk
,
F (z ) = J f (t)dt -.
12. 31
For continuous functions. the CDF is calculated as
-
12.20
Therefore, the height of the curve at its peak is 2
The equation of the line from x- O up to x = 1/2 is
/ (x ) =4", [o:
-
12.21
Solution Since the outcomes are 'either -or" in nature, the
outcomes (and combinations of outcomes) follow a binomial
distribution. A male kittm is defined as a success The probability
ofa success is
p = 1 - 0.52 = 0.48 = P (male kitten) q = 0.52 = P (Ccmale k
itten) n = 7 t rials '" = 2"uccesset!
Pn(z ) n ! rf rf-I< "'!(n xl ! ~ ( 2) = ( 7' )
(O.48)1(O.52f-1
2!(7 2)! = 0.184 (0.18)
The tmSWeT is (B).
EtpUJtion lZ.J411utn'6" Eq. 12.17: Nonnnl Disuilnliion
1 f (",) = -- ' u../2-i [-oo :s '" S 00]
I {:r) = _1_ (J-~/z [-00 S " :S ooJ ff.
F(- xl = 1- F(z)
12.34
12.35
12. 36
12.37
The norma! di,/riDution (G"''''icm di,/riDution) is a
symmetrical continuous distribution, commonly referred to as the
wll-,hap
-
12.22
Figure 12.3 Normal Om ... with M.an If. and Standard Ikviaiion
(5
2.15% 13.6% 34.1 % 34.1 % 13.6% 2.15%
"f fl - 2
-
12.23
From Table 12.2, the cumulative distribution function at Z- 06
is F( Z) = 0.7257 The percentage of boys having height greater than
1.23 m "
percentage taller than 1.23 m = 100% - (0.7257)(100%) = 27.43%
(27%)
The tmSWeT is (A),
Etpmtion 12.1B tmiI Eq. 12.1J: C~nJTuI LimiJ I1woTPm
12. 38
u Uj =
12. 39
The contral limil Ihcorom states that the distribution of a
significantly large number of sample means of n items where all
items are drawn from the same (i. e., parent) population will be
normal. According to the centra1limit theorem, the mean of sample
means, 1',. , is equal to the population mean of the parent
distribution, /1-, as shown in Eq. 12.38. The standard deviation of
the sample means, Uj, is equal to the standard deviation of the
parent population divided by the square root of the sample size, as
shown in Eq. 12.39
Etpmtion 12.40 tmiI Eq. 12.41: t-Disuilnllion
r(~) f (t ) = 2 foir(n
_. ( ")-' 1+ -"
12.40
12.4 1
For the I-di,tribution (commonly referred to as Stu
-
12.24
INscription
The gamma junclion, r{ n ), is an extension of the factorial
function and is used to determine values of the factorial for
complex numbers greater than zero (i. e .. positive integers)
Etpmtion 12.41: ClIi-StpllUd Disll'i]",uon
:i = 4+z~+ ... +z~ 12.43
IHscriptlon
The sum of the squares of n independent normal random variables
will be distributed according to the chHquar.d di,/riDulion and
will have n degrees of freedom The chi-squared distribution is
often used with hypothesis testing of varianc es Chi-squared
values, X~ n ' for selected values of a and n can be found from
Table 12.5 at the end of this chapter
10. STUDENT' S T-TE ST
Etpllltion 12.44: Ex~
a = f~ f{t)dt 12.44
Dlscriptlon
The I-to'l is a method of comparing two variables, usually to
test th, significanc, of the differ,",c e betw= _ samples F",
example, th, I-test can be used to test i
-
12.25
12. SUl\IS OF RANDOM VARIABLES
EtpUJtion lZ.45: Sruns of RmriIom Varinbks
12.45
LHscrlptlon
The sum of random variables, Y, is found from Eq. 12.45
EtpUJtion lZ.46: Exp~d Vah~ ofdle Srun of RmriIom Varinbks
~ = E(Y) = aIE(X ,) + aJE(Xz) + ... + a"E(X,,) 12.46
~scrlptlon
The expected value of the sum of random variables 11", is
calculated using Eq. 12.46
EtpUJtion lZ.47 and Eq. lZ.4B: Variance ofdle Srun of I~mJmJ
RnniIom Varinbks
12.47
u: = a~o-i + ~~ + ... + ~,r" 12.48
The varianc e of the sum of indep endent random variables can be
calculated from Eq 12.47 and Eq. 12.48
EtpUJtion lZ.4J: StturiIanl Devintion ofJlIe Srun of I~tuklll
RmriIom Varinbks
"". = p, 12.49
LHscrlptlon
The standard deviation of the sum of independent random
variables (s ee Eq 12.45) is found from Eq. 12.49
13. SUl\IS AND DIFFERENCES OF :MEANS \Vhen two variables are
sampled from two different standard normal variables (i. e , are
independent), their sums will be distributed with mean 14-- /1-1
+/1-2 and variance.r ..... = o-i / n t + ~/fl2 The sample size" nl
and n2, do not have to be the same. The relationships for
confidence intervals and hypothesis testing can be used for a new
variabl e, X"",, - Xl +X2, if /I- is replaced by 14- and "is
replaced by""""
For the differenc e in two standard normal variables, the mean
is the differenc e in two population mean" 14-- /1-1 -/1-2, but the
varianc e is the sum, as it was for the sum of two standard normal
variables
-
12.26
14. CONFIDENCE INTERVALS
Population properties such as means and variarr es must usually
be estimated from samples The sample mean, X , and sample standard
deviation, " are unbiased estimators, but they ace not nec essarily
precisely equal to the true populatim properties. For estimated
values, it is common to specilY an interval expected to conlain the
true population properties. The interval is kno;.vn as a confidenc
e interval becaus e a confidenc e level, C(e.g., 99%), is
associated with it (There is still a 1 - C chanc e that the true
pop1.lation property is outside of the interval.) The interval will
be bounded below by its lowor conjicknco limit (LCL) and above by
its upp"r conjicknco limit (UCL)
As a consequenc e of the contral limit thorom, means of samples
of n items taken from a distribution that is normally distributed
with mean If-and standard deviation (5 will be normally distrituted
with mean If- and varianc e ,rln Therefore, the j:robability that
any given average, X, exceeds some value, L, is
L ~" } '"
L is the conjicknco limit for the confidenc e level 1 - p { X
> L } (expressed as a percentage). Values ofp(x) are r.ad
direcUy from the unit normal table (s ee Table 12. 2) . As an
example, :r = 1.64:) for a 95% confidenc e level sinc e only 5% of
the curve is above that x in the upp e:-tail. This is known as a
om-tail conjicknco limit becaus e all of the exceedanc e
probability is given to one side of the variation With:wo tail
conjidonc< Ii>?'it" the probability i, ,olit between the two
, ne, of vori:ttion. Ther< will be upper ond 10w,.- confidenc e
limit, U:L and LCL, r espectively This is appropriate when it is
not specifically known that the calculated parameter is too high
0:- too low Table 12.3 "t the er.d of this chapter lists standard
normal variab:es and t values for two-tail confidenc e limits
p{LCL < X < UCL} _ l LCL- ~ VCL -JJ } _ p ~ < :r <
"
'" '"
Etpllllion 12.50 tmiI Eq. 12.51: COiifIMlrcr LimiJstmillttUnwfor
~ of u Nornwl Distri1ndion
12.50
12.51
LCL = X - l"j :,n_l ( :n ) UCL = X + l" j:,n_1 ( :n )
-
12.27
The conjicknc. limil' jar 1m "",an, /1-, of a normal
distribution can be calculated from Eq. 12.50 when the standard
deviation, (5, is known
If the standard deviation, (5, of the underlying distribution is
not known, the confidence limits must be estimated from the sample
standard deviation, " using Eq. 12.51 . Accordingly, the standard
normal variable is replaced by the I-distribution parameter, lo!:l,
with n - 1 degrees of freedom, where n is the sample size. a = 1 -
C, and r:JJ2 is the I-distribution parameter since half of the
exceedance is allocated to each confidence limit
EtpUJtion 12.52 aniI Eq. 12.51: C91ifuk,~ LimiJsforOw D!ffrlncr
Brtwun TW9~
~' .; +Z"/l - + -.. '" 12. 52 (i + ~)[(n1 - l)sf +(n-: -
l)~]
n1 +nt 2
(i +~) [(n1 - l )sf + (n-: - l)~] n1 +R:I 2
[unknown '" and ".1 12. 53
The differenc e in two standard normal variables will be
distributed with mean 14-- /1-1 -/1-2 Us e Eq. 12.52 to calculate
the confidence interval for the differenc e between two means, /1-1
and /1-2, if the standard deviations "l and"'.l are known. If the
standard deviations "l and"'.l are unknown, us e Eq 12.53. The
I-distribution parameter, lo!:l, has 1 + n-: - 2 degrees of
freedom
100 resistors produced by company A and 150 resistors produced
by company B are tested to find their limits before burning out The
test results show that the company A resistors have a mean rating
of 2 W before burning out, with a standard deviation of 0.25 W; and
the company B resistors have a 3 W mean rating before burning out,
with a standard deviation of 0.30 W. \VIlat are the 95% confidence
limits for the differenc e between the two means for the company A
resistors and company B resistors (i. e., A- B)?
(A) -1.1 W;-1.0W
(B) -1.1 W; -0.93 W
(e) -1.1 W; -0.90 W
(D) -1.0W;-0.99W
Solution From Table 12.3, the value of the standard normal
variable for a two-tail test with 95% confidence is 1.9600
From Eq. 12.52, the confidence limits for the differenc e
between the two means are
-
12.28
The tmSWeT is (B).
- - !f0;' .r, = X 1 - X 1 - Z,,/2 - + -
n, "' = 2W-3W
- 1.9600 (0.25 W)2 (0.30 W)2
100 + 150
= - U )686 W (-1.1 W)
- - !f0;' .r, = X 1 - X 2 + Z,,/2 - + -n, "' = 2W-3W
+ 1.9600 (0.25 W)2 (0.30 W)2
100 + 150
= -0,9314 W (-0,93 W)
EtpUJtion 12.54: Co,gukncr LimiJstmillnU"ulford~ V~ oj"4 Nonnnl
Disuilnliion
(n _ 1)82
r,'/2~'-1 12. 54
Equation 12.54 gives the limits of a confidenc e interval
(confidenc e C - 1 - (X) for an estimate of the population varianc
e calculated as the sample varianc e from Eq. 12.19 with a sample
size ofn drawn from a normal distribution. Sinc e the varianc e is
a squared variable, it will be distributed as a chi-squared
distribution with n - 1 degr""s of freedom Therefore, the
denominators are the i' values taken from Table 12.5 at the end of
this chapter. (The values in Table 12. 5 are already squared and
should not be squared again.) Sinc e the chi-squared distribution
is not symmetrical, the table values for r:JJ2 and for 1 - (r:JJ2)
will be different for the two confidenc e limits
15. HYPO THESIS TESTING
A hypot""'i' t05t is a procedure that answers the question, "Did
these data come from [a particular typ e of] distributionT There
are many typ es of tests, depending on the distribution and
parameter being evaluated. The most simple hypothesis test
determines whether an average value obtained from n repetitions of
an experiment could have come from a population with known mean If-
and standard deviation (5. A practical application of this question
is whether a manufacturing process has changed from what it used to
be or should be Of course, the answer (i. e., yes or no) cannot be
given with absolute certainty-there will be a confidence level
associated with the answer
The following procedure is used to determine whether the average
of n measurements can be assumed (with a given confidence level) to
have come from a known normal population, or to determine the
sample size required to make the decision with the desired
confidence level
-
12.29
Etpmtion 12.55 l1um'6" Eq. 12.60: TMt on ~han of Nonnnl
DisUill11Jion, POJIIUntion ~ tmiI V/Jrianct!o Kn_n
,tep J Assume random sampling from a normal population
The n~!! hypothe'i' is
12. 55
The altemativo hypothe'i' is
12. 56
A typo lorror is rejecting Ho when it is true The probability
ofa typ e I error is the Iovd oj'ignijiccmco
a = probability(type I error) 12.57
A typo llorror is accepting Ho when it is fals e
fJ = probability(type II error) 12. 58
,tep 2 Choose the desired confidence level, C.
,tep 3 Decide on a one-tail or two-tail test If the hypothesis
being tested is that the average has or has not incrw,od or has
not
-
12.30
Etpllliion 12.61 Tlum'6" Eq. 12.tiB: Stunpk SiuforNormal
Distri1l11Jion, tmiI p Known
(Z,,/2 + Z,9 )2
-
12.31
(A) There is at least a 5% probability that the plant is
operating properly
(B) There is at least a 95% probability that the plant is
operating properly
(C) There is at least a 5% probability that the plant is not
operating properly
(D) There is at least a 95% probability that the plant is not
operating properly
Solution Since a specific direction in the variation is not
given (i. e , the example do es not ask if the average has
decreased), us e a two-tail hypothesis test
From Table 12.3. x - 1.9600
Us e Eq. 12.59 to calculate the actual standard normal
variable
= 871 - 880 = - 3.03
"
'"
Since -3.03 < 1.9600, the distributions are not the same.
There is at least a 95% probability that the plant is not operating
correctly
The tmSWeT is (D).
16. LINEAR REGRESSION
EqrUJtiOIt 1Z.U Tlutn16" Eq. 1Z. 75: ~kd.od of LnJSt 8tpUUWi
Ifit is necessary to draw a ,traight line (11 = Ii + b;l)
throughn two-dimensioml data points (Xj,n),(X2,n). .('",y"), the
followilg method based on the _thad oj Iw" "'f'Kl'"' can be used
,top J Cc1culate the following seven quanti~s
, ., ~"
' ~ (I/n) (i>.) .-,
12.69
' ~ (I/n)(ty.) .-,
12.70
-
12.32
,top 2 Calculate the slop e, &, of the line
12.71
12.72
12.73
,top 3 Calculate they-intercept, a
12.74
The equation of the straight line is
12.75
The least squares method is used to plot a straight line through
the data points (1 6), (2,7), (3, 11), and (5, 13). The ,lope of
the line is most nearly
(A) 0.87
(B) 1.7
(C) 1.9
(D) 2.0
Solution First, calculate the following values
I>; =1 + 2 + 3 + 5 = 11 L: y; = 6 + 7 + 11 + 13=37 L::r~ =
(1)2 + (2)' + (3)2 + (5)2 = 39
L: :r,y; = (1)(6) + (2)(7) + (3)( 11)+ (5)(1 3) = 118
Find the value of S~~ using Eq. 12.72
-
12.33
= liS - O}ll)(37) = 16.25
Find the value of Sa from Eq. 12.73
= S.75 From Eq 12.71 . the slop e is
16.25 j, = Sz. / Szz = - -8.75 = 1.857 (1.9)
The tmSWeT is (q.
Etpmtion 12. 76 tmiI Eq. 12. 77: SttuulnnJ Error of
Estinttlu
12.76
12.77
Equation 12.76 gives the ""''''' ;quar.d .rror. S; or MSE.
wl:ich estimates th, likelihood ofa V3lue being clos e to an
observed value by averaging fr.e square of the mors (i. e .. the
differenc e between the estimated value and observed value) Small
MSE values are fa.orable, as they indicate a smaller likelihood of
error
Etpmtion 12. 7B tmiI Eq. 12.7J: Cotifitkncr lllknulsfor S/~
tmillllkrcqll
12.73
- + - MSE ( 1 ") n Su 12.79
-
12.34
Etpllliion 12.BO tmiI Eq. 12.B1: Stunpk COnlntion
CH.JIicwlII
12. 80
12. 81
Onc e the slop e of the line is calculated using the l""t
squares meth,d, the goodn
-
12.35
- 32 - (t) (9)(3) ~ ~=;=;=~===;====;=
.j (39 - (J) (9)' )(87 - (1)(3)') = - 0.972 (-0.91)
The tmSWeT is (A).
II l: 0.1
2
3
,
6
o
o
o
2
o
2
J
o
2
)
o
2
3
,
09000
08100
0.9900
01190
09120
0.9990
06561
09417
09963
0.9999
05905
09185
0.9914
09995
I 0000
05314
0.8857
09842
09987
09999
I 0000
o . ~ ' .3 , .. 08000 0.7000 06000
0.6400
0.9600
05120
0.8960
0.9920
0 4096
0.4900 0.3600
0.9100 0.8400
0,3430 02160
0,7840 06480
0.9730 0.9360
02401 o 1296
08]910,6517 04752 0.9128
0. 9984
,m
0.91 63 0.8208
0. 991 9 0.9144
0, 1681 00778
07373 0,5282 03370
0.9421 0.8369 0.68 26
09933 09692 091 30
09991 0,9976 09898
0.2621
0. 6554
0. 1176 0.0467
0.4202 0.2333
090110,744305443
09830 0,9295 08208
0.9984 0.9891 0.9590
0.9999 0.9993 09959
,s
05000
02500
8.7500
0]250 05000
0.8150
00625
03115
06815
8.9315
00313
01815
0.5000
08125
06988
0.0156
8.1094
03438
06563
0.9806
09844
p
' .6 0.4000 0 3000
0.1 600 0.0900
0.6400 0.51 00
0,0640 00270
0,3520 02 160
0.7840 0.6570
00256 00081
0,1792 00837
0. 5248 0.3483
0.8704 0.7599
0,0]02 00024 0,0810 00308
0.3 ]]4 0.1631 06630 04718
0,9222 083 19
0. 0041 0.0007
0.04 10 0.01 09
0,[792 00705 0.4557 0 2557
0.7667 0.5798
09533 0.3824
01000
00400
8.3600
00080
o 1040
0.4880
00016
00271
o 1808
0.5904
00003
00067
0.0519
02627
067lJ 00001
0.0016
00170
00989
03 ...
0.7319
, ..
0]000 0.0100
01900
000]0 00180
0.2110
00001
00031
0.0523
0.3439
00000
00005
0.0086
00815
04095
00000
0 0001
00013
00159
01143
04686
0 .9~
0,0500
0.00 25
0. 0975
0,00 0] 0,0073
o 1426
00000
0,0005
0.0140
0.1 855
0,00 00
0,0000
0.00 1l
00226
0,2262
00000
0 0000
0,00 0] 0,00 22
00328
0.2649
' .99 001 00
0.00 01
0.0 199
00000
00003
0.0297
00000
00000
0.0006
0.0394
00000
00000
00000
000 10
00490
00000
00000
00000
00000
0.00 15
00585
-
12.36
,
,
,
1 1 ' 8 < ;
l K I '
[ S I T
' U
8 9 1
0 < ; 1
S L 6 0
< ; 9 L O
,
< ; [ 6 6 < ; 9 6 9
0 1 ' O e n 9 8 8 1 9 8 1 1 9 0 1 9 1 8 0
,
L < ; 9 9 1 [ 8 1
9 0 L l l
1 ' 1 [ 9
S L O T
[ 9 6 1 9 L [ 1
0 0 0 1
~OOO
1 0 ' 0
~;:OO
~O'O
o r o
~T'O _ D
0 ; : ' 0 -
0
~Z'O _ D
. .
,
_ 0 _ 0 _ 0 _ 0 _ 0
I l L ' ! ~ad
-
12.37
0.683
0.683
0.674
0.854
0.854
0.842
1055
1055
1036
1.311
1.310
1282
1.699
1.697
1645
H145
H142
1.960
* -The numb er of indep endent degrees of freedom, v, is always
one less than the sample ,ize, n
Tahle 12.5 Critical Val"'" ofChi-Squar.d n,triDution
( (XL)
,
. - I
""~;'"-~ ~!': .. .., ... "
:Ii:;;~i;1! .. !'::!::!: .., ...... .. ~ .. ~ .... ~
",:;;u~ ~~ :!~~ ~~~~~ ~~!':~~ :>~ .. "' ... "
~ ... ~ .. .. .. .... .. .. .. .... .. ..
",~~i:~ ~~~5~ ~ ~~~~ ~~~~~ ~!': .. "' ... ~~ ... ~.. .. .... ..
.. .. .... .. ..
"',.;'!;;~ ~!': .. "' ... ,,~~ii;:: g!!!wi";"; .., ill.; 8::::
51:!!""'''' "
e~ f: !3: ;; ... ~.,~ ..
~~~~~ ... .,.,~ ..
~U"-~ .. .... .. ..
i~::S ;~~~~ ~~~~: ~ ~ a ~ ~~~~~ -;;~~~ a~ .. ~ ... .,;~i~~ ~~ ..
~~ ;~!;5 55~~~ ~~5~~ " ~~:~! "
.. _~Ji;lq::;; :1 ~~".., "
~1!~!I;e "' ... .........
-.. ~ ..... -----
>!:~u~ ..........
"Ii\~!!~ "' ... ~.., ...
ea5U .. ~ .. --
e~8ac; .. .... .. ..
!3:SUc; .. ........ .. ........
~ r1!~;::I!:, ..........
U: $Ii;;; .. ........
2.462
2.457
2.326
.;:t:p~ Ii
........
2.756
2.750
2.576
Mof