STAT:5100 (22S:193) Statistical Inference I - Week 6homepage.stat.uiowa.edu/~luke/classes/193/notes-week6.pdffX>rg r1 fX rg: As a result, E[X] E[r1 fX rg] = rE[1 fX rg] = rP(X r):
Post on 24-Aug-2020
6 Views
Preview:
Transcript
STAT:5100 (22S:193) Statistical Inference IWeek 6
Luke Tierney
University of Iowa
Fall 2015
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 1
Monday, September 28, 2015
Recap
• Change of variables formula
• Probability integral transform
• Inverse probability integral transform
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 2
Monday, September 28, 2015 Expected Values
Expected Values
• The expected value, or mean value, is a way of capturing a “typicalvalue” of a random variable.
• It corresponds to thinking about the average value observed in manyrepetitions of an experiment.
• To start off, call a random variable “simple” if it takes on only finitelymany values.
Definition
The expected value, or mean value, of a simple random variable X withpossible values x1, . . . , xN is
E [X ] = µX =N∑i=1
xiP(X = xi )
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 3
Monday, September 28, 2015 Expected Values
Examples
• X = number of eyes on a die roll:
E [X ] = 1× 1
6+ · · ·+ 6× 1
6=
6× 7
2× 6= 3.5
• Y = X 2, X as above:
E [Y ] = 1× 1
6+ 4× 1
6+ · · ·+ 36× 1
6=
6× 7× 13
6× 6=
91
6
• Suppose 1A is the indicator function on an event A. Then
E [1A] = 1× P(A) + 0× P(Ac) = P(A).
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 4
Monday, September 28, 2015 Expected Values
Example
• Suppose a coin is flipped n times independently; p is the probabilityof a head on each toss, X = number of heads.
• Then X has a binomial distribution and for x = 0, 1, . . . , n the PMF is
fn,p(x) =
(n
x
)px(1− p)n−x
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 5
Monday, September 28, 2015 Expected Values
Example
• One way to calculate the mean:
E [X ] =n∑
k=0
k
(n
k
)pk(1− p)n−k
=n∑
k=0
kn!
k!(n − k)!pk(1− p)n−k
=n∑
k=1
n!
(k − 1)!(n − k)!pk(1− p)n−k
= npn∑
k=1
(n − 1)!
(k − 1)!(n − k)!pk−1(1− p)n−k
= npn−1∑j=0
(n − 1
j
)pj(1− p)n−1−j
= npn−1∑j=0
fn−1,p(j) = np
• The simple answer suggests there may be an easier way.Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 6
Monday, September 28, 2015 Properties of Expected Values
Properties of Expected Values
• E [1] = 1
• Homogeneity: E [cX ] = cE [X ] for constants c .
• Additivity: E [X + Y ] = E [X ] + E [Y ].
• For any function g
E [g(X )] =N∑i=1
g(xi )P(X = xi ).
• If X ≤ Y , i.e. X (s) ≤ Y (s) for all s ∈ S , then
E [X ] ≤ E [Y ].
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 7
Monday, September 28, 2015 Properties of Expected Values
Example
• Let X = number of heads on n coin tosses.
• LetYi = no. heads on i-th toss = 1{i-th toss is a head}
• Then for each iE [Yi ] = P(Yi = 1) = p
• FurthermoreX = Y1 + · · ·+ Yn.
• Therefore
E [X ] = E [Y1 + · · ·+ Yn]
= E [Y1] + · · ·+ E [Yn]
= nE [Y1]
= np.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 8
Monday, September 28, 2015 Properties of Expected Values
Example
• In the matching problem let Ai be the event that person i gets theirown hat.
• The number of matches X is
X = 1A1 + · · ·+ 1An .
• The probability of person i receiving their own hat is
E [1Ai] = P(Ai ) =
1
n
• So the expected number of matches is
E [X ] = E [1A1 + · · ·+ 1An ]
= E [1A1 ] + · · ·+ E [1An ]
= nE [1A1 ] = n1
n= 1.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 9
Monday, September 28, 2015 Properties of Expected Values
Example
• It is often useful to compute a probability as the expectation of anindicator function.
• For example, the indicator function of a union is
1A∪B = 1− (1− 1A)(1− 1B) = 1A + 1B − 1A∩B .
• As a result,
P(A ∪ B) = E [1A + 1B − 1A∩B ]
= E [1A] + E [1B ]− E [1A∩B ]
= P(A) + P(B)− P(A ∩ B).
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 10
Monday, September 28, 2015 Properties of Expected Values
Example (continued)
For 3 or more events we can
• use induction to derive the inclusion-exclusion formula for indicatorfunctions:
1A1∪···∪An = 1− (1− 1A1) · · · (1− 1An)
=∑i
1Ai−∑i<j
1Ai∩Aj+∑
i<j<k
1Ai∩Aj∩Ak· · · ± 1A1∩···∩An
• then take expectations of the result:
P(A1 ∪ · · · ∪ An) =E [1A1∪···∪An ]
=∑i
P(Ai )−∑i<j
P(Ai ∩ Aj)
+∑
i<j<k
PAi ∩ Aj ∩ Ak) · · · ± P(A1 ∩ · · · ∩ An)
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 11
Monday, September 28, 2015 Properties of Expected Values
Example
• Expectations and indicator functions can also be used to find boundson probabilities.
• For example, suppose• X is a nonnegative random variable• r is a positive constant.
• ThenX ≥ X1{X>r} ≥ r1{X≥r}.
• As a result,
E [X ] ≥ E [r1{X≥r}] = rE [1{X≥r}] = rP(X ≥ r).
• SoP(X ≥ r) ≤ E [X ]/r .
• This is sometimes called Markov’s inequality.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 12
Monday, September 28, 2015 Expectation for General Random Variables
Expectation for General Random Variables
• If X is nonnegative, define
E [X ] = sup{E [Y ] : Y simple, Y ≤ X}
• The result is always well-defined but may be infinite.
• If X ≤ 0, define E [X ] = −E [−X ].
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 13
Monday, September 28, 2015 Expectation for General Random Variables
• If X takes on both positive and negative values, let
X+ = max{X , 0} ≥ 0 positive part
X− = −min{X , 0} ≥ 0 negative part
• ThenX = X+ − X−
and|X | = X+ + X−.
• Define
E [X ] = E [X+]− E [X−]
if E [|X |] = E [X+] + E [X−] is finite.
• If E [|X |] is infinite then E [X ] is not defined.
• Properties such as homogeneity and additivity continue to holdprovided all expectations exist.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 14
Monday, September 28, 2015 Expectation for General Random Variables
TheoremLet X be a random variable, g a nice real-valued function, and Y = g(X ).
(i) If X is discrete with values x1, x2, . . ., then
E [Y ] =∑
g(xi )fX (xi )
if Y ≥ 0 or E [|Y |] <∞.
(ii) If X is continuous, then
E [Y ] =
∫g(x)fX (x)dx
if Y ≥ 0 or E [|Y |] <∞.
(iii) For a mixed discrete/continuous distribution
E [Y ] = p∑
g(xi )f1(xi ) + (1− p)
∫g(x)f2(x)dx
if Y ≥ 0 or E [|Y |] <∞.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 15
Monday, September 28, 2015 Examples
Example
• Let X have a geometric distribution with PMF
fX (x) = p(1− p)x−1
for x = 1, 2, . . ..
• Let q = 1− p.
• Then
E [X ] =∞∑x=1
xp(1− p)x−1 = p∞∑x=1
xqx−1 = p∞∑x=1
d
dqqx
= pd
dq
∞∑x=1
qx = pd
dq
q
1− q
= pd
dq
(1
1− q− 1
)= p
1
(1− q)2= p
1
p2=
1
p.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 16
Monday, September 28, 2015 Examples
Example
• Suppose X is uniform on [a, b], i.e. X has PDF
fX (x) =
{1
b−a a < x < b
0 otherwise
• Then
E [X ] =
∫ b
a
x
b − adx =
1
2
x2
b − a
∣∣∣∣ba
=1
2
b2 − a2
b − a=
1
2(b + a)
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 17
Wednesday, September 30, 2015
Recap
• Expected values of simple random variables
• Properties of expected values
• Expected values for general random variables
• Computing expected values
• Examples
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 18
Wednesday, September 30, 2015 Examples
Example
• Let X have an exponential distribution with PDF
fX (x) =
{λe−λx for x ≥ 0
0 otherwise.
• Then
E [X ] =
∫ ∞0
xλe−λxdx =1
λ
∫ ∞0
λxe−λxλdx =1
λ
∫ ∞0
ye−ydy
• Using integration by parts∫ ∞0
ye−ydy = −ye−y∣∣∞0
+
∫ ∞0
e−ydy = 0 + 1 = 1.
• So E [X ] = 1/λ.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 19
Wednesday, September 30, 2015 Examples
Example
• Let X have a negative binomial distribution with PMF
fn,p(x) =
(x + n − 1
n − 1
)pn(1− p)x
for x = 0, 1, 2, . . . .
• X corresponds to the number of tails before the n-th head in tosses of abiased coin with probability of heads p.
• Then
E [X ] =∞∑x=0
x
(x + n − 1
n − 1
)pn(1− p)x
=∞∑x=0
x(x + n − 1)!
x!(n − 1)!pn(1− p)x
=∞∑x=1
(x + n − 1)!
(x − 1)!(n − 1)!pn(1− p)x .
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 20
Wednesday, September 30, 2015 Examples
Example (continued)
• Rewriting the sum in terms of y = x − 1 produces
E [X ] =∞∑x=1
(x + n − 1)!
(x − 1)!(n − 1)!pn(1− p)x =
∞∑y=0
(y + n)!
y !(n − 1)!pn(1− p)y+1
• Factoring out some terms:
E [X ] =n(1− p)
p
∞∑y=0
(y + n)!
y !n!pn+1(1− p)y
=n(1− p)
p
∞∑y=0
fn+1,p(y)
=n(1− p)
p.
• Relating a complicated sum or integral back to one we know is a very usefulstrategy.
• Alternative: View X as a sum of n geometric random variables.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 21
Wednesday, September 30, 2015 Standard Deviation and Variance
Standard Deviation and Variance
• There are many different possible measures of spread of a distribution.• They are generally designed to give the “typical” magnitude of a
deviation of X from its mean.• One simple measure is the mean absolute deviation
MAD(X ) = E [|X − µX |].
• The lack of differentiability of the absolute value function makes thishard to work with.
• A mathematically nicer option is the root mean square deviation orstandard deviation:
SD(X ) = σX =√E [(X − µX )2].
• The key step in computing the standard deviation is finding thevariance:
Var(X ) = σ2X = E [(X − µX )2].
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 22
Wednesday, September 30, 2015 Standard Deviation and Variance
• A useful property of the standard deviation: for constants a and b
SD(aX + b) = |a|SD(X )
• For the variance:Var(aX + b) = a2Var(X )
Proof.
Var(aX + b) = E [(aX + b − E [aX + b])2]
= E [(aX − aE [X ])2]
= a2E [(X − E [X ])2]
= a2Var(X )
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 23
Wednesday, September 30, 2015 Standard Deviation and Variance
• Computing variances:
Var(X ) = E [X 2]− E [X ]2 = E [X 2]− µ2
Proof.
Var(X ) = E [(X − µ)2]
= E [X 2 − 2Xµ+ µ2]
= E [X 2]− 2µE [X ] + µ2
= E [X 2]− 2µ2 + µ2
= E [X 2]− µ2
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 24
Wednesday, September 30, 2015 Examples
Example
• Suppose X is uniform on [0, 1],
fX (x) =
{1 0 < x < 1
0 otherwise
• Then
E [X ] = 1/2
E [X 2] =
∫ 1
0x2dx =
x3
3
∣∣∣∣10
=1
3
Var(X ) =1
3− 1
4=
1
12
SD(X ) =1√12≈ 0.2887
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 25
Wednesday, September 30, 2015 Examples
Example (continued)
• Suppose Y is uniform on [a, b].
• This means Y ∼ (b − a)X + a (check this).
• Then
E [Y ] =b − a
2+ a =
b + a
2
Var(Y ) =(b − a)2
12
SD(Y ) =(b − a)√
12
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 26
Wednesday, September 30, 2015 Examples
Example
• Suppose X has a binomial distribution with n trials and probability of heads p withPMF
fn,p(x) =
(n
x
)px(1− p)n−x .
• Instead of computing E [X 2] is is easier to compute
E [X (X − 1)] = E [X 2 − X ] = E [X 2]− E [X ].
• Computing E [X (X − 1)]:
E [X (X − 1)] =n∑
x=0
x(x − 1)
(n
x
)px(1− p)n−x
=n∑
x=2
n!
(x − 2)!(n − x)!px(1− p)n−x
= n(n − 1)p2n−2∑y=0
(n − 2
y
)py (1− p)n−2−y
= n(n − 1)p2.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 27
Wednesday, September 30, 2015 Examples
Example (continued)
• The mean of a binomial random variable is
E [X ] = np.
• As a result,
E [X 2] = E [X (X − 1)] + E [X ]
= n(n − 1)p2 + np
= n2p2 − np2 + np
= n2p2 + np(1− p).
• The variance is
Var(X ) = n2p2 + np(1− p)− (np)2 = np(1− p).
• The standard deviation is SD(X ) =√np(1− p).
• Again the simplicity of the answer suggests there may be a simpler way.Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 28
Wednesday, September 30, 2015 Examples
• The standard deviation provides information on how much a randomvariable can deviate from its mean.
• Using the Markov inequality, for any t > 0
P(|X − µ| > t) = P((X − µ)2 > t2)
≤ E [(X − µ)2]
t2=σ2Xt2.
• For t = kσX this can be written as
P(|X − µ| > kσX ) ≤ 1
k.
• These inequalities are known as Chebychev’s inequality.
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 29
Wednesday, September 30, 2015 Examples
Chebychev’s inequality is usually quite conservative:
P
(|X − µ|σ
> 1
)≤ 1
P
(|X − µ|σ
> 2
)≤ 1/4
P
(|X − µ|σ
> 3
)≤ 1/9
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 30
Wednesday, September 30, 2015 Moments
Moments
• For each n, the n-th (non-central) moment of a random variable X is
µ′n = E [X n]
• The n-th central moment of X is
µn = E [(X − µ)n].
• The mean of X is the first non-central moment µ′1 = µ = E [X ].
• The variance of X is its second central moment,
Var(X ) = µ2 = E [(X − µ)2]
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 31
Friday, October 2, 2015
First Midterm Exam
Luke Tierney (U Iowa) STAT:5100 (22S:193) Statistical Inference I Fall 2015 32
top related