ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes ACTL2002/ACTL5101 Probability and Statistics c Katja Ignatieva School of Risk and Actuarial Studies Australian School of Business University of New South Wales [email protected]Week 3 Video Lecture Notes Probability: Week 1 Week 2 Week 3 Week 4 Estimation: Week 5 Week 6 Review Hypothesis testing: Week 7 Week 8 Week 9 Linear regression: Week 10 Week 11 Week 12 Video lectures: Week 1 VL Week 2 VL Week 4 VL Week 5 VL
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Introduction
Special sampling distributions & sample mean and variance
Numerical methods to summarize dataIntroductionMeasures of location & spreadNumerical example
Graphical procedures to summarize dataSummarizing data
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Introduction
Population vs sample
Population: the large body of data;
Sample: a subset of the population.
Question: For the following four cases would we refer to apopulation or sample:
1. All the actuaries in Australia;2. The temperature on 5, randomly chosen, days;3. All NSW cars;4. The basket of goods of each fifth customer on a given day.
Solution: 1. Population; 2. Sample; 3. Population 4. Sample.
402/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Introduction
Summarising data: Numerical approaches
Given a set of observations x1, x2, x3, . . . , xn selected from apopulation (usually assumed i.i.d. (independent and identicallydistributed)).
Sorted data in ascending order: x(1), x(2), . . . , x(n), such thatx(1) is the smallest and x(n) is the largest.
Objectives:
- Understand the main features of data and to summarise data(essential first step in analysing data);
- Make inferences about the population(more on this later in the course).
403/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Special sampling distributions & sample mean and variance
Numerical methods to summarize dataIntroductionMeasures of location & spreadNumerical example
Graphical procedures to summarize dataSummarizing data
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Measures of locationUsed to estimate the central point of the sample, also calledmeasures of central tendency:
The sample mean is given by:
x =1
n·
n∑k=1
xk
The population mean is given by:
µx =∑all x
pX (x) · x
100α% trimmed mean, the average of the observations afterdiscarding the lowest 100α% and highest 100α%:
x̃α =x(bnαc+1) + . . .+ x(n−bnαc)
n − 2bnαc,
where bnαc is the greatest integer less than or equal to nα.404/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Measures of spread
The sample variance:
s2 =1
n − 1·
n∑k=1
(xk − x)2 =1
n − 1·
(n∑
k=1
x2k +
n∑k=1
x2 − 2n∑
k=1
xkx
)
=1
n − 1·
(n∑
k=1
x2k − n · x2
).
The population variance:
σ2 = Var(X ) =∑all x
pX (x) · (x − µX )2 =∑all x
pX (x) · x2 − µ2X
Sample standard deviation: s =√s2.
Population standard deviation: σ =√σ2.
405/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Quantiles
Pα, αth quantile or (α× 100)th percentile:
1
n[number of xk<Pα] ≤ α ≤ 1
n[number of xk≤Pα]
approximated by linear interpolation as the ((n − 1)α + 1)th
observation.
Quartiles: Q1 (25th percentile) and Q3 (75th percentile).
Quantile function: F−1X (u), u ∈ [0, 1], where FX (x) = u.
Question: What are the 0.025, 0.16, 0.5, 0.84 and 0.975quantiles of the N(0,1) distribution?
Solution: They are -1.96, -1, 0, 1 and 1.96, respectively.406/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Measures of location & spread
Mode: The mode m is the value that maximises the p.m.f.pX (x) in the discrete case or the p.d.f. fX (x) in thecontinuous case.
Median, M:
M =
x( n+12 ), if n is odd;
12 ·(x( n
2 ) + x( n2
+1)
), if n is even.
Median absolute deviation:
MAD = median of the numbers:{|xi −M|}.
Range:
R = x(n) − x(1).
Interquartile range:
IQR = Q3 − Q1.407/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Special sampling distributions & sample mean and variance
Numerical methods to summarize dataIntroductionMeasures of location & spreadNumerical example
Graphical procedures to summarize dataSummarizing data
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
Numerical methods to summarize data
Numerical example
Numerical exampleAn insurance company has occurred the 26 claims with thefollowing amounts:
Each row corresponds to a bin.The number before | displays the number of thousands (or hundreds/tensetc.).Each number after | displays the 3rd (or 2nd/1st) digit of an observation.
Note: rounding!417/420
ACTL2002/ACTL5101 Probability and Statistics: Week 3 Video Lecture Notes
as the joint probability mass function of X , then:
∞∑i=1
∞∑j=1
pX1,X2 (x1i , x2j) = 1.
504/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Discrete Random Variables
The marginal p.m.f. of X1 and X2 are respectively
pX1 (x1i ) =∞∑j=1
pX1,X2 (x1i , x2j)
and
pX2 (x2j) =∞∑i=1
pX1,X2 (x1i , x2j) .
(sum over the other random variable(s)).
Prove: use Law of Total Probability.
505/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Example discrete random variablesAn insurer offers both disability insurance (DI) andunemployment insurance (UI) to small companies.
Most companies buy DI and UI, because of a large discount.
The claims are categorized in “no claims”, “mild claims”, and“severe claims”.
Last year the 100 insured felt in the following categories:
DI no no no mild mild mild severe severe severeUI no mild severe no mild severe no mild severe
# 74 6 2 3 2 4 1 3 5
Question: Find the marginal p.m.f. of DI and UI.
Solution: x no mild severe
pDI (x) 74+6+2100 = 0.82 3+2+4
100 = 0.09 1+3+5100 = 0.09
pUI (x) 74+3+1100 = 0.78 6+2+3
100 = 0.11 2+4+5100 = 0.11
506/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Continuous Random Variables
In the case where X1 and X2 are both continuous randomvariables, we set the joint density function of X as
fX1,X2 (X1,X2) =∂
∂x1
∂
∂x2FX1,X2 (x1, x2)
and therefore the joint cumulative density function is given by:
FX1,X2 (x1, x2) =
∫ x2
−∞
∫ x1
−∞fX1,X2 (z1, z2) dz1dz2.
Note:
FX1,X2 (∞,∞) =
∫ ∞−∞
∫ ∞−∞
fX1,X2 (z1, z2) dz1dz2 = 1
FX1,X2 (−∞,−∞) =
∫ −∞−∞
∫ −∞−∞
fX1,X2 (z1, z2) dz1dz2 = 0.
507/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Continuous Random Variables
The marginal density function of X1 and X2 are respectively:
fX1 (x1) =
∫ ∞−∞
fX1,X2 (x1, z2) dz2 and fX2 (x2) =
∫ ∞−∞
fX1,X2 (z1, x2) dz1.
The marginal cumulative distribution function of X1 and X2
are then respectively:
FX1 (x1) =
∫ x1
−∞fX1 (u) du and FX2 (x2) =
∫ x2
−∞fX2 (u) du,
or, alternatively:
FX1 (x1) =
∫ ∞−∞
∫ x1
−∞fX (u1, u2) du1du2
and FX2 (x2) =
∫ x2
−∞
∫ ∞−∞
fX (u1, u2) du1du2.
508/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Introduction
Continuous Random Variables: example
The joint p.d.f. of X and Y is given by:
fX ,Y = 4 · x · (1− y), for 0 ≤ x , y ≤ 1, and 0 otherwise.
a. The marginal p.d.f. of X is: fX (x) =∫∞−∞ fX ,Y (x , y)dy =∫ 1
0 4 · x · (1− y)dy =[4 · x · (y − 1/2 · y2)
]10
= 2x .
b. The marginal c.d.f. of X is:FX (x) =
∫ x−∞ fX (z)dz =
∫ x0 2zdz =
[z2]x
0= x2, if 0 ≤ x ≤ 1
and zero if x < 0 and one if x > 1.
c. The marginal p.d.f. of Y is: fY (y) =∫∞−∞ fX ,Y (x , y)dx =∫ 1
0 4 · x · (1− y)dx =[1/2 · 4 · x2(1− y)
]10
= 2(1− y).
d. The marginal c.d.f. of Y is: FY (y) =∫ y−∞ fY (z)dz =∫ y
0 2(1− z)dz =[2z − z2
]y0
= 2y − y2, if 0 ≤ y ≤ 1 and zeroif y < 0 and one if y > 1.
509/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Discrete caseLet X be the random variable taking one if there is a positivereturn on the asset portfolio and zero otherwise.
Let Y be the random variable for the claims for homeinsurance, which can take value 0, 1, 2, and 3 for few, normal,many claims and a large number of claims due to floods,respectively.
The marginal probability mass functions of X and Y are:
X = x Pr (X = x)
0 1/21 1/2
and
Y = y Pr (Y = y)
0 1/81 3/82 3/83 1/8
Question: What would be the joint probability densityfunction if X and Y are independent?510/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Discrete case
Solution: If the two are independent, we would have:
Pr (X = x ,Y = y) = Pr (X = x) · Pr (Y = y)
For all X = x and Y = y the joint distribution, if they areindependent, is described in the table below:
Pr(X = x ,Y = y) Y = yX = x 0 1 2 3
0 1/16 3/16 3/16 1/161 1/16 3/16 3/16 1/16
511/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Discrete case
Suppose instead they are not independent and their jointdistribution could be described as:
Pr(X = x ,Y = y) Y = yX = x 0 1 2 3
0 0 3/16 3/16 1/81 1/8 3/16 3/16 0
Question: Proof that X and Y are dependent.
Solution: We have Pr(Y = 3) = 1/8 and Pr(X = 1) = 1/2,however Y takes the value 3 the probability that X takes thevalue 1 is zero (joint probability of Y = 3 and X = 1 is zero).
512/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Example: Multinomial distribution
Suppose we have n independent trials with r outcomes withprobabilities p1, p2, . . . , pr .
The joint frequency distribution is given by:
pN1,N2,...,Nr (n1, n2, . . . , nr ) =n!
n1! · n2! · . . . · nr !pn1
1 · pn22 · . . . · p
nrr .
The marginal distribution is (Binomial distribution!) given by:
pNi(ni ) =
∑N1
, . . . ,∑Ni−1
,∑Ni+1
, . . . ,∑Nr
pN1,N2,...,Nr (n1, n2, . . . , nr )
∗=
(n
ni
)· pnii · (1− pi )
n−ni .
Can do this by summing the marginals.* Using Binomial expansion (prove not required).
513/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
Now consider an example of a bivariate random vector[X ,Y ]> whose joint density function is:
fX ,Y (x , y) = c(x2 + xy
), for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1,
and zero otherwise. To find the constant c, it must be a validdensity so that:
1 =
∫ ∞−∞
∫ ∞−∞
fX ,Y (x , y) dxdy =
∫ 1
0
∫ 1
0c(x2 + xy
)dxdy
= c ·∫ 1
0
[1
3x3 +
1
2x2y
]1
0
dy =c ·[
1
3y +
1
4y2
]1
0
= c · 7
12.
Hence, c = 12/7, then also fX ,Y (x , y) ≥ 0 for all x , y .
a. Question: Find the marginal densities.
b. Question: Find the joint distribution function.514/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
a. Solution: Knowing the constant, we can then determine themarginal densities. First the marginal density for X :
fX (x) =
∫ ∞−∞
fX ,Y (x , y)dy =
∫ 1
0
12
7
(x2 + xy
)dy
=12
7
(x2 +
1
2x
), for 0 ≤ x ≤ 1,
and zero otherwise, and for Y :
fY (y) =
∫ ∞−∞
fX ,Y (x , y)dx =
∫ 1
0
12
7
(x2 + xy
)dx
=12
7
(1
3+
1
2y
), for 0 ≤ y ≤ 1,
and zero otherwise.515/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous caseb. Solution: You can also determine the joint distribution
function if 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 by:
FX ,Y (x , y) =
∫ y
−∞
∫ x
−∞fX ,Y (u, v)dudv =
∫ y
0
∫ x
0
12
7
(u2 + uv
)dudv
=
∫ y
0
[12
7
(1
3u3 +
1
2u2v
)]x0
dv =
∫ y
0
12
7
(1
3x3 +
1
2x2v
)dv
=
[12
7
(1
3x3v +
1
4x2v2
)]y0
=12
7
(1
3x3y +
1
4x2y2
).
Hence:
FX ,Y (x , y) =
0, if x < 0 or y < 0;127
(13x
3y + 14x
2y2), if 0 ≤ x ≤ 1, 0 ≤ y ≤ 1;
FX (x) , if y > 1;FY (y) , if x > 1.
516/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
−0.500.51 1.5
−0.500.511.5−0.5
00.5
11.5
x
joint p.d.f.
y
F X,Y(x
,y)
−0.5 0 0.5 1 1.5−0.5
0
0.5
1
1.5
x
F X(x)
marginal p.d.f.
−0.5 0 0.5 1 1.5−0.5
0
0.5
1
1.5
y
F Y(y)
marginal p.d.f.
−0.5 0 0.5 1 1.5−0.5
0
0.5
1
1.5
x
y
slide 519
517/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
You can then determine the marginal distributions:
FX (x) = FX ,Y (x , 1) =
0, if x < 0;127
(13x
3 + 14x
2), if 0 ≤ x ≤ 1;
1, if x > 1,
and
FY (y) = FX ,Y (1, y) =
0, if y < 0;127
(13y + 1
4y2), if 0 ≤ y ≤ 1;
1, if y > 1.
Can you confirm the marginal densities are correct?
518/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Exercises
Exercise: Continuous case
It becomes straightforward to compute probability statements suchas (using lower right panel on slide 517):
Pr (X < Y ) =
∫ 1
0
∫ y
0
12
7
(x2 + xy
)dxdy
=12
7·∫ 1
0
[(x3
3+
x2y
2
)]y0
dy
=12
7·∫ 1
0
(y3
3+
y3
2
)dy
=
∫ 1
0
12
7
(5
6y3
)dy =
12 · 57 · 6
[y4
4
]1
0
=5
14,
so that Pr (X > Y ) =∫∞−∞
∫∞y fX ,Y (x , y)dxdy = 9/14.
519/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
MeansConsider the bivariate random vector X = [X1 X2]>.
The mean of X is the vector whose elements are the means ofX1 and X2, that is,
E[X ] =
[E [X1]E [X2]
]=
[µ1
µ2
].
If X1,X2, . . . ,Xn are jointly distributed random variables withexpectations E [Xi ] for i = 1, . . . , n and Y is a affine functionof the Xi , i.e.,
Y = a +n∑
i=1
biXi ,
then, we have the additively rule:
E [Y ] =E
[a +
n∑i=1
biXi
]= a +
n∑i=1
E [biXi ] =a +n∑
i=1
biE [Xi ] .520/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Variances, Covariances
Recall: variance of X is a measure for the spread of X .
Covariance is a measure of the spread between X1 and X2.
The variance of the random vector X is also called thevariance-covariance matrix:
Var (X ) =
[Var (X1) Cov (X1,X2)
Cov (X1,X2) Var (X2)
]=
[σ2
1 σ12
σ12 σ22
],
where the covariance is defined as:
Cov (X1,X2) ≡ σ12 =E [(X1 − µ1) · (X2 − µ2)]
=E [X1 · X2 − X1 · µ2−µ1 · X2 + µ1 · µ2]
=E [X1 · X2]− E [X1] · E [X2] .
Note: Cov(Xi ,Xi ) = σii = σ2i , and covariance only defined
for two r.v..521/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Example: Consider the example from slide 506.
No
Mild
Severe
No
Mild
Severe
0
0.2
0.4
0.6
0.8
DIUI
pro
ba
bili
ty
Question: Is covariancepositive or negative?
Let “no”=0, “mild”=1, and“severe”=2.
Question: Calculate the mean ofX1 = DI and X2 = UI .
Solution:E [X1] = 3+2+4
100 · 1 + 1+3+5100 · 2 = 0.27.
E [X2] = 6+2+3100 · 1 + 2+4+5
100 · 2 = 0.33.
Question: Calculate the covariancebetween X1 and X2.
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Let X ∼ Beta(0.2, 1) (prob of a claim) and Y |X ∼ NB(3,X )(Y ∼ Beta-Negative-Binomial). Home insurance, insured qualifiedas bad risk if 3 claims within 50 quarters.Question: Does it have a negative or positive covariance?524/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Properties of Covariance
If X and Y are jointly distributed random variables withexpectations µX and µY the covariance of X and Y is
Cov (X ,Y ) =E [(X − µX ) · (Y − µY )]
=E [X · Y−X · µY − Y · µX + µX · µY ]
=E [X · Y ]− µX · µY .
If X and Y are independent:
Cov (X ,Y ) = E [X · Y ]− µX · µY∗= E [X ] · E [Y ]− µX · µY = 0.
* using independence X , Y .
525/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Means, Variances, Covariances
Properties of Covariance
Let X ,Y ,Z be random variables, and a, b ∈ < we have:
Cov (a + X ,Y ) =E [(a + X − (a + µX )) · (Y − µY )]
=E [(X − µX ) · (Y − µY )]
=Cov (X ,Y )
Cov (a · X , b · Y ) =E [(a · X − a · µX ) · (b · Y − b · µY )]
=E [a · (X − µX ) · b · (Y − µY )]
=a · b · E [(X − µX ) · (Y − µY )] = a · b · Cov (X ,Y )
Cov (X ,Y + Z ) =E [(X − µX ) · (Y + Z − µY − µZ )]
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Application: an imperfect particle counter
Define the random variable N as the number of incomingclaims and X as claims paid. Probability of a fraudulent claimis q = 1− p and number of claims paid is Binomial:
(X |N = n) ∼ Binomial (n, p) .
If the number of incoming claims follows a Poissondistribution (with parameter λ) then the number of claimspaid turns out to also be Poisson with parameter λ · p. This isan example of “thinning” of a Poisson probability.
We will see more on thinning of a Poisson probability inACLT2003/5103 using Markov chains.
Proof: See next slides.
540/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Application: an imperfect particle counter
Proof: the law of total probability (why can we apply ithere?) gives:
Pr (X = k) =∞∑n=0
Pr (X = k |N = n ) · Pr (N = n)
=∞∑n=k
(n
k
)· pk · (1− p)n−k · λ
n · e−λ
n!, since n ≥ k .
=∞∑n=k
n!
(n − k)! · k!· pk · (1− p)n−k · λ
n · e−λ
n!
continues on next slide.
541/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
Conditional Distributions
Application: an imperfect particle counter
Now (making change of variables j = n − k in the third line):
=∞∑n=k
n!
(n − k)! · k!· pk · (1− p)n−k · λ
n · e−λ
n!
=(λ · p)k
k!· e−λ ·
∞∑n=k
λn−k · (1− p)n−k
(n − k)!
=(λ · p)k
k!· e−λ ·
∞∑j=0
(λ · (1− p))j
j!
∗=
(λ · p)k
k!· e−λ · eλ·(1−p) =
(λ · p)k
k!e−λ·p,
which is the p.m.f. of a Poisson(λ · p) random variable.* using exponential function exp(x) =
∑∞i=0 x
i/i !, withx = λ(1− p).
542/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
The Bivariate Normal Distribution
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
The Bivariate Normal Distribution
The Bivariate Normal Distribution
Suppose [X ,Y ]> has a bivariate normal distribution, then itsdensity is given by:
fX ,Y (x , y) =1
2πσXσY√
1− ρ2exp
(− 1
2 (1− ρ2)A
),
where
A =
(x − µXσX
)2
− 2ρ
(x − µXσX
)·(y − µYσY
)+
(y − µYσY
)2
.
543/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
The Bivariate Normal Distribution
The following results are important although quite tedious to show(see section 5.10 of W+(7ed) for some of the derivation):
1. The marginals are: X ∼ N(µX , σ
2X
)and Y ∼ N
(µY , σ
2Y
).
2. The conditional distributions are:
(Y |X = x ) ∼ N
(µY + ρ (x − µX )
σYσX
, σ2Y
(1− ρ2
))and
(X |Y = y ) ∼ N
(µX + ρ (y − µY )
σXσY
, σ2X
(1− ρ2
)).
3. The correlation coefficient between X and Y is: ρ (X ,Y ) = ρ.
544/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Bivariate Case
The Bivariate Normal Distribution
Simulating multivariate normal distribution
Bivariate case: use properties 1 & 2 to simulate from i.i.d.standard normal distributions:
X =µX + σXZ1
Y =µY + σY ρZ1 + σY
√(1− ρ2)Z2,
where Z1 and Z2 are i.i.d N(0, 1).
OPTIONAL: In case of multivariate normal, letZ = [Z1 . . .Zn]> i.i.d. N(0, 1), we have:
- The Cholesky decomposition: AA> = Σ (Σ is thevariance-covariance matrix).
- We have: X = µ+ AZ .
545/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Law of Iterated Expectations
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Law of Iterated Expectations
Law of Iterated Expectations
Note: E[X |Y = y ] is a constant, but E[X |Y ] is a randomvariable.
For any two random variables X and Y , we have the law ofiterated expectations:
E [E [Y |X ]] = E [Y ] .
To prove this in the continuous case, first consider:
E [E [Y |X ]] =
∫ ∞−∞
E [Y |X = x ] · fX (x) dx
=
∫ ∞−∞
(∫ ∞−∞
y · fY |X (y |x ) dy
)fX (x) dx .
546/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Law of Iterated Expectations
Interchanging order of integration, we have
E [E [Y |X ]] =
∫ ∞−∞
y
∫ ∞−∞
fY |X (y |x ) fX (x) dx︸ ︷︷ ︸=fY (y)
dy
∗=
∫ ∞−∞
y · fY (y) dy
=E [Y ]
* using the law of total probability (why can we use it here?).
547/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Conditional variance identity
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Conditional variance identity
Conditional variance identity
Another important result is the conditional variance identity:
Var (Y ) = Var (E [Y |X ]) + E [Var (Y |X )] .
Proof (* using the law of iterative expectations):
Var (Y ) =E[Y 2]− (E [Y ])2
∗=E
[E[Y 2|X
]]− (E [E [Y |X ]])2
=E[E[Y 2|X
]]−E
[(E [Y |X ])2
]+ E
[(E [Y |X ])2
]− (E [E [Y |X ]])2
=E [Var (Y |X )] + Var (E [Y |X ]).
Proof can also be found in section 5.11 of W+(7ed).548/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Application: Random Sums
An insurance company usually has uncertainty in both thenumber of claims and the claim amount of each claim filled.
Denote the total claim size is S , individual claim size Xi andN is the total number of claims.
We are interested in the (distribution) mean and variance of arandom sum defined as:
S = X1 + X2 + . . .+ XN ,
where both the Xi ’s and N are random variables.
We assume all the Xi are independent and also independent ofN.
549/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Application: Random SumsMean of S : The mean of the aggregate claims is:
E [S ] = E [Xi ] · E [N] .
This is straightforward:
E [S ] =E [E [S |N ]]
=E
[E
[N∑i=1
Xi |N
]]
=E
[N∑i=1
E [Xi |N]
]=E [N · E [Xi |N]]∗=E [E [Xi ]] · E [N] = E [Xi ] · E [N] .
* using independence Xi and N.550/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Application: Random SumsVariance of S : The variance of the aggregate claims is:
Var (S) = (E [Xi ])2 · Var (N) + E [N] · Var (Xi ) .
This is also straightforward to show:
Var (S)∗=E [Var (S |N )] + Var (E [S |N ])
=E
[Var
(N∑i=1
Xi
)]+ Var (E [Xi ] · N)
∗∗=E [N] · E
Var (Xi )︸ ︷︷ ︸constant
+
E [Xi ]︸ ︷︷ ︸constant
2
· Var (N)
=E [N] · Var (Xi ) + (E [Xi ])2 · Var (N)
* using conditional variance identity, ** using independencebetween Xi and N.
551/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
Application: Random Sums
Moment Generating Function of S : The m.g.f. of theaggregate claims is given by:
MS (t) = MN (log (MX (t))) .
Finding the m.g.f. is also straightforward:
MS (t) =E[etS]
= E[E[etS∣∣∣N]]
=E[(MX (t))N
]= E
[eN·log(MX (t))
]=MN (log (MX (t))) .
Note that when the number of claims has a Poissondistribution, the resulting total claims S is said to have aCompound Poisson distribution.
552/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Laws
Application & Exercise
ExerciseLet X ∼ Gamma(α, β) and Y |X ∼ EXP(1/X ).
a. Question: Find E [Y ].(Note: E [X ] = α/β, EXP(λ)=Gamma(1,λ))
b. Question: Find Var (Y ). (Note: Var (X ) = α/β2)
a. Solution:
E [Y ] =E [E [Y |X ]]
=E [X ] = α/β.
b. Solution:
Var (Y ) =Var (E [Y |X ]) + E [Var (Y |X )]
=Var(X ) + E[X 2]
=α/β2 + Var(X ) + (E [X ])2
=α/β2 + α/β2 + (α/β)2 =(2α + α2
)/β2.
553/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
The Multivariate Case
Let X = [X1,X2, . . . ,Xn]> be a random vector with nelements. The joint distribution function (DF) of X isdenoted by:
In the continuous case, we define the joint density function ofX as:
fX1,X2,...,Xn (x1, . . . , xn) =∂
∂x1. . .
∂
∂xnFX1,X2,...,Xn (x1, . . . , xn) .
554/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
The joint DF is given by:
FX1,X2,...,Xn (x1, . . . , xn) =
∫ xn
−∞. . .
∫ x1
−∞fX1,X2,...,Xn (z1, . . . , zn) dz1 . . . dzn.
To derive marginal p.m.f.’s or densities, simply evaluate (sumor integrate) overall the region except for the variable ofinterest. For example in the continuous case, the marginaldensity of Xk , for k = 1, 2, . . . , n is given by:
fXk(xk) =
∫ ∞−∞
. . .
∫ ∞−∞
fX1,X2,...,Xn (z1, . . . , xk,. . . . , zn)∏j 6=k
dzj .
555/562
ACTL2002/ACTL5101 Probability and Statistics: Week 3
The Multivariate Case
Introduction
Independent Random Variables
The random variables X1,X2, . . . ,Xn are said to beindependent if their joint distribution function can be writtenas the product of their marginal distribution functions:
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Summarizing data
Exercises
Joint & Multivariate Distributions
The Bivariate CaseIntroductionExercisesMeans, Variances, CovariancesCorrelation coefficientConditional DistributionsThe Bivariate Normal Distribution
LawsLaw of Iterated ExpectationsConditional variance identityApplication & Exercise
The Multivariate CaseIntroduction
Summarizing dataExercises
SummarySummary
ACTL2002/ACTL5101 Probability and Statistics: Week 3
Summarizing data
Exercises
Exercise: summarizing data
An insurer assumes that the time between claims isexponential distributed. A reinsurer pays out when the insurerhas two or more claims within two years. The distribution ofinterest is Gamma(2,3).