Probability & Statistics S.Lan Basic Concepts Conditional Probability Random Variables Mathematical Expectation Multivariate Distributions Common Probability Distributions Lecture 1 Probability & Statistics A brief overview Shiwei Lan 1 1 School of Mathematical and Statistical Sciences Arizona State University STP427 Mathematical Statistics Fall 2019 1 / 35
38
Embed
S.Lan Lecture 1 Probability & Statisticsslan/download/stp427_lecture1.pdfMultivariate Distributions Common Probability Distributions Lecture 1 Probability & Statistics A brief overview
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Lecture 1 Probability & StatisticsA brief overview
Shiwei Lan1
1School of Mathematical and Statistical SciencesArizona State University
STP427 Mathematical StatisticsFall 2019
1 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Table of Contents
1 Basic Concepts
2 Conditional Probability
3 Random Variables
4 Mathematical Expectation
5 Multivariate Distributions
6 Common Probability Distributions
2 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Probability vs Statistics
3 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Probability vs Statistics
4 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Terminology
• random experiment: outcome cannot be predicted.
• outcome c : specific result of the experiment.
• sample space C: collection of all possible outcomes.
• event C : collection of some outcomes, subset of sample space.
Let C be a sample space and let B be the set of events. Let P be a real-valuedfunction defined on B. Then P is a probability set function if P satisfies thefollowing three conditions:
1 P(C ) ≥ 0 for all C ∈ B.
2 P(C) = 1.
3 If {Cn} is a sequence of events in B and Cm ∩ Cn = ∅ for all m 6= n, then
• Frequentist statisticians define probability using (relative) frequency, thenumber (measurement) of outcomes in an event divided by the totaloutcomes.
• We need counting rules like the multiplication rule. Moreover, we have thefollowing counting formulae depending on whether the random draw is withreplacement and whether the results are ordered.
Select k objects out of n With replacement Without replacement (k ≤ n)
ordered nk Pnk
unordered(n−1+k
k
) (nk
)
Table: Counting Formulae
9 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Table of Contents
1 Basic Concepts
2 Conditional Probability
3 Random Variables
4 Mathematical Expectation
5 Multivariate Distributions
6 Common Probability Distributions
10 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Conditional Probability
Definition (Conditional Probability)
If P(C1) > 0, then the conditional probability of the event C2 given the eventC1 is defined as
P(C2|C1) =P(C1 ∩ C2)
P(C1)
This definition satisfies the requirements of probability set function:
1 P(C2|C1) ≥ 02 P(C1|C1) = 1.3 P (∪∞n=2Cn) =
∑∞n=2 P (Cn) for {Cn}n≥2 mutually exclusive.
We immediately have the following• multiplication rule: P(C1 ∩ C2) = P(C1)P(C2|C1)• law of total probability: if {Ci}ki=1 form a partition of C,
P(C ) =k∑
i=1
P(Ci )P(C |Ci )11 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Bayes’ Theorem
Theorem (Bayes’ Theorem)
if {Ci}ki=1 form a partition of C, and P(C ) > 0, then
P(Cj |C ) =P(C ∩ Cj)
P(C )=
P(Cj)P(C |Cj)∑ki=1 P(Ci )P(C |Ci )
• P(Cj)’s are called prior probabilities.
• P(Cj |C )’s are called posterior probabilities.
• The theorem enables us to update our prior belief (P(Cj)) with data(P(C |Cj)) to get new knowledge (P(Cj |C )), which is the foundation ofBayesian statistics.
12 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Monty Hall Problem
The Monty Hall problemis a brain teaser, in the form of a probabilitypuzzle, loosely based on the Americantelevision game show Let’s Make a Dealand named after its original host, Monty Hall.
Suppose you’re on a game show, and you’re given the choice of threedoors: Behind one door is a car; behind the others, goats. You pick adoor, say No. 1, and the host, who knows what’s behind the doors, opensanother door, say No. 3, which has a goat. He then says to you, “Do youwant to pick door No. 2?”
Is it to your advantage to switch your choice?
13 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Independence
Definition (Independence)
Events C1 and C2 are independent if
P(C1 ∩ C2) = P(C1)P(C2)
It immediately implies that P(C2|C1) = P(C2) if P(C1) > 0 or P(C1|C2) = P(C1)if P(C2) > 0. For multiple events, we have
Definition (Independence among multiple events)
Events {Ci}ni=1 are pairwise independent if
P(Ci ∩ Cj) = P(Ci )P(Cj), 1 ≤ i 6= j ≤ n
They are mutually independent if for 2 ≤ k ≤ n, {dj |1 ≤ dj ≤ n}kj=1 distinct,
• Hint: consider an urn with balls numbered 1, 2, 3, 4: how to constructevents A, B, C?
15 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Table of Contents
1 Basic Concepts
2 Conditional Probability
3 Random Variables
4 Mathematical Expectation
5 Multivariate Distributions
6 Common Probability Distributions
16 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Random Variable
Definition (Random Variable)
Consider the probability space (C,B,P). A random variable is a function thatassigns to each element c ∈ C one and only one real number X (c) = x . Thespace or range of X is the set of real numbers D = {x : x = X (c), c ∈ C}.Depending on whether D is a countable set or a subset of real numbers, we nameX as discrete or continuous random variable.
17 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Probability Distribution
Note the random variable X induces a probability PX on D ⊂ R:
PX (D) = P[c ∈ C : X (c) ∈ D], ∀D ⊂ D
18 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Probability Distribution
Definition (Probability Mass (Density) Function)
If D = {di}, then the probability mass function (pmf) of random variable X is
pX (di ) = P[c ∈ C : X (c) = di ],
If there exists nonnegative function fX (x) such that (a, b) ∈ σ(D),
PX [(a, b)] = P[c ∈ C : a < X (c) < b] =
∫ b
afX (x)dx
then we call fX the probability density function (pdf) of X .
19 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Probability Distribution
Definition (Cumulative Distribution Function (CDF))
Let X be a random variable. Then its cumulative distribution function (cdf) isdefined by FX (x):
FX (x) = PX ((−∞, x ]) = P[c ∈ C : X (c) ≤ x ]
20 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Probability Distribution
• CDF is a nondecreasing, right-continuous, bounded (between 0 and 1)function.
• If X is discrete, FX (x) =∑
x ′≤x pX (x ′) and pX (x) = FX (x)− FX (x−).
• If X is continuous, Fx(x) =∫ x−∞ fX (x ′)dx ′ and fX (x) = d
dx F (x) if fX iscontinuous. fX (x) =?.
• If X is continuous,P(a < X ≤ b) = P(a ≤ X ≤ b) = P(a ≤ X < b) = P(a < X < b). Is it trueif X is discrete?
21 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Transformation
• If X is discrete, then
pY (y) = P[Y = y ] = P[g(X ) = y ] = P[X = g−1(y)] = pX (g−1(y))
• If X is continuous, we further assume g is differentiable, then we have
fY (y) = fX (g−1(y))
∣∣∣∣dx
dy
∣∣∣∣ , for y ∈ SY
where the support of Y is the set SY = {y = g(x) : x ∈ SX}.
22 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Table of Contents
1 Basic Concepts
2 Conditional Probability
3 Random Variables
4 Mathematical Expectation
5 Multivariate Distributions
6 Common Probability Distributions
23 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Mathematical Expectation
Definition (Expectation)
If X is a continuous random variable with pdf f (x) and∫∞−∞ |x |f (x)dx <∞, then
the expectation of X is
E (X ) =
∫ ∞
−∞xf (x)dx
If X is a discrete random variable with pmf p(x), and∑
x |x |p(x) <∞, then theexpectation of X is
E (X ) =∑
x
xf (x)dx
In general, the expectation of Y = g(X ) can be calculated by substituting theintegrand (summand) with g(x)fX (x) (g(x)pX (x)) as long as they are absolutelyintegrable (summable).
24 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Variance and Moments of higher order
Definition (Variance)
If X is a random variable with finite mean µ = E [X ] and such that E [(X − µ)2] isfinite, then the variance of X is defined to be E [(X − µ)2], usually denoted by σ2
or Var(X ). NoteVar(X ) = E [X 2]− (E [X ])2
It is convention to call σ (square root of the variance) the standard deviation ofX . In general, the k-th moments of X is defined as µk := E [X k ] if it exists.
Definition (Moment Generating Function (mgf))
Let X be a random variable such that for some h > 0, the expectation of etX
exists for −h < t < h. The moment generating function (mgf) of X is definedto be MX (t) = E [etX ]. We have
µk =dk
dtk
∣∣∣∣t=0
M(t)25 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Table of Contents
1 Basic Concepts
2 Conditional Probability
3 Random Variables
4 Mathematical Expectation
5 Multivariate Distributions
6 Common Probability Distributions
26 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Random Vector
• Random function (X1,X2) : C × C → R2.D = {(x1, x2) : x1 = X1(c), x2 = X2(c), c ∈ C}.• Joint cdf FX1,X2(x1, x2) = P[{X1 ≤ x1} ∩ {X2 ≤ x2}] = P[X1 ≤ x1,X2 ≤ x2].
• Expectation of Y = g(X1,X2) for g : R2 → R is calculated asE [Y ] =
∫ ∫g(x1, x2)fX1,X2(x1, x2)dx1dx2 or
E [Y ] =∑
x1
∑x2g(x1, x2)pX1,X2(x1, x2). Expectation is a linear operator.
• Mgf of X = (X1,X2)′: MX(t) = E [et′X] = E [et1X1+t2X2 ].
• Transformation Y = [g1(X), g2(X)]′ := G (X). ThenfY(y) = fX(G−1(y))
∣∣∂X∂Y
∣∣.
27 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Random Vector
Example
Let Y1 = 12(X1 − X2) where X1 and X2 have the joint pdf
fX1,X2(x1, x2) =
{14 exp
(− x1+x2
2
), 0 < x1 <∞, 0 < x2 <∞
0, elsewhere
What is the distribution of Y1?
28 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Conditional Distributions and Expectations
• Conditional pmf pX2|X1(x2|x1) =
pX1,X2 (x1,x2)
pX1 (x1)for given x1 with pX1(x1) > 0;
conditional pdf fX2|X1(x2|x1) =
fX1,X2 (x1,x2)
fX1 (x1)for given x1 with fX1(x1) > 0.
• Conditional cdf FX2|Xx(x2|x1) can be calculated using conditional pmf or pdf.
• Conditional Expectation E [u(X2)|x1] =∫u(x2)f2|1(x2|x1)dx2, conditional
variance Var(X2|x1) = E [X 22 |x1]− (E [X2|x1])2.
Theorem
Let (X1,X2) be a random vector such that the variance of X2 is finite. Then
1 E [E [X2|X1]] = E (X2).
2 Var[E [X2|X1]] ≤ Var(X2).
29 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Correlation Coefficient
Definition (Variance)
The covariance between random variables X and Y , denoted as cov(X ,Y ), isdefined to be E [(X − µX )(Y − µY )] = E [XY ]− µXµY . If each of σ1 and σ2 isfinite, the number
ρ =E [(X − µX )(Y − µY )]
σ1σ2=
cov(X ,Y )
σ1σ2
is called the correlation coefficient of X and Y .
30 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Independent Random Variables
Definition (Independence)
Let the random variables X1 and X2 have the joint pdf f (x1, x2) (joint pmfp(x1, x2)) and the marginal pdfs f1(x1), f2(x2) (marginal pmfs p1(x1), p2(x2))respectively. X1 and X2 are independent if and only if f (x1, x2) ≡ f1(x1)f2(x2)(p(x1, x2) ≡ p1(x1)p2(x2)). Otherwise they are said to be dependent.
• independence =⇒ uncorrelation?
• independence ⇐= uncorrelation?
• counter-example?
31 / 35
Probability &Statistics
S.Lan
Basic Concepts
ConditionalProbability
RandomVariables
MathematicalExpectation
MultivariateDistributions
CommonProbabilityDistributions
Independent Random Variables
Criteria to judge independence between X1 and X2:
• Have separate supports S1 and S2 respectively and the joint pdf factorizesf (x1, x2) ≡ g(x1)h(x2).
• The joint cdf factorizes F (x1, x2) = F1(x1)F2(x2).
• The joint probability factorizesP(a < X1 ≤ b, c < X2 ≤ d) = P(a < X1 ≤ b)P(c < X2 ≤ d).