Instructor: Shengyu Zhang
Instructor: Shengyu Zhang
Content
Basic Concepts
Probability Mass Function
Functions of Random Variables
Expectation, Mean, and Variance
Joint PMFs of Multiple Random Variables
Conditioning
Independence
Basic Concepts
In some experiments, the outcomes are numerical. E.g. stock price.
In some other experiments, the outcomes are not numerical, but they may be associated with some numerical values of interest.
Example. Selection of students from a given population, we may wish to consider their grade point average. The students are not numerical, but their GPA scores
are.
Basic Concepts
When dealing with these numerical values, it
is useful to assign probabilities to them.
This is done through the notion of a random
variable.
Sample Space
Ξ©
Random Variable π
π₯
Real Number Line
Main Concepts Related to Random
Variables
Starting with a probabilistic model of an
experiment:
A random variable is a real-valued function of
the outcome of the experiment.
A function of a random variable defines
another random variable.
Examples
5 tosses of a coin.
This is a random variable:
The number of heads
This is not:
Main Concepts Related to Random
Variables
We can associate with each random variable
certain βaveragesβ of interest, such as the
mean and the variance.
A random variable can be conditioned on an
event or on another random variable.
Notion of independence of a random variable
from an event or from another random
variable.
Weβll talk about all these in this lecture.
Discrete Random Variable
A random variable is called discrete if its
range is either finite or countably infinite.
Example. Two rolls of a die.
The sum of the two rolls.
The number of sixes in the two rolls.
The second roll raised to the fifth power.
Continuous random variable
Example. Pick a real number π and associate to it the numerical value π2.
The random variable π2 is continuous, not discrete.
Weβll talk about continuous random variables later.
The following random variable is discrete:
π πππ π = α1 π > 00 π = 0β1 π < 0
.
Discrete Random Variables: Concepts
A discrete random variable is a real-valued
function of the outcome of a discrete experiment.
A discrete random variable has an associated
probability mass function (PMF), which gives the
probability of each numerical value that the
random variable can take.
A function of a discrete random variable defines
another discrete random variable, whose PMF
can be obtained from the PMF of the original
random variable.
Content
Basic Concepts
Probability Mass Function
Functions of Random Variables
Expectation, Mean, and Variance
Joint PMFs of Multiple Random Variables
Conditioning
Independence
Probability Mass Function
For a discrete random variable π, the
probability mass function (PMF) of π captures
the probabilities of the values that it can take.
If π₯ is any possible value of π, the probability
mass of π₯, denoted ππ(π₯), is the probability of
the event π = π₯ consisting of all outcomes
that give rise to a value of π equal to π₯ :
ππ π₯ = π π = π₯
Example
Two independent tosses of a fair coin
π: the number of heads obtained
The PMF of π is
ππ π₯ = α1/4 if π₯ = 0 or π₯ = 21/2 if π₯ = 10 otherwise
Probability Mass Function
Upper case characters to denote random variables
π, π, π, β¦
Lower case characters to denote real numbers
π₯, π¦, π§, β¦
the numerical values of a random variable
Weβll write π(π = π₯) in place of the notation π( π = π₯ ).
Similarly, weβll write π π β π for the probability that π takes a value within a set π.
Probability Mass Function
Follows from the additivity and normalization axioms
π₯: πππ πππ π πππππ£πππ’ππ ππ π
ππ π₯ = 1
The events π = π₯ are disjoint, and they form a partition of the sample space
For any set π of real numbers
π π β π =
π₯βπ
ππ(π₯)
Probability Mass Function
For each possible value π₯ of π:
Collect all the possible outcomes that give rise to
the event π = π₯ .
Add their probabilities to obtain ππ(π₯).
Event π = π₯
Sample space
Ξ©
ππ(π₯)
Important specific distributions
Binomial random variable
Geometric random variable
Poisson random variable
Bernoulli Random Variable
The Bernoulli random variable takes the two
values 1 and 0
π β 0,1
Its PMF is
ππ π₯ = απ if π₯ = 11 β π if π₯ = 0
Example of Bernoulli Random Variable
The state of a telephone at a given time that
can be either free or busy.
A person who can be either healthy or sick
with a certain disease.
The preference of a person who can be either
for or against a certain political candidate.
The Binomial Random Variable
A biased coin is tossed π times.
Each toss is independently of prior tosses
Head with probability π.
Tail with probability 1 β π.
The number π of heads up is a binomial
random variable.
The Binomial Random Variable
We refer to π as a binomial random variable
with parameters π and π.
For π = 0,1, β¦ , π.
ππ π = π π = π =π
πππ 1 β π πβπ
The Binomial Random Variable
Normalization
π=0
ππ
πππ 1 β π πβπ = 1
The Geometric Random Variable
Independently and repeatedly toss a biased
coin with probability of a head π, where 0 <π < 1.
The geometric random variable is the number
π of tosses needed for a head to come up for
the first time.
The Geometric Random Variable
The PMF of a geometric random variable
ππ π = 1 β π πβ1π
π β 1 tails followed by a head.
Normalization condition is satisfied:
π=1
β
ππ π =
π=1
β
1 β π πβ1π = π
π=0
β
1 β π π
= π β 1
1β 1βπ= 1
The Geometric Random Variable
The ππ π = 1 β π πβ1π decreases as a
geometric progression with parameter 1 β π.
The Poisson Random Variable
A Poisson random variable takes
nonnegative integer values.
The PMF
ππ π = πβπππ
π!οΌπ = 0, 1, 2, β¦ ,
Normalization condition
π=0
β
πβπππ
π!= πβπ 1 + π +
π2
2!+π3
3!+β―
= πβπππ = 1
Poisson random variable can be viewed as a
binomial random variable with very small πand very large π.
More precisely, the Poisson PMF with
parameter π is a good approximation for a
binomial PMF with parameters π and π where
π = ππ, π is large and π is small.
See the wiki page for a proof.
Examples
Because of the above connection, Poisson
random variables are used in many scenarios.
π is the number of typos in a book of π words.
The probability that any one word is misspelled is very
small.
π is the number of cars involved in accidents in a
city on a given day.
The probability that any one car is involved in an
accident is very small.
The Poisson Random Variable
For Poisson random variable ππ π = πβπππ
π!
π β€ 1, monotonically decreasing
π > 1, first increases and then decreases
Content
Basic Concepts
Probability Mass Function
Functions of Random Variables
Expectation, Mean, and Variance
Joint PMFs of Multiple Random Variables
Conditioning
Independence
Functions of Random Variables
Consider a probability model of todayβs
weather
π = the temperature in degrees Celsius
π = the temperature in degrees Fahrenheit
Their relation is given by
π = 1.8π + 32
In this example, π is a linear function of π, of
the form
π = π π = ππ + π
Functions of Random Variables
We may also consider nonlinear functions,
such as
π = log π
In general, if π = π(π) is a function of a
random variable π, then π is also a random
variable.
The PMF ππ of π = π(π) can be calculated
from PMF ππ of π
ππ π¦ = Οπ₯:π π₯ =π¦ ππ π₯
Example
The PMF of π is
ππ π₯ = α1/9 if π₯ is an integer and π₯ β [β4,4]0 otherwise
Let π = |π|. Then the PMF of π is
ππ π¦ = α2/9 if π¦ = 1,2,3,41/9 if π¦ = 00 otherwise
Example
Visualization of the relation between π and π
Example
Let π = π2. Then the PMF of π is
ππ π§ = α2/9 if π§ = 1, 4, 9, 161/9 if π§ = 00 otherwise
Content
Basic Concepts
Probability Mass Function
Functions of Random Variables
Expectation, Mean, and Variance
Joint PMFs of Multiple Random Variables
Conditioning
Independence
Expectation
Sometimes it is desirable to summarize the
values and probabilities by one number.
The expectation of π is a weighted average
of the possible values of π.
Weights: probabilities.
Formally, the expected value of a random
variable π, with PMF ππ π₯ , is
π π = Οπ₯ π₯ππ(π₯)
Names: expected value, expectation, mean
Example
Two independent coin tosses
π π» =3
4
π = the number of heads
Binomial random variable with parameters
π = 2 and π = 3/4.
Example
The PMF is
ππ π = ΰ΅
1/4 2 if π = 0
2 β 1/4 β 3/4 if π = 1
3/4 2 if π = 2
The mean is
π π = 0 β 1
4
2
+ 1 β 2 β 1
4β 3
4+ 2 β
3
4
2
=3
2
Expectation
Consider the mean as the center of gravity of
the PMF
Οπ₯ π₯ β π ππ π₯ = 0
β π = Οπ₯ π₯ππ(π₯) .
Center of gravity
π = mean = π[π]
Variance
Besides the mean, there are several other
important quantities.
The πth moment is π ππ
So the first moment is just the mean.
Variance of π, denoted by var(π), isvar π = π π β π π 2
The second moment of π β π π .
The variance is always non-negative:
π£ππ π β₯ 0
Standard deviation
Variance is closely related to another
measure.
Standard deviation of π, denoted by ππ, is
ππ = var π
Example
Suppose that the PMF of π is
ππ π₯ = α1/9 if π₯ is an integer and π₯ β [β4,4]0 otherwise
The expectation
π π =
π₯
π₯ππ π₯ =1
9
π₯=β4
4
π₯ = 0
Can also be seen from symmetry.
Example
Let π = π β π π 2 = π2. The PMF of π
ππ π§ = α2/9 if π§ = 1, 4, 9, 161/9 if π§ = 00 otherwise
The variance of π is then
var π = π π = Οπ§ π§ππ(π§)
= 0 β 1
9+ 1 β
2
9+ 4 β
2
9+ 9 β
2
9+ 16 β
2
9=
60
9
Expectation for π π
There is a simpler way of computing
π£ππ π π .
Let π be a random variable with PMF ππ(π₯), and let π(π) be a real-valued function of π.
The expected value of the random variable
π = π(π) is
π π π =
π₯
π π₯ ππ(π₯)
Expectation for π π
Using the formula ππ π¦ = Ο{π₯|π π₯ =π¦}ππ(π₯):
π π π = π π
= Οπ¦ π¦ππ(π¦)
= Οπ¦ π¦Ο{π₯|π π₯ =π¦}ππ(π₯)
= Οπ¦ Ο{π₯|π π₯ =π¦} π¦ππ(π₯)
= Οπ¦ Ο{π₯|π π₯ =π¦}π(π₯)ππ(π₯)
= Οπ₯ π π₯ ππ(π₯)
Variance example
The PMF of π
ππ π₯ = α1/9 if π₯ is an integer and π₯ β [β4,4]0 otherwise
The variance
var π = π π β π[π] 2
= Οπ₯ π₯ β π π 2ππ(π₯)
=1
9Οπ₯=β44 π₯2
= 16 + 9 + 4 + 1 + 0 + 1 + 9 + 16 /9
=60
9
Mean of ππ + π
Let π be a linear function of ππ = ππ + π
The mean of π
π π =
π₯
ππ₯ + π ππ(π₯)
= π
π₯
π₯ππ(π₯) + π
π₯
ππ(π₯) = ππ π + π
The expectation scales linearly.
Variance of ππ + π
Let π be a linear function of ππ = ππ + π
The variance of π
var π = Οπ₯ ππ₯ + π β π ππ + π 2ππ(π₯)
= Οπ₯ ππ₯ + π β ππ π β π 2ππ(π₯)
= π2Οπ₯ π₯ β π π 2ππ(π₯)
= π2var(π)
The variance scales quadratically.
Variance as moments
Fact. π£ππ π = π π2 β π π 2.
π£ππ π = π π β π π 2
= π π2 β 2ππ π + π π 2
= π π2 β 2π ππ π + π π 2
= π π2 β 2π π π π + π π 2
= π π2 β π π 2
Example: Average time
Distance between class and home is 2 miles
π weather is good = 0.6
Speed:
π = 5 miles/hour if weather is good.
π = 30 miles/hour if weather is bad.
Question: What is the mean of the time π to
get to class?
Example: Average time
The PMF of π
ππ π‘ =0.6 if π‘ =
2
5βππ’ππ
0.4 if π‘ =2
30βππ’ππ
The mean of π
π π = 0.6 β 2
5+ 0.4 β
2
30=
4
15
Example: Average time
Wrong calculation by speed π
The mean of speed ππ π = 0.6 β 5 + 0.4 β 30 = 15
The mean of time π2
π[π]=
2
15
To summarize, in this example we have
π =2
πand π π = π
2
πβ
2
π[π]
Example: Bernoulli
Consider the Bernoulli random variable πwith PMF
ππ π₯ = απ if π₯ = 11 β π if π₯ = 0
Its mean, second moment, and variance:
π π = 1 β π + 0 β 1 β π = ππ π2 = 12 β π + 0 β 1 β π = π
var π = π π2 β π π 2 = π β π2 = π(1 β π)
Example: Uniform
What is the mean and variance of the roll of a
fair six-sided die?
ππ π = α1/6 if π = 1,2,3,4,5,60 otherwise
The mean π π = 3.5 and the variance
var π = π π2 β π π 2
=1
612 + 22 + 32 + 42 + 52 + 62 β 3.52
= 35/12
Example: Uniform integers
General, a discrete uniformly distributed
random variable
Range: contiguous integer values π, π + 1,β¦ , π
Probability: equal probability
The PMF is
ππ π = α1
π β π + 1if π = π, π + 1,β¦ , π
0 otherwise
Example: Uniform integers
The mean
π π =π + π
2 For variance, first consider π = 1 and π = π
The second moment
π π2 =1
π
π=1
π
π2 =1
6(π + 1)(2π + 1)
Example: Uniform integers
The variance for special case
var π = π π2 β π π 2
=1
6π + 1 2π + 1 β
1
4π + 1 2
=π2β1
12
Example: Uniform integers
For the case of general integers a and b
π: discrete uniform over [π, π]
π: discrete uniform over [1, π β π + 1]
Relation between π and ππ = π β π + 1
Thus
var π = var π =π β π + 1 2 β 1
12
Example: Poisson
Recall Poisson PMF
ππ π = πβπππ
π!π = 0,1,2, β¦ ,
Mean:
π π =
π=0
β
ππβπππ
π!=
π=1
β
ππβπππ
π!
= π
π=1
β
πβπππβ1
(π β 1)!= π
π=0
β
πβπππ
π!
= π Variance: π£ππ π = π.
Verification left as exercise.
The Quiz Problem
A person is given two questions and must
decide which question to answer first.
π(question 1 correct) = 0.8 Prize=$100
π(question 2 correct) = 0.5 Prize=$200
If incorrectly answer the first question, then no
second question.
How to choose the first question so that
maximize the expected prize?
Tree illustration
The Quiz Problem
Answer question 1 first: Then the PMF of π is
ππ 0 = 0.2ππ 100 = 0.8 β 0.5ππ 300 = 0.8 β 0.5
We have
π π = 0.8 β 0.5 β 100 + 0.8 β 0.5 β 300 = 160
The Quiz Problem
Answer question 2 first: Then the PMF of π is
ππ 0 = 0.5ππ 200 = 0.5 β 0.2ππ 300 = 0.5 β 0.8
We have
π π = 0.5 β 0.2 β 200 + 0.5 β 0.8 β 300 = 140
It is better to answer question 1 first.
The Quiz Problem
Let us now generalize the analysis.
π1: π(correctly answering question 1)
π2: π(correctly answering question 2)
π£1: prize for question 1
π£2: prize for question 2
The Quiz Problem
Answer question 1 first
π π = π1 1 β π2 π£1 + π1π2 π£1 + π£2= π1π£1 + π1π2π£2
Answer question 2 first
π π = π2 1 β π1 π£2 + π2π1 π£2 + π£1= π2π£2 + π2π1π£1
The Quiz Problem
It is optimal to answer question 1 first if and
only if
π1π£1 + π1π2π£2 β₯ π2π£2 + π2π1π£1 Or equivalently
π1π£11 β π1
β₯π2π£21 β π2
Rule: Order the questions in decreasing
value of the expression ππ£/(1 β π)
Content
Basic Concepts
Probability Mass Function
Functions of Random Variables
Expectation, Mean, and Variance
Joint PMFs of Multiple Random Variables
Conditioning
Independence
Multiple Random Variables
Probabilistic models often involve several
random variables of interest.
Example: In a medical diagnosis context, the
results of several tests may be significant.
Example: In a networking context, the
workloads of several gateways may be of
interest.
Joint PMFs of Multiple Random Variables
Consider two discrete random variables πand π associated with the same experiment.
The joint PMF of π and π is denoted by ππ,π.
It specifies the probability of the values that πand π can take.
If π₯, π¦ is a pair of values that π, π can
take, then the probability mass of π₯, π¦ is the
probability of the event π = π₯, π = π¦ :
ππ,π π₯, π¦ = π π = π₯, π = π¦ .
The joint PMF determines the probability of
any event that can be specified in terms of
the random variables π and π.
For example, if π΄ is the set of all pairs
(π₯, π¦) that have a certain property, then
π π, π β π΄ =
π₯,π¦ βπ΄
ππ,π(π₯, π¦)
Joint PMFs of Multiple Random Variables
The PMFs of π and π
ππ π₯ = Οπ¦ ππ,π(π₯, π¦) , ππ π¦ = Οπ₯ ππ,π(π₯, π¦)
The formula can be verified by
ππ π₯ = π π = π₯
= Οπ¦π(π = π₯, π = π¦)
= Οπ¦ ππ,π(π₯, π¦)
ππ, ππ are the marginal PMFs.
Joint PMFs of Multiple Random Variables
Computing the marginal
MPFs ππ and ππ of ππ,πfrom table.
The joint PMF ππ,π is
arranged in a two-
dimensional table.
Joint PMFs of Multiple Random Variables
The marginal PMF of
π or π at a given value
is obtained by adding
the table entries along
a corresponding
column or row,
respectively.
Functions of Multiple Random Variables
One can generate new random variables by
applying functions on several random
variables.
Consider π = π(π, π).
Its PMF can be calculated from the joint PMF
ππ,π according to
ππ π§ =
{(π₯,π¦)|π π₯,π¦ =π§}
ππ,π(π₯, π¦)
Functions of Multiple Random Variables
The expected value rule for multiple variables
π π π, π =
π₯,π¦
π π₯, π¦ ππ,π(π₯, π¦)
For special case, π is linear and of the form
ππ + ππ + π, we have
π ππ + ππ + π = ππ π + ππ π + π
βlinearity of expectationβ --- regardless of
dependence of π and π.
More than Two Random Variables
We can also consider three or more random variables.
The joint PMF of three random variables π, π, and π
ππ,π,π π₯, π¦, π§ = π π = π₯, π = π¦, π = π§
The marginal PMFs are
ππ,π π₯, π¦ = Οπ§ ππ,π,π π₯, π¦, π§
and
ππ π₯ = Οπ¦Οπ§ ππ,π,π π₯, π¦, π§
More than Two Random Variables
The expected value rule for functions
π π π, π, π = Οπ₯,π¦,π§π π₯, π¦, π§ ππ,π,π(π₯, π¦, π§)
If π is linear and of the form
π π, π, π = ππ + ππ + ππ + πthen
π ππ + ππ + ππ + π
= ππ[π] + ππ[π] + ππ[π] + π
More than Two Random Variables
Generalization to more than three random
variables.
For any random variables π1, π2, . . . , ππ and
any scalars π1, π2, . . . , ππ, we have
π π1π1 + π2π2 +β―+ ππππ= π1π π1 + π2π π2 +β―+ πππ[ππ]
Example: Mean of the Binomial
300 students in probability class
Each student has probability 1/3 of getting an
A, independently of any other student.
π: the number of students that get an A.
Question: What is the mean of π?
Example: Mean of the Binomial
Let ππ be the random variable for πth student
ππ = α1 if the πth student gets an A0 otherwise
Each ππ is a Bernoulli random variable
π ππ = π = 1/3
πππ« ππ = π(1 β π) = (1/3)(2/3) = 2/9
Example: Mean of the Binomial
The random variable π can be expressed as
their sum
π = π1 + π2 +β―+ ππ
Using the linearity of π as a function of the ππ
π π =
π=1
300
π ππ =
π=1
3001
3= 300 β
1
3= 100
Example: Mean of the Binomial
If we repeat this calculation for a general
number of students π and probability of A
equal to π, we obtain
πΈ π =
π=1
π
πΈ ππ = ππ
Example: The Hat Problem
Suppose that π people throw their hats in a
box.
Each picks up one hat at random.
π: the number of people that get back their
own hat
Question: What is the expected value of π?
Example: The Hat Problem
For the πth person, we introduce a random
variable ππ
ππ = α1 if the πth his own0 otherwise
Since π ππ = 1 =1
πand π ππ = 0 = 1 β
1
π
πΈ ππ = 1 β 1
π+ 0 β 1 β
1
π=1
π
Example: The Hat Problem
We know
π = π1 + π2 +β―+ ππ
Thus
π π = π π1 + π π2 +β―+ π ππ = π β 1
π= 1
Summary of Facts About Joint PMFs
The joint PMF of π and π is defined by
ππ,π π₯, π¦ = π(π = π₯, π = π¦)
The marginal PMFs of π and π can be
obtained from the joint PMF, using the
formulas
ππ π₯ = Οπ¦ ππ,π(π₯, π¦) , ππ π¦ = Οπ₯ ππ,π(π₯, π¦)
Summary of Facts About Joint PMFs
A function π(π, π) of π and π defines another random variable
π π π, π =
π₯,π¦
π π₯, π¦ ππ,π(π₯, π¦)
If π is linear, of the form ππ + ππ + π,π ππ + ππ + π = ππ π + ππ π + π
These naturally extend to more than two random variables.
Content
Basic Concepts
Probability Mass Function
Functions of Random Variables
Expectation, Mean, and Variance
Joint PMFs of Multiple Random Variables
Conditioning
Independence
Conditioning
In a probabilistic model, a certain event π΄ has
occurred
Conditional probability captures this
knowledge.
Conditional probabilities are like ordinary
probabilities (satisfy the three axioms) except
refer to a new universe: event π΄ is known to have
occurred
Conditioning a Random Variable on an
Event
The conditional PMF of a random variable π,
conditioned on a particular event π΄ with
π(π΄) > 0, is defined by
ππ|π΄ π₯ = π π = π₯ π΄
=π({π = π₯} β© π΄)
π(π΄)
Conditioning a Random Variable on an
Event
Consider the events {π = π₯} β© π΄:
They are disjoint for different values of π₯.
Their union is π΄.
Thus π π΄ = Οπ₯ π({π = π₯} β© π΄)
Combining this and ππ|π΄ π₯ = π({π = π₯} β© π΄)/π π΄ (last slide), we can see that
Οπ₯ ππ|π΄ π₯ = 1
So ππ|π΄ is a legitimate PMF.
Conditioning a Random Variable on an
Event
The conditional PMF is calculated similar to
its unconditional counterpart.
To obtain ππ|π΄(π₯)
Add the probabilities of the outcomes π = π₯
Conditioning event π΄
Normalize by dividing with π(π΄)
Conditioning a Random Variable on an
Event
Visualization and calculation of the
conditional PMF ππ|π΄(π₯)
Example: dice
π: the roll of a fair 6-sided dice
π΄: the roll is an even number
ππ|π΄ π₯ = π π = π₯ π΄)
=π(π = π₯ πππ π΄)
π(π΄)
= α1
3if π₯ = 2,4,6
0 otherwise
Conditioning one random variable on
another
We have talked about conditioning a random
variable π on an event π΄.
Now letβs consider conditioning a random
variable π on another random variable π.
Let π and π be two random variables
associated with the same experiment.
The experimental value π = π¦ (ππ π¦ > 0)
provides partial knowledge about the value of
π.
Conditioning one random variable on
another
The knowledge is captured by the conditional
PMF ππ|π of π given π, which is defined as
ππ|π΄ for π΄ = {π = π¦}:
ππ|π π₯ π¦ = π(π = π₯|π = π¦)
Using the definition of conditional
probabilities
ππ|π π₯ π¦ =π(π = π₯, π = π¦)
π(π = π¦)=ππ,π(π₯, π¦)
ππ(π¦)
Conditioning one random variable on
another
Fix some π¦, with ππ π¦ > 0 and consider
ππ|π(π₯|π¦) as a function of π₯.
This function is a valid PMF for X:
Assigns nonnegative values to each possible x
These values add to 1
Has the same shape as ππ,π(π₯, π¦)
Οπ₯ ππ|π π₯ π¦ = 1
Conditioning one random variable on
another
Visualization of the conditional PMF ππ|π(π₯|π¦)
Conditioning one random variable on
another
It is convenient to calculate the joint PMF by
a sequential approach and the formula
ππ,π π₯, π¦ = ππ π¦ ππ|π(π₯|π¦),
Or its counterpart
ππ,π π₯, π¦ = ππ π₯ ππ|π(π¦|π₯).
This method is entirely similar to the use of
the multiplication rule from previous lectures.
Example: Question answering
A professor independently answers each of
her studentsβ questions incorrectly with
probability ΒΌ.
In each lecture the professor is asked 0,1, or
2 questions with equal probability 1/3.
π: the number of questions professor is asked
π: the number of questions she answers wrong in
a given lecture
Example: Question answering
Construct the joint PMF ππ,π(π₯, π¦): calcualte
all the probabilities π(π = π₯, π = π¦).
Using a sequential description of the
experiment and the multiplication rule
ππ,π π₯, π¦ = ππ π¦ ππ|π(π₯|π¦)
Example: Question answering
For example,
ππ,π 1,1 = ππ π₯ ππ|π π¦, π₯ =1
4β 1
3=
1
12
Example: Question answering
We can compute other useful information
from two-dimensional table.
For example,
π at least one wrong answer
= ππ,π 1,1 + ππ,π 2,1 + ππ,π 2,2
=4
48+
6
48+
1
48=
11
48
Conditioning one random variable on
another
The conditional PMF can also be used to
calculate the marginal PMFs.
ππ π₯ =
π¦
ππ,π(π₯, π¦) =
π¦
ππ π¦ ππ|π(π₯|π¦)
This formula provides a divide-and-conquer
method for calculating marginal PMFs.
Summary of Facts About Conditional
PMFs
Conditional PMFs are similar to ordinary
PMFs, but refer to a universe where the
conditioning event is known to have occurred.
The conditional PMF of π given an event π΄with π(π΄) > 0, is defined by
ππ|π΄ π₯ = π π = π₯ π΄
and satisfies
Οπ₯ ππ|π΄ π₯ = 1
Summary of Facts About Conditional
PMFs
The conditional PMF of π given π can be
used to calculate the marginal PMFs with the
formula
ππ π₯ =
π¦
ππ π¦ ππ|π(π₯|π¦)
This is analogous to the divide-and-conquer
approach for calculating probabilities using
the total probability theorem.
Conditional Expectations
The conditional expectation of π given an
event π΄ with π(π΄) > 0, is defined by
π π π΄ =
π₯
π₯ππ|π΄(π₯|π΄)
For a function π(π), it is given by
π π(π) π΄ =
π₯
π(π₯)ππ|π΄(π₯|π΄)
Conditional Expectations
The conditional expectation of π given a
value π¦ of π is defined by
π π π = π¦ =
π₯
π₯ππ|π(π₯|π¦)
The total expectation theorem
π¬ π =
π¦
ππ(π¦) π π π = π¦
Conditional Expectations
Let π΄1, β¦ , π΄π be disjoint events that form a
partition of the sample space, and assume
that π(π΄π) > 0 for all π. Then
π π = Οπ=1π π π΄π π[π|π΄π]
Indeed,
π π = Οπ₯ π₯ππ π₯= Οπ₯ π₯ Οπ=1
π π π΄π ππ₯|π΄π π₯ π΄π= Οπ=1
π π π΄π Οπ₯ π₯ππ₯|π΄π π₯ π΄π= Οπ=1
π π π΄π π π|π΄π
Conditional Expectation
Messages transmitted by a computer in
Boston through a data network are destined
for New York with probability 0.5
for Chicago with probability 0.3
for San Francisco with probability 0.2
The transit time π of a message is random
π π = 0.05 for New York
π π = 0.1 for Chicago
π π = 0.3 for San Francisco
Conditional Expectation
By total expectation theorem
π π = 0.5 β 0.05 + 0.3 β 0.1 + 0.2 β 0.3
= 0.115
Mean and Variance of the Geometric
Random Variable
You write a software program over and over,
probability π that it works correctly
independently from previous attempts
π: the number of tries until the program works
correctly
Question: What is the mean and variance of
π?
Mean and Variance of the Geometric
Random Variable
π is a geometric random variable with PMF
ππ π = 1 β π πβ1π π = 1,2, β¦
The mean and variance of π
π π = Οπ=1β π 1 β π πβ1π
var π = Οπ=1β π β π π 2 1 β π πβ1π
Mean and Variance of the Geometric
Random Variable
Evaluating these infinite sums is somewhat
tedious.
As an alternative, we will apply the total
expectation theorem.
Let
π΄1 = π = 1 = {first try is a success}and
Mean and Variance of the Geometric
Random Variable
If the first try is successful, we have π = 1π π π = 1 = 1
If the first try fails (π > 1), we have wasted
one try, and we are back where we started.
The expected number of remaining tries is π[π]
We have
π π π > 1 = 1 + π[π]
Mean and Variance of the Geometric
Random Variable
Thus
π π= π π = 1 π π π = 1 + π π > 1 π π π > 1= π + (1 β π)(1 + π π )
Solving this equation gives
π[π] =1
π
Mean and Variance of the Geometric
Random Variable
Similar reasoning
π π2 π = 1 = 1
and
π π2 π > 1 = π 1 + π 2
= 1 + 2π π + π[π2]
So
π π2 = π β 1 + 1 β π 1 + 2π π + π π2
Mean and Variance of the Geometric
Random Variable
We obtain
π π2 =2
π2β1
π
and conclude that
πππ« π = π π2 β π π 2
=2
π2β1
πβ
1
π2=1 β π
π2
Content
Basic Concepts
Probability Mass Function
Functions of Random Variables
Expectation, Mean, and Variance
Joint PMFs of Multiple Random Variables
Conditioning
Independence
Independence of a r.v. from an event
Idea is similar to the independence of two
events.
Knowing the occurrence of the conditioning
event tells us nothing about the value of the
random variable.
Independence of a r.v. from an event
Formally, the random variable π is
independent of the event π΄ if
π π = π₯ and π΄ = π π = π₯ π π΄ = ππ π₯ π(π΄)
Same as requiring that the events π = π₯and π΄ are independent, for any choice π₯.
Independence of a r.v. from an event
Consider π(π΄) > 0
By the definition of the conditional PMF
ππ|π΄ π₯ = π(π = π₯ and π΄)/π(π΄)
Independence is the same as the condition
ππ|π΄ π₯ = ππ π₯ for all π₯
Independence of a r.v. from an event
Consider two independent tosses of a fair
coin.
π: the number of heads
π΄: the number of heads is even
The PMF of π
ππ π₯ = α
1/4 if π₯ = 01/2 if π₯ = 11/4 if π₯ = 2
Independence of a r.v. from an event
We know π π΄ =1
2
The conditional PMF
ππ|π΄ π₯ = α1/2 if π₯ = 00 if π₯ = 11/2 if π₯ = 2
The PMFs ππ and ππ|π΄ are different
β π and π΄ are not independent
Independence of random variables
The notion of independence of two random
variables is similar.
Two random variables π and π are
independent if
ππ,π π₯, π¦ = ππ π₯ ππ π¦ for all π₯, π¦
Same as requiring that the two events
π = π₯ and {π = π¦} be independent for every
π₯ and π¦.
Independence of random variables
By the formula
ππ,π π₯, π¦ = ππ|π π₯ π¦ ππ π¦
Independence is equivalent to the condition
ππ|π π₯ π¦ = ππ π₯
for all π¦ with ππ(π¦) > 0 and all π₯.
Independence means that the experimental
value of π tells us nothing about the value of
π.
Independence of random variables
π and π are conditionally independent, if
given a positive probability event π΄π π = π₯, π = π¦ π΄ = π π = π₯ π΄ π(π = π¦|π΄)
Using this chapterβs notation
ππ,π|π΄ π₯, π¦ = ππ|π΄ π₯ ππ|π΄(π¦)
Or equivalently,
ππ|π,π΄ π₯ π¦ = ππ|π΄ π₯
for all π₯, π¦ such that ππ|π΄ π¦ > 0.
Independence of random variables
If π and π are independent random variables,
then
π ππ = π π β π[π]
Shown by the following calculation
π ππ = Οπ₯Οπ¦ π₯π¦ β ππ,π(π₯, π¦)
= Οπ₯Οπ¦ π₯π¦ β ππ π₯ ππ(π¦)
= Οπ₯ π₯ππ(π₯) β Οπ¦ π¦ππ(π¦)
= π π β π[π]
Independence of random variables
Conditional independence may not imply
unconditional independence.
π and π are not independent
ππ|π 1 1 = π π = 1 π = 1
= 0 β π π = 1 = ππ(1)
Condition on
π΄ = {π β€ 2, π β₯ 3}
They are independent
Independence of random variables
A very similar calculation shows that if π and
π are independent, then so are π(π) and
β(π) for any functions π and β.
π π π β(π) = π π(π) π[β(π)]
Next, we consider variance of sum of
independent random variables.
Independence of random variables
Consider π = π + π, where π and π are
independent.
πππ« π = π π + π β π π + π 2
= π π + π β π π β π π 2
= π π β π π + π β π π2
= π π β π π 2 + π π β π π 2
+2π π β π π π β π π
Independence of random variables
Now we compute π π β π π π β π π .
Since π and π are independent, so are π β π π and π β π π . As they are two functions of π and π, respectively.
Thus π π β π π π β π π
= π π β π π β π[ π β π π ]
= 0 β 0 = 0
So πππ« π = π π β π π 2 + π π β π π 2
= πππ« π + πππ«[π]
Summary of independent r.v.βs
π is independent of the event π΄ if
ππ|π΄ π₯ = ππ(π₯)
that is, if for all π₯, the events {π = π₯} and π΄are independent.
π and π are independent if for all possible
pairs (π₯, π¦), the events {π = π₯} and π = π¦are independent
ππ,π π₯, π¦ = ππ π₯ ππ(π¦)
Summary of Facts About Independent
Random Variables
If π and π are independent random variables,
then
1. π ππ = π π π π
2. π π π β(π) = π π(π) π[β(π)], for any
functions π and β.
3. πππ« π + π = πππ« π + πππ«[π]
Independence of Several Random
Variables
All previous results have natural extensions
to more than two random variables.
Example: Random variables π, π, and π are
independent if
ππ,π,π π₯, π¦, π§ = ππ π₯ ππ π¦ ππ(π§)
Example: If π1, π2, β¦ , ππ are independent
random variables, then
πππ« π1 + π2 +β―+ ππ= πππ« π1 + πππ« π2 +β―+ πππ«(ππ)
Variance of the Binomial
Consider π independent coin tosses
π π» = π
ππ: Bernoulli random variable for πth toss
Its PMF
πππ π₯ = α1 πth toss comes up a head
0 otherwise
Variance of the Binomial
Let π = π1 + π2 +β―+ ππ be a binomial
random variable.
By the independence of the coin tosses
πππ« π =
π=1
π
πππ« ππ = ππ(1 β π)
Mean and Variance of the Sample Mean
Estimate the approval rating of a president πΆ.
Ask π persons randomly from the voters
ππ response of the πth person
ππ = α1 πth person approves πΆ0 πth person disapproves πΆ
Mean and Variance of the Sample Mean
Model π1, π2, β¦ , ππ as independent Bernoulli
random variables
mean π
variance π(1 β π)
The sample mean
ππ =π1 + π2 +β―+ ππ
π
Mean and Variance of the Sample Mean
ππ is the approval rating of πΆ within our π-person
sample.
Using the linearity of ππ as a function of the ππ
π ππ =
π=1
π1
ππ ππ =
1
π
π=1
π
π = π
and
πππ« ππ =
π=1
π1
π2πππ« ππ =
π(1 β π)
π