-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Lecture #3: Probability Theory
Website: http://ncbs.knu.ac.kr
Email: [email protected] of IT Engineering
Kyungpook National University
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#1
http://ncbs.knu.ac.kr/Teaching/index.htm
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Why Probability Theory?
• In our attempt, this course’s objective, to estimate asystem’s
outcome (signal), we will be trying toextract meaningful
information about the outcomeand uncertainties (noise
characteristics).
• In order to accomplish this, we need to know aboutthe
uncertainty, its characteristics and how it can behandled.
• Various real-world situations that involve uncertaintyare:•
throwing of dice;• measuring of a physical parameter such as
length,current, temperature etc;
• sampling a batch of manufactured items
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#2
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Why Probability Theory?
• Probability theory is a universally accepted tool
forexpressing degrees of confidence or doubt aboutsome proposition
(outcomes) in the presence ofuncertainty, or randomness, in some
sense.
• This class reviews about the probability theory.Kalyana C.
Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001 #3
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Outline
1 Basic concepts of Probability
2 Random Variables
3 Transformations of random variables
4 Multiple Random Variables4.1 Statistical Independence
5 Stochastic Process
6 White noise
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#4
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Definitions: Sample spaceProbability theory starts with the
consideration of asample space.
The sample space is the set of all possible outcomes inany
physical experiment.
For example, if a coin is tossed twice and after each tossthe
face that shows is recorded, then the possibleoutcomes of this
particular experiment are
HH , HT , TH , TT
with H denoting the occurrence of head and T for tails.Then,
S = {HH , HT , TH , TT} (1)is called as the sample space of this
coin-tossingexperiment.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#5
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Definitions
In general, sample space is
• countably infinite (finite): For example, toss a coinuntil the
first time head shows up and record thetrail number.
• uncountably infinite: For example, consider theexperiment of
choosing a number at random fromthe interval [0, 1]
Let Ω be the sample space of an experiment E . Thenany subset A
of Ω, including the empty set φ and theentire sample space Ω is
called an event. Events maycontain even one single sample point ω,
in which casethe event is a Singletonset ω.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#6
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Events, sets and words
Experiment: toss a coin 3 times.Which of the following describes
the event{THH ,HTH ,HHT} ?(1) Exactly one head {TTH ,HTT ,THT}(2)
Exactly one tail {THH ,HTH ,HHT}(3) at most one tail {THH ,HTH ,HHT
,HHH}(4) none of the aboveAnswer: (2) Exactly one tailNotice that
the same event E ⊂ Ω may be described inwords in multiple ways -
exactly 2 heads and exactly 1tail.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#7
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Events, sets and words
Experiment: toss a coin 3 times.Q: Which of following equals the
event “exactly twoheads” ?A = THH ,HTH ,HHT ,HHHB = THH ,HTH ,HHTC
= HTH ,THH(1) A (2) B (3) C (4) B or C
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#8
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Probability measure
Given a sample space Ω, a probability or aprobabilitymeasure on
Ω is a function P on subsets of Ωsuch that
a) P(A) > 0 for any A ⊆ Ω;b) P(Ω) = 1;
c) Given disjoint subsets A1,A2, · · · of Ω,P(∪∞i=1Ai) =
∑i=1∞ P(Ai). This property is known
as countable additivity.
Let Ω be a finite sample space consisting of N samplepoints. We
say that the sample points are equally likelyif P(ω) = 1
Nfor each sample point ω.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#9
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Probability
• Let Ω be a finite sample space consisting of Nequally likely
sample points. Let A be any event andsuppose A contains n distinct
sample points. Then
P(A) =n
N=
No.of sample points favourable to A
Total number of sample points
• For instance, our experiment may be rolling asix-sided fair
dice. Event A may be defined as thenumber 4 showing up on the top
surface of the dieafter we roll. Then probability of Event A is
1/6.Similarly for any number in the dice.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#10
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Probability and Induction• By convention, calibrated
probabilities lies from 0 to
1. All propositions (outcomes) fall somewhere inbetween. If
probability is zero, the occurrence ofthat event is impossible. If
probability is one, thenthe occurrence of that event is
certain.
• Probability statements that we make can be basedon our past
experience, or on our personaljudgments.
• Whether our probability statements are based onpast experience
or subjective personal judgments,they obey a common set of rules,
which we can useto treat probabilities in a mathematical
framework,and also for making decisions on predictions,
forunderstanding complex systems, or as intellectualexperiments and
for entertainment.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#11
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Probability and Induction
• Suppose that we know somehow from pastobservations the
probability P(A) of an event A in agiven experiment. What
conclusion can we drawabout the occurrence of this event in a
single futureperformance of this experiment?
• Suppose that P(A) = 0.6. In this case, the number0.6 gives us
only a “certain degree of confidence”that the event A will occur.
The known probabilityis thus used as a ” measure of our belief”
about theoccurrence of A in a single trial. In a single trial,
theevent A will either occur or will not occur. If it doesnot, this
will not be a reason for questioning thevalidity of the assumption
that P(A) = 0.6.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#12
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Probability
Example: What is the probability of being dealt four ofa kind in
a five card poker ?Solution:Event: Dealing Four of a kind in a
poker game.Total number of outcomes: number of subset of size
fivethat can be picked from a deck of 52 cards = total
number of possible poker hands =
(525
)= 2, 598, 960
Number of times event occurs: Out of all those hands,there are
48 possible hands containing four aces, 48possible hands containing
four kings, and so on. So thereare a total of 13× 48 hands
containing four of a kind.P(event) = Number of time A occurs
Total number of outcomes= (13)(48)
2598960= 0.024
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#13
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Probability and set
operations on events
Events are A, L, RRule 1: Complements. P(Ac) = 1− P(A)Rule 2:
Disjoint events. If L and R are disjoint thenP(L ∪ R) = P(L) +
P(R)Rule 3: Inclusion-exclusion principle. Any L and R:P(L ∪ R) =
P(L) + P(R)− P(L ∩ R)
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#14
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Definitions: EventExample: Lucky Larry has a coin that youre
quite sureis not fair.
• He will flip the coin twice (independently)• Its your job to
bet whether the outcomes will be the
same (HH, TT) or different (HT, TH)
Solution: Lets the probability of heads is P(H) = p
andprobability of tail is P(T ) = 1− P(H) = (1− p) = q.Since, the
flips are disjoint the probabilities for
Event — Probability
HH — p2
TT — q2
HT — pq
TH — pq
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#15
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
So, the probability of same outcome isP(same) = P(HH) + P(TT ) =
p2 + q2
The probability of different outcomes isP(diff ) = P(HT ) + P(HT
) = 2p + qArithmetic: If a 6= b the (a − b)2 > 0⇔ a2 + b2 >
2abSince, the coin is unfair, we know p 6= q.Thusp2 + q2 > 2pq ⇒
P(same) > P(different)
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#16
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Conditional ProbabilityP(A|B) is the conditional probability of
A given B , thatis, the probability that A occurs given the fact
that Boccurred, defined as:
P(A|B) = P(A,B)P(B)
• P(A,B) is the joint probability of A and B , that is,the
probability that events A and B both occur.
• The probability P(A) or P(B) is called an a prioriprobability
because it applies to the probability of anevent apart from any
previously known information.
• A conditional probability is called an a posterioriprobability
because it applies to a probability giventhe fact that some
information about a possiblyrelated event is already known.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#17
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Conditional Probability
The conditional probability of event A given event B canbe
defined if the probability of B is nonzero.Example: Suppose that
that A is the appearance of a 4on a die, and B is the appearance of
an even number ona die. How much is P(A|B)?Solution:Event A:
Appearance of a 4 on a die - P(A) = 1/6Event B: Appearance of an
even number on a die -P(B) = 3/6 = 1/2Joint probability: Appearance
of 4 and even number -P(A,B) = 1/6Then, Conditional probability
isP(A|B) = P(A,B)
P(B)= 1/6
1/2= 1
3
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#18
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Conditional Probability
Consider the eight shapes in above figure. Find theP(gray
|circle)?
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#19
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Bayes’ RuleFor two events A and B, we can write the
conditionalprobability as:
P(A|B) = P(A,B)P(B)
;
P(B |A) = P(A,B)P(A)
From above two equations, we can obtain...
P(A|B)P(B) = P(B |A)P(A)Bayes’ rule is often written by
re-arranging the aboveequation to obtain:
P(A|B) = P(B |A)P(A)P(B)
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#20
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Bayes’ rule
Consider the eight shapes in above figure. Find theP(gray
|circle)?
P(circle|gray) = P(circle|gray)P(circle)P(gray)
P(circle|gray) = 1/3 ; P(circle) = 3/8; P(gray) = 5/8Conditional
probabilityP(gray |circle) = P(circle|gray)P(circle)
P(gray)= (1/5)(5/8)
(3/8)= 1/5
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#21
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Random VariablesRandom variable (RV) is defined as a functional
mappingfrom a set of experimental outcomes (the domain) to aset of
real numbers (the range).
• For example, the roll of a die can be viewed as a RVif we map
the appearance of one dot on the die tothe output one, the
appearance of two dots on thedie to the output two, and so on.
• If we define X as an RV that represents the roll of adie, then
the probability that X will be a four isequal to 1/6. If we then
roll a four, the four is arealization of the RV X. If we then roll
the die againand get a three, the three is another realization
ofthe RV X. However, the RV X exists independentlyof any of its
realizations.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#22
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Random Variables• This distinction between an RV and its
realizations
is important for understanding the concept ofprobability.
Realizations of an RV are not equal tothe RV itself.
• The probability of X = 4 is equal to 1/6, thatmeans that there
is a 1 out of 6 chance that eachrealization of X will be equal to
4. However, the RVX will always be random and will never be equal
to aspecific value.
• An RV can be either continuous or discrete. Thethrow of a die
is a discrete random variable becauseits realizations belong to a
discrete set of values.The high temperature tomorrow is a
continuousrandom variable because its realizations belong to
acontinuous set of values.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#23
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Probability distribution
function
The fundamental property of an RV X is its
probabilitydistribution function (PDF) FX (x), defined as
FX (x) = P(X ≤ x)
where x is a nonrandom independent variable
orconstant.Properties:
1 FX (x) ∈ [0, 1]2 FX (−∞) = 0; FX (∞) = 13 FX (a) ≤ FX (b) if a
≤ b4 P(a < X < b) = FX (b)− FX (a)
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#24
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Probability density functionThe probability density function
(pdf) fX (x) is defined asthe derivative of the probability
distribution function.
fX (x) =dFX (x)
dx
Properties:
1 FX (x) =∫ x−∞ fX (z)dz
2 fX (x) ≥ 03∫∞−∞ fX (x) = 1
4 P(a < X < b) =∫ ba
fX (x)dx
Note: probability distribution function is a probability
ofrandom variable. Probability density function containsinformation
about probability but it is not a probabilitysince can have any
value positive, even larger than one.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#25
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Moments
First Moment: The expected value of an RV X is definedas its
average value over a large number of experiments.This can also be
called the expectation, the mean, or theaverage of the RV.Suppose
we run the experiment for N times and observeof m different
outcomes. We observe that outcome A1occurs n1 times, A2 occurs n2
times, so on, Am occursnm times. Then the expected value of X is
computed as:
E (X ) =1
N
m∑i=1
Aini
E (X ) is also often written as E (x), X̄ , or x̄
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#26
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Mean
Suppose that we roll a die an infinite number of times.We would
expect to see each possible number 1/6 of thetime each. We can
compute the expected value of theroll of the die as
E (X ) = limN←∞
1
N[1(N/6) + · · ·+ 6(N/6)] = 3.5
Note that the expected value of an RV is not necessarilywhat we
would expect to see when we run a particularexperiment. For
example, even though the aboveexpected value of X is 3.5, we will
never see a 3.5 whenwe roll a die.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#27
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Mean and VarianceIf a function, say g(X ), acts upon an RV, then
theoutput of the function is also an RV.For example, if X is the
roll of a die, thenP(X = 4) = 1/6.If g(X ) = X 2, then P[g(X ) =
16] = 1/6. We cancompute the expected value of any function g(X )
as
E [g(X )] =
∫ ∞−∞
g(x)fX (x)dx
where fX (x) is the pdf of X.If g(X ) = X , then we compute
expected value of X as
E [X ] =
∫ ∞−∞
xfX (x)dx
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#28
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
VarianceThe variance of a random variable is a measure of
howmuch we expect the random variable to vary from itsmean.Case 1:
If the RV X always is equal to one value (forexample, the die is
loaded and we always get a 4 whenwe roll the die), then the
variance of X is equal to 0.Case 2: If X can take on any value
between ±∞ withequal probability, then the variance of X is equal
to ∞.The variance of a random variable is defined as
σ2X = E [(X − x̄)2] =∫ ∞−∞
(x − x̄)2fX (x)dx
The notation to indicate a random variable X with meanx̄ and
variance σ2 is
X ∼ (x̄ , σ2)Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL -
ELEC732001 #29
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Skewness and Kurtosis
The skew of an RV is a measure of the asymmetry of thepdf around
its mean. Skew is defined as
skew = E [(X − x̄)3]Kurtosis is a measure of whether the data
are peaked orflat relative to a normal distribution. That is, data
setswith high kurtosis tend to have a distinct peak near themean,
decline rather rapidly, and have heavy tails.
Kurtosis = E [(X − x̄)4]Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL
CONTROL - ELEC732001 #30
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Uniform DistributionAn RV is called uniform it its pdf is a
constant valuebetween two limits. This indicates that the RV has
anequally likely probability of obtaining any value betweenits
limits, but a zero probability of obtaining a valueoutside its
limits.
fX (x) =
{1
b−a x ∈ [a, b]0 otherwise
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#31
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Uniform Distribution
In this example we will find the mean and variance of anRV that
is uniformly distributed between 1 and 3. Thepdf of the RV is given
as
fX (x) =
{12
x ∈ [1, 3]0 otherwise
Mean is computed as:x̄ =
∫∞−∞ xfX (x)dx =
∫ 31
(1/2)xdx = 2Variance is computed as:σ2X =
∫∞−∞(x − x̄)
2fX (x)ds =∫ 31
(1/2)(x − 2)2dx = 1/3
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#32
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Uniform Distribution
In this example we will find the mean and variance of anRV that
is uniformly distributed between 1 and 3. Thepdf of the RV is given
as
fX (x) =
{15
x ∈ [1, 6]0 otherwise
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#33
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Normal distributionAn RV is called Gaussian or normal if its pdf
is given by
fX (x) =1
σ√
2πexp[−(x − x̄)2
2σ2]
where x̄ and σ2 are mean and variance of the GaussianRV. We use
the notation X ∼ N(x̄ , σ2)
If the mean changes, the pdf will shift to the left or right.If
the variance increases, the pdf will spread out. If thevariance
decreases, the pdf will be squeezed in.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#34
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Transformations of random
variableswhat happens to the pdf of an RV when we pass the
RVthrough some function?Suppose that we have two RVs, X and Y,
related to oneanother by the monotonic functions g() and h():Y =
g(X )X = g−1(Y ) = h(Y )If we know the pdf of X [fX (X )], then we
can computethe pdf of Y [fY (y)] as follows:
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#35
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
RV transformationsIn this example, we will find the pdf of a
linear functionof a Gaussian RV. Suppose that X ∼ N(x̄ , σ2x) andY
= g(X ) = aX + b, where a 6= 0 and b are any realconstants.
Then
The RV Y is Gaussian with mean ȳ = ax̄ + b andvariance σ2Y =
a
2σ2X . This example shows that a lineartransformation of a
Gaussian RV results in a newGaussian RV.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#36
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
RV transformations
Suppose that we pass a Gaussian RV X ∼ N(0, σ2x)through the
nonlinear function Y = g(X ) = X 3:
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#37
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Multiple Random VariablesFor example, if X and Y are RVs, then
their distributionfunctions are defined as:FX (x) = P(X ≤ x); FY
(y) = P(Y ≤ y)Now we define the probability that both X ≤ x andY ≤
y as the joint probability distribution function of Xand Y: FXY (x
, y) = P(X ≤ x ,Y ≤ y).Properties:
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#38
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Statistical IndependenceTwo events are independent if the
occurrence of oneevent has no effect on the probability of the
occurrenceof the other event. RVs X andY are independent if
theysatisfy the following relation:P(X ≤ x ,Y ≤ y) = P(X ≤ x)P(Y ≤
y)• The central limit theorem says that the sum of
independent RVs tends toward a Gaussian RV,regardless of the pdf
of the individual RVs thatcontribute to the sum. This is why so
many RVs innature seem to have a Gaussian distribution.
• For example, temperature on any given day in anygiven location
follows a Gaussian distribution. Thisis because temperature is
affected by clouds, wind,air pressure and other factors. Each of
these factorsis determined by other random factors.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#39
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Covariance
Covariance between two scalar RVs X and Y can bedefined as:CXY =
E [(X − X̄ )(Y − Ȳ )] = E [(XY )− X̄ Ȳ ]Correlation coefficient
of two scalar RVs X and Y as:
ρ =CXYσxσy
The correlation coefficient is a normalized measurementof the
independence between two RVs X and Y. If XandY are independent,
then ρ = 0 (although theconverse is not necessarily true). If Y is
a linear functionof X then ρ = 1.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#40
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
CorrelationThe correlation between two scalar RVs X and Y can
bedefined as RXY = E (XY )
• Two RVs are said to be uncorrelated ifRXY = E (X )E (Y ).
• From the definition of independence, if two RVs areindependent
then they are also uncorrelated.Independence implies
uncorrelatedness, butuncorrelatedness does not necessarily
implyindependence.
• Two RVs are said to be orthogonal if RXY = 0. Iftwo RVs are
uncorrelated, then they are orthogonalonly if at least one of them
is zero-mean. If two RVsare orthogonal, then they may or may not
beuncorrelated
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#41
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Uncorrelated
Example: Show that the outcome of two rolls of thedice are
uncorrelated and not orthogonal.Solution: Let two rolls of the dice
are represented bythe RVs X andY.Then Uncorrelated means E (XY ) =
E (X )E (Y ) and notorthogonal means RXY 6= 0The two RVs are
independent because one roll of the diedoes not have any effect on
a second roll of the die.Each roll of the die has an equally likely
probability (1/6)of being a 1, 2, 3, 4, 5, or 6.Therefore,E (X ) =
E (Y ) = 1 + 2 + 3 + 4 + 5 + 6/6 = 3.5
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#42
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
There are 36 possible combinations of the two rolls ofthe die.
We could get the combination (1,1), (1,2), andso on. Each of these
36 combinations have an equallylikely probability
(1/36).Therefore,the correlation between X and Y isRXY = E (XY ) =
1/36
∑6i=1
∑6j=1 ij = 12.25 =
E (X )E (Y ).Since E (XY ) = E (X )E (Y ), we see that X and Y
areuncorrelated. However, RXY 6= 0, so X and Y are
notorthogonal.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#43
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Correlated and Not
orthogonalA slot machine is rigged so you get 1 or -1 with
equalprobability the first spin X, and the opposite number
thesecond spinY. We have equal probabilities of obtaining(X, Y)
outcomes of (1, -1) and ( -1, 1).Solution:The two RVs are dependent
because the realization of Ydepends on the realization of X.We also
see thatE (X ) = E (Y ) = 0E (XY ) = (1)(−1) + (−1)(1)/2 = −1We see
that X andY are correlated becauseE (XY ) 6= E (X )E (Y ).We also
see that X andY are not orthogonal becauseRXY 6= 0.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#44
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Uncorrelated and
Orthogonal
A slot machine is rigged so you get -1, 0, or +1 withequal
probability the first spin X. On the second spinYyou get 1 if X =
0, and 0 if X 6= 0.Solution:
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#45
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Suppose that x and y are independent RVs, and the RV zis
computed as z = g(x) + h(y). In this example, we willcalculate the
mean of z:
As a special case of this example, we see that the meanof the
sum of two independent RVs is equal to the sumof their means. That
is, E (x + y) = E (x) + E (y) if xand y are independent.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#46
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Suppose we roll a die twice. What is the expected valueof the
sum of the two outcomes?Solution:We use X and Y to refer to the two
rolls of the die, andwe use Z to refer to the sum of the two
outcomes.Therefore, Z = X + Y . Since X and Y are independent,we
have Each roll of the die has an equally likelyprobability (1/6) of
being a 1, 2, 3, 4, 5, or 6. Therefore,E (X ) = E (Y ) = 1 + 2 + 3
+ 4 + 5 + 6/6 = 3.5E (Z ) = E (X ) + E (Y ) = 3.5 + 3.5 = 7
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#47
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Stochastic ProcessA stochastic process, also called a random
process, is avery simple generalization of the concept of an RV.
Astochastic process X(t) is an RV X that changes withtime. A
stochastic process can be one of four types.
• If the RV at each time is continuous and time iscontinuous,
then X (t) is a continuous randomprocess. For example, the
temperature at eachmoment of the day is a continuous random
processbecause both temperature and time are continuous.
• If the RV at each time is discrete and time iscontinuous, then
X (t) is a discrete random process.For example, the number of
people in a givenbuilding at each moment of the day is a
discreterandom process because the number of people is adiscrete
variable and time is continuous.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#48
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Stochastic Process
• If the RV at each time is continuous and time isdiscrete, then
X(t) is a continuous randomsequence. For example, the high
temperature eachday is a continuous random sequence
becausetemperature is continuous but time is discrete (dayone, day
two, etc.).
• If the RV at each time is discrete and time isdiscrete, then
X(t) is a discrete random sequence.For example, the highest number
of people in agiven building each day is a discrete randomsequence
because the number of people is a discretevariable and time is also
discrete.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#49
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Stochastic Process
Since a stochastic process is an RV that changes withtime, it
has a distribution and density function that arefunctions of
time.The PDF of X (t) is FX (x , t) = P(X (t) ≤ x)The pdf of X (t)
is fX (x , t) =
dFX (x ,t)dx
If X (t) is a randomvector, then the derivative above is taken
once withrespect to each element of x.The mean and covariance of X
(t) are also functions oftime:
• Mean: x̄(t) =∫∞−∞ xf (x , t)dx
• Covariance:CX (t) = E [X (t)− bar(x)(t)][X (t)− bar(x)(t)]T
=∫∞−∞[x − bar(x)(t)][x − bar(x)(t)]
T f (x , t)dx
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#50
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Stochastic Process -
Examples
• The high temperature each day can be considered astochastic
process. However, this process is notstationary. The high
temperature on a day in Julymight be an RV with a mean of 100
degreesFahrenheit, but the high temperature on a day inDecember
might have a mean of 30 degrees. This isa stochastic process whose
statistics change withtime, so the process is not stationary.
• The closing price of the stock market is an RVwhose mean
generally increases with time.Therefore, the stock market price is
a non-stationarystochastic process.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#51
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Stochastic Process
Suppose we have a stochastic process X (t). Furthersuppose that
the process has a realization x(t). The timeaverage of X (t) is
denoted asA[X (t)], and the timeautocorrelation of X (t) is denoted
as R[X (t)]. Thesequantities are defined for continuous-time
randomprocesses as
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#52
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Ergodic processAn ergodic process is a stationary random process
forwhich
A[X (t)] = E (x); R[X (t), τ ] = RX (τ)
• In the real world, we are often limited to only a
fewrealizations of a stochastic process.
• For example, if we measure the fluctuation of avoltmeter
reading, we are actually only measuringonly one realization of a
stochastic process. We cancompute the time average, time
autocorrelation, andother time-based statistics of the
realization.
• If the random process is ergodic, then we can usethose time
averages to estimate the statistics of thestochastic process.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#53
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
White noise and colored
noiseIf the RV X (t1) is independent from the RV X (t2) for
allt1 6= t2 then X (t) is called white noise. Otherwise, X(t)is
called colored noise.The whiteness or color content of a stochastic
processcan be characterized by its power spectrum. The
powerspectrum SX (ω) of a wide-sense stationary stochasticprocess X
(t) is defined as the Fourier transform of theautocorrelation. The
autocorrelation is the inverseFourier transform of the power
spectrum.
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#54
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
White and colored noiseThe power spectrum is sometimes referred
to as thepower density spectrum, the power spectral density, orthe
power density. The power of a wide-sense stationarystochastic
process is defined as
PX = 1/2π
∫ ∞−∞
SX (ω)dω
A discrete-time stochastic process X (t) is called whitenoise
if
where δk is the delta function.Kalyana C. Veluvolu (#IT1 − 817)
OPTIMAL CONTROL - ELEC732001 #55
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
White noise
• Discrete-time white noise shows that it does nothave any
correlation with itself except at the presenttime.
• If X (k) is a discrete-time white noise process, thenthe RV X
(n) is uncorrelated with X (m) unlessn = m. This shows that the
power of adiscrete-time white noise process is equal at
allfrequencies: SX (ω) = RX (0)∀ω ∈ [−π, π]
• For a continuous-time random process, white noiseis defined
similarly. White noise has equal power atall frequencies (like
white light):SX (ω) = RX (0)∀ω
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#56
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Power Spectrum
Suppose that a zero-mean stationary stochastic processhas the
autocorrelation functionRX (t) = σ
2e−β|t| where β is a positive real number.The power spectrum is
computed as:
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#57
-
Lecture #3:ProbabilityTheory
Basic concepts ofProbability
Random Variables
Transformationsof randomvariables
Multiple RandomVariables
4.1 StatisticalIndependence
StochasticProcess
White noise
Power Spectrum
The variance of the stochastic process is computed as
Kalyana C. Veluvolu (#IT1 − 817) OPTIMAL CONTROL - ELEC732001
#58
Basic concepts of ProbabilityRandom VariablesTransformations of
random variablesMultiple Random Variables4.1 Statistical
Independence
Stochastic ProcessWhite noise