Top Banner
Before we give a definition of probability, let us examine the following concepts: Random Experiment: An experiment is a random experiment if its outcome cannot be predicted precisely. One out of a number of outcomes is possible in a random experiment. 2. Sample Space: The sample space S is the collection of all outcomes of a random experiment. The elements of S are called sample points. A sample space may be finite, countably infinite or uncountable. A finite or countably infinite sample space is called a discrete sample space. An uncountable sample space is called a continuous sample space 3. Event: An event A is a subset of the sample space such that probability can be assigned to it. Thus . A S For a discrete sample space, all subsets are events. S is the certain event (sue to occur) and . φ is the impossible event. Consider the following examples. Example 1 Tossing a fair coin –The possible outcomes are H (head) and T (tail). The associated sample space is { , }. S H T = It is a finite sample space. The events associated with the sample space S are: ,{ },{ } S H T and . φ Example 2 Throwing a fair die- The possible 6 outcomes are: ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ The associated finite sample space is {'1', '2', '3', '4', '5', '6'}. S = Some events are The event of getting an odd face={'1', '3', '5'}. The event of getting a six={6} A B = = And so on. Example 3 Tossing a fair coin until a head is obatined We may have to toss the coin any number of times before a head is obtained. Thus the possible outcomes are: H, TH,TTH,TTTH,….. How many outcomes are there? The outcomes are countable but infinite in number. The countably infinite sample space is { , , , ......}. S H TH TTH = Example 4 Picking a real number at random between -1 and 1 1 s 2 s Sample space S Sample point
298

35229433 Random Variale Random Process

Nov 28, 2014

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 35229433 Random Variale Random Process

Before we give a definition of probability, let us examine the following concepts:

Random Experiment: An experiment is a random experiment if its outcome cannot be predicted precisely. One out of a number of outcomes is possible in a random experiment.

2. Sample Space: The sample space S is the collection of all outcomes of a random experiment. The elements of S are called sample points.

• A sample space may be finite, countably infinite or uncountable. • A finite or countably infinite sample space is called a discrete sample space. • An uncountable sample space is called a continuous sample space

3. Event: An event A is a subset of the sample space such that probability can be assigned to it. Thus

• .A S⊆ • For a discrete sample space, all subsets are events. • S is the certain event (sue to occur) and .φ is the impossible event.

Consider the following examples. Example 1 Tossing a fair coin –The possible outcomes are H (head) and T (tail). The associated sample space is , .S H T= It is a finite sample space. The events associated with the sample space S are: , , S H T and .φ Example 2 Throwing a fair die- The possible 6 outcomes are: ‘1’ ‘2’ ‘3’ ‘4’ ‘5’ ‘6’ The associated finite sample space is '1', '2', '3', '4', '5', '6'.S = Some events are

The event of getting an odd face='1', '3', '5'.The event of getting a six=6

AB==

And so on. Example 3 Tossing a fair coin until a head is obatined We may have to toss the coin any number of times before a head is obtained. Thus the possible outcomes are: H, TH,TTH,TTTH,….. How many outcomes are there? The outcomes are countable but infinite in number. The countably infinite sample space is , , ,.......S H TH TTH= Example 4 Picking a real number at random between -1 and 1

1s• 2s•

Sample space S

Sample point

Page 2: 35229433 Random Variale Random Process

The associated sample space is | , 1 1 [ 1, 1].S s s s= ∈ − ≤ ≤ = − Clearly S is a continuous sample space. The probability of an event A is a number ( )P A assigned to the event . Let us see how we can define probability.

1. Classical definition of probability ( Laplace 1812) Consider a random experiment with a finite number of outcomes .N If all the outcomes of the experiment are equally likely, the probability of an event A is defined by

( ) ANP AN

=

where Number of outcomes favourable to .AN A=

Example 4 A fair die is rolled once. What is the probability of getting a‘6’? Here '1', '2 ', '3', '4 ', '5 ', '6 'S = and '6 'A =

6 and 11( )6

AN N

P A

∴ = =

∴ =

Example 5 A fair coin is tossed twice. What is the probability of getting two ‘heads’? Here , , , S HH TH TT TT= and .A HH= Total number of outcomes is 4 and all four outcomes are equally likely. Only outcome favourable to A is HH

1( )4

P A∴ =

Discussion

• The classical definition is limited to a random experiment which has only a finite number of outcomes. In many experiments like that in the above example, the sample space is finite and each outcome may be assumed ‘equally likely.’ In such cases, the counting method can be used to compute probabilities of events.

• Consider the experiment of tossing a fair coin until a ‘head’ appears.As we have

discussed earlier, there are countably infinite outcomes. Can you believe that all these outcomes are equally likely?

• The notion of equally likely is important here. Equally likely means equally probable.

Thus this definition presupposes that all events occur with equal probability. Thus the definition includes a concept to be defined.

2. Relative-frequency based definition of proability( von Mises, 1919) If an experiment is repeated n times under similar conditions and the event A occurs in An times, then ( ) A

n

nP A Limn→∞

=

This definition is also inadequate from the theoretical point of view. We cannot repeat an experiment infinite number of times.

Page 3: 35229433 Random Variale Random Process

How do we ascertain that the above ratio will converge for all possible sequences of outcomes of the experiment?

Example Suppose a die is rolled 500 times. The following table shows the frequency each face. Face 1 2 3 4 5 6 Frequency 82 81 88 81 90 78 Relative frequency 0.164 0.162 0.176 0.162 0.18 0.156 3. Axiometic definition of probability ( Kolmogorov, 1933) We have earlier defined an event as a subset of the sample space. Does each subset of the sample space forms an event? The answer is yes for a finite sample space. However, we may not be able to assign probability meaningfully to all the subsets of a continuous sample space. We have to eliminate those subsets. The concept of the sigma algebra is meaningful now. Definition: Let S be a sample space and F a sigma field defined over it. Let :P F →ℜbe a mapping from the sigma-algebra F into the real line such that for each , A F∈ there exists a unique

( ) .P A ∈ℜ Clearly P is a set function and is called probability if it satisfies the following these axioms

1 2

11

1 ( ) 0 for all 2 ( ) 13. Countable additivityIf , ,... are pair-wise disjoint events, i.e. for , then

) ( )

i j

i iii

. P A A. P S

A A A A i j

P ( A P A

φ∞ ∞

==

≥ ∈=

∩ = ≠

= ∑∪

F

Remark

• The triplet ( , , ).S PF is called the probability space.

S

A

1

0

Page 4: 35229433 Random Variale Random Process

• Any assignment of probability assignment must satisfy the above three axioms • If 0,if A B ∩ = / ( ) ( ) ( )P A B P A P B∪ = +

This is a special case of axiom 3 and for a discrete sample space, this simpler version may be considered as the axiom 3. We shall give a proof of this result below.

• The events A and B are called mutually exclusive 0,if A B ∩ = / Basic results of probability From the above axioms we established the following basic results:

1. ( ) 0P φ = This is because,

( ) ( )( ) ( ) ( )

( ) 0

S SP S P SP S P P S

P

φφ

φφ

∪ =⇒ ∪ =⇒ + =∴ =

2. ( ) 1- ( )cP A P A= where where A∈ F We have

( ) ( )( ) ( ) 1

( ) 1 ( )

c

c

c c

c

A A SP A A P SP A P A A A

P A P Aφ

∪ =

⇒ ∪ =

⇒ + = ∩ =

∴ = −

3. If , and 0,A B A B ∈ ∩ = /F ( ) ( ) ( )P A B P A P B∪ = + We have

... ..

( ) ( ) ( ) ( )... ( ) .. (using axiom 3)( ) ( ) ( )

A B A BP A B P A P B P PP A B P A P B

φ φφ φ

∪ = ∪ ∪ ∪∴ ∪ = + + + +∴ ∪ = +

4. If , , ( ) ( ) ( )cA B P A B P A P A B∈ ∩ = − ∩F

We have( ) ( )

[( ) ( )] ( )( ) ( ) ( )( ) ( ) ( )

c

c

c

c

A B A B AP A B A B P AP A B P A B P AP A B P A P A B

∩ ∪ ∩ =

∴ ∩ ∪ ∩ =

⇒ ∩ + ∩ =

⇒ ∩ = − ∩

We can similarly show that ( ) ( ) ( )cP A B P B P A B∩ = − ∩ 5. If , , ( ) ( ) ( ) ( )A B P AUB P A P B P A B∈ = + − ∩F We have

S

cA B∩

AB

A B∩

( ) ( )cA A B A B= ∩ ∪ ∩

Page 5: 35229433 Random Variale Random Process

( ) ( ) ( )

( ) [( ) ( ) ( )] = ( ) ( ) ( ) = ( ) ( ) ( ) ( ) ( ) = ( ) ( ) ( )

c c

c c

c c

A B A B A B A BP A B P A B A B A B

P A B P A B P A BP B P A B P A B P A P A BP B P A P A B

∪ = ∩ ∪ ∩ ∪ ∩

∴ ∪ = ∩ ∪ ∩ ∪ ∩

∩ + ∩ + ∩− ∩ + ∩ + − ∩+ − ∩

6. We can apply the properties of sets to establish the following result for , , ,A B C ∈F ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )P A B C P A P B P C P A B P B C P A C P A B C∪ ∪ = + + − ∩ − ∩ − ∩ + ∩ ∩ The following generalization is known as the principle inclusion-exclusion. 7. Principle of Inclusion-exclusion Suppose 1 2, ,..., nA A A ∈F Then

1

1 , | , , |1 1

( ) ( ) ( ) .... ( 1)n nn

ni i i j i j K i

i i j i j i j k i j ki i

P A P A P A A P A A A P A+

= < < <= =

⎛ ⎞ ⎛ ⎞= − ∩ + ∩ ∩ − + −⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠∑ ∑ ∑∪ ∩

Discussion We require some rules to assign probabilities to some basic events inℑ . For other events we can compute the probabilities in terms of the probabilities of these basic events. Probability assignment in a discrete sample space

Consider a finite sample space 1 2, ,...... nS s s s= . Then the sigma algebra ℑ is defined

by the power set of S. For any elementary event is ∈ ℑ , we can assign a probability P( si ) such that,

( )1

1N

iiP s

=

=∑

For any event A ∈ ℑ , we can define the probability

( )( )A Ai

iP A P A∈

= ∑

In a special case, when the outcomes are equi-probable, we can assign equal probability p to each elementary event.

1

1

1

n

ip

p n=

∴ =

⇒ =

( )

1 ( )( )

iS AiP A P S

n An An n

⎛ ⎞∴ = ⎜ ⎟⎜ ⎟

⎝ ⎠

= =

Example Consider the experiment of rolling a fair die considered in example 2. Suppose , 1,..,6iA i = represent the elementary events. Thus 1A is the event of getting ‘1’, 2A is the event of getting ’2’ and so on. Since all six disjoint events are equiprobale and 1 2 6....S A A A= ∪ ∪ ∪ we get

Page 6: 35229433 Random Variale Random Process

1 2 61( ) ( ) ... ( )6

P A P A P A= = = =

Su ppose A is the event of getting an odd face. Then 1 3 5

1 3 51 1( ) ( ) ( ) ( ) 36 2

A A A A

P A P A P A P A

= ∪ ∪

∴ = + + = × =

Example Consider the experiment of tossing a fair coin until a head is obtained discussed in Example 3. Here , , ,.......S H TH TTH= Let us call

1

2

3

s Hs THs TTH

===

and so on. If we assign, 1( )2n nP s = then ( ) 1.

nn

s SP s

∈=∑ Let 1 2 3 , , A s s s= is the event of

obtaining the head before the 4th toss. Then 1 2 3

2 3

( ) ( ) ( ) ( )1 1 1 7 2 2 2 8

P A P s P s P s= + +

= + + =

Probability assignment in a continuous space Suppose the sample space S is continuous and un-countable. Such a sample space arises when the outcomes of an experiment are numbers. For example, such sample space occurs when the experiment consists in measuring the voltage, the current or the resistance. In such a case, the sigma algebra consists of the Borel sets on the real line. Suppose S = and :f → is a non-negative integrable function such that,

( ) 1f x dx =∫

For any Borel set ,A ( ) ( )

A

P A f x dx= ∫ defines the probability on the Borel sigma-algebra .Β

We can similarly define probability on the continuous space of 2 3, etc. Example Suppose

1 for [ , ]

( ) - 0 otherwise

Xx a b

f x b a⎧ ∈⎪= ⎨⎪⎩

Then for 1 1[ , ] [ , ]a b a b⊆

1 11([ , ])-

P a bb a

=

Example Consider 2 S = the two-dimensional Euclidean space. Let 21 S ⊆ and 1S represents

the area under 1.S

11

1

1 for ( )

0 otherwise

( )

X

x SSf x

AP A

S

⎧ ∈⎪= ⎨⎪⎩

=

This example interprets the geometrical definition of probability.

Page 7: 35229433 Random Variale Random Process

Probability Using Counting Method: In many applications we have to deal with a finite sample space S and the elementary events formed by single elements of the set may be assumed equiprobable. In this case, we can define the probability of the event A according to the classical definition discussed earlier:

( ) AnP An

=

where An = number of elements favorable to A and n is the total number of elements in the sample space .S Thus calculation of probability involves finding the number of elements in the sample space S and the event A. Combinatorial rules give us quick algebraic rules to find the elements in .S We briefly outline some of these rules:

(1) Product rule: Suppose we have a set A with m distinct elements and the set B with n distinct elements and ( ) , | ,i j i jA B a b a A b B× = ∈ ∈ . Then A B× contains

mn ordered pair of elements. This is illustrated in Fig for 5m = and 4.n = In other words if we can choose element a in m possible ways and the element b in n possible ways then the ordered pair (a,b) can be chosen in mn possible ways.

The above result can be generalized as follows: The number of distinct k-tupples in

( ) 1 2 1 2 1 1 2 2..... , ,..., | , ,.........,k k k kA A A a a a a A a A a A× × = ∈ ∈ ∈ is 1 2......... kn n n where in represents the number of distinct elements in iA .

1 4,a b 2 4,a b 3 4,a b 4 4,a b 5 5,a b

1 3,a b 2 3,a b 3 3,a b 4 3,a b 5 3,a b

1 2,a b 2 2,a b 3 2,a b 4 2,a b 5 2,a b

1 1,a b 2 1,a b 3 1,a b 4 1,a b 5 1,a b

A

B

Fig Illustration of the product rule

Page 8: 35229433 Random Variale Random Process

Example A fair die is thrown twice. What is the probability that a 3 will appear at least once. Solution: The sample space corresponding to two throws of the die is illustrated in the following table. Clearly, the sample space has 6.6 36= elements by the product rule. The event corresponding to getting at least one 3 is highlighted and contains 11 elements.

Therefore, the required probability is 1136

(2) Sampling with replacement and ordering:

Suppose we have to choose k objects from a set of n objects. Further, after every choosing, the object is placed back in the set. In this case, the number of distinct ordered k-tupples = ...... ( times) .kn n n k n× × × − =

(3) Sampling without replacement: Suppose we have to choose k objects from a set of n objects by picking one object after another at random. In this case the first object can be chosen from n objects, the second object can be chosen from n-1 objects, and so. Therefore, by applying the product rule, the number of distinct order k-tupples in this case is

( ) ( )

( )

1 ....... 1!

!

n n n kn

n k

× − − +

=−

The number ( )

!!

nn k−

is called the permutation of n objects taking k at a time and

denoted by .nkP Thus

(1,6) (2,6) (3,6) (4,6) (5,6) (6,6) (1,5 ) (2,5 ) (3,5 ) (4,5 ) (5,5 ) (6,5 ) (1,4 ) (2,4 ) (3,4 ) (4,4 ) (5,4 ) (6,4 ) (1,3) (2,3) (3,3) (4,3) (5,3) (6,3) (1,2) (2,2) (3,2) (4,2) (5,2) (6,2) (1,1) (2,1) (3,1) (4,1) (5,1) (6,1)

Throw 1

Throw 2

Page 9: 35229433 Random Variale Random Process

Clearly , !n

nP n=

Example: Birthday problem- Given a class of students, what is the probability of two students in theclass having the same birthday? Plot this probability vs. number of people and be surprised! If the group has more than 365 people the probability of two people in the group having the same birthday is obviously 1.

Let k be the number of students in the class.

365

Then the number of possible birth days=365.365....365 ( -times) 365The number of cases with each of the students having a different birthday is 365 .364.....(365 - 1)

Therefore, the probab

k

k

kk

P k

=

= = +365

ility of common birthday 1-365

kk

P=

Number of persons Probability 2 0.0027 10 0.1169 15 0.4114 25 0.5687 40 0.8912 50 0.9704 60 0.9941 80 0.9999 100

( )!

!n

knP

n k=

Page 10: 35229433 Random Variale Random Process

The plot of probability vs number of students is shown in Fig. . Observe the steep rise in the probability in the beginning. In fact this probability for a group of 25 students is greater than 0.5 and that for 60 students onward is closed to 1. This probability for 366 or more number of stuents is exactly one.

(4) Sampling without replacing and without ordering

Suppose nkC be the number of ways in which k objects can be chosen out of a set of

n objects. In this case ordering of the objects in the set of k objects is not considered. Note that k objects can be arranged among themselves in !k ways. Therefore, if ordering of the k objects is considered, the number of ways in which k objects can be chosen out of n objects is !n

kC k . This is the case of sampling with ordering.

!n nk k

nC k pn k

∴ = =−

n

knC

k n k∴ =

Page 11: 35229433 Random Variale Random Process

n

kC is also called the binomial co-efficient. Example: An urn contains 6 red balls, 5 green balls and 4 blue balls. 9 balls were picked at random from the urn without replacement. What is the probability that out of the balls 4 are red, 3 are green and 2 are blue? Solution:

9 balls can be picked from a population of 15 balls in 159

15!9!6!

C =

Therefore the required probability is 6 5 4

4 3 215

9

C C CC

× ×

(5) Arranging n objects into k specific groups Suppose we want to partition a set of n distinct elements into k distinct subsets

1 2, ,....., nA A A of sizes 1 2, ,.., kn n n respectively so that 1 2 ....... kn n n n= + + + . Then the total number of distinct partitions is

1 2

!! !.... !k

nn n n

This can be proved by noting that the resulting number of partitions is

( )( )( )

( )( )

1 2 11

1 2

...

1 1 2 1

1 1 2 1 2 1 2 1

1 2

...........

! .... !! .....! ! ! ! ! .. !

!! !.... !

k

k

n n n nn nnn n n

k

k k k

k

C C C

n n n n n nnn n n n n n n n n n n n n

nn n n

−− −−

× × ×

− − − −= × × ×

− − − − − − −

=

Example: What is the probability that in a throw of 12 dice each face occurs twice. Solution: The total number of elements in the sample space of the outcomes of a single throw of 12 dice is 126= The number of favourable outcomes is the number of ways in which 12 dice can be arranged in six groups of size 2 each – group 1 consisting of two dice each showing 1, group 2 consisting of two dice each showing 2 and so on.

Therefore, the total number distinct groups 12!

2!2!2!2!2!2!=

Hence the required probability is

Page 12: 35229433 Random Variale Random Process

6 12

12!(2) 6

=

Page 13: 35229433 Random Variale Random Process

Conditional probability Consider the probability space ( , , ).S PF Let A and B two events in F . We ask the following question – Given that A has occurred, what is the probability of B? The answer is the conditional probability of B given A denoted by ( / ).P B A We shall develop the concept of the conditional probability and explain under what condition this conditional probability is same as ( ).P B Let us consider the case of equiprobable events discussed earlier. Let ABN sample points be favourable for the joint event .A B∩

Clearly

Number of outcomes favourable to A and B( / )Number of outcomes in A

( )( )

AB

A

AB

A

P B A

NNN

P A BNN P AN

=

=

∩= =

This concept suggests us to define conditional probability. The probability of an event B under the condition that another event A has occurred is called the conditional probability of B given A and defined by

( / ) 0P ( A B )P B A , P ( A )P ( A )

∩= ≠

We can similarly define the conditional probability of A given B, denoted by ( / ).P A B

Notation ( / )P B A = Conditional

probability of B given A

Page 14: 35229433 Random Variale Random Process

From the definition of conditional probability, we have the joint probability ( )P A B∩ of two events A and B as follows

( ) ) ( / ( ) ( / )P A B P ( A P B A ) P B P A B∩ = = Example 1 Consider the example tossing the fair die. Suppose

event of getting an even number 2, 4,6event of getting a number less than 4 1, 2,3

2( ) 1/ 6 1( / )

( ) 3 / 6 3

AB

A BP A BP B A

P A

= == =

∴ ∩ =∩

∴ = = =

Example 2 A family has two children. It is known that at least one of the children is a girl. What is the probability that both the children are girls? A = event of at least one girl B = event of two girls Clearly

, , , , , , and

( ) 1/ 4 1( / )( ) 3 / 4 3

S gg gb bg bb A gg gb bg B ggA B gg

P A BP B AP A

= = =∩ =

∩∴ = = =

Conditional probability and the axioms of probability In the following we show that the conditional probability satisfies the axioms of probability.

By definition 0)(,)(

)()/( ≠∩

= APAP

BAPABP

Axiom 1

0)(

)()/(

0)(,0)(

≥∩

=∴

>≥∩

APBAPABP

APBAP

Axiom 2 We have S ∩ A = A

1)()(

)()()/( ==

∩=∴

APAP

APASPASP

Axiom 3 Consider a sequence of disjoint events 1 2, ,..., ,...nB B B We have

1 1

( ) ( )i ii i

B A B A∞ ∞

= =

∩ = ∩∪ ∪

( See the Venn diagram below for illustration of finite version of the result.)

Page 15: 35229433 Random Variale Random Process

Note that the sequence , 1, 2,...iB A i∩ = is also sequence of disjoint events.

11

( ( )) ( )i iii

P B A P B A∞ ∞

==

∴ ∩ = ∩∑∪

1 1

11

( ) ( )( / ) ( / )

( ) ( )

i ii i

i iii

P B A P B AP B A P B A

P A P A

∞ ∞

∞ ∞= =

==

∩ ∩∴ = = =

∑∑

∪∪

Properties of Conditional Probabilities (1) If ,B A⊆ then ( / ) 1 and ( / ) ( )P B A P A B P A= ≥ We have A B B∩ =

( ) ( )( / ) 1( ) ( )

and( )( / )

( )( ) ( / )

( )( ) ( )

( )

P A B P AP B AP A P A

P A BP A BP B

P A P B AP B

P AP B

P A

∩∴ = = =

∩=

=

=

(2) Chain rule of probability 1 2 1 2 1 3 1 2 1 2 1... ) ) ( / ( / )... ( / .... )n n nP(A A A P(A P A A )P A A A P A A A A −∩ = ∩ ∩ ∩ We have ( ) ( )

( ) ( / ) ( / ) ) ( / ( / )

A B C A B CP A B C P ( A B ) P C A B

P ( A B ) P C A BP ( A P B A ) P C A B

∩ ∩ = ∩ ∩∩ ∩ = ∩ ∩

= ∩ ∩= ∩

( ) ) ( / ( / )P A B C P ( A P B A ) P C A B∴ ∩ ∩ = ∩ We can generalize the above to get the chain rule of probability

1 2 1 2 1 3 1 2 1 2 1... ) ) ( / ( / )... ( / .... )n n nP(A A A P(A P A A )P A A A P A A A A −∩ = ∩ ∩ ∩

AB

S

Page 16: 35229433 Random Variale Random Process

(3) Theorem of Total Probability: Let

1 2, , nA A . . . A be n events such that

1 2..... and for .

n i ji j S A A A A A φ∪ ≠= ∪ ∩ = Then for any event ,B

1

( ) ( ) ( / )n

i ii

P B P A P B A=

= ∑

Proof: We have 1

n

ii

B A B=

∩ =∪ and the sequence B ∩ Ai is disjoint.

1

1

1

( ) ( )

( )

( ) ( / )

n

ii

n

ii

n

i ii

P B P B A

P B A

P A P B A

=

=

=

∴ = ∩

= ∩

=

Remark (1) A decomposition of a set S into 2 or more disjoint nonempty subsets is called a partition of .S The subsets

1 2, , nA A . . . A form a partition of S if

1 2..... and for i j.

n i j S A A A A A φ∪ ≠= ∪ ∩ =

(2) The theorem of total probability can be used to determine the probability of a complex event in terms of related simpler events. This result will be used in Bays’ theorem to be discussed to the end of the lecture. Example 3 Suppose a box contains 2 white and 3 black balls. Two balls are picked at random without replacement. Let 1A = event that the first ball is white and Let 1

cA = event that the first ball is black. Clearly 1A and 1

cA form a partition of the sample space corresponding to picking two balls from the box. Let B = the event that the second ball is white. Then

1 1 1 1( ) ( ) ( / ) ( ) ( / )2 1 3 2 2 5 4 5 4 5

c cP B P A P B A P A P B A= +

= × + × =

1A

3A 2A

B

S

Page 17: 35229433 Random Variale Random Process

Independent events Two events are called independent if the probability of occurrence of one event does not affect the probability of occurrence of the other. Thus the events A and B are independent if ( / ) ( )P B A P B= and ( / ) ( ).P A B P A= where ( )P A and ( )P B are assumed to be non-zero. Equivalently if A and B are independent, we have

( ) ( )( )

P A B P BP A∩

=

or

Two events A and B are called statistically dependent if they are not independent. Similarly, we can define the independence of n events. The events 1 2, ,..., nA A A are called independent if and only if

( ) ( ) ( )

( ) ( ) ( ) ( )

( ... ) ( ) ( ) ( )... ( )

i j i j

i j k i j k

i j k n i j k n

P A A P A P A

P A A A P A P A P A

P A A A A P A P A P A P A

∩ =

∩ ∩ =

∩ ∩ ∩ =

Example 4 Consider the example of tossing a fair coin twice. The resulting sample space is given by , , , S HH HT TH TT= and all the outcomes are equiprobable. Let , A TH TT= be the event of getting ‘tail’ in the first toss and , B TH HH= be the event of getting ‘head’ in the second toss. Then

( ) 12

P A = and ( ) 1 .2

P B =

Again, ( ) A B TH∩ = so that

( ) ( )1 ) (4

P A B P A P B∩ = =

Hence the events A and B are independent. Example 5 Consider the experiment of picking two balls at random discussed in example 3. In

this case, 2( )5

P B = and 11( / ) .4

P B A =

Therefore, 1( ) ( / )P B P B A≠ and 1A and B are dependent. Bayes’ Theorem

( ) ( ) ( )P A B P A P B∩ =

Joint probability is the product of individual probabilities.

Page 18: 35229433 Random Variale Random Process

1 2 1 2

1 2

Suppose , , ..... and for i j.

Suppose the event occurs if one of the events , , occurs. Thus we have the in n i j

n

A A . . . A are partitions on S such that S A A A A A

B A A . . . A

φ∪ ≠= ∪ ∩ =

nformation of We ask the following question: probabilities ( ) and ( / ), 1, 2.., .

? In other wo

i i

k

P A P B A i nGiven that B has occured what is the probability that a particular event A has occured

=

( )

( )

( )

1

1

( / ) ?

We have ( ) ( Using the theorem of total probability)

( / )

k

n

i ii

k kk

k kn

i ii

what is P A B

P B P(A ) P B | A

P(A ) P B/A P(A | B)

P(B)P(A ) P B/A

P(A )P B A

=

=

=

∴ =

=

This result is known as the Baye’s theorem. The probability kP(A ) is called the a priori probability and /kP(A B) is called the a posteriori probability. Thus the Bays’ theorem enables us

to determine the a posteriori probability kP(A | B) from the observation that B has occurred. This result is of practical importance and is the heart of Baysean classification, Baysean estimation etc. Example 6 In a binary communication system a zero and a one is transmitted with probability 0.6 and 0.4 respectively. Due to error in the communication system a zero becomes a one with a probability 0.1 and a one becomes a zero with a probability 0.08. Determine the probability (i) of receiving a one and (ii) that a one was transmitted when the received message is one. Let S be the sample space corresponding to binary communication. Suppose 0T be event of transmitting 0 and 1T be the event of transmitting 1 and 0R and 1R be corresponding events of receiving 0 and 1 respectively. Given 0( ) 0.6,P T = 1( ) 0.4,P T = 1 0( / ) 0.1P R T = and 0 1( / ) 0.08.P R T =

Page 19: 35229433 Random Variale Random Process

1

1 1 1 0 1 0

1 1 11 1

1

(i) ( ) Probabilty of receiving 'one' ( ) ( / ) ( ) ( / ) 0.4 0.92 0.6 0.1 0.448(ii) Using the Baye's rule

( ) ( / )( / )( )

P RP T P R T P T P R T

P T P R TP T RP R

== += × + ×=

=

1 1 1

1 1 1 0 1 0

( ) ( / ) ( ) ( / ) ( ) ( / )

0.4 0.92 0.4 0.92 0.6 0.1

0.8214

P T P R TP T P R T P T P R T

=+

×=

× + ×=

Example 7 In an electronics laboratory, there are identically looking capacitors of three makes 1 2 3, and A A A in the ratio 2:3:4. It is known that 1% of 1,A 1.5% of 2 3 and 2% of A A are defective. What percentage of capacitors in the laboratory are defective? If a capacitor picked at defective is found to be defective, what is the probability it is of make 3 ?A Let D be the event that the item is defective. Here we have to find

3( ) and ( / ).P D P A D

Here 1 2 32 1 4( ) , ( ) and ( ) . 9 3 9

P A P A P A= = =

The conditional probabilities are 1 2 3( / ) 0.01, ( / ) 0.015 and ( / ) 0.02. P D A P D A P D A= = =

1 1 2 2 3 3

3 33

( ) ( ) ( / ) ( ) ( / ) ( ) ( / )2 1 4 0.01 0.015 0.029 3 9

0.0167and

( ) ( / )( / )( )

4 0.029 0.0167

0.533

P D P A P D A P A P D A P A P D A

P A P D AP A DP D

∴ = + +

= × + × + ×

=

=

×=

=

Page 20: 35229433 Random Variale Random Process

Repeated Trials In our discussions so far, we considered the probability defined over a sample space corresponding to a random experiment. Often, we have to consider several random experiments in a sequence. For example, the experiment corresponding to sequential transmission of bits through a communication system may be considered as a sequence of experiments each representing transmission of single bit through the channel. Suppose two experiments E1 and E2 with the corresponding sample space and are performed sequentially. Such a combined experiment is called the product of two experiments E

1S 2S

1 and E2. Clearly, the outcome of this combined experiment consists of the ordered pair

where and . The sample space corresponding to the combined experiment is given by The events in consist of all the Cartesian products of the form where is an event in and is an event in Our aim is to define the probability

1 2( , )s s 1s S∈ 1 2

2

2s S∈

1 2 .S S S= × S

1A A× 1A 1S 2A 2 .S

( )1 2P A A× . The sample space are illustrated in in Fig.

1 2 1 2 1 2 1 2 and the events , and S S A S S A A A× × × ×

We can easily show that ( )

( )

1 2 2 2

1 2 1 1

( )and

( )

P S A P A

P A S P A

× =

× =

where Pi is the probability defined on the events of 1,2.iA i = Ai,. This is because, the event occurs whenever occurs, irrespective of the event in 1 2S in A × S 1S

)

1 in A 2 .S

Also note that

( ) (1 2 1 2 1 2

1 2 1 2( )

A A A S S A

P A B P A S S A

× = × ∩ ×

∴ × = × ∩ ×⎡ ⎤⎣ ⎦

Fig. (Animate)

1A

1 2S A×

1 2A S×

1 2A A× 2A

2S

1S

Page 21: 35229433 Random Variale Random Process

Independent Experiments: In many experiments, the events 1A S2× and 1S A2× are independent for every selection of and . Such experiments are called independent experiments. In this case can write

1A S∈ 1 22A S∈

( ) ( )( ) ( )

1 2 1 2

1 2 1 2

1 1 2 2

( )

( ) ( )

P A B P A S S A

P A S P S AP A P A

× = × ∩ ×⎡ ⎤⎣ ⎦= × ×

=

Example 1 Suppose is the sample space of the experiment of rolling of a six-faced fair die and is the sample space of the experiment of tossing of a fair die.

1S 2S

Clearly,

1 1

2 2

1 2

1 2

1, 2,3, 4,5,6 , =2,3 and

, ,

(1, ), (1, ), (2, ), (2, ), (3, ), (3, ), (4, ), (4, ), (5, ), (5, ), (6, ), (6, )and

1 2 1( ) .2 6 6

S A

S H T A H

S S H T H T H T H T H T H T

P A A

=

= =

∴ × =

× = =

Example 2 In a digital communication system transmitting 1 and 0, 1 is transmitted twice as often as 0. If two bits are transmitted in a sequence, what is the probability that both the bits will be 1?

1 1

2 2

1 2

1 2

0,1 , =1 and

0,1 , 1

(0,0), (0,1), (1,0), (1,1)and

1 1 1( )4 2 2

S A

S A

S S

P A A

=

= =

∴ × =

× = = ×

Page 22: 35229433 Random Variale Random Process

Generalization We can similarly define the sample space 1 2 .............. nS S S S= × × × corresponding to n experiments and the Cartesian product of events ( ) 1 2 1 2 1 1 2 2.......... , ,......., , .......,n nA A A s s s s A s A s A× × × = ∈ ∈ ∈n n .

If the experiments are independent, we can write ( ) ( ) ( ) ( )1 2 1 1 2 2...... ........n nP A A A P A P A P A× × × = n

where is probability defined on the event of . iP is Bernoulli trial Suppose in an experiment, we are only concerned whether a particular event A has occurred or not. We call this event as the ‘success’ with probability and the complementary event as the ‘failure’ with probability

( )P A p=cA ( ) 1 .cP A p= − Such a random

experiment is called Bernoulli trial. Binomial Law: We are interested in finding the probability of k ‘successes’ in n independent Bernoulli trials. This probability ( )xp k is given by,

( ) (1 )n k nn kp k C p p −= − k

s

Consider n independent repetitions of the Bernoulli trial. Let be the sample space associated with each trial and we are interested in a particular event and its complement such that and

1S

1A ∈cA ( )P A p= ( ) 1cP A p= − . If A occurs in a trial, then we have

a ‘success’ otherwise a ‘failure’. Thus the sample space corresponding to the n repeated trials is 1 2 ....... nS S S S= × × × . Any event in is of the form S 1 2 ........ nA A A× × × where some s are A and remaining

s are . iA

iA cA Using the property of independent experiment we have,

( ) ( ) ( )1 2 1 2( ......... ) ...........n nP A A A P A P A P A× × × = . If k s are A and the remaining n - k s are , then iA iA cA

Page 23: 35229433 Random Variale Random Process

( )1 2 ......... (1 )k n

nP A A A p p k−× × × = − But the number of events in S with k number of As and n – k number of s. For example, if , the possible events are

nkC cA

4, 2n k= =

c c

c c

c c

c c

c c

c c

A A A AA A A AA A A AA A A AA A A AA A A A

× × ×

× × ×

× × ×

× × ×

× × ×

× × ×

We also note that all the events are mutually exclusive. n

kC Hence the probability of k successes in n independent repetitions of the Bernoulli trial is given by

( ) (1 )n k nn kp k C p p −= − k

Page 24: 35229433 Random Variale Random Process

Example1: A fair dice is rolled 6 times. What is the probability that a 4 appears thrice? Solution: We have 1 1,2,3,4,5,6s =

4A = with 1( )6iP A p= =

And 1,2,3,5,6cA = with ( ) 1 5cP A p= − = / 6

4 2

66 5 1 5(4) 0.008

2 6 6P × ⎛ ⎞ ⎛ ⎞∴ = × × =⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠

Example2: A communication source emits binary symbols 1 and 0 with probability 0.6 and 0.4 respectively. What is the probability that there will be 5 1s in a message of 20 symbols? Solution:

Page 25: 35229433 Random Variale Random Process

1 0,1S =

1 , ( ) 0.6A P A p= = =

( ) ( )52020 5(5) 0.6 0.4P C 15∴ = =0.0013

Example 3 In a binary communication system, bit error occurs with a probability of . What is the probability of getting at least one error bit in a message of 8 bits?

510−

Here we can consider the sample space

1-5

8

error in transmission of 1 bit error in transmission in transmission of 1 bit

with p=error in transmission of 1 bit =10 Probability of no bit-error in transmission of 8 bits

= (

S no

P

= ∪

∴5 8

8

0) (1 10 ) 0.9999Probability of at least bit-error in transmission of 8 bits=1- (0) 0.0001P

−= − =∴ = Typical plots of binomial probabilities are shown in the figure. Approximations of the Binomial probabilities Two interesting approximations of the binomial probabilities are very important. Case 1 Suppose n is very large and p is very small and np λ= a constant.

Page 26: 35229433 Random Variale Random Process

( )( )

( ) (1 )

1

1( 1)....( 1) .

1

11 1/ 1 2 / ....... 1 1

1

n k n kn k

k n kn

k

nk

kk

nk k

kk

P k C p p

Cn n

n n n k nk

nn

kn n nn n

kn

n

λ λ

λλ

λ

λλ

λ

= −

⎛ ⎞ ⎛ ⎞= −⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

⎛ ⎞−⎜ ⎟− − + ⎝ ⎠=⎛ ⎞−⎜ ⎟⎝ ⎠

−⎛ ⎞ ⎛− − − −⎜ ⎟ ⎜⎝ ⎠ ⎝=

⎛ ⎞−⎜ ⎟⎝ ⎠

⎞⎟⎠

lim 1 0k

n nλ

→∞

⎛ ⎞− =⎜ ⎟⎝ ⎠

and lim 1n

ne

nλλ −

→∞

⎛ ⎞− =⎜ ⎟⎝ ⎠

( )k

nep kk

λλ −

∴ =

This distribution is known as Poisson probability and widely used in engineering and other fields. We shall discuss more about this distribution in a later class.

Page 27: 35229433 Random Variale Random Process

Case 2 When n is sufficiently large and (1 ) 1np p− , ( )np k may be approximated as

( )( )

2

2 11( )2 (1 )

k npnp p

np k enp pπ

−−

−≈−

The right hand side is an expression for normal distribution to be discussed in a later class. Example: Consider the problem in example 2.

Here ( )25 3

2 0.6 0.420

1(5) 0.01282 20 0.6 0.4

np eπ

−−

× ×≈ =× × ×

Page 28: 35229433 Random Variale Random Process
Page 29: 35229433 Random Variale Random Process

Random Variables

In application of probabilities, we are often concerned with numerical values which are random in nature. These random quantities may be considered as real-valued function on the sample space. Such a real-valued function is called real random variable and plays an important role in describing random data. We shall introduce the concept of random variables in the following sections. Mathematical Preliminaries Real-valued point function on a set Recall that a real-valued function maps each element :f S → ,s S∈ a unique element

The set is called the domain of and the set ( ) .f s ∈ S f ( ) | fR f x x S= ∈ is called the range of Clearly The range and domain of are shown in Fig, . .f .fR ⊆ f

Image and Inverse image For a point the functional value ,s S∈ ( )f s ∈ is called the image of the point If .s

,A S⊆ then the set of the images of elements of A is called the image of A and denoted by ( ).f A Thus

( ) ( ) | f A f s s A= ∈ Clearly ( ) ff A R⊆ Suppose The set .Β ⊆ | ( ) x f x ∈Β is called the inverse image of Β under f and is denoted by 1( ).f − ΒExample Suppose and is defined by and , S H T= :f S → ( ) 1f H = ( ) 1.f T = − Therefore,

• 1, 1fR = − ⊆

4s

2s 3s

1s

1(f s

2(f s

1( )f s

3( )f s

Domain of f Range of

f

Page 30: 35229433 Random Variale Random Process

• Image of is 1 and that of T is -1. H• For a subset of say 1 ( , 1.5],Β = −∞ 1

1 1( ) | ( ) , .f s f s H T− Β = ∈Β = For another subset 1

2 2[5, 9], ( ) .B f B φ−= = Random variable A random variable associates the points in the sample space with real numbers. Consider the probability space ( , and function mapping the sample space into the real line. Let us define the probability of a subset by

, )S PF :X S →S B ⊆

1( ) ( ( )) ( | ( ) ).XP B P X B P s X s B−= = ∈ Such a definition will be valid if 1( ( ))X B− is a valid event. If is a discrete sample space, S 1( ( ))X B− is always a valid

event, but the same may not be true if is infinite. The concept of sigma algebra is again necessary to overcome this difficulty. We also need the Borel sigma algebra -the sigma algebra defined on the real line.

SB

The function called a random variable if the inverse image of all Borel sets under

:X S →X is an event. Thus, if X is a random variable, then

1( ) | ( ) .X s X s− Β = ∈Β ∈F (To be animated) Remark

• is the domain of S .X • The range of ,X denoted b

Clearly .XR ⊆• The above definition of th

is a valid event 1 ( )X − Β

1( )A X B−=

B

S

1X −

Figure Random Variabl

Notations: • Random variables are represented by

upper-case letters. • Values of a random variable are

( ) | .XR X s s S= ∈

denoted by lower case letters • ( )X s x= means that x is the value of a

random variable X at the sample point

y ,XR is given by

e

.s• Usually, the argument s is omitted and

we simply write .X x=

e random variable requires that the mapping X is such that in If is a discrete sample space, this requirement is .S S

Page 31: 35229433 Random Variale Random Process

met by any mapping Thus any mapping defined on the discrete sample space is a random variable.

:X S → .

Example 1: Consider the example of tossing a fair coin twice. The psample space is S= HH,HT,TH,TT and all four outcomes are equally likely. Then we can define a random variable X as follows

Sample Point Value of the random Variable X x=

HH 0 HT 1 TH 2 TT 3

Here 0,1,2,3.XR = Example 2: Consider the sample space associated with the single toss of a fair die. The sample space is given by 1,2,3,4,5,6S =If we define the random variable X that associates a real number equal to the number in the face of the die, then

1,2,3,4,5,6X =

Probability Space induced by a Random Variable The random variable X induces a probability measure on B defined by XP. 1( ) ( ( )) ( | ( ) )XP B P X B P s X s B−= = ∈

=

The probability measure satisfies the three axioms of probability: XP Axiom 1

. 1( ) ( ( )) 1XP B P X B−= ≤ Axiom 2 1( ) ( ( )) ( ) 1XP P X P S−= = Axiom 3 Suppose are disjoint Borel sets. Then are distinct events in

Therefore, 1 2, ,....B B 1 1

1 2( ), ( ),....X B X B− −

.F

1

1 1

1

1

1

( ) ( ( )

( ( ))

( )

X i ii i

ii

X ii

P B P X B

P X B

P B

∞ ∞−

= =

∞−

=

=

=

= ∑

= ∑

∪ ∪

Page 32: 35229433 Random Variale Random Process

Thus the random variable X induces a probability space ( , , )XS PBProbability Distribution Function We have seen that the event B and | ( ) s X s B∈ are equivalent and

The underlying sample space is omitted in notation and we simply write

( ) ( | ( ) ).XP B P s X s B= ∈X B∈ and in stead of |( )P X B∈ ( ) s X s B∈ and

respectively. ( | ( ) )P s X s B∈

Consider the Borel set ( , ]x−∞ where x represents any real number. The equivalent event is denoted as 1− −∞ = ≤ ∈(( , ]) | ( ) , s SX x s X s x .X x≤ The event X x≤

.

can be taken as a representative event in studying the probability description of a random variable X Any other event can be represented in terms of this event. For example,

1 2 2 1 , \ ,1 \

c

1n

X x X x x X x X x X x

X x X x X xn

≤ < ≤ = ≤ ≤

⎛ ⎞≤ ≤ −⎜ ⎟⎝ ⎠

=

> =

= = ∩

and so on.

The probability is called the probability distribution function ( also called the cumulative distribution function abbreviated as CDF) of

( ) ( | ( ) , )P X x P s X s x s S≤ = ≤ ∈

X and denoted by Thus (( ).XF x , ],x −∞ ( ) ( )XF x P X x= ≤

( )XF x

Value of the random variable

Random variable

Eaxmple 3 Consider the random variable X in Example 1 We have

Value of the random Variable X x=

( )P X x=

0 14

1 14

2 14

3 14

Page 33: 35229433 Random Variale Random Process

For 0,

( ) ( ) 0For 0 1,

1( ) ( ) ( 0)4

For 1 2,( ) ( )

( 0 1) ( 0) ( 1)

1 1 1 4 4 2

For 2 3,( ) ( )

( 0

X

X

X

X

xF x P X x

x

F x P X x P X

xF x P X x

P X XP X P X

xF x P X x

P X

<= ≤ =≤ <

= ≤ = = =

≤ <= ≤= = ∪ == = + =

= + =

≤ <= ≤= = ∪ 1 2)

( 0) ( 1) ( 2)1 1 1 3 4 4 4 4

For 3,( ) ( )

( ) 1

X

X XP X P X P X

xF x P X x

P S

= ∪ == = + = + =

= + + =

≥= ≤==

Properties of Distribution Function

Page 34: 35229433 Random Variale Random Process

• 0 ( )XF x≤ ≤ 1

This follows from the fact that is a probability and its value should lie between 0 and 1.

)(xFX

• is a non-decreasing function of Thus, )(xFX .X

1 2 1 2, then ( ) ( )X Xx x F x F< < x

1 2

1 2

1 2

1 2

( ) ( ) ( ) ( )

( ) ( )X X

x xX s x X s x

P X s x P X s xF x F x

<⇒ ≤ ⊆ ≤⇒ ≤ ≤ ≤∴ <

• is right continuous )(xFX

00

0 00 0

( ) lim ( ) ( )

Because, lim ( ) lim ( )

= ( ) = ( )

X X Xhh

Xh hh h

X

F x F x h F x

F x h P X s x h

P X s xF x

+

→>

→ →> >

= + =

+ = ≤ +

• 0)( =−∞XF

Because, ( ) | ( ) ( ) 0XF P s X s P φ−∞ = ≤ −∞ = =

A real function is said to be continuous at a point if and only if

( )f xa

(i) is defined ( )f a(ii) ( ) ( ) ( )lim lim

x a x af x f x f a

→ + → −= =

Thefunction is said to be right-continuous ( )f xat a point if and only if a

(iii) is defined ( )f a ( ) ( )lim

x af x f a

→ +=

• 1)( =∞XFBecause, ( ) | ( ) ( ) 1XF P s X s P S∞ = ≤ ∞ = = • 1 2 2( ) ( ) ( )X XP x X x F x F x< ≤ = − 1

1

We have 2 1 1 2

2 1 1 2

1 2 2 1 2

( ) ( ) ( )( ) ( ) ( ) ( ) ( )X X

X x X x x X xP X x P X x P x X xP x X x P X x P X x F x F x

≤ = ≤ < ≤∴ ≤ = ≤ + < ≤⇒ < ≤ = ≤ − ≤ = −

• ( )= ( ) ( )X XF x F x P X x− − =

00

00

( ) lim ( )

lim ( )

= ( ) ( ( ) ) = ( ) ( )

X Xhh

hh

X

F x F x h

P X s x h

P X s x P X s xF x P X x

→>

→>

= −

= ≤ −

≤ − =− =

Page 35: 35229433 Random Variale Random Process

We can further establish the following results on probability of events on the real line:

1 2 2 1 ( ) ( ) (X XP x X x F x F x P X x≤ ≤ = − + = 1 )

2

1 2 2 1 1( ) ( ) ( ) ( ) ( )X XP x X x F x F x P X x P X x≤ < = − + = − =

( ) ( ) 1 ( )XP X x P x X F x> = < < ∞ = − Thus we have seen that given ( ), - ,XF x x∞ < < ∞ we can determine the probability of any event involving values of the random variable .X Thus ( ) x XXF x ∀ ∈ is a complete description of the random variable .X Example 4 Consider the random variable X defined by

( ) 0, 21 1 , 2 08 41, 0

XF x x

x x x

x

= < −

= + − ≤ <

= ≥

Find a) P(X = 0) b) 0P X ≤

c) 2P X >

d) 1 1P X − < ≤Solution:

a) ( 0) (0 ) (0 )

1 314 4

X XP X F F+ −= = −

= − =

b) 0

1XP X F≤ =

=

(0)

c) 2 1 (2)

1 1 0XP X F> = −

= − =

d) 1 1

(1) ( 1)1 718 8

X X

P XF F− < ≤

= − −

= − =

( )XF x

1

x →

Page 36: 35229433 Random Variale Random Process

Conditional Distribution and Density function: We discussed conditional probability in an earlier class. For two events A and B with , the conditional probability ( ) 0P B ≠ ( )/P A B was defined as

( ) ( )( )

/P A B

P A BP B

∩=

Clearly, the conditional probability can be defined on events involving a random variable X. Consider the event X x≤ and any event B involving the random variable X. The conditional distribution function of X given B is defined as

( )

( ) ( )

/ /

0

XF x B P X x B

P X x BP B

P B

= ≤⎡ ⎤⎣ ⎦≤ ∩⎡ ⎤⎣ ⎦= ≠

We can verify that satisfies all the properties of the distribution function. ( /XF x B) In a similar manner, we can define the conditional density function ( )/Xf x B of the random variable X given the event B as

( ) ( )/ /X Xdf x B F x Bdx

=

Example 1: Suppose X is a random variable with the distribution function ( )XF x .

Defined B X b= ≤ . Then

( ) ( )

( )

/X

X

P X x BF x B

P B

P X x X bP X b

P X x X bF b

≤ ∩⎡ ⎤⎣ ⎦=

≤ ∩ ≤⎡ ⎤⎣ ⎦=≤

≤ ∩ ≤⎡ ⎤⎣ ⎦=

Case 1: x<b Then

Page 37: 35229433 Random Variale Random Process

( )

( )

( )( )( )

/XX

X

X X

P X x X bF x B

F b

P X x F xF b F b

≤ ∩ ≤⎡ ⎤⎣ ⎦=

≤= =

And ( ) ( )( )

( )( )

/ X XX

X X

F x f xdf x Bdx F b f b

= =

Case 2: x b≥

( ) ( )

( )

( )( )

/XX

X

X X

P X x x bF x B

F x

P X b F bF x F x

≤ ∩ ≤⎡ ⎤⎣ ⎦=

≤= =

and ( ) ( ) ( )( )

/ / XX X

X

F bd df x B F x Bdx dx F x

= = 0=

)

( /XF x B and ( )/Xf x B are plotted in the follo

Remark: We can define the Bayo rule in a simil Suppose the interval X x≤ is portioned

. 1

n

iiX x B

=≤ = ∪

Then ( ) ( ) ( )1

/n

X i Xi

iF x P B F x=

= ∑ B

( ) ( )

( ) ( )1

/

/

X

n

i X ii

P B X xP B X x

F x

P B X x

P B F x B=

∪ ≤⎡ ⎤⎣ ⎦∴ ≤ =

∪ ≤⎡ ⎤⎣ ⎦=

wing figures.

ar manner.

into non overlapping subsets such that

Page 38: 35229433 Random Variale Random Process

Mixed Type Random variable: A random variable X is said to be mixed type if its distribution function has discontinuous finite number of points and increases strictly with respect to over at least one interval of values of the random variable X.

( )XF x

Thus for a mixed type random variable X, ( )XF x has discontinuous, but is not of stair case type as the in the case of discrete random variable. A typical plot of the distribution functions of a mixed type random variable as shown in Figure.

Suppose SD denotes the countable subset of points on SX such that the RV X is characterized by the probability mass function ( ) ,X Dp x x S∈ . Similarly let SC be a continuous subset of points on SX such that RV is characterized by the probability density function ( ) ,X Cf x x S∈ . Clearly the subsets SD and SC partition the set SX. If ( )DP S , then ( ) 1CP S p= − . Thus the probability of the event X x≤ can be expressed as ( ) ( ) ( ) ( )

( ) ( ) ( )| |

1D D C

D C

P X x P S P X x S P S P X x S

pF x p F x

≤ = ≤ + ≤

= + −C

Where ( )DF x is the conditional distribution function of X given X is discrete and

( )CF x is the conditional distribution function given that X is continuous. Clearly is a staircase of function and ( )DF x ( )CF x is a continuous function.

Page 39: 35229433 Random Variale Random Process

Also note that,

( )( )

D

D

Xx X

p P S

p x∈

=

= ∑

Example: Consider the distribution function of a random variable X given by

( )

( )

0, 01 1 0 44 163 1 4 4 84 161 8

XF x x

x x

x x

x

= <

= + ≤ <

= + − ≤ ≤

= >

Express as the distribution function of a mixed type random variable. ( )XF x Solution:

The distribution function is as shown in the figure.

(XF x)

( )xClearly XF has jumps

at 0 and 4x x= = .

( ) ( ) 1 1 10 44 4 2X Xp p p∴ = + = + =

And

( ) ( ) ( ) ( ) ( )

( ) ( )

0 41 1 42 2

D X XF x p u x p u x

u x u x

= +

= + −

4−

Page 40: 35229433 Random Variale Random Process

( )0 0

0 81 8

C

xF x x x

x

∞ < <⎧⎪= ≤⎨⎪ >⎩

Example 2: X is the RV representing the life time of a device with the PDF ( )Xf x for x>0. Define the following random variable

if if

y X X aa X

= ≤= > a

DS a=

( )0,CS a=

( )1 X

p P y D

P X a

F a

= ∈

= >

= −

( ) ( ) ( ) ( )1X D CF x pF x p F x∴ = + −

Page 41: 35229433 Random Variale Random Process

Discrete, Continuous and Mixed-type random variables

• A random variable X is called a discrete random variable is piece-wise constant. Thus is flat except at the points of jump discontinuity. If the sample space is discrete the random variable

)(xFX

)(xFX

S X defined on it is always discrete.

• X is called a continuous random variable if is an absolutely continuous

function of x. Thus is continuous everywhere on and exists everywhere except at finite or countably infinite points .

)(xFX

)(xFX ( )XF x′

• X is called a mixed random variable if has jump discontinuity at countable number of points and it increases continuously at least at one interval of values of x. For a such type RV X,

)(xFX

( ) ( ) (1 ) ( )d cX X XF x pF x p F x= + −

where ( )dXF x is the distribution function of a discrete RV and ( )

cXF x is the distribution function of a continuous RV. Typical plots of for discrete, continuous and mixed-random variables are shown in Fig below.

( )XF x

( )XF x

Plot ( ) vs. XF x x for a discrete random variable ( to bex →

x

( )XF x

1

1

animated)

Page 42: 35229433 Random Variale Random Process

x →

( )XF x

1

Plot ( ) vs. XF x x for a continuous random variable ( to be animated)

Plot ( ) vs. XF x x for a mixed-type random variable ( to b

1

( )XF x

x →

Discrete Random Variables and Probability mass functions A random variable is said to be discrete if the number of elements incountably infinite. Examples 1 and 2 are discrete random variables. Assume XR to be countably finite. Let 1 2 3, , ..., Nx x x x be the elemen

( )X s partitions into subsets S N | ( ) , 1,2,...., .is X s x i N= = The discrete random variable in this case is completely specified byfunction (pmf) ( ) ( | ( ) ), 1, 2,...., .X i ip x P s X s x i N= = = Clearly,

• ( ) 0 and X i i Xp x x R≥ ∀ ∈• ( ) 1

XX i

i Rp x

∈=∑

• Suppose Then .XD R⊆

e animated)

the range of XR is finite or

ts in .XR Here the mapping

the probability mass

Page 43: 35229433 Random Variale Random Process

( ) ( )i

X ix D

P x D p x∈

∈ = ∑

(To be animated) Example Consider the random variable X with the distribution function

0 0 1 0 1 4( )1 1 221 2

X

x

xF x

x

x

<⎧⎪⎪ ≤ <⎪= ⎨⎪ ≤ <⎪⎪ ≥⎩

The plot of the is shown in Fig. ( )XF x

4s

2s

1( )X s

1s2( )X s

3( )X s 3s

Figure Discrete Random Variable

4( )X s

Page 44: 35229433 Random Variale Random Process

The probability mass function of the random variable is given by

Value of the random Variable X x=

( )Xp x

0 14

1 14

2 12

We shall describe about some useful discrete probability mass functions in a later class. Continous Random Variables and Probability Density Functions

For a continuous random variable is continuous everywhere. Therefore, ,X )(xFX

( ) ( ) .X XF x F x x−= ∀ ∈ This implies that

( ) ( )

( ) ( ) 0

X

X X

p x P X x

F x F x−

= =

= −=

210

( )XF x

1

12

14

x

Page 45: 35229433 Random Variale Random Process

Therefore, the probability mass function of a continuous RV X is zero for all .x A continuous random variable cannot be characterized by the probability mass function. A continuous random variable has a very important chacterisation in terms of a function called the probability density function.

If is differentiable, the probability density function ( pdf) of )(xFX ,X denoted by ( ),Xf x us defined as

)()( xFdxdxf XX =

Interpretation of ( )Xf x

0

0

( ) ( )

( ) ( lim

( ) lim

X X

X X

x

x

df x F xdx

F x x F xx

P x X x x

)

x

∆ →

∆ →

=

+ ∆ −=

∆< ≤ + ∆

=∆

so that

( ) ( ) .XP x X x x f x x< ≤ + ∆ ∆

Thus the probability of X lying in the some interval ( , ]x x x+ ∆ is determined by ( ).Xf x ( ) In that sense, Xf x represents the concentration of probability just as the density

represents the concentration of mass.

Properties of the Probability Density Function

• ( ) 0.Xf x ≥

This follows from the fact that is a non-decreasing function ( )XF x

• ( ) ( )x

X XF x f u du−∞

= ∫

• ∫∞

∞−

= 1)( dxxf X

Page 46: 35229433 Random Variale Random Process

• ∫−

=≤<2

1

)()( 21

x

xX dxxfxXxP

( )Xf x

Example

0( )

1

XF x⎧

= ⎨⎩

The pdf of

0( )

Xf xe⎧

= ⎨⎩

Remark:

random va

Consider ( )X ip x P=

The distribu

( )N

Xi

F x=

= ∑where u x(

( )iu x x− =

Then the d

0x 0 0x x+ ∆

Fig. 0(P x X<

Consider the ra

, 0 ax

xe a x−− >

the RV is give

, 0 ax

xa x− >

Using the Dirac

riables.

the random var( | ( ) )is X s x=tion function F

1( ) (X ip x u x x−

x− shifted un)i

1 for 0 otherwise

x ≥⎧⎨⎩

ensity function

x

0 0) (Xx x f≤ + ∆

ndom variable X w

0 <

≥ 0

n by

0 0

<

delta function we c

iable X defined by , 1, 2,...., .i N=

can be written a( )X x

)i

it-step function give

ix

( )Xf x can be writte

0 )x x∆ 0

ith the distribution function

an define the density function for a discrete

the probability mass function (pmf)

s

n by

n in terms of the Dirac delta function as

Page 47: 35229433 Random Variale Random Process

1( ) ( ) ( )

n

X X ii

if x p x xδ=

= −∑ x

Example Consider the random variable defined in Example 1 and Example 3. The distribution function ( )XF x can be written as

1 1 1( ) ( ) ( 1) ( 2)4 4 2

and1 1 1( ) ( ) ( 1) ( 2)4 4 2

X

X

F x u x u x u x

f x x x xδ δ δ

= + − + −

= + − + −

Probability Density Function of Mixed-type Random Variable Suppose X is a mixed-type random variable with having jump discontinuity at ( )XF x

, 1, 2,.., .iX x i n= = As already satated, the CDF of a mixed-type random variable X is given by

( ) ( ) (1 ) ( )d cX X XF x pF x p F x= + −

where is the distribution function of a discrete RV and is the distribution function of a discrete RV. Therefore,

( )dXF x ( )

cXF x

( ) ( ) (1 ) ( )d cX X Xf x pf x p f x= + −

where

1

( ) ( ) ( )d

n

X X ii

if x p x xδ=

= −∑ x

Example Consider the random variable X with the distribution function

0 0 0.1 0

( )0.1 0.8 0 11 1

X

xx

F xx x

x

<⎧⎪ =⎪= ⎨ + <⎪⎪ >⎩

<

The plot of is shown in Fig. ( )XF x

x →0

( )XF x

1

1

Page 48: 35229433 Random Variale Random Process

( )XF x can be expressed as

( ) 0.2 ( ) 0.8 ( )d cX X XF x F x F x= +

where 0 0

( ) 0.5 0 11 1

dX

xF x x

x

<⎧⎪= ≤⎨⎪ >⎩

and

0 0 ( ) x 0 1

1 1

cX

xF x x

x

<⎧⎪= ≤⎨⎪ >⎩

The pdf is given by

( ) 0.2 ( ) 0.8 ( )d cX X Xf x f x f= + x

where

( ) 0.5 ( ) 0.5 ( 1)dXf x x xδ δ= + −

and 1 0 1

( )0 elsewhere

cX

xf x

≤ ≤⎧= ⎨⎩

10

( )Xf x

1

x

Page 49: 35229433 Random Variale Random Process

Functions of Random Variables Often we have to consider random variables which are functions of other random variables. Let X be a random variable and is function Then is a random variable. We are interested to find the pdf of For example, suppose

(.)g . ( )Y g X=.Y

X represents the random voltage input to a full-wave rectifier. Then the rectifier output is given by Y .Y X= We have to find the probability description of the random variable We consider the following cases: .Y

(a) X is a discrete random variable with probability mass function ( )Xp x The probability mass function of Y is given by

| ( )

| ( )

( ) ( )

( | ( ) )

( )

( )

Y

x g x y

Xx g x y

p y P Y y

P x g x y

P X x

p x=

=

= =

= =

= =

=

(b) X is a continuous random variable with probability density function ( )Xf x and

is one-to-one and monotonically increasing ( )y g x= The probability distribution function of Y is given by

Page 50: 35229433 Random Variale Random Process

] 1

1

1

1

( )

( )

( )

( )

( )

( )

( ) ( )

( )

( )( )

( )

( )

( )

Y

x g

X x g y

YY

X

x g y

X

x g y

X

x g

F y P Y y

P g X y

P X g y

P X x y

F x

dF yf y

dydF x

dy

dF x dxdx dy

dF xdxdydx

=

=

=

=

=

= ≤

= ≤

= ≤

= ≤ ⎤⎦

=

=

⎤= ⎥

⎤= ⎥

⎤⎥

= ⎥⎥⎥⎦ 1

1

( )

( )

( )

( )

y

X

x g y

f xg x

−=

⎤= ⎥′ ⎦

1 ( )

( ) ( )( )( )

X XY

x g y

f x f xf y dy g xdx

−=

⎤= = ⎥′ ⎦

This is illustrated in Fig.

Page 51: 35229433 Random Variale Random Process

Example 1: Probability density function of a linear function of random variable Suppose , 0.Y aX b a= + >

Then and

( )( )( )X

XY

y b dyx aa dx

y bff x af y dy adx

−= =

∴ = =

Example 2: Probability density function of the distribution function of a random variable Suppose the distribution function of a continuous random variable ( ) XF x X is monotonically increasing and one-to-one and define the random variable

Then, ( ). XY F X= ( ) 1 0 1.Yf y y= ≤ ≤( )

Clearly 0 1( ) ( )

( ) ( )( ) 1( )

( ) 1 0 1.

X

XX

X XY

X

Y

y F xy

dF xdy f xdx dx

f x f xf y dy f xdx

f y y

=≤ ≤

= =

∴ = = =

∴ = ≤ ≤

Page 52: 35229433 Random Variale Random Process

Remark (1) The distribution given by ( ) 1 0 1Yf y y= ≤ ≤ is called a uniform distribution over the interval [0,1]. (2) The above result is particularly important in simulating a random variable with a particular distribution function. We assumed to be one-to-one function for invariability. However, the result is more general- the random variable defined by the distribution function of any random variable is uniformly distributed over [0,1]. For example, if

( ) XF x

X is a discrete RV,

1

1

1

( ) = ( ) ( ( ) )

( ( ))

( ( ))

( Assigning ( ) to the left-most point of the interval for which ( ) ).

Y

X

X

X X

X X

F y P Y yP F x y

P X F y

F F y

y F y F x

≤= ≤

= ≤

=

= = y

( )( ) 1 0 1.YY

dF yf y ydy

∴ = = ≤ ≤

(c) X is a continuous random variable with probability density function ( )Xf x and

has multiple solutions for ( )y g x= x Suppose for has solutions, (y Y y g x∈ = ) , 1, 2,3,.............,ix i n= . Then

X

( )XY F X=

y

1 ( )Xx F y−=

Page 53: 35229433 Random Variale Random Process

1

( )( )

i

nX

Yi

x x

f xf ydydx

=

=

=∑

Proof: Consider the plot of . Suppose at a point( )Y g X= ( )y g x= , we have three distinct roots as shown. Consider the event y Y y dy< ≤ + . This event will be equivalent to union events 1 1 1 2 2 2 3 3, and 3x X x dx x dx X x x X x dx< ≤ + − < ≤ < ≤ +

1 1 1 2 2 2 3 3P y Y y dy P x X x dx P x dx X x P x X x dx3∴ < ≤ + = < ≤ + + − < ≤ + < ≤ +

1 1 2 2 3( ) ( ) ( )( ) ( )Y X X X 3f y dy f x dx f x dx f x dx∴ = + − + Where the negative sign in is used to account for positive probability. 2dx− Therefore, dividing by dy and taking the limit, we get

Page 54: 35229433 Random Variale Random Process

31 21 2 3

31 21 2 3

3

1

( ) ( ) ( ) ( )

( ) ( ) ( )

( )

i

Y X X X

X X X

X i

i

x x

dxdx dxf y f x f x f xdy dy dy

dxdx dxf x f x f xdy dy dy

f xdydx

=

=

⎛ ⎞ ⎛ ⎞ ⎛= + − +⎜ ⎟ ⎜ ⎟ ⎜

⎝ ⎠ ⎝ ⎠ ⎝

= + +

=∑

⎞⎟⎠

In the above we assumed to have three roots. In general, if has n roots, then

( )y g x= ( )y g x=

1

( )( )

i

nX i

Yi

x x

f xf ydydx

=

=

=∑

Example 3: Probability density function of a linear function of random variable Suppose , 0.Y aX b a= + ≠

Then and

( )( )( )X

XY

y b dyx aa dx

y bff x af ydy adx

−= =

∴ = =

Example 4: Probability density function of the output of a full-wave rectifier Suppose , , 0Y X a X a a= − ≤ ≤ >

Page 55: 35229433 Random Variale Random Process

y

X

Y

yy−

y x= has two solutions 1x y= and 2x y= − and 1dydx

= at each solution point.

] ]( ) ( )

( )1 1

X Xx y x yY

f x f xf y = =−∴ = +

( ) ( )X Xf x f x= + − Example 5: Probability density function of the output a square-law device

2 , 0Y CX C= ≥

2 0yy cx x yc

∴ = => = ± ≥

And 2dy cxdx

= so that 2 / 2dy c y c cydx

= =

( ) ( )/ /( )

2X X

Y

f y c f y cf y

cy

+ −∴ = 0y >

= 0 otherwise

Page 56: 35229433 Random Variale Random Process

Expected Value of a Random Variable

• The expectation operation extracts a few parameters of a random variable and provides a summary description of the random variable in terms of these parameters.

• It is far easier to estimate these parameters from data than to estimate the distribution or density function of the random variable.

Expected value or mean of a random variable

The expected value of a random variable X is defined by

( )XEX xf x d∞

−∞= ∫ x

provided ( )Xxf x dx∞

−∞∫ exists.

EX is also called the mean or statistical average of the random variable X and denoted by .Xµ Note that, for a discrete RV X defined by the probability mass function (pmf)

( ), 1, 2,...., ,X ip x i N= the pdf ( )Xf x is given by

1

( ) ( ) ( )N

X X ii

if x p x xδ=

= −∑ x

i

1

1

1

( ) ( )

= ( ) ( )

= ( )

N

X X ii

N

X i ii

N

i X ii

EX x p x x x d

p x x x x dx

x p x

µ δ

δ

=−∞

= −∞

=

∴ = = −∑∫

−∑ ∫

x

Thus for a discrete random variable X with ( ), 1, 2,...., ,X ip x i N=

X1

= ( )N

i X ii

x p xµ=∑

Interpretation of the mean • The mean gives an idea about the average value of the random value. The

values of the random variable are spread about this value. • Observe that

( )

( ) ( ) 1

( )

X X

X

X

X

xf x dx

xf x dxf x dx

f x dx

µ∞

−∞

∞−∞∞

−∞

−∞

= ∫

∫= =∫

∫∵

Therefore, the mean can be also interpreted as the centre of gravity of the pdf curve.

Page 57: 35229433 Random Variale Random Process

Fig. Mean of a random variable Example 1 Suppose is a random variable defined by the pdf X

1( )

otherwise0X

a x bf x b a

⎧ ≤ ≤⎪= −⎨⎪⎩

Then

( )

1

2

X

b

a

EX xf x dx

x dxb a

a b

−∞= ∫

= ∫−

+=

Example 2 Consider the random variable X with pmf as tabulated below

Value of the random variable x

0 1 2 3

( )Xp x 18

18

14

12

X1

= ( )

1 1 1 =0 1 2 38 8 4

17 =8

N

i X ii

x p xµ=

∴ ∑

× + × + × + ×12

Remark If ( )Xf x is an even function of ,x then ( ) 0.Xxf x dx∞

−∞=∫ Thus the mean of a RV

with an even symmetric pdf is 0.

Page 58: 35229433 Random Variale Random Process

Expected value of a function of a random variable Suppose is a function of a random variable ( )Y g X= X as discussed in the last class.

Then, ( ) ( ) ( )XEY Eg X g x f x dx∞

−∞= = ∫

We shall illustrate the theorem in the special case when ( )g X ( )y g x= is one-to-one and monotonically increasing function of In this case, .x

1 ( )

( )( )

( )X

Yx g y

f xf y

g x −=

⎤= ⎥′ ⎦

1

12

1

( )

( ( )) =

( ( )

Y

yX

y

EY yf y dy

f g yy d

g g y

−∞

= ∫

∫ ′y

where 1 2( ) and ( ).y g y g= −∞ = ∞

Substituting 1 ( ) so that ( ) and ( ) ,x g y y g x dy g x dx− ′= = = we ge

= ( ) ( )XEY g x f x dx∞

−∞∫

The following important properties of the expectation operation caderived:

(a) If is a constant, c Ec c= Clearly

( ) ( )X XEc cf x dx c f x dx∞ ∞

−∞ −∞= =∫ ∫ c=

(b) If are two functions of the random vari1 2( ) and ( ) g X g X

are constants, 1 and c 2c) 1 1 2 2 1 1 2 2[ ( ) ( )]= ( ) (E c g X c g X c Eg X c Eg X+ +

1 1 2 2 1 1 2 2

1 1 2 2

1 1 2 2

[ ( ) ( )] [ ( ) ( )] ( )

= ( ) ( ) ( ) ( )

= ( ) ( ) ( ) (

X

X X

X X

E c g X c g X c g x c g x f x dx

c g x f x dx c g x f x

c g x f x dx c g x f x

−∞

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

+ = +∫

+∫ ∫

+∫ ∫

1 1 2 2 = ( ) ( )c Eg X c Eg X+

( )g x

t

n be imme

able X an

)

dx

dx

2y

1y

diately

d

x
Page 59: 35229433 Random Variale Random Process

The above property means that E is a linear operator.

Mean-square value

2 2 ( )XEX x f x∞

−∞= ∫ dx

Variance For a random variable X with the pdf ( )xXf and men ,Xµ the variance of X is denoted by 2

Xσ and defined as

2 2 2( ) ( ) ( )X X X XE X x fσ µ µ∞

−∞= − = −∫ x dx

Thus for a discrete random variable X with ( ), 1, 2,...., ,X ip x i N=

2 2X

1 = ( ) ( )

N

i X X ii

x p xσ µ=

−∑

The standard deviation of X is defined as 2( )X XE Xσ µ= −

Example 3 Find the variance of the random variable discussed in Example 1.

2 2

2

22

2

( )1 ( )

2

1 = [ 22 2

( ) 12

X Xb

a

b b

a a

E Xa bx dx

b a

a b a b b

ax dx xdx dx

b a

b a

σ µ= −+

= −∫−

+ +⎛ ⎞− × +∫ ∫ ⎜ ⎟− ⎝ ⎠−

=

Example 4 Find the variance of the random variable discussed in Example 2.

As already computed 178Xµ =

2 2

2 2 2

( )17 1 17 1 17 1 17 1 (0 ) (1 ) (2 ) (3 )8 8 8 8 8 4 8

117 128

X XE Xσ µ= −

= − × + − × + − × + − ×

=

2

2

Page 60: 35229433 Random Variale Random Process

Remark • Variance is a central moment and measure of dispersion of the random

variable about the mean.

• is the average of the square deviation from the mean. It gives information about the deviation of the values of the RV about the mean. A smaller

2( XE X µ− )

2Xσ implies that the random values are more clustered

about the mean, Similarly, a bigger 2Xσ means that the random values are

more scattered. For example, consider two random variables with pmf as

shown below. Note that each of has zero means.

1 and XX 2

21 and XX1

2 12Xσ = and

2

2 53Xσ = implying that has more spread about the mean. 2X

x1-1

14

( )Xp x12

0x -1 0 1

1( )Xp x 1

4 1

2 1

2

x

16

( )Xp x3

x -2 -1 0 1 2

2( )Xp x 1

6 1

6

13

16

16

Fig. shows the pdfs of two continuous random variables with same mean but differvariances

1

1

ent

2

0 -1
Page 61: 35229433 Random Variale Random Process

• We could have used the mean absolute deviation XE X µ− for the same

purpose. But it is more difficult both for analysis and numerical calculations.

Properties of variance

(1) 2 2X XEX 2σ µ= −

Page 62: 35229433 Random Variale Random Process

2 2

2 2

2 2

2 2 2

2 2

( )

( 2 )

2

2

X X

X X

X X

X X

X

E X

E X X

EX EX E

EX

EX

σ µ

µ µ

µ µ

µ µ

µ

= −

= − +

= − +

= − +

= −

(2) If then , where and are constants,Y cX b c b= + 2 2Y Xc 2σ σ=

2 2

2 2

2 2

( )

( )

Y X

X

X

E cX b c b

Ec X

c

σ µ

µ

σ

= + − −

= −

=

(3) If c is a constant,

var( ) 0.c =

nth moment of a random variable

We can define the nth moment and the nth central-moment of a random variable X

by the following relations

nth-orde moment ( ) 1, 2,..

nth-orde central moment ( ) ( ) ( ) 1, 2, ...

n nX

n nX X X

r EX x f x dx n

r E X x f x dx nµ µ

−∞

−∞

= =∫

− = − =∫

Note that

• The mean X = EXµ is the first moment and the mean-square value 2 EX

is the second moment

• The first central moment is 0 and the variance is the

second central moment

2 2(X XE Xσ = − )µ

• The third central moment measures lack of symmetry of the pdf of a random

variable. 3

3

( X

X

E X µσ− ) is called the coefficient of skewness and If the pdf is

symmetric this coefficient will be zero.

• The fourth central moment measures flatness of peakednes of the pdf of a

random variable. 4

4

( X

X

E X µσ− ) is called kurtosis. If the peak of the pdf is

sharper, then the random variable has a higher kurtosis.

Page 63: 35229433 Random Variale Random Process

Inequalities based on expectations

The mean and variance also give some quantitative information about the bounds of

RVs. Following inequalities are extremely useful in many practical problems.

Chebysev Inequality Suppose X is a parameter of a manufactured item with known mean

2and variance .X Xµ σ The quality control department rejects the item if the absolute

deviation of X from Xµ is greater than 2 X .σ What fraction of the manufacturing

item does the quality control department reject? Can you roughly guess it?

The standard deviation gives us an intuitive idea how the random variable is

distributed about the mean. This idea is more precisely expressed in the remarkable

Chebysev Inequality stated below. For a random variable X with mean 2

X Xµ σ and variance

2

2 XXP X σµ ε

ε− ≥ ≤

Proof:

2

2 2

2

2

2

2

( ) ( )

( ) ( )

( )

X

X

X

x X X

X XX

XX

X

X

x f x dx

x f x dx

f x dx

P X

P X

µ ε

µ ε

σ

σ µ

µ

ε

ε µ ε

µ εε

−∞

− ≥

− ≥

= −∫

≥ −∫

≥ ∫

= − ≥

∴ − ≥ ≤

Markov Inequality For a random variable X which take only nonnegative values

( ) E XP X aa

≥ ≤ where 0.a >

Page 64: 35229433 Random Variale Random Process

0( ) ( )

( )

( )

X

Xa

Xa

E X xf x dx

xf x dx

af x dx

aP X a

= ∫

≥ ∫

≥ ∫

= ≥

( ) E XP X aa

∴ ≥ ≤

Remark:2

2 ( ) ( ) E X kP X k aa−

− ≥ ≤

Example

Example A nonnegative RV X has the mean 1.Xµ = Find an upper bound of the

probability ( 3P X ≥ ).

By Markov’s inequality

( ) 1( 3)3 3

E XP X ≥ ≤ = .

Hence the required upper bound 1 .3

=

Page 65: 35229433 Random Variale Random Process

Characteristic Functions of Random Variables Just as the frequency-domain charcterisations of discrete-time and continuous-time signals, the probability mass function and the probability density function can also be characterized in the frequency-domain by means of

the charcteristic function of a random variable. These functions

are particularly important in • calculating of moments of a random variable • evaluating the PDF of combinations of multiple RVs. Characteristic function

Consider a random variable X with probability density function ( ).Xf x The characteristic function of X denoted by ( ),X wφ is defined as

( )

= ( )

where 1.

j XX

j xX

w Ee

e f x dx

j

ω

ω

φ∞

−∞

=

= −

Note the following:

• ( )Xφ ω is a complex quantity, representing the Fourier transform of ( )f x and

traditionally using j Xe ω instead of .j Xe ω− This implies that the properties of the Fourier transform applies to the characteristic function.

• The interpretation that ( )Xφ ω is the expectation of j Xe ω helps in calculating moments with the help of the characteristics function.

• As ( )Xφ ω always +ve and ( ) 1Xf x dx∞

−∞

=∫ , ( )Xφ ω always exists.

[Recall that the Fourier transform of a function f(t) exists if ( )f t dt∞

−∞

< ∞⎡ ⎤⎣ ⎦∫ , i.e., f(t) is absolutely

integrable.] We can get ( )Xf x from ( )Xφ ω by the inverse transform

( ) ( )12

j xX Xf x e ω dφ ω ω

π

∞−

−∞

= ∫

Example 1 Consider the random variable X with pdf ( )Xf x given by

( ) 1Xf x a

b a= ≤

−x b≤

Page 66: 35229433 Random Variale Random Process

= 0 otherwise. The characteristics function is given by Solution:

( )

( ) ( )

1

1

1

bj x

Xa

bj x

a

j b j a

e dxb a

eb a j

e ej b a

ω

ω

ω ω

φ ω

ω

ω

=−

⎤= ⎥− ⎦

= −−

Example 2 The characteristic function of the random variable X with

0

( )

0

( ) 0, 0 is

( )

xX

j x xX

jw x

f x e x

e e

e dx

j

λ

ω λ

λ

λ λ

φ ω λ

λ

λλ ω

∞−

∞− −

= > >

= ∫

= ∫

=−

Characteristic function of a discrete random variable: Suppose X is a random variable taking values from the discrete set 1 2, ,.....XR x x= with

corresponding probability mass function ( )X ip x for the value ix .

Then

( )( ) i

i X

j XX

j xX i

X R

Ee

p x e

ω

ω

φ ω

=

= ∑

Note that ( )Xφ ω can be interpreted as the discrete-time Fourier transform with ij xe ω substituting

ij xe ω− in the original discrete-time Fourier transform. The inverse relation is

Page 67: 35229433 Random Variale Random Process

1( ) ( )2

ij xX i Xp x e

πω

πdφ ω ω

π−

−= ∫

Example 3 Suppose X is a random variable with the probability mass function

( )( ) 1 , 0,1,....,n kn kX kp k C p p k n−= − =

Then ( ) ( )0

1n

n kn k jX k

k

C p p e kωφ ω −

=

= −∑

( ) ( )0

1kn

n kn jk

kC pe pω −

=

= −∑

( ) njpe pω⎡= + −⎣ ⎤⎦ (Using the Binomial theorem)

Example 4 The characteristic function of the discrete random variable X with

0

0

( ) (1 ) , 0,1,....

( ) (1 )

(1 )

1 (1 )

kX

j k kX

k

j k k

k

j

p k p p k

e p p

p e p

pp e

ω

ω

ω

φ ω∞

=

=

= − =

= −∑

= −∑

=− −

Moments and the characteristic function Given the characteristics function ( )Xφ ω , the kth moment is given by

( )0

1 kk

Xk

dEXj d ω

φ ωω

=

=

To prove this consider the power series expansion of i Xe ω

2 2( ) ( )1 ...... ..2! !

n nj X j X j Xe j X

nω ω ωω= + + + + +

Taking expectation of both sides and assuming to exist, we get 2, ,..........., nEX EX EX

2 2( ) ( )( ) 1 ...... .....2! !

n n

Xj EX j EXj EX

nω ωφ ω ω= + + + + +

Taking the first derivative of ( )Xφ ω with respect to ω at 0,ω = we get

Page 68: 35229433 Random Variale Random Process

0

( )Xd jEXd ω

φ ωω =

=

Similarly, taking the derivative of nth ( )Xφ ω with respect to ω at 0,ω = we get

0

( )nn nX

n

d j EXd ω

φ ωω

=

=

Thus

0

0

( )1

( )1

X

nn X

n n

dEXj d

dEXj d

ω

ω

φ ωω

φ ωω

=

=

=

=

Example 3 First two moments of random variable in Example 2

2

2 2

2 3

20

22

2 30

( )

( )( )

2( )( )

1 1( )

1 2 2( )

X

X

X

jd j

d jd j

d j

jEXj j

jEXj j

ω

ω2

λφ ωλ ω

λφ ωω λ ω

λφ ωω λ ω

λλ ω λ

λλ ω λ

=

=

=−

=−

=−

= =−

= =−

Probability generating function: If the random variable under consideration takes non negative integer values only, it is convenient to characterize the random variable in terms of the probability generating function G (z) defined by

( )

( )0

XX

Xk

G z Ez

p k z∞

=

=

=∑

Note that

is related to z-transform, in actual z-transform, ( )XG z kz− is used instead of . kz

The characteristic function of X is given by ( ) ( )jX XG e ωφ ω =

( ) ( )0

1 1X Xk

G p k∞

=

= =∑

Page 69: 35229433 Random Variale Random Process

( ) ( ) 1

0

' kX X

k

G z kp k z∞

=

= ∑

( )0

'(1) Xk

G kp k∞

=

= =∑ EX

2kx ( ) ( ) ( )2 2 2

0 0 0

''( ) ( 1) k kX X x

k k k

G z k k p k z k p k z k p k z∞ ∞ ∞

− − −

= = =

= − = −∑ ∑ ∑

( ) ( )2 2

0 0''(1)X X X

k kG k p k kp k EX

∞ ∞

= =

EX∴ = − =∑ ∑ −

X

( ) ( )2 22 2 ''(1) '(1) '(1)X X XEX EX G G Gσ∴ = − = + − Ex: Binomial distribution: ( ) (1 )n x

m xp x C p p= − x

xz ( ) (1 )n x n xx x

xz C p pφ −∴ = −∑

( ) (1 )n xx

xC pz p n x−= −∑

(1 )np pz= − + ' (1)X EX npφ = = '' 2 2(1) ( 1)X EX EX n n pφ = − = −

2 ''

2

2

(1)

( 1)XEX E

n n p npnp npq

φ X∴ = +

= − +

= +

Example 2: Geometric distribution ( ) (1 )x

Xp x p p= −

( )( )

( ) (1 )

1

11 (1 )

x xX

x

x

x

z p p z

p p z

pp z

φ = −

= −

=− −

'2

(1 )( )(1 (1 ) )X

p pzp z

φ −= +

− −

Page 70: 35229433 Random Variale Random Process

'2 2

(1 ) (1 )(1)(1 1 )X

p p p p qp p p

φ − −= =

− +=

''3

2 (1 )(1 )( )(1 (1 ) )Xp p pz

p zφ − −

=− −

22

''3

2(1) 2Xpq qp p

φ⎛ ⎞

= = ⎜ ⎟⎝ ⎠

2

2 '' (1) 2q qEX qp p p

φ⎛ ⎞

= + = +⎜ ⎟⎝ ⎠

2 2

( ) 2 q q qVar Xp p p

⎛ ⎞ ⎛ ⎞ ⎛ ⎞= + −⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠ ⎝ ⎠

Moment Generating Function: Sometimes it is convenient to work with a function similarly to the Laplace transform and known as the moment generating function. For a random variable X, the moment generating function ( )XM s is defined by

( )( )

X

SXX

SXX

R

M s Ee

f x e dx

=

= ∫

Where XR is the range of the random variable X. If X is a non negative continuous random variable, we can write

( ) ( )0

SXX XM s f x e d

= ∫ x

Note the following:

( )0

'( ) sxx xM s xf x e d

= ∫ x

'(0)M EX∴ =

( ) ( )0

kk sx

X Xk

d M s x f x e dds

= ∫ x

= kEX

Page 71: 35229433 Random Variale Random Process

Example Let X be a continuous random variable with

( ) ( )2 2, 0Xf x x

xα α

π α= −∞ <

+< ∞ >

Then

( )XEX xf x dx∞

−∞

= ∫

= 2 20

2x dxx

απ α

+∫

= ( )2

0

ln 1 xαπ

∞⎤+ ⎥⎦

Hence EX does not exist. This density function is known as the Cauchy density function.

2

q qp p

⎛ ⎞ ⎛ ⎞= +⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

The joint characteristic function of two random variables X and Y is defined by,

1 2

1 2

, 1 2

,

( , )

( , )

j x j yX Y

j x j yX y

Ee

f x y e dydx

ω ω

ω ω

φ ω ω +

∞ ∞+

−∞ −∞

=

= ∫ ∫

And the joint moment generating function of 1 2( , )s sφ is defined by,

1 2

1 2

, 1 2 ,( , ) ( , ) xs ysX Y X Y

s x s y

s s f x y e dxdy

Ee

φ∞ ∞

+

−∞ −∞

+

=

=

∫ ∫

IF Z =ax+by, then

( ),( ) ( , )Zs ax by s

Z Xs Ee Ee as bsφ φ+= = = Y Suppose X and Y are independent. Then

Page 72: 35229433 Random Variale Random Process

1 2

1 2

, 1 2 ,

1 1 2 2

( , ) ( , )

( ) ( )

( ) ( )

s x s xX Y X Y

s x s yX Y

s s e f x y dydx

e f x dx e f y dy

s s

φ

φ φ

∞ ∞+

−∞ −∞

∞ ∞

−∞ −∞

=

=

=

∫ ∫

∫ ∫

Particularly if Z=X+Y and X and Y are independent, then

,( ) ( , )( ) ( )

Z X Y

X Y

s ss s

sφ φφ φ

=

=

Using the property of Laplace transformation we get,

( ) ( )* ( )Z X Yf z f z f z= Let us recall the MGF of a Gaussian random variable

2( ,X XX N µ σ= )

2

2 2 2 2

2

4 2 22 2

12

2( ) ( ) ( ) ..12

( 2 )11( ) 22

( )

1 .2

12

12

XX

X X X X X X

X

X X XX X X

XsX

xxs

Xx s x s s

Xs s

x s

X

s Ee

e e dx

e d

e dx e

µσ

µ σ µ σ µ σσ

σ µ σµ σ σ

x

dx

φ

πσ

πσ

πσ

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

−∞ −

−∞

− + + + − + +∞ −

−∞

+∞ +− − −

−∞

=

=

=

= +

We have,

, 1 2( , )X Y s sφ

( )

1 2

1 21 2

2 2 2 21 2

1 2 1 2

(1 ..............)2

12 2

Xs YsEeXs Ys

E Xs Ys

s EX s EYs EX s EY s s EXY

+=

+= + + + +

= + + + + +

Hence, 1, 1 2

1

( , )]X Y sEX s ssφ =

∂=∂ 0

2, 1 2

2

( , )]X Y sEY s ssφ =

∂=∂ 0

Page 73: 35229433 Random Variale Random Process

1 2

2

, 1 2 0,1 2

( , )]X Y s sEXY s ss s

φ 0= =

∂=∂ ∂

We can generate the joint moments of the RVS from the moment generating function.

Page 74: 35229433 Random Variale Random Process

Important Discrete Random Variables

Discrete uniform random variable

A discrete random variable X is said to be a uniform random variable if it assumes each of the values 1 2, ,... nx x x with equal probability. The probability mass function of the uniform random variable X is given by

1( ) , 1, 2,... X ip x i nn

= =

Its CDF is

1

1( ) ( )n

X ii

F x xn

δ=

= −∑ x

( )Xp x

Mean and variance of the Discrete uniform random variable

1x 2x 3x nx

1n

x

iii iii

Page 75: 35229433 Random Variale Random Process

X0

0

2 2

0

2

02 2 2X

22

0 0

= (

1

( )

1

1 1

n

i X ii

n

ii

n

i X ii

n

ii

X

n n

i ii i

EX x p x

xn

EX x p x

xn

EX

)

x xn n

µ

σ µ

=

=

=

=

= =

=

=

==

=

∴ = −

⎛ ⎞= − ⎜ ⎟⎝ ⎠

∑ ∑

Example: Suppose X is the random variable representing the outcome of a single roll of a fair dice. Then X can assume any of the 6 values in the set 1, 2, 3, 4, 5, 6 with the probability mass function

1( ) 1,2,3,4,5,66

Xp x x= =

Bernoulli random variable Suppose X is a random variable that takes two values 0 and 1, with probability mass functions

(1) 1Xp P X p= = = and (0) 1 , 0 1Xp p p= − ≤ ≤ Such a random variable X is called a Bernoulli random variable, because it describes the outcomes of a Bernoulli trial. The typical cdf of the Bernoulli RV X is as shown in Fig.

Page 76: 35229433 Random Variale Random Process

( )XF x

1 x0

Remark We can define the pdf of X with the help of delta function. Thus

( ) (1 ) ( ) ( )Xf x p x p xδ δ= − + Example 1: Consider the experiment of tossing a biased coin. Suppose P H p= and 1P T p= − .

If we define the random variable ( ) 1X H = and ( ) 0X T = then X is a Bernoulli random variable. Mean and variance of the Bernoulli random variable

1

X0

12 2

0

2 2 2X

= ( ) 1 0 (1

( ) 1 0 (1 )

(1 )

Xk

Xk

X

EX kp k p p p

EX k p k p p p

EX p p

µ

σ µ

=

=

= = × + × − =

= = × + × − =

)

∴ = − = −

Remark

• The Bernoulli RV is the simplest discrete RV. It can be used as the building block for many discrete RVs.

• For the Bernoulli RV, . Thus all the moments of the Bernoulli RV have the same value of .

1, 2,3...mEX p m= =p

Binomial random variable: Suppose X is a discrete random variable taking values from the set 0,1,.......,n . X is called a binomial random variable with parameters n and 0 1p≤ ≤ if

Page 77: 35229433 Random Variale Random Process

( ) (1 )n k nX kp k C p p −= − k

here w

( )!

! !nC = k

nk n k−

As we have seen, the probability of k successes in n independent repetitions of the

• The notation

Bernoulli trial is given by the binomial law. If X is a discrete random variable representing the number of successes in this case, then X is a binomial random variable. For example, the number of heads in ‘n’ independent tossing of a fair coin is a binomial random variable.

~ ( , )X B n p is used to represent a binomial RV with the parameters an nd . p

( ),Xp• k k), defines a valid probability mass function. This is because

n kX k

k kp k C p p p p

= =

= − = + − =∑ ∑

• The sum of n independent identically distributed Bernoulli random variables is a

• s useful when there are two types of objects - good,

xample: In a binary communication system, the probability of bit error is 0.001. If

k=0,1,...,n

0 0( ) (1 ) [ (1 )] 1.

n nn k n−

binomial random variable. The binomial distribution ibad; correct, erroneous; healthy , diseased etc.

Ea block of 8 bits are transmitted, find the probability that (a) exactly 2 bit errors will occur (b) at least 2 bit errors will occur (c) all the bits will bit erroneous Suppose X is the random variable representing the number of bit errors in a block of

2 bit errors w (2)

0.01 0.99(b) (at least 2 bit errors will occur)= (0) (1) (2)

X

X X X

CP p p p

8 bits. Then ~ (8,0.01).X B Therefore, (a) (exactly ill occur)P p=

8 2 62= × ×

+ +8 8 1 7 8 2 6

1 28 6

=0.99 0.01 0.99 0.01 0.99

(c) P( all 8 bits will be erroneous)= (8) 0.01 10X

C C

p −

+ × × + × ×

= =

he probability mass function for a binomial random variable with n = 6 and p =0.8 is T

shown in the figure below.

Page 78: 35229433 Random Variale Random Process

Mean and variance of the Binomial random variable

Page 79: 35229433 Random Variale Random Process

1

0

0

1

1

1

11

1

1

We have

( )

(1 )

=0 (1 )

! = (1 )!( )!

! = (1 )( 1)!( )!

1! = (1 )( 1)!( )!

1! =!(

n

Xk

nn k n k

kk

nn n k n k

kk

nk n k

k

nk n k

k

nn kk

k

EX kp k

k C p p

q k C p p

nk p pk n k

n p pk n k

nnp p pk n k

nnpk n

=

=

=

=

=

− −−

=

=

= −

× + −

−−

−− −

−−

− −

1 1

1

11

10 1

1

2 2

0

2

0

2

1

2

1

(1 ) (Substituting 1)1 )!

( 1 ) Similarly

( )

(1 )

=0 (1 )

! (1 )!( )!

nk n k

k

n

n

Xk

nn k n k

kk

nn n k n k

kk

nk n k

k

p p k kk

np p pnp

EX k p k

k C p p

q k C p p

nk p pk n k

−− −

=

=

=

=

=

− = −− −

= + −=

=

= −

× + −

= −−

1

1 1 ( 1)

1

1 1 ( 1) 1 1 (

1 1

! (1 )( 1)!( )!

1! ( 1 1) (1 )( 1)!( )!

1! 1! ( 1) (1 ) (1 )( 1)!( 1 1)! ( 1)!( 1 1)!

nk n k

k

nk n k

k

n nk n k k n

k k

nk p pk n k

nnp k p pk n k

n nnp k p p np p pk n k k n k

=

− − − −

=

1)k− − − − − − − −

= =

= −− −

−= − + −

− −

− −= − − + −

− − − + − − − +=

∑ ∑

2

2 2 2 21) (1 )p np n p np p

( 1) ( 1)

variance of (X

np n p npn n p np

X n nσ

× − +

= − +

∴ = = − + − = − Mean of ( 1, )B n p−

Page 80: 35229433 Random Variale Random Process

Important Random Variables

Discrete uniform random variable

A discrete random variable X is said to be a uniform random variable if it assumes each of the values 1 2, ,... nx x x with equal probability. The probability mass function of the uniform random variable X is given by

1( ) , 1, 2,... X ip x i nn

= =

( )Xp x

Example: Suppose X is the random variable representing the outcome of a single roll of a fair dice. Then X can assume any of the 6 values in the set 1, 2, 3, 4, 5, 6 with the probability mass function

1( ) 1,2,3,4,5,66

Xp x x= =

Bernoulli random variable Suppose X is a random variable that takes two values 0 and 1, with probability mass functions

(1) 1Xp P X= = = p and (0) 1 , 0 1Xp p p= − ≤ ≤

1x 2x 3x nx

1n

x

iii iii

Page 81: 35229433 Random Variale Random Process

Such a random variable X is called a Bernoulli random variable, because it describes the outcomes of a Bernoulli trial. The typical cdf of the Bernoulli RV X is as shown in Fig.

( )XF x

1 x0

Remark We can define the pdf of X with the help of delta function. Thus

( ) (1 ) ( ) ( )Xf x p x p xδ δ= − + Example 1: Consider the experiment of tossing a biased coin. Suppose P H p= and 1P T p= − .

If we define the random variable ( ) 1X H = and ( ) 0X T = then X is a Bernoulli random variable. Mean and variance of the Bernoulli random variable

1

X0

12 2

0

2 2 2X

= ( ) 1 0 (1

( ) 1 0 (1 )

(1 )

Xk

Xk

X

EX kp k p p p

EX k p k p p p

EX p p

µ

σ µ

=

=

= = × + × − =

= = × + × − =

)

∴ = − = −

Remark

• The Bernoulli RV is the simplest discrete RV. It can be used as the building block for many discrete RVs.

Page 82: 35229433 Random Variale Random Process

• For the Bernoulli RV, . Thus all the moments of the Bernoulli RV have the same value of .

1, 2,3...mEX p m= =p

Binomial random variable: Suppose X is a discrete random variable taking values from the set 0,1,.......,n . X is called a binomial random variable with parameters n and 0 1p≤ ≤ if ( ) (1 )n k n

X kp k C p p −= − k

where

( )

!! !

nk

nCk n k

=−

As we have seen, the probability of k successes in n independent repetitions of the Bernoulli trial is given by the binomial law. If X is a discrete random variable representing the number of successes in this case, then X is a binomial random variable. For example, the number of heads in ‘n’ independent tossing of a fair coin is a binomial random variable.

• The notation ~ ( , )X B n p is used to represent a binomial RV with the parameters and n .p

• k), defines a valid probability mass function. This is because

( ), k=0,1,...,nXp k

0 0( ) (1 ) [ (1 )] 1.

n nn k n k n

X kk k

p k C p p p p−

= =

= − = + −∑ ∑ =

• The sum of n independent identically distributed Bernoulli random variables is a binomial random variable.

• The binomial distribution is useful when there are two types of objects - good, bad; correct, erroneous; healthy , diseased etc.

Example: In a binary communication system, the probability of bit error is 0.001. If a block of 8 bits are transmitted, find the probability that (a) exactly 2 bit errors will occur (b) at least 2 bit errors will occur (c) all the bits will bit erroneous Suppose X is the random variable representing the number of bit errors in a block of 8 bits. Then Therefore, ~ (8,0.01).X B

8 2 62

(a) (exactly 2 bit errors will occur) (2)

0.01 0.99(b) (at least 2 bit errors will occur)= (0) (1) (2)

X

X X X

P p

CP p p p

=

= × ×+ +

8 8 1 7 8 2 61 2

8 6

=0.99 0.01 0.99 0.01 0.99

(c) P( all 8 bits will be erroneous)= (8) 0.01 10X

C C

p −

+ × × + × ×

= =

Page 83: 35229433 Random Variale Random Process

The probability mass function for a binomial random variable with n = 6 and p =0.8 is shown in the figure below.

Mean and variance of the Binomial random variable

Page 84: 35229433 Random Variale Random Process

1

0

0

1

1

1

11

1

1

We have

( )

(1 )

=0 (1 )

! = (1 )!( )!

! = (1 )( 1)!( )!

1! = (1 )( 1)!( )!

1! =!(

n

Xk

nn k n k

kk

nn n k n k

kk

nk n k

k

nk n k

k

nn kk

k

EX kp k

k C p p

q k C p p

nk p pk n k

n p pk n k

nnp p pk n k

nnpk n

=

=

=

=

=

− −−

=

=

= −

× + −

−−

−− −

−−

− −

1 1

1

11

10 1

1

2 2

0

2

0

2

1

2

1

(1 ) (Substituting 1)1 )!

( 1 ) Similarly

( )

(1 )

=0 (1 )

! (1 )!( )!

nk n k

k

n

n

Xk

nn k n k

kk

nn n k n k

kk

nk n k

k

p p k kk

np p pnp

EX k p k

k C p p

q k C p p

nk p pk n k

−− −

=

=

=

=

=

− = −− −

= + −=

=

= −

× + −

= −−

1

1 1 ( 1)

1

1 1 ( 1) 1 1 (

1 1

! (1 )( 1)!( )!

1! ( 1 1) (1 )( 1)!( )!

1! 1! ( 1) (1 ) (1 )( 1)!( 1 1)! ( 1)!( 1 1)!

nk n k

k

nk n k

k

n nk n k k n

k k

nk p pk n k

nnp k p pk n k

n nnp k p p np p pk n k k n k

=

− − − −

=

1)k− − − − − − − −

= =

= −− −

−= − + −

− −

− −= − − + −

− − − + − − − +=

∑ ∑

2

2 2 2 21) (1 )p np n p np p

( 1) ( 1)

variance of (X

np n p npn n p np

X n nσ

× − +

= − +

∴ = = − + − = − Mean of ( 1, )B n p−

Page 85: 35229433 Random Variale Random Process

Geometric random variable: X as a discrete random variable with range 1,2,.......XR = . X is called a geometric

random variable of 1( ) (1 ) 0 1kXp k p p p−= − ≤ ≤

• X describes the outcomes of independent Bernoulli trials each with probability of

success ,p before a ‘success’ occurs. If the first success occurs at the trial, then the outputs of the Bernoulli trials are

kth

1 times1 1

, ,.., ,

( ) (1 ) (1 )k

k kX

F ailiure Failure Failure Success

p k p p p p−

− −∴ = − = −

• XR is countablly infinite, because we may have to wait infinitely long before the first success occurs.

• The geometric random variable X with the parameter p is denoted by ~ ( )X geo p

• The CDF of ~ ( )X geo p is given by

1

1( ) (1 ) 1 (1 )

ki k

Xi

F k p p p−

=

= − = − −∑which gives the probability that the first ‘success’ will occur before the trial. ( 1)k t+ h

) Following Figure shows the pmf of a random variable ~ (X geo p for and

respectively. Observe that the plots have a mode at 0.25p =

0.5p = 1.k =

Page 86: 35229433 Random Variale Random Process

Example: Suppose X is the random variable representing the number of independent tossing of a coin before a head shows up. Clearly X will be a geometric random variable. Example: A fair dice is rolled repeatedly. What is the probability that a 6 will be shown before the fourth roll. Suppose X is the random variable representing the number of independent rolling of the

dice before a ‘6’ shows up. Clearly X will be a geometric random variable with 1 .6

p = .

( 1 2 3(1) (2) (3)X X X

P P Xp p p

)X X= = == + +

( a '6' will be shown before the 4th roll)= or or

2

2

(1 ) (1 )(3 3 )

91196

p p p p pp p p

= + − + −

= − +

=

Mean and variance of the Geometric random variable

Page 87: 35229433 Random Variale Random Process

0

1

0

1

0

0

0

Mean= ( )

(1 )

(1 )

(1 )

(1 ) ( Sum of the geometric series)

Xk

k

k

k

k

k

k

k

k

EX kp k

k p p

p k p

dp pdk

dp pdk

=

∞−

=

∞−

=

=

=

=

= −

= −

= − −

= − −

2

2 2

0

2 1

0

1

0

2 1

0 0

21

20 0

2

1

( )

(1 )

( ( 1) )(1 )

(1 ) ( 1)(1 ) (1 )

(1 ) (1 ) (1 )

(1- ) 2

Xk

k

k

k

k

k k

k k

k k

k k

pp

p

EX k p k

k p p

p k k k p

p p k k p p k p

dp p p p k pdp

pp

=

∞−

=

∞−

=

∞ ∞− −

= =

∞ ∞−

= =

=

=

=

= −

= − + −

= − − − + −

= − − + −

=

∑ ∑

∑ ∑

22 2

2

1

(1- ) 1 12

(1- )

X

pp

pp pp

p

σ

+

∴ = + −

=

1

X

22

Mean

(1- )Variance X

pp

p

µ

σ

=

=

Page 88: 35229433 Random Variale Random Process

Poisson Random Variable: X is a Poisson distribution with the parameter λ such that 0λ > and

( ) , 0,1, 2,....!

k

Xep k k

k

λλ−

= =

The plot of the pmfof the Poisson RV is shown in Fig.

Remark

• ( )!

k

Xep k

k

λλ−

= satisfies to be a pmf, because

0 0 0( ) 1

! !

k k

Xk k k

ep k e e ek k

λλ λλ λ−∞ ∞ ∞

− −

= = =

= = =∑ ∑ ∑ λ =

• Named after the French mathematician S.D. Poisson

Page 89: 35229433 Random Variale Random Process

Mean and Variance of the Poisson RV The mean of the Poisson RV X is given by

0

1

1

1

( )

0!

1!

X Xk

k

k

k

k

kp k

ekk

ek

λ

λ

µ

λ

λλ

λ

=

−∞

=

−∞−

=

=

= +

=−

=

2 2

0

2

1

1

1

2 1

2 12

2 1

2

( )

0!

1!

( 1 1) 1!

02! 1!

2! 1!

Xk

k

k

k

kk

k

k k

k k

k k

k k

EX k p k

ekk

kekke

k

e ek k

e ek k

e e e e

λ

λ

λ

λ λ

λ λ

λ λ λ

λ

λ

λ

λ λ

λ λλ λ

λ λ λ

=

−∞

=

∞−

=

∞−

=

∞ ∞− −

= =

− −∞ ∞− −

= =

− −

=

= +

=−

− +=

⎛ ⎞= + +⎜ ⎟− −⎝ ⎠

= +− −

= +

∑ ∑

∑ ∑

2

2 2 2X XEXλ λ

σ µ λ

= +

∴ = − =

Example: The number of calls received ina telephone exchange follows a Poisson distribution with an average of 10 calls per minute. What is the probability that in one-minute duration

(i) no call is received (ii) exactly 5 callsare received (iii) more than 3 calls are received

Let be the random variable representing the number of calls received. Given X

( )!

k

Xep k

k

λλ−

= where 10.λ = Therefore,

Page 90: 35229433 Random Variale Random Process

(i) probability that no call is received 10(0)Xp e−= = =

(ii) probability that exactly 5 calls are received 10 510(5)

5!Xep− ×

= = =

(iii) probability that more the 3 calls are received 2 35

10

0

10 10 101 ( ) 1 (11 2! 3!X

kp k e−

=

= − = − + + + =∑ )

The Poisson distribution is used to model many practical problems. It is used in many counting applications to count events that take place independently of one another. Thus it is used to model the count during a particular length of time of:

• customers arriving at a service station • telephone calls coming to a telephone exchange packets arriving at a particular

server • particles decaying from a radioactive specimen

Poisson approximation of the binomial random variable The Poisson distribution is also used to approximate the binomial distribution ( , )B n p when is very large and n p is small. Consider a binomial RV ~ ( , )X B n p with

, 0 so that remains constant.n p EX np λ→∞ → = = Then

( )!

k

Xep k

k

λλ−

Page 91: 35229433 Random Variale Random Process

( ) (1 )! = (1 )

!( )!( 1)( 2).....( 1) = (1 )

!1 2 1(1 )(1 ).....(1 )

= (1 )!

1 2 1(1 )(1 ).....(1 ) = ( ) (1 )

!1(1 )

n k n kX k

k n k

k n k

k

k n k

k n k

p k C p pn p p

k n kn n n n k p p

kkn

n n n p pk

kn n n np p

k

n

= −

−−

− − − +−

−− − −

−− − −

−=

n

2 1(1 ).....(1 )( ) (1 )

!(1 )

Note that lim(1 ) .

1 2 1(1 )(1 ).....(1 )( ) (1 )( ) lim

!!(1 )

k n

k

n

n

k nk

Xk

kn n n

kn

en

ken n n np k

kkn

λ

λ

λλ

λ

λ

λλ λλ

→∞

→∞

−− − −

− =

−− − − −

∴ =−

Thus the Poisson approximation can be used to compute binomial probabilities for large

It also makes the analysis of such probabilities easier.Typical examples are: .n• number of bit errors in a received binary data file • number of typographical errors in a printed page

Example Suppose there is an error probability of 0.01 per word in typing. What is the probability that there will be more than 1 error in a page 120 words. Suppose X is the RV representing the number of errors per page of 120 words.

~ (120, )X B p where Therefore, 0.01.p =

120 0.01 0.12P(more than one errors) 1 (0) (1)

1X Xp p

e eλ λ

λ

λ− −

∴ = × =

= − −

− −

Page 92: 35229433 Random Variale Random Process

Uniform Random Variable A continuous random variable X is called uniformly distributed over the interval [a, b], if its probability density function is given by

1( )

otherwise0X

a x bf x b a

⎧ ≤ ≤⎪= −⎨⎪⎩

We use the notation to denote a random variable X uniformly distributed over the interval

[a,b]. Also note that

~ ( , )X U a b1( ) 1.

b

Xa

f x dx dxb a

−∞

= =−∫ ∫

Distribution function ( )XF x For

( ) 0For

X

x aF x

a x b

<=≤ ≤

( )x

X

x

a

f u dx

dub a

x ab a

−∞

=−

−=

For ,( ) 1X

x bF x

>=

Mean and Variance of a Uniform Random Variable

( )

2

b

X Xa

xEX xf x dx dxb a

a b

µ∞

−∞

= = =−

+=

∫ ∫

Page 93: 35229433 Random Variale Random Process

22 2

2 2

2 22 2 2

2

( )

3

( )3 4

( ) 12

b

Xa

X X

xEX x f x dx dxb a

b ab a

b ab a a bEX

b a

σ µ

−∞

= =−

+ +=

+ + + 2

∴ = − = −

−=

∫ ∫

The characteristic function of the random variable is given by ~ ( , )X U a b

( )

jwxbjwx

Xa

jwb jwa

ew Ee db a

e eb a

φ = = ∫−

−=

x

Example: Suppose a random noise voltage X across an electronic circuit is uniformly distributed between -4 V and 5 V. What is the probability that the noise voltage will lie between 2 v and 4 V? What is the variance of the voltage?

3

2

22

1(2 3) .5 ( 4) 9

(5 4) 27 .12 4X

dxP X

σ

< ≤ = =∫− −

+= =

Remark • The uniform distribution is the simplest continuous distribution • Used, for example, to to model quantization errors. If a signal is discritized into steps of ,∆

then the quantization error is uniformly distributed between and .2 2−∆ ∆

• The unknown phase of a sinusoid is assumed to be uniformly distributed over [0,2 ]π in many applications. For example, in studying the noise performance of a communication receiver, the carrier signal is modeled as

( ) cos( )cX t A w t= +Φ. where ~ (0, 2 )U πΦ

• A random variable of arbitrary distribution can be generated with the help of a routine to generate uniformly distributed random numbers. This follows from the fact that the distribution function of a random variable is uniformly distributed over [0,1]. (See Example)

Thus if is a contnuous raqndom variable, then ( ) ~ (0,1).XX F X U

Normal or Gaussian Random Variable The normal distribution is the most important distribution used to model natural and man made phenomena. Particular, when the random variable is the sum of a large number of random variables, it can be modeled as a normal random variable.

Page 94: 35229433 Random Variale Random Process

A continuous random variable X is called a normal or a Gaussian random variable with parameters Xµ and 2

Xσ if its probability density function is given by,

2121( )

2

X

X

x

XX

f x eµ

σ

πσ

⎛ ⎞−− ⎜ ⎟

⎝ ⎠= , x−∞ < < ∞

where Xµ and 0Xσ > are real numbers. We write that X is distributed. ( 2,X XN µ σ ) If 0Xµ = and , 2 1Xσ =

2121( )

2x

Xf x eπ

−=

and the random variable X is called the standard natural variable.

• ( )Xf x , is a bell-shaped function, symmetrical about Xx µ= .

Page 95: 35229433 Random Variale Random Process

• Xσ , determines the spread of the random variable X. If 2Xσ is small X is more concentrated

around the mean Xµ .

( )2

12

( )

12

X

X

X

tx

X

F x P X x

e dµ

σ

πσ

⎛ ⎞−− ⎜ ⎟

⎝ ⎠

−∞

= ≤

= ∫ t

Substituting, X

X

tu µσ−

= , we get

21

21( )2

X

X

x

u

XF x e

µσ

π

−∞

= ∫ du

X

X

x µσ

⎛ ⎞−= Φ⎜ ⎟

⎝ ⎠

where ( )xΦ is the distribution function of the standard normal variable. Thus can be computed from tabulated values of( )XF x ( ).xΦ . The table ( )xΦ was very useful in the pre-computer days. In communication engineering, it is customary to work with the Q function defined by,

2

2

( ) 1 ( )

12

u

x

Q x x

e duπ

∞−

= −Φ

= ∫

Note that 1(0) , ( ) ( )2

Q Q x Q= − = x

)

If X is distributed, then ( 2,X XN µ σ

XEX µ= 2var( ) XX σ=

Proof:

Page 96: 35229433 Random Variale Random Process

( )2

2

2

2

12

12

12

2

2

0

1( )2

12

12 2

102

22

X

X

x

XX

u

X X X

uXX

u

X

uXX

X

EX xf x dx xe dx

u e du

udu e du

e du

e du

µσ

πσ

σ σ µπ

µσπ π

µπ

µ µπ

⎛ ⎞−∞ ∞ − ⎜ ⎟⎝ ⎠

−∞ −∞

∞−

−∞

∞ ∞−

−∞ −∞

∞−

−∞

∞−

= =

= +

= +

= +

= =

∫ ∫

∫ ∫

Evaluation of 2

2x

e dx∞

−∞∫

Suppose 2

2x

I e dx∞

−∞

= ∫

Then

2

2 2

2 2

2

2 2

2 2

2

x

x y

x y

I e dx

e dx e dy

e dydx

∞−

−∞

∞ ∞− −

−∞ −∞

∞ ∞ +−

−∞ −∞

⎛ ⎞= ⎜ ⎟⎜ ⎟⎝ ⎠

=

=

∫ ∫

∫ ∫

Substituting cosx r θ= and siny r θ= we ge

2

2

2 2

0

0

0

2

2

2 1 2

r

r

s

I e r d dr

e r dr

e ds

π

π

θ

π

π

π π

∞−

∞−

∞−

=

=

=

= × =

∫ ∫

( ) 2r s=

2I π∴ =

Substituting,Xx µ−

so that

X

X X

u

x uσ

σ µ

=

= +

t

Page 97: 35229433 Random Variale Random Process

( )

( )2

2

2

2

12 2

12 2 2

122 2

0

122

0

2

2

2

2

( )

12

12

22

2 22

322

1 122 2

X

X

X

x

XX

u

X XX

uX

tX

X

X

X

X

Var X E X

x e d

u e du

u e du

t e dt

µσ

µ

µπσ

σ σπσ

σπ

σπ

σπ

σπ

σ ππ

σ

⎛ ⎞−∞ − ⎜ ⎟⎝ ⎠

−∞

∞−

−∞

∞−

∞−

= −

= −

=

= ×

= ×

= ×

= ×

= ×

=

x

1

0

Note the definition and properties of the gamma function

( 1) ( 1)12

n tn t e dt

n n n

π

∞− −Γ =

Γ = − Γ −

Γ =

Put 21 so that

2t u dt udu= =

Put So that XX

X

x u dx duµ σσ−

= =

Page 98: 35229433 Random Variale Random Process

Exponential Random Variable A continuous random variable X is called exponentially distributed with the parameter 0λ > if the probability density function is of the form

0

2 2 22

0

0( )

0The corresponding probabilty distribution function is

( ) ( )

0 01 0

1We have

1 1( ) ( )

x

X

x

X X

x

xX

xX X

xef x

otherwise

F x f u du

xe x

EX x e dx

E X x e dx

λ

λ

λ

λ

λ

µ λλ

σ µ λλ λ

−∞

∞−

∞−

≥⎧= ⎨⎩

=

<⎧= ⎨

− ≥⎩

= = =

= − = − =

The following figure shows the pdf of an exponential RV.

Page 99: 35229433 Random Variale Random Process

• The time between two consecutive occurrences of independent events can be modeled by the exponential RV. For example, the exponential distribution gives probability density function of the time between two successive counts in Poisson distribution

• Used to model the service time in a queing system. • In reliability studies, the expected life-time of a part, the average time between successive

failures of a system etc.,are determined using the exponential distribution. Memoryless Property of the Exponential Distribution For an exponential RV with parameter X ,λ

0 0 0

0 00 0

0

0

0

0

0

( / ) ( ) for 0, Proof:

[( ) )]( / )( )

( ) ( )

1 ( ) 1 ( )

X

X

P X t t X t P X t t t

P X t t PX tP X t t X tP X t

P X t tP X t

F t tF t

> + > = > > >

> + ∩ >> + > =

>

> +=

>− +

=−

0

0

0

( )

( )

t t

t

t

ee

e P X t

λ

λ

λ

− +

=

= = >

Hence if X represents the life of a component in hours, the probability that the component will last more than hours given that it has lasted hours, is same as the probability that the component will last t hours. The information that the component has already lasted for hours is not used. Thus the life expectancy of an used component is same as that for a new component. Such a model cannot represent a real-world situation, but used for its simplicity.

0t t+ 0t

0t

Laplace Distribution A continuous random variable X is called Laplace distributed with the parameter 0λ > with the probability density function is of the form

( ) 0, -2

xXf x e xλλ λ−= > ∞ < < ∞

2 2 2

We have 02

( ) 2

xX

xX X

EX x e dx

E X x e dx

λ

λ

λµ

λσ µ λ

∞−

−∞

∞−

= = =

= − = =

Page 100: 35229433 Random Variale Random Process

Chi-square random variable

A random variable is called a Chi-square random variable with degrees of freedom if its

PDF is given by

n

2/ 2 1

/ 2/ 2 0

( ) 2 ( / 2)0 0

nx

n nX

x e xf x n

x

σ

σ

−−⎧

>⎪= Γ⎨⎪ <⎩

with the parameter 0σ > and (.)Γ denoting the gamma function. A Chi-square random

variable with n degrees of freedom is denoted by 2.nχ

Note that a22χ RV is the exponential RV.

The pdf of 2nχ RVs with different degrees of freedom is shown in Fig. below:

Page 101: 35229433 Random Variale Random Process

Mean and variance of the chi-square random variable

2

2

/ 2 1/ 2

/ 20

/ 2/ 2

/ 20

2 / 2 / 22 2

/ 20

2

2

( )

2 ( / 2)

2 ( / 2)(2 ) (2 ) ( Substituting / 2 )

2 ( / 2)2 [( 2) / 2]

( / 2)2 / 2 ( / 2)

( / 2)

X X

nx

n n

nx

n n

n nu

n n

xf x dx

xx e dxn

x e dxnu e du u xn

nn

n nn

σ

σ

µ

σ

σσ σ σσ

σ

σ

−∞

−∞−

∞−

∞−

= ∫

= ∫Γ

= ∫Γ

= =∫Γ

Γ +=

Γ

Γ=

Γ2 nσ=

Similarly,

2

2

2 2

/ 2 12 / 2

/ 20

( 2) / 2/ 2

/ 20

2 ( 2) / 2 / 22 2

/ 20

4

4

( )

2 ( / 2)

2 ( / 2)(2 ) (2 ) ( Substituting / 2 )2 ( / 2)

4 [( 4) / 2] ( / 2)

4 [(

X

nx

n n

nx

n n

n nu

n n

EX x f x dx

xx e dxn

x e dxn

u e du u xn

nn

n

σ

σ

σ

σσ σ σσ

σ

σ

−∞

−∞−

+∞−

+∞−

= ∫

= ∫Γ

= ∫Γ

= =∫Γ

Γ +=

Γ

+=

4

2 2 2 4 4 4

2) / 2] / 2 ( / 2)( / 2)

( 2)( 2) 2X X

n nn

n nEX n n n n

σ

σ µ σ σ σ

ΓΓ

= +

= − = + − =

A random variable Let be independent zero-mean Gaussian variables each with

variance Then has distribution with mean

X . . . , XX n21

.2Xσ 2 2

1 2 nY X X . . . X = + + + 2 2nχ

2nσ and varaiance

42n .σ This result gives an application of the chi-square random variable.

Page 102: 35229433 Random Variale Random Process

Relation between the chi-square distribution and the Gaussian distribution

A random variable Let be independent zero-mean Gaussian variables each with

variance Then has distribution with mean

X . . . , XX n21

.2Xσ 2 2 2

1 2 nY X X . . . X = + + + 2nχ

Rayleigh Random Variable A Rayleigh random variable is characterized by the PDF

0 0

0 )(

22 2/2

⎪⎩

⎪⎨⎧

<

>=

x

xexxf

x

X

σ

σ

where σ is the parameter of the random variable. The probability density functions for the Rayleigh RVs are illustrated in Fig.

Mean and variance of the Rayleigh distribution

Page 103: 35229433 Random Variale Random Process

2 2

2 2

/ 22

0

2/ 2

0

2

( )

22

22

2

X

x

x

EX xf x dx

xx e dx

x e d

σ

σ

σ

πσ πσ

π σσπ σ

−∞

∞−

∞−

=

=

=

=

=

∫ x

Similarly

2 2

2 2

2 / 22

0

22

20

2

0

2 2 2

( )

2 )2

2

2 ( )2

(2 )2

( Substituting

( Noting that is the mean of the exponential RV with =1)

X

x

u

u

X

EX x f x dx

xx e dx

xue du u

ue du

σ

σ

σσ

σ λ

πσ σ σ

π σ

−∞

∞−

∞−

∞−

=

=

= =

=

∴ = −

= −

2

Relation between the Rayleigh distribution and the Gaussian distribution A Rayleigh RV is related to Gaussian RVs as follow: Let 1X and 2X be two independent zero-mean Gaussian RVs with a common variance 2 .σ Then

21X X X= + 2

2 has the Rayleigh pdf with the parameter .σ We shall prove this result in a later class. This important result also suggests the cases where the Rayleigh RV can be used. Application of the Rayleigh RV

• Modelling the root mean square error- • Modelling the envelope of a signal with two orthogonal components as in the case of a signal

of the following form: 1 1( ) cos sins t X wt Y wt= +

If 2

1 ~ (0, )X N σ and 22 ~ (0, )X N σ are independent, then the envelope 21X X X= + 2

2 has the Rayleigh distribution.

Page 104: 35229433 Random Variale Random Process
Page 105: 35229433 Random Variale Random Process

Simulation of Random Variables • In many fields of science and engineering, computer simulation is used to study

random phenomena in nature and the performance of an engineering system in a noisy environment. For example, we may study through computer simulation the performance of a communication receiver. Sometimes a probability model may not be analytically tractable and computer simulation is used to calculate probabilities.

• The heart of all these application is that it is possible to simulate a random variable with an empirical CDF or pdf that fits well with the theoretical CDF or pdf.

Generation of Random Numbers

Generation of random numbers means producing a sequence of independent random numbers with a specified CDF or pdf. All the random number generators rely on a routine to generate random numbers with the uniform pdf. Such routine is of vital importance because the quality of the generated random numbers with any other distribution depends on it. By the quality of the generated random numbers we mean how closely the empirical CDF or PDF fits the true one.

There are several algorithms to generate random numbers. Note that these algorithms generate random number by a reproducible deterministic method. These numbers are pseudo random numbers because they are reproducible and the same sequence of numbers repeats after some period of count specific to the generating algorithm. This period is very high and a finite sample of data within the period appears to be uniformly distributed. We will not discuss about these algorithms. Software packages provide routines to generate such numbers.

[0 1]U

Method of Inverse transform

Suppose we want to generate a random variable X with a prescribed distribution function We have observed that the random variable defined by

Thus given random number ( ).XF x ,Y

( ) ~ [ 0, 1 ]..XY F X U= [ 0, 1 ].U ,Ythe inverse transform 1 ( )XX F Y−= will have the CDF ( ).XF x

Page 106: 35229433 Random Variale Random Process

The algorithmic steps for the inverse transform method are as follows:

1. Generate a random number from Call it ~ [0, 1].Y U .y . 2. Compute the value such that x ( x) =y.XF 3. Take to be the random number generated. x

Example Suppose, we want to generate a random variable with the pdf ( )Xf x ribution given by

2 0 3( ) 9

0 otherwiseX

x xf x

⎧ ≤ ≤⎪= ⎨⎪⎩

The CDF of X is given by

2

01( ) 0 391 x>0

XF x x x

⎧⎪⎪= ≤⎨⎪⎪⎩

Page 107: 35229433 Random Variale Random Process

x

FX(x)

x2/9

1

3

Therefore, we generate a random number from the distribution and set y [0, 1]U

( ) .XF x y= We have

21 9

x= 9y

x y=

Example Suppose, we want to generate a random variable with the exponential distribution given by ( ) 0, 0.X

xf x e xλλ λ−= > ≥ Then ( ) 1X

xF x e λ−= − Therefore given y, we can get by the mapping x

( )1

log 1

x

e

e yy

x

λ

λ

−− =

−= −

Since 1 is also uniformly distributed over [0, 1], the above expression can be simplified as,

y−

log yxλ

= −

Page 108: 35229433 Random Variale Random Process

Generation of Gaussian random numbers Generation of discrete random variables We observed that the CDF of a discrete random variable is also distributed. [0, 1]USuppose X is a discrete random variable with the probability mass function

( ), 1, 2,.. .X ip x i n= Given the inverse mapping is defined as shown in the Fig. below.

( ) ,Xy F x=

( )XY F X=

y

Thus if en 1( ) ( )X k X kF x y F x− ≤ < , th 1 ( )X kF y x− =

The algorithmic steps for the inverse transform method for simulating discrete random variables are as follows:

1. Generate a random number fromY U Call it ~ [0, 1]. .y2. Compute the value kx such that 1( ) ( )X k X kF x y F x− ,≤ < 3. Take kx to be the random number generated.

Example Generation of Bernoulli random numbers Suppose we want to generate a random number from ~ ( ).X Br p Generate from the

distribution. Set y

[0, 1]U

0 for 11 otherwise

y px

≤ −⎧= ⎨⎩

1 ( )k Xx F y−= X

Page 109: 35229433 Random Variale Random Process

1 p−

0 x

y

1

Page 110: 35229433 Random Variale Random Process

Jointly distributed random variables

We may define two or more random variables on the same sample space. Let X and

be two real random variables defined on the same probability space ( , The mapping such that for is called a joint random variable.

Y , ).S PF2S → 2, ( ( ), ( ))s S X s Y s∈ ∈

2R

S

( ( ), ( ))X s Y s

Figure

s •

Remark • The above figure illustrates the mapping corresponding

variable. The joint random variable in the above case id d• We may represent a joint random variable as a two-dime

[ Y] .X ′=X• We can extend the above definition to define joint random

dimension.The mapping such that for is called a n-dimes

and denoted by the vector

nS →

1 2, ( ( ), ( ),...., ( )) nns S X s X s X s∈ ∈

1 2[ .... ] .nX X X ′=X

Example1: Suppose we are interested in studying the height andin a class. We can define the joint RV ( , )X Y where X

represents the weight. Y Example 2 Suppose in a communication system X is the tra

is the corresponding noisy received signal. Then Y ( , )X Y isvariable.

Joint Probability Distribution Function Recall the definition the distribution of a single random X x≤ was used to define the probability distribution

we can find the probability of any event involvinSimilarly, for two random variables

( ),XF xX and

, X x Y y X x Y y≤ ≤ = ≤ ∩ ≤ is considered as the repre

( )Y s

e

to a joint random enoted by ( , ).X Y nsional vector

variables of any

ional random variabl

weight of the studerepresents height a

nsmitted signal and a joint random

variable. The evfunction Givg the random variab

( ).XF x

the ev,Y

sentative event.

Joint Random Variabl

e

nts nd

ent en le. ent

Page 111: 35229433 Random Variale Random Process

The probability is called the joint distribution function of the random variables

2 , ( , )P X x Y y x y≤ ≤ ∀ ∈X and Y and denoted by ).,(, yxF YX

( , )x y

X

Y

, ( , )X YF x y satisfies the following properties: • , 1 1 , 2 2 1 2 1( , ) ( , ) if and yX Y X YF x y F x y x x y≤ ≤ 2≤

2

1 2 1 2

1 1 2 2

1 1 2

, 1 , 2 2

If and y , , ,

, ,( , ) ( , )X Y X Y

x x yX x Y y X x Y yP X x Y y P X x Y yF x y F x y

< <≤ ≤ ⊆ ≤ ≤

∴ ≤ ≤ ≤ ≤ ≤∴ ≤

2 2( , )x y

• , ,( , ) ( , ) 0X Y X YF y F x−∞ = −∞ =

Note that , X Y y X≤ −∞ ≤ ⊆ ≤ −∞

f and y ,

• , ( , ) 1.X YF ∞ ∞ =

• is right continuous in both the variables. , ( , )X YF x y

• I I 1 2 1 2x x y< <

.1 2 1 2 , 2 2 , 1 2 , 2 1 , 1 1 , y ( , ) ( , ) ( , ) ( , ) 0X Y X Y X Y X YP x X x Y y F x y F x y F x y F x y< ≤ < ≤ = − − + ≥

1 1( , )x y

2 2( , )x yY

X

1 1( , )x y

1

Y

Page 112: 35229433 Random Variale Random Process

Given , ),,(, y , - x -yxF YX ∞<<∞∞<<∞ we have a complete description of the random variables X and .Y

• ).,()( +∞= xFxF XYX

To prove this

Similarly ).,()( yFyF XYY ∞=

( ) ( ) ),(,)()()()(

+∞=∞≤≤=≤=∴+∞≤∩≤=≤

xFYxXPxXPxFYxXxX

XYX

• Given , ),,(, y , - x -yxF YX ∞<<∞∞<<∞ each of is called a marginal distribution function.

)( and )( yFxF YX

Example

Consider two jointly distributed random variables X and with the joint CDF Y

2

,(1 )(1 ) 0, 0

( , )0 otherwise

x y

X Ye e x y

F x y− −⎧ − − ≥ ≥

= ⎨⎩

(a) Find the marginal CDFs (b) Find the probability 11 2, 1 2P X Y< ≤ < ≤

(a)

2

,

,

1 0( ) lim ( , )

0 elsewhere

1 y 0( ) lim ( , )

0 elsewhere

x

X X Yy

y

Y X Yx

e xF x F x y

eF y F x y

→∞

→∞

⎧ − ≥= = ⎨

⎩⎧ − ≥

= = ⎨⎩

(b)

1 , , , ,

4 2 2 1 2 2 4

1 2, 1 2 (2,2) (1,1) (1,2) (2,1)

(1 )(1 ) (1 )(1 ) (1 )(1 ) (1 )(1 ) =0.0

X Y X Y X Y X YP X Y F F F F

e e e e e e e e− − − − − − −

< ≤ < ≤ = + − −

= − − + − − − − − − − −272

1−

Jointly distributed discrete random variables

Page 113: 35229433 Random Variale Random Process

If X and are two discrete random variables defined on the same probability space such that

Y( , , )S F P X takes values from the countable subset XR and Y takes values from the countable subset .YR Then the joint random variable ( , )X Y can take values from the countable subset in .X YR R× The joint random variable ( , )X Y is completely specified by their joint probability mass function

, ( , ) | ( ) , ( ) , ( , ) .X Y X Yp x y P s X s x Y s y x y R R= = = ∀ ∈ ×

Given we can determine other probabilities involving the random variables , ( , ),X Yp x yX and .YRemark

• , ( , ) 0 for ( , )X Y X Yp x y x y R R= ∉ ×

• ,( , )

( , ) 1X Y

X Yx y R R

p x y∈ ×

=∑ ∑ This is because

,, )( , )

( , ) ,

= ( ) = | ( ( ), ( )) ( = ( ) 1

X YX YX Y

x y R Rx y R R

X Y

X Y

p x y P x y

P R RP s X s Y s R RP S

∈ ×∈ ×=∑ ∑

×∈ ×

=

• Marginal Probability Mass Functions: The probability mass functions ( )Xp x and are obtained from the joint probability mass function as

follows ( )Yp y

,

( ) = ( , )

Y

X Y

X Yy R

p x P X x Rp x y

= =

and similarly ,( ) ( , )

XY X Y

x Rp y p x y

∈= ∑

These probability mass functions ( )Xp x and obtained from the joint probability mass functions are called marginal probability mass functions.

( )Yp y

Example Consider the random variables with the joint probability mass function as tabulated in Table . The marginal probabilities are as shown in the last column and the last row

and X Y

X Y

0 1 2 ( )Yp y

0 0.25 0.1 0.15 0.5 1 0.14 0.35 0.01 0.5 ( )Xp x 0.39 0.45

Page 114: 35229433 Random Variale Random Process

Joint Probability Density Function

If X and Y are two continuous random variables and their joint distribution function is continuous in both and then we can define joint probability density function x ,y

, ( , )X Yf x y by 2

, ( , ) ( , ),X Y X Y,f x y F x yx y∂

=∂ ∂

provided it exists.

Clearly , ,( , ) ( , )yx

X Y X YF x y f u v dudv−∞ −∞

= ∫ ∫ Properties of Joint Probability Density Function

• is always a non-negaitive quantity. That is, ),(, yxf YX

2, ( , ) 0 ( , )X Yf x y x y≥ ∀ ∈

• , ( , ) 1X Yf x y dxdy∞ ∞

−∞ −∞=∫ ∫

• The probability of any Borel set B can be obtained by ,

( , )( ) ( , )X Y

x y BP B f x y dxdy

∈= ∫∫

Marginal density functions The marginal density functions ( )Xf x and ( )Yf y of two joint RVs are given by the derivatives of the corresponding marginal distribution functions. Thus

and X Y

,

,

,

( ) ( )

( , )

( ( , ) )

( , )

a n d s im ila rly ( ) ( , )

dX Xd x

dXd x

xd

X Yd x

X Y

Y X Y

f x F x

F x

f u y d y d u

f x y d y

f y f x y d

−∞ −∞

−∞

−∞

=

= ∞

= ∫ ∫

= ∫

= ∫ x

Remark • The marginal CDF and pdf are same as the CDF and pdf of the concerned

single random variable. The marginal term simply refers that it is derived from the corresponding joint distribution or density function of two or more jointly random variables.

Page 115: 35229433 Random Variale Random Process

• With the help of the two-dimensional Dirac Delta function, we can define the joint pdf of two discrete jointly random variables. Thus for discrete jointly random variables X and .Y

, , ( , ) .

( , ) ( , ) ( , )i j X Y

X Y X Y j jx y R R

f x y p x y x x y yδ∈ ×

= −∑ ∑ −

Example The joint density function , ( , )X Yf x y in the previous example are

2

, ,

22

2

( , ) ( , )

[(1 )(1 )] 0, 0

2 0, 0

X Y X Y

x y

x y

f x y F x yx y

e e x yx ye e x y

− −

− −

∂=∂ ∂

∂= − − ≥∂ ∂

= ≥ ≥

Example: The joint pdf of two random variables X and Y are given by

, ( , ) 0 2, 0 2 0 otherwise

X Yf x y cxy x y= ≤ ≤ ≤

=

(i) Find .c(ii) Find , ( , )X yF x y (iii) Find ( )Xf x and ( ).Yf y (iv) What is the probability (0 1, 0 1) ?P X Y< ≤ < ≤

2 2

, 0 0

y

, 0 0

2 2

2

0

( , )

14

1( , ) 4

16

( ) 0 2 4

0 2 2

Similarly

( ) 0 2 2

X Y

x

X Y

X

Y

f x y dydx c xydydx

c

F x y uvdudv

x y

xyf x dy y

x y

yf y y

∞ ∞

−∞ −∞=

∴ =

=

=

= ≤ ≤

= ≤ ≤

= ≤ ≤

∫ ∫ ∫ ∫

∫ ∫

Page 116: 35229433 Random Variale Random Process

, , , ,

(0 1, 0 1) (1,1) (0, 0) (0,1) (1, 0)

1 = 0 0 0161 =

16

X Y X Y X Y X Y

P X YF F F F

< ≤ < ≤= + − −

+ − −

Page 117: 35229433 Random Variale Random Process

Jointly distributed random variables

We may define two or more random variables on the same sample space. Let X and

be two real random variables defined on the same probability space ( , The mapping such that for is called a joint random variable.

Y , ).S PF2S → 2, ( ( ), ( ))s S X s Y s∈ ∈

2R

S

( ( ), ( ))X s Y s

Figure

s •

Remark • The above figure illustrates the mapping corresponding

variable. The joint random variable in the above case id d• We may represent a joint random variable as a two-dime

[ Y] .X ′=X• We can extend the above definition to define joint random

dimension.The mapping such that for is called a n-dimes

and denoted by the vector

nS →

1 2, ( ( ), ( ),...., ( )) nns S X s X s X s∈ ∈

1 2[ .... ] .nX X X ′=X

Example1: Suppose we are interested in studying the height andin a class. We can define the joint RV ( , )X Y where X

represents the weight. Y Example 2 Suppose in a communication system X is the tra

is the corresponding noisy received signal. Then Y ( , )X Y isvariable.

Joint Probability Distribution Function Recall the definition the distribution of a single random X x≤ was used to define the probability distribution

we can find the probability of any event involvinSimilarly, for two random variables

( ),XF xX and

, X x Y y X x Y y≤ ≤ = ≤ ∩ ≤ is considered as the repre

( )Y s

e

to a joint random enoted by ( , ).X Y nsional vector

variables of any

ional random variabl

weight of the studerepresents height a

nsmitted signal and a joint random

variable. The evfunction Givg the random variab

( ).XF x

the ev,Y

sentative event.

Joint Random Variabl

e

nts nd

ent en le. ent

Page 118: 35229433 Random Variale Random Process

The probability is called the joint distribution function of the random variables

2 , ( , )P X x Y y x y≤ ≤ ∀ ∈X and Y and denoted by ).,(, yxF YX

( , )x y

X

Y

, ( , )X YF x y satisfies the following properties: • , 1 1 , 2 2 1 2 1( , ) ( , ) if and yX Y X YF x y F x y x x y≤ ≤ 2≤

2

1 2 1 2

1 1 2 2

1 1 2

, 1 , 2 2

If and y , , ,

, ,( , ) ( , )X Y X Y

x x yX x Y y X x Y yP X x Y y P X x Y yF x y F x y

< <≤ ≤ ⊆ ≤ ≤

∴ ≤ ≤ ≤ ≤ ≤∴ ≤

2 2( , )x y

• , ,( , ) ( , ) 0X Y X YF y F x−∞ = −∞ =

Note that , X Y y X≤ −∞ ≤ ⊆ ≤ −∞

f and y ,

• , ( , ) 1.X YF ∞ ∞ =

• is right continuous in both the variables. , ( , )X YF x y

• I I 1 2 1 2x x y< <

.1 2 1 2 , 2 2 , 1 2 , 2 1 , 1 1 , y ( , ) ( , ) ( , ) ( , ) 0X Y X Y X Y X YP x X x Y y F x y F x y F x y F x y< ≤ < ≤ = − − + ≥

1 1( , )x y

2 2( , )x yY

X

1 1( , )x y

1

Y

Page 119: 35229433 Random Variale Random Process

Given , ),,(, y , - x -yxF YX ∞<<∞∞<<∞ we have a complete description of the random variables X and .Y

• ).,()( +∞= xFxF XYX

To prove this

Similarly ).,()( yFyF XYY ∞=

( ) ( ) ),(,)()()()(

+∞=∞≤≤=≤=∴+∞≤∩≤=≤

xFYxXPxXPxFYxXxX

XYX

• Given , ),,(, y , - x -yxF YX ∞<<∞∞<<∞ each of is called a marginal distribution function.

)( and )( yFxF YX

Example

Consider two jointly distributed random variables X and with the joint CDF Y

2

,(1 )(1 ) 0, 0

( , )0 otherwise

x y

X Ye e x y

F x y− −⎧ − − ≥ ≥

= ⎨⎩

(a) Find the marginal CDFs (b) Find the probability 11 2, 1 2P X Y< ≤ < ≤

(a)

2

,

,

1 0( ) lim ( , )

0 elsewhere

1 y 0( ) lim ( , )

0 elsewhere

x

X X Yy

y

Y X Yx

e xF x F x y

eF y F x y

→∞

→∞

⎧ − ≥= = ⎨

⎩⎧ − ≥

= = ⎨⎩

(b)

1 , , , ,

4 2 2 1 2 2 4

1 2, 1 2 (2,2) (1,1) (1,2) (2,1)

(1 )(1 ) (1 )(1 ) (1 )(1 ) (1 )(1 ) =0.0

X Y X Y X Y X YP X Y F F F F

e e e e e e e e− − − − − − −

< ≤ < ≤ = + − −

= − − + − − − − − − − −272

1−

Jointly distributed discrete random variables

Page 120: 35229433 Random Variale Random Process

If X and are two discrete random variables defined on the same probability space such that

Y( , , )S F P X takes values from the countable subset XR and Y takes values from the countable subset .YR Then the joint random variable ( , )X Y can take values from the countable subset in .X YR R× The joint random variable ( , )X Y is completely specified by their joint probability mass function

, ( , ) | ( ) , ( ) , ( , ) .X Y X Yp x y P s X s x Y s y x y R R= = = ∀ ∈ ×

Given we can determine other probabilities involving the random variables , ( , ),X Yp x yX and .YRemark

• , ( , ) 0 for ( , )X Y X Yp x y x y R R= ∉ ×

• ,( , )

( , ) 1X Y

X Yx y R R

p x y∈ ×

=∑ ∑ This is because

,, )( , )

( , ) ,

= ( ) = | ( ( ), ( )) ( = ( ) 1

X YX YX Y

x y R Rx y R R

X Y

X Y

p x y P x y

P R RP s X s Y s R RP S

∈ ×∈ ×=∑ ∑

×∈ ×

=

• Marginal Probability Mass Functions: The probability mass functions ( )Xp x and are obtained from the joint probability mass function as

follows ( )Yp y

,

( ) = ( , )

Y

X Y

X Yy R

p x P X x Rp x y

= =

and similarly ,( ) ( , )

XY X Y

x Rp y p x y

∈= ∑

These probability mass functions ( )Xp x and obtained from the joint probability mass functions are called marginal probability mass functions.

( )Yp y

Example Consider the random variables with the joint probability mass function as tabulated in Table . The marginal probabilities are as shown in the last column and the last row

and X Y

X Y

0 1 2 ( )Yp y

0 0.25 0.1 0.15 0.5 1 0.14 0.35 0.01 0.5 ( )Xp x 0.39 0.45

Page 121: 35229433 Random Variale Random Process

Joint Probability Density Function

If X and Y are two continuous random variables and their joint distribution function is continuous in both and then we can define joint probability density function x ,y

, ( , )X Yf x y by 2

, ( , ) ( , ),X Y X Y,f x y F x yx y∂

=∂ ∂

provided it exists.

Clearly , ,( , ) ( , )yx

X Y X YF x y f u v dudv−∞ −∞

= ∫ ∫ Properties of Joint Probability Density Function

• is always a non-negaitive quantity. That is, ),(, yxf YX

2, ( , ) 0 ( , )X Yf x y x y≥ ∀ ∈

• , ( , ) 1X Yf x y dxdy∞ ∞

−∞ −∞=∫ ∫

• The probability of any Borel set B can be obtained by ,

( , )( ) ( , )X Y

x y BP B f x y dxdy

∈= ∫∫

Marginal density functions The marginal density functions ( )Xf x and ( )Yf y of two joint RVs are given by the derivatives of the corresponding marginal distribution functions. Thus

and X Y

,

,

,

( ) ( )

( , )

( ( , ) )

( , )

a n d s im ila rly ( ) ( , )

dX Xd x

dXd x

xd

X Yd x

X Y

Y X Y

f x F x

F x

f u y d y d u

f x y d y

f y f x y d

−∞ −∞

−∞

−∞

=

= ∞

= ∫ ∫

= ∫

= ∫ x

Remark • The marginal CDF and pdf are same as the CDF and pdf of the concerned

single random variable. The marginal term simply refers that it is derived from the corresponding joint distribution or density function of two or more jointly random variables.

Page 122: 35229433 Random Variale Random Process

• With the help of the two-dimensional Dirac Delta function, we can define the joint pdf of two discrete jointly random variables. Thus for discrete jointly random variables X and .Y

, , ( , ) .

( , ) ( , ) ( , )i j X Y

X Y X Y j jx y R R

f x y p x y x x y yδ∈ ×

= −∑ ∑ −

Example The joint density function , ( , )X Yf x y in the previous example are

2

, ,

22

2

( , ) ( , )

[(1 )(1 )] 0, 0

2 0, 0

X Y X Y

x y

x y

f x y F x yx y

e e x yx ye e x y

− −

− −

∂=∂ ∂

∂= − − ≥∂ ∂

= ≥ ≥

Example: The joint pdf of two random variables X and Y are given by

, ( , ) 0 2, 0 2 0 otherwise

X Yf x y cxy x y= ≤ ≤ ≤

=

(i) Find .c(ii) Find , ( , )X yF x y (iii) Find ( )Xf x and ( ).Yf y (iv) What is the probability (0 1, 0 1) ?P X Y< ≤ < ≤

2 2

, 0 0

y

, 0 0

2 2

2

0

( , )

14

1( , ) 4

16

( ) 0 2 4

0 2 2

Similarly

( ) 0 2 2

X Y

x

X Y

X

Y

f x y dydx c xydydx

c

F x y uvdudv

x y

xyf x dy y

x y

yf y y

∞ ∞

−∞ −∞=

∴ =

=

=

= ≤ ≤

= ≤ ≤

= ≤ ≤

∫ ∫ ∫ ∫

∫ ∫

Page 123: 35229433 Random Variale Random Process

, , , ,

(0 1, 0 1) (1,1) (0, 0) (0,1) (1, 0)

1 = 0 0 0161 =

16

X Y X Y X Y X Y

P X YF F F F

< ≤ < ≤= + − −

+ − −

Page 124: 35229433 Random Variale Random Process

Conditional probability mass functions

/

,

( / ) ( / )( )

( , )

provided ( ) 0( )

Y X

X YX

X

p y x P Y y X xP X x Y y

P X xp x y

p xp x

= = == −

==

= ≠

Similarly we can define the conditional probability mass function / ( / )X Yp x y

• From the definition of conditional probability mass functions, we can define two independent random variables. Two discrete random variables X and Y are said to be independent if and only if

/ ( / ) ( )Y X Yp y x p y= so that , ( , ) ( ) ( )X y X Yp x y p x p y=

• Bayes rule:

/

,

,

,

( / ) ( / )( )

( , )

( )( , )

( )

X

X Y

X Y

Y

X Y

X Yx R

p x y P X x Y yP X x Y y

P Y yp x y

p yp x y

p y∈

= = =

= −=

=

=

=∑

Example Consider the random variables with the joint probability mass function as tabulated in Table .

and X Y

X Y

0 1 2 ( )Yp y

0 0.25 0.1 0.15 0.5 1 0.14 0.35 0.01 0.5 ( )Xp x 0.39 0.45

The marginal probabilities are as shown in the last column and the last row

Page 125: 35229433 Random Variale Random Process

,/

(0,1)(0 /1)

(1)0.14 =0.39

X YY X

X

pp

p=

Conditional Probability Density Function

)/()/( // xyfxXyf XYXY == is called conditional density of Y given .X

Let us define the conditional distribution function . We cannot define the conditional distribution function for the continuous random variables and X Y by the relation

/( / ) ( / )

( , =( )

Y X

)

F y x P Y y X x

P Y y X xP X x

= ≤ =

≤ ==

as both the numerator and the denominator are zero for the above expression. The conditional distribution function is defined in the limiting sense as follows:

0/

0

,

0

,

( / ) ( / )

( , =l( )

( , ) =l

( )

( , ) =

( )

xY X

x

y

X Y

xX

y

X Y

X

)

F y x lim P Y y x X x x

P Y y x X x ximP x X x x

f x u xduim

f x x

f x u du

f x

∆ →

∆ →

∞∆ →

= ≤ < ≤

≤ < ≤ + ∆< ≤ + ∆

∆∫

+ ∆

The conditional density is defined in the limiting sense as follows

( /))/()/((lim

/))/()/((lim)/(

//0,0

//0/

yxxXxyFxxXxyyF

yxXyFxXyyFxXyf

XYXYxy

XYXYyXY

∆∆+≤<−∆+≤<∆+=

∆=−=∆+==

→∆→∆

→∆

Because )(lim)( 0 xxXxxX x ∆+≤<== →∆ The right hand side in equation (1) is

Page 126: 35229433 Random Variale Random Process

0, 0 / /

0, 0

0, 0

lim ( ( / ) ( / )) /

lim ( ( / )) /

lim ( ( , )) / ( )

y x Y X Y X

y x

y x

F y y x X x x F y x X x x y

P y Y y y x X x x y

P y Y y y x X x x P x X x x y

∆ → ∆ →

∆ → ∆ →

∆ → ∆ →

+ ∆ < < + ∆ − < < + ∆ ∆

= < ≤ + ∆ < ≤ + ∆ ∆

= < ≤ + ∆ < ≤ + ∆ < ≤

0, 0 ,

,

lim ( , ) / ( )

( , ) / ( )y x X Y X

X Y X

f x y x y f x x y

f x y f x∆ → ∆ →= ∆ ∆ ∆ ∆

=

+ ∆ ∆

/ ,( / ) ( , ) / ( )

Y X X Y Xf x y f x y f x∴ =

(2)

Similarly we have

/ ,( / ) ( , ) / ( )

X Y X Y Yf x y f x y f y∴ =

(3)

• Two random variables are statistically independent if for all 2( , ) ,x y ∈

/

,

( / ) ( ) or equivalently ( , ) ( ) ( )

Y X Y

X Y X Y

f y x f y

f x y f x f y

=

= (4)

Bayes rule for continuous random variables: From (2) and (3) we get Baye’s rule

Page 127: 35229433 Random Variale Random Process

,/

/

,

,

/

/

( , )( / )

( )( ) ( / )

( )

( , ) =

( , )

( / ) ( ) =( ) ( / )

X YX Y

Y

X Y X

Y

X Y

X Y

Y X X

X Y X

f x yf x y

f yf x f y x

f yf x y

f x y dx

f y x f x

f u f y x du

−∞

−∞

∴ =

=

(4)

Given the joint density function we can find out the conditional density function. Example: For random variables X and Y, the joint probability density function is given by

,1( , ) 1, 1

4 0 otherwise

X Yxyf x y X Y+

= ≤

=

Find the marginal density /( ), ( ) and ( / ).X Y Y Xf x f y f y x Are independent? and X Y

1

1

1( ) 4

1 2

Similarly1( ) -1 1 2

X

Y

xyf x dy

f y y

+=

=

= ≤ ≤

and ,

/

( , )( / )

( ) =

X YY X

X

f x yf y x

f x=

Baye’s Rule for mixed random variables Let X be a discrete random variable with probability mass function and Y be a continuous random variable with the conditional probability density function

( )Xp x

/ ( / ).Y Xf y x In practical problem we may have to estimate X from observed Then .Y

Page 128: 35229433 Random Variale Random Process

/ 0

0

/0

/

/

( / ) l i m ( / )

( , = l i m( )

( ) ( / ) = l i m

( )( ) ( / )

= ( )

( ) ( / ) =

X Y y

y

X Y Xy

Y

X Y X

Y

X Y X

)

p x y P X x y Y y y

P X x y Y y yP y Y y y

p x f y x yf y y

p x f y xf y

p x f y x

∆ →

∆ →

∆ →

= = < ≤

= < ≤ + ∆< ≤ + ∆

∆∆

+ ∆

/( ) ( / )X Y Xx

p x f y x∑

Example

I

X

+

V

V

T

nde

X is a binary random varia

1 with probability 1 with probability 1-

pX

⎧= ⎨−⎩

is the Gaussian noise with m

20 and variance .σ

hen

2 2

2 2 2

//

/

( 1) / 2

( 1) / 2 ( 1) / 2

( ) (( 1/ ) ( ) (

(1 )

X Y XX Y

X Y Xx

y

y y

p x f yp x yp x f y

pepe p e

σ

σ σ

− −

− − − +

= =∑

=+ −

pendent Random Variables

Let X and Y be two random v

,),(, YxXPyxF YX ≤≤=

and the corresponding joint de

Then X and Y are indeindependent events. Thus,

Y

ble with

p

ean

2

/ )/ )xx

ariables characterised by the joint distribution function

y

nsity function ),(),( ,,2 yxFyxf YXyxYX ∂∂

∂=

pendent if 2( , ) , x y X x∀ ∈ ≤ and are Y y≤

Page 129: 35229433 Random Variale Random Process

,

2,

,

( , ) , = ( ) ( )

( , )( , )

( ) ( )

X Y

X Y

X YX Y

X Y

F x y P X x Y yP X x P Y yF x F y

F x yf x y

x yf x f y

= ≤ ≤

≤ ≤=

∂∴ =

∂ ∂=

and equivalently )()(/ yfyf YXY =

Remark: Suppose X and Y are two discrete random variables with joint probability mass function

, ( , ).X Yp x y Then X and Y are independent if

, ( , ) ( ) ( ) (x,y)X Y X Y X Yp x y p x p y R R= ∀ ∈ ×

Page 130: 35229433 Random Variale Random Process

Transformation of two random variables:

We are often interested in finding out the probability density function of a function of two or more RVs. Following are a few examples.

• The received signal by a communication receiver is given by Z X Y= +

where Z is received signal which is the superposition of the message signal X and the noise .Y

+Z X

Y

• The frequently applied operations on communicatiodemodulation, correlation etc. involve multiplicatiform Z = XY.

We have to know about the probability distribution of More formally, given two random variables X and Y withfunction and a function , ( , )X Yf x y ( ), ,Z g X Y= we have to

In this lecture, we will try to address this problem. Probability density of the function of two random variables: We consider the transformation 2: .g →R RConsider the event Z z≤ corresponding to each z. We ca

such that2ZD ⊆ ( ) ( ) , | ,ZD x y g x y z= ≤ .

X

Y

ZD

n signals like modulation, on of two signals in the

Z in any analysis of .Z joint probability density find ( ).Zf z

n find a variable subset

Z

Z z≤

Page 131: 35229433 Random Variale Random Process

( ) ( ) ( )

( )( ),

,

( )

, | ,

,z

Z

z

X Yx y D

F z P Z z

P x y x y D

f x y dydx∈

∴ = ≤

= ∈

= ∫∫

( ) ( )ZZ

dF zf z

dz∴ =

Example: Suppose Z = X+Y. Find the PDF ( )Zf z . Z z≤

X Y Z=> + ≤ Therefore ZD is the shaded region in the fig below.

( ) ( )( )

,,

,z

Z X Yx y D

F z f x y dxdy∈

∴ = ∫∫

( )

( )

( )

,

,

,

,

, substituting

, interchanging

z x

X Y

z

X Y

z

X Y

f x y dy dx

f x u x du dx y

f x u x dx du

∞ −

−∞ −∞

−∞ −∞

−∞ −∞

⎡ ⎤= ⎢ ⎥

⎣ ⎦⎡ ⎤

= −⎢ ⎥⎣ ⎦⎡ ⎤

= −⎢ ⎥⎣ ⎦

∫ ∫

∫ ∫

∫ ∫

( ) ( )

( )

,

,

,

,

z

Z X Y

X Y

df z f x u x dxdz

f x u x dx

−∞ −∞

−∞

⎡ ⎤du∴ = −⎢ ⎥

⎣ ⎦

= −

∫ ∫

If X and Y are independent

( ) ( ) (, ,X Y X Y )f x z x f x f z x− = −

( ) ( ) ( )

( ) ( )*

Z X Y

X Y

f z f x f z x

f z f z

−∞

dx∴ = −

=

Where * is the convolution operation.

-

the order of integration

u x=

Page 132: 35229433 Random Variale Random Process

Example: Suppose X and Y are independent random variables and each uniformly distributed over (a, b). ( )Xf x and ( )Yf y are as shown in the figure below.

The PDF of z is a triangular probability density function as shown in the figure. Probability density function of Z = XY

( )( )

( ),,

,z

Z X Yx y D

F z f x y dydx∈

= ∫∫

( ),( ,

zx

X Y )f x y dy dx∞

−∞ −∞

= ∫ ∫

Substituting u = xy du = xdy ,

1 ,X Yuf x dudx

x x

∞ ∞

−∞ −∞

⎛ ⎞= ⎜ ⎟⎝ ⎠∫ ∫

Page 133: 35229433 Random Variale Random Process

( ) ( ) ,1 ,Z Z X Y

d zf z F z f xdz x x

−∞

⎛ ⎞dx∴ = = ⎜ ⎟⎝ ⎠∫

,1 ,X Y

zf y dyy y

−∞

⎛ ⎞= ⎜ ⎟

⎝ ⎠∫

Probability density function of YZX

=

Z z

Y zXY xz

=> ≤

=> ≤

( )

, |

,zD x y Z z

x y xz

∴ = ≤

= −∞ < ≤ ∞ ≤

( )

( )( )

( )

( )

,,

,

,

,

,

z

Z X Yx y D

zx

XY

z

X Y

F z f x y dydx

f x y dydx

x f x ux dudx

−∞ −∞

−∞ −∞

∴ =

=

=

∫∫

∫ ∫

∫ ∫

Suppose, X and Y are independent random variables. Then, ( ) ( ) (, ,X Y X Y )f x zx f x f zx=

( ) ( ) ( )Z X Yf z x f x f zx∞

−∞

= ∫ dx

( ) ( )

( )

( )

,

,

, ,

, ,

Z Z

X Y

X Y

df z F zdz

x f x z k d

y f z y u dy

−∞

−∞

x

∴ =

=

=

Example:

Page 134: 35229433 Random Variale Random Process

Suppose X and Y are independent zero mean Gaussian random variable with unity

standard derivationand YZX

= . Then

( )2 23

2 21 12 2

x x

Z

2

f z x e eπ π

∞− −

−∞

= ∫ dx

( )

( )

( )

2 2

2 2

1 12

1 12

0

2

12

1

11

x z

x z

xe dx

xe dx

z

π

π

π

∞− +

−∞

∞− +

=

=

=+

which is the Cauchy probability density function. Probability density function of 2 2Z X Y= +

( ) ( )

( )

2 2

, |

, |

, | 0 ,0 2

zD x y Z z

x y x y z

r r zθ θ π

∴ = ≤

= + ≤

= ≤ ≤ ≤ ≤

( )( )

( )

( )

( ) ( ) ( )

,,

2

0 02

0

,

cos , sin

cos , sin

z

Z X Yx y D

z

XY

Z Z XY

F z f x y dydx

f r r rdrd

df z F z f z z zddz

π

π

θ θ θ

θ θ θ

∴ =

=

∴ = =

∫∫

∫ ∫

Example Suppose X and Y are two independent Gaussian random variables each with mean 0 and variance 2σ and 2 2 .Z X Y= + Then

Page 135: 35229433 Random Variale Random Process

( ) ( )

( )

2

0

2

02 22 2

2 22

02

2

cos sin2 2

2

2

2

cos , sin

cos ) ( sin

0

.2

Z XY

X Y

z z

z

f z f z z zd

z f z f z d

ez d

ze z

e

π

π

πθ θ

σ σ

σ

θ θ θ

θ θ θ

θπσ

σ

− −

=

=

=

= ≥

The above is the Rayleigh density function we discussed earlier. Rician Distribution:

Suppose X and Y are independent Gaussian variables with non zero mean Xµ and Yµ respectively and constant variance. We have to find the joint density function of the random variable 2 2Z X Y= + . Envelope of a sinusoidal + a narrow band Gaussian noise. Received noise in a multipath situation.

Here

( ) ( )

2 2

2 22

12

, 2

and

1( , )2

X Yx y

X Y

Z X Y

f x y eµ µ

σ

πσ

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

− − + −

= +

=

We have shown that

( ) ( )2

0

cos , sinZ XYf z f z z zdπ

θ θ θ= ∫

Suppose cos and sin .X Yµ µ φ µ µ φ= = Then

Page 136: 35229433 Random Variale Random Process

( )( ) ( )

( )

2 2

2 2

2

,

)cos( )

) cos(2 )

0

222

2

2

1 cos cos sin csin2

2

(2

2

(2

2

cos , sin

121

21

2

X Y

z

z

Z

z z

z

z

f z z

e

f z e zd

e

e

e

µµ θ φ

µ µ θ φπσ

θ µ φ θ µ φσ

σ

σ

θ θ

θ

πσ

πσ

πσ

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

+−

+ −

− − + −

=

=

∴ = ∫

2cos(2 )

0

2 2

2( )

2

2 2

zz

e dze µ θ φπσ

µσ

θπσ

−+−

= ∫

Page 137: 35229433 Random Variale Random Process
Page 138: 35229433 Random Variale Random Process

Joint Probability Density Function of two functions of two random variables We consider the transformation We have to find out the joint probability density function

21 2( , ) : .g g →R R2

( )1 2, 1 2,Z Zf z z where ( )1 1 ,Z g X Y= and (2 2 , ).Z g X Y= We

hve to find out the joint probability density function ( )1 2, 1 2,Z Zf z z where ( )1 1 ,z g x y=

and (2 2 , )z g x y= . Suppose the inverse mapping relation is

1 1 2( , )x h z z= and 2 1 2( , )y h z z= Consider a differential region of area at point in the 1dz dz2 21 2( , )z z 1Z Z− plane. Let us see how the corners of the differential region are mapped to the X Y− plane. Observe that

1 11 1 1 2 1 1 2 1 1

1 1

( , ) ( , )h h

h z dz z h z z dz x dzz z

δ δδ δ

+ = + = +

2 22 1 1 2 1 1 2 1 1

1 1

( , ) ( , )h h

h z dz z h z z dz y dzz z

δ δδ δ

+ = + = +

Therefore,

The point is mapped to the point 1 1 2( ,z dz z+ ) 11

1 1

( ,h h2 )x dz y dzz z

δ δδ δ

+ + in the

plane. X Y−

X

Y

1Z

2Z

1 1 2( ,z dz z+ )1 2( , )z z

1 21

1 1

( ,h h

)x dz y dzz z

δ δδ δ

+ + ( , )x y

We can similarly find the points in the X Y− plane corresponding to ( , and

The mapping is shown in Fig. We notice that each differential region in the

1 2 2 )z z dz+).1 1 2 2( ,z dz z dz+ +

X Y− plane is a parallelogram. It can be shown the differential parallelogram at ( , )x y has a area 1 2 1 2( , )J z z dz dz where is the Jacobian of the transformation defined as the determinant

1 2( , )J z z

Page 139: 35229433 Random Variale Random Process

1 1

1 21 2

2 2

1 2

( , )

h hz z

J z zh hz z

δ δδ δδ δδ δ

=

Further, it can be shown that the absolute values of the Jacobians of the forward and the inverse transform are inverse of each other so that

1 21( , )

( , )J z z

J x y=

where

1 1

2 2

( , )

g gx y

J x yg gx y

δ δδ δδ δδ δ

=

Therefore, the differential parallelogram in Fig. has an area of 1 2 .( , )

dz dzJ x y

Suppose the transformation ( )1 1 ,z g x y= and ( )2 2 ,z g x y= has roots and let n( , ), 1, 2,..i ix y i n= be the roots. The inverse mapping of the differential region in the X Y− plane will be differential regions corresponding to n roots. The inverse mapping is illustrated in the following figure for

n4.n = As these parallelograms are non-

overlapping,

( )

( )

1 2

1 2

1 2, 1 2 1 2 ,

1

,, 1 2

1

, ( , )( , )

( , ) ,

( , )

n

Z Z X Yi i i

nX Y

Z Zi i i

dz dzf z z dz dz f x y

J x yf x y

f z zJ x y

=

=

=

∴ =

Remark • If (1 1 , )z g x y= and ( )2 2 ,z g x y= does not have a root in ( , ),x y

then ( )1 2, 1 2 , 0.Z Zf z z =

Page 140: 35229433 Random Variale Random Process

Example: pdf of linear transformation

1

2

Z aX bYZ cX dY

= += +

Then

22 2

( , )zx yx dz y dzz zδ δδ δ

+ +

1 2 1 21 2 1 2

( ,x x y y )x dz dz y dz dzz z z z

δ δ δ δδ δ δ δ

+ + + +

1 11 1

( ,x y )x dz y dzz z

δ δδ δ

+ +

Y

X

( , )x y

1Z

2Z1 2 2( , )z z dz+

1 1 2( ,z dz z+ )

1 1 2 2( ,z dz z dz+ + )

1 2( , )z z

1Z

1 1 2( ,z dz z )+1 2( , )z z

Y

X

Z2

Page 141: 35229433 Random Variale Random Process

1 2 2,

( , )

dz bz az czx y

ad bc ad bca b

J x y ad bcc d

− −= =

− −

= = −

1

Suppose X and Y are two independent Gaussian random variables each with mean 0 and

variance 2σ . Given 2 2R X Y= + and 1tan YX

θ −= , find ( ) and ( )Rf r fθ θ .

Solution: We have cosx r θ= and siny r θ= so that 2 2r x y2= + …………. (1)

and tan yx

θ = ……………… (2)

From (1)

cosr xx r

θ∂= =

and sinr yy r

θ∂= =

From (2)

2 2

sinyx x y rθ θ∂ −= = −

∂ +

2 2

cosxy x y rθ θ∂= =

∂ +

cos sin

1det sin cosJr

r r

θ θθ θ

⎡ ⎤⎢ ⎥∴ = =⎢ ⎥−⎣ ⎦

,cossin

( , )( , ) XR

x ry r

f x yf rJθ

θθ

θ==

⎤⎥⎥⎦

∴ =

2 22 22 2cos sin

2 22 .

2

r rr e eθ θ

σ σπσ

− −=

222

22

rr e σπσ

−=

Page 142: 35229433 Random Variale Random Process

2

,0

( ) ( , )R Rf r f r dπ

θ θ θ∴ = ∫

222

2 0rr e rσ

σ−

= ≤ <∞

22

,0

22

0

( ) ( , )

12

R

r

f f r dr

re dr

θ θ

σ

θ θ

πσ

∞ −

=

=

1 0 22

θ ππ

= ≤ ≤

Rician Distribution:

• X and Y are independent Gaussian variables with non zero mean Xµ and Yµ respectively and constant variance.

We have to find the joint density function of the random variable 2 2Z X Y= + . Envelope of a sinusoidal + a narrow band Gaussian noise. Received noise in a multipath situation.

2 2Z X Y= + 1tan Y

Xφ −=

We have to find J(x, y) corresponding to z and φ .

( , ) det

z zx y

J x y

x yφ φ

∂ ∂⎡ ⎤⎢ ⎥∂ ∂⎢ ⎥=∂ ∂⎢ ⎥⎢ ⎥∂ ∂⎣ ⎦

From 2z x y= + 2 and tan yx

φ =

We have, and 2 2z x y= + 2 tan yx

φ =

Therefore,

Page 143: 35229433 Random Variale Random Process

cosz xx z

φ∂= =

and sinz yy z

φ∂= =

also 22 2 2 cossecy y

x x xφ φ

φ∂

= − = −∂

and 22

1 cossec

yy x xφ φ

φ∂

= =∂

2 22

2

cos sin( , ) det cos 1 cos

J x y y xx x

φ φ

φ⎡ ⎤⎢ ⎥∴ = ⎢ ⎥−⎢ ⎥⎣ ⎦

3 2

2

cos sin cosyx xφ φ φ

= +

22

cos sincos x yx

φ φφ +⎛ ⎞= ⎜ ⎟⎝ ⎠

2

2

cos 1zx z

φ= =

Consider the transformation as shown in the diagram below:

Page 144: 35229433 Random Variale Random Process

( ) ( )2 22

12

, 21( , )

2X Yx y

X Yf x y eµ µ

σ

πσ

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

− − + −=

We have to find the density at (x, y) corresponding to z and φ . From the above figure, 0cos cosXX Zµ φ µ φ− = − and 0sin sinYY Zµ φ µ φ− = −

( ) ( )022

21 cos cos sin sin

2, 2

1( , )2

z z

X Yf x y eφ µ φ φ µ φ

σ

πσ

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

− − + −=

2 2

02 cos( )21

22

12

zze

µ φ φ µσ

πσ

⎛ ⎞⎜ ⎟⎝ ⎠

− − +−=

2 202 2 cos( )

22

12

z z

e eµ µ φ φ

σ σ

πσ

+− −=

Page 145: 35229433 Random Variale Random Process

Expected Values of Functions of Random Variables Recall that

• If is a function of a continuous random variable ( )Y g X= ,X then

( ) ( ) ( )XEY Eg X g x f x dx∞

−∞= = ∫

• If is a function of a discrete random variable ( )Y g X= ,X then ( ) ( ) ( )

XX

x REY Eg X g x p x

∈= = ∑

Suppose ( , )Z g X Y= is a function of continuous random variables and ,X Y then

the expected value of Z is given by

,

( , ) ( )

( , ) ( , )

Z

X Y

EZ Eg X y zf z dz

g x y f x y dxdy

−∞

∞ ∞

−∞ −∞

= = ∫

= ∫ ∫

Thus EZ can be computed without explicitly determining ( ).Zf z We can establish the above result as follows. Suppose ( , )Z g X Y= has roots n ( , ), 1,2,..,i ix y i n= at .Z z= Then

1

( , )n

i i ii

z Z z z x y D=

< ≤ + ∆ = ∈∆∪

where iD∆ is the differential region containing ( , ).i ix y The mapping is illustrated in Fig. for

3.n =

,( , )

,( , )

,( , )

( ) ( ) ( ,

( ) ( , )

( , ) ( , )

i i i

i i i

i i i

)Z X Y i i i ix y D

Z X Y i i i ix y D

i i X Y i i ix y D

P z Z z z f z z f x y x y

zf z z zf x y x y

g x y f x y x y

∈∆

∈∆

< ≤ + ∆ = ∆ = ∆ ∆

∴ ∆ = ∆ ∆

= ∆ ∆

As is varied over the entire the entire z Z axis, the corresponding (nonoverlapping) differential regions in X Y− plane covers the entire plane.

,( ) ( , ) ( , )Z X Yzf z dz g x y f x y dxdy∞ ∞ ∞

−∞ −∞ −∞∴ =∫ ∫ ∫

Thus,

Page 146: 35229433 Random Variale Random Process

,( , ) ( , ) ( , )X YEg X y g x y f x y dxdy∞ ∞

−∞ −∞= ∫ ∫

z Z z z< ≤ + ∆

Z Y

1D∆ 2D∆

X

3D∆

If ( , )Z g X Y= is a function of discrete s random variables and ,X Y We can similarly show that

,,

( , ) ( , ) ( , )X Y

X Yx y R

EZ Eg X Y g x y p x y∈ ×

= = ∑ ∑R

Example: The joint pdf of two random variables X and Y are given by

,1( , ) 0 2, 0 24

0 otherwise

X Yf x y xy x y= ≤ ≤ ≤

=

Find the joint expectation of 2( , )g X Y X Y=

2

,

2 22

0 0

2 23 2

0 0

4 3

( , )

( , ) ( , )

14

141 2 24 4 383

X Y

Eg X Y EX Y

g x y f x y dxdy

x y xydxdy

x dx y dy

∞ ∞

−∞ −∞

=

= ∫ ∫

= ∫ ∫

= ∫ ∫

= × ×

=

Page 147: 35229433 Random Variale Random Process

Example: If , where and are constants, thenZ aX bY a b= + EZ aEX bEY= + Proof:

,

, ,

, ,

( ) ( , )

( , ) ( , )

( , ) ( , )

( ) ( )

X Y

X Y X Y

X Y X Y

X Y

EZ ax by f x y dxdy

axf x y dxdy byf x y dxdy

ax f x y dydx by f x y dxdy

a xf x dx b yf y dy

aEX bEY

∞ ∞

−∞ −∞

∞ ∞ ∞ ∞

−∞ −∞ −∞ −∞

∞ ∞ ∞ ∞

−∞ −∞ −∞ −∞

∞ ∞

−∞ −∞

= +∫ ∫

= +∫ ∫ ∫ ∫

= +∫ ∫ ∫ ∫

= +∫ ∫

= +Thus, expectation is a linear operator. Example:

Consider the discrete random variables discussed in Example .The joint probability mass function of the random variables are tabulated in Table . Find the joint expectation of

and X Y

( , )g X Y XY=

X Y

0 1 2 ( )Yp y

0 0.25 0.1 0.15 0.5 1 0.14 0.35 0.01 0.5 ( )Xp x 0.39 0.45

,,

Clearly ( , ) ( , )

1 1 0.35 1 2 0.01 0.37

XX Y

x y YEXY g x y p x y

∈ ×= ∑ ∑

= × × + × ×=

R R

Remark (1) We have earlier shown that expectation is a linear operator. We can generally write 1 1 2 2 1 1 2 2[ ( , ) ( , )] ( , ) ( , )E a g X Y a g X Y a Eg X Y a Eg X Y+ = + Thus ( 5log ) 5 loge eE XY XY EXY E XY+ = +(2) If X and Y are independent random variables and 1 2( , ) ( ) ( ),g X Y g X g Y= then

Page 148: 35229433 Random Variale Random Process

1 2

1 2 ,

1 2

1 2

1 2

( , ) ( ) ( )

( ) ( ) ( , )

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

( ) ( )

X Y

X Y

X Y

Eg X Y Eg X g Y

g X g Y f x y dx

g X g Y f x f y dxdy

g X f x dx g Y f y dxdy

Eg X Eg Y

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

∞ ∞ ∞

−∞ −∞ −∞

=

= ∫ ∫

= ∫ ∫

= ∫ ∫ ∫

=Joint Moments of Random Variables

Just like the moments of a random variable provides a summary description of the

random variable, so also the joint moments provide summary description of two random

variables.

For two continuous random variables X and the joint moment of order is

defined as

Y m n+

,( ) ( , )m n m nX YE X Y x y f x y dxdy

∞ ∞

−∞ −∞= ∫ ∫ and

the joint central moment of order m n+ is defined as

,( ) ( ) ( ) ( ) ( , )m n m nX Y X Y X YE X Y x y f x y dµ µ µ µ

∞ ∞

−∞ −∞− − = − −∫ ∫ xdy

where X EXµ =and Y EYµ =

Remark

(1) If X and Y are discrete random variables, the joint expectation of order and

is defined as

m

n

,

,

,( , )

,( , )

( ) ( , )

( ) ( ) ( ) ( ) (X Y

X Y

m n m nX Y

x y R

m n m nX Y X Y X Y

x y R

E X Y x y p x y

, )E X Y x y p xµ µ µ µ

= ∑ ∑

− − = − −∑ ∑ y

(2) If and , we have the second-order moment of the random variables 1m = 1n =

X and Y given by

,

,

,( , )

( , ) if and are continuous ( )

( , ) if and are discrete

X Y

X Y

X Yx y R

xyf x y dxdy X YE XY

xyp x y X Y

∞ ∞

−∞ −∞

⎧∫ ∫⎪

= ⎨∑ ∑⎪

Page 149: 35229433 Random Variale Random Process

(3) If X and Y are independent, ( )E XY EXEY=

Covariance of two random variables

The covariance of two random variables X and Y is defined as

( , ) ( )( )X YCov X Y E X Yµ µ= − −

Expanding the right-hand side, we get

( , ) ( )( ) ( )

X Y

Y X X Y

Y X X

X Y

Cov X Y E X YE XY X YEXY EX EYEXY

Y

µ µµ µ µ µµ µ µµ µ

= − −= − − += − − += −

µ

The ratio ( , )( , )X Y

Cov X YX Yρσ σ

= is called the correlation coefficient. We will give an

interpretation of and ( , )Cov X Y ( , )X Yρ later on.

We will show that ( , ) 1.X Yρ ≤ To establish the relation, we prove the following result:

For two random variables and ,X Y 222 )( EYEXXYE ≤

Proof:

Consider the random variable YaX Z +=

. 02

0)(222

2

≥⇒

≥+

aEXY+ + EYEXa YaX E

Non-negatively of the left-hand side => its minimum also must be nonnegative.

For the minimum value,

2

2

0EXEXYa

dadEZ

−==>=

so the corresponding minimum is 2

22

2

2

2EX

XYEEYEX

XYE−+

Minimum is nonnegative => 2

22

2 2 2

0

E XYEYEX

E XY EX EY

− ≥

⇒ ≤2 2 EXY EX EY⇒ ≤

Page 150: 35229433 Random Variale Random Process

Now

2 2

2 2

2 2

2 2

( )( )( , )( , )( ) (

( )( )( , )

( ) ( )

( ) ( )

( ) ( )1

X Y

X X X Y

X Y

X Y

X Y

X Y

E X YCov X YX YE X E Y

E X YX Y

E X E Y

E X E Y

E X E Y

µ µρ

σ σ µ µ

µ µρ

µ µ

µ µ

µ µ

− −= =

− −

− −∴ =

− −

− −≤

− −

=

)

( , ) 1X Yρ ≤

Uncorrelated random variables Two random variables X and Y are called uncorrelated if

( , ) 0which also means

( ) = X Y

Cov X Y

E XY µ µ

=

Recall that if X and Y are independent random variables, then , ( , ) ( ) ( ).X Y X Yf x y f x f y=

Then

, ( , ) assuming and are continuous

( ) ( )

( ) ( )

X Y

X Y

X Y

EXY xyf x y dxdy X Y

xyf x f y dxdy

xf x dx yf y dy

EXEY

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

= ∫ ∫

= ∫ ∫

= ∫ ∫

=

Thus two independent random variables are always uncorrelated.

The converse is not always true.

(3) Two random variables may be dependent, but still they may be uncorrelated. If

there exists correlation between two random variables, one may be represented as

a linear regression of the others. We will discuss this point in the next section.

Linear prediction of Y from X

baXY +=ˆ Regression

Prediction error ˆY Y−

Page 151: 35229433 Random Variale Random Process

Mean square prediction error 2 2ˆ( ) ( )E Y Y E Y aX b− = − −

For minimising the error will give optimal values of a Corresponding to the

optimal solutions for a we have

. and b

, and b

2

2

( )

( )

E Y aX ba

E Y aX bb

∂ − − =∂∂ − − =∂

0

0

Solving for , ba and ,21ˆ (Y X YX

)XY xµ σ µσ

− = −

so that )(ˆ, X

x

yYXY xY µσσ

ρµ −=−

where YX

XYYX σσ

σρ =, is the correlation oefficient.

sl ,y

X Yx

opeσ

ρσ

=

ˆYY µ−

Remark

If , 0,X Yρ > then are called positively correlated. and X Y

If , 0,X Yρ < then are called negatively correlated and X Y

If 0, =YXρ then are uncorrelated. YX and

.predictionbest theis ˆ

Y

Y

Y

Y

µ

µ

==>

=−=>

( To be labeled and animated)

c

x −

Page 152: 35229433 Random Variale Random Process
Page 153: 35229433 Random Variale Random Process

If 0, =YXρ then are uncorrelated. YX and

.predictionbest theis ˆ

Y

Y

Y

Y

µ

µ

==>

=−=>

Note that independence => Uncorrelatedness. But uncorrelated generally does not imply

independence (except for jointly Gaussian random variables).

Example :

(1,-1).between ddistributeuniformly is and 2 (x) f XY X=

are dependent, but they are uncorrelated. YX and

Because 0)EX(

0 ))((),(

3

=====

−−==

∵EXEYEXEXY

YXEYXCov YXX µµσ

In fact for any zero- mean symmetric distribution of X, are uncorrelated. 2 and XX

(4) is a linear estimator aYX =ˆ

Page 154: 35229433 Random Variale Random Process

Jointly Gaussian Random variables Many practically occurred random variables are modeled as jointly Gaussian random variables.

For example, noise occurring in the communication systems is modeled as jointly Gaussian

random variables.

Two random variables are called jointly Gaussian if their joint density function is YX and

2 2( ) ( )( ) ( )12 2 22(1 ),

2,

21

, 2 1( , )

- , -

x x y yX X Y YXY

X YX Y X Y

X Y X YX Yf x y e

x y

µ µ µ µσ σρ σ σ

ρ

πσ σ ρ

− − − −

⎡ ⎤− − +⎢ ⎥

⎢ ⎥⎣ ⎦

−=

∞ < < ∞ ∞ < < ∞

The joint pdf is determined by 5 parameters

- means and X Yµ µ

- variances 2 2

and X Yσ σ

- correlation coefficient , .X Yρ

We denote the jointly Gaussian random variables X and Y with these parameters as

2 2,( , ) ~ ( , , , , )X Y X Y X YX Y N µ µ σ σ ρ

The pdf has a bell shape centred at ( , )X Yµ µ as shown in the Fig. below. The variances

2 and X2Yσ σ determine the spread of the pdf surface and ,X Yρ determines the orientation of the

surface in the X Y− plane.

Page 155: 35229433 Random Variale Random Process

Properties of jointly Gaussian random variables

(1) If X and Y are jointly Gaussian, then X and Y are both Gaussian.

We have

2 2( ) ( )( ) ( )12 2 22(1 ),

2,

2 2 2( ) ( )1 ,12 22 2(1 ),

2,

,

21

2 1

21

2 2 1

( ) ( , )

x x y yX X Y YXY

X YX Y X Y

X Y X Y

x xx X Y X XXXY

X X Y X

X Y X Y

X X Y

e

f x f x y dy

e d

e

µ µ µ µσ σρ σ σ

ρ µ µµ

σ ρ σ

ρ

πσ σ ρ

ρ

πσ πσ ρ

− − − −

−⎛ ⎞ −−⎜ ⎟−⎜ ⎟ −⎝ ⎠

−∞

⎡ ⎤⎢ ⎥− − +∞ ⎢ ⎥⎣ ⎦

−−∞

− −

= ∫

= ∫

=

y

2( ) ( )2

22 ( )1 ,12 22 2 (1 ),

2,

2

(1

2 2 1

121

2

y yY YX Y Y

xx X Y XX YY XX Y X Y

X Y X Y

X

X

X

ye

x

dy

e d

e

µ µσ σ σ

ρ σ µµσσ σ ρ

µ

πσ πσ ρ

µ

σ

πσ

− −

−⎛ ⎞−⎜ ⎟−⎜ ⎟ −⎝ ⎠

⎡ ⎤⎢ ⎥+∞ ⎢ ⎥⎣ ⎦

−∞

⎡ ⎤− − −⎢ ⎥∞

⎢ ⎥⎣ ⎦

−−∞

⎛ ⎞−⎜ ⎟−⎜ ⎟⎝ ⎠

= ∫

=

y

Area under a Gaussian

Similarly 2

121

2( )

Y

Y

Y

y

Yf y eµ

σ

πσ

⎛ ⎞−⎜ ⎟−⎜ ⎟⎝ ⎠=

(2) The converse of the above result is not true. If each of X and is Gaussian, Y X and Y

are not necessarily jointly Gaussian. Suppose

Page 156: 35229433 Random Variale Random Process

2 2( ) ( )12 2 2

1, 2( , ) (1 sin sin )

x yX Y

X Y

X YX Yf x y e x yµ µ

σ σ

πσ σ

− −⎡ ⎤− +⎢ ⎥

⎢ ⎥⎣ ⎦= +

, ( , )X Yf x y in this example is non-Gaussian and qualifies to be a joint pdf. Because,

, ( , ) 0X Yf x y ≥ and

2 2( ) ( )12 2 2

2 2 2 2( ) ( ) ( ) ( )1 12 2 2 2 2 2

1

2

1 1

2 2

(1 sin sin )

sin sin

x yX Y

X Y

X Y

x y x yX Y X Y

X Y X Y

X Y X Y

e x y dydx

e dydx e x ydydx

µ µ

σ σ

µ µ µ µ

σ σ σ σ

πσ σ

πσ σ πσ σ

− −

− − − −

⎡ ⎤∞ ∞ − +⎢ ⎥⎢ ⎥⎣ ⎦

−∞ −∞

⎡ ⎤ ⎡ ⎤∞ ∞ ∞ ∞− + − +⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦

−∞ −∞ −∞ −∞

+

= +

∫ ∫

∫ ∫ ∫ ∫2 2( ) ( )1 1

2 2 2 2 1

21 sin sin

1 0 1

x yX Y

X Y

X Ye xdx e ydy

µ µ

σ σ

πσ σ

− −∞ ∞− −

−∞ −∞

= +

= +=

∫ ∫

Odd function in y

The marginal density ( )Xf x is given by

2 2( ) ( )12 2 2

2 2 2 2( ) ( ) ( ) ( )1 12 2 2 2 2 22(1 ),

1

2

1 1

2 2

( ) (1 sin sin )

sin sin

x yX Y

X Y

X Y

x y x yX Y X Y

X Y X Y X Y

X Y X Y

Xf x e x y dy

e dy e x

µ µ

σ σ

µ µ µ µ

σ σ ρ σ σ

πσ σ

πσ σ πσ σ

− −

− − − −

⎡ ⎤∞ − +⎢ ⎥⎢ ⎥⎣ ⎦

−∞

⎡ ⎤ ⎡ ⎤∞ ∞− + − +⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦

−∞ −∞

= +

= +

∫ ∫ ydy

2

2

121

2

121

2

= 0

=

X

X

X

X

X

X

x

x

e

e

µσ

πσ

µ

σ

πσ

⎛ ⎞−⎜ ⎟−⎜ ⎟⎝ ⎠

⎛ ⎞−⎜ ⎟−⎜ ⎟⎝ ⎠

+

Odd function in y

Similarly,

2121

2( )

Y

Y

Y

y

Yf y eµ

σ

πσ

⎛ ⎞−⎜ ⎟−⎜ ⎟⎝ ⎠=

Thus X and Y are both Gaussian, but not jointly Gaussian.

(3) If X and Y are jointly Gaussian, then for any constants and b then the random variable a ,

given by is Gaussian with mean ,Z bYaXZ += YXZ ba µµµ += and variance

Page 157: 35229433 Random Variale Random Process

YXYXYXZ abba ,22222 2 ρσσσσσ ++=

(4) Two jointly Gaussian RVs X and are independent if and only if Y X and Y are

uncorrelated ,( X Y 0).ρ = Observe that if X and Y are uncorrelated, then

2 2( ) ( )12 2 2

2 2( ) ( )2 2

1

, 2( , )

1 1 2 2

( ) ( )

x yX Y

X Y

X Y

x yX Y

X Y

X Y

X Y

X Y

f x y e

e e

f x f y

µ µ

σ σ

µ µ

σ σ

πσ σ

πσ πσ

− −

− −

⎡ ⎤− +⎢ ⎥

⎢ ⎥⎣ ⎦=

=

=

.

Joint Characteristic Functions of Two Random Variables The joint characteristic function of two random variables X and Y is defined by

1 2, 1 2( , ) j x j y

X Y Ee ω ωφ ω ω +=

If X and Y are jointly continuous random variables, then

1 2, 1 2 ,( , ) ( , ) j x j y

X Y X Yf x y e dydxω ωφ ω ω∞ ∞

+

−∞ −∞

= ∫ ∫

Note that , 1 2( , )X Yφ ω ω is same as the two-dimensional Fourier transform with the basis function 1 2j x j ye ω ω+ instead of 1 2( ) .j x j ye ω ω− +

, ( , )X yf x y is related to the joint characteristic function by the Fourier inversion formula

1 2, , 1 22

1( , ) ( , )4

j x j yX y X Y 1 2f x y e d dω ωφ ω ω ω ω

π

∞ ∞− −

−∞ −∞

= ∫ ∫

If X and Y are discrete random variables, we can define the joint characteristic function in terms of the joint probability mass function as follows:

, 1 2( , )X Yφ ω ω = 1 2,

( , )( , )

X Y

j x j yX Y

x yp x y e ω ω+

∈ ×∑ ∑R R

Properties of Joint Characteristic Functions of Two Random Variables The joint characteristic function has properties similar to the properties of the chacteristic function of a single random variable. We can easily establish the following properties: 1. ,( ) ( ,0)X X Yφ ω φ ω=

2. ,( ) (0, )Y X Yφ ω φ ω=

3. If X and Y are independent random variables, then

Page 158: 35229433 Random Variale Random Process

1 2

1 2

1 2

, 1 2

1 2

( , )

( ) ( ) ( )

j X j YX Y

j X j Y

j X j Y

X Y

Ee

E e eEe Ee

ω ω

ω ω

ω ω

φ ω ω

φ ω φ ω

+=

=

==

4. We have,

1 2, 1 2

2 21 2

1 2

2 2 2 2 2 21 2

1 2 1 2

( , )

( )(1 ..............)

2

1 .....2 2

j X j YX Y Ee

j X j YE j X j Y

j EX j EYj EX j EY EXY

ω ωφ ω ω

ω ωω ω

ω ωω ω ω ω

+=

+= + + + +

= + + + + + +

Hence,

1

2

1 2

,

, 1 21 0

, 1 22 0

2, 1 2

21 2 0, 0

1= (0,0)

1 ( , )

1 ( , )

( , )1

X Y

X Y

X Y

X Y

EXj

EYj

EXYj

ω

ω

ω ω

φ

φ ω ωω

φ ω ωω

φ ω ωω ω

=

=

= =

∂=

∂=

∂=

∂ ∂

In general, the order joint moment is given by ( )m n th+

1 2

, 1 2

1 2 0, 0

( , )1 m nX Ym n

m n m nEX Yj

ω ω

φ ω ωω ω+

= =

∂ ∂=

∂ ∂

Example Joint characteristic function of the jointly Gaussian random variables X and Y with the joint pdf

2 2

,2,

1 22(1 )

, 2,

( , )2 1

yX XX y

X X Y YX Y

yx x y

X Y

X Y X Y

ef x y

µµ µρ

σ σ σ σρ

πσ σ ρ

⎡ ⎤−⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞− − −⎢ ⎥− − +⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟⎢ ⎥− ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎝ ⎠⎣ ⎦

=−

)

Let us recall the characteristic function of a Gaussian random variable 2~ ( ,X XX N µ σ

Page 159: 35229433 Random Variale Random Process

2 2 2 2 2

22 2 2

2 / 2

2

2

2

2

2

2

12

2( ) ( ) ( )12

( 2 )1 12 2

1

( )

1 .2

12

12

X X X

X X X X

X

X

j j j

j x j

j

XX

X X X

X

X

X

X

j XX

xj x

Xx x

X

X

Ee

e e dx

e d

e e

e

e

ω ω ω

ω µ σ ωσ

ω

ω

µσ ω

µ σ µ σ µ σ µσ

σ ω µ σσ

µ σ ω

µ

φ ω

πσ

πσ

πσ

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

− − −

+ ⎛ ⎞− −⎜ ⎟⎜ ⎟⎝ ⎠

−∞ −

−∞

− + − +∞ −

−∞

− ∞ −

−∞

×

=

=

=

=

=

=

4 / 22 2XjX ω σ ω−

2X

x

dx

Area under a Gaussian

If X and Y are jointly Gaussian, 2 2

,2,

1 22(1 )

, 2,

( , )2 1

yX XX y

X X Y YX Y

yx x y

X Y

X Y X Y

ef x y

µµ µρ

σ σ σ σρ

πσ σ ρ

⎡ ⎤−⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞− − −⎢ ⎥− − +⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟⎢ ⎥− ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎝ ⎠⎣ ⎦

=−

we can similarly show that

1 2

2 2 2 21 2 1 , 1 2 2

(, 1 2

1( 22

( , ) )

Y X X Y X Y Y

j X YX Y

jXj

Ee

e

ω ω

ω µ ω ρ σ σ ω ω )µ σ ω σ ω

φ ω ω +

+ − + +

=

=

We can use the joint characteristic functions to simplify the probabilistic analysis as illustrated below: Example 2 Suppose .Z aX bY= + then

( ),( ) ( , )j Z j aX bY

Z XEe Ee a bω ωYφ ω φ+= = = ω ω

If X and Y are jointly Gaussian, then

2 2,

,

1) ( 2 )2

(

( ) ( , )

Y X X Y X Y Y

Z X Y

Xj

a b

e µ ω ρ σ σ 2µ σ σ

φ ω φ ω ω

+ − + +

=

which is the characteristic function of a Gaussian random variable with

mean YZ X µµ µ= + and

variance 2 2

,22

Z X X Y X Yρ σ σ Yσ σ σ= + + Example 3 If Z=X+Y and X and Y are independent, then

,( ) ( , )( ) ( )

Z X Y

X Y

φ ω φ ω ω

φ ω φ ω

=

=

Page 160: 35229433 Random Variale Random Process

Using the property of the Fourier transform, we get

( ) ( )* ( )Z X Yf z f z f z=

Page 161: 35229433 Random Variale Random Process

Conditional Expectation Recall that

• If are continuous random variables, then the conditional density

function of is given by

and X Y

x given Y X =

/ ,( / ) ( , ) / ( )Y X X Y Xf x y f x y f x=

• If are discrete random variables, then the probability mass function

is given by

and X Y

x given Y X =

/ ,( / ) p ( , ) / ( )Y X X Y Xp y x x y p x=

The conditional expectation of given Y X x= is defined by

/

//

( / ) if and are continuous ( / )

( / ) if and are discrete

Y

Y X

Y X xY X

y R

yf x y X YE Y X x

yp x y X Yµ

−∞=

⎧∫⎪

= = = ⎨∑⎪

Remark

• The conditional expectation of given Y X x= is also called the conditional mean

of

given .Y X = x

• We can similarly define the conditional expectation of

given , denoted by ( / )X X x E X Y y= =

• Higher-order conditional moments can be defined in a similar manner.

• Particulaly, the conditional variance of given Y X x=

is given by

2 2

/ /[( ) / ]Y X x Y X xE Y Xσ µ= == − = x

Y

Example:

Consider the discrete random variables discussed in example .The joint probability mass function of the random variables are tabulated in Table . Find the joint expectation of

and X

E ( / 2)Y X =

Page 162: 35229433 Random Variale Random Process

X Y

0 1 2 ( )Yp y

0 0.25 0.1 0.15 0.5 1 0.14 0.35 0.01 0.5 ( )Xp x 0.39 0.45 0.16

The conditional probability mass function is given by

/ ,

/ ,

/ ,

/ /

( / 2) p (2, ) / (2)(0 / 2) p (2, 0) / (2)

0.15 15 /160.16

and(1/ 2) p (2,1) / (2)

0.01 1/160.16

( / 2) 0 (0 / 2) 1 (1/ 2) 1/16

Y X X Y X

Y X X Y X

Y X X Y X

Y X Y X

p y y pp p

p p

E Y x p p

=

∴ =

= =

=

= =

= = × + ×

=

Example

Suppose are jointlyuniform random variables with the joint probability density function given by

and X Y

,

1 0, 0, 2( , ) 2

0 otherwise X Y

x y x yf x y

⎧ ≥ ≥ + ≤⎪= ⎨⎪⎩

Find ( / )E Y X x=

From the figure, ,1( , )2X Yf x y = in the shaded area.

We have

Page 163: 35229433 Random Variale Random Process

2

,0

2

0

( ) ( , )

1 2

1 = (2 ) 0 x 22

x

X X Y

x

f x f x y dy

dy

x

∴ = ∫

= ∫

− ≤ ≤

/ ,

/

2

0

( / ) ( , ) / ( )1

2

( / ) ( / )

1 2

2 - 2

Y X X Y X

Y X

x

f y x f x y f x

x

E Y X x yf y x dy

y dyx

x

−∞

∴ =

=−

∴ = = ∫

= ∫−

=

Example Suppose are jointly Gaussian random variables with the joint probability density function given by

and X Y

2 2( ) ( )( ) ( )1

2 2 22(1 ),2

,

21

, 2 1( , ) .

x x y yX X Y YXY

X YX Y X Y

x y X YX Yf x y e

µ µ µ µ

σ σρ σ σρ

πσ σ ρ

− − − −

⎡ ⎤− − +⎢ ⎥

⎢ ⎥⎣ ⎦

−=

We have to find ( / ).E Y X x=

2 2( ) ( )( ) ( )12 2 22(1 ),

2,

2( )12 2

/ ,

21

2 1

12

12

We have ( / ) ( , ) / ( )

x x y yX X YXY

X YX Y X Y

X Y X Y

x X

X

X

Y X X Y Xf y x f x y f x

e

e

µ µ µ µσ σρ σ σ

µ

σ

ρ

πσ σ ρ

πσ

πσ

− − − −

Y⎡ ⎤− − +⎢ ⎥

⎣ ⎦−

=

=

=

2,1

2 22 (1 ),2

,

( ) ( )

1

Y X YxY

Y X Y X

Y X Y

y xe σ ρ

σ ρµ µ

σ

ρ

⎡ ⎤− − − −⎢ ⎥

⎣ ⎦−

Therefore,

2,1

2 22 (1 ),2

,

( ) ( )1

2 1

,

( / )

( )

Y X YxY

XY X Y

Y X Y

Y

y x

Y X Yx

X

E Y X x y e dy

x

σ ρ

σ ρµ µ

σ

πσ ρ

σ ρµ µ

σ

⎡ ⎤− − − −∞ ⎢ ⎥

⎣ ⎦

−−∞= = ∫

= + −

Page 164: 35229433 Random Variale Random Process

Conditional Expectation as a random variable

Note that ( / ) is a function of .E Y X x x=

Using this function, we may define a random variable ( ) ( / )X E Y Xφ =

Thus we may consider

as a function of the random variable /EY X X

and ( / )E Y X x=

as the value of ( / )E Y X

at

.X x=

X x=

EE EEPr

EE

Th EE Ba

( / )E Y X is a random variable, ( / )E Y X x= is the value of ( / )E Y X at

( / )Y X EY=

( / )Y X EY= oof:

/

/

,

,

( / ) ( / ) ( )

( / ) ( )

( ) ( / )

( , )

( , )

X

Y X X

X Y X

X Y

X Y

Y X E Y X x f x dx

yf x y dy f x dx

yf x f x y dydx

yf x y dydx

y f x y dxdy

−∞

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

= =∫

= ∫ ∫

= ∫ ∫

= ∫ ∫

= ∫ ∫

( )

Yyf y dy

EY

−∞= ∫

=

us ( / )EE Y X EY= and similarly

( / )X Y EX=

ysean Estimation theory and conditional expectation Consider two random variables X and with joint pdf Y , ( , ).X Yf x y Suppose

Y is observable and ( )Xf x is known. We have to estimate X for a given value in

some optimal sense. I

Page 165: 35229433 Random Variale Random Process

in a sense that some values of θ are more likely (a priori information). We can

represent this prior information in the form of a prior density function.

In the folowing we omit the suffix in density functions just for notational simplicity.

Obervation Y y=

Random variable X with density

( )Xf x

The conditional density

estimation terminology.

, ( , )X Yf x y f=

Also we have the Bay

/ ( / )X Yf x y =

where / ( )f θΘ X is the a poste

Suppose the optimam estim

it minimizes the mean-squar

The Estimation error ˆ( (E X Y

square error estimator.

Estimation problem is

Minimize ( ∞ ∞

−∞ −∞∫ ∫

with respect to .X(y)

This is equivalent to minimiz

/ ( / )Y Xf y x

function / ( / )Y Xf y x is called likelihood function in

/( ) ( )X Y Xx f y

es rule

/( ) ( / )( )

X Y X

Y

f x f y xf y

riori density function

ator ˆ ( )X Y is a function of the random variable such that

e error

Y

2) ) .X− Such an estimator is known as the minimum mean-

2,

ˆ ( ) , )X YX y x) f (x y dx dy−

ing

Page 166: 35229433 Random Variale Random Process

2/

2/

ˆ( ( ) ) ( / )

ˆ( ( ) ( / ) )

Y X Y

X Y Y

X y x) f (y f x y dx dy

X y x) f x y dx f (y dy

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

= −

∫ ∫

∫ ∫

Since )Yf (y is always +ve, the above integral will be minimum if the inner integral is

minimum. This results in the problem:

Minimize 2/

ˆ( ( ) ( / )X YX y x) f x y dx ∞

−∞

−∫

with respect to ˆ ( ).X y The minimum is given by

2/

ˆ( ( ) ( / ) 0ˆ ( ) X YX y x) f x y dx X y

−∞

∂− =

∂ ∫

/

/ /

ˆ2 ( ( ) ( / ) 0

ˆ ( ) ( / ) ( / ) ( / )

X Y

X Y X Y

X y x) f x y dx

X y f x y dx x f x y dx E X Y y

−∞

∞ ∞

−∞ −∞

⇒ − − =

⇒ = =

∫ ∫ =

Page 167: 35229433 Random Variale Random Process

Multiple Random Variables In many applications we have to deal with many random variables. For example, in the navigation problem, the position of a space craft is represented by three random variables denoting the x, y and z coordinates. The noise affecting the R, G, B channels of colour video may be represented by three random variables. In such situations, it is convenient to define the vector-valued random variables where each component of the vector is a random variable. In this lecture, we extend the concepts of joint random variables to the case of multiple random variables. A generalized analysis will be presented n random variables defined on the same sample space.

Joint CDF of random variables n Consider random variables n 1 2 , ,.., nX X X defined on the same probability space

We define the random vector X as, ( , , ).S PF

1

2

.

.

n

XX

X

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

X or [ ]1 2' . . nX X X=X

where ' indicates the transpose operation. Thus an random vector is defined by the mapping A particular value of the random vector is denoted by

dimensional n − .X : .nS →X R.X

1 2=[ .. ]'.nx x x x

The CDF of the random vector is defined as the joint CDF of.X 1 2 , ,.., .nX X X Thus

1 2, ,.., 1 2

1 1 2 2

( , ,.. )

( , ,.. )nX X X n

n n

F x x x F

P X x X x X x

=

= ≤ ≤ ≤X (x)

Some of the most important properties of the joint CDF are listed below. These properties are mere extensions of the properties of two joint random variables. Properties of the joint CDF of random variables n (a)

1 2, ,.. 1 2( , ,.. )nX X X nF x x x is a non-decreasing function of each of its arguments.

(b) 1 2 1 2 1 2, ,.. 2 , ,.. 1 , ,.. 1 2( , ,.. ) ( , ,.. ) ... ( , ,.. ) 0

n n nX X X n X X X n X X XF x x F x x F x x−∞ = −∞ = = − ∞ =

(c) 1 2, ,.. ( , ,.., ) 1

nX X XF ∞ ∞ ∞ =

(d) is right-continuous in each of its arguments. 1 2, ,.. 1 2( , ,.. )

nX X X nF x x x

1

Page 168: 35229433 Random Variale Random Process

(e) The marginal CDF of a random variable iX is obtained from by letting all random variables except

1 2, ,.. 1 2( , ,.. )nX X X nF x x x

iX tend to .∞ Thus

1 1 2

2 1 2

1 , ,.. 1

2 , ,.. 2

( ) ( , ,.., ),

( ) ( , ,.., )n

n

X X X X

X X X X

F x F x

F x F x

= ∞

= ∞

x

n

and so on. Joint pmf of discrete random variables nSuppose is a discrete random vector defined on the same probability space

Then is completely specified by the joint probability mass function X

( , , ).S F P X

1 2, ,.., 1 2

1 1 2 2

( , ,.. )

( , ,.. )nX X X n

n n

P x x x P

P X x X x X x

=

= = = =X (x)

Given we can find the marginal probability mass function

1 2, ,.., 1 2( , ,.. )nX X X nP x x

( ) ( )

1 2 3

2 3

1 , ,..., 1 2.... , ,...,n

n

X X X XX X X

p x p x x=∑∑ ∑ x

x

n -1 summations

Joint PDF of random variables n If is a continuous random vector, that is, is continuous in each of

its argument, then can be specified by the joint probability density function

X1 2, ,.. 1 2( , ,.. )

nX X X nF x x

X

1 2

1 2

, ,.. 1 2

, ,.. 1 21 2

( , ,.. )

( , ,.. )...

n

n

X X X n

n

X X X nn

f f x x x

F x xx x x

=

∂=∂ ∂ ∂

X (x)

x

x

Properties of joint PDF of random variables nThe joint pdf of random variables satisfies the following important properties n

(1) is always a non-negaitive quantity. That is, 1 2, ,.. 1 2( , ,.. )

nX X X nf x x

1 2, ,.. 1 2 1 2( , ,.. ) 0 ( , ,.. )

n

nX X X n nf x x x x x x≥ ∀ ∈

(2) 1 2, ,.. 1 2 1 2... ( , ,.. ) ... 1

nX X X n nf x x x dx dx dx∞ ∞ ∞

−∞ −∞ −∞=∫ ∫ ∫

(3) Given for all we can find the

probability of a Borel set (region ) 1 2, ,.. 1 2( , ,.. )

nX X X nf f x x=X (x) x

n

1 2( , ,.. ) ,nnx x x ∈R

,nB ⊆ R

1 21 2 .. , ,.. 1 2 1 2(( , ,.. ) ) ( , ,.. ) ...nn X X X n

B

P x x x B f x x x dx dx dx∈ = ∫ ∫ ∫

(4) The marginal CDF of a random variable iX is related to by

the integral

1 2, ,.. 1 2( , ,.. )nX X X nf x x x

ld( 1)n fo− −

2

Page 169: 35229433 Random Variale Random Process

1 2, ,.. 1 2 1 2( ) ... ( , ,.. .. ) ...

i nX i X X X i n nf x f x x x x dx dx dx∞ ∞ ∞

−∞ −∞ −∞= ∫ ∫ ∫

where the integral is performed over all the arguments except .ix Similarly,

1 2, , ,.. 1 2 1 2( , ) ... ( , ,.. .. ) ... ( 2) integrali j nX X i j X X X i n nf x x f x x x x dx dx dx n fold

∞ ∞ ∞

−∞ −∞ −∞= −∫ ∫ ∫ −

and so on.

The conditional density functions are defined in a similar manner. Thus

( )( )( )

1 2

1 2 1 2

1 2

, ,..., 1 2, ,......, / , ,......., 1 2 1 2

, ,..., 1 2

, ,.........,, ,...... / , ,.........,

, ,.........,n

m m n m

m

X X X nX X X X X X m m n m

X X X m

f x x xf x x x x x x

f x x x+ + + + =

Independent random variables: The random variables 1 2, ,..............., nX X X are called (mutually) independent if and only if

( )1 2, ,.. 1 2

1

( , ,.. )n i

n

X X X n X ii

f x x x f x=

=∏

For example, if 1 2, ,..............., nX X X are independent Gaussian random variables, then

( )2

1 2

2

, ,.. 1 21

12( , ,.. ) 1

2i

n

i in

X X X ni i

f x x x eµ

σ

πσ=

−−

=∏x

Remark, 1 2, ,................, nX X X may be pair wise independent, but may not be mutually independent.

Identically distributed random variables: The random variables 1 2, ,....................., nX X X

are called identically distributed if each random variable has the same marginal distribution function, that is,

( ) ( ) ( )1 2

.........nX X XF x F x F x x= = =

An important subclass of independent random variables is the independent and identically distributed (iid) random variables. The random variables 1 2, ,....., nX X X are called iid if

1 2, ,......., nX X X are mutually independent and each of 1 2, ,......., nX X X has the same marginal distribution function. Example: If 1 2, ,......., nX X X may be iid random variables generated by n independent throwing of a fair coin and each taking values 0 and 1, then 1 2, ,......., nX X X are iid and

( ) 11,1,......,12

n

p ⎛ ⎞= ⎜ ⎟⎝ ⎠

X

3

Page 170: 35229433 Random Variale Random Process

Moments of Multiple random variables

Consider jointly random variables represented by the random vector

The expected value of any scalar-valued function is defined

using the integral as

n

1 2[ , , ....., ] '.nX X X=X ( )g X

n fold−

1 21 2 , ,.. 1 2 1 2( ( ) ... ( , ,.. ) ( , ,.. ) ...nn X X X n nE g g x x x f x x x dx dx dx

∞ ∞ ∞

−∞ −∞ −∞= ∫ ∫ ∫X

The mean vector of denoted by is defined as ,X ,Xµ

[ ]

1 2

1 2

( ) ( ) ( )...... ( ) '.

... '.n

n

X X X

EE X E X E X

µ µ µ

=

=

⎡ ⎤= ⎣ ⎦

Xµ X

Similarly for each we can define the covariance ( , ) 1, 2,.., , 1, 2,..,i j i n j n= =

( , ) ( )( )i ji j i X j XCov X X E X Xµ µ= − −

All the possible covariances can be represented in terms of a matrix called the covariance matrix defined by XC

1 2 1

2 1 2 2

1 2

( )( )var( ) cov( , ) cov( , )cov( , ) var( ) . cov( , )

cov( , ) cov( , ) var( )

i n

n

n n

EX X X XX X X X X

X X X X X n

Xµ µ ′= − −

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦

X X XC X X

Properties of the Covariance Matrix

• is a symmetric matrix because CoXC ( , ) ( , )i j j i v X X Cov X X=

• is a non-negative definite matrix in the sense that for any real vector XC 0,≠z the quadratic form The result can be proved as follows: 0.′ ≥Xz C z

2

(( )( ) ) ( ( )( ) )

( ( )) 0

EE

E

µ µµ µ

µ

′ ′= − − ′′ ′= − −

′= −≥

X X

X X

X

z C z z X X zz X X z

z X

X

The covariance matrix represents second-order relationship between each pair of the random variables and plays an important role in applications of random variables.

• The random variables n 1 2 , ,.., nX X X are called uncorrelated if for each ( , ) 1, 2,.., , 1, 2,..,i j i n j n= =

( , ) 0i jCov X X =

If 1 2 , ,.., nX X X are uncorrelated, will be a diagonal matrix. XC

4

Page 171: 35229433 Random Variale Random Process

Multiple Jointly Gaussian Random variables

For any positive integer ,n 1 2, ,....., nX X X represent jointly random variables.

These random variables define a random vector

n

n 1 2[ , , ....., ] '.nX X X=X These

random variables are called jointly Gaussian if the random variables 1 2, , ....., nX X X

have joint probability density function given by

( )

' 1

1 2

12

, ,....., 1 2( , ,... )2 det(

X

nX X X n nef x x xπ

−−

=X C X

XC )

)E

where '( )(= − −X X XC X µ X µ

]

is the covariance matrix and

is the vector formed by the means of the random

variables.

[ 1 2( ) ( ), ( )...... ( ) 'nE E X E X E X= =Xµ X

5

Page 172: 35229433 Random Variale Random Process

Vector space Interpretation of Random Variables

Consider a set V with elements called vectors and the field of real numbers .

V is called a vector space if and only if

1. An operation vector addition '+' is defined in such that ( V ,+) is a commutative group. Thus ( ,+) satisfies the following properties.

VV

(i) For any pair of elements ,∈v, w V there exists a unique element .∈(v + w) V

(ii) Vector addition is associative: for any three vectors

v + (w + z) = (v + w) + z∈v, w, z V

(iii) There is a vector such that for any ∈0 V v + 0 = 0 + v .∈v V (iv) For any there is a vector ∈v V − ∈v V such that .v + (-v) = 0 = (-v) + v(v) for any ,∈v, w V .v + w = w + v

2. For any element and any ∈v V .r∈ the scalar product r ∈v V This scalar product has the following properties for any ,r s .∈ and any

,∈v, w V3. ( ) ( ) for , and r s rs r s= ∈v v v∈V

al

0

4. r(v + w) = rv + rw 5. (r + s)v = rv + sv 6. 1v = v

It is easy to verify that the set of all random variables defined on a probability space

forms a vector space with respect to addition and scalar multiplication. Similarly

the set of all random vectors forms a vector space. Interpretation of

random variables as elements of a vector space help in understanding many operations

involving random variables.

( , , )S PF

dimn ension−

Linear Independence

Consider random vectors n .1 2 Nv , v , ...., v

If implies that 1 2 .... Nc c c+ + + =1 2 Nv v v

,0....21 ==== Nccc then are linearly independent. 1 2 Nv , v , ...., v

Page 173: 35229433 Random Variale Random Process

For random vectors if N , ,1 2 NX , X , .... X 1 2 .... Nc c c+ + + =1 2 NX X X 0

>

implies that

then are linearly independent. 1 2 .... 0,Nc c c= = = = ,1 2 NX , X , .... X

Inner Product If and are real vectors in a vector space defined over the field , the inner product

is a scalar such that

v w V

< v, w

and V r∀ ∈ ∈v, w, z

2

1.

2. 0, where is a inducedby the inner product3.4.

norm,

r , r

< > = < >

< > = ≥

< + > = < > + < >< > = < >

v, w w, v

v, v v vv w z v, z w, zv w v, w

In the case of two random variables the joint expectation and ,X Y EXY defines an

inner product between Thus and .X Y

< X, Y > = EXY

We can easily verify that EXY satisfies the axioms of inner product.

The norm of a a random variable X is given by

2 2X EX =

For two dimensional random vectors n−

1

2

n

XX

X

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦

X and

1

2 ,

n

YY

Y

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦

Y the inner product is

n

i ii

< > = E EX′ = ∑X, Y X Y Y

The norm of a a random vector is given by X

2 2n

ii

< > = E EX′= = ∑X X, X X X

• The set of RVs along with the inner product defined through the joint expectation

operation and the corresponding norm defines a Hilbert Space.

Schwarz Inequality For any two vectors and belonging to a Hilbert space v w V

Page 174: 35229433 Random Variale Random Process

| | < >≤v, w v w

This means that for any two random variables and X Y

2 2( )E XY EX EY≤

Similarly for any two random vectors and X Y

( )E E E′ ′≤X Y X X Y Y′

Orthogonal Random Variables and Orthogonal Random Vectors Two vectors and are called orthogonal if v w 0 < > =v, w

Two random variables are called orthogonal if YX and 0 .EXY = Similarly two random vectors are called orthogonal if and X Y

0n

i ii

E EX Y′ = =∑X Y

Just like the of independent random variables and the uncorrelated random variables, the

orthogonal random variables form an important class of random variables.

Remark

If are uncorrelated, then YX and

( )( ) 0( ) is orthogonal to ( )

X Y

X Y

E X Y X Y

µ µµ µ− − =

∴ − −

If each of is zero-mean YX and

( , )Cov X Y EXY=

In this case, 0 ( )EXY Cov XY 0.= ⇔ =

Page 175: 35229433 Random Variale Random Process

Minimum Mean-square-error Estimation

Suppose X is a random variable which is not observable and is another observable

random variable which is statistically dependent on

Y

X through the joint probability

density function , ( , ).X Yf x y We pose the followingproblem.

Given a value of Y what is the best guess for X ?

This problem is known as the estimation problem and has many practical applications.

One application is the signal estimation from noisy observations as illustrated in the Fig.

below:

Noisy observation Y Estimation

Signal

X

Noise

+

Let be the estimate of the random variable )(ˆ YX X based on

Clearly is a function of We have to find best esti

meaningful sense. Observe that

.Y )(ˆ YX .Y

• X is the unknown random variable

• ˆ ( )X Y is the estimate of .X

• ˆ ( )X X Y− is the estimation error.

• is the mean of the square error. 2))(ˆ( YXXE −

One meaningful criterion is to minimize with

corresponding estimation principle is called minimum mean sq

a function which we want to minimize is called a cost function

finding

2))(ˆ( YXXE −

ˆ ( ),X Y we have to minimize the cost function

Estimated

Signal X

the random variable

mate be in some )(ˆ YX

respect to and the

uare error principle. Such

in optimization theory. For

)(ˆ YX

Page 176: 35229433 Random Variale Random Process

2 2,

2/

2/

ˆ ˆ ( ( )) ( ( )) ( , )

ˆ ( ( )) ( ) ( )

ˆ ( )( ( ( )) ( ) )

X Y

Y X Y

Y X Y

E X X Y x X y f x y dydx

x X y f y f x dydx

f y x X y f x dx dy

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

∞ ∞

−∞ −∞

− = −∫ ∫

= −∫ ∫

= −∫ ∫

Since is always positive, therefore the minimization of )( yfY

2ˆ ( ( ))E X X Y− with respect to is equivalent to minimizing the inside

integral

)(ˆ YX

2/

ˆ( ( )) ( )X Yx X y f x dx∞

−∞−∫ with respect to ˆ ( ).X Y The condition for the

minimum is

2/ˆ

/-

/ /

ˆ ( ( )) ( / ) 0

ˆOr 2 ( ( )) ( / ) 0

ˆ ( ) ( / ) ( / )

ˆ ( ) ( / )

X YX

X Y

X Y X Y

x X y f x y dx

x X y f x y dx

X y f x y dx xf x y dx

X y E X Y y

∞∂∂

−∞

∞ ∞

−∞ −∞

− =∫

− =∫

⇒ =∫ ∫

⇒ = =

Thus, the minimum mean-square error estimation involves conditional expectation

( / ).E X Y y= To find ( / ),E X Y y= we have to determine the a posteriori probability

density / ( / )X Yf x y and perform / ( / ) .X Yxf x y d∞

−∞∫ x These operations are computationally

exhaustive when we have to perform these operations numerically.

Example Consider two zero-mean jointly Gaussian random variables X and Y with the joint pdf

2212 2 22(1 ),

2,

21

, 2 1( , )

- , -

xy yxXY

X YX Y X Y

X Y X YX Yf x y e

x y

σ σρ σ σρ

πσ σ ρ

⎡ ⎤− − +⎢ ⎥

⎣ ⎦

−=

∞ < < ∞ ∞ < < ∞

The marginal density ( )Yf y is a Gaussian random variable and given by

Page 177: 35229433 Random Variale Random Process

2

22

21

2 22 (1 ),2

,

1

2

,/

1

2 1

( )

( , )( / )

( )

y

Y

Y

XY X

YX X Y

X X Y

Y

X YX Y

Y

x y

f y e

f x yf x y

f y

e

σ

ρ σσσ ρ

πσ

πσ ρ

⎡ ⎤− −⎢ ⎥

⎣ ⎦

=

∴ =

=

which is Gaussian with mean .XY X

Yy

ρ σσ Therefore, the MMSE estimator of X given

is given by Y y=

ˆ ( ) ( / )

XY X

Yy

X y E X Y yρ σσ

= =

=

This example illustrates that in the case of jointly Gaussian random variables X and

the mean-square estimator of

,Y

X given ,Y y= is linearly related with .y This important result

gives us a clue to have simpler version of the mean-square error estimation problem discussed

below.

Linear Minimum Mean-square-error Estimation and the Orthogonality Principle

We assume that X and are both zero-mean and Y ˆ ( ) .X y ay= The estimation problem

is now to find the optimal value for Thus we have the linear minimum mean-square

error criterion which minimizes

.a2(E X aY )− with respect to .a

Noisy observation Y Estimation

X aY=

Noise

+ Signal

X

Estimated

Signal X

Page 178: 35229433 Random Variale Random Process

2

2

( ) 0( )

( )0

dda

dda

E X aYE X aYE X aY YEeY

− =

⇒ −⇒ − =

00=

)

⇒ =

where e is the estimation error.

Thus the optimum value of a is such that the estimation error (X aY− is orthogonal to

the observed random variable Y and the optimal estimator is the orthogonal

projection of

aY

X on Y This orthogonality principle forms the heart of a class of

estimation problem called Wiener filtering. The orthogonality principle is illustrated

geometrically in the following figure (Fig. ).

.

e

aY

X

The optimum value of a is given by

2

2

( ) 00

E X aY YEXY aEY

EXYaEY

− =

⇒ − =

⇒ =

The corresponding minimum linear mean-square error (LMMSE) is

Y

e is orthogonal to Y

Orothogonal projection

Page 179: 35229433 Random Variale Random Process

2

2

( ) ( ) ( ) ( ) 0 ( ( ) 0, using the )

LMMSE E X aYE X aY X aE X aY YE X aY X

E X aY Y orthogonality principleEX aEXY

= −= − − −= − −

− =

= −

The orthogonality principle can be applied to optimal estimation of a random variable from more than one observation. We illustrate this in the following example. Example Suppose X is a zero-mean random variable which is to be estimated from two zero-mean random variables and Let the LMMSE estimator be 1Y 2.Y 1 1 2 2

ˆ .X a Y a Y= + Then the optimal values of and are given by 1a 2a

21 1 2 2( ) 0 1, 2.

i

E X a Y a Y ia

∂ − −= =

This results in the orthogonality conditions

1 1 2 2 1( )E X a Y a Y Y− − = 0

0

2

and

1 1 2 2 2( )E X a Y a Y Y− − = Rewriting the above equations we get

21 1 2 2 1 1

21 2 1 2 2

anda EY a EY Y EXY

a EY Y a EY EXY

+ =

+ + =

Solving these equations we can find 1 2and .a aFurther the corresponding minimum linear mean-square error (LMMSE) is

21 1 2 2

1 1 2 2 1 1 1 2 2 1 2 1 1 2 2 2

1 1 2 2

21

( ) ( ) ( ) ( ) ( ) 0 ( using the )

LMMSE E X a Y a YE X a Y a Y X a E X a Y a Y Y a E X a Y a Y YE X a Y a Y X

orthogonality principleEX a

= − −= − − − − − − − −= − − −

= − 1 2 2EXY a EXY−

Page 180: 35229433 Random Variale Random Process

Convergence of a sequence of random variables Let 1 2, ,..., nX X X be a sequence independent and identically distributed random

variables. Suppose we want to estimate the mean of the random variable on the basis of

the observed data by means of the relation

n

1

1 N

n ii

Xn

µ=

= ∑

How closely does nµ represent the true mean Xµ as is increased? How do we

measure the closeness between

n

nµ and Xµ ?

Notice that nµ is a random variable. What do we mean by the statement nµ converges

to ?Xµ

• Consider a deterministic sequence of real numbers The sequence

converges to a limit if correspond to every

....,...., 21 nxxx

x 0>ε we can find a positive integer

N such that for .nx x nε− < > N For example, the sequence

1 121, ,..., ,...n converges to the number 0.

• The Cauchy criterion gives the condition for convergence of a sequence without

actually finding the limit. The sequence converges if and only if ,

for every

....,...., 21 nxxx

0>ε there exists a positive integer N such that

for all and all 0.n m nx x n N mε+ − < > >

Convergence of a random sequence cannot be defined as above. Note

that for each

....,...., 21 nXXX

,s S∈ 1 2( ), ( ),.... ( )....nX s X s X s represent a sequence of numbers . Thus

represents a family of sequences of numbers. Convergence of a random

sequence is to be defined using different criteria. Five of these criteria are explained

below.

....,...., 21 nXXX

Convergence Everywhere A sequence of random variables is said to converge everywhere to X if

( ) ( ) 0 for and .nX s X s n N s S− → > ∀ ∈

Note here that the sequence of numbers for each sample point is convergent. Almost sure (a.s.) convergence or convergence with probability 1

A random sequence may not converge for every 1 2, ,.... ,....nX X X .s S∈

Consider the event | ( ) ns X s X→

1

Page 181: 35229433 Random Variale Random Process

The sequence is said to converge to1 2, ,.... ,....nX X X X almost sure or with probability

1 if

| ( ) ( ) 1 ,

or equivalently for every >0 there exists such that ( ) ( ) for all for 1

n

n

P s X s X s as nN

P s X s X s n Nε

ε

→ = →∞

− < ≥ =

We write . .a snX X⎯⎯→ in this case

One important application is the Strong Law of Large Numbers(SLLN):

If are independent and identically distributed random variables with a

finite mean

....,...., 21 nXXX

,Xµ then 1

1 with probability 1as .n

i Xi

X nn

µ=

→ →∑ ∞

Remark:

• 1

1 n

ni

iXn

µ=

= ∑ is called the sample mean.

• The strong law of large number states that the sample mean converges to the true

mean as the sample size increases.

• The SLLN is one of the fundamental theorems of probability. There is a weaker

versions of the law that we will discuss letter

Convergence in mean square sense

A random sequence is said to converge in the mean-square sense (m.s) to

a random variable

....,...., 21 nXXX

X if 2( ) 0nE X X as n− → →∞

X is called the mean-square limit of the sequence and we write

. . . nl i m X X=

where means limit in mean-square. We also write . . . l i m. .m s

nX X⎯⎯→

• The following Cauchy criterion gives the condition for m.s. convergence of a

random sequence without actually finding the limit. The sequence

converges in m.s. if and only if , for every 1 2, ,.... ....nX X X 0>ε there exists a

positive integer N such that

20 as for all 0.n m nE x x n m+⎡ ⎤− → →∞ >⎣ ⎦

Example :

2

Page 182: 35229433 Random Variale Random Process

If are iid random variables, then ....,...., 21 nXXX

1

1 in the mean square 1as .N

i Xi

X nn

µ=

→ →∑ ∞

2

1

1 We have to show that lim ( ) 0N

i Xn iE X

→∞ =− =∑

Now,

2 2

1 1

n n22 2

1 i=1 j=1,j i

2

2

1 1( ) ( ( ( ))

1 1 ( ) + ( )( )

+0 ( Because of independence)

N N

i X i Xi i

N

i X i X j Xi

X

E X E Xn n

E X E X Xn nnn

µ µ

µ µ

σ

= =

= ≠

− = −∑ ∑

= − − −∑ ∑ ∑

=

µ

2

2

1

1lim ( ) 0

X

N

i Xn i

n

E Xn

σ

µ→∞ =

=

∴ − =∑

Convergence in probability

Associated with the sequence of random variables we can define a

sequence of probabilities

1 2, ,.... ,....,nX X X

,nP X X nε− > =1,2,...for every 0.ε >

The sequence is said to convergent to ....,...., 21 nXXX X in probability if this sequence

of probability is convergent that is

0nP X X as nε− > → →∞

for every 0.ε > We write PnX X⎯⎯→ to denote convergence in probability of the

sequence of random variables to the random variable ....,...., 21 nXXX .X

If a sequence is convergent in mean, then it is convergent in probability also, because 2 2 2 2 ( ) /n nP X X E X Xε− > ≤ − ε (Markov Inequality)

We have 22 /)( εε XXEXXP nn −≤>−

If (mean square convergent) then ,0)( 2 ∞→→− nasXXE n

.0 ∞→→>− nasXXP n ε

Example:

Suppose nX be a sequence of random variables with

3

Page 183: 35229433 Random Variale Random Process

1 1 1

and1 1

n

n

P Xn

P Xn

= = −

= − =

Clearly

1 1 1 0

.

n nP X P Xn

as n

ε− > = = − = →

→∞

Therefore 0PnX X⎯⎯→ =

Thus the above sequence converges to a constant in probability.

Remark:

Convergence in probability is also called stochastic convergence.

Weak Laws of Large numbers

If are independent and identically distributed random variables, with

sample mean

....,...., 21 nXXX

1

1 . Then as .n P

n i n Xi

X nn

µ µ µ=

= ⎯⎯→ →∑ ∞

We have

1

1

22

2 2

2

2

1

1

( ) (as shown above)

( )

0 as

n

n ii

n

n i Xi

Xn X

n X n X

X

n X

Xn

E Xn

and

En

P E

nP n

µ

µ µ

σµ µ

/

.

µ µ ε µ µ ε

σε

µ µ ε

=

=

= ∑

∴ = =∑

− =

− ≥ ≤ −

=

∴ − ≥ → →∞

Convergence in distribution

Consider the random sequence and a random variable Suppose ....,...., 21 nXXX .X

( ) and ( )nX XF x F x are the distribution functions of nX and X respectively. The sequence

is said to converge to X in distribution if

.)()( ∞→→ nasxFxF XX n

4

Page 184: 35229433 Random Variale Random Process

for all at which is continuous. Here the two distribution functions eventually

coincide. We write

x ( )XF x

dnX X⎯⎯→ to denote convergence in distribution of the random

sequence to the random variable ....,...., 21 nXXX .X

Example: Suppose is a sequence of RVs with each random variable ....,...., 21 nXXX iX having the uniform density

1

( )0 other wise

iX

x bf x b

⎧ ≤⎪= ⎨⎪⎩

Define 1 2max( , ,.... )n nZ X X X= We can show that

0, 0

( ) , 0

1 otherwise

n

n

Z n

zzF z z aa

= <⎧⎪⎪= ≤⎨⎪⎪⎩

<

n

z aF z F z

z a

z Z distribution

→∞

<⎧= = ⎨ ≥⎩

Clearly,

n

0, lim ( ) ( )

1

Converges to in .

nZ Z

Relation between Types of Convergence

asnX X⎯⎯→

Convergence almost sure

dnX X⎯⎯→

Convergence in distribution

pnX X⎯⎯→

Convergence in probability

. .m snX X⎯⎯→

Convergence in mean-square

5

Page 185: 35229433 Random Variale Random Process

Central Limit Theorem Consider independent random variables n 1 2, , .... nX X X The mean and variance of each

of the random variables are known. Suppose ( )iiE X Xµ= and 2var( ) .

ii XX σ=

Form a random variable

1 2 ...n nY X X X= + +

The mean and variances of are given by nY

1 2

...n nn Y X X XEY µ µ µ µ= = + +

and

1 2

2 2

1

2

1 1 1,

2 2 2

var( ) ( )

( ) ( )(

...

( )

n

n

n

n Y i ii

n n n

i i i i j ji i j j i

X X X

i j

Y E X

E X E X X )

X X i

σ µ

j

µ µ µ

σ σ σ

=

= = = ≠

= = −

− + − −

= + + +

∑ ∑ ∑ =

and are independent for ∵

Thus we can determine the mean and variances of Can we guess about the probability

distribution of

.nY

?nY

The central limit theorem (CLT) provides an answer to this question.

The CLT states that under very general conditions 1

n

ni

Y X=

= i∑ converges in distribution

to 2~ ( , )Y YY N µ σ as The conditions are: .∞→n

1. The random variables 1 2, ,..., nX X X are independent with same mean and

variance, but not identically distributed.

2. The random variables 1 2, ,..., nX X X are independent with different means and

same variance and not identically distributed.

3. The random variables 1 2, ,..., nX X X are independent with different means and

each variance being neither too small nor too large.

We shall consider the first condition only. In this case, the central-limit theorem can

be stated as follows:

Page 186: 35229433 Random Variale Random Process

Suppose 1 2, .... nX X X is a sequence of independent and identically distributed random

variables each with mean Xµ and variance 2Xσ and

1

(.

ni X

ni

XY

n)µ

=

−=∑ Then, the

sequence converges in distribution to a Gaussian random variable Y with mean0 and variance

nY2 .Xσ That is,

2 221X

y u σ−

lim ( )2nYn

X

F y e duπσ−∞→∞

= ∫

Remarks • The central limit theorem is really a property of convolution, Consider the

sum of two statistically independent random variables, say, . Then the pdf

1Y X X= + 2

( )Yf y the convolution of 1 2

. This can be shown with the help of the characteristic functions as follows:

( ) and ( )X Xf x f x

( ) ( )1 2

1 2

1 2

1 2

1 2

( )( )

( ) ( )

( ) ( )* ( )

( ) ( )d

j x xY

j x j xx x

Y X X

X X

E e

E e E e

f x f y f y

f f y

ω

ω ω

φ ω

φ ω φ ω

τ τ τ

+

−∞

⎡ ⎤= ⎣ ⎦

= =

∴ =

= −∫

where * is the convolution operation. We can illustrate this by convolving two uniform distributions repeatedly. The convolution of two uniform distributions gives a triangular distribution. Further convolution gives a parabolic distribution and so on.

Page 187: 35229433 Random Variale Random Process

(To be animated) Proof of the central limit theorem

We give a less rigorous proof of the theorem with the help of the characteristic function.

Further we consider each of 1 2, .... nX X X to have zero mean. Thus,

1 2( ... ) /n nY X X X n= + + .

Clearly,

2 2

3 3

0,

,

( ) ( ) / and so on.

n

n

Y

Y X

nE Y E X n

µ

σ σ

=

=

=

The characteristic function of is given by nY

( ) 1

1

( )

n

iin

n

j Xnj Y

Y E e E eω

ωφ ω =

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

⎡ ⎤∑⎢ ⎥= =⎢ ⎥⎣ ⎦

We will show that as the characteristic functionn →∞nYφ is of the form of the

characteristic function of a Gaussian random variable. Expanding njwYe in power series

Page 188: 35229433 Random Variale Random Process

2 32 3( ) ( )1 ....

2! 3!nj Y

n n nj je j Y Y Yω ω ωω= + + + +

Assume all the moments of to be finite. Then nY

( )2 2

2 3( ) ( )( ) 1 ( ) ( ) ...2! 3!

n

n n

j YY Y n

j jE e j E Y E Yω ω ωφ ω ωµ= = + + + n +

Substituting 2 2 20 and ( ) , we get n nY n Y XE Yµ σ σ= = =

Therefore, 2 2( ) 1 ( / 2!) R( , )

nY X nφ ω ω σ ω= − +

where R( , )nω is the average of terms involving 3ω and higher powers of .ω

2 2( ) 1 ( / 2!) R( , ).nY Xw nφ ω σ ω∴ = − +

Note also that each term in R( , )nω involves a ratio of a higher moment and a power of and therefore, nlim R( , ) 0n

nω→∞

= 2 2

2lim In ( )X

nYne

ω σ

φ ω−

→∞∴ =

which is the characteristic function of a Gaussian random variable with 0 mean and variance 2 .Xσ 2(0, )d

n XY N σ⎯⎯→

Remark:

(1) Under the conditions of the CLT, the sample mean

1

1ˆN

Xi

iXn

µ=

= ∑ converges in distribution to 2

( , )XXN

nσµ . In other words, if samples are

taken from any distribution with mean Xµ and variance 2 ,Xσ as the sample size

increases, the distribution function of the sample mean approaches to the distribution

function of a Gaussian random variable.

n

(2) The CLT states that the distribution function ( )nXF x converges to a Gaussian

distribution function. The theorem does not say that the pdf is a Gaussian pdf

in the limit. For example, suppose each has a Bernoulli distribution. Then the pdf

of consists of impulses and can never approach the Gaussian pdf.

( )nYf y

iX

Y

Page 189: 35229433 Random Variale Random Process

(3) The the Cauchy distribution does not meet the conditions for the central limit

theorem to hold. As we have noted earlier, this distribution does not have a finite

mean or a variance.

Suppose a random variable has the Cauchy distribution iX

2

1( ) - .(1 )iXf x x

xπ= ∞ <

+< ∞

The characteristic function of is given by iX

( )i

wX w eφ −=

The sample mean 1

1ˆN

Xi

iXn

µ=

= ∑ will have the chacteristic function

( )n

wY w eφ −=

Thus the sum of large number of Cauchy random variables will not follow a

Gaussian distribution.

(4) The central-limit theorem one of the most widely used results of probability. If a

random variable is result of several independent causes, then the random variable can

be considered to be Gaussian. For example,

-the thermal noise in a resistor is result of the independent motion of billions electrons

and is modelled as a Gaussian.

-the observation error/ measurement error of any process is modeled as a Gaussian.

(5) The CLT can be used to simulate a Gaussian distribution given a

routine to simulate a particular random variable

Normal approximation of the Binomial distribution One of the application of the CLT is in approximation of the Binomial coefficients. Suppose X1, X2, X3, ………… Xn,….. is a sequence of Bernoulli(p) random variables with P Xi =1= p and P xi =0= 1- p.

Then is a Binomial distribution with 1

n

ni

y=

= ∑ iXny npµ = and . 2 ( 1 )

ny n p pσ = −

Thus, (0, 1)(1 )

n dY np Nnp p

−⎯⎯→

or ( , (1 )ndY N np np p⎯⎯→ − )

Page 190: 35229433 Random Variale Random Process

( )21 ( ).1 2 (1 )

1

11 .2 (1 )

y npk np p

knP k Y k e dynp pπ

−−− −

−∴ − < ≤ =

−∫

( )21 ( ).

2 (1 )12 (1 )

y npnp p

nP Y k enp pπ

−−

−∴ = =−

( assume the integrand interval = 1 )

This is normal approximation to the Binomial coefficients and known as the De-Moirre-Laplace approximation.

Page 191: 35229433 Random Variale Random Process

RANDOM PROCESSES In practical problems we deal with time varying waveforms whose value at a time is

random in nature. For example, the speech waveform, the signal received by

communication receiver or the daily record of stock-market data represents random

variables that change with time. How do we characterize such data? Such data are

characterized as random or stochastic processes. This lecture covers the fundamentals of

random processes..

Random processes

Recall that a random variable maps each sample point in the sample space to a point in

the real line. A random process maps each sample point to a waveform.

Consider a probability space A random process can be defined on as

an indexed family of random variables (

, , .S PF , , S PF

, ), s S, X s t t∈ ∈Γ whereΓ is an index set which

may be discrete or continuous usually denoting time. Thus a random process is a function

of the sample point ξ and index variable and may be written as t ).,( ξtX

Remark

• For a fixed )),( 0tt = ,( 0 ξtX is a random variable.

• For a fixed ),( 0ξξ = ),( 0ξtX is a single realization of the random process and

is a deterministic function.

• For a fixed 0( )ξ ξ= and a fixed ),( 0tt = 0( , )X t ξ is a single number.

• When both and t ξ are varying we have the random process ).,( ξtX

The random process ( , ), , X s t s S t T∈ ∈ is normally denoted by ( ).X t Following

figure illustrates a random procee.

A random process is illustrated below.

Page 192: 35229433 Random Variale Random Process

2( , )X t s 3s

2s 1s

3( , )X t s

S

1( , )X t s t

Figure Random Process

( To Be animated)

Example Consider a sinusoidal signal ( ) cosX t A tω= where A is a bin

variable with probability mass functions (1)Ap p= and ( 1) 1 .Ap p− = −

Clearly, ( is a random process with two possible realizations ), X t t∈Γ

and 2 ( ) cos .X t tω= − At a particular time 0t 0( )X t is a random variable w

0cos tω and 0cos .tω−

Continuous-time vs. discrete-time process If the index set is continuous, (Γ ), X t t∈Γ is called a continuous-time pro Example Suppose 0( ) cos( )X t A w t φ= + w where are constants0and Auniformly distributed between 0 and 2 .π ( )X t is an example of a contiprocess. 4 realizations of the process is illustrated below. (TO BE ANIMATED)

ary random

1( ) cosX t tω=

ith two values

cess.

and φ is nuous-time

Page 193: 35229433 Random Variale Random Process

0.8373φ π=

0.9320φ π=

1.6924φ π=

1.8636φ π=

Page 194: 35229433 Random Variale Random Process

If the index set is a countable set, (Γ ), X t t∈Γ is called a discrete-time process. Such a random process can be represented as [ ],X n n Z∈ and called a random sequence.

Sometimes the notation , 0nX n ≥ is used to describe a random sequence indexed by the

set of positive integers.

We can define a discrete-time random process on discrete points of time. Particularly,

we can get a discrete-time random process [ ],X n n Z∈ by sampling a continuous-time

process ( at a uniform interval T such that ), X t t∈Γ [ ] ( ).X n X nT=

The discrete-time random process is more important in practical implementations.

Advanced statistical signal processing techniques have been developed to process this

type of signals.

Example Suppose where 02cos( )nX nω= √ +Y 0ω is a constant and Y is a random variable uniformly distributed between π and -π .

nX is an example of a discrete-time process.

0.4623φ π=

Page 195: 35229433 Random Variale Random Process

1.9003φ π=

0.9720φ π=

Continuous-state vs. discrete-state process:

The value of a random process is at any time t can be described from its probabilistic model.

)(tX

The state is the value taken by at a time t, and the set of all such states is called the state space. A random process is discrete-state if the state-space is finite or countable. It also means that the corresponding sample space is also finite countable. Other-wise the random process is called continuous state.

)(tX

Example Consider the random sequence generated by repeated tossing of a fair coin where we assign 1 to Head and 0 to Tail.

, 0nX n ≥

Clearly nX can take only two values- 0 and 1. Hence is a discrete-time two-state process.

, 0nX n ≥

Page 196: 35229433 Random Variale Random Process

How to describe a random process? As we have observed above that at a specific time t is a random variable and can be

described by its probability distribution function

)(tX

( ) ( ) ( ( ) ).X tF x P X t x= ≤ This distribution

function is called the first-order probability distribution function. We can similarly

define the first-order probability density function ( )( )

( )( ) .X t

X t

dF xf x

dx=

To describe ( we have to use joint distribution function of the random

variables at all possible values of . For any positive integer ,

represents jointly distributed random variables. Thus a random process

can thus be described by specifying the joint distribution

function

), X t t∈Γ

t n )(),.....(),( 21 ntXtXtX

n

( ), X t t∈Γ -th ordern

1 2( ), ( )..... ( ) 1 2 1 1 2 2( , ..... ) ( ( ) , ( ) ..... ( ) ), 1 and nX t X t X t n n n nF x x x P X t x X t x X t x n t= ≤ ≤ ≤ ∀ ≥ ∀ ∈Γ

or th the joint density function -th ordern

1 2 1 2( ), ( )..... ( ) 1 2 ( ), ( )..... ( ) 1 21 2

( , ..... ) ( , ..... )...n n

n

X t X t X t n X t X t X t nn

f x x x F x x xx x x

∂=

∂ ∂ ∂

If ( is a discrete-state random process, then it can be also specified by the

collection of joint probability mass function

), X t t∈Γ

-th ordern

1 2( ), ( )..... ( ) 1 2 1 1 2 2( , ..... ) ( ( ) , ( ) ..... ( ) ), 1 and

nX t X t X t n n n np x x x P X t x X t x X t x n t= = = = ∀ ≥ ∀ ∈Γ

If the random process is continuous-state, it can be specified by

Moments of a random process

We defined the moments of a random variable and joint moments of random variables.

We can define all the possible moments and joint moments of a random process

Particularly, following moments are important. ( ), .X t t ∈Γ

• ( )x tµ = Mean of the random process at t ( ( )E X t=

• 1 2 1 2 1 2( , ) = of the process at times , ( ( ) ( ))XR t t autocorrelation function t t E X t X t=

Note that

Page 197: 35229433 Random Variale Random Process

1 2 2 12

( , ) = ( , , ) and

( , ) ( ) sec or - at time .X X

X

R t t R t t

R t t EX t ond moment mean square value t= =

• The autocovariance function of the random process at time

is defined by

1 2( , )XC t t

1 and t 2t

1 2 1 1 2 2

1 2 1 22

( , ) ( ( ) ( ))( ( ) ( )) = ( , ) ( ) ( )

( , ) ( ( ) ( )) variance of the process at time .

X X X

X X X

X X

C t t E X t t X t tR t t t t

C t t E X t t t

µ µµ µ

µ

= − −−

= − =

These moments give partial information about the process.

The ratio 1 21 2

1 1 2 2

( , )( , )

( , ) ( , )X

XX X

C t tt t

C t t C t tρ = is called the correlation coefficient.

The autocorrelation function and the autocovariance functions are widely used to

characterize a class of random process called the wide-sense stationary process.

We can also define higher-order moments

))(),(),((),,( 321321 tXtXtXEtttRX = = Triple correlation function at etc. 1 2 3, ,t t t

The above definitions are easily extended to a random sequence , 0.nX n ≥ Example (a) Gaussian Random Process

For any positive integer represent jointly random

variables. These random variables define a random vector

The process is called Gaussian if the random vector

,n )(),.....(),( 21 ntXtXtX n

n

1 2[ ( ), ( ),..... ( )]'.nX t X t X t=X )(tX

1 2[ ( ), ( ),..... ( )]'nX t X t X t is jointly Gaussian with the joint density function given by

( )

' 1

1 2

12

( ), ( )... ( ) 1 2( , ,... )2 det(

X

nX t X t X t n nef x x xπ

−−

=XC X

XC )

'X

where ( )( )E= − −X XC X µ X µ

and [ ]1 2( ) ( ), ( )...... ( ) '.nE E X E X E X= =Xµ X

Page 198: 35229433 Random Variale Random Process

The Gaussian Random Process is completely specified by the autocovariance matrix and

hence by the mean vector and the autocorrelation matrix 'E=XR XX

d

.

(b) Bernoulli Random Process

A Bernoulli process is a discrete-time random process consisting of a sequence of independent and identically distributed Bernoulli random variables. Thus the discrete –time random process is Bernoulli process if , 0nX n ≥

1 an 0 1

n

n

P X pP X p

= == = −

Example

Consider the random sequence generated by repeated tossing of a fair coin

where we assign 1 to Head and 0 to Tail. Here is a Bernoulli process where

each random variable

, 0nX n ≥

, 0nX n ≥

nX is a Bernoulli random variable with

1(1) 1 and21(0) 02

X n

X n

p P X

p P X

= = =

= = =

(c) A sinusoid with a random phase

0( ) cos( )X t A w t φ= + w where are constants and 0 and A φ is uniformly distributed between 0 and 2 .π Thus

1( )2

f φπΦ =

( )X t at a particular is a random variable and it can be shown that t

2 2( )

1 ( )

0 otherwiseX t

x Af x A xπ

⎧ <⎪= −⎨⎪⎩

The pdf is sketched in the Fig. below:

The mean and autocorrelation of ( ) :X t

Page 199: 35229433 Random Variale Random Process

( )

0

0

1 2 0 1 0 22

0 1 0 22

0 1 2 0 1 2

( )

cos( )1 cos( )

2 0

( , ) cos( ) cos( )

cos( ) cos( )

(cos( ( )) cos( ( 2 )))2

X t

X

EX t

EA w t

A w t d

R t t EA w t A w t

A E w t w t

A E w t t w t t

µ

φ

φ φπ

φ φ

φ φ

φ

−∞∫

=

= +

= +

== + +

= + +

= − + + +

2 2

0 1 2 0 1 2

2

0 1 2

1 cos( ( )) cos( ( 2 ))2 2

cos( ( ))2

A Aw t t w t t d

A w t t

π

π 2φ φ

π−∫= − + + +

= −

Two or More Random Processes In practical situations we deal with two or more random processes. We often deal with the input and output processes of a system. To describe two or more random processes we have to use the joint distribution functions and the joint moments.

Consider two random processes ( ), X t t ∈Γ and ( ), .Y t t ∈Γ For any positive integer

, represent jointly distributed random

variables. Thus these two random processes can be described by the joint

distribution function

n1

/ /1 2 2( ), ( ),..... ( ), ( ), ( ),..... ( )

nnX t X t X t Y t Y t Y t / 2n

( )th ordern m+

Page 200: 35229433 Random Variale Random Process

/ / /1 2 21

1 2

1 2 1 2( ), ( )..... ( ), ( ), ( ),..... ( )

/ / /1 1 2 2 1 2

( , ..... , , ..... )

( ( ) , ( ) ..... ( ) , ( ) , ( ) ..... ( ) ) n m

n mX t X t X t Y t Y t Y t

n n m m

F x x x y y y

P X t x X t x X t x Y t y Y t y Y t y= ≤ ≤ ≤ ≤ ≤ ≤

or the corresponding joint density function ( )th ordern m+

/ / /1 2 21

/ / /1 2 21

1 2 1 2( ), ( )..... ( ), ( ), ( ),..... ( )

2

1 2 , 1 2( ), ( )..... ( ), ( ), ( ),..... ( )1 2 1 2

( , ..... , , ..... )

( , ..... , ........ ...

n n

n n

n nX t X t X t Y t Y t Y t

n

nX t X t X t Y t Y t Y tn n

f x x x y y y

F x x xx x x y y y

∂=

∂ ∂ ∂ ∂ ∂ ∂)nyy y

Two random processes can be partially described by the joint moments:

• 1 2

1 2 1 2 1 2

of the processes at times , ( , ) ( ( ) ( )) ( ( ) ( ))XY

Cross correlation function t tR t t E X t Y t E X t Y t

−= =

1 2

1 2 1 1 2 2

1 2 1 2

cov of the processes at times , ( , ) ( ( ) ( ))( ( ) ( )) ( , ) ( ) ( ).

XY X X

XY X Y

Cross ariance function t tC t t E X t t X t t

R t t t tµ µ

µ µ

−== − −

= −

• 1 21 2

1 1 2 2

Cross-correlation coefficient ( , )

( , ) ( , ) ( , )

XYXY

X Y

C t tt t

C t t C t tρ =

On the basis of the above definitions, we can study the degree of dependence between two random processes

Independent processes: Two random processes ( ), X t t ∈Γ and ( ), .Y t t ∈Γ

are called independent if

Uncorrelated processes: Two random processes ( ), X t t ∈Γ and ( ), .Y t t ∈Γ

are called uncorrelated if

1 2 1 2 ( , ) 0 ,XYC t t t t= ∀ ∈Γ

This also implies that for such two processes

Page 201: 35229433 Random Variale Random Process

1 2 1 2( , ) ( ) ( ).

XY X YR t t t tµ µ=

Orthogonal processes: Two random processes ( ), X t t ∈Γ and ( ), .Y t t ∈Γ

are called orthogonal if

1 2 1 2 R ( , ) 0 ,XY t t t t= ∀ ∈Γ Example Suppose 0( ) cos( )X t A w t φ= + and 0( ) sin( )Y t A w t φ= + where are constants and

0 and A wφ is uniformly distributed between 0 and 2 .π

Page 202: 35229433 Random Variale Random Process

RANDOM PROCESSES In practical problems we deal with time varying waveforms whose value at a time is

random in nature. For example, the speech waveform, the signal received by

communication receiver or the daily record of stock-market data represents random

variables that change with time. How do we characterize such data? Such data are

characterized as random or stochastic processes. This lecture covers the fundamentals of

random processes..

Random processes

Recall that a random variable maps each sample point in the sample space to a point in

the real line. A random process maps each sample point to a waveform.

Consider a probability space A random process can be defined on as

an indexed family of random variables (

, , .S PF , , S PF

, ), s S, X s t t∈ ∈Γ whereΓ is an index set which

may be discrete or continuous usually denoting time. Thus a random process is a function

of the sample point ξ and index variable and may be written as t ).,( ξtX

Remark

• For a fixed )),( 0tt = ,( 0 ξtX is a random variable.

• For a fixed ),( 0ξξ = ),( 0ξtX is a single realization of the random process and

is a deterministic function.

• For a fixed 0( )ξ ξ= and a fixed ),( 0tt = 0( , )X t ξ is a single number.

• When both and t ξ are varying we have the random process ).,( ξtX

The random process ( , ), , X s t s S t T∈ ∈ is normally denoted by ( ).X t Following

figure illustrates a random procee.

A random process is illustrated below.

Page 203: 35229433 Random Variale Random Process

2( , )X t s 3s

2s 1s

3( , )X t s

S

1( , )X t s t

Figure Random Process

( To Be animated)

Example Consider a sinusoidal signal ( ) cosX t A tω= where A is a bin

variable with probability mass functions (1)Ap p= and ( 1) 1 .Ap p− = −

Clearly, ( is a random process with two possible realizations ), X t t∈Γ

and 2 ( ) cos .X t tω= − At a particular time 0t 0( )X t is a random variable w

0cos tω and 0cos .tω−

Continuous-time vs. discrete-time process If the index set is continuous, (Γ ), X t t∈Γ is called a continuous-time pro Example Suppose 0( ) cos( )X t A w t φ= + w where are constants0and Auniformly distributed between 0 and 2 .π ( )X t is an example of a contiprocess. 4 realizations of the process is illustrated below. (TO BE ANIMATED)

ary random

1( ) cosX t tω=

ith two values

cess.

and φ is nuous-time

Page 204: 35229433 Random Variale Random Process

0.8373φ π=

0.9320φ π=

1.6924φ π=

1.8636φ π=

Page 205: 35229433 Random Variale Random Process

If the index set is a countable set, (Γ ), X t t∈Γ is called a discrete-time process. Such a random process can be represented as [ ],X n n Z∈ and called a random sequence.

Sometimes the notation , 0nX n ≥ is used to describe a random sequence indexed by the

set of positive integers.

We can define a discrete-time random process on discrete points of time. Particularly,

we can get a discrete-time random process [ ],X n n Z∈ by sampling a continuous-time

process ( at a uniform interval T such that ), X t t∈Γ [ ] ( ).X n X nT=

The discrete-time random process is more important in practical implementations.

Advanced statistical signal processing techniques have been developed to process this

type of signals.

Example Suppose where 02cos( )nX nω= √ +Y 0ω is a constant and Y is a random variable uniformly distributed between π and -π .

nX is an example of a discrete-time process.

0.4623φ π=

Page 206: 35229433 Random Variale Random Process

1.9003φ π=

0.9720φ π=

Continuous-state vs. discrete-state process:

The value of a random process is at any time t can be described from its probabilistic model.

)(tX

The state is the value taken by at a time t, and the set of all such states is called the state space. A random process is discrete-state if the state-space is finite or countable. It also means that the corresponding sample space is also finite countable. Other-wise the random process is called continuous state.

)(tX

Example Consider the random sequence generated by repeated tossing of a fair coin where we assign 1 to Head and 0 to Tail.

, 0nX n ≥

Clearly nX can take only two values- 0 and 1. Hence is a discrete-time two-state process.

, 0nX n ≥

Page 207: 35229433 Random Variale Random Process

How to describe a random process? As we have observed above that at a specific time t is a random variable and can be

described by its probability distribution function

)(tX

( ) ( ) ( ( ) ).X tF x P X t x= ≤ This distribution

function is called the first-order probability distribution function. We can similarly

define the first-order probability density function ( )( )

( )( ) .X t

X t

dF xf x

dx=

To describe ( we have to use joint distribution function of the random

variables at all possible values of . For any positive integer ,

represents jointly distributed random variables. Thus a random process

can thus be described by specifying the joint distribution

function

), X t t∈Γ

t n )(),.....(),( 21 ntXtXtX

n

( ), X t t∈Γ -th ordern

1 2( ), ( )..... ( ) 1 2 1 1 2 2( , ..... ) ( ( ) , ( ) ..... ( ) ), 1 and nX t X t X t n n n nF x x x P X t x X t x X t x n t= ≤ ≤ ≤ ∀ ≥ ∀ ∈Γ

or th the joint density function -th ordern

1 2 1 2( ), ( )..... ( ) 1 2 ( ), ( )..... ( ) 1 21 2

( , ..... ) ( , ..... )...n n

n

X t X t X t n X t X t X t nn

f x x x F x x xx x x

∂=

∂ ∂ ∂

If ( is a discrete-state random process, then it can be also specified by the

collection of joint probability mass function

), X t t∈Γ

-th ordern

1 2( ), ( )..... ( ) 1 2 1 1 2 2( , ..... ) ( ( ) , ( ) ..... ( ) ), 1 and

nX t X t X t n n n np x x x P X t x X t x X t x n t= = = = ∀ ≥ ∀ ∈Γ

If the random process is continuous-state, it can be specified by

Moments of a random process

We defined the moments of a random variable and joint moments of random variables.

We can define all the possible moments and joint moments of a random process

Particularly, following moments are important. ( ), .X t t ∈Γ

• ( )x tµ = Mean of the random process at t ( ( )E X t=

• 1 2 1 2 1 2( , ) = of the process at times , ( ( ) ( ))XR t t autocorrelation function t t E X t X t=

Note that

Page 208: 35229433 Random Variale Random Process

1 2 2 12

( , ) = ( , , ) and

( , ) ( ) sec or - at time .X X

X

R t t R t t

R t t EX t ond moment mean square value t= =

• The autocovariance function of the random process at time

is defined by

1 2( , )XC t t

1 and t 2t

1 2 1 1 2 2

1 2 1 22

( , ) ( ( ) ( ))( ( ) ( )) = ( , ) ( ) ( )

( , ) ( ( ) ( )) variance of the process at time .

X X X

X X X

X X

C t t E X t t X t tR t t t t

C t t E X t t t

µ µµ µ

µ

= − −−

= − =

These moments give partial information about the process.

The ratio 1 21 2

1 1 2 2

( , )( , )

( , ) ( , )X

XX X

C t tt t

C t t C t tρ = is called the correlation coefficient.

The autocorrelation function and the autocovariance functions are widely used to

characterize a class of random process called the wide-sense stationary process.

We can also define higher-order moments

))(),(),((),,( 321321 tXtXtXEtttRX = = Triple correlation function at etc. 1 2 3, ,t t t

The above definitions are easily extended to a random sequence , 0.nX n ≥ Example (a) Gaussian Random Process

For any positive integer represent jointly random

variables. These random variables define a random vector

The process is called Gaussian if the random vector

,n )(),.....(),( 21 ntXtXtX n

n

1 2[ ( ), ( ),..... ( )]'.nX t X t X t=X )(tX

1 2[ ( ), ( ),..... ( )]'nX t X t X t is jointly Gaussian with the joint density function given by

( )

' 1

1 2

12

( ), ( )... ( ) 1 2( , ,... )2 det(

X

nX t X t X t n nef x x xπ

−−

=XC X

XC )

'X

where ( )( )E= − −X XC X µ X µ

and [ ]1 2( ) ( ), ( )...... ( ) '.nE E X E X E X= =Xµ X

Page 209: 35229433 Random Variale Random Process

The Gaussian Random Process is completely specified by the autocovariance matrix and

hence by the mean vector and the autocorrelation matrix 'E=XR XX

d

.

(b) Bernoulli Random Process

A Bernoulli process is a discrete-time random process consisting of a sequence of independent and identically distributed Bernoulli random variables. Thus the discrete –time random process is Bernoulli process if , 0nX n ≥

1 an 0 1

n

n

P X pP X p

= == = −

Example

Consider the random sequence generated by repeated tossing of a fair coin

where we assign 1 to Head and 0 to Tail. Here is a Bernoulli process where

each random variable

, 0nX n ≥

, 0nX n ≥

nX is a Bernoulli random variable with

1(1) 1 and21(0) 02

X n

X n

p P X

p P X

= = =

= = =

(c) A sinusoid with a random phase

0( ) cos( )X t A w t φ= + w where are constants and 0 and A φ is uniformly distributed between 0 and 2 .π Thus

1( )2

f φπΦ =

( )X t at a particular is a random variable and it can be shown that t

2 2( )

1 ( )

0 otherwiseX t

x Af x A xπ

⎧ <⎪= −⎨⎪⎩

The pdf is sketched in the Fig. below:

The mean and autocorrelation of ( ) :X t

Page 210: 35229433 Random Variale Random Process

( )

0

0

1 2 0 1 0 22

0 1 0 22

0 1 2 0 1 2

( )

cos( )1 cos( )

2 0

( , ) cos( ) cos( )

cos( ) cos( )

(cos( ( )) cos( ( 2 )))2

X t

X

EX t

EA w t

A w t d

R t t EA w t A w t

A E w t w t

A E w t t w t t

µ

φ

φ φπ

φ φ

φ φ

φ

−∞∫

=

= +

= +

== + +

= + +

= − + + +

2 2

0 1 2 0 1 2

2

0 1 2

1 cos( ( )) cos( ( 2 ))2 2

cos( ( ))2

A Aw t t w t t d

A w t t

π

π 2φ φ

π−∫= − + + +

= −

Two or More Random Processes In practical situations we deal with two or more random processes. We often deal with the input and output processes of a system. To describe two or more random processes we have to use the joint distribution functions and the joint moments.

Consider two random processes ( ), X t t ∈Γ and ( ), .Y t t ∈Γ For any positive integer

, represent jointly distributed random

variables. Thus these two random processes can be described by the joint

distribution function

n1

/ /1 2 2( ), ( ),..... ( ), ( ), ( ),..... ( )

nnX t X t X t Y t Y t Y t / 2n

( )th ordern m+

Page 211: 35229433 Random Variale Random Process

/ / /1 2 21

1 2

1 2 1 2( ), ( )..... ( ), ( ), ( ),..... ( )

/ / /1 1 2 2 1 2

( , ..... , , ..... )

( ( ) , ( ) ..... ( ) , ( ) , ( ) ..... ( ) ) n m

n mX t X t X t Y t Y t Y t

n n m m

F x x x y y y

P X t x X t x X t x Y t y Y t y Y t y= ≤ ≤ ≤ ≤ ≤ ≤

or the corresponding joint density function ( )th ordern m+

/ / /1 2 21

/ / /1 2 21

1 2 1 2( ), ( )..... ( ), ( ), ( ),..... ( )

2

1 2 , 1 2( ), ( )..... ( ), ( ), ( ),..... ( )1 2 1 2

( , ..... , , ..... )

( , ..... , ........ ...

n n

n n

n nX t X t X t Y t Y t Y t

n

nX t X t X t Y t Y t Y tn n

f x x x y y y

F x x xx x x y y y

∂=

∂ ∂ ∂ ∂ ∂ ∂)nyy y

Two random processes can be partially described by the joint moments:

• 1 2

1 2 1 2 1 2

of the processes at times , ( , ) ( ( ) ( )) ( ( ) ( ))XY

Cross correlation function t tR t t E X t Y t E X t Y t

−= =

1 2

1 2 1 1 2 2

1 2 1 2

cov of the processes at times , ( , ) ( ( ) ( ))( ( ) ( )) ( , ) ( ) ( ).

XY X X

XY X Y

Cross ariance function t tC t t E X t t X t t

R t t t tµ µ

µ µ

−== − −

= −

• 1 21 2

1 1 2 2

Cross-correlation coefficient ( , )

( , ) ( , ) ( , )

XYXY

X Y

C t tt t

C t t C t tρ =

On the basis of the above definitions, we can study the degree of dependence between two random processes

Independent processes: Two random processes ( ), X t t ∈Γ and ( ), .Y t t ∈Γ

are called independent if

Uncorrelated processes: Two random processes ( ), X t t ∈Γ and ( ), .Y t t ∈Γ

are called uncorrelated if

1 2 1 2 ( , ) 0 ,XYC t t t t= ∀ ∈Γ

This also implies that for such two processes

Page 212: 35229433 Random Variale Random Process

1 2 1 2( , ) ( ) ( ).

XY X YR t t t tµ µ=

Orthogonal processes: Two random processes ( ), X t t ∈Γ and ( ), .Y t t ∈Γ

are called orthogonal if

1 2 1 2 R ( , ) 0 ,XY t t t t= ∀ ∈Γ Example Suppose 0( ) cos( )X t A w t φ= + and 0( ) sin( )Y t A w t φ= + where are constants and

0 and A wφ is uniformly distributed between 0 and 2 .π

Page 213: 35229433 Random Variale Random Process

Important Classes of Random Processes Having characterized the random process by the joint distribution ( density) functions and joint moments we define the following two important classes of random processes.

(a) Independent and Identically Distributed Process

Consider a discrete-time random process .nX For any finite choice of time instants if 1 2, ,...., ,Nn n n the random variables

1 2, ,...,

Nn n nX X X are jointly independent with a common distribution, then nX is called an independent and identically distributed (iid) random process. Thus for an iid random process ,nX

1 2, ,.., 1 2 1 2( , ..... ) ( ) ( )..... ( )

n n nNX X X n X X X nF x x x F x F x F x=

and equivalently

1 2, ,.., 1 2 1 2( , ..... ) ( ) ( )..... ( )

n n nNX X X n X X X np x x x p x p x p x=

Moments of the IID process: It is easy to verify that for an iid process nX

• Mean constantn XEX µ= = • Variance 2 2( ) =constantn X XE X µ σ− =• Autocovariance ( , ) ( )( )X n X mC n m E X X Xµ µ= − −

2

2

( ) ( )0 for

= otherwise

[ , ]

n X m X

X

X

E X E Xn m

n m

µ µ

σ

σ δ

= − −

≠⎧⎨⎩

=

where [ , ] 1 for and 0 otherwise.n m n mδ = = • Autocorrelation 2 2( , ) ( , ) ( , )X X X XR n m C n m n m 2

Xµ σ δ µ= + = +

Example Bernoulli process: Consider the Bernoulli process nX with

(1) and(0) 1

X

X

p pp p

== −

This process is an iid process. Using the iid property, we can obtain the joint probability mass functions of any order in terms of .p For example,

1 2

1 2 3

,

2, ,

(1,0) (1 )

(0,0,1) (1 )X X

X X X

p p p

p p

= −

= − p

and so on. Similarly, the mean, the variance and the autocorrelation function are given by

Page 214: 35229433 Random Variale Random Process

1 2

1 2

1 2

2

var( ) (1 ) ( , )

nX n

n

X n

n n

EX p

n

X p pR n n EX X

EX EX

p

µ = =

= −=

=

=

(b) Independent Increment Process

A random process ( )X t is called an independent increment process if for any

the set of random variables 1 21 and ... ,nn t t t> < < < ∈Γ

1

n 1 2 1( ), ( ) ( ),... , ( ) ( )n nX t X t X t X t X t −− − are jointly independent random variables. If the probability distribution of ( ) ( ' ) X t r X t r+ − + is same as that of ( ) ( ),X t X t′− for any choice of the (, 'and , t t r )X t is called stationary increment process.

• The above definitions of the independent increment process and the stationary increment process can be easily extended to discrete-time random processes.

• The independent increment property simplifies the calculation of joint probability distribution, density and mass functions from the corresponding first-order quantities. As an example,for 1 2 1 2, ,t t x x< <

1 2

1

( ), ( ) 1 2 1 1 2 2

1 1 2 2 1 1

1 1 2 1 2 1

( ) 1

( , ) ( ( ) , ( )

( ( ) ) ( ( ) / ( ) ) ( ( ) ) ( ( ) ( ) ) ( )

X t X t

X t

F x x P X t x X t x

P X t x P X t x X t xP X t x P X t X t x xF x

= ≤ ≤

= ≤ ≤ ≤= ≤ − ≤ −=

2 1( ) ( ) 2 1( )X t X tF x x− −

• The independent increment property simplifies the computation of the autocovariance function.

For the autocorrelation function of 1 2 ,t t< ( )X t is given by

1 2 1 2

1 1 2 12

1 1 2 12 2

1 1 2 1

( , ) ( ) ( ) ( )( ( ) ( ) ( ))

( ) ( ) ( ( ) ( ))

( ) ( ) ( ) ( ( )) var( (

XR t t EX t X tEX t X t X t X t

EX t EX t E X t X t

EX t EX t EX t EX tX

== + −

= + −

= + −= 1 1 2

1 2 1 2 1 2 1

)) ( ) ( ) ( , ) ( ) ( ) ( ) ( ) var( ( ))X

t EX t EX tC t t EX t X t EX t EX t X t

+∴ = − =

Similarly, for 1 2 ,t t> 1 2 2( , ) var( ( ))XC t t X t=Therefore

1 2 1 2( , ) var( (min( , )))XC t t X t t=

Page 215: 35229433 Random Variale Random Process

Example: Two continuous-time independent increment processes are widely studied. They are:

(a) Wiener process with the increments following Gaussian distribution and (b) Poisson process with the increments following Poisson distribution. We shall

discuss these processes shortly.

Random Walk process Consider an iid process nZ having two states 1nZ = 1nZ = − with the probability mass functions

(1)Zp p= and ( 1) 1 .Zp q p− = = − Then the sum process nX given by

11

n

n i ni

nX Z X Z−=

= = +∑

with is called a Random Walk process. 0 0X =• This process is one of the widely studied random processes. • It is an independent increment process. This follows from the fact that 1n n nX X −− = Z and nZ is an iid process.

• If we call as success and 1nZ = 1nZ = − as failure, then 1

n

n ii

X Z=

= ∑

represents the total number of successes in independent trials. n

• If 1 ,2

p = nX is called a symmetrical random walk process.

Probability mass function of the Random Walk Process

At an instant ,n nX can take integer values from n− to n Suppose .nX k= Clearly 1 1k n n−= −

where number of successes and 1n = 1n− = number of failures in n trials of nZ such that 1 1 .n n n−+ =

∴ 1 2n kn +

= and 1 2n kn−

−=

Also and are necessarily non-negative integers. 1n 1n−

2 2

2

(1 ) if and are non-negative integers( ) 2 2

0 otherwisen

n k n kn

n kX

n k n kC p pp k

+ −

+

+ −⎧ −⎪∴ = ⎨⎪⎩

Page 216: 35229433 Random Variale Random Process

Mean, Variance and Covariance of a Random Walk process

2

2 2

2

1

Note that 1 1 (1 ) 2 1

1 1 (1 ) 1

var( ) ( )

1- 4 4 1 4

(2 1)

and

n

n

n n n

n

n ii

EZ p p p

EZ p pand

Z EZ EZ

p ppq

EX EZ n p=∑

= × − × − = −

= × + × − =

= −

= + −=

∴ = = −

1 var( ) var( ) are independent random variables

4

n

n i ii

X Z Z s

npq=∑=

=

Since the random walk process nX is an independent increment process, the autocovariance function is given by

1 2 1 2( , ) 4 min( , )XC n n pq n n= Three realizations of a random walk process is as shown in the Fig. below:

Remark If the increment nZ of the random walk process takes the values of and s ,s− then

Page 217: 35229433 Random Variale Random Process

1

12

(2 1)

and

var( ) var( )

4

n

n ii

n

n ii

EX EZ n p s

X Z

npqs

=

=

∴ = = −

=

=

(c) Markov process

A process ( )X t is called a Markov process if for any sequence of time 1 2 ..... ,nt t t< < <

1 1 2 2 1 1 1 1( ( ) | ( ) , ( ) ,..., ( ) ) ( ( ) | ( ) )n n n nP X t x X t x X t x X t x P X t x X t x− − − −≤ = = = = ≤ =n n • Thus for a Markov process “the future of the process, given present, is

independent of the past.” • A discrete-state Markov process is called a Markov Chain. If nX is a discrete-

time discrete-state random process, the process is Markov if 0 0 1 1 1 1 1 1( | , ,..., ) ( | )n n n n n n n nP X x X x X x X x P X x X x− − − −= = = = = = =

• An iid random process is a Markov process. • Many practical signals with strong correlation between neighbouring samples are modelled as Markov processes

Example Show that the random walk process nX is Markov. Here,

0 1 1 1 1

1 0 1 1 1

1

( | 0, ,..., ) ( | 0, ,..., ) ( ) (

n n n n

n n n n n

n n n

P X x X X x X xP X Z x X X x X xP Z x xP

− −

− −

= = = =

= + = = = == = −

= 1 1 | )n n n nX x X x− −= =

1−

Wiener Process Consider a symmetrical random walk process nX given by

( )nX X n= ∆ where the discrete instants in the time axis are separated by ∆ as shown in the Fig. below. Assume to be infinitesimally small. ∆

Γ 0 ∆ 2∆

t n= ∆

Page 218: 35229433 Random Variale Random Process

Clearly,

2 2

01 1var( ) 4 42 2

n

n

EX

2X pqns ns ns

=

= = × × =

For large the distribution of ,n nX approaches the normal with men 0 and variance

2 2tns s tα= =∆

As and 0∆→ ,n →∞ nX becomes the continuous-time process ( )X t with the pdf

( )

2121( ) .

2

xt

X tf x et

α

πα

−= This process ( ) X t is called the Wienerprocess.

A random process ( ) X t is called a Wiener process or the Brownian motion process if it satisfies the following conditions: (1) ( )0 0X =

(2) ( )X t is an independent increment process.

(3) For each has the normal distribution with mean 0 and variance

( )0, 0 ( )s t X s t X s≥ ≥ + −tα .

( ) ( )

2121( )

2

xt

X s t X sf x et

α

πα

+ − =

• Wiener process was used to model the Brownian motion – microscopic particles suspended in a fluid are subject to continuous molecular impacts resulting in the zigzag motion of the particle named Brownian motion after the British Botanist Brown.

• Wiener Process is the integration of the white noise process. A realization of the Wiener process is shown in the figure below:

Page 219: 35229433 Random Variale Random Process

( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )( )

1 2 1 2

1 2 1 1

21 2 1

21

1

,XR t t EX t X t

EX t X t X t X t

EX t E X t X t EX t

EX ttα

=

= − +

= − +

=

=

1

>

Assuming 2 1t t>

Similarly if t t 1 2

( )1 2 2,XR t t tα=

( ) (1 2 1 2, min ,X )R t t t tα∴ =

( ) ( )21

212

xt

X tf x et

α

πα

−=

Remark C t ( ) ( )1 2 1 2, min ,X t t tα=

( )X t is a Gaussian process.

Page 220: 35229433 Random Variale Random Process

Poisson Process Consider a random process representing the number of occurrences of an event up to time t (over time interval Such a process is called a counting process and we shall denote it by

(0, t].( ), 0 .N t t ≥ Clearly ( ), 0N t t ≥ is a continuous-time discrete state

process and any of its realizations is non-decreasing function of time.

The counting process ( ), 0N t t ≥ is called Poisson’s process with the rate parameter λ if

(i) N(0)=0 (ii) N (t) is an independent increment process.

Thus the increments 2 1 4 3( ) ( ) and ( ) ( ), . are independent.N t N t N t N t et− −

(iii) ( )( ) 1 0( )P N t t tλ∆ = = ∆ + ∆

where 0( implies any function such that )t∆0

0( )lim 0.t

tt∆ →

∆=

(iv) ( ) ( )( ) 2 0P N t t∆ ≥ = ∆ The assumptions are valid for many applications. Some typical examples are

• Number of alpha particles emitted by a radio active substance. • Number of binary packets received at switching node of a communication

network. • Number of cars arriving at a petrol pump during a particular interval of time.

( ) ( ) ( ) ( ) ( )

( ) = Probability of occurrence of events up to time

( ) , ( ) 0 ( ) 1, ( ) 1 ( ) 1, ( ) 2

( ) (1 0( ))

P N t t n n t t

P N t n N t P N t n n t P N t n n t

P N t n t tλ

+ ∆ = + ∆

= = ∆ = + = − ∆ = + < − ∆ ≥

= = − ∆ − ∆ ( ) ( )( ) 1 ( 0( )) ( ) ( 1) (0

(P N t n t t P N t nλ+ = − ∆ + ∆ + = −

( ) ( ) ( ) ( ) ( ) ( )

( ) , ( ) 0 ( ) 1, ( ) 1 ( ) 1, ( ) 2

( ) (1 0( )) ( ) 1 ( 0( )) ( ) ( 1) (0( ))

P N t n N t P N t n n t P N t n n t

P N t n t t P N t n t t P N t n tλ λ

= ∆ = + = − ∆ = + < − ∆ ≥

= = − ∆ − ∆ + = − ∆ + ∆ + = − ∆

Page 221: 35229433 Random Variale Random Process

( ) ( ) ( ) ( )

( ) ( ) ( )0

( ) ( )lim ( ) ( ) 1

( ) ( ) ( ) 1 (1)

t

P N t t n P N t nP N t n P N t n

td P N t n P N t n P N t ndt

λ

λ

∆ →

+ ∆ = − == − = − = −⎡ ⎤⎣ ⎦∆

⎡ ⎤∴ = = = − = −⎣ ⎦

The above is a first-order linear differential equation with initial condition

( (0) ) 0P N n= = . . This differential equation can be solved recursively. First consider the problem to find ( )( ) 0P N t = From (1)

( ) ( )

( ) 0 ( ) 0

( ) 0 t

d P N t P N tdt

P N t e λ

λ

= = =

⇒ = =

Next to find ( ( ) 1)P N t =

( ) ( ) ( )

( ( ) 1 ( ) 1 ( ) 0

( ) 1 t

d P N t P N t P N tdt

P N t e λ

λ λ

λ λ −

= = − = − =

= − = −

with initial condition ( )(0) 1 0.P N = = Solving the above first-order linear differential equation we get ( ) 1 tP N t te λλ −= = Now by mathematical indication we can show that

( ) ( )( )!

n tt eP N t nn

λλ −

= =

Remark (1) The parameter λ is called the rate or intensity of the Poisson process.

It can be shown that

( )2 1( )

2 12 1

( ( ))( ) ( )!

t tnt t eP N t N t nn

λλ − −−− = =

Thus the probability of the increments depends on the length of the interval and not on the absolute times Thus the Poisson process is a process with stationary increments.

2t t− 1

12and .t t

(2) The independent and stationary increment properties help us to compute the joint probability mass function of For example, ( ).N t

Page 222: 35229433 Random Variale Random Process

( ) ( ) ( )

1 1 2

1 1 2 2 1 1 2 2 1 1

1 1 2 1 2 1

1 2 1

1

( ) , ( ) ( ) ) ( ( ) / ( )

( ) ) ( ( ) ( )

( ) ( ( )) !

n t n n

P N t n N t n P N t n P N t n N t n

P N t n P N t N t n n

t e t tn

λλ λ− −

= = = = = =

= = − = −

−=

1 2 1( )

2 1( )!

t ten n

λ− −

Mean, Variance and Covariance of the Poisson process We observe that at any time is a Poisson random variable with the parameter 0, ( )t N> t

.tλ Therefore, ( )EN t tλ= and var ( )N t tλ= Thus both the mean and variance of a Poisson process varies linearly with time. As is a random process with independent increment, we can readily show that ( )N t

1 2 1 2

1 2

1 2 1 2 1 2

21 2 1 2

( , ) var( (min( , ))) min( , )

( , ) ( , ) ( ) ( ) min( , )

N

N N

C t t N t tt t

R t t C t t EN t EN tt t t t

λ

λ λ

=

=∴ = +

= +

A typical realization of a Poisson process is shown below:

Example: A petrol pump serves on the average 30cars per hour. Find the probability that during a period of 5 minutes (i) no car comes to the station, (ii) exactly 3 cars come to the station and (iii) more than 3 cars come to the station. Average arrival = 30 cars/hr = ½ car/min

Page 223: 35229433 Random Variale Random Process

Probability of no car in 5 minutes

(i) 1 5 2.52(5) 0 0.0821P N e e

− × −= = = =

(ii)

3

2.5

1 52(5) 3

3!P N e−

⎛ ⎞×⎜ ⎟⎝ ⎠= =

(iii) (5) 3 1 0.21 0.79P N > = − = Binomial model: P = probability of car coming in 1 minute=1/2 n = 5

5 5( 0) (1 ) 0.5P X P∴ = = − = Inter-arrival time and Waiting time of Poisson Process:

Let = time elapsed between (n-1) st event and nth event. The random process

represent the arrival time of the Poisson process. nT

, 1,2,..nT n =

1T = time elapsed before the first event take place. Clearly is a continuous random variable.

1T

Let us find out the probability 1( )P T t>

1( ) (0 event upto time ) t

P T t P te λ−

> =

=

1 1( ) 1 1 tTF t P T t e λ−∴ = − > = −

o nt

2t 1t 1nt −

1nT − nT 1T 2T

Page 224: 35229433 Random Variale Random Process

1( ) 0t

Tf t e tλλ −∴ = ≥ Similarly,

1 1 1

1 1

( ) P( 0 event occurs in the interval ( , )/(n-1)th event occurs at (0, ] ) P( 0 event occurs in the interval ( , ])

n n n

n nt

P T t t t t tt t t

e λ

− − −

− −

> = += +

=

( )n

tT

n

f t e λλ −∴ = Thus the inter-arrival times of a Poisson process with the parameter λ are exponentially distributed. with ( ) 0

n

tTf t e nλλ −= >

Remark • We have seen that the inter-arrival times are identically distributed with the

exponential pdf. Further, we can easily prove that the inter-arrival times are independent also. For example

1 2, 1 2 1 1 2 2

1 1 2 2 1 1

1 1 1 1 2 1 1

1 1

( , ) ( , )

( ) ( / ) ( ) (zero event occurs in ( , + ) /one event occurs in (0 , ]) ( )

T TF t t P T t T t

P T t P T t T tP T t P T T t TP T t P

= ≤ ≤

= ≤ ≤ ≤= ≤= ≤

1 2

1 1 2

1 1 2 2

1 2

(zero event occurs in ( , + ) ) ( ) ( ) ( ) ( )T T

T T tP T t P T tF t F t

= ≤ ≤=

• It is interesting to note that the converse of the above result is also true. If the inter-arrival times between two events of a discrete state ( process are

exponentially distributed with mean

), 0N t t ≥1 ,λ

then ( is a Poisson process

with the parameter

), 0N t t ≥

.λ • The exponential distribution of the inter-arrival process indicates that the arrival

process has no memory. Thus 0 1 1 0 0( / ) ( ) ,n n nP T t t T t P T t t t> + > = > ∀ 1

Another important quantity is the waiting time . This is the time that elapses before the

event occurs. Thus nW

nth

1

n

n ii

W T=∑=

How to find the first order pdf of is left as an exercise. Note that is the sum of independent and identically distributed random variables.

nW nW n

Page 225: 35229433 Random Variale Random Process

1 1

21 1

and

( ) var ( ) var( )

n n

n i ii i

n n

n i ii i

nEW ET ET

nvar W T T

λ

λ

= =

= =

∑ ∑

∑ ∑

∴ = = =

= = =

Example The number of customers arriving at a service station is a Poisson process with a rate 10 customers per minute.

(a) What is the mean arrival time of the customers? (b) What is the probability that the second customer will arrive 5 minutes after the

first customer has arrived? (c) What is the average waiting time before the 10th customer arrives?

Semi-random Telegraph signal Consider a two-state random process ( )X t with the states ( ) 1X t = and Suppose if the number of events of the Poisson process in the interval

is even and if the number of events of the Poisson process in the interval is odd. Such a process

( ) 1.X t = −( ) 1X t = ( )N t

(0, ]t ( ) 1X t = − ( )N t(0, ]t ( )X t is called the semi-random telegraph signal

because the initial value is always 1. (0)X

( )

2

(1) ( ( ) 1) ( ( ) 0 ( ( ) 2) ...

( ) 1 ....)2!

cosh

X t

t

t

p P X tP N t P N t

te

e t

λ

λ

λ

λ

= =

= = + =

= + +

=

+

and ( )

3

( 1) ( ( ) 1) ( ( ) 1 ( ( ) 3) ...

( ) ....)1! 3!

sinh

X t

t

t

p P X tP N t P N t

t te

e t

λ

λ

λ λ

λ

− = = −

= = + =

= + +

=

+

We can also find the conditional and joint probability mass functions. For example, for

1 2 ,t t<

Page 226: 35229433 Random Variale Random Process

1 2

1

1

( ), ( ) 1 2 1

1 2 1

1 2 1 1

(1,1) ( ( ) 1) ( ( ) 1)/ ( ) 1)

cosh ( ( ) is even )/ ( ) is even) cosh ( ( ) ( ) is even )/ ( ) is even)

X t X t

t

t

p P X t P X t X t

e t P N t N te t P N t N t N t

λ

λ

λλ

= = = =

=

= −1

1 2 1

2

2

1 2

1 2 1

( )1 2 1

1 2 1

( ), ( ) 1

cosh ( ( ) ( ) is even ) cosh cosh ( ) cosh cosh ( )Similarly

(1, 1) cosh sinh

t

t t t

t

tX t X t

e t P N t N te t e t te t t t

p e t

λ

λ λ

λ

λ

λλ λλ λ

λ λ

− − −

= −

= −

= −

− =2

1 2

2

1 2

2 1

( ), ( ) 1 2 1

( ), ( ) 1 2 1

( ),

( 1,1) sinh sinh ( )

( 1, 1) sinh cosh ( )

tX t X t

tX t X t

t t

p e t t t

p e t t t

λ

λ

λ λ

λ λ

− = −

− − = −

Mean, autocorrelation and autocovariance function of ( )X t

2

2

4

( ) 1 cosh 1 sinh (cosh sinh )

( ) 1 cosh 1 sinh (cosh sinh ) 1

var( ( )) 1

t t

t

t

t t

t

t t

t

EX t e t e te t te

EX t e t e te t te e

X t e

λ λ

λ

λ

λ λ

λ

λ λ

λ

λ λλ λ

λ λλ λ

− −

− −

= × − ×

= −

=

= × + ×

= +

==

∴ = −

For 1 2t t<

1 2 1 2

1 2 1 2

2

1 2 1 2

( ), ( ) ( ), ( )

( ), ( ) ( ), ( )

1 2 1

( , ) ( ) ( ) 1 1 (1,1) 1 ( 1) (1, 1)

( 1) 1 ( 1,1) ( 1) ( 1) ( 1, 1)

cosh cosh ( )

X

X t X t X t X t

X t X t X t X t

t

R t t EX t X tp p

p p

e t t t eλ λ λ− −

== × × + × − × −

+ − × × − + − × − × − −

= − − 2

2 2

2 2

1 2 1

1 2 1 1 2 1

1 2 1 2 1 1 2 1 1 2

cosh sinh ( ) sinh sinh ( ) sinh cosh ( ) cosh (cosh ( ) sinh ( )) sinh (cosh ( ) sinh ( ))

t

t t

t t

t t te t t t e t t t

e t t t t t e t t t t t

λ

λ λ

λ λ

λ λλ λ λ λ

λ λ λ λ λ λ

− −

− −

− − + −

= − − − + − −

=1−

2 2 1

2 2 1 1

2 1

1 2

1 2

( )1 1

( )

2 ( )

1 2

2 ( )1 2

21 2

(cosh sinh ) Similarlrly for

( , )

( , )

t t t

t t t t

t t

t tX

t tX

e e t te e ee

t tR t t e

R t t e

λ λ

λ λ λ

λ

λ

λ

λ λ− − −

− − −

− −

− −

− −

+

=

=>

=

∴ =

Page 227: 35229433 Random Variale Random Process

Random Telegraph signal Consider a two-state random process with the states ( )Y t ( ) 1Y t = and

Suppose

( ) 1.Y t = −1( (0) 1)2

P Y = = and 1( (0) 1)2

P Y = − = and changes polarity with an

equal probability with each occurrence of an event in a Poisson process of parameter

( )Y t

.λ Such a random process ( is called a random telegraph signal and can be expressed as

)Y t

( ) ( )Y t AX t=where ( )X t is the semirandom telegraph signal and is a random variable

independent of

A

( )X t with 1( 1)2

P A = = and 1( 1) .2

P A = − =

Clearly,

2 2 2

1 1( 1) 1 02 2

and1 1( 1) 1 12 2

EA

EA

= − × + × =

= − × + × =

Therefore,

1 2

1 2 1 22

1 2

2

( ) ( ) ( ) and ( ) are independent 0

( , ) ( ) ( )

( ) ( )

Y

t t

EY t EAX tEAEX t A X t

R t t EAX t AX t

EA EX t X t

e λ− −

====

=

=

Page 228: 35229433 Random Variale Random Process

Stationary Random Process The concept of stationarity plays an important role in solving practical problems

involving random processes. Just like time-invariance is an important characteristics of

many deterministic systems, stationarity describes certain time-invariant property of a

random process. Stationarity also leads to frequency-domain description of a random

process.

Strict-sense Stationary Process

A random process ( )X t is called strict-sense stationary (SSS) if its probability

structure is invariant with time. In terms of the joint distribution function, ( )X t is

called SSS if

1 2 1 0 2 0 0( ), ( )..., ( ) 1 2 ( ), ( )..., ( ) 1 2( , ..., ) ( , ..., ) n nX t X t X t n X t t X t t X t t nF x x x F x x x+ + +=

0 1, and for all choices of sample points , ,... .nn N t t t t∀ ∈ ∀ ∈Γ ∈Γ2

Thus the joint distribution functions of any set of random variables 1 2( ), ( ), ..., ( )nX t X t X t

does not depend on the placement of the origin of the time axis. This requirement is a

very strict. Less strict form of stationarity may be defined.

Particularly,

if 1 2 1 0 2 0 0( ), ( )..... ( ) 1 2 ( ), ( )..... ( ) 1 2( , ..... ) ( , ..... ) for 1, 2,.., ,

n nX t X t X t n X t t X t t X t t nF x x x F x x x n k+ + += = then

( )X t is called order stationary. kth

• If ( )X t is stationary up to order 1

)()( 01)(1)( 011TtxFxF ttXtX ∈∀= +

Let us assume Then 0 .t t= − 1

1( ) 1 (0) 1 ( ) ( ) which is independent of time.X t XF x F x=

As a consequence

1( ) (0) (0) constantXEX t EX µ= = =

• If ( )X t is stationary up to order 2

Page 229: 35229433 Random Variale Random Process

),(.),( 21)(),(21)(),( 020121xxFxxF ttXttXtXtX ++=

Put 20 tt −=

1 2 1 2( ), ( ) 1 2 ( ), (0) 1 2

1 2

( , ) ( , )

This implies that the second-order distribution depends only on the time-lag .X t X t X t t XF x x F x x

t t−=

As a consequence, for such a process

1 2

1 2 1 2

1 2 (0) ( ) 1 2 1 2

1 2

( , ) ( ( ) ( ))

( , )

= ( )

X

X X t t

X

R t t E X t X t

x x f x x dx dx

R t t

∞ ∞

−−∞ −∞

=

=

∫ ∫

Similarly,

1 2 1 2( , )= C ( )X XC t t t t−

Therefore, the autocorrelation function of a SSS process depends only on the time lag

1 2 .t t−

We can also define the joint stationarity of two random processes. Two processes

( )X t and ( )Y t are called jointly strict-sense stationary if their joint probability

distributions of any order is invariant under the translation of time. A complex process

( ) ( ) ( )Z t X t jY t= + is called SSS if ( )X t and ( )Y t are jointly SSS.

Example An iid process is SSS. This is because, ,n∀

1 2 1 2

1 0 2 0

( ), ( )..., ( ) 1 2 ( ) 1 ( ) 2 ( )

1 2

( ), ( )..., (

( , ..., ) ( ) ( )... ( )

( ) ( )... ( )

n n

n

X t X t X t n X t X t X t n

X X X n

X t t X t t X t

F x x x F x F x F

F x F x F xF + + +

=

==

0 ) 1 2( , ..., )

t n

x

x x x

Example The Poisson process is ( not stationary, because ), 0N t t ≥

( )EN t tλ=

which varies with time.

Wide-sense stationary process

It is very difficult to test whether a process is SSS or not. A subclass of the SSS process

called the wide sense stationary process is extremely important from practical point of

view.

laks
Highlight
Page 230: 35229433 Random Variale Random Process

A random process ( )X t is called wide sense stationary process (WSS) if

1 2 1 2 1 2

1 2 1 2 1 2

( ) constantand

) ( ( ) ( ) is a function of time lag .(Equivalently, ( ) ( ) )( ) ( ) is a function of time lag

X

X

X

EX t

EX Xt t R t t t tCov X X Ct t t t t t

µ= =

= − −= − −

Remark

(1) For a WSS process ( ),X t

2

2 2

X 1 2 1 2 1 22

2 1

X 1 2 2 1

) 0 constant( ( )

var( ( )= ) ( ( )) constant(C ( , ) ( ) ( ) ( ) ( )

( )C ( , ) is a function of lag ( ).

X

X X

EX t R

X t EX EX ttt t EX t X t EX t EX t

R t tt t t t

µ

∴ ==

− == −

= − −∴ −

(2) An SSS process is always WSS, but the converse is not always true.

Example: Sinusoid with random phase

Consider the random process ( )X t given by

0( ) cos( )X t A w t φ= + w where are constants and 0 and A φ is uniformly distributed between 0 and 2 .π

• This is the model of the carrier wave (sinusoid of fixed frequency) used to

analyse the noise performance of many receivers.

Note that

1 0 2( ) 2

0 otherwisef

φ πφ πΦ

⎧ ≤ ≤⎪= ⎨⎪⎩

By applying the rule for the transformation of a random variable, we get

2 2( )

1 -A( )

0 otherwiseX t

x Af x A xπ

⎧ ≤ ≤⎪= −⎨⎪⎩

which is independent of Hence .t ( )X t is first-order stationary.

Note that

Page 231: 35229433 Random Variale Random Process

0

2

00

( ) cos( )

1 cos( )2

0 which is a constant

EX t EA w t

A w t dπ

φ

φ φπ

= +

= +

=

and

1 2 1 2

0 1 0 22

0 1 0 2 0 1 0 2

2

0 1 2 0 1 2

2

0 1 2 1 2

( , ) ( ) ( ) cos( ) cos( )

[ os( ) os( )]2

[ os( ( ) 2 ) os( ( )]2

os( ( ) which is a function of the lag .2

XR t t EX t X tEA w t A w t

A E c w t w t c w t w t

A E c w t t c w t t

A c w t t t t

φ φ

φ φ φ

φ

== + +

= + + + + + −

= + + + −

= −

φ−

Hence ( )X t is wide-sense stationary.

Example: Sinusoid with random amplitude

Consider the random process ( )X t given by

0( ) cos( )X t A w t φ= + where 0 and wφ are constants and A is a random variable. Here,

0( ) cos( )EX t EA w t φ= × +

which is independent of time only if 0.EA =

1 2 1 2

0 1 0 22

0 1 0 2

20 1 2 0 1 2

( , ) ( ) ( ) cos( ) cos( )

cos( )cos( )1 [ os( ( ) 2 ) os( ( )]2

XR t t EX t X tEA w t A w t

EA w t w t

EA c w t t c w t t

φ φ

φ φ

φ

== + +

= × + +

= × + + + −

which will not be function of 1 2(t t )− only.

Example: Random binary wave Consider a binary random process ( )X t consisting of a sequence of random pulses of duration T with the following features:

• Pulse amplitude KA is a random variable with two values 1(1)2KAp = and

1( 1)2KAp − =

Page 232: 35229433 Random Variale Random Process

• Pulse amplitudes at different pulse durations are independent of each other. • The start time of the pulse sequence can be any value between 0 to T. Thus the

random start time D (Delay) is uniformly distributed between 0 and T.

A realization of the random binary wave is shown in Fig. above. Such waveforms are used in binary munication- a pulse of amplitude 1is used to transmit ‘1’ and a pulse of amplitude -1 is used to transmit ‘0’. The random process X (t) can be written as,

( )( ) nn

t nT DX t A rectT

=−∞

− −= ∑

For any t,

2 2 2

1 1( ) 1 ( 1)( ) 02 2

1 1( ) 1 ( 1) 12 2

EX t

EX t

= × + − =

= × + − =

Thus mean and variance of the process are constants. To find the autocorrelation function 1 2( , )XR t t let us consider the case 1 10 t t Tτ< < + < . Depending on the delay D, the points may lie on one or two pulse intervals. 1 andt 2t

Page 233: 35229433 Random Variale Random Process

Case 1:

1 2( ) ( ) 1X t X t = Case 2:

1 1( ) ( ) ( 1)( 1) 1X t X t = − − = Case 3:

1 2( ) ( ) 1X t X t = − Thus,

Page 234: 35229433 Random Variale Random Process

1 21 2

1 2

01( ) ( )

1D t or t D T

X t X tt D t

< < < <⎧= ⎨ < <−⎩

1 2 1 2

1 2 1 2 1 2 1 2 1 2

2 1 2 1

2 1

( , ) ( ) ( ) |( ( ) ( ) | (0 )) ( ( ) ( )) | ). ( )

1 11 1 (1 1 )2 2

1

XR t t EEX t X t DE X t X t P D t or t D T E X t X t t D t P t D t

t t t tT T

t tT

∴ == < < < < + < <

− −⎛ ⎞= × − + × − ×⎜ ⎟⎝ ⎠−

= −

< <

2 1 2 1 1 2 1 2We also have, ( , ) ( ) ( ) ( ) ( ) ( , )X XR t t EX t X t EX t X t R t t= = =

So that 2 11 2 2 1( , ) 1X

t tR t t t t T

T−

= − − ≤

2 1 1 2

1 2 1 2

For , and are at different pulse intervals. ( ) ( ) ( ) ( ) 0

t t T t tEX t X t EX t EX t

− >

∴ = =

Thus the autocorrelation function for the random binary waveform depends on

2 1 ,t tτ = − and we can write

1( ) 1XR TTτ

τ τ= − ≤

The plot of ( )XR τ is shown below.

1

( )XR τ

T τ -T1
Page 235: 35229433 Random Variale Random Process

Example Gaussian Random Process

Consider the Gaussian process ( )X t discussed earlier. For any positive integer

is jointly Gaussian with the joint density function given by ,n )(),.....(),( 21 ntXtXtX

( )

' 1

1 2

12

( ), ( )... ( ) 1 2( , ,... )2 det(

X

nX t X t X t n nef x x xπ

−−

=XC X

XC )

'X

where ( )( )E= − −X XC X µ X µ

and [ ]1 2( ) ( ), ( )...... ( ) '.nE E X E X E X= =Xµ X

If ( )X t is WSS, then

1

2

( )( )

.. ..

.. ..( )

X

X

n X

EX tEX t

EX t

µµ

µ

⎡ ⎤ ⎡⎢ ⎥ ⎢⎢ ⎥ ⎢⎢ ⎥ ⎢= =⎢ ⎥ ⎢⎢ ⎥ ⎢⎢ ⎥ ⎢⎣ ⎦ ⎣

⎤⎥⎥⎥⎥⎥⎥⎦

n

t

1 1

2 2

2 1 1

2 1 2

( ) ( )( ) ( )

.. ..

.. ..( ) ( )

(0) ( )... ( )( ) (0)... ( )

(

X X

X X

n X n X

X X X n

X X X

X

X t X tX t X t

C E

X t X t

C C t t C tC t t C C t t

C

µ µµ µ

µ µ

⎛ ⎞′− −⎡ ⎤ ⎡ ⎤⎜ ⎟⎢ ⎥ ⎢ ⎥⎜ ⎟− −⎢ ⎥ ⎢ ⎥⎜ ⎟⎢ ⎥ ⎢ ⎥= ⎜ ⎟⎢ ⎥ ⎢ ⎥⎜ ⎟⎢ ⎥ ⎢ ⎥⎜ ⎟⎢ ⎥ ⎢ ⎥− −⎜ ⎟⎣ ⎦ ⎣ ⎦⎝ ⎠− −

− −=

X

1 1) ( )... (0)n X n Xt t C t t C

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥

− −⎢ ⎥⎣ ⎦

We see that 1 2( ), ( )... ( ) 1 2( , ,... )

nX t X t X t nf x x x depends on the time-lags. Thus, for a Gaussian

random process, WSS implies strict sense stationarity, because this process is completely

described by the mean and the autocorrelation functions.

Page 236: 35229433 Random Variale Random Process

Properties Autocorrelation Function of a real WSS Random Process

Autocorrelation of a deterministic signal

Consider a deterministic signal ( )x t such that

210 lim ( )2

T

T Tx t dt

T→∞ −∫< < ∞

Such signals are called power signals. For a power signal ( ),x t the autocorrelation

function is defined as

1( ) lim ( ) ( )2

T

x T TR x t x t dt

Tτ τ

→∞ −∫= +

( )xR τ measures the similarity between a signal and its time-shifted version.

Particularly, 21(0) lim ( )2

T

x T TR x t dt

T→∞ −∫= is the mean-square value. If ( )x t is a voltage

waveform across a 1 ohm resistance, then is the average power delivered to the

resistance. In this sense, represents the average power of the signal.

(0)xR

(0)xR

Example Suppose ( ) cos .x t A tω= The autocorrelation function of ( )x t at lag τ is given

by

2

2

1( ) lim cos ( ) cos2

lim [cos(2 ) cos ]4

cos 2

T

x T T

T

T T

R A t A tdT

A t dT

A

τ ω τ t

t

ω

ω τ ωτ

ωτ

→∞ −

→∞ −

= +

= + +

=

We see that ( )xR τ of the above periodic signal is also periodic and its maximum occurs

when 2 20, , , etc.π πτω ω

= ± ± The power of the signal is 2

(0) .2xAR =

The autocorrelation of the deterministic signal gives us insight into the properties of the

autocorrelation function of a WSS process. We shall discuss these properties next.

Page 237: 35229433 Random Variale Random Process

Properties of the autocorrelation function of a WSS process

Consider a real WSS process ( ) .X t Since the autocorrelation function 1 2( , )XR t t of such

a process is a function of the lag 1 2 ,t tτ = − we can redefine a one-parameter

autocorrelation function as

)()()( tXtEXRX ττ +=

If ( )X t is a complex WSS process, then

( ) ( ) *( )XR EX t X tτ τ= +

where *( )X t is the complex conjugate of ( ).X t For a discrete random sequence, we can

define the autocorrelation sequence similarly.

The autocorrelation function is an important function charactersing a WSS random

process. It possesses some general properties. We briefly describe them below.

1. 20( )X ( )EX tR = is the mean-square value of the process. If ( )X t is a voltage

signal applied across a 1 ohm resistance, then is the ensemble average

power delivered to the resistance. Thus,

(0)XR

20 ( )( )X EX tR = ≥ 0.

2. For a real WSS process ( ),X t ( )XR τ is an even function of the time .τ

. ( ) ( )X XR Rτ τ=− Thus,

1 1 1

( ) ( ) ( ) = ( ) ( ) = ( ) ( ) ( Substituting ) ( )

X

X

R EX t X tEX t X tEX t X t t tR

τ ττ

τ ττ

− = −−

+ ==

Remark For a complex process ,X(t) *( ) ( )XXR Rτ τ− =

3. 0 .( ) ( )X XR Rτ ≤ This follows from the Schwartz inequality

2 2( ), ( ) ( ) ( )X t X t X t X t 2τ τ< + > ≤ +

We have

Page 238: 35229433 Random Variale Random Process

2 2

2 2

( ) ( ) ( )

( ) ( ) (0) (0)

0( ) ( )

X

X X

X X

R EX t X t

EX t EX tR R

R R

τ τ

τ

τ

= +

= +=

∴ <

4. ( )XR τ is a positive semi-definite function in the sense that for any positive integer

and real n jj aa , , 1 1

( )n n

i j X i ji j

a a R t t= =

0− >∑ ∑

Proof

Define the random variable

1

2

1 1

1 1

( )

Then we have

0 (

( )

n

i ij

n n

i j i ji j

n n

i j X i ji j

Y a X t

) ( )EY a a EX t X t

a a R t t

=

= =

= =

= ∑

≤ = ∑ ∑

= −∑ ∑

It can be shown that the sufficient condition for a function ( )XR τ to be the

autocorrelation function of a real WSS process ( )X t is that ( )XR τ be real, even and

positive semidefinite.

5. If is MS periodic, then )(tX )(τXR is also periodic with the same period.

Proof:

Note that a real WSS random process ( )X t is called mean-square periodic ( MS

periodic) with a period if for every tpT ∈Γ

2

2 2

( ( ) ( )) 0

( ) ( ) 2 ( ) ( )

(0) (0) 2 ( ) 0

( ) (0)

p

p p

X X X p

X p X

E X t T X t

EX t T EX t EX t T X t

R R R T

R T R

+ − =

⇒ + + − +

⇒ + − =

⇒ =

0=

Again

Hans
Underline
Page 239: 35229433 Random Variale Random Process

2 2

2

2

(( ( ) ( )) ( )) ( ( ) ( )) ( )

( ( ) ( )) 2( (0) ( )) (0)

( ( ) ( )) 0 (0) ( )

( ) ( )

p p

X p X X X p X

X p X X X p

X p X

2E X t T X t X t E X t T X t EX t

R T R R R T R

R T R R R T

R T R

τ τ τ τ

τ τ

τ τ

τ τ

+ + − + ≤ + + − +

⇒ + − ≤ −

⇒ + − ≤ =

∴ + =

For example, 0( ) cos( )X t A w t φ= + where are constants and 0and A w

~ [0, 2 ],Uφ π is MS periodic random process with a period 0

2 .πω

Its autocorrelation

function 2

0cos( )2X

AR ω ττ = is periodic with the same period 0

2 .πω

The converse of this result is also true. If )(τXR is periodic with period then pT( )X t is MS periodic with a period This property helps us in determining time

period of a MS periodic random process. .pT

6. Suppose ( ) ( )XX t V tµ= + where is a zero-mean WSS process and ( )V t lim ( ) 0.VR

ττ

→∞= Then

2lim ( )X XRτ

τ µ→∞

=

Interpretation of the autocorrelation function of a WSS process

The autocorrelation function )(τXR measures the correlation between two random

variables ( )X t and (X t ).τ+ If )(τXR drops quickly with respect to ,τ then the

( )X t and (X t )τ+ will be less correlated for large .τ This in turn means that the

signal has lot of changes with respect to time. Such a signal has high frequency

components. If )(τXR drops slowly, the signal samples are highly correlated and

such a signal has less high frequency components. Later on we see that )(τXR is

directly related to the frequency -domain representation of a WSS process.

Fig. Cross correlation function of jointly WSS processes If ( )X t and ( are two real jointly WSS random processes, their cross-correlation functions are independent of t and depends on the time-lag. We can write the cross-correlation function

)Y t

Hans
Underline
Page 240: 35229433 Random Variale Random Process

( ) ( ) ( )XYR EX t Y tτ τ= + The cross correlation function satisfies the following properties: ( ) ( ) ( )XY YXi R Rτ τ= − This is because

( ) ( ) ( ) ( ) ( ) ( )

XY

YX

R EX t Y tEY t X tR

τ ττ

τ

= += += −

( ) ( ) (0) (0)XY X Yii R R Rτ ≤ We have

2 2

2 2

( ) ( ) ( )

( ) ( ) using Cauch-Schwarts Inequality (0) (0)

( ) (0) (0)

XY

X Y

XY X Y

R EX t Y t

EX t EY tR R

R R R

τ τ

τ

τ

= +

≤ +=

∴ ≤

Further,

( )1(0) (0) (0) (0) Geometric mean Arithmatic mean2X Y X YR R R R≤ + ≤∵

( )1( ) (0) (0) (0) (0)2XY X Y X YR R R R Rτ∴ ≤ ≤ +

(iii) If X (t) and Y (t) are uncorrelated, then ( ) ( ) ( )XY X YR EX t EY tτ τ µ µ= + = (iv) If X (t) and Y (t) is orthogonal process, ( ) ( )( ) 0XYR EX t Y tτ τ= + =

( )XYR τ

O

( )YXR τ

τFig. R ( ) ( )XY YXRτ τ= −

Page 241: 35229433 Random Variale Random Process

Example Consider a random process ( )Z t which is sum of two real jointly WSS random processes

We have . and Y(t)X(t)

( ) ( ) ( )( ) ( ) ( )

[ ( ) ( )][ ( ) ( )] ( ) ( ) ( ) ( )

Z

X Y XY YX

Z t X t Y tR EZ t Z t

E X t Y t X t Y tR R R R

τ ττ τ

τ τ τ τ

= += += + + + += + + +

If ( )X t and are orthogonal processes,then ( )Y t ( ) ( ) 0XY YXR Rτ τ= = ( ) ( ) ( )Z X YR R Rτ τ τ∴ = +

Page 242: 35229433 Random Variale Random Process

Continuity and Differentiation of Random Processes

• We know that the dynamic behavior of a system is described by differential equations involving input to and output of the system. For example, the behavior of the RC network is described by a first order linear differential equation with the source voltage as the input and the capacitor voltage as the output. What happens if the input voltage to the network is a random process?

• Each realization of the random process is a deterministic function and the

concepts of differentiation and integration can be applied to it. Our aim is to extend these concepts to the ensemble of realizations and apply calculus to random process.

We discussed the convergence and the limit of a random sequence. The continuity of the random process can be defined with the help of convergence and limits of a random process. We can define continuity with probability 1, mean-square continuity, and continuity in probability etc. We shall discuss the mean-square continuity and the elementary concepts of corresponding mean-square calculus. Mean-square continuity of a random process Recall that a sequence of random variables nX converges to a random variable X if

[ ]2lim 0nnE X X

→∞− =

and we write l.i .m. nn

X X→∞

=

A random process ( ) X t is said to be continuous at a point 0t t= in the mean-square

sense if ( ) ( )0

0l.i .m.t t

X t X t→

= or equvalently

( ) ( )0

20lim 0

t tE X t X t

→− =⎡ ⎤⎣ ⎦

Mean-square continuity and autocorrelation function (1) A random process ( ) X t is MS continuous at if its auto correlation function

is continuous at 0t

( 1 2,XR t t )

0

0 0( , ).t t Proof:

( ) ( ) ( ) ( ) ( ) ( )( )( ) ( ) ( )

2 2 20 0

0 0 0

2

, 2 , ,X X X

E X t X t E X t X t X t X t

R t t R t t R t t

− = − +⎡ ⎤⎣ ⎦= − +

If is continuous at then ( 1 2,XR t t ) 0 0( , ),t t

Page 243: 35229433 Random Variale Random Process

( ) ( ) ( ) ( ) ( )

( ) ( ) ( )0 0

20 0

0 0 0 0 0 0

lim lim , 2 , ,

, 2 , ,0

X X Xt t t t

X X X

E X t X t R t t R t t R t t0 0

R t t R t t R t t→ →

− = − +⎡ ⎤⎣ ⎦

= − +

=

(2) If ( )X t is MS continuous at its mean is continuous at 0t 0.t This follows from the fact that

( ) ( )( ) ( ) ( )( )2 2

0 0E X t X t E X t X t⎡ ⎤− ≤ −⎡ ⎤⎣ ⎦ ⎣ ⎦

( ) ( ) ( ) ( )0 0

2 20 0lim lim 0

t t t tE X t X t E X t X t

→ →⎡ ⎤∴ − ≤ −⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦⎣ ⎦ =

( )EX t∴ is continuous at t 0.

Example Consider the random binary wave ( )X t discussed in Eaxmple . Atypical realization of the process is shown in Fig. below. The realization is a discontinuous function.

( )X t

1

The process has the autocorrelation function given by

1 ( )

0 otherwise

ppX

TTRτ

ττ

⎧− ≤⎪= ⎨

⎪⎩

We observe that ( )XR τ is continuous at 0.τ = Therefore, ( )XR τ is continuous at all .τ

pT

1−

0 t

… pT

Page 244: 35229433 Random Variale Random Process

Example For a Wiener process ( ) ,X t

( ) ( )

( ) ( )

1 2 1 2 , min ,where is a constant.

, min ,

X

X

R t t t t

R t t t t t

αα

α α

=

∴ = =

Thus the autocorrelation function of a Wiener process is continuous everywhere implying that a Wiener process is m.s. continuous everywhere. We can similarly show that the Poisson process is m.s. continuous everywhere. Mean-square differentiability The random process ( ) X t is said to have the mean-square derivative ( )'X t at a point

provided ,t∈Γ ( ) ( )X t t X tt

+ ∆ −∆

approaches ( )'X t in the mean square sense as

. In other words, the random process 0t∆ → ( ) X t has a m-s derivative ( )'X t if

( ) ( ) ( )2

0lim ' 0

t

X t t X tE X

t∆ →

+ ∆ −⎡ ⎤− =⎢ ⎥∆⎣ ⎦

t

Remark (1) If all the sample functions of a random process ( )X t is differentiable, then the above condition is satisfied and the m-s derivative exists. Example Consider the random-phase sinusoid ( )X t given by

0( ) cos( )X t A w t φ= + w . where are constants and 0 and A ~ [0, 2 ]Uφ π Then for each ,φ ( )X t is differentiable. Therefore, the m.s. derivative is

0 0( ) sin( )X t Aw w t φ′ = − +

M.S. Derivative and Autocorrelation functions

The m-s derivative of a random process ( )X t at a point t∈Γ exists if ( )21 2

1 2

,XR t tt t

∂∂ ∂

exists at the point ( , ).t t Applying the Cauchy criterion, the condition for existence of m-s derivative is

( ) ( ) ( ) ( )1 2

21 2

, 01 2

lim 0t t

X t t X t X t t X tE

t t∆ ∆ →

+ ∆ − + ∆ −⎡ ⎤− =⎢ ⎥∆ ∆⎣ ⎦

Expanding the square and taking expectation results,

Page 245: 35229433 Random Variale Random Process

( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

1 2

1 2

21 2

, 01 2

1 1 1 2 2 22 2, 0

1 2

1 2 1 2

1 2

lim

, , 2 , , , 2lim

, , , ,2

t t

X X X X X X

t t

X X X X

X t t X t X t t X tE

t t

,R t t t t R t t R t t t R t t t t R t t R t t tt t

R t t t t R t t t R t t t R t tt t

∆ ∆ →

∆ ∆ →

+ ∆ − + ∆ −⎡ ⎤−⎢ ⎥∆ ∆⎣ ⎦

+ ∆ + ∆ + − + ∆ + ∆ + ∆ + − + ∆⎡ ⎤ ⎡= +⎢ ⎥ ⎢∆ ∆⎣ ⎦ ⎣

+ ∆ + ∆ − + ∆ − + ∆ +⎡ ⎤− ⎢ ⎥∆ ∆⎣ ⎦

⎤⎥⎦

Each of the above terms within square bracket converges to ( )

1 2

21 2

1 2 ,

,X

t t t t

R t tt t

= =

⎤∂⎥∂ ∂ ⎦

if the

second partial derivative exists.

( ) ( ) ( ) ( ) ( ) ( ) ( )1 2

1 2

2 2 2 21 2 1 2 1 2 1 2

, 01 2 1 2 1 2 1 2 ,

, , ,lim 2

0

X X X

t tt t t t

X t t X t X t t X t R t t R t t R t tE

t t t t t t t t∆ ∆ →= =

⎤+ ∆ − + ∆ − ∂ ∂ ∂⎡ ⎤∴ − = + − ⎥⎢ ⎥∆ ∆ ∂ ∂ ∂ ∂ ∂ ∂⎣ ⎦ ⎦

=

Thus, ( ) X t is m-s differentiable at t∈Γ if ( )21 2

1 2

,XR t tt t

∂∂ ∂

exists at ( , ) .t t ∈Γ×Γ

Particularly, if ( )X t is WSS,

( ) (1 2 1 2,X X )R t t R t t= − Substituting 1 2t t τ− = , we get

( ) ( )

( ) ( )

( )

( )

2 21 2 1 2

1 2 1 2

1 2

1 2

2

21

2

2

,

.

X X

X

X

X

R t t R t tt t t t

dR t tt d t

d Rd t

d Rd

ττ

τ ττ

ττ

∂ ∂ −=

∂ ∂ ∂ ∂

∂ −⎛ ⎞∂= ⎜ ⎟∂ ∂⎝ ⎠

∂= −

= −

Page 246: 35229433 Random Variale Random Process

Therefore, a WSS process ( )X t is m-s differentiable if ( )XR τ has second derivative at

0τ = . Example Consider a WSS process ( ) X t with autocorrelation function

( ) ( )expXR aτ τ= −

( )XR τ does not have the first and second derivative at 0.τ = ( ) X t is not mean-square differentiable. Example The random binary wave ( ) X t has the autocorrelation function

1 ( )

0 otherwise

ppX

TTRτ

ττ

⎧− ≤⎪= ⎨

⎪⎩

( )XR τ does not have the first and second derivative at 0.τ = Therefore, ( ) X t is not mean-square differentiable. Example For a Wiener process ( ) ,X t

( ) ( )

( )

( )

( )

1 2 1 2

2 22

22

21

2

21 2

1 21 2

, min ,where is a constant.

if 00,

0 other wise

if 00,

0 if 0does not exist if if 0

, does not exist at ( 0, 0)

X

X

X

X

R t t t t

t tR t

tR t

tt

t

R t tt t

t t

αα

α

α

=

<⎧∴ = ⎨

⎩<⎧

∂ ⎪∴ = >⎨∂ ⎪ =⎩∂

∴ = =∂ ∂

Thus a Wiener process is m.s. differentiable nowhere.. Mean and Autocorrelation of the Derivative process We have,

Page 247: 35229433 Random Variale Random Process

( ) ( ) ( )

( ) ( )

( ) ( )

( )

0

0

0

' lim

lim

lim

'

t

t

X X

t

X

X t t X tEX t E

tEX t t EX t

tt t t

tt

µ µ

µ

∆ →

∆ →

∆ →

+ ∆ −=

∆+ ∆ −

=∆

+ ∆ −=

∆=

For a WSS process ( ) ( )' ' 0XEX t tµ= = as ( )X tµ = constant.

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( ) ( )

( ) ( )

( )

2

2

2

1 2 ' 1 2

2 2 21 0

2

1 2 2 1 2

02

1 2 2 1 2

02

1 22

' ,

lim

lim

, ,lim

,

XX

t

t

X X

t

X

EX t X t R t t

X t t X tEX t

t

E X t X t t X t X tt

R t t t R t tt

R t tt

∆ →

∆ →

∆ →

=

+ ∆ −=

+ ∆ −⎡ ⎤⎣ ⎦=∆

+ ∆ −=

∆∂

=∂

Similarly we can show that

( ) ( ) ( )21 2

1 21 2

,' ' XXR t t

EX t X tt t

∂=

∂ ∂

For a WSS process

( ) ( ) ( ) ( )1 2 1 2' X

X

dREX t X t R t t

t dτ

τ∂

= − =∂

and

( ) ( ) ( )( )

( ) ( )

1 2 1

2

2

2

20

var(

X

X

X

EX t X t R t t

d Rd

d RX t

τττ

τ=

′′ ′ = −

=

′∴ =

2

Page 248: 35229433 Random Variale Random Process

Mean Square Integral Recall that the definite integral (Riemannian integral) of a function ( )x t over the interval

[ ]0 , t t is defined as the limiting sum given by

( ) ( )0

1

0 0lim

t n

k kn k kt

x d xτ τ τ−

→∞∆ →=

= ∆∑∫

Where are partitions on the interval [0 1 1................ n nt t t t−< < < < = t ]0 , t t and

and 1k kt t+∆ = − k [ ]1, k k kt tτ −∈ . For a random process ( ) X t , the m-s integral can be similarly defined as the process

given by ( )Y t

∆ ( ) ( ) ( )0

1

0 0l.i .m.

t n

k kn k kt

Y t X d Xτ τ τ−

→∞∆ → =

= = ∑∫Existence of M.S. Integral

• It can be shown that the sufficient condition for the m-s integral ( )0

t

t

X dτ τ∫ to

exist is that the double integral ( )0 0

1 2 1 2,t t

Xt t

R d dτ τ τ τ∫ ∫ exists.

• If ( ) X t is M.S. continuous, then the above condition is satisfied and the process is M.S. integrable.

Page 249: 35229433 Random Variale Random Process

Mean and Autocorrelation of the Integral of a WSS process We have

( ) ( )

( )

0

0

0

0

( )

t

t

t

t

t

Xt

X

EY t E X d

EX d

d

t t

τ τ

τ τ

µ τ

µ

=

=

=

= −

Therefore, if 0,Xµ ≠ ( ) Y t is necessarily non-stationary.

( )

( ) ( )

( ) ( )

( )

1 2

0 0

1 2

0 0

1 2

0 0

1 2 1 2

1 2 1

1 2 1

1 2 1 2

, ( ) ( )

Y

t t

t t

t t

t t

t t

Xt t

R t t EY t Y t

E X X d d

EX X d d

R d d

2

2

τ τ τ τ

τ τ τ τ

τ τ τ τ

=

=

=

= −

∫ ∫

∫ ∫

∫ ∫

which is a function of 1 2 and .t tThus the integral of a WSS process is always non-stationary.

Fig. (a) Realization of a WSS process ( )X t (b) corresponding integral Y t ( )

Remark The nonstatinarity of the M.S. integral of a random process has physical importance – the output of an integrator due to stationary noise rises unboundedly. Example The random binary wave ( ) X t has the autocorrelation function

Page 250: 35229433 Random Variale Random Process

1 ( )

0 otherwise

ppX

TTRτ

ττ

⎧− ≤⎪= ⎨

⎪⎩

( )XR τ is continuous at 0τ = implying that ( ) X t is M.S. continuous. Therefore,

( ) X t is mean-square integrable.

Page 251: 35229433 Random Variale Random Process

Time averages and Ergodicity Often we are interested in finding the various ensemble averages of a random process ( )X t by means of the corresponding time averages determined from single realization of the random process. For example we can compute the time-mean of a single realization of the random process by the formula

1lim ( )2

T

x T TTx t dt

−→∞= ∫

which is constant for the selected realization. x Tµ represents the dc value of ( ).x t

Another important average used in electrical engineering is the rms value given by 21lim ( )

2T

rms T TTx x t dt

T −→∞= ∫

Can x T

µ and rms Tx represent 2and ( ) respectively?X EX tµ

To answer such a question we have understand various time averages and their properties. Time averages of a random process The time-average of a function of a continuous random process ( ( ))g X t ( )X t is defined by

1( ( )) ( ( ))2

T

T Tg X t g X t dt

T −= ∫

where the integral is defined in the mean-square sense. Similarly, the time-average of a function of a continuous random process ( )ng X nX is defined by

1( ) ( )2 1

N

n iN i Ng X g

N =−= ∑

+X

The above definitions are in contrast to the corresponding ensemble average defined by

( )

( )

( )

( ( )) ( ) ( ) for continuous case

( ) ( ) for discrete caseX t

X t

i X t ii R

Eg X t g x f x dx

g x p x

−∞

=

=

∫∑

The following time averages are of particular interest:

(a) Time-averaged mean

Page 252: 35229433 Random Variale Random Process

1 ( ) (continuous case)2

T

X T TX t dt

−= ∫

1 (discrete case)2 1

N

X iN i NX

=−= ∑

+

(b) Time-averaged autocorrelation function

1( ) ( ) ( ) (continuous case)2

T

X T TR X t X t dt

Tτ τ

−= +∫

1[ ] (discrete case)2 1N

N

X i i mi N

R m X XN +

=−= ∑

+

Note that, ( ( ))

Tg X t and ( )n Ng X are functions of random variables and are governed

by respective probability distributions. However, determination of these distribution functions is difficult and we shall discuss the behaviour of these averages in terms of their mean and variances. We shall further assume that the random processes ( )X t and

nX are WSS. Mean and Variance of the time averages Let us consider the simplest case of the time averaged mean of a discrete-time WSS random process nX given by

1 2 1

N

X iN i NX

=−= ∑

+

The mean of X Nµ

X

1 2 11

2 1 =

N

X iN i N

N

ii N

E E XN

EXN

µ

µ

=−

=−

= ∑+

= ∑+

and the variance

( )

( )

22

2

22

,

12 1

1 ( )2 11 ( ) 2 ( )( )

2 1

N

X X i XN i N

N

i Xi N

N N N

i X i X j Xi N i N i j j N

E E XN

E XN

E X E X XN

µ µ µ

µ

µ µ

=−

=−

=− =− ≠ =−

∑ ∑ ∑

⎛ ⎞− = −⎜ ⎟+⎝ ⎠

⎛ ⎞= −⎜ ⎟+⎝ ⎠⎡ ⎤= − + −⎢ ⎥⎣ ⎦+

µ−

N

If the samples 1 1 2, ,...., , ,....,N NX X X X− − + X are uncorrelated,

Page 253: 35229433 Random Variale Random Process

( )

( )

22

22

2

12 11 ( )

2 1

2 1

N

X X i XN i N

N

i Xi N

X

E E XN

E XN

N

µ µ µ

µ

σ

=−

=−

⎛ ⎞− = −⎜ ⎟+⎝ ⎠

⎡ ⎤= −⎢ ⎥⎣ ⎦+

=+

We also observe that ( )2lim 0X XNN

E µ µ→∞

− =

From the above result, we conclude that . .M SX XN

µ µ⎯⎯⎯→

Let us consider the time-averaged mean for the continuous case. We have 1 ( )

21 ( )

21

2

T

X T T

T

X T T

T

X XT

X t dtT

E EX t dtT

dtT

µ

µ

µ µ

=

∴ =

= =

and the variance

( )2

2

2

1 22

1 22

1 ( )2

1 ( ( ) )2

1 ( ( ) )( ( ) )4

1 ( )4

TTX X XT

TT X

T TT T X X

TT X

E E X t dtT

E X t dtT

E X t X t dt dtT

C t t dT

µ µ µ

µ

µ µ

− −

∫ ∫

⎛ ⎞− = −⎜ ⎟⎝ ⎠

⎛ ⎞= −⎜ ⎟⎝ ⎠

= − −

= − 1 2TT t dt−∫

1 2

TT

The above double integral is evaluated on the square area bounded by and We divide this square region into sum of trapizoida strips parallel to

Putting

1t = ±

2 .t = ± 1 2 0.t t− =

1 2t t τ− = and noting that the differential area between 1 2t t τ− = and

1 2t t dτ τ− = + is (2 )T dτ τ− , the above double integral is converted to a single integral as follows: Therefore,

Page 254: 35229433 Random Variale Random Process

( )2

1 2 1 22

222

22

1 ( )41 (2 ) ( )

41 1 ( )

2 2

T TT TX X XT

TT X

TT X

E C t t dt dtT

T C dT

C dT T

µ µ

τ τ τ

ττ τ

− −

∫ ∫

− = −

= −

⎛ ⎞= −⎜ ⎟

⎝ ⎠

T

T−

1t

1 2t t τ− =

1 2t t dτ τ− = +

1 2 2t t T− =

Ergodicity Principle If the time averages convsense, then a time-averagthe corresponding ensemdiscussed below: Mean ergodic process A WSS process ( )X t is

T

T

erge to the corre computed fromble average. Su

said to be ergod

1 2 2t t T− = −

2t

esponding ensemble a large realization

ch a principle is th

ic in mean, if X Tµ

averages in the probabilistic can be used as the value for e ergodicity principle to be

. .M SXµ⎯⎯⎯→ T →∞ as .

Page 255: 35229433 Random Variale Random Process

Thus for a mean ergodic process ( ) ,X t

2

lim

andlim var

X XTT

X XTT

E µ µ

µ σ

→∞

→∞

=

=

We have earlier shown that X XT

E µ µ= and

2

2

1var ( ) 12 2

T

X XTT

C dT T

τµ τ τ

⎡ ⎤= −⎢ ⎥

⎣ ⎦∫

Therefore, the condition for ergodicity in mean is

2

2

1lim ( ) 1 02 2

T

XTT

C dT T

ττ τ

→∞−

⎡ ⎤− =⎢ ⎥

⎣ ⎦∫

If ( )XC τ decreases to 0 for 0τ τ> , then the above condition is satisfied. Further,

2 2

2 2

1 1( ) 1 ( )2 2 2

T T

X XT T

C d CT T T

τdτ τ τ

− −

⎡ ⎤− ≤⎢ ⎥

⎣ ⎦∫ ∫ τ

Therefore, a sufficient condition for mean ergodicity is

2

2

( )T

XT

C dτ τ−

< ∞∫

Example Consider the random binary waveform ( )X t discussed in Eaxmple .The process has the autocovariance function for pTτ ≤ given by

1 ( )

0 otherwise

ppX

TTCτ

ττ

⎧− ≤⎪= ⎨

⎪⎩

Here

Page 256: 35229433 Random Variale Random Process

2 2

2 0

0

3 2

2

( ) 2 ( )

2 1

23

2

3

p

T T

X XT

T

p

p pp

p p

p

C d C d

dT

T TT

T T

T

τ τ τ τ

τ τ

=

⎛ ⎞= −⎜ ⎟⎜ ⎟

⎝ ⎠⎛ ⎞

= + −⎜ ⎟⎜ ⎟⎝ ⎠

=

∫ ∫

( )XC dτ τ∞

−∞

∴ < ∞∫

Hence ( )X t is not mean ergodic. Autocorrelation ergodicity

1( ) ( ) ( )d2

T

X TT

R X t X t tT

τ τ−

= +∫

If we consider ( ) ( ) ( )Z t X t X t τ= + so that, ( )Z XRµ τ= Then ( )X t will be autocorrelation ergodic if ( )Z t is mean ergodic. Thus ( )X t will be autocorrelation ergodic if

21

1 12

1lim 1 ( )d 02 2

T

ZTT

CT T

ττ τ

→∞−

⎛ ⎞− =⎜ ⎟

⎝ ⎠∫

where

1 12

1

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( )Z

X

C EZ t Z t EZ t EZ t

EX t X t X t X t R1τ τ τ

τ τ τ τ

= − − −

= − − − − − τ

Involves fourth order moment. Hence found the condition for autocorrelation ergodicity of a jointly Gaussian process. Thus ( )X t will be autocorrelation ergodic if

2

2

1lim 1 ( )d 02 2

T

zTT

CT T

ττ τ

→∞−

⎛ ⎞− →⎜ ⎟

⎝ ⎠∫

Page 257: 35229433 Random Variale Random Process

Now 2( ) ( ) ( ) ( )Z XC EZ t Z t Rτ τ τ= + − Hence, X (t) will be autocorrelation ergodic

If ( )2

2

2

1lim 1 ( ) ( ) ( ) d 02 2

T

XTT

Ez t z t RT T

τα τ α

→∞−

⎛ ⎞− + −⎜ ⎟

⎝ ⎠∫ →

Example Consider the random –phased sinusoid given by

0( ) cos( )X t A w t φ= + w where are constants and 0 and A ~ [0, 2 ]Uφ π is a random variable. We have earlier proved that this process is WSS with 0Xµ = and

2

0( ) cos2XAR wτ τ=

For any particular realization 0 1( ) cos( ),x t A w t φ= +

0 1

00

1 cos( )2

1 sin( )

T

x T TA w t d

T

A w TTw

µ φ−

= +

=

∫ t

and

0 1 0 1

2

0 0 1

2 20 0

0

1( ) cos( ) cos( ( ) )d2

[cos cos( (2 ) 2 )]d4

cos sin( (2 )) 2 4

T

x TT

T

T

R A w t A w tT

A w A w t tT

A w A w Tw T

τ φ

τ τ

τ τ

= + +

= + + +

+= +

tτ φ

φ

+

We see that as both ,T →∞ 0x Tµ → and

20cos( )

2x T

A wR ττ →

For each realization, both the time-averaged mean and the time-averaged autocorrelation function converges to the corresponding ensemble averages. Thus the random-phased sinusoid is ergodic in both mean and autocorrelation. Remark A random process ( )X t is ergodic if its ensemble averages converges in the M.S. sense to the corresponding time averages. This is a stronger requirement than stationarity- the ensemble averages of all orders of such a process are independent of time. This implies that an ergodic process is necessarily stationary in the strict sense. The converse is not true- there are stationary random processes which are not ergodic.

Hans
Underline
Page 258: 35229433 Random Variale Random Process

Following Fig. shows a hierarchical classification of random processes.

Random processes

WSS random process

WSS Process

Ergodic Processes

Example Suppose ( )X t C= where ~ [0 ].C U a ( )X t is a family of straight line as illustrated in Fig. below.

3( )4

X t a=

1( )2

X t a=

( ) 0X t =

t

( )X t a=

1( )4

X t a=

Page 259: 35229433 Random Variale Random Process

Here 2Xaµ = and

1lim2

T

X T T TCdt

→∞ −∫= is a different constants for different realizations. Hence ( )X t is

not mean ergodic.

Page 260: 35229433 Random Variale Random Process

Spectral Representation of a Wide-sense Stationary Random Process Recall that the Fourier transform (FT) of a real signal is given by ( )g t

( ) ( ( ) ( ) j tG FT g t g t e ωω∞

−∞= = ∫ dt

t where cos sinj te t jω ω ω− = + is the complex exponential. The Fourier transform ( )G ω exists if satisfies the following Dirichlet conditions ( )g t

1) is absolutely integrable, that is, ( )g t

( )g t dt∞

−∞< ∞∫

2) has only a finite number of discontinuities in any finite interval ( )g t3) has only finite number of maxima and minima within any finite interval. ( )g t

The signal can be obtained from ( )g t ( )G ω by the inverse Fourier transform (IFT) as follows:

1( ) ( ( )) ( )2

j tg t IFT G G e dωω ω ωπ

−∞= = ∫

The existence of the inverse Fourier transform implies that we can represent a function as a superposition of continuum of complex sinusoids. The Fourier transform

( )g t( )G ω is the strength of

the sinusoids of frequency ω present in the signal. If is a voltage signal measured in volt, ( )g t( )G ω has the unit of volt/radian. The function ( )G ω is also called the spectrum of ( ).g t

We can define the Fourier transform also in terms of the frequency variable .2

f ωπ

= In this

case, we can define the Fourier transform and the inverse Fourier transform as follows:

2( ) ( ) j ftG f g t e dtπ∞

−∞= ∫

and

2( ) ( ) j ftg t G f e dfπ∞

−∞= ∫

The Fourier transform is a linear transform and has many interesting properties. Particularly, the energy of the energy of the signal is related by the Parseval’s theorem ( )f t

22T

T

g (t)dt G( ) dω ω∞

− −∞

=∫ ∫

How to have the frequency-domain representation of a random process, particularly a

WSS process?

The answer is the spectral representation of WSS random process. Wiener (1930) and

Khinchin (1934) independently discovered it. Einstein (1914) also used the concept.

laks
Highlight
Page 261: 35229433 Random Variale Random Process

Difficulty in Fourier Representation of a Random Process

We cannot define the Fourier transform of a WSS process ( )X t by the mean-square

integral

( ( )) ( ) j tFT X t X t e dtω∞

−∞= ∫

The existence of the above integral would have implied the existence the Fourier

transform of every realization of ( ).X t But the very notion of stationarity demands that

the realization does not decay with time and the first condition of Dirichlet is violated.

This difficulty is avoided by a frequency-domain representation of ( )X t in terms of the

power spectral density (PSD). Recall that the power of a WSS process ( )X t is a constant

and given by 2 ( ).EX t The PSD denotes the distribution of this power over frequencies.

Definition of Power Spectral Density of a WSS Process

Let us define

0 otherwise

( ) ( )2

TX (t) X(t) -T t T

tX t rectT

= <=

=

<

where (2trectT

) is the unity-amplitude rectangular pulse of width 2 centering the

origin. As will represent the random process

T

)( , tXt T∞→ ).(tX

Define the mean-square integral

(T

j tT T

T

FTX ) X (t)e dtωω −

= ∫

Applying the Pareseval’s theorem we find the energy of the signal

22T

T TT

X (t)dt FTX ( ) dω ω∞

− −∞

=∫ ∫ .

Therefore, the power associated with ( )TX t is

221 1

2 2

T

T TT

X (t)dt FTX ( ) dT T

ω ω∞

− −∞

=∫ ∫ and

The average power is given by

Page 262: 35229433 Random Variale Random Process

2221 1

2 2 2

TT

T TT

FTX ( )E X (t)dt E FTX ( ) d E d

T T Tω

ω ω ω∞ ∞

− −∞ −∞

= =∫ ∫ ∫

where 2( )

2TE FTX

is the contribution to the average power at frequency ω and

represents the power spectral density for ( ).TX t As the left-hand side in the

above expression represents the average power of

,T → ∞

( ).X t Therefore, the PSD ( )XS ω of

the process ( )X t is defined in the limiting sense by

2( )

( ) lim2

TX T

E FTXS

ω→∞

=

Relation between the autocorrelation function and PSD: Wiener-Khinchin-Einstein

theorem

We have

1 2

1 2

2 *

1 2 1

( )1 2 1 2

| ( ) | ( ) ( )2 2

1 ( ) ( )2

1 ( )2

T T T

T Tj t j t

T TT T

T Tj t t

XT T

FTX FTX FTXE ET T

EX t X t e e dt dtT

R t t e dt dtT

ω ω

ω

ω ω ω

− +

− −

− −

− −

=

=

= −

∫ ∫

∫ ∫

2

1t

T

T−

1 2t t τ− =

1 2t t dτ τ− = +

1 2 2t t T− =

T

T

2t

1 2 2t t T− = −

Page 263: 35229433 Random Variale Random Process

Note that the above integral is to be performed on a square region bounded by

and Substitute 1t = ±T T2 .t = ± τ=− 21 tt so that τ−= 12 tt is a family of straight

lines parallel to The differential area in terms of 1 2 0.t t− = τ is given by the shaded area

and equal to (2 | |) .T dτ τ− The double integral is now replaced by a single integral in .τ

Therefore,

* 2

2

2

2

( ) ( ) 1 ( ) (2 | |)2 2

| | ( ) (1 )2

TjT T

XT

Tj

xT

FTX XE R e T dT T

R e dT

ωτ

ωτ

ω ω τ τ τ

ττ τ

= −∫

= −∫

If XR ( )τ is integrable then the right hand integral converges to ( ) jXR e dωττ τ

∞−

−∞∫ as

,∞→T

2( )

lim ( )2T

T jX

E FTXR e d

Tωτω

τ τ→∞

∞−

−∞∴ = ∫

As we have noted earlier, the power spectral density 2( )

( ) lim2

TX T

E FTXS

ω→∞

= is the

contribution to the average power at frequency ω and is called the power spectral density

( )X t . Thus

( ) ( ) jX XS R e ωτ dω τ τ

∞−

−∞

= ∫

and using the inverse Fourier transform

1( ) ( )2

jX XR S e dωττ ω

π

−∞

= ∫ w

Example 1 The autocorrelation function of a WSS process ( )X t is given by

2( ) 0bXR a e bττ −= >

Find the power spectral density of the process.

Page 264: 35229433 Random Variale Random Process

( ) ( )

2

0 2 2

02 2

222 2

jS R e dX x

b ja e e d

j jb ba e e d a e e d

a ab j b j

a b

b

ωτω τ τ

τ ωτ τ

ωτ ωτ τ ττ τ

ω ω

ω

∞ −= ∫−∞∞ − −= ∫

−∞∞− −−= + ∫ ∫

−∞

= +− +

=+

The autocorrelation function and the PSD are shown in Fig.

Page 265: 35229433 Random Variale Random Process

Example: Suppose ( ) sin ( )cX t A B tω= + + Φ where A is a constant bias and ~ [0, 2 ]U πΦ . Find ( )XR τ and ( ).XS ω .

22

( ) ( ) ( ) ( sin ( ( ) ))( sin ( ))

cos2

c c

c

xR EX t X tE A B t A B t

BA

τ τω τ ω

ω τ

= +

= + + + Φ + + Φ

= +

( ) ( ) ( )( )2

2 ( )4

where ( ) is the Dirac Delta function.

X cBS Aω δ ω δ ω ω δ ω ω

δ ω

c∴ = + + + −

2

2B

( )XS ω

2A2

2B

ωocω− cω

Example 3 PSD of the amplitude-modulated random-phase sinusoid

( ) ( )( ) ( ) cos , ~ 0, 2cX t M t t Uω π= + Φ Φ where M(t) is a WSS process independent of .Φ ⇒ ( ) ( )

( ) ( )

( )2

( ) ( ) cos ( ) ( ) cos

( ) ( ) cos ( ) cos ( Using the independence of ( ) and the sinusoid)

cos2

c c

c c

M c

XR E M t t M t

E M t M t E tM t

AR

τ τ ω τ ω τ

τ ω τ ω τ

τ ω τ

= + + + Φ + Φ

= + + + Φ + Φ

=

( ) ( ) ( )( )( )

2

4where is the PSD of M(t)

X M c M

M

AS S S

S

ω ω ω ω

ω

cω∴ = + + −

Page 266: 35229433 Random Variale Random Process

o

cω cω−

( )XS ω

ω

( )MS ω

ω

Example 4 The PSD of a noise process is given by

( ) 0

2 20 Otherwise

N cNS ωω ω ω= ±

=

Find the autocorrelation of the process.

( )

2

2

1( )2

1 2 cos2 2

c

c

jN N

o

R S e d

N d

ωτ

ωω

ωω

τ ωπ

ω

ωτ τπ

−∞

+

∴ =

= × ×

sin sin

2 22

c coNω ωω ω

π τ

⎡ ⎛ ⎞ ⎛+ − −⎜ ⎟ ⎜⎢ ⎝ ⎠ ⎝⎢=⎢⎢⎣

sin 2 cos2 2

ooN ωτ ω τ

π ωτ=

⎤⎥⎥⎥⎥⎦

⎞⎟⎠

Page 267: 35229433 Random Variale Random Process

Properties of the PSD

( )XS ω being the Fourier transform of ( ),XR τ it shares the properties of the Fourier

transform. Here we discuss important properties of ( ).XS ω

The average power of a random process ( )X t is

2

XR (0)1 ( )

2 X

EX (t)

S dωπ

−∞

=

= ∫ w

The average power in the band 1 2[ , ]ω ω is 2

1

2 ( )XS dω

ωω ω∫

Page 268: 35229433 Random Variale Random Process

Average Power in the frequency band

1 2[ , ]ω ω

• If ( )X t is real, )(R X τ is a real and even function of .τ Therefore,

0

( ) ( )

( )(cos sin )

( )cos

2 ( )cos

jX X

X

X

X

S R e d

R j d

R d

R d

ωτω τ τ

τ ωτ

τ ωτ τ

τ ωτ τ

∞−

−∞

−∞

−∞

=

= +

=

=

ωτ τ

Thus ( )XS ω is a real and even function of .ω

Page 269: 35229433 Random Variale Random Process

• From the definition 2( )

( ) lim2T

X TE X

S wT

ω→∞= is always non-negative. Thus

( ) 0.XS ω ≥

• If ( )X t has a periodic component, )(R X τ is periodic and so ( )XS ω will have

impulses.

Remark

1) The function ( )XS ω is the PSD of a WSS process ( )X t if and only

if ( )XS ω is a non-negative, real and even function of ω and

( )XS ω∞

−∞∫ < ∞

2) The above condition on ( )XS ω also ensures that the corresponding

autocorrelation function )(R X τ is non-negative definite. Thus the

non-negative definite property of an autocorrelation function can be

tested through its power spectrum.

3) Recall that a periodic function has the Fourier series expansion. If

( )X t is M.S. periodic we can have an equivalent Fourier series

expansion ( ) .X t

Page 270: 35229433 Random Variale Random Process

Cross power spectral density

Consider a random process ( )Z t which is sum of two real jointly WSS random processes

As we have seen earlier, . and Y(t)X(t)

( ) ( ) ( ) ( ) ( )Z X Y XY YXR R R R Rτ τ τ τ= + + + τ If we take the Fourier transform of both sides,

( ) ( ) ( ) ( ( )) ( ( ))Z X Y XY YXS S S FT R FT Rω ω ω τ τ= + + + where stands for the Fourier transform. (.)FT

Thus we see that ( )ZS ω includes contribution from the Fourier transform of the cross-

correlation functions ( ) and ( ).XY YXR Rτ τ These Fourier transforms represent cross power

spectral densities.

Definition of Cross Power Spectral Density

Given two real jointly WSS random processes and ,X(t) Y(t) the cross power spectral

density (CPSD) ( )XYS ω is defined as

( ) ( )( ) lim2

T TXY T

FTX FTYS ET

ω ωω∗

→∞=

where ( ) and ( )TFTX FTYTω ω are the Fourier transform of the truncated processes

( ) ( ) and ( ) (2 2T TtX t X(t)rect Y t Y(t)rectT T

= = )t respectively and ∗ denotes the complex

conjugate operation.

We can similarly define ( )YXS ω by

( ) ( )( ) lim2

T TYX T

FTY FTXS ET

ω ωω∗

→∞=

Proceeding in the same way as the derivation of the Wiener-Khinchin-Einstein theorem

for the WSS process, it can be shown that

( ) ( ) jXY XYS R e ωτ dω τ τ

∞−

−∞= ∫

and

( ) ( ) jYX YXS R e ωτ dω τ τ

∞−

−∞= ∫

Page 271: 35229433 Random Variale Random Process

The cross-correlation function and the cross-power spectral density form a Fourier

transform pair and we can write

( ) ( ) jXY XYR S e dωττ ω ω

−∞= ∫

and

( ) ( ) jYX YXR S e dωττ ω ω

−∞= ∫

Properties of the CPSD

The CPSD is a complex function of the frequency .ω Some properties of the CPSD of

two jointly WSS processes and X(t) Y(t) are listed below:

(1) *( ) ( )XY YXS Sω ω=

Note that ( ) ( )XY YXR Rτ τ= −

*

( ) ( )

( )

( )

= ( )

jXY XY

jYX

jYX

YX

S R e d

R e d

R e d

S

ωτ

ωτ

ωτ

ω τ τ

τ τ

τ τ

ω

∞−

−∞

∞−

−∞

−∞

∴ = ∫

= −∫

= ∫

(2) Re( ( ))XYS ω is an even function of ω and Im( ( ))XYS ω is an odd function of ω

We have

( ) ( )(cos sin )

( )cos ( )sin )

Re( ( )) Im( ( ))where

Re( ( )) ( )cos is an even function of and

XY XY

XY XY

XY XY

XY XY

S R j d

R d j R d

S j S

S R d

ω τ ωτ ωτ τ

τ ωτ τ τ ωτ τ

ω ω

ω τ ωτ τ ω

−∞

∞ ∞

−∞ −∞

−∞

= +∫

= +∫ ∫

= +

= ∫

Im( ( )) ( )sin is an odd function of andXY XYS R dω τ ωτ τ ω∞

−∞= ∫

(3) and X(t) Y(t) are uncorrelated and have constant means, then

( ) ( ) ( )XY YX X YS Sω ω µ µ δ ω= =

Observe that

Page 272: 35229433 Random Variale Random Process

( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( )

XY

X Y

Y X

XY

XY YX X Y

R EX t Y tEX t EY t

RS S

τ ττ

µ µµ µ

τω ω µ µ δ ω

= += +===

∴ = =

(4) If and X(t) Y(t) are orthogonal, then

( ) ( ) 0XY YXS Sω ω= =

If and X(t) Y(t) are orthogonal,

( ) ( ) ( ) 0 ( )

( ) ( ) 0

XY

XY

XY YX

R EX t Y t

RS S

τ τ

τω ω

= +==

∴ = =

(5) The cross power between XYP and X(t) Y(t) is defined by

1lim ( ) ( )2

T

XY T TP E X t Y

T→∞ −∫= t dt

Applying Parseval’s theorem, we get

*

*

1lim ( ) ( )21 lim ( ) ( )

21 1 lim ( ) ( )

2 2( ) ( )1 lim

2 21 ( )

21 ( )

2

T

XY T T

T TT

T TT

T T

T

XY

XY XY

P E X t Y t dtT

E X t Y t dtT

E FTX FTY dT

EFTX FTY dT

S d

P S d

ω ω ωπ

ω ω ωπ

ω ωπ

ω ωπ

→∞ −

→∞ −∞

→∞ −∞

→∞−∞

−∞

−∞

=

=

=

=

=

∴ =

Similarly,

*

*

1 ( )21 ( )

2

YX YX

XY

XY

P S

S d

P

dω ωπ

ω ωπ

−∞

−∞

=

=

=

Page 273: 35229433 Random Variale Random Process

Example Consider the random process ( ) ( ) ( )Z t X t Y t= + discussed in the beginning of the lecture. Here ( )Z t is the sum of two jointly WSS orthogonal random processes

and .X(t) Y(t) We have, ( ) ( ) ( ) ( ) ( )Z X Y XY YXR R R R Rτ τ τ τ= + + + τ Taking the Fourier transform of both sides,

( ) ( ) ( ) ( ) ( )1 1 1 1 1( ) ( ) ( ) ( ) ( )

2 2 2 2 2Therefore,

( ) ( ) ( ) ( ) ( )

Z X Y XY YX

Z X Y XY YX

Z X Y XY YX

S S S S S

S d S d S d S d S d

P P P P P

ω ω ω ω ω

ω ω ω ω ω ω ω ωπ π π π π

ω ω ω ω ω

∞ ∞ ∞ ∞ ∞

−∞ −∞ −∞ −∞ −∞∫ ∫ ∫ ∫ ∫

= + + +

ω ω∴ = + + +

= + + +

Remark

• ( ) ( )XY YXP Pω ω+ is the additional power contributed by ( ) and ( )X t Y t to the resulting power of ( ) ( )X t Y t+

• If and X(t) Y(t) are orthogonal, then

( ) ( ) ( ) 0 0

( ) ( )Z X Y

X Y

S S SS S

ω ω ωω ω

= + += +

+

Consequently

( ) ( ) ( )Z X YP P Pω ω ω= +

Thus in the case of two jointly WSS orthogonal processes, the power of the sum of

the processes is equal to the sum of respective powers.

Power spectral density of a discrete-time WSS random process

Suppose [ ]g n is a discrete-time real signal. Assume to be obtained by sampling a

continuous-time signal at an uniform interval T such that

[ ]g n

( )g t

[ ] ( ), 0, 1, 2,...g n g nT n= = ± ±

The discrete-time Fourier transform (DTFT) of the signal [ ]g n is defined by

( ) [ ] j n

nG g n e ωω

∞ −

=−∞= ∑

Page 274: 35229433 Random Variale Random Process

( )G ω exists if [ ]g n is absolutely summable, that is, [ ] .n

g n∞

=−∞< ∞∑ The signal [ ]g n is

obtained from ( )G ω by the inverse discrete-time Fourier transform

1[ ]) ( )2

j ng n g e dwπ

ω

π

ωπ −

= ∫

Following observations about the DTFT are important:

• ω is a frequency variable representing the frequency of a discrete sinusoid.

Thus the signal 0[ ] cos( )g n A nω= has a frequency of 0ω radian/samples.

• ( )G ω is always periodic in ω with a period of 2 .π Thus ( )G ω is uniquely

defined in the interval .π ω π− ≤ ≤

• Suppose [ ]g n is obtained by sampling a continuous-time signal at a

uniform interval T such that

( )ag t

[ ] ( ), 0, 1, 2,...ag n g nT n= = ± ±

The frequency ω of the discrete-time signal is related to the frequency Ω of the

continuous time signal by the relation Tω

Ω =

where is the uniform sampling interval. The symbol T Ω for frequency of a

continuous signal is used in the signal-processing literature just to distinguish it from

the corresponding frequency of the discrete-time signal. This is illustrated in the Fig.

below.

• We can define the Z transform− of the discrete-time signal by the relation

Page 275: 35229433 Random Variale Random Process

( ) [ ] n

nG z g n z

∞ −

=−∞= ∑

where is a complex variable. z ( )G ω is related to by ( )G z

( ) ( ) jz eG G z ωω=

=

Power spectrum of a discrete-time real WSS process [ ]X n

Consider a discrete-time real WSS process [ ].X n The very notion of stationarity

poses problem in frequency-domain representation of [ ]X n through the Discrete-time

Fourier transform. The difficulty is avoided similar to the case of the continuous-time

WSS process by defining the truncated process

[ ] for

[ ]0 otherwiseN

X n nX n

⎧ ≤⎪= ⎨⎪⎩

N

The power spectral density ( )XS ω of the process [ ]X n is defined as

21( ) lim ( )2 1X NN

S E DTFTN

Xω ω→∞

=+

where

( ) [ ] [ ]Njwn jwn

N Nn n N

DTFTX X n e X n eω∞ − −

=−∞ =−∑ ∑= =

Note that the average power of [ ]X n is 2[0] [ ]XR E X n = and the power spectral

density ( )XS ω indicates the contribution to the average power of the sinusoidal

component of frequency .ω

Wiener-Einstein-Khinchin theorem

The Wiener-Einstein-Khinchin theorem is also valid for discrete-time random processes.

The power spectral density ( )XS ω of the WSS process [ ]X n is the discrete-time

Fourier transform of autocorrelation sequence.

[ ]( ) j mX X

mS R m e ω wω π π

∞ −

=−∞= −∑ ≤ ≤

[ ]XR m is related to ( )XS ω by the inverse discrete-time Fourier transform and given by

1[ ]) ( )2

j mX XR m S e

πω

π

dω ωπ −

= ∫

Page 276: 35229433 Random Variale Random Process

Thus and [ ]XR m ( )XS ω forms a discrete-time Fourier transform pair. A generalized

PSD can be defined in terms of transformz − as follows

[ ]( ) mX x

m

S z R m z∞

=−∞

= ∑

Clearly, ( ) ( ) jX X z eS S z ωω=

=

Example Suppose [ ] 2 0, 1, 2, 3....mXR m m−= = ± ± ± Then

[ ]

0

( )

1 12

3 5 4cos

j mX X

m

mj m

mm

S R m e

e

ω

ω

ω

ω

∞ −

=−∞

∞ −

=−∞≠

= ∑

⎛ ⎞= + ∑ ⎜ ⎟⎝ ⎠

=−

The plot of the autocorrelation sequence and the power spectral density is shown in Fig. below.

Example Properties of the PSD of a discrete-time WSS process

Page 277: 35229433 Random Variale Random Process

• For the real discrete-time process [ ],X n the autocorrelation function [ ]XR m is

real and even. Therefore, ( )XS ω is real and even. • ( ) 0.XS ω ≥ • The average power of [ ]X n is given by

2 1[ ] [0] ( )2X XEX n R S d

π

πω ω

π −= = ∫

Similarly the average power in the frequency band 1 2[ , ]ω ω is given by

2

1

2 ( )XS dω

ωω ω∫

• ( )XS ω is periodic in ω with a period of 2 .π

Interpretation of the power spectrum of a discrete-time WSS process

Assume that the discrete-time WSS process [ ]X n is obtained by sampling a continuous-

time random process ( )aX t at an uniform interval, that is,

[ ] ( ), 0, 1, 2,...aX n X nT n= = ± ±

The autocorrelation function [ ]XR m is defined by

[ ] [ ] [ ]

( ) ( ) ( )

a

X

a a

X

R m E X n m X n E X n T mT X n T R mT

= += +=

[ ] ( ) 0, 1, 2,...aX XR m R mT m∴ = = ± ±

Thus the sequence [ ]XR m is obtained by sampling the autocorrelation function ( )

aXR τ at a uniform interval .T The frequency ω of the discrete-time WSS process is related to the frequency Ω of the

continuous time process by the relation Tω

Ω =

Page 278: 35229433 Random Variale Random Process

White noise process

A white noise process ( )W t is defined by

0( )2W

NS ω ω= −∞ < < ∞

where is a real constant and called the intensity of the white noise. The

corresponding autocorrelation function is given by

0N

( ) ( )2WNR τ δ τ= where )(τδ is the Dirac delta.

The average power of white noise

2 1( )2 2avg

NP EW t dωπ

−∞= = →∫ ∞

The autocorrelation function and the PSD of a white noise process is shown in Fig.

below.

0

2N

ω O

( )WS ω

Page 279: 35229433 Random Variale Random Process

0 ( )2

Nδ τ

Remarks

• The term white noise is analogous to white light which contains all visible light

frequencies.

• We generally consider zero-mean white noise process.

• A white noise process is unpredictable as the noise samples at different instants of

time are uncorrelated.:

for ( , ) 0W i jC t t = .i jt t≠

• White noise is a mathematical abstraction, it cannot be realized since it has infinite

average power.

• If the system band-width (BW) is sufficiently narrower than the noise BW and noise

PSD is flat , we can model it as a white noise process. Thermal noise, which is the

noise generated in resistors due to random motion electrons, is well modelled as

white Gaussian noise, since they have very flat psd over very wide band (GHzs)

τ

O

( )WR τ

Page 280: 35229433 Random Variale Random Process

• A white noise process can have any probability density function. Particularly, if the

white noise process ( )W t is a Gaussian random process, then ( )W t is called a

white Gaussian random process.

• A white noise process is called strict-sense white noise process if the noise samples at

distinct instants of time are independent. A white Gaussian noise process is a strict-

sense white noise process. Such a process represents a ‘purely’ random process,

because its samples at arbitrarily close intervals also will be independent.

Example A random-phase sinusoid corrupted by white noise

Suppose ( ) sin ( ) ( )cX t B t W tω= +Φ + where is a constant bias and A ~ [0, 2 ]U πΦ .

and ( )W t is a zero-mean WGN process with PSD of 0

2N and independent of .Φ

Find ( )XR τ and ( ).XS ω .

2

( ) ( ) ( ) ( sin ( ( ) ) ( ) )( ( ) sin ( ))

cos ( )2

c c

c W

xR EX t X tE B t W t W t B t

B R

τ τω τ τ ω

ω τ τ

= +

= + +Φ + + +

= +

( ) ( ) ( )( )2

0

4 2where ( ) is the Dirac Delta function.

X cNBS ω δ ω ω δ ω ω

δ ω

c∴ = + + − +

Band-limited white noise

A noise process which has non-zero constant PSD over a finite frequency band and zero

elsewhere is called band-limited white noise. Thus the WSS process ( )X t is band-

limited white noise if

0( )2X

NS Bω ω= − B< <

For example, thermal noise which has constant PSD up to very high frequency is better

modeled by a band-limited white noise process.

The corresponding autocorrelation function ( )XR τ is given by

0 sin( )2X

N B BRB

ττπ τ

=

Page 281: 35229433 Random Variale Random Process

The plot of ( )XS ω and ( )XR τ of a band-limited white noise process is shown in Fig.

below. Further assume that ( )X t is a zero-mean process.

Observe that

• The average power of the process is 2 0( ) (0)2X

N BEX t Rπ

= =

• 2 3( ) 0 for , , ,...XRB B Bπ π πτ τ= = ± ± ± This means that ( )X t and ( )nX t

+

where is a non-zero integer are uncorrelated. Thus we can get uncorrelated n

0

2N

ω

( )XS ω

O B− B

Page 282: 35229433 Random Variale Random Process

samples by sampling a band-limited white noise process at a uniform interval of

.Bπ

• A band-limited white noise process may also be a band-pass process with PSD as

shown in the Fig. and given by

0

0 ( ) 2 2

0 otherwiseX

N BS

ω ωω

⎧ − <⎪= ⎨⎪⎩

The corresponding autocorrelation function is given by

00

sin2( ) cos

22

X

BN BR B

τ

τ ω ττπ=

0 2Bω− −

0 2Bω− +

0 2Bω −

0 2Bω +0ω 0ω

0

2N

0ω− ω

( )XS ω

O

Page 283: 35229433 Random Variale Random Process

Coloured Noise

A noise process which is not white is called coloured noise. Thus the noise process

( )X t with 2( ) 0bXR a e bττ −= > and PSD

22( ) 2 2Xa bS

ω=

+ is an example of a

coloured noise.

White Noise Sequence A random sequence [ ]W n is called a white noise sequence if

( )2WNS ω π ω π= − ≤ ≤

Therefore

( ) ( )2WNR m δ= m

Page 284: 35229433 Random Variale Random Process

where )(mδ is the unit impulse sequence. The autocorrelation function and the PSD of a

whitenoise sequence is shown in Fig.

A realization of a white noise sequence is shown in Fig. below.

Remark

• The average power of the white noise sequence is 2[ ]EX n

The average power of the white noise sequence is finite adistributed over all frequencies.

• If the white noise sequence [ ]W n is a Gaussian sequencecalled a white Gaussian noise (WGN) sequence.

• An i.i.d. random sequence is always white. Such a sequencstrict-sense white noise sequence. A WGN sequence is isstationary white noise sequence.

• The model white noise sequence looks artificial, but it plarandom signal modelling. It plays the similar role as thatfunction in the modeling of deterministic signals. A class

NN

( )WS ω

2

N

• • • • m → •

[ ]WR m N

π−

N

2

1 22 2 2

N Nππ

= × =

nd uniformly

, then [ ]W n is

e may be called a strict-sense

ys a key role in of the impulse of WSS processes

22 2

π ω→

Page 285: 35229433 Random Variale Random Process

called regular processes can be considered as the output of a linear system with white noise as input as illustrated in Fig.

• The notion of the sequence of i.i.d. random variables is also important in statistical inference.

Linear System White noise process Regular WSS random process

Page 286: 35229433 Random Variale Random Process

Response of Linear time-invariant system to WSS input: In many applications, physical systems are modeled as linear time invariant (LTI) system. The dynamic behavior of an LTI system to deterministic inputs is described by linear differential equations. We are familiar with time and transfer domain (such as Laplace transform and Fourier transform) techniques to solve these equations. In this lecture, we develop the technique to analyse the response of an LTI system to WSS random process. The purpose of this study is two-folds:

(1) Analysis of the response of a system (2) Finding a LTI system that can optionally estimate an unobserved random process

from an observed process. The observed random process is statistically related to the unobserved random process. For example, we may have to find LTI system (also called a filter) to estimate the signal from the noisy observations.

Basics of Linear Time Invariant Systems: A system is modeled by a transformation T that maps an input signal ( )x t to an output signal y(t). We can thus write, [ ]( ) ( )y t T x t=

Linear system The system is called linear if superposition applies: the weighted sum of inputs results in the weighted sum of the corresponding outputs. Thus for a linear system ( ) ( ) ( ) ( )1 1 2 2 1 1 2 2T a x t a x t a T x t a T x t+ = +⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦ ⎡ ⎤⎣ ⎦

Example : Consider the output of a linear differentiator, given by

( )( ) d x ty tdx

=

Then, ( )1 1 2 2( ) ( )d a x t a x tdt

+

1 1 2 2( ) ( )d da x t a x tdt dt

= +

Hence the linear differentiator is a linear system.

Page 287: 35229433 Random Variale Random Process

Linear time-invariant system Consider a linear system with y(t) = T x(t). The system is called time-invariant if T x ( ) ( )0 0 0t t y t t t− = − ∀

t

It is easy to check that that the differentiator in the above example is a linear time-invariant system. Causal system The system is called causal if the output of the system at t 0= depends only on the present and past values of input. Thus for a causal system ( ) ( )0 0( ),y t T x t t t= ≤Response of a linear time-invariant system to deterministic input A linear system can be characterised its impulse response ( ) ( )h t T tδ= where ( )tδ is the Dirac delta function.

)(tδ Recall that any functifollows

( ) (x t x τ∞

−∞

= ∫

s T δ=

If x(t) is input to the

( )

(

(

y t T x

x s

x s

−∞

−∞

−∞

=

=

=

∫ Where h t( ), If the system is time ( ) (,h t s h t=Therefore for a linear

(( )y t x s∞

−∞

= ∫where * denotes the c Thus for a LTI Syste

( )y t x=

)(th LTI system

on x(t) can be represented in terms of the Dirac delta function as

( )) t sδ − ds

)t s−

linear system y(t) = T x(t), then

( )

( ) [ ]

( )

( )

) Using the linearity property

) ,

s t s ds

T t s ds

h t s ds

δ

δ

is the response at time t due to the shifted impulse ( ( )t sδ − .

invariant, )s−

invariant system,

) ( ) ( ) * ( )h t s ds x t h t− =

onvolution operation.

m,

( ) * ( ) ( ) * ( )t h t h t x t=

Page 288: 35229433 Random Variale Random Process

( )x t ( )y t LTI System ( )h t

( )Y ω ( )X ω LTI System ( )H ω

Taking the Fourier transform, we get

( ) ( ) ( )

( ) ( )where ( ) is the frequency response of the systemj t

Y H X

H FT h t h t e dtω

ω ω ω

ω∞

−∞

=

= = ∫ Response of an LTI System to WSS input Consider an LTI system with impulse response h(t). Suppose ( ) X t is a WSS process input to the system. The output ( of the system is given by ) Y t

( ) ( ) ( ) ( ) ( )Y t h s X t s ds h t s X s ds∞ ∞

−∞ −∞

= − = −∫ ∫where we have assumed that the integrals exist in the mean square (m.s.) sense. Mean and autocorrelation of the output process ( ) Y t The mean of the output process is given by,

( ) ( ) ( )

( ) ( )

( )

( )

( )0

X

X

X

EY t E h s X t s ds

h s EX t s ds

h s ds

h s ds

H

µ

µ

µ

−∞

−∞

−∞

−∞

= −

= −

=

=

=

where is the frequency response (0)H ( )H ω at 0 frequency ( 0ω = ) given by

Page 289: 35229433 Random Variale Random Process

( ) ( )00

( ) j tH h t e dt h t dtωω

ω

ω∞ ∞

−=

−∞ −∞=

= =∫ ∫

Therefore, the mean of the output process ( ) Y t is a constant The Cross correlation of the input X(t) and the out put Y(t) is given by

( ) ( ) ( ) ( ))EX t Y t E X t h s X t s dsτ τ∞

−∞

+ = + −∫

( ) ( ) ( )h s E X t X t s dsτ∞

−∞

= +∫ −

( ) ( )Xh s R s dsτ∞

−∞

= +∫

( ) ( ) [ Put ]Xh u R u du s uτ∞

−∞

= − − = −∫ ( ) ( )* Xh Rτ τ= −

( ) ( ) ( )( ) ( ) ( ) ( )

( ) ( )

*

*

*

XY X

YX XY X

X

R h R

also R R h R

h R

τ τ τ

τ τ τ

τ τ

τ

∴ = −

= − = −

=

Thus we see that ( )XYR τ is a function of lag τ only. Therefore, ( )X t and ( )Y t are jointly wide-sense stationary. The autocorrelation function of the output process Y(t) is given by,

( ) ( ) ( )

( ) ( )

) ( ) ( )

( )

EY t Y t E X t s dsY th s

h s E X t s Y t ds

τ τ

τ

−∞

−∞

∴ + = + −

= + −

( ) ( )

( ) ( )( ) * ( ) * ( ) *

XY

XY X

h s R s ds

h R h h R

τ

τ τ τ τ

−∞

= −

= = −

∫τ

Thus the autocorrelation of the output process ( ) Y t depends on the time-lag τ , i.e.,

( ) ( ) ( )YEY t Y t Rτ τ+ = . Thus

( ) ( ) ( ) (* *Y XR R h h )τ τ τ τ= − The above analysis indicates that for an LTI system with WSS input

(1) the output is WSS and

Page 290: 35229433 Random Variale Random Process

(2) the input and output are jointly WSS. The average power of the output process ( ) Y t is given by

( )( ) ( ) ( )0

0 * 0 * 0Y Y

X

P R

R h h

=

=Power spectrum of the output process Using the property of Fourier transform, we get the power spectral density of the output process given by

( ) ( ) ( ) (( ) ( )

)*

2

Y X

X

S S H H

S H

ω ω ω ω

ω ω

=

=

Also note that

( ) ( ) (( ) ( ) ( )

*

*XY X

YX X

R h R

and R h R)τ τ τ

τ τ τ

= −

=

Taking the Fourier transform of ( )XYR τ we get the cross power spectral density ( )XYS ω given by

( ) ( ) (

( ) ( ) ( )

*

andXY X

YX X

S H S

S H S

)ω ω ω

ω ω ω

=

=

( )YS ω( )XYS ω ( )XS ω

*( )H ω ( )H ω

( )YR τ( )XYR τ ( )XR τ

( )h τ− ( )h τ

Page 291: 35229433 Random Variale Random Process

Example:

(a) White noise process ( )X t with power spectral density 0

2N is input to an ideal low

pass filter of band-width B. Find the PSD and autocorrelation function of the output process.

( )H ω

2N

B− B ω

The input process ( )X t is white noise with power spectral density ( ) 0

2XNS ω = .

The output power spectral density ( )YS ω is given by,

( ) ( ) ( )2

0

0

12

2

Y XS H S

N B B

N B B

ω ω ω

ω

ω

=

= × − ≤ ≤

= − ≤ ≤

( ) ( )0 0

Inverse Fourier transform of

1 si2 2 2

Y Y

Bj

B

R S

N N Be dωτ n 2

τ ω

π τωπ π− τ

∴ =

= =∫

The output PSD ( )YS ω and the output autocorrelation function ( )YR τ are illustrated in Fig. below.

Page 292: 35229433 Random Variale Random Process

( )YS ω

Example 2: A random voltage modeled by a white noise process ( )X t with power spectral density

0

2N is input to a RC network shown in the fig.

Find (a) output PSD ( )YS ω

(b) output auto correlation function ( )YR τ

(c) average output power ( )2EY t R

0

2N

O B− B ω

Page 293: 35229433 Random Variale Random Process

The frequency response of the system is given by

( )

1

1

11

jCHR

jC

jRC

ωϖ

ω

ω

=+

=+

Therefore, (a)

( ) ( ) ( )

( )

2

2 2 2

02 2 2

1 1

11 2

Y X

X

S H S

SR C

NR C

ω ω ω

ωω

ω

=

=+

=+

(b) Taking inverse Fourier transform

( ) 0

4RC

YNR eRC

τ

τ−

=

(c) Average output power

( ) ( )2 004YNEY t RRC

= =

Page 294: 35229433 Random Variale Random Process

Discrete-time Linear Shift Invariant System with WSS Random Inputs We have seen that the Dirac delta function ( )tδ plays a very important role in the analysis

of the response of the continuous-time LTI systems to deterministic and random inputs.

Similar role in the case of the discrete-time LTI system is played by the unit sample

sequence [ ]nδ defined by

1 for 0

[ ]0 otherwise

nnδ

=⎧= ⎨

Any discrete-time signal [ ]x n can be expressed in terms of [ ]nδ as follows:

[ ] [ ] [ ]k

x n x k nδ∞

=−∞= −∑ k

A discrete-time linear shift-invariant system is characterized by the unit sample response

which is the output of the system to the unit sample sequence [ ]h n [ ].nδ

][nh LTI system

][nδ

The transfer function of such a system is given by

( ) [ ] j n

nH h n e ωω

∞−

=−∞= ∑

An analysis similar to that for the continuous-time LTI system shows that the response

of a the linear time-invariant system with impulse response to a deterministic

input

[ ]y n [ ]h n

[ ]x n is given by

[ ] [ ] [ ] [ ]* [ ]k

y n x k h n k x n h n∞

=−∞= − =∑

Consider a discrete-time linear system with impulse response and WSS input ][nh [ ]X n

][nX][nh

[ ] [ ] [ ]

[ ] [ ]

Y n X n * h n

E Y n E X n *

=

=

For the WSS inpu

[ ]Y E Y n µ = =

][nY

[ ]

[ ]k

h k

h n

=−∞= ∑

t ][nX

[ ]X * h nµ

[ ]X n k−

[ ] (0)X Xn h n Hµ µ

=−∞= =∑

Page 295: 35229433 Random Variale Random Process

where is the dc gain of the system given by (0)H

0

0

(0) ( )

[ ]

[ ]

jwn

n

n

H H

h n e

h n

ω

ω

ω=

∞ −

=−∞ =

=−∞

=

= ∑

= ∑

[ ] [ ] [ ] ( [ ] [ ])( [ ]* [ ]) [ ]* [ ]* [ ]

Y

X

R m E Y n Y n mE X n * h n X n m h n mR m h m h m

= −= −= −

. −

is a function of lag only. ][mRY m

From above we get 2) )Y X S ( Η S )ω ω ω= | ( | (

)(ωXS 2)ωH

)(ωYS

• Note that though the input is an uncorrelated process, the output is a correlated

process.

Consider the case of the discrete-time system with a random sequence as an input. ][nx

][nh][nx ][ny

][*][*][][ mhmhmRmR XY −=

Taking the we get transform,−z

S )()()()( 1−= zHzHzSz XY

Notice that if is causal, then is anti causal. )(zH )( 1−zH

Similarly if is minimum-phase then is maximum-phase. )(zH )( 1−zH

)(zH )( 1−zH [ ]XR m [ ]YR m

( )YS z ( )XS z

Page 296: 35229433 Random Variale Random Process

Example

If 11( )

1H z

zα −=−

and is a unity-variance white-noise sequence, then ][nx

2 2

2

1

1

Given [ ]

( ) 2

( ) ( ) ( ) ( )1 1

1 21

X

XX

Y X

EX n

S

S z H z H z S z

zz

σ

σωπ

1α πα

=

∴ =

=

⎛ ⎞⎛ ⎞= ⎜ ⎟⎜ ⎟−−⎝ ⎠⎝ ⎠

By partial fraction expansion and inverse −z transform, we get

||21

1][ mY amR

α−=

Spectral factorization theorem

A WSS random signal that satisfies the Paley Wiener condition

can be considered as an output of a linear filter fed by a white noise

sequence.

][nX

| ln ( ) |XS dπ

πω ω

−< ∞∫

If ( )XS ω is an analytic function of ,ω

and , then | ln ( ) |XS dπ

πω ω

−< ∞∫ 2( ) ( ) ( )X v c aS z H z H zσ=

where

)(zH c is the causal minimum phase transfer function

)(zHa is the anti-causal maximum phase transfer function

and 2vσ a constant and interpreted as the variance of a white-noise sequence.

Innovation sequence

v n [ ] ][nX)(zH c

Figure Innovation Filter

Minimum phase filter => the corresponding inverse filter exists.

)1

zHc

[ ]V n (

][nX

Figure whitening filter

Page 297: 35229433 Random Variale Random Process

Since is analytic in an annular region ln ( )XS z 1zρρ

< < ,

ln ( ) [ ] kX

kS z c k z

∞ −

=−∞= ∑

where 1[ ] ln ( )2

i nXc k S e dwπ ω

π ωπ −= ∫ is the order cepstral coefficient. kth

For a real signal [ ] [ ]c k c k= −

and 1[0] ln ( )2 XXc Sπ

ππ −= ∫ w dw

1

1

1

[ ]

[ ] [ ][0]

[ ]

-1 2

( )

Let ( )

1 (1)z (2) ......

kk

k kk k

kk

c k zXX

c k z c k zc

c k zC

c c

S z e

e e e

H z e z

h h z

ρ

∞ −=−∞

∞ −− −= =−∞

∞ −=

∑ ∑

=

=

= >

= + + +

( [0] ( ) 1c z Ch Lim H z→∞= =∵

( )CH z and are both analytic ln ( )CH z

=> is a minimum phase filter. ( )CH z

Similarly let

1

1

( )

( ) 1

( )

1 ( )

kk

a

kk

c k z

c k zC

H z e

e H z zρ

− −=−∞

=

∑ −

=

= = <

0)

2 1

Therefore,

( ) ( ) ( )XX V C CS z H z H zσ −=

where 2 (cV eσ =

Salient points

• can be factorized into a minimum-phase and a maximum-phase factors

i.e. and

)(zSXX

( )CH z 1( )CH z− .

• In general spectral factorization is difficult, however for a signal with rational

power spectrum, spectral factorization can be easily done.

Page 298: 35229433 Random Variale Random Process

• Since is a minimum phase filter, 1( )CH z

exists (=> stable), therefore we can have a

filter 1( )CH z

to filter the given signal to get the innovation sequence.

• and are related through an invertible transform; so they contain the

same information.

][nX [ ]v n

Example

Wold’s Decomposition

Any WSS signal can be decomposed as a sum of two mutually orthogonal

processes

][nX

• a regular process [ ]rX n and a predictable process [ ]pX n , [ ] [ ] [ ]r pX n X n X n= +

• [ ]rX n can be expressed as the output of linear filter using a white noise

sequence as input.

• [ ]pX n is a predictable process, that is, the process can be predicted from its own

past with zero prediction error.