Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet, Probability, Statistics, and Random Signals, Oxford University Press, February 2016. B.J. Bazuin, Spring 2022 1 of 34 ECE 3800 Charles Boncelet, βProbability, Statistics, and Random Signals," Oxford University Press, 2016. ISBN: 978-0-19-020051-0 Chapter 5: MULTIPLE DISCRETE RANDOM VARIABLES Sections 5.1 Multiple Random Variables and PMFs 5.2 Independence 5.3 Moments and Expected Values 5.3.1 Expected Values for Two Random Variables 5.3.2 Moments for Two Random Variables 5.4 Example: Two Discrete Random Variables 5.4.1 Marginal PMFs and Expected Values 5.4.2 Independence 5.4.3 Joint CDF 5.4.4 Transformations With One Output 5.4.5 Transformations With Several Outputs 5.4.6 Discussion 5.5 Sums of Independent Random Variables 5.6 Sample Probabilities, Mean, and Variance 5.7 Histograms 5.8 Entropy and Data Compression 5.8.1 Entropy and Information Theory 5.8.2 Variable Length Coding 5.8.3 Encoding Binary Sequences 5.8.4 Maximum Entropy Summary Problems
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 1 of 34 ECE 3800
Charles Boncelet, βProbability, Statistics, and Random Signals," Oxford University Press, 2016. ISBN: 978-0-19-020051-0
Chapter 5: MULTIPLE DISCRETE RANDOM VARIABLES
Sections 5.1 Multiple Random Variables and PMFs 5.2 Independence 5.3 Moments and Expected Values
5.3.1 Expected Values for Two Random Variables 5.3.2 Moments for Two Random Variables
5.4 Example: Two Discrete Random Variables 5.4.1 Marginal PMFs and Expected Values 5.4.2 Independence 5.4.3 Joint CDF 5.4.4 Transformations With One Output 5.4.5 Transformations With Several Outputs 5.4.6 Discussion
5.5 Sums of Independent Random Variables 5.6 Sample Probabilities, Mean, and Variance 5.7 Histograms 5.8 Entropy and Data Compression
5.8.1 Entropy and Information Theory 5.8.2 Variable Length Coding 5.8.3 Encoding Binary Sequences 5.8.4 Maximum Entropy
Summary Problems
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 13 of 34 ECE 3800
π π π π, π
π π3
12,
312
,3
12,
212
,1
12
πΈ π π β π π
πΈ π 0 β3
121 β
312
2 β3
123 β
212
4 β1
120 3 6 6 4
121912
πΈ π π β π π
πΈ π 0 β3
121 β
312
2 β3
123 β
212
4 β1
120 3 12 18 16
124912
π πΈ π πΈ π
π4912
1912
4912
361144
588 361144
227144
Computing a CDF
Determine the bounds of interest
πΆπ·πΉ 2.5,1.3 π π, π
..6
12
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 14 of 34 ECE 3800
HW Problem 5.5 Continue the example in Section 5.4 and consider the joint transformation. Consider the joint transformation Two dimensional probability example in (X,Y)
0 1 2 3 4
1
2
0
12 equally likely points in X and Y
Letting π πππ π,π and π πππ₯ π,π
a.) What are the level curves (draw picture)?
b.) What are the individual PMFs of U and W?
c.) What is the joint PMF of U and W?
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
For the sum of independent R.V., the MGF is the product of the MGF!
General comment on Laplace Transforms
a convolution in one domain is multiplication in the other Convolve in time/sample β multiply in Laplace Multiply in time/sample β convolve in Laplace
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 20 of 34 ECE 3800
5.6 Sample Probabilities, Mean, and Variance (The beginning of the relationship between statistics and probability!)
Statistics Definition: The science of assembling, classifying, tabulating, and analyzing data or facts:
Descriptive statistics β the collecting, grouping and presenting data in a way that can be easily understood or assimilated.
Inductive statistics or statistical inference β use data to draw conclusions about or estimate parameters of the environment from which the data came from.
Theoretical Areas:
Sampling Theory β selecting samples from a collection of data that is too large to be examined completely.
Estimation Theory β concerned with making estimates or predictions based on the data that are available.
Hypothesis Testing β attempts to decide which of two or more hypotheses about the data are true.
Curve fitting and regression β attempt to find mathematical expressions that best represent the data. (Shown in Chap. 4)
Analysis of Variance β attempt to assess the significance of variations in the data and the relation of these variances to the physical situations from which the data arose. (Modern term ANOVA)
We will focus on parameter estimation of the mean and variance to begin!
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 21 of 34 ECE 3800
Sampling Theory β The Sample Mean
How many samples are required to find a representative sample set that provides confidence in the results?
Defect testing, opinion polls, infection rates, etc.
Definitions
Population: the collection of data being studied N is the size of the population
Sample: a random sample is the part of the population selected all members of the population must be equally likely to be selected! n is the size of the sample
Sample Mean: the average of the numerical values that make of the sample
To generalize, describe the statistical properties of arbitrary random samples rather than those of any particular sample.
Sample Mean π β β π ,
where iX are random variables with a pdf.
Notice that for a pdf, the true mean, X , can be compute while for a sample data set the
above sample mean, is computed. XΜ
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 22 of 34 ECE 3800
As may be noted, the sample mean is a combination of random variables and, therefore, can also be considered a random variable. As a result, the hoped for result can be derived as:
πΈ π π πΈ1πβ π
1πβ πΈ π
1πβ π
ππβ π π π
If and when this is true, the estimate is said to be an unbiased estimate.
Though the sample mean may be unbiased, the sample mean may still not provide a good estimate.
What is the βvarianceβ of the computation of the sample mean?
You would expect the sample mean to have some variance about the βprobabilisticβ or actual mean; therefore, it is also desirable to know something about the fluctuations around the mean. As a result, computation of the variance of the sample mean is desired.
For N>>n or N infinity (or even a known pdf), using the collected samples β¦ based on the prior definition of variance, a statistical estimate of the 2nd moment and the square of the mean.
22
1
Λ1Λ XEXn
EXVarn
ii
211
2
1Λ XXXn
EXVarn
jj
n
ii
21 1
2
1Λ XXXn
EXVarn
i
n
jji
21 1
2
1Λ XXXEn
XVarn
i
n
j
ji
For iX independent (measurements should be independent of each other)
jiforXXEXEXE
jiforXXEXXE
ji
ii
ji
,Λ
,
22
22
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 23 of 34 ECE 3800
As a result we can define two summation where i=j and i<>j,
21 ,1
2
1Λ XXXEXXEn
XVarn
i
n
ijj
jiii
2222
2
1Λ XXEnnXEnn
XVar ii
22
2
221Λ XX
n
nnX
nXVar
nn
XXX
n
nX
nXVar
2222
221Λ
where 2 is the true variance (probabilistic) of the random variable, X.
Therefore, as n approaches infinity, this variance in the sample mean estimate goes to zero!
It is referred to as a βconsistentβ estimate. Thus a larger sample size leads to a better estimate of the population mean.
Note: this variance is developed based on βsampling with replacementβ.
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 24 of 34 ECE 3800
Example: How many samples of an infinitely long time waveform would be required to insure the mean is within 1% of the true (probabilistic) mean value? For this relationship, we would require that
πππ π 0.01 β π 0.01 β π
Infinite set, therefore assume that you use the βwith replacement equationβ:
n
XVar2
Λ
Assume that the true means is 10 and that the true variance is 9 so that the mean +/- a standard deviation would be 310 . Then,
21001.09Λ n
XVar
01.01.09 2 n
900n
A very large sample set size to βestimateβ the mean within the 1% desired bound!
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 25 of 34 ECE 3800
Sampling Theory β The Sample Variance
When dealing with probability, both the mean and variance provide valuable information about the βDCβ and βACβ operating conditions (about what value is expected) and the variance (in terms of power or squared value) about the operating point.
Therefore, we are also interested in the sample variance as compared to the true data variance.
The sample variance of the population (stdevp) is defined as:
n
i
i XXn
S
1
22 Λ1
and continuing until (shown in the coming pages)
22 1
n
nSE
where is the true probabilistic variance of the random variable.
Note: the sample variance is not equal to the true variance; it is a biased estimate!
To create an unbiased estimator, scale by the biasing factor to compute (stdev):
n
ii
n
iix XX
nXX
nn
nSE
n
nSE
1
2
1
2222 Λ
1
1Λ1
11
~
This is equation 5.12 in the textbook!
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 26 of 34 ECE 3800
Additional notes: MATLAB and MS Excel
Simulation and statistical software packages allow for either biased or unbiased computations.
In MS Excel there are two distinct functions stdev and stdevp.
In MATLAB, there is an additional flag associate with the std function.
n
jjx
nXXstd
1
2
1
1var , flag implied as 0
n
jjx
nXXstd
1
211,var1, , flag specified as 1
>> help std std Standard deviation. For vectors, Y = std(X) returns the standard deviation. For matrices, Y is a row vector containing the standard deviation of each column. For N-D arrays, std operates along the first non-singleton dimension of X. std normalizes Y by (N-1), where N is the sample size. This is the sqrt of an unbiased estimator of the variance of the population from which X is drawn, as long as X consists of independent, identically distributed samples. Y = std(X,1) normalizes by N and produces the square root of the second moment of the sample about its mean. std(X,0) is the same as std(X).
The tools you use compute the unbiased variance and standard deviation! Did you know this before?!
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 27 of 34 ECE 3800
Sampling Theory β The Sample Variance - Proof
The sample variance of the population is defined as
n
i
i XXn
S
1
22 Λ1
n
i
n
j
ji Xn
Xn
S
1
2
1
2 11
Determining the expected value
n
i
n
jji X
nX
nESE
1
2
1
2 11
n
i
n
jj
n
jjii X
nXX
nX
nESE
1
2
11
22 121
n
i
n
kk
n
jj
n
jjii XX
nEXXE
nXE
nSE
1 112
1
22 121
n
i
n
kk
n
jj
n
i
n
jji
n
ii XX
nE
nXXE
nXE
nSE
1 112
1 12
1
22 1121
n
i
n
j
n
kkj
n
i
XXEnn
XEnXEn
XEnn
SE1 1 1
21
222
22 111
21
n
i
n
j
n
j
n
jkkkjj XXEXE
nnXEnnXEn
nXESE
1 1 1 ,1
2
2
222
22 111
2
n
i
XEnnXEnn
XEn
nXE
nXESE
1
2223
2222 1122
22223
2222 1122XEnnnXEn
nXE
n
nXE
nXESE
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 28 of 34 ECE 3800
n
n
n
nXE
nnXESE
112121 222
n
nXE
n
nXESE
11 222
2222 11
n
nXEXE
n
nSE
Therefore,
22 1
n
nSE
To create an unbiased estimator, scale by an (un-) biasing factor to compute:
222
1
~
SEn
nSE
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 29 of 34 ECE 3800
Statistical Mean and Variance Summary
For taking samples and estimating the mean and variance β¦
The Estimate Variance of Estimate
Mean
n
iiX X
nX
1
1ΛΜ
An unbiased estimate
XEXE Λ
XX Λ
n
XVar X2
Λ
Variance (biased)
n
i
i XXn
S
1
22 Λ1
A biased estimate
22 1Xn
nSE
2
442
1
~
n
nSVar X
44 XXE
Variance (unbiased) 222
1
~XSE
n
nSE
An unbiased estimate
22~XXESE
222 Λ~
XXSE
n
SVar X4
42
44 XXE
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 30 of 34 ECE 3800
5.7 Histograms
Histogramming can be used to determine the values of a pmf! However, a significant number of trials may have to be run before the correct pmf can be observed.
Remember the MATLAB simulation of the marble selection in homework #1?!
Sec1_Marble1.m
Sec1_Marble2.m
Sec1_Marble3.m
See Uniform_hist.m
See Binomial_hist.m
Concepts to validate probability β¦ ground truth, traffic studies, trend analysis
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 31 of 34 ECE 3800
5.8 Entropy and Data Compression
See https://en.wikipedia.org/wiki/Information_theory
The basis for information theory and of particular benefit data compression involves the concept of entropy.
When evaluating information, a measure of the information content randomness involves the probability of the occurrence of various βlettersβ in the alphabet and the number of bits actually needed to represent the alphabet.
For the English alphabet, there are m=26 letters. For normal language, each letters has a probability of occurrence.
The measure of the entropy of each potential symbol is
Overall, this is a specific application and discussion related to encoding that is quite involved and very important β¦. but somewhat unique to an area of interest. Therefore, read it at your leisure β¦.
Shannonβs Papers on βA mathematical theory of communicationβ
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 32 of 34 ECE 3800
Homework Problem 5.5:
Continue the example in Section 5.4 and consider the joint transformation, U = min(X ,Y) (e.g., min(3,2) = 2), and W = max(X ,Y ). For each transformation,
a) What are the level curves (draw pictures)?
b) What are the individual PMFβs of U and W?
c) What is the joint PMF of U and W?
Below are the level curves and PMFs for W = max(X ,Y ) and U = min(X ,Y ):
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 33 of 34 ECE 3800
Homework Problem 5.30:
Prove the Cauchy-Schwarz inequality:
π₯ β π¦ π₯ β π¦
where the xβs and yβs are arbitrary numbers.
Hint: Start with the following inequality (why is this true?):
0 π₯ π β π¦
Find the value of a that minimizes the right hand side above, substitute that value into the same inequality, and rearrange the terms into the Cauchy-Schwarz inequality at the top.
Notes and figures are based on or taken from materials in the course textbook: Charles Boncelet,
Probability, Statistics, and Random Signals, Oxford University Press, February 2016.
B.J. Bazuin, Spring 2022 34 of 34 ECE 3800
π₯ β π¦ π₯ β π¦
or
0 π₯ β π¦ π₯ β π¦
You may have heard the phrase,β The square of the sum of the product is less than or equal to the product of the sums of the squares!