NPR COLLEGE OF ENGINEERING & TECHNOLOGY. EC1252 COMMUNICATION THEORY DEPT/ YEAR/ SEM: ECE/ II/ IV PREPARED BY: Ms. S. THENMOZHI/ Lecturer/ECE www.VidyarthiPlus.in www.VidyarthiPlus.in
NPR COLLEGE OF ENGINEERING & TECHNOLOGY.
EC1252
COMMUNICATION THEORY
DEPT/ YEAR/ SEM: ECE/ II/ IV
PREPARED BY: Ms. S. THENMOZHI/ Lecturer/ECE
www.VidyarthiPlus.in
www.VidyarthiPlus.in
SYLLABUS
UNIT 1 AMPLITUDE MODULATION SYSTEMS
Review of spectral characteristics of periodic and non-periodic signals Generation and
demodulation of AM, DSBSC, SSB and VSB signals Comparison of amplitude
modulation systems Frequency translation FDM Non-linear distortion.
UNIT II ANGLE MODULATION SYSTEMS
Phase and frequency modulation Single tone Narrow band and wideband FM
Transmission bandwidth Generation and demodulation of FM signal.
UNIT III NOISE THEORY
Review of probability Random variables and random process Gaussian process
Noise Shot noise Thermal noise and white noise Narrow band noise Noise
temperature Noise figure.
UNIT IV PERFORMANCE OF CW MODULATION SYSTEMS
Superheterodyne radio receiver and its characteristic SNR Noise in DSBSC systems
using coherent detection Noise in AM system using envelope detection FM system
FM threshold effect Pre-emphasis and de-emphasis in FM Comparison of
performances.
UNIT V INFORMATION THEORY
Discrete messages and information content Concept of amount of information
Average information Entropy Information rate Source coding to increase average
information per bit Shannon-fano coding Huffman coding Lempel-Ziv (LZ) coding
Shannons theorem Channel capacity Bandwidth S/N trade-off Mutual
information and channel capacity Rate distortion theory Lossy source coding.
TEXT BOOKS
1. Dennis Roddy and John Coolen., Electronic Communication, 4th Edition,
PHI,1995.
2. Herbert Taub and Donald L Schilling., Principles of Communication Systems,
3rd Edition, TMH, 2008.
REFERENCES
1. Simon Haykin., Communication Systems, 4th Edition, John Wiley and Sons,
2001.
2. Bruce Carlson., Communication Systems, 3rd Edition, TMH, 1996.
3. Lathi, B. P., Modern Digital and Analog Communication Systems, 3rd Edition,
Oxford Press, 2007.
4. John G. Proakis, Masoud Salehi., Fundamentals of Communication Systems,
5th Edition, Pearson Education, 2006.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
UNIT 1
AMPLITUDE MODULATION SYSTEMS
Review of spectral characteristics of periodic and non-periodic signals.
Generation and demodulation of AM signal.
Generation and demodulation of DSBSC signal.
Generation and demodulation of SSB signal.
Generation and demodulation of VSB signal.
Comparison of amplitude modulation systems.
Frequency translation.
FDM.
Non-linear distortion.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Introduction:
In electronics, a signal is an electric current or electromagnetic field used to convey data from one
place to another. The simplest form of signal is a direct current (DC) that is switched on and off; this is the
principle by which the early telegraph worked. More complex signals consist of an alternating-current (AC)
or electromagnetic carrier that contains one or more data streams.
Modulation:
Modulation is the addition of information (or the signal) to an electronic or
optical signal carrier. Modulation can be applied to direct current (mainly by turning it on
and off), to alternating current, and to optical signals. One can think of blanket waving as a
form of modulation used in smoke signal transmission (the carrier being a steady stream of
smoke). Morse code, invented for telegraphy and still used in amateur radio, uses
a binary (two-state)digital code similar to the code used by modern computers. For most of
radio and telecommunication today, the carrier is alternating current (AC) in a given range
of frequencies. Common modulation methods include:
Amplitude modulation (AM), in which the voltage applied to the carrier is varied
over time
Frequency modulation (FM), in which the frequency of the carrier waveform is
varied in small but meaningful amounts
Phase modulation (PM), in which the natural flow of the alternating current
waveform is delayed temporarily
Classification of Signals:
Some important classifications of signals
Analog vs. Digital signals: as stated in the previous lecture, a signal with a
magnitude that may take any real value in a specific range is called an analog signal
while a signal with amplitude that takes only a finite number of values is called a
digital signal.
Continuous-time vs. discrete-time signals: continuous-time signals may be analog
or digital signals such that their magnitudes are defined for all values of t, while
discrete-time signal are analog or digital signals with magnitudes that are defined at
specific instants of time only and are undefined for other time instants.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Periodic vs. aperiodic signals: periodic signals are those that are constructed from a
specific shape that repeats regularly after a specific amount of time T0, [i.e., a
periodic signal f(t) with period T0 satisfies f(t) = f(t+nT0) for all integer values of
n], while aperiodic signals do not repeat regularly.
Deterministic vs. probabilistic signals: deterministic signals are those that can be
computed beforehand at any instant of time while a probabilistic signal is one that
is random and cannot be determined beforehand.
Energy vs. Power signals: as described below.
Energy and Power Signals
The total energy contained in and average power provided by a signal f(t) (which
is a function of time) are defined as
2| ( ) |fE f t dt ,
and
/ 2
2
/ 2
1lim | ( ) |
T
fT
T
P f t dtT
,
respectively.
For periodic signals, the power P can be computed using a simpler form based on
the periodicity of the signal as
0
0
2|)(|1
tT
t
fPeriodic dttfT
P ,
where T here is the period of the signal and t0 is an arbitrary time instant that is chosen to
simply the computation of the integration (to reduce the functions you have to integrate
over one period).
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Classification of Signals into Power and Energy Signals
Most signals can be classified into Energy signals or Power signals. A signal is classified
into an energy or a power signal according to the following criteria
a) Energy Signals: an energy signal is a signal with finite energy and zero
average power (0 E < , P = 0),
b) Power Signals: a power signal is a signal with infinite energy but finite
average power (0 < P < , E ).
Comments:
1. The square root of the average power P of a power signal is what is
usually defined as the RMS value of that signal.
2. Your book says that if a signal approaches zero as t approaches then the
signal is an energy signal. This is in most cases true but not always as you
can verify in part (d) in the following example.
3. All periodic signals are power signals (but not all nonperiodic signals are
energy signals).
4. Any signal f that has limited amplitude (| f | < ) and is time limited
(f = 0 for | t | > t0 for some t0 > 0) is an energy signal as in part (g) in
the following example.
Exercise 1: determine if the following signals are Energy signals, Power signals, or
neither, and evaluate E and P for each signal (see examples 2.1 and 2.2 on
pages 17 and 18 of your textbook for help).
a) ( ) 3sin(2 ),a t t t ,
This is a periodic signal, so it must be a power signal. Let us prove it.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
2 2| ( ) | | 3sin(2 ) |
19 1 cos(4 )
2
19 9 cos(4 )
2
J
aE a t dt t dt
t dt
dt t dt
Notice that the evaluation of the last line in the above equation is infinite
because of the first term. The second term has a value between 2 to 2 so it
has no effect in the overall value of the energy.
Since a(t) is periodic with period T = 2 /2 = 1 second, we get
1 1
2 2
0 0
1
0
0 1
0 0
1
0
1| ( ) | | 3sin(2 ) |
1
19 1 cos(4 )
2
19 9 cos(4 )
2
9 9sin(4 )
2 4
9W
2
aP a t dt t dt
t dt
dt t dt
t
So, the energy of that signal is infinite and its average power is finite (9/2).
This means that it is a power signal as expected. Notice that the average
power of this signal is as expected (square of the amplitude divided by 2)
b) 2| |( ) 5 ,tb t e t ,
Let us first find the total energy of the signal.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
22 2| |
0
4 4
0
04 4
0
| ( ) | 5
25 25
25 25
4 4
25 25 50J
4 4 4
t
b
t t
t t
E b t dt e dt
e dt e dt
e e
The average power of the signal is
/ 2 / 22
2 2| |
/ 2 / 2
0 / 2
4 4
/ 2 0
0 / 24 4
/ 2 0
2 2
1 1lim | ( ) | lim 5
1 125 lim 25 lim
25 1 25 1lim lim
4 4
25 1 25 1lim 1 lim 1
4 4
0 0 0
T T
t
bT T
T T
T
t t
T TT
Tt t
TT T
T T
T T
P b t dt e dtT T
e dt e dtT T
e eT T
e eT T
So, the signal b(t) is definitely an energy signal.
So, the energy of that signal is infinite and its average power is finite (9/2). This means that
it is a power signal as expected. Notice that the average power of this signal is as expected
(the square of the amplitude divided by 2)
c) 34 , | | 5
( )0, | | 5
te tc t
t,
d)
1, 1
( )
0, 1
td t t
t
,
Let us first find the total energy of the signal.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
2
1
1
1| ( ) |
ln
0 J
dE d t dt dtt
t
So, this signal is NOT an energy signal. However, it is also NOT a power
signal since its average power as shown below is zero.
The average power of the signal is
/ 2 / 2
2
/ 2 1
/ 2
1
1 1 1lim | ( ) | lim
1 1 1lim ln lim ln ln 1
2
ln1 2
lim ln lim2
T T
dT T
T
T
T T
T T
P d t dt dtT T t
Tt
T T T
T
T
T T
Using Lehopitals rule, we see that the power of the signal is zero. That is
2ln2
lim lim 01
dT T
T
TPT
So, not all signals that approach zero as time approaches positive and
negative infinite is an energy signal. They may not be power signals either.
e) 2( ) 7 ,e t t t ,
f) 2( ) 2cos (2 ),f t t t .
g) 212cos (2 ), 8 31
( )0, elsewhere
t tg t .
www.VidyarthiPlus.in
www.VidyarthiPlus.in
AMPLITUDE MODULATION:
In amplitude modulation, the instantaneous amplitude of a carrier wave is varied in accordance with the instantaneous amplitude of the
modulating signal. Main advantages of AM are small bandwidth and simple transmitter and receiver designs. Amplitude modulation is
implemented by mixing the carrier wave in a nonlinear device with the modulating signal. This produces upper and lower sidebands,
which are the sum and difference frequencies of the carrier wave and modulating signal.
The carrier signal is represented by
c(t) = A cos(wct)
The modulating signal is represented by
m(t) = B sin(wmt)
Then the final modulated signal is
[1 + m(t)] c(t)
= A [1 + m(t)] cos(wct)
= A [1 + B sin(wmt)] cos(wct)
= A cos(wct) + A m/2 (cos((wc+wm)t)) + A m/2 (cos((wc-wm)t))
Because of demodulation reasons, the magnitude of m(t) is always kept less than 1 and
the frequency much smaller than that of the carrier signal.
The modulated signal has frequency components at frequencies wc, wc+wm and wc-wm.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
DSBSC:
Double Sideband Suppressed Carrier Modulation In amplitude modulation the
amplitude of a high-frequency carrier is varied indirect proportion to the low-frequency
(baseband) message signal. The carrier is usually a sinusoidal waveform, that is,
c(t)=Ac cos(ct+c)
Or
c(t)=Ac sin(ct+c)
Where:
Ac is the unmodulated carrier amplitude
c is the unmodulated carrier angular frequency in radians/s;
c =2fcc is the unmodulated carrier phase, which we shall assume is zero.
The amplitude modulated carrier has the mathematical form
DSB-SC(t)= A(t) cos(ct)
Where:
A(t) is the instantaneous amplitude of the modulated carrier, and is a linear function
of the message signal m(t). A(t) is also known as the envelope of the modulated signal For
double-sideband suppressed carrier (DSB-SC) modulation the amplitude is
related to the message as follows:
A(t)=Ac(t) m(t)
Consider a message signal with spectrum (Fourier transform) M() which is band limited
to 2B as shown in Figure 1(b). The bandwidth of this signal is B Hz and c is chosen
such that c >> 2B. Applying the modulation theorem, the modulated Fourier transform
is
A(t) cos(ct)= m(t) cos(ct) ( M( - c)+ M( + c))
GENERATION OF DSBSC:
The DSB-SC can be generated using either the balanced modulator or the ring-modulator.
The balanced modulator uses two identical AM generators along with an adder. The two
amplitude modulators have a common carrier with one of them modulating the input
message , and the other modulating the inverted message . Generation of AM is not simple,
and to have two AM generators with identical operating conditions is extremely difficult.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Hence, laboratory implementation of the DSB-SC is usually using the ring-modulator,
shown in figure 1.
Figure 1
Figure 1: The ring modulator used for the generation of the double-side-band-suppressed-
carrier (DSB-SC)
This standard form of DSB-SC generation is the most preferred method of laboratory
implementation. However, it cannot be used for the generation of the AM waveform.
The DSB-SC and the DSB forms of AM are closely related as; the DSB-SC with the
addition of the carrier becomes the DSB, while the DSB with the carrier removed results in
the DSB-SC form of modulation. Yet, existing methods of DSB cannot be used for the
generation of the DSB-SC. Similarly the ring modulator cannot be used for the generation
of the DSB. These two forms of modulation are generated using different methods. Our
attempt in this work is to propose a single circuit capable of generating both the DSB-SC
and the DSB forms of AM.
THE MODIFIED SWITCHING MODULATOR:
The block diagram of the modified switching modulator given in figure 1, has all
the blocks of the switching modulator, but with an additional active device. In this case, the
active device has to be of three terminals to enable it being used as a controlled switch.
Another significant change is that of the adder being shifted after the active device. These
www.VidyarthiPlus.in
www.VidyarthiPlus.in
changes in the switching-modulator enable the carrier to independently control the
switching action of the active device, and thus eliminate the restriction existing in the usual
switching-modulator (equation (2)). In addition, the same circuit can generate the DSB-
SC waveform. Thus the task of modulators given in figures 1 and 2 is accomplished by the
single modulator of figure 3.
Figure 2
Figure 2: The modified switching modulator
It is possible to obtain AM or the DSB-SC waveform from the modified switching-
modulator of figure 3, by just varying, the amplitude of the square wave carrier . It may be
noted that the carrier performs two tasks: (i) control the switching action of the active
devices and (ii) control the depth of modulation of the generated AM waveform. Thus, the
proposed modification in the switching modulator, enables the generation of both the AM
and the DSB-SC from a single circuit. Also, it may be noted that the method is devoid of
any assumptions or stringent difficult to maintain operating conditions, as in existing low
power generation of the AM. We now implement the modified switching modulator and
record the observed output in the next Section.
Experimental results
The circuit implemented for testing the proposed method is given in figure 4, which
uses transistors CL-100 and CK-100 for controlled switches, two transformers for the
adder, followed by a passive BPF. The square-wave carrier and the sinusoidal message are
www.VidyarthiPlus.in
www.VidyarthiPlus.in
given from a function generator (6MHz Aplab FG6M).The waveforms are observed on the
mixed signal oscilloscope (100MHz Agilent 54622D, capable of recording the output in
.tif format).
Figure 3
Figure 3: The implementation of the modified switching modulator to generate the AM
and the DSB-SC waveform
The modified switching modulator is tested using a single tone message of 706 Hz,
with a square-wave carrier of frequency 7.78 KHz. The depth of modulation of the
generated waveform can be varied either by varying the amplitude of the carrier or by
varying the amplitude of the signal. Figure 5 has the results of the modulated waveforms
obtained using the modified switching modulator. It can be seen that the same circuit is
able to generate AM for varying depths of modulation, including the over-modulation and
the DSB-SC. The quality of the modulated waveforms is comparable to that obtained using
industry standard communication modules (like the LabVolt for example).
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Properties of DSB-SC Modulation:
(a) There is a 180 phase reversal at the point where +A(t)=+m(t) goes negative.
This is typical of DSB-SC modulation.
(b) The bandwidth of the DSB-SC signal is double that of the message signal, that
is,
BWDSB-SC =2B (Hz).
(c) The modulated signal is centered at the carrier frequency c with two identical
sidebands (double-sideband) the lower sideband (LSB) and the upper sideband (USB).
Being identical, they both convey the same message component.
(d) The spectrum contains no isolated carrier. Thus the name suppressed carrier.
(e)The 180 phase reversal causes the positive (or negative) side of the envelope to
have a shape different from that of the message signal. This is known as envelope
distortion, which is typical of DSBSC modulation.
(f) The power in the modulated signal is contained in all four sidebands.
Generation of DSB-SC Signals
The circuits for generating modulated signals are known as modulators. The basic
modulators are Nonlinear, Switching and Ring modulators. Conceptually, the simplest
modulator is the product or multiplier modulator which is shown in figure 1-a. However, it
is very difficult (and expensive) in practice to design a product modulator that maintains
amplitude linearity at high carrier frequencies. One way of replacing the modulator stage
is by using a non-linear device. We use the non-linearity to generate a harmonic that
contains the product term then use a BPF to separate the term of interest. Figure 3 shows a
block diagram of a nonlinear DSBSC modulator. Figure 4 shows a double balanced
modulator that use the diode as a non-linear device, then use the BPF to separate the
product term.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
The received DSB-SC signal is
Sm(t) = DSB-SC(t)= Ac (t) m(t) cos(ct)
The receiver first generates an exact (coherent) replica (same phase and frequency) of the
unmodulated carrier
Sc(t) = Cos(ct)
The coherent carrier is then multiplied with the received signal to give
Sm(t)* Sc(t) = Ac (t) m(t) cos(ct)* Cos(ct)
= Ac (t) m(t)+1/2 Ac (t) m(t) cos(2ct)
The first term is the desired baseband signal while the second is a band-pass signal
centered at 2c. A low-pass filter with bandwidth equal to that of the m(t) will pass the
first term and reject the band-pass component.
Single Side Band (SSB) Modulation:
In DSB-SC it is observed that there is symmetry in the band structure. So,
even if one half is transmitted, the other half can be recovered at the received. By
doing so, the bandwidth and power of transmission is reduced by half.
Depending on which half of DSB-SC signal is transmitted, there are two types of
1. Lower Side Band (LSB) Modulation
2. Upper Side Band (USB) Modulation
Vestigial Side Band (VSB) Modulation:
The following are the drawbacks of SSB signal generation:
1. Generation of an SSB signal is difficult.
2. Selective filtering is to be done to get the original signal back.
3. Phase shifter should be exactly tuned to 900
.
To overcome these drawbacks, VSB modulation is used. It can viewed
as a compromise between SSB and DSB-SC.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
In VSB
1. One sideband is not rejected fully.
2. One sideband is transmitted fully and a small part (vestige)of
the other sideband is transmitted.
The transmission BW is BWv = B + v. where, v is the vestigial frequency band.
FREQUENCY TRANSLATION:
The transfer of signals occupying a specified frequency band, such
as a channel or group of channels, from one portion of the frequency spectrum to another,
in such a way that the arithmetic frequency difference of signals within the band is
unaltered.
FREQUENCY-DIVISION MULTIPLEXING (FDM):
It is a form of signal multiplexing which involves assigning non-overlapping
frequency ranges to different signals or to each "user" of a medium.
FDM can also be used to combine signals before final modulation onto a carrier
wave. In this case the carrier signals are referred to as subcarriers: an example is stereo
FM transmission, where a 38 kHz subcarrier is used to separate the left-right difference
signal from the central left-right sum channel, prior to the frequency modulation of the
composite signal. A television channel is divided into subcarrier frequencies for video,
color, and audio. DSL uses different frequencies for voice and
for upstream and downstream data transmission on the same conductors, which is also an
example of frequency duplex. Where frequency-division multiplexing is used as to allow
multiple users to share a physical communications channel, it is called frequency-division
multiple access (FDMA).
NONLINEAR DISTORTION:
It is a term used (in fields such as electronics, audio and telecommunications) to
describe the phenomenon of a non-linear relationship between the "input" and "output"
signals of - for example - an electronic device.
EFFECTS OF NONLINEARITY:
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Nonlinearity can have several effects, which are unwanted in typical situations.
The a3 term for example would, when the input is a sine wave with frequency , result in
an extra sine wave at 3, as shown below.
In certain situations, this spurious signal can be filtered away because the
"harmonic" 3 lies far outside the frequency range used, but in cable television, for
example, third order distortion could cause a 200 MHz signal to interfere with the regular
channel at 600 MHz.
Nonlinear distortion applied to a superposition of two signals at different frequencies
causes the circuit to act as a frequency mixer, creating intermodulation distortion.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
PART A (2 MARK) QUESTIONS.
1. As related to AM, what is over modulation, under modulation and 100% modulation?
2. Draw the frequency spectrum of VSB, where it is used
3. Define modulation index of an AM signal
4. Draw the circuit diagram of an envelope detector
5. What is the mid frequency of IF section of AM receivers and its bandwidth.
6. A transmitter radiates 9 kW without modulation and 10.125 kW after modulation.
Determine depth of modulation.
7. Draw the spectrum of DSB.
8. Define the transmission efficiency of AM signal.
9. Draw the phasor diagram of AM signal.
10. Advantages of SSB.
11. Disadvantages of DSB-FC.
12. What are the advantages of superhetrodyne receiver?
13. Advantages of VSB.
14. Distinguish between low level and high level modulator.
15. Define FDM & frequency translation.
16. Give the parameters of receiver.
17. Define sensitivity and selectivity.
18. Define fidelity.
19. What is meant by image frequency?
20. Define multitone modulation.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
PART B (16 MARK) QUESTIONS
1. Explain the generation of AM signals using square law modulator. (16)
2. Explain the detection of AM signals using envelope detector. (16)
3. Explain about Balanced modulator to generate DSB-SC signal. ` (16)
4. Explain about coherent detector to detect SSB-SC signal. (16)
5. Explain the generation of SSB using balanced modulator. (16)
6. Draw the circuit diagram of Ring modulator and explain with its operation? (16)
7. Discus the coherent detection of DSB-SC modulated wave with a block diagram of
detector and Explain. (16)
8. Explain the working of Superheterodyne receiver with its parameters. (16)
9. Draw the block diagram for the generation and demodulation of a VSB signal and
explain the principle of operation. (16)
10. Write short notes on frequency translation and FDM? (16)
www.VidyarthiPlus.in
www.VidyarthiPlus.in
UNIT II
ANGLE MODULATION SYSTEMS
Phase and frequency modulation
Single tone
Narrow band FM
Wideband FM
Transmission bandwidth
Generation of FM signal.
Demodulation of FM signal
www.VidyarthiPlus.in
www.VidyarthiPlus.in
PHASE MODULATION:
Phase modulation (PM) is a form of modulation that represents information as
variations in the instantaneous phase of a carrier wave.
Unlike its more popular counterpart, frequency modulation (FM), PM is not very
widely used for radio transmissions. This is because it tends to require more complex
receiving hardware and there can be ambiguity problems in determining whether, for
example, the signal has changed phase by +180 or -180. PM is used, however, in digital
music synthesizers such as the Yamaha DX7, even though these instruments are usually
referred to as "FM" synthesizers (both modulation types sound very similar, but PM is
usually easier to implement in this area).
An example of phase modulation. The top diagram shows the modulating signal
superimposed on the carrier wave. The bottom diagram shows the resulting phase-
www.VidyarthiPlus.in
www.VidyarthiPlus.in
modulated signal. PM changes the phase angle of the complex envelope in direct
proportion to the message signal.
Suppose that the signal to be sent (called the modulating or message signal) is m(t) and the
carrier onto which the signal is to be modulated is
Annotated:
carrier(time) = (carrier amplitude)*sin(carrier frequency*time + phase shift)
This makes the modulated signal
This shows how m(t) modulates the phase - the greater m(t) is at a point in time, the
greater the phase shift of the modulated signal at that point. It can also be viewed as a
change of the frequency of the carrier signal, and phase modulation can thus be considered
a special case of FM in which the carrier frequency modulation is given by the time
derivative of the phase modulation.
The spectral behavior of phase modulation is difficult to derive, but the
mathematics reveals that there are two regions of particular interest:
For small amplitude signals, PM is similar to amplitude
modulation (AM) and exhibits its unfortunate doubling of
baseband bandwidth and poor efficiency.
For a single large sinusoidal signal, PM is similar to FM, and its
bandwidth is approximately
,
where fM = m / 2 and h is the modulation index defined below. This is
also known as Carson's Rule for PM.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
MODULATION INDEX:
As with other modulation indices, this quantity indicates by how much the
modulated variable varies around its unmodulated level. It relates to the variations in the
phase of the carrier signal:
,
where is the peak phase deviation. Compare to the modulation index for frequency
modulation.
Variable-capacitance diode phase modulator:
This circuit varies the phase between two square waves through at least 180. This
capability finds application in fixed-frequency, phase shift, resonant-mode converters. ICs
such as the UC3875 usually only work up to about 500 kHz, whereas this circuit can be
extended up to tens of megahertz. In addition, the circuit shown uses low-cost components.
This example was used for a high-efficiency 2-MHz RF power supply.
The signal is delayed at each gate by the RC network formed by the 4.7k input
resistor and capacitance of the 1N4003 diode. The capacitance of the diode, and hence
delay, can be varied by controlling the reverse dc bias applied across the diode. The 100k
resistor to ground at the input to the second stage corrects a slight loss of 1:1 symmetry.
The fixed delay for output A adjusts the phase to be approximately in phase at a 5-V bias.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Note that the control voltage should not drop below approximately 3 V, because the diodes
will start to be forward-biased and the signal will be lost.
FREQUENCY MODULATION:
Frequency modulation (FM) conveys information over a carrier wave by varying
its instantaneous frequency. This is in contrast with amplitude modulation, in which
the amplitude of the carrier is varied while its frequency remains constant.
In analog applications, the difference between the instantaneous and the base frequency of
the carrier is directly proportional to the instantaneous value of the input signal
amplitude. Digital data can be sent by shifting the carrier's frequency among a set of
discrete values, a technique known as frequency-shift keying.
Frequency modulation can be regarded as phase modulation where the carrier phase
modulation is the time integral of the FM modulating signal.
FM is widely used for broadcasting of music and speech, and in two-way
radio systems, in magnetic tape recording systems, and certain video transmission systems.
In radio systems, frequency modulation with sufficient bandwidth provides an advantage in
cancelling naturally-occurring noise. Frequency-shift keying (digital FM) is widely used in
data and fax modems.
THEORY:
Suppose the baseband data signal (the message) to be transmitted is xm(t) and
the sinusoidal carrier is , where fc is the carrier's base frequency
and Ac is the carrier's amplitude. The modulator combines the carrier with the baseband
data signal to get the transmitted signal:
www.VidyarthiPlus.in
www.VidyarthiPlus.in
In this equation, is the instantaneous frequency of the oscillator and is
the frequency deviation, which represents the maximum shift away from fc in one
direction, assuming xm(t) is limited to the range 1.
Although it may seem that this limits the frequencies in use to fc f, this neglects the
distinction between instantaneous frequency and spectral frequency. The frequency
spectrum of an actual FM signal has components extending out to infinite frequency,
although they become negligibly small beyond a point.
SINUSOIDAL BASEBAND SIGNAL:
While it is an over-simplification, a baseband modulated signal may be approximated
by a sinusoidal Continuous Wave signal with a frequency fm. The integral of such a
signal is
Thus, in this specific case, equation (1) above simplifies to:
where the amplitude of the modulating sinusoid, is represented by the peak
deviation (see frequency deviation).
The harmonic distribution of a sine wave carrier modulated by such
a sinusoidal signal can be represented with Bessel functions - this provides a basis for a
mathematical understanding of frequency modulation in the frequency domain.
MODULATION INDEX:
As with other modulation indices, this quantity indicates by how much the
modulated variable varies around its unmodulated level. It relates to the variations in the
frequency of the carrier signal:
where is the highest frequency component present in the modulating
signal xm(t), and is the Peak frequency-deviation, i.e. the maximum deviation of
the instantaneous frequency from the carrier frequency. If , the modulation is
www.VidyarthiPlus.in
www.VidyarthiPlus.in
called narrowband FM, and its bandwidth is approximately . If , the
modulation is called wideband FM and its bandwidth is approximately . While
wideband FM uses more bandwidth, it can improve signal-to-noise ratio significantly.
With a tone-modulated FM wave, if the modulation frequency is held constant and
the modulation index is increased, the (non-negligible) bandwidth of the FM signal
increases, but the spacing between spectra stays the same; some spectral components
decrease in strength as others increase. If the frequency deviation is held constant and the
modulation frequency increased, the spacing between spectra increases.
Frequency modulation can be classified as narrow band if the change in the carrier
frequency is about the same as the signal frequency, or as wide-band if the change in the
carrier frequency is much higher (modulation index >1) than the signal frequency. [1]
For
example, narrowband FM is used for two way radio systems such as Family Radio
Service where the carrier is allowed to deviate only 2.5 kHz above and below the center
frequency, carrying speech signals of no more than 3.5 kHz bandwidth. Wide-band FM is
used for FM broadcasting where music and speech is transmitted with up to 75 kHz
deviation from the center frequency, carrying audio with up to 20 kHz bandwidth.
CARSON'S RULE:
A rule of thumb, Carson's rule states that nearly all (~98%) of the power of a
frequency-modulated signal lies within a bandwidth of
where , as defined above, is the peak deviation of the instantaneous frequency
from the center carrier frequency .
NOISE QUIETING:
The noise power decreases as the signal power increases, therefore the SNR goes
up significantly.
MODULATION:
FM signals can be generated using either direct or indirect frequency modulation.
Direct FM modulation can be achieved by directly feeding the message into the
input of a VCO.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
For indirect FM modulation, the message signal is integrated to generate a phase
modulated signal. This is used to modulate a crystal controlled oscillator, and the
result is passed through a frequency multiplier to give an FM signal.
DEMODULATION:
Many FM detector circuits exist. One common method for recovering the
information signal is through a Foster-Seeley discriminator. A phase-lock loop can be used
as an FM demodulator.
Slope detection demodulates an FM signal by using a tuned circuit, which has its
resonant frequency slightly offset from the carrier frequency. As the frequency rises and
falls, the tuned circuit provides a changing amplitude of response, converting FM to AM.
AM receivers may detect some FM transmissions by this means, though it does not provide
an efficient method of detection for FM broadcasts.
APPLICATIONS:
MAGNETIC TAPE STORAGE:
FM is also used at intermediate frequencies by all analog VCR systems,
including VHS, to record both the luminance (black and white) and the chrominance
portions of the video signal. FM is the only feasible method of recording video to and
retrieving video from Magnetic tape without extreme distortion, as video signals have a
very large range of frequency components from a few hertz to several megahertz, too
wide for equalizers to work with due to electronic noise below 60 dB. FM also keeps the
tape at saturation level, and therefore acts as a form of noise reduction, and a
simple limiter can mask variations in the playback output, and the FM capture effect
removes print-through and pre-echo. A continuous pilot-tone, if added to the signal as
was done on V2000 and many Hi-band formats can keep mechanical jitter under control
and assist time base correction.
These FM systems are unusual in that they have a ratio of carrier to maximum
modulation frequency of less than two; contrast this with FM audio broadcasting where the
ratio is around 10,000. Consider for example a 6 MHz carrier modulated at a 3.5 MHz rate;
by Bessel analysis the first sidebands are on 9.5 and 2.5 MHz, while the second sidebands
are on 13 MHz and 1 MHz The result is a sideband of reversed phase on +1 MHz; on
demodulation, this results in an unwanted output at 61 = 5 MHz The system must be
designed so that this is at an acceptable level.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
SOUND:
FM is also used at audio frequencies to synthesize sound. This technique, known
as FM synthesis, was popularized by early digital synthesizers and became a standard
feature for several generations of personal computer sound cards.
RADIO:
The wideband FM (WFM) requires a wider signal bandwidth than amplitude
modulation by an equivalent modulating signal, but this also makes the signal more robust
against noise and interference. Frequency modulation is also more robust against simple
signal amplitude fading phenomena. As a result, FM was chosen as the
modulation standard for high frequency, high fidelity radio transmission: hence the term
"FM radio" (although for many years the BBC called it "VHF radio", because commercial
FM broadcasting uses a well-known part of the VHF bandthe FM broadcast band).
FM receivers employ a special detector for FM signals and exhibit
a phenomenon called capture effect, where the tuner is able to clearly receive the stronger
of two stations being broadcast on the same frequency. Problematically
however, frequency drift or lack of selectivity may cause one station or signal to be
suddenly overtaken by another on an adjacent channel. Frequency drift typically
constituted a problem on very old or inexpensive receivers, while inadequate selectivity
may plague any tuner.
An FM signal can also be used to carry a stereo signal: see FM stereo. However,
this is done by using multiplexing and demultiplexing before and after the FM process. The
rest of this article ignores the stereo multiplexing and demultiplexing process used in
"stereo FM", and concentrates on the FM modulation and demodulation process, which is
identical in stereo and mono processes.
A high-efficiency radio-frequency switching amplifier can be used to transmit FM
signals (and other constant-amplitude signals). For a given signal strength (measured at the
receiver antenna), switching amplifiers use less battery power and typically cost less than
a linear amplifier. This gives FM another advantage over other modulation schemes that
require linear amplifiers, such as AM and QAM.
FM is commonly used at VHF radio frequencies for high-
fidelity broadcasts of music and speech (see FM broadcasting). Normal (analog) TV sound
is also broadcast using FM. A narrow band form is used for voice communications in
www.VidyarthiPlus.in
www.VidyarthiPlus.in
commercial and amateur radio settings. In broadcast services, where audio fidelity is
important, wideband FM is generally used. In two-way radio, narrowband FM (NBFM) is
used to conserve bandwidth for land mobile radio stations, marine mobile, and many other
radio services.
VARACTOR FM MODULATOR:
Varactor FM Modulator
Another fm modulator which is widely used in transistorized circuitry uses a
voltage-variable capacitor (VARACTOR). The varactor is simply a diode, or pn junction,
that is designed to have a certain amount of capacitance between junctions. View (A) of
figure 2 shows the varactor schematic symbol. A diagram of a varactor in a simple
oscillator circuit is shown in view (B).This is not a working circuit, but merely a simplified
illustration. The capacitance of a varactor, as with regular capacitors, is determined by the
area of the capacitor plates and the distance between the plates. The depletion region in the
varactor is the dielectric and is located between the p and n elements, which serve as the
plates. Capacitance is varied in the varactor by varying the reverse bias which controls the
thickness of the depletion region. The varactor is so designed that the change in
www.VidyarthiPlus.in
www.VidyarthiPlus.in
capacitance is linear with the change in the applied voltage. This is a special design
characteristic of the varactor diode. The varactor must not be forward biased because it
cannot tolerate much current flow. Proper circuit design prevents the application of
forward bias.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
IMPORTANT QUESTION
PART A
All questions Two Marks:
1. What do you mean by narrowband and wideband FM?
2. Give the frequency spectrum of narrowband FM?
3. Why Armstrong method is superior to reactance modulator.
4. Define frequency deviation in FM?
5. State Carsons rule of FM bandwidth?
6. Differentiate between narrow band and wideband FM.?
7. What are the advantages of FM.?
8. Define PM.
9. What is meant by indirect FM generation?
10. Draw the phasor diagram of narrow band FM.
11. Write the expression for the spectrum of a single tone FM signal.
12. What are the applications of phase locked loop?
13. Define modulation index of FM and PM.
14. Differentiate between phase and frequency modulation.
15. A carrier of frequency 100 MHz is frequency modulated by a signal x(t)=20sin
(200x103t ). What is the bandwidth of the FM signal if the frequency sensitivity of the
modulator is 25 KHz per volt?
16. What is the bandwidth required for an FM wave in which the modulating frequency
signal
is 2 KHz and the maximum frequency deviation is 12 KHz?
17. Determine and draw the instantaneous frequency of a wave having a total phase angle
given by (t)= 2000t +sin10t.
18. Draw the block diagram of PLL.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
PART B
1. Explain the indirect method of generation of FM wave and any one method of
demodulating an FM wave. (16)
2. Derive the expression for the frequency modulated signal. Explain what is meant by
narrowband FM and wideband FM using the expression. (16)
3. Explain any two techniques of demodulation of FM. (16)
4. Explain the working of the reactance tube modulator and drive an expression to show
how the variation of the amplitude of the input signal changes the frequency of the output
signal of the modulator. (16)
5. Discuss the effects of nonlinearities in FM. (8)
6. Discuss in detail FM stereo multiplexing. (8)
7. Draw the frequency spectrum of FM and explain. Explain how Varactor diode can be
used for frequency modulation. (16)
8. Discuss the indirect method of generating a wide-band FM signal. (8)
9. Draw the circuit diagram of Foster-Seelay discriminator and explain its working. (16)
10. Explain the principle of indirect method of generating a wide-band FM signal with a
neat block diagram. (8)
www.VidyarthiPlus.in
www.VidyarthiPlus.in
UNIT III
NOISE THEORY
Review of probability.
Random variables and random process.
Gaussian process.
Noise.
Shot noise.
Thermal noise.
White noise.
Narrow band noise.
Noise temperature.
Noise figure.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
6
i=1 Pk
INTRODUCTION OF PROBABILITY:
Probability theory is the study of uncertainty. Through this class, we will be relying on concepts from probability theory for deriving machine learning algorithms. These notes attempt to cover the basics of probability theory at a level appropriate. The mathematical theory of probability is very sophisticated, and delves into a branch of analysis known as measure theory. In these notes, we provide a basic treatment of probability that does not address these finer details.
1 Elements of probability
In order to define a probability on a set we need a few basic elements,
Sample space : The set of all the outcomes of a random experiment. Here, each outcome can be thought of as a complete description of the state of the real world at the end of the experiment.
Set of events (or event space) F : A set whose elements A F (called events) are subsets of (i.e., A is a collection of possible outcomes of an experiment).1 .
Probability measure: A function P : F R that satisfies the following properties,
- P (A) 0, for all A F - P () = 1
- If A1 , A2 , . . . are disjoint events (i.e., Ai Aj = whenever i = j), then
P (ui Ai ) = X
P (Ai ) i
These three properties are called the Axioms of Probability.
Example: Consider the event of tossing a six-sided die. The sample space is = {1, 2, 3, 4, 5, 6}. We can define different event spaces on this sample space. For example, the simplest event
space is the trivial event space F = {, }. Another event space is the set of all subsets of . For the first event space, the unique probability measure satisfying the requirements
above is given by P () = 0, P () = 1. For the second event space, one valid probability measure is to assign the probability of each set in the event space to be i where i is the number of elements of that set. Properties:
- If A B = P (A) P (B). - P (A B) min(P (A), P (B)).
- (Union Bound) P (A u B) P (A) + P (B).
- P ( \ A) = 1 P (A).
- (Law of Total Probability) If A1 , . . . , Ak are a set of disjoint events such that uk
2 Random variables
Consider an experiment in which we flip 10 coins, and we want to know the number of coins that come up heads. Here, the elements of the sample space I are 10-length sequences of heads and tails. For example, we might have wO = (H, H, T , H, T , H, H, T , T , T ) E I. However, in practice, we usually do not care about the probability of obtaining any particular sequence of heads and tails. Instead we usually care about real-valued functions of outcomes, such as the number of the number of heads that appear among our 10 tosses, or the length of the longest run of tails. These functions, under some technical conditions, are known as random variables.
More formally, a random variable X is a function X : I R.2 Typically, we will denote random variables using upper case letters X () or more simply X (where the dependence on the random outcome is implied). We will denote the value that a random variable may take on using lower case letters x.
Example: In our experiment above, suppose that X () is the number of heads which occur in the sequence of tosses . Given that only 10 coins are tossed, X () can take only a finite number of values, so it is known as a discrete random variable. Here, the probability of the set associated with a random variable X taking on some specific value k is
www.VidyarthiPlus.in
www.VidyarthiPlus.in
P (X = k) := P ({ : X () = k}).
Example: Suppose that X () is a random variable indicating the amount of time it takes for a radioactive particle to decay. In this case, X (I) takes on a infinite number of possible values, so it is called a continuous random variable. We denote the probability that X takes on a value between two real constants a and b (where a < b) as
P (a X b) := P ({ : a X
() b}).
2.1 Cumulative distribution functions
In order to specify the probability measures used when dealing with random variables, it is often convenient to specify alternative functions (CDFs, PDFs, and PMFs) from which the probability measure governing an experiment immediately follows. In this section and the next two sections, we describe each of these types of functions in turn.
A cumulative distribution function (CDF) is a function FX : R [0, 1] which specifies a proba- bility measure as,
FX (x) , P (X x). (1)
By using this function one can calculate the probability of any event in F .3 Figure 1 shows a sample CDF function.
2.2 Probability mass functions
When a random variable X takes on a finite set of possible values (i.e., X is a discrete random variable), a simpler way to represent the probability measure associated with a random variable is to directly specify the probability of each value that the random variable can assume. In particular,
a probability mass function (PMF) is a function pX : I R such that
pX (x) , P (X = x).
In the case of discrete random variable, we use the notation V al(X ) for the set of possible values that the random variable X may assume. For example, if X () is a random variable indicating the
number of heads out of ten tosses of coin, then V al(X ) = {0, 1, 2, . . . , 10}.
Properties:
- 0 pX (x)
1.
- P
xV al(X ) pX (x) =
1.
- P
xA pX (x) = P (X e
A).
2.3 Probability density functions
For some continuous random variables, the cumulative distribution function FX (x) is differentiable everywhere. In these cases, we define the Probability Density Function or PDF as the derivative of the CDF, i.e.,
fX (x) , dFX (x)
. (2) dx
www.VidyarthiPlus.in
www.VidyarthiPlus.in
x
Note here, that the PDF for a continuous random variable may not always exist (i.e., if FX (x) is not differentiable everywhere).
According to the properties of differentiation, for very small x,
P (x X x + x) fX (x)x. (3)
Both CDFs and BDFs (when they exist!) can be used for calculating the probabilities of different events. But it should be emphasized that the value of PDF at any given point x is not the probability
of that event, i.e., fX (x) = P (X = x). For example, fX (x) can take on values larger than one (but the integral of fX (x) over any subset of R will be at most one).
Properties:
- fX (x) 0 .
- R
fX (x) = i.
- R A
fX (x)dx = P (X e A).
2.4 Expectation
Suppose that X is a discrete random variable with PMF pX (x) and g : R - R is an arbitrary function. In this case, g(X ) can be considered a random variable, and we define the expectation or expected value of g(X ) as
E[g(X )] , X
xV al(X )
g(x)pX (x).
If X is a continuous random variable with PDF fX (x), then the expected value of g(X ) is defined as,
E[g(X )] ,
Z g(x)fX (x)dx.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
( 1
x
Intuitively, the expectation of g(X ) can be thought of as a weighted average of the values that g(x) can taken on for different values of x, where the weights are given by pX (x) or fX (x). As a special case of the above, note that the expectation, E[X ] of a random variable itself is found by letting g(x) = x; this is also known as the mean of the random variable X .
Properties:
- E[a] = a for any constant a e R.
- E[af (X )] = aE[f (X )] for any constant a e R.
- (Linearity of Expectation) E[f (X ) + g(X )] = E[f (X )] + E[g(X )].
- For a discrete random variable X , E[i{X = k}] = P (X = k).
2.5 Variance
The variance of a random variable X is a measure of how concentrated the distribution of a random variable X is around its mean. Formally, the variance of a random variable X is defined as
V ar[X ] , E[(X - E(X ))2 ]
Using the properties in the previous section, we can derive an alternate expression for the variance:
E[(X - E[X ])2 ] = E[X 2 - 2E[X ]X + E[X ]2 ]
= E[X 2 ] - 2E[X ]E[X ] + E[X ]2
= E[X 2 ] - E[X ]2 ,
where the second equality follows from linearity of expectations and the fact that E[X ] is actually a constant with respect to the outer expectation.
Properties:
- V ar[a] = 0 for any constant a e R.
- V ar[af (X )] = a2 V ar[f (X )] for any constant a e R.
2.6 Some common random variables
Discrete random variables
X Bernoulli(p) (where O p 1): one if a coin with heads probability p comes up heads, zero otherwise.
p if p = 1
p(x) = 1 - p if p = O
X Binomial(n, p) (where O p 1): the number of heads in n independent flips of a coin with heads probability p.
p(x) =
n
x
px (1 - p)nx
X Geometric(p) (where p > O): the number of flips of a coin with heads probability p until the first heads.
p(x) = p(1 - p)x1
X P oisson() (where > O): a probability distribution over the nonnegative integers used for modeling the frequency of rare events.
p(x) = e x!
Continuous random variables
X U nif orm(a, b) (where a < b): equal probability density to every value between a and b on the real line.
f (x) =
ba if a x b O otherwise
www.VidyarthiPlus.in
www.VidyarthiPlus.in
1 2
X Exponential() (where > O): decaying probability density over the nonnegative reals.
f (x) =
ex if x > O O otherwise
X N ormal(, 2 ): also known as the Gaussian distribution
1 f (x) =
e 2
22
(x)
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Figure 2: PDF and CDF of a couple of random variables.
3 Two random variables
Thus far, we have considered single random variables. In many situations, however, there may be more than one quantity that we are interested in knowing during a ran- dom experiment. For instance, in an experiment where we flip a coin ten times, we may care about both X () = the number of heads that come up as well as Y () = the length of the longest run of consecutive heads. In this section, we consider the setting of two random variables.
3.1 Joint and marginal distributions
Suppose that we have two random variables X and Y . One way to work with these two random variables is to consider each of them separately. If we do that we will only need FX (x) and FY (y). But if we want to know about the values that X and Y assume simultaneously during outcomes of a random experiment, we require a more complicated structure known as the joint cumulative distribution function of X and Y , defined by
FX Y (x, y) = P (X x, Y y)
It can be shown that by knowing the joint cumulative distribution function, the probability of any event involving X and Y can be calculated.
6
www.VidyarthiPlus.in
www.VidyarthiPlus.in
The joint CDF FX Y (x, y) and the joint distribution functions FX (x) and FY (y) of each variable separately are related by
FX (x) = lim FX Y (x, y)dy y
FY (y) = lim FX Y (x, y)dx. x
Here, we call FX (x) and FY (y) the marginal cumulative distribution functions of FX Y (x, y).
Properties:
- o FX Y (x, y) 1.
- limx,y FX Y (x, y) = 1.
- limx,y FX Y (x, y) = o.
- FX (x) = limy FX Y (x, y).
3.2 Joint and marginal probability mass functions
If X and Y are discrete random variables, then the joint probability mass function pX Y : R R [o, 1] is defined by
pX Y (x, y) = P (X = x, Y = y).
Here, o PX Y (x, y) 1 for all x, y, and P
xV al(X )
PyV al(Y ) PX Y (x, y) = 1.
How does the joint PMF over two variables relate to the probability mass function for each variable separately? It turns out that
pX (x) = X
pX Y (x, y).
y
and similarly for pY (y). In this case, we refer to pX (x) as the marginal probability mass function of X . In statistics, the process of forming the marginal distribution with respect to one variable by summing out the other variable is often known as marginalization.
3.3 Joint and marginal probability density functions
Let X and Y be two continuous random variables with joint distribution function FX Y . In the case that FX Y (x, y) is everywhere differentiable in both x and y, then we can define the joint probability density function,
fX Y (x, y) =
2 FX Y (x, y) .
xy
Like in the single-dimensional case, fX Y (x, y) = P (X = x, Y = y), but rather
Z Z
xA
fX Y (x, y)dxdy = P ((X, Y ) e A).
Note that the values of the probability density function fX Y (x, y) are always nonnegative, but they
may be greater than 1. Nonetheless, it must be the case that R
R fX Y (x, y) = 1.
Analagous to the discrete case, we define
fX (x) =
Z
fX Y (x, y)dy,
as the marginal probability density function (or marginal density) of X , and similarly for fY (y).
www.VidyarthiPlus.in
www.VidyarthiPlus.in
P
R
X
X
3.4 Conditional distributions
Conditional distributions seek to answer the question, what is the probability distribution over Y , when we know that X must take on a certain value x? In the discrete case, the conditional probability mass function of X given Y is simply
pX Y (x, y)
assuming that pX (x) _ o.
pY |X (y|x) _ p (x) ,
In the continuous case, the situation is technically a little more complicated because the probability that a continuous random variable X takes on a specific value x is equal to zero4 . Ignoring this technical point, we simply define, by analogy to the discrete case, the conditional probability density of Y given X _ x to be
fX Y (x, y)
provided fX (x) _ o.
3.5 Bayess rule
fY |X (y|x) _ f (x) ,
A useful formula that often arises when trying to derive expression for the conditional probability of one variable given another, is Bayess rule.
In the case of discrete random variables X and Y ,
PX Y (x, y) PX |Y (x|y)PY (y)
PY |X (y|x) _ _ PX (x) .
yi EV al(Y ) PX |Y (x|yl)PY (yl)
If the random variables X and Y are continuous,
fX Y (x, y)
fX |Y (x|y)fY (y) fY |X (y|x) _ _ fX (x)
. fX |Y (x|y
l)fY (yl)dyl
www.VidyarthiPlus.in
www.VidyarthiPlus.in
3.6 Independence
Two random variables X and Y are independent if FX Y (x, y) _ FX (x)FY (y) for all values of x
and y. Equivalently,
For discrete random variables, pX Y (x, y) _ pX (x)pY (y) for all x e V al(X ), y e V al(Y ).
For discrete random variables, pY |X (y|x) _ pY (y) whenever pX (x) _ o for all y e V al(Y ).
For continuous random variables, fX Y (x, y) _ fX (x)fY (y) for all x, y e R.
For continuous random variables, fY |X (y|x) _ fY (y) whenever fX (x) _ o for all y e R.
To get around this, a more reasonable way to calculate the conditional CDF is,
FY |X (y, x) = lim x0
P (Y y|x X x + x).
It can be easily seen that if F (x, y) is differentiable in both x, y then,
Z y fX,Y (x,
) FY |X (y, x) =
d fX (x)
and therefore we define the conditional PDF of Y given X = x in the following way,
fX Y (x, y) fY |X (y|x) = fX (x)
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Informally, two random variables X and Y are independent if knowing the value of one variable will never have any effect on the conditional probability distribution of the other variable, that is, you know all the information about the pair (X, Y ) by just knowing f (x) and f (y). The following lemma formalizes this observation:
Lemma 3.1. If X and Y are independent then for any subsets A, B R, we have,
P (X e A, y e B) _ P (X e A)P (Y e B)
By using the above lemma one can prove that if X is independent of Y then any function of X is independent of any function of Y .
3.7 Expectation and covariance
Suppose that we have two discrete random variables X, Y and g : R2 - R is a function of these two random variables. Then the expected value of g is defined in the following way,
E[g(X, Y )] , X
X
g(x, y)pX Y (x, y).
xEV al(X ) yEV al(Y )
For continuous random variables X, Y , the analogous expression is
Z Z E[g(X, Y )] _
g(x, y)fX Y (x, y)dxdy.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
n
We can use the concept of expectation to study the relationship of two random variables with each other. In particular, the covariance of two random variables X and Y is defined as
C ov[X, Y ] , E[(X - E[X ])(Y - E[Y ])]
Using an argument similar to that for variance, we can rewrite this as,
C ov[X, Y ] _ E[(X - E[X ])(Y - E[Y ])] _ E[X Y - X E[Y ] - Y E[X ] + E[X ]E[Y ]]
_ E[X Y ] - E[X ]E[Y ] - E[Y ]E[X ] + E[X ]E[Y ]]
_ E[X Y ] - E[X ]E[Y ].
Here, the key step in showing the equality of the two forms of covariance is in the third equality, where we use the fact that E[X ] and E[Y ] are actually constants which can be pulled out of the expectation. When C ov[X, Y ] _ o, we say that X and Y are uncorrelated5 .
Properties:
- (Linearity of expectation) E[f (X, Y ) + g(X, Y )] _ E[f (X, Y )] + E[g(X, Y )].
- V ar[X + Y ] _ V ar[X ] + V ar[Y ] + 2C ov[X, Y ].
- If X and Y are independent, then C ov[X, Y ] _ o.
- If X and Y are independent, then E[f (X )g(Y )] _ E[f (X )]E[g(Y )].
4 Multiple random variables
The notions and ideas introduced in the previous section can be generalized to more than two random variables. In particular, suppose that we have n continuous random variables, X1 (), X2 (), . . . Xn (). In this section, for simplicity of presentation, we focus only on the continuous case, but the generalization to discrete random variables works similarly.
4.1 Basic properties
We can define the joint distribution function of X1 , X2 , . . . , Xn , the joint probability density function of X1 , X2 , . . . , Xn , the marginal probability density function of X1 , and the condi- tional probability density function of X1 given X2 , . . . , Xn , as
FX1 ,X2 ,...,Xn (x1 , x2 , . . . xn ) = P (X1 x1 , X2 x2 , . . . , Xn xn )
n FX ,X ,...,X 1 2 n
fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )
=
1 2
x1
(x , x , . . . x )
. . . xn
fX1 (X1 ) =
Z Z
fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )dx2 . . . dxn
fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )
fX1 |X2 ,...,Xn (x1 |x2 , . . . xn ) = f
(x , x , . . . x )
X2 ,...,Xn 1 2 n
To calculate the probability of an event A Rn we have, Z
P ((x1 , x2 , . . . xn ) e A) = (x1 ,x2 ,...xn )EA
fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )dx1 dx2 . . . dxn (4)
Chain rule: From the definition of conditional probabilities for multiple random variables, one can show that
f (x1 , x2 , . . . , xn ) = f (xn |x1 , x2 . . . , xn1 )f (x1 , x2 . . . , xn1 )
= f (xn |x1 , x2 . . . , xn1 )f (xn1 |x1 , x2 . . . , xn2 )f (x1 , x2 . . . , xn2 ) n
= . . . = f (x1 ) Y
f (xi |x1 , . . . , xi1 ).
www.VidyarthiPlus.in
www.VidyarthiPlus.in
i=2
Independence: For multiple events, A1 , . . . , Ak , we say that A1 , . . . , Ak are mutually indepen-
dent if for any subset S {1, 2, . . . , k}, we have
P (niES Ai ) = Y
P (Ai ). iES
Likewise, we say that random variables X1 , . . . , Xn are independent if
f (x1 , . . . , xn ) = f (x1 )f (x2 ) f (xn ).
Here, the definition of mutual independence is simply the natural generalization of independence of two random variables to multiple random variables.
Independent random variables arise often in machine learning algorithms where we assume that the training examples belonging to the training set represent independent samples from some unknown probability distribution. To make the significance of independence clear, consider a bad training
set in which we first sample a single training example (x(1) , y(1) ) from the some unknown distribu- tion, and then add m - 1 copies of the exact same training example to the training set. In this case, we have (with some abuse of notation)
m
P ((x(1) , y(1) ), . . . .(x(m) , y(m) )) = Y
P (x(i) , y(i) ).
i=1
Despite the fact that the training set has size m, the examples are not independent! While clearly the procedure described here is not a sensible method for building a training set for a machine learning algorithm, it turns out that in practice, non-independence of samples does come up often, and it has the effect of reducing the effective size of the training set.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
1
n
1
n
++ ++
n
.
.
4.2 Random vectors
Suppose that we have fl random variables. When working with all these random variables together, we will often find it convenient to put them in a vector X = [X1 X2 . . . Xn ]
T . We call the resulting vector a random vector (more formally, a random vector is a mapping from I to Rn ). It should be clear that random vectors are simply an alternative notation for dealing with fl random variables, so the notions of joint PDF and CDF will apply to random vectors as well.
Expectation: Consider an arbitrary function from g : Rn R. The expected value of this function is defined as
E[g(X )] =
Z
g(x1 , x2 , . . . , xn )fX1 ,X2 ,...,Xn (x1 , x2 , . . . xn )dx1 dx2 . . . dxn , (5) Rn
where R R
is fl consecutive integrations from - to . If g is a function from Rn to Rm , then the expected value of g is the element-wise expected values of the output vector, i.e., if g is
g1 (x)
g2 (x)
Then,
g(x) =
. ,
.
gm (x)
E[g1 (X )]
E[g2 (X )]
E[g(X )] =
. .
.
E[gm (X )]
Covariance matrix: For a given random vector X : I Rn , its covariance matrix is the fl fl square matrix whose entries are given by ij = C ov[Xi , Xj ].
From the definition of covariance, we have
C ov[X1 , X1 ] C ov[X1 , Xn ]
=
. . . .
C ov[Xn , X1 ] C ov[Xn , Xn ]
E[X 2 ] - E[X1 ]E[X1 ] E[X1 Xn ] - E[X1 ]E[Xn ]
= . . . .
E[Xn X1 ] - E[Xn ]E[X1 ] E[X 2 ] - E[Xn ]E[Xn ]
E[X 2 ] E[X1 Xn ]
E[X1 ]E[X1 ] E[X1 ]E[Xn ]
=
. . . . .
-
. . . . .
. . . .
E[Xn X1 ] E[X 2 ] E[Xn ]E[X1 ] E[Xn ]E[Xn ]
= E[X X T ] - E[X ]E[X ]T = . . . = E[(X - E[X ])(X - E[X ])T ]. where the matrix expectation is defined in the obvious way.
The covariance matrix has a number of useful properties:
- 0; that is, is positive semi definite.
- = T ; that is, is symmetric.
4.3 The multivariate Gaussian distribution
One particularly important example of a probability distribution over random vectors X is called
the multivariate Gaussian or multivariate normal distribution. A random vector X e Rn is said to have a multivariate normal (or Gaussian) distribution with mean e Rn and covariance matrix e Sn (where S
n refers to the space of symmetric positive definite fl fl matrices)
1 fX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ; , ) = (2)n/2
||
exp 1/2
1 T - 2
(x - )
1
(x - ) .
www.VidyarthiPlus.in
www.VidyarthiPlus.in
We write this as X N (, ). Notice that in the case fl = 1, this reduces the regular definition of a normal distribution with mean parameter 1 and variance 11 .
Generally speaking, Gaussian random variables are extremely useful in machine learning and statistics for
two main reasons. First, they are extremely common when modeling noise in statistical algorithms. Quite
often, noise can be considered to be the accumulation of a large number of small independent random
perturbations affecting the measurement process; by the Central Limit Theorem, summations of independent
random variables will tend to look Gaussian. Second, Gaussian random variables are convenient for many
analytical manipulations, because many of the integrals involving Gaussian distributions that arise in practice
have simple closed form solutions.
GAUSSIAN PROCESS:
In probability theory and statistics, a Gaussian process is a stochastic process whose realizations
consist of random values associated with every point in a range of times (or of space) such that each
such random variable has a normal distribution. Moreover, every finite collection of those random variables
has a multivariate normal distribution.
Gaussian processes are important in statistical modeling because of properties inherited from the normal
distribution. For example, if a random process is modeled as a Gaussian process, the distributions of various
derived quantities can be obtained explicitly. Such quantities include: the average value of the process over a
range of times; the error in estimating the average using sample values at a small set of times.
A process is Gaussian if and only if for every finite set of indices t1, ..., tk in the index set T
is a vector-valued Gaussian random variable. Using characteristic functions of random variables, the
Gaussian property can be formulated as follows:{ Xt ; t T } is Gaussian if and only if, for every finite
set of indices t1, ..., tk, there are reals l j with i i > 0 and reals j such that
The numbers l j and j can be shown to be the covariances and means of the variables in the
process.
NOISE:
In common use, the word noise means any unwanted sound. In both analog and digital
electronics, noise is an unwanted perturbation to a wanted signal; it is called noise as a generalization of the
audible noise heard when listening to a weak radio transmission. Signal noise is heard as acoustic noise if
played through a loudspeaker; it manifests as 'snow' on a television or video image. Noise can block, distort,
change or interfere with the meaning of a message in human, animal and electronic communication.
In signal processing or computing it can be considered unwanted data without meaning; that is, data
that is not being used to transmit a signal, but is simply produced as an unwanted by-product of other activities.
"Signal-to-noise ratio" is sometimes used informally to refer to the ratio of useful information to false or
irrelevant data in a conversation or exchange, such as off-topic posts and spam in online discussion forums and
other online communities. In information theory, however, noise is still considered to be information. In a
broader sense, film grain or even advertisements encountered while looking for something else can be
www.VidyarthiPlus.in
www.VidyarthiPlus.in
considered noise. In biology, noise can describe the variability of a measurement around the mean, for
example transcriptional noise describes the variability in gene activity between cells in a population.
In many of these areas, the special case of thermal noise arises, which sets a fundamental lower limit
to what can be measured or signaled and is related to basic physical processes at the molecular level described
by well-established thermodynamics considerations, some of which are expressible by simple formulae.
SHOT NOISE:
Shot noise consists of random fluctuations of the electric current in an electrical conductor, which are
caused by the fact that the current is carried by discrete charges (electrons). The strength of this noise increases
for growing magnitude of the average current flowing through the conductor. Shot noise is to be distinguished
from current fluctuations in equilibrium, which happen without any applied voltage and without any average
current flowing. These equilibrium current fluctuations are known as Johnson-Nyquist noise.
Shot noise is important in electronics, telecommunication, and fundamental physics.
The strength of the current fluctuations can be expressed by giving the variance of the current I, where
is the average ("macroscopic") current. However, the value measured in this way depends on the frequency
range of fluctuations which is measured ("bandwidth" of the measurement): The measured variance of the
current grows linearly with bandwidth. Therefore, a more fundamental quantity is the noise power, which is
essentially obtained by dividing through the bandwidth (and, therefore, has the dimension ampere squared
divided by Hertz). It may be defined as the zero-frequency Fourier transform of the current-current correlation
function.
THERMAL NOISE:
Thermal noise (JohnsonNyquist noise, Johnson noise, or Nyquist noise) is the electronic noise
generated by the thermal agitation of the charge carriers (usually the electrons) inside an electrical conductor at
equilibrium, which happens regardless of any applied voltage.
Thermal noise is approximately white, meaning that the power spectral density is nearly equal
throughout the frequency spectrum (however see the section below on extremely high frequencies).
Additionally, the amplitude of the signal has very nearly a Gaussian probability density function.
This type of noise was first measured by John B. Johnson at Bell Labs in 1928. He described his
findings to Harry Nyquist, also at Bell Labs, who was able to explain the results.
Noise voltage and power
Thermal noise is distinct from shot noise, which consists of additional current fluctuations that occur
when a voltage is applied and a macroscopic current starts to flow. For the general case, the above definition
applies to charge carriers in any type of conducting medium (e.g. ions in an electrolyte), not just resistors. It
can be modeled by a voltage source representing the noise of the non-ideal resistor in series with an ideal noise
free resistor.
The power spectral density, or voltage variance (mean square) per hertz of bandwidth, is given by
www.VidyarthiPlus.in
www.VidyarthiPlus.in
where kB is Boltzmann's constant in joules per kelvin, T is the resistor's absolute temperature in kelvins, and R
is the resistor value in ohms (). Use this equation for quick calculation:
For example, a 1 k resistor at a temperature of 300 K has
For a given bandwidth, the root mean square (RMS) of the voltage, vn, is given by
where f is the bandwidth in hertz over which the noise is measured. For a 1 k resistor at room temperature
and a 10 kHz bandwidth, the RMS noise voltage is 400 nV. A useful rule of thumb to remember is that 50 at
1 Hz bandwidth correspond to 1 nV noise at room temperature.
A resistor in a short circuit dissipates a noise power of
The noise generated at the resistor can transfer to the remaining circuit; the maximum noise power
transfer happens with impedance matching when the Thevenin equivalent resistance of the remaining circuit is
equal to the noise generating resistance. In this case each one of the two participating resistors dissipates noise
in both itself and in the other resistor. Since only half of the source voltage drops across any one of these
resistors, the resulting noise power is given by
where P is the thermal noise power in watts. Notice that this is independent of the noise generating resistance.
Noise current:
The noise source can also be modeled by a current source in parallel with the resistor by taking the Norton
equivalent that corresponds simply to divide by R. This gives the root mean square value of the current source
as:
Thermal noise is intrinsic to all resistors and is not a sign of poor design or manufacture, although resistors
may also have excess noise.
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Noise power in decibels:
Signal power is often measured in dBm (decibels relative to 1 milliwatt, assuming a 50 ohm load).
From the equation above, noise power in a resistor at room temperature, in dBm, is then:
where the factor of 1000 is present because the power is given in milliwatts, rather than watts. This equation
can be simplified by separating the constant parts from the bandwidth:
which is more commonly seen approximated as:
Noise power at different bandwidths is then simple to calculate:
Bandwidth (f) Thermal noise power Notes
1 Hz 174 dBm
10 Hz 164 dBm
100 Hz 154 dBm
1 kHz 144 dBm
10 kHz 134 dBm FM channel of 2-way radio
100 kHz 124 dBm
180 kHz 121.45 dBm One LTE resource block
200 kHz 120.98 dBm One GSM channel (ARFCN)
1 MHz 114 dBm
2 MHz 111 dBm Commercial GPS channel
6 MHz 106 dBm Analog television channel
20 MHz 101 dBm WLAN 802.11 channel
Thermal noise on capacitors:
Thermal noise on capacitors is referred to as kTC noise. Thermal noise in an RC circuit has an
unusually simple expression, as the value of the resistance (R) drops out of the equation. This is because higher
R contributes to more filtering as well as to more noise. The noise bandwidth of the RC circuit is 1/(4RC),
which can substituted into the above formula to eliminate R. The mean-square and RMS noise voltage
generated in such a filter are:
www.VidyarthiPlus.in
www.VidyarthiPlus.in
Thermal noise accounts for 100% of kTC noise, whether it is attributed to the resistance or to the
capacitance.
In the extreme case of the reset noise left on a capacitor by opening an ideal switch, the resistance is
infinite, yet the formula still applies; however, now the RMS must be interpreted not as a time average, but as
an average over many such reset events, since the voltage is constant when the bandwidth is zero. In this sense,
the Johnson noise of an RC circuit can be seen to be inherent, an effect of the thermodynamic distribution of
the number of electrons on the capacitor, even without the involvement of a resistor.
The noise is not caused by the capacitor itself, but by the thermodynamic equilibrium of the amount
of charge on the capacitor. Once the capacitor is disconnected from a conducting circuit, the thermodynamic
fluctuation is frozen at a random value with standard deviation as given above.
The reset noise of capacitive sensors is often a limiting noise source, for example in image sensors.
As an alternative to the voltage noise, the reset noise on the capacitor can also be quantified as the electrical
charge standard deviation, as
Since the charge variance is kBTC, this noise is often called kTC noise.
Any system in thermal equilibrium has state variables with a mean energy of kT/2 per degree of
freedom. Using the formula for energy on a capacitor (E = CV2), mean noise energy on a capacitor can be
seen to also be C(kT/C), or also kT/2. Thermal noise on a capacitor can be derived from this relationship,
without consideration of resistance.
The kTC noise is the dominant noise source at small capacitors.
Noise of capacitors at 300 K
Capacitance
Electrons
1 fF 2 mV 12.5 e
10 fF 640 V 40 e
100 fF 200 V 125 e
1 pF 64 V 400 e
10 pF 20 V 1250 e
100 pF 6.4 V 4000 e
1 nF 2 V 12500 e
Noise at very high frequencies:
The above equations are good approximations at any practical radio frequency in use (i.e. frequencies
below about 80 gigahertz). In the most general case, which includes up to optical frequencies, the power
spectral density of the voltage across the resistor R, in V2/Hz is given by:
www.VidyarthiPlus.in
www.VidyarthiPlus.in
where f is the frequency, h Planck's constant, kB Boltzmann constant and T the temperature in kelvins. If the
frequency is low enough, that means:
(this assumption is valid until few terahertz at room temperature) then the exponential can be expressed in
terms of its Taylor series. The relationship then becomes:
In general, both R and T depend on frequency. In order to know the total noise it is enough to
integrate over all the bandwidth. Since the signal is real, it is possible to integrate over only the positive
frequencies, then multiply by 2. Assuming that R and T are constants over all the bandwidth f, then the root
mean square (RMS) value of the voltage across a resistor due to thermal noise is given by
that is, the same formula as above.
WHITE NOISE:
White noise is a random signal (or process) with a flat power spectral density. In other words, the
signal contains equal power within a fixed bandwidth at any center frequency. White noise draws its name
from white light in which the power spectral density of the light is distributed over the visible band in such a
way that the eye's three color receptors (cones) are approximately equally stimulated. In statistical sense, a
time series rt is called a white noise if {rt} is a sequence of independent and identically distributed (iid) random
variables with finite mean and variance. In particular, if rt is normally distributed with mean zero and variance
, the series is called a Gaussian white noise.
An infinite-bandwidth white noise signal is a purely theoretical construction. The bandwidth of white noise is
limited in practice by the mechanism of noise generation, by the transmission medium and by finite
observation capabilities. A random signal is considered "white noise" if it is observed to have a flat spectrum
over a medium's widest possible bandwidth.
WHITE NOISE IN A SPATIAL CONTEXT:
While it is usually applied in the context of frequency domain signals, the term white noise is also
commonly applied to a noise signal in the spatial domain. In this case, it has an auto correlation which can be
represented by a delta function over the relevant space dimensions. The signal is then "white" in the spatial
frequency domain (this is equally true for signals in the angular frequency domain, e.g., the distribution of a
signal across all angles in the night sky).
www.VidyarthiPlus.in
www.VidyarthiPlus.in
STATISTICAL PROPERTIES:
The image to the right displays a finite length, discrete time realization of a white noise process
generated from a computer.
Being uncorrelated in time does not restrict the values a signal can take. Any distribution of values is
possible (although it must have zero DC components). Even a binary signal which can only take on the values
1 or -1 will be white if the sequence is statistically uncorrelated. Noise having a continuous distribution, such
as a normal distribution, can of course be white.
It is often incorrectly assumed that Gaussian noise (i.e., noise with a Gaussian amplitude distribution see
normal distribution) is necessarily white noise, yet neither property implies the other. Gaussianity refers to the
probability distribution with respect to the value i.e. the probability that the signal has a certain given value,
while the term 'white' refers to the way the signal power is distributed over time or among frequencies.
We can therefore find Gaussian white noise, but also Poisson, Cauchy, etc. white noises. Thus, the
two words "Gaussian" and "white" are often both specified in mathematical models of systems. Gaussian white
noise is a good approximation of many real-world situations and generates mathematically tractable models.
These models are used so frequently that the term additive white Gaussian noise has a standard abbreviation:
AWGN. Gaussian white noise has the useful statistical property that its values are independent (see Statistical
independence).
White noise is the generalized mean-square derivative of the Wiener process or Brownian motion.
APPLICATIONS: