PDC_Chap4

8/12/2019 PDC_Chap4

1/32

Chapter 4

Signal Design Trade-Offs

4.1 Introduction

In Chapters2 and3 we have focused on the receiver, assuming that the signal set wasgiven to us. In this chapter we introduce the signal design. We have three main goals inmind: (i) Introduce the design parameters we care mostly about; (ii) sharpen our intuitionabout the role played by the dimensions of the signal space as we increase the numberof bits to be transmitted; and (iii) define the signal design strategy to be pursued in thenext two chapters. We will also discuss isometries that may be applied to the signal setto vary some design parameters while keeping other parameters fixed, notably the errorprobability. The continuous-time AWGN channel model is assumed.

4.2 Design ParametersThe problem of choosing a convenient signal constellation is not as clean-cut as the receiverdesign problem. The reason is that the receiver design problem has a clear objective, tominimize the error probability, and one solution, namely the MAP rule. In contrast,when we choose a signal constellation we make tradeoffs among conflicting objectives.The design parameters and the performance measures we are mostly concerned with are:

The cardinality m of the message set H . Since in most cases the message consistsof bits, typically we choose m to be a power of 2 . Whether m is a power of 2 ornot, we say that a message is worth k= log2 m bits.

The implementation cost and computational complexity. To keep the discussion assimple as possible, we continue to assume that the cost is determined by the numberof matched filters in the n -tuple former and the complexity is that of the decoder.

121

8/12/2019 PDC_Chap4

2/32

8/12/2019 PDC_Chap4

3/32

8/12/2019 PDC_Chap4

4/32

124 Chapter 4.

First Zero-Crossing Bandwidth: The first zero-crossing bandwidth, if it exists, is thatW for which |hF(f)| is positive in the interior of I= [W2, W2 ] and vanishes on theboundary of I.

Equivalent Noise Bandwidth: It is W if

R |hF(f)|2df=W|hF(0)|2 . This bandwidth

name comes from the fact that if we feed with white noise a filter of impulse responseh(t) and we feed with the same input an ideal lowpass filter of frequency response|hF(0)| [W

2 ,W

2](f), then the output power is the same in both situations.

Root-Mean Square (RMS) Bandwidth: It is defined as

W =

"R f

2|hF(f)|2dfR |hF(f)|

2df

# 12

.

To understand this definition, notice that the function g(f) := |hF(f)|2

R

|hF(f)|2df

is non-

negative, even, and integrates to 1. Hence it is the density of some zero-mean random

variable and W= qRf2g(f)dfis the standard deviation of that random variable.The reader should be aware that some authors define the bandwidth by considering onlypositive frequencies. Since we have assumed that |hF(f)| is an even function, the valuethey obtain is exactly half the value obtained by considering the entire frequency axis.The definition we have used easily extends to the cases where |hF(f)| is not even, and it ismore useful in answering some fundamental questions, see in particular Section 4.7. Theother, single-sided, definition makes sense for real-valued functions and it is somewhatmore useful for passband signals (see Chapter7). We use them both, reserving the letterB for single-sided bandwidths of real-valued signals.

4.4 Isometric Transformations Applied to the Codebook

If the channel is AWGN and the receiver implements a MAP rule, the error probability iscompletely determined by the codebook C= {c0, . . . , cm1} . The purpose of this section isto identify transformations to the codebook that do not affect the error probability. Whatwe do generalizes to complex-valued codebooks and complex-valued noise. However, sincewe are not ready to discuss complex-valued random variables, for the moment we assumethat the codebook and the noise are real-valued.

From the geometrical intuition gained in Chapter2, it should be clear to the reader thatthe probability of error remains the same if a given codebook and the correspondingdecoding regions are translated by the same n -tuple b

Rn .

A translation is a particular instance of an isometry. An isometry is a distance preservingtransformation. Formally, given an inner product space V, a: V V is an isometry ifand only if for any V and V, the distance between and equals that betweena() and a(). The following example gives three isometries applied to a codebook.

8/12/2019 PDC_Chap4

5/32

4.4. Isometric Transformations Applied to the Codebook 125

Example 61. Figure4.1 shows an original codebook C={c0, c1, c2, c3} and three varia-tions obtained by applying to Ca reflection, a rotation, and an translation, respectively.In each case the isometry a: Rn Rn sends ci to ci= a(ci) .

1

2

c0c1

c2 c3

(a) Original Codebook C

1

2

c3

c2

c1 c0

(b) Reflected Codebook

1

2

c0

c1

c2

c3

(c) Rotated Codebook

1

2

c0c1

c2 c3

(d) Translated Codebook

Figure 4.1: Isometries

Next we give a formal proof that if we apply an isometry to a codebook and its decoding

regions, then the error probability associated to the new codebook and the new regionsis the same as that of the original codebook and original regions. Let

g() = 1

(22)n/2exp

2

22

, R

so that for Z N(0,2In) we can write fZ(z) = g(kzk) . Then for any codebook

8/12/2019 PDC_Chap4

6/32

126 Chapter 4.

C= {c0, . . . , cm1} , decoding regions R0, . . . ,Rm1 , and isometry a: Rn Rn we havePc(i) = P r{Y Ri|codewordci is transmitted}

=

ZyRi

g(ky cik)dy

(a)=Z

yRig(ka(y) a(ci)k)dy

(b)=

Za(y)a(Ri)

g(ka(y) a(ci)k)dy(c)=

Za(Ri)

g(k a(ci)k)d

=P r{Y a(Ri)|codeword a(ci) is transmitted},where in (a) we use the distance preserving property of an isometry, in (b) we use thefact that y Ri if and only if a(y) a(Ri), and in (c) we make the change of variable = a(y) and use the fact that the Jacobian of an isometry is 1 . The last line is the

probability of decoding correctly when the transmitter sends a(ci) and the correspondingdecoding region is a(Ri) .

One can show that all isometries in Rn are obtained from the composition of translation,rotation, and reflection. If we apply a rotation or a reflection to an n -tuple, we do notchange its norm. Hence reflections and rotations applied to a signal set do not changethe average energy, but translations generally do. In the next section, we determine thetranslation that minimizes the average energy.

4.5 The Energy-Minimizing Translation

We keep assuming that vectors, scalars, and random variables are defined over the reals.Generalization to the complex-valued counterparts is straightforward. But for this, wefirst need to introduce complex-valued random variables (Appendix7.B).

Let Y be a zero-mean random vector in Rn . For any b Rn ,EkY + bk2 = EkYk2 + kbk2 + 2EhY , bi = EkYk2 + kbk2 EkYk2

with equality if and only if b = 0 . An arbitrary (not necessarily zero-mean) randomvector Y Rn can be written as Y = Y +m where m = E[Y] and Y = Y m iszero-mean. The above inequality can then be restated as

E

kY

bk2

E

kY

k2,

with equality if and only if b= m .

We apply the above to a codebook C={c0, . . . cm1} . If we let Y be the random variablethat takes value ci with probability PH(i), then we see that the average energy E =

8/12/2019 PDC_Chap4

7/32

4.5. The Energy-Minimizing Translation 127

E [kYk2] can be decreased by a translation if and only if the mean m= E [Y] =Pi PH(i)ciis non-zero. If it is non-zero, then the translated constellation C={c0, . . . cm1} , whereci= c m , will achieve the minimum energy among all possible translated versions ofC.The average energy associated to the translated constellation is E=E kmk2 .If S = {w0(t), . . . , wm

1(t)} is the set of waveforms linked to C via some orthonormal

bias, then through the same basis ci will be associated to wi(t) = wi(t) m(t) wherem(t) =

Pi PH(i)wi(t). An example follows.

Example62. Let w0(t) andw1(t) be rectangular pulses with support [0, T] and [T, 2T] ,respectively, as shown on the left of Figure4.2(a). Assuming that PH(0) =PH(1) =

12, we

calculate the averagem(t) = 12

w0(t)+12

w1(t) and see that it is non-zero (center waveform).Hence we can save energy by using the new signal set defined by wi(t) = wi(t) m(t) ,i = 0, 1 (right). In Figure4.2(b) we see the same idea, but acting on the codewords c0and c1 obtained through the orthonormal basis i(t) =

wi1(t)kwi1k, i= 1, 2 . As we see from

the figures, w0(t) andw1(t) are antipodal signals. This is not a coincidence: after weremove the mean, the two signals become the negative of each other. This is best seenfrom the codeword viewpoint of Figure4.2(b).

1

2

c0

c1

1

2

a

(a) Codeword Viewpoint

1

2

c0

c1

t

w0(t)

t

w0(t)

t

w1(t)

t

a(t)

(b) Waveform Viewpoint

t

w1(t)

Figure 4.2: Energy minimization by translation

8/12/2019 PDC_Chap4

8/32

128 Chapter 4.

4.6 Isometric Transformations Applied to the Waveform

Set

The definition of isometry is based on the notion of distance, which is defined in everyinner product space: the distance between V and V is the normk k .Let Vbe the inner product space spanned by a signal set W={w0(t) . . . wm1(t)} andlet a :V V be an isometry. If we apply this isometry to W, we obtain a new signalset W ={w0(t) . . . wm1(t)}V. Let B= {1(t), . . . , n(t)} be an orthonormal basisfor V and let C = {c0, . . . , cm1} be the codebook associated to W via B. Could wehave obtained Wby applying some isometry to the codebook C?

Yes we could. Through B, we obtain the codebook C= {c0, . . . , cm1} associated to W.Through the composition that sends ci

wi(t)

wi(t)

ci , we obtain a map from C

to C. It is easy to see that this map is an isometry of the kind considered in Section 4.4.

Are there other kinds of isometries applied to W that cannot be obtained simply byapplying an isometry to C? Yes there are. The easiest way to see this is to keep thecodebook the same and substitute the original orthonormal basis B= {1(t), . . . ,n(t)}with some other orthonormal basis B = {1(t), . . . , n(t)} . In so doing, we obtain anisometry from V to some other subspace V of the set of finite energy signals. (SeeExercise4for more insight on this.)

The new signal set W might not bear any resemblance to W, yet the resulting errorprobability will be identical since the codebook is unchanged. This sort of transformationis implicit in Example60 of Section3.4.

4.7 Time Bandwidth Product versus Dimensionality

If we substitute an orthonormal basis {1(t), . . . ,n(t)} with the related orthonormalbasis {1(t), . . . ,n(t)} obtained via the relationship i(t) =

bi(bt) for some b1 ,

i = 1, . . . , n , then all signals are time-compressed and frequency-expanded by the samefactor b . Regardless how we define the duration Tand the bandwidth W, this examplesuggests that we can increase one of the two at the expense of the other while keepingW Tconstant. Is there a minimum to W Tfor a fixed dimensionality?

There is indeed a fundamental relationship between the dimensionality n and the product

W T. Formulating this relationship precisely is tricky because the details depend on howwe define duration and bandwidth. As it turns out, any reasonable definition leads to thesame conclusion, specifically that the dimensionality of a set of time and frequency limitedsignals grows linearly with W T when W T is large. Perhaps the cleanest formulation is

8/12/2019 PDC_Chap4

9/32

4.7. Time Bandwidth Product versus Dimensionality 129

the one presented by David Slepian in his Shannon Lecture1 [8]. Hereafter we summarizehis main result without proof.

Formulating Slepians view on the relationship between n and W Trequires a worthwhilephilosophical digression about mathematical model versus reality. When we say that areal-world signal h(t) is time-limited to (

T

2

, T

2

), we can at best mean that if we measureit we cannot tell the difference between h(t) and h(t) [T

2, T2](t) . Our limited ability

to tell the difference between a signal and another could be due to a number of things,including the facts that the instrument we use to make measurements is made of wiresthat filter the signal and add noise.

To cope with the indistinguishability of certain signals we say that two signals are indis-tinguishable at level if their difference has norm less than .

We say that h(t) is time-limited to the interval (T2 , T2 ) at level if h(t) is indistin-guishable from (T

2, T2)h(t) at level . If T0 is the smallest such T , then we say that

h(t) is ofduration T0 at level .

Example

63.

Consider the signalh(t) =et

, t R

. The norm ofh(t)h(t) (T/2,T/2)(t)is eT0 . Hence, for any fixed > 0 , h(t) is of duration T0 = 2 ln . For instance, for= 105 , T0 = 23.025 .

Similarly, we say that h(t) is frequency-limited to the interval (W2, W2 ) at level ifhF(f) is indistinguishable from (W

2, W

2 )hF(f) at level . If W0 is the smallest such W,

then we say that h(t) is a signal ofbandwidth W0 at level .

A particularity of these definitions is that if we increase the strength of a signal, we couldvery well increase its duration and bandwidth. This is in distinct contradiction with theusual definitions where duration and bandwidth are not affected by scaling. Anotherparticularity is that all finite-energy signals are both frequency-limited to some finite

bandwidth Wand time-limited to some finite duration T.The dimensionality of a signal set2 is modified accordingly. We say that a set G ofsignals hasapproximate dimension n at level during the interval (T2 , T2 ) if there is afixed collection of n= n(T, ) signals, say {1(t), . . . ,n(t)} , such that over the interval(T2 , T2 ) every signal in G is indistinguishable at level from some signal of the formPn

i=1 aii(t) . That is, we require for each h(t) G that there exists a s such that(T

2, T2)h(t) and (T

2, T2)

Pni=1 aii(t) are indistinguishable at level . We further require

that n be the smallest such number. We can now state the main result.

Theorem 64. (Slepian) Let G be the set of all signals frequency-limited to (W2, W2)and time-limited to (T2 , T2 ) at level . Let n(W,T, , ) be the approximate dimension

1

The Shannon Award is the most prestigious award bestowed by the Information Theory Society.Slepian was the first after Shannon himself to receive the award. The recipient presents the ShannonLecture at the next IEEE International Symposium on Information Theory.

2We do not require that this signal set be closed under addition and under multiplication by scalars,i.e., we do not require that it forms an inner product space.

8/12/2019 PDC_Chap4

10/32

130 Chapter 4.

of G at level during the interval (T2 , T2 ) . Then, for every > ,

limT

n(W,T, , )

T =W

limW

n(W,T, , )

W

=T .

So for large values, n is essentially W T. As already mentioned, the linear asymptoticrelationship between n and W T is not tied to how we define duration and bandwidth. Inthe following example, for every positive integer n we construct a signal space for whichW T=n .

Example 65. Let (t) = 1Ts

sinc(t/Ts) and F(f) =

Ts [1/(2Ts),1/(2Ts)](f) be a nor-malized pulse and its Fourier transform. Let l(t) = (t lTs) , l = 1, . . . , n . Thecollection B= {1(t), . . . ,n(t)} forms an orthonormal set. One way to see that i(t)and j(t) are orthogonal to one another when i

6=j is to go to the Fourier domain and

use Parsevals relationship. (Another way is to evoke Theorem79of Chapter5.) Let Gbe the space spanned by the orthonormal basisB. It has dimension n by construction.All signals of G are strictly frequency-limited to (W/2, W/2) for W = 1/Ts and in asense (that we could define) time-limited to (0, T) . For this example W T=n .

In light of the above example, the next example might be surprising at first.

Example 66. Let (t) = 1Ts [Ts/2,Ts/2](t) and F(f) =

Tssinc(Tsf) be the nor-

malized rectangular pulse of duration Ts and its Fourier transform. The collection{1(t), . . . ,n(t)} , where l(t) = (t lTs) , l = 1, . . . , n , forms an orthonormal set.(This is obvious from the time-domain.) Let Gbe the space spanned by the orthonormalbasis B. It has dimension n by construction. All signals of G are strictly time-limited

to (Ts/2, T+ Ts/2) for T =nTs and in a sense (that we could define) frequency-limitedto (W/2, W/2) for W= 2/Ts . For this example W T = 2n .

In both of the above two examples, we have constructed a signal set of dimensionality nfor which there is a linear relationship between n and W T, where W is the bandwidth(according to some reasonable definition) of every signal in the set and T is the width ofthe smallest interval of time that contains (according to some reasonable definition) everysignal in the set. What might be surprising is that in the second example W T is twicethe value taken in the first example.

The explanation lies in the fact that in the first example we shift the sinc by half itswidth. (By width we mean the main-lobe width.) So it is half the width of the sinc that

matters for the growth of W T for large n . Every shift contributes by 1 to the final W Tcount. In contrast, in the second example it is the full with of the sinc main lobe thatmatters for the final count. Every time we shift the rectangle by T , we contribute by2 to the final W T count. We can summarize this by saying that the width of a sinc s

8/12/2019 PDC_Chap4

11/32

4.7. Time Bandwidth Product versus Dimensionality 131

main lobe is not representative of how tightly we can pack shifted sinks next to each otherwhile keeping them orthogonal to one another. It is enough to shift the sinc by half thatwidth to ensure orthogonality.

Note that, in this section, n is the dimensionality of the signal space which may or maynot be related to a codeword length (also denoted by n ). For instance, if we communicateusing signals from a signal space of dimensionality n , we can choose the codeword lengthto be any integer smaller or equal n . For example, we could use a space of dimensionn= 107 to send one thousand codewords of length n= 104 each. It is standard practiceto use n for the dimensionality and for the codeword length: which is which should alwaysbe clear from the context.

Theorem64 establishes a fundamental relationship between the continuous-time and thediscrete-time channel model. It says that if we are allowed to use a frequency intervalof width W Hz during T seconds, then we can make approximately (asymptoticallyexactly) up to W T uses of the equivalent discrete-time channel model. In other words,we get to choose the discrete-time channel at a rate of up to Wchannel uses per second.This is perhaps the most useful interpretation. It tells us that a discrete-time system thatsends k= log2mn bits per channel use, where m is the codebook size and n the codewordlength, can be used to send kWbits per second.

Theorem64tells us that time and frequency are on an equal footing in terms of providingthe degrees of freedom exploited by the discrete-time channel. It is sometimes useful tothink of T and Was the width and hight of a rectangle in the time frequency planeas shown in Figure 4.3. We associate such a rectangle with the set of signals that havethe corresponding time and frequency limitations according to, say, the criterion use inTheorem64. Like a piece of land, such a rectangle represents a natural resource and whatmatters for its exploitation is its area.

W

T

t

f

Figure 4.3: Time frequency plane.

While Theorem64 assumes a signal set that can be represented in the time frequencyplane by a rectangle as in Figure 4.3, as we can see from Exercise ??(to be added), onecan argue that the relationship between the dimensionality of a signal set and the areaoccupied by its representation in the time frequency plane extends to any shape. So

8/12/2019 PDC_Chap4

12/32

132 Chapter 4.

the shape does not matter in some sense but in practice it does as it affects the implemen-tation. This explains why we typically communicate using signal sets that correspond toa rectangle in the time frequency plane.

4.8 Building Intuition about Scalability: n versus k

The aim of this section is to sharpen our intuition by looking at a few examples of signalconstellations that contain a large number m of signals. We are interested in exploringwhat happens to the probability of error when the number k= log2 m of bits carried byone signal becomes large. In doing so, we will let the energy grow linearly with k so as tokeep constant the energy per bit, which seems to be fair. The dimensionality of the signalspace will be n= 1 for the first example (PAM) and n= 2 for the second (PSK). In thethird example (bit-by-bit on a pulse train) n will be equal to k . In the final example(block-orthogonal signaling) we will let n = 2k . These examples will provide us withuseful insight on the asymptotic relationship between the number of transmitted bits and

the dimensionality of the signal space.What matters for all these examples is the choice of codebook. There is no need, inprinciple, to specify the waveform signal wi(t) associated to a codeword ci . Nevertheless,we will specify wi(t) to make the examples more realistic.

4.8.1 Keeping n Fixed as k Grows

Example 67. (PAM) In this example, we fix n= 1 . Let m be a positive even integer,H= {0, 1, . . . , m 1} be the message set, and for each i H let ci be a distinct elementof {a,3a,5a , . . . (m 1)a} as shown in Figure4.4.

1c0

5a c1

3a c2

a c3a

c4

3ac5

5a

Figure 4.4: Codebook for PAM signaling.

The waveform associated to message i is

wi(t) =ci(t),

where (t) is an arbitrary unit-energy waveform. This signaling method is called PulseAmplitude Modulation (PAM). (With n = 1 we do not have any choice other than

modulating the amplitude of a pulse.) We are totally free to choose the pulse. For the sakeof completeness we arbitrarily choose a rectangular pulse such as(t) = 1

T [T/2,T/2](t) .

8/12/2019 PDC_Chap4

13/32

4.8. Building Intuition about Scalability: n versus k 133

We have already computed the error probability of PAM in Example6of Section2.4.3,namely

Pe=

2 2m

Qa

,

where 2 =N0/2 . As shown in Exercise9, the average energy of the above constellation

when signals are uniformly distributed is

E=a2(m2 1)

3 . (4.1)

Equating to E=kEb and using the fact that k= log2 m yields

a=

s3Eblog2 m

(m2 1) ,

which goes to 0 as m goes to . Hence Pe goes to 1 as m goes to .

Example 68. (PSK) In this example, we keep n = 2 . We could start by defining thecodebook as in the previous example and then choose an arbitrary orthonormal basis, butfor the codebook we are after in this example, there is a natural signal set. Hence, we startfrom there. Let T be a positive number and define the Phase-Shift-Keying constellation

wi(t) =

r2E

T cos(2f0t +

2

mi) [0,T](t), i= 0, 1, . . . , m 1. (4.2)

We assume that 2f0T is an integer for some integer k , so that kwik2 = E for all i .(When 2f0T is an integer, wi(t) has an integer number of periods in a length- T interval.This ensures that its norm is the same, regardless of the initial phase.) The signal space

representation can be obtained by using the trigonometric equivalence cos( + ) =cos()cos() sin() sin() to rewrite (4.2) as

wi(t) =ci,11(t) + ci,22(t),

where

ci,1 =Ecos

2im

, 1(t) =

q2T

cos(2f0t) [0,T](t),

ci,2 =Esin

2im

, 2(t) =

q2Tsin(2f0t) [0,T](t).

Hence the codeword associated to wi(t) is

ci=E

cos2i/msin2i/m

.

8/12/2019 PDC_Chap4

14/32

134 Chapter 4.

In Example19, we have already studied this constellation and have derived the followinglower bound to the error probability

Pe 2Qr

E

2sin

m

!m 1

m ,

where 2 = N02

is the variance of the noise in each coordinate. If we let E = kEb grow

linearly with k , the circle that contains the codewords has radiusE =

kEb . Its

circumference grows with

k , and the number m = 2k of points on this circle growsexponentially with k . Hence the minimum distance between points goes to zero (indeedexponentially fast). As a consequence, the argument of theQ function that lower boundsthe probability of error for PSK goes to 0 and the probability of error goes to 1 .

As they are, the signal constellations used in the above two examples are not suitableto transmit a large amount k of bits by letting the constellation size m = 2k growexponentially with k . The problem with the above two examples is that, as m grows, weare trying to pack an increasing number of points into a space that also grows in size butnot fast enough. The space becomes crowded as m grows, meaning that the minimumdistance becomes smaller and the probability of error increases.

We should not conclude that PAM and PSK are not useful to send many bits. On thecontrary, these signaling methods are used in conjunction with some variation of thetechnique described in the next example. The idea is to keep m relatively small andreuse the constellation over and over along different dimensions. See the comment afterthe next example.

4.8.2 Growing n Linearly with k

Example 69. (Bit by Bit on a Pulse Train) The idea is to use a different dimensionfor each bit. Let (bi,1, bi,2, . . . , bi,k) be the binary sequence corresponding to message i .For mathematical convenience, we assume these bits to take value in {1} rather than{0, 1} . We let the associated codeword be ci= (ci,1, ci,2, . . . , ci,k)T be defined by

ci,j =bi,jpEb

where Eb= Ek is the energy per bit. The transmitted signal is

wi(t) =k

Xj=1ci,jj(t), t R. (4.3)

As already mentioned, the choice of orthonormal basis is immaterial for the point weare making, but in practice some choices are more convenient than others. Specifically,if we choose j(t) = (tj T) for some waveform (t) that fulfills hi,ji = ij,

8/12/2019 PDC_Chap4

15/32


then the n -tuple former is drastically simplified because a single matched filter is suf-ficient to obtain all n projections (see Subsection 3.4.1). For instance, we can choose(t) = 1

Ts [Ts/2,Ts/2](t) , which fulfills the mentioned constraints. We can now rewritethe waveform signal as

wi(t) =kX

j=1

ci,j(t jTs), t R. (4.4)

The above expression justifies the namebit-by-bit on a pulse traingiven to this signalingmethod. As we will see in Chapter5, there are many other possible choices for (t) .

c0

c1

(a) k = 1

1

2

c0c1

c2 c3

(b) k= 2

1

2

3

c2

c1

c0

c4

c7c3

c5

c6

(c) k = 3

Figure 4.5: Codebooks for bit-by-bit on a pulse train signaling.

The codewords c0, . . . , cm1 are the vertices of a k -dimensional hypercube as shown inFigure4.5for k= 1, 2, 3 . For these values of k we immediately see from the figure what

the decoding regions of a ML decoder are, but let us proceed analytically and find a MLdecoding rule that works for any k . The ML receiver decides that the constellation point

used by the sender is the ci = {Eb}k that maximizeshy, cii kcik22 . Sincekcik2

is the same for all i , the previous expression is maximized by the ci that maximizes

8/12/2019 PDC_Chap4

16/32

136 Chapter 4.

hy, cii =P

yjci,j. The maximum is achieved for the i for which ci,j = sign(yj)Eb where

sign(y) =

(1, y 01, y Eb . (The statement is an

if and only if if we ignore the zero-probability event that Zj =Eb .) This happens

with probability 1 Q(Eb

) . Based on similar reasoning, it is straightforward to verifythat the probability of error is the same if ci,j is negative. Now let Cj be the event thatthe decoder makes the correct decision about the j th bit. The probability ofCj dependsonly on Zj. The independence of the noise components implies the independence of C1 ,C2 , . . . , Ck . Thus, the probability that all k bits are decoded correctly when H=i isis

Pc(i) =

1 QEb

k,

which is the same for all i and, therefore, it is also the average error probability Pc .Notice that Pc 0 as k . However, the probability that a any specific bit bedecoded incorrectly is Q(

Eb

) , which does not depend on k .

Although in this example we chose to transmit a single bit per dimension, we could havetransmitted instead some small number of bits per dimension by means of one of themethods discussed in the previous two examples. In that case we would have called thesignaling scheme symbol-by-symbol on a pulse train: this term will come up often in thistext. In fact it is the basis for many digital communication systems.

The following question seems natural at this point: Is it possible to avoid that Pc 0as k ? The next example gives us the answer.

4.8.3 Growing n Exponentially With k

Example 70. (Block-Orthogonal-Signaling) Let n = m = 2k , choose n orthonormalwaveforms 1(t), . . . , n(t) and define w1(t), . . . , wm(t) to be

wi(t) =Ei(t).

This is called block-orthogonal signaling. The name stems from the fact that in practice

a block of k bits are collected and then mapped into one of m orthogonal waveforms.Notice thatkwik2 =E for all i .There are many ways to choose the 2k waveforms i(t) . One way is to choose i(t) =(tiT) for some normalized pulse(t) such that (tiT) and(tjT) are orthogonal

8/12/2019 PDC_Chap4

17/32


when i 6=j . In this case the requirement for (t) is the same as that in bit-by-bit on apulse train, but now we need 2k rather than k shifted versions, and we send one pulserather than a train of k weighted pulses. For obvious reasons the resulting signalingscheme is sometimes called pulse position modulation.

Another example is to choose

wi(t) =

r2E

T cos(2fit) [0,T](t). (4.5)

This is called m -FSK (m -ary frequency shift keying). If we choose fiT =ki/2 for someinteger ki such that ki6=kj if i 6=j then

hwi, wji =2ET

Z T

0

1

2cos[2(fi+ fj)t] +

1

2cos[2(fi fj)t]

dt= Eij

as desired.

1

2

c1

c2

(a) k= 2

1

2

3

c2

c3

c1

(b) k = 3

Figure 4.6: Codebooks for block-orthogonal signaling.

When m 3 , it is not easy to visualize the decoding regions. However, we can proceedanalytically, using the fact that all coordinates of ci are 0 except for the i th, which hasvalue

E. Hence,

HML(y) = arg maxi

hy, cii E2

= arg maxi

hy, cii= arg max

iyi.

To compute (or bound) the error probability, we start as usual with a fixed ci . We choosei= 1 . When H= 1 ,

Yj =

(Zj ifj6= 1,E+ Zj ifj = 1.

8/12/2019 PDC_Chap4

18/32

138 Chapter 4.

ThenPc(1) =P r{Y1> Z2, Y1 > Z3, . . . , Y 1, > Zm|H= 1}.

To evaluate the right side, we first condition on Y1 = , where R is an arbitrarynumber

P r{H= H|H= 1, Y1 = }= P r{> Z2, . . . ,> Zm}=

"1 Q

p

N0/2

!#m1,

and then remove the conditioning on Y1 ,

Pc(1) =

Z

fY1|H(|1)

"1 Q

p

N0/2

!#m1d

=

Z

1N0

exp

(

E)2

N0

"1 Q

pN0/2

!#m1d,

where we use the fact that when H = 1 , Y1 N(E, N02 ) . The above expression forPc(1) cannot be simplified further, but we can evaluate it numerically. By symmetry,Pc(1) =Pc(i) for all i . Hence Pc= Pc(1) =Pc(i) .

The fact that the distance between any two distinct codewords is a constant simplifiesthe union bound considerably:

Pe= Pe(i) (m 1)Q

d

2

= (m 1)Qr

E

N0

!

2 l n 2 .

The result of the above example is quite surprising at first. The more bits we send, thelarger the probability Pc that they will allbe decoded correctly. On second thought, it

8/12/2019 PDC_Chap4

19/32


is less surprising. We have more options to resist the noise if we are allowed to choosean encoder that maps many bits at a time. In fact in n dimensions, the noise source ismore constrained than the encoder in the following sense. The encoder is free to choosethe components of the codewords it produces, with the only restriction that the averageenergy constraint be met. A codeword that is zero in all components except for one would

not raise any eyebrows. To the contrary, you should be very suspicious of an iid Gaussiansource that outputs such an n -tuple. Implicitly, designing a good encoder is a matter ofusing the encoders freedom to create patterns (codewords) that remain identifiable afterbeing corrupted by noise.

Unfortunately there is a major problem with the requirement that n grows exponentiallywith k . In fact, from Theorem64 it means that W T has to grow exponentially withk . In general, we expect W to be fixed and T to grow linearly with k . We deducethat for large values of k we can use bit-by-bit on a pulse train but not block-orthogonalsignaling.

In the next subsection we gain additional insight on why the message error probabilitygoes to 0 for block-orthogonal signaling and to 1 for bit-by-bit on a pulse train. Theunion bound is very useful for this.

4.8.4 Bit-By-Bit Versus Block-Orthogonal

We have seen that the message error probability goes to 1 in bit-by-bit on a pulse trainand goes to zero (exponentially fast) in block-orthogonal signaling. The union bound isquite useful to understand what goes on.

In computing the error probability when message i is transmitted, the union bound hasone term for each j6= i . The dominating terms correspond to the signals cj that areclosest to ci . If we neglect the other terms, we obtain an expression of the form

Pe(i) NdQdm

22

,

where Nd is the number of dominant terms, i.e., the number of nearest neighbors to ci ,and dm is the minimum distance, i.e., the distance to a nearest neighbor. 2

2 =N0 .

For bit-by-bit on a pulse train, there are k closest neighbors, each neighbor obtained bychanging ci in exactly one component, and each of them is at distance 2

Eb from ci . As

k increases, Nd increases and Q(dm22 ) stays constant. The increase of Nd makes Pe(i)

increase.

Now consider block-orthogonal signaling. All signals are at the same distance from each

other. Hence there are Nd= 2k 1 nearest neighbors to ci , all at distance dm= 2E=

8/12/2019 PDC_Chap4

20/32

140 Chapter 4.

2kEb . Hence

Qdm

2

1

2exp

hd2m82

i=

1

2exp

h kEb

42

i,

Nd= 2k

1 =ek ln 2

1.We see that the probability that the noise carries a signal closer to a specific neighbordecreases as exp

kEb42 , whereas the number of nearest neighbors increases as exp(k ln2).For Eb42 > ln 2 the product decreases, otherwise it increases.

In all four examples considered in this section, there is a common thread. As k increases,the space populated by the signals grows in size and the number of signals increases.If the former does not grow fast enough, the space becomes more and more crowdedand the error probability goes up. Sophisticated coding techniques can avoid this whilekeeping a linear relationship between n and k . In Chapter6, we do a case study of sucha technique.

4.9 Conclusion and Outlook

We have discussed some of the trade-offs between the number of transmitted bits, thesignal duration, the bandwidth, the signals energy, and the error probability. We haveseen that, rather surprisingly, it is possible to transmit an increasing number k of bits ata fixed energy per bit Eb and to make the probability that even a single bit is decodedincorrectly go to zero as k increases. However, the scheme we used to prove this has theundesirable property of requiring an exponential growth of the time bandwidth product.In any given channel, we would quickly run out of time and/or bandwidth even with

moderate values of k . In real-world applications, we are given a fixed bandwidth and welet the duration grow linearly with k . It is not a coincidence that most signaling methodsin use today can be seen one way or another as refinements of bit-by-bit on a pulse train.This line of signaling technique will be pursued in the next two chapters.

This is a good time to clarify our non-standard use of the words coding, encoder, codeword,and codebook. We have seen that no matter which waveform signals we use to communi-cate, we can always break down the sender into a block that provides an n -tuple and onethat maps the n -tuple into the corresponding waveform. This view is completely generaland serves us well, whether we analyze or implement a system. Unfortunately there is nostandard name for the first block. Calling it an encoder is a good name, but the readershould be aware that the current practice is to say that there is coding when the mapping

from bits to codewords is non-trivial and to say that there is no coding when the map istrivial as in bit-by-bit on a pulse train. Making a distinction is not a satisfactory solutionin our view. First of all, there is no good way to make a clean distinction between a trivialand a non-trivial map. Second, there is no good substitute for the term encoder for theblock that implements a trivial map. We find it to be a cleaner solution to talk about an

8/12/2019 PDC_Chap4

21/32

4.9. Conclusion and Outlook 141

encoder, regardless of complexity. An example of a non-trivial encoder will be studied indepth in Chapter6.

Calling the second block a waveform formeris definitely non-standard, but we find thisname to be more appropriate than calling it a modulator, which is the most common nameused for it. It has been inherited from the old days of analog communication techniquessuch as amplitude modulation (AM) for which it was an appropriate name.

In this chapter we have looked at the relationship between k , T, W, E and Pe byconsidering specific signaling methods. Information theory is a field that searches for theultimate tradeoffs, regardless of signaling method. A main result from information theory,is the famous formula

C=W

2 log2

1 +

2P

N0W

. (4.6)

It gives a precise value to the ultimate rate C[bps] at which we can transmit reliably overa waveform AWGN channel of power spectral density N0/2 [Watts/Hz] if we are allowedto use signals of power not exceeding P[Watts] and absolute bandwidth not exceeding

W [Hz].As already mentioned, it is quite common to consider only positive frequencies in de-termining the bandwidth of a signal. We can identify two reasons this point of viewhas become popular. One reason is that the positive frequencies are those we seewhen we observe a signal with a spectrum analyzer. The other reason is that the useof complex-valued notation to design and analyze communication systems is a relativelyrecent practice. If we consider only real-valued signals, then the only difference betweenthe two bandwidth definitions is a factor 2 . As negative frequencies count as much asthe positive ones in determining the dimensionality of a set of signals, if we allow forcomplex-valued signals, we can no longer disregard the negative frequencies in the def-inition of bandwidth. Notice that if we define bandwidth accounting only for positivefrequencies, then the relationship n= W T becomes n= 2BT (which is less attractive)and (4.6) becomes the C=B log2(1 +

PN0B

) (which is more attractive3).

3The factor 1/2 in front of (4.6) is fundamental: it reflects the fact that there is a factor 1/2 infront of the capacity of the discrete-time channel model. The factor 2 inside the log is an artifact of anunfortunate practice, specifically that we denote the noise power spectral density by N0

2 rather than by

the better choice N0 .

8/12/2019 PDC_Chap4

22/32

142 Chapter 4.

4.10 Exercises

Problem 1. (Bandwidth) Verify the following statements.

(a) The absolute bandwidth of sinc( tTs ) is W= 1Ts

.

(b) The 3 -dB bandwidth of an RC lowpass filter is W = 1RC.

(c) The -bandwidth of an RC lowpass filter is W = 1RCtan

2 (1 )

.

(d) The zero crossing bandwidth of [Ts2 ,Ts

2 ](t) is W =

2Ts

.

(e) The equivalent noise bandwidth of an RC lowpass filter is W= 12RC.

(f) The RMS bandwidth of h(t) = exp(t2) is W = 14 . Hint: hF(f) = exp(f2) .

Hint: For an RC lowpass filter, we have h(t) = 1

RCexp tRC , t 0 and h(t) = 0otherwise. The squared magnitude of its Fourier transform is |hF(f)|2 = 11+(2RCf)2.

Problem 2. (Signal Translation) Consider the signals w0(t) and w1(t) shown in Fig-ure4.7, used to communicate one bit across an AWGN channel of power spectral densityN0/2 .

t

w0(t)

T 2T

1

1 t

w1(t)

2T1

1

Figure 4.7:

(a) Determine an orthonormal basis{0(t),1(t)} for the space spanned by{w0(t), w1(t)}and find the corresponding codewords c0 and c1 . Work out two solutions, one ob-tained via Gram-Schmidt and one in which the second elements of the orthonormalbasis is a delayed version of the first. Which of the two solutions would you chooseif you had to implement the system?

(b) Let Xbe a uniformly distributed binary random variable that takes values in {0, 1} .We want to communicate the value of X over an additive white Gaussian noisechannel. When X= 0 , we send w0(t) , and when X= 1 , we send w1(t) . Draw theblock diagram of a ML receiver based on a single matched filter.

8/12/2019 PDC_Chap4

23/32

4.10. Exercises 143

(c) Determine the error probability Pe of your receiver as a function of T and N0 .

(d) Find a suitable waveform v(t) , such that the new signalsw0(t) = w0(t) v(t) andw1(t) =w1(t) v(t) have minimal energy and plot the resulting waveforms.

(e) What is the name of the kind of signaling scheme that usesw0(t) andw1(t) ? Argue

that one obtains this kind of signaling scheme independently of the initial choice ofw0(t) and w1(t) .

Problem 3. (Orthogonal Signal Sets) Consider a set W={w0(t), . . . , wm1(t)} of mu-tually orthogonal signals with squared norm Eeach used with equal probability.

(a) Find the minimum-energy signal set W = {w0(t), . . . , wm1(t)} obtained by trans-lating the original set.

(b) Let Ebe the average energy of a signal picked at random within W. DetermineE

and the energy saving EE.(c) Determine the dimension of the inner-product space spanned by W.

Problem4. (Isometries) Let W={w0(t), . . . , wm1(t)} be obtained from a given signalset W={w0(t), . . . , wm1(t)} via an isometry that sends wi(t) W towi(t) W, i=0, . . . , m 1 . To simplify notation, we assume that W and Ware orthogonal to one an-other and each spans a space of dimension n . Let B= {1(t), . . . ,n(t),n+1(t), . . . ,2n(t)}be an orthonormal basis for the signal set spanned by WW, where the first n elementsof this basis form a basis for the inner product space spanned byWand the last n forma basis for that spanned by W. Let Cbe the codebook associated toWwith respect to

the basis B and let C be the one associated to W. Notice that codewords have length2n . Describe the isometry that maps C toC.

Problem5. (Dimensionality vs Area)The purpose of this exercise is to explore in whatsense the dimensionality of a signal set equals the area it occupies in the time frequencyplane. The focus is on ideas rather than mathematical rigor.

Consider the three regions in the time frequency plane of Figure4.8. The region thatcontains the origin represents a set G of signals that are time-limited to (T2 , T2 ) andfrequency-limited to (W2, W2) in the sense of Theorem 64. The other two regions areobtained by frequency-shifting all signals by f0 or by time-shifting the signals by t0 . We

assume that f0> W and t0> T.

(a) Argue that forG, there exists areal-valued basisB. Hint: See Exercise8.

8/12/2019 PDC_Chap4

24/32

144 Chapter 4.

(b) Describe an orthonormal basis for each of the other two signal sets knowing thatB={1(t), . . . ,n(t)} is an orthonormal basis for G. Conclude that the three setshave the same dimensionality.

(c) Argue that the three sets are orthogonal to one another.

(d) Argue that the dimensionality of a large signal space G described by one or morenon-overlapping regions (not necessarily rectangular) in the time frequency planeis essentially equal to the total area of the describing regions.

GW

T

W f0

T

t0t

f

Figure 4.8:

Problem6. (Time and Frequency-Limited Orthonormal Sets) Complement Examples65and 66with similar examples in which the shifts occur in the frequency domain. The

corresponding time-domain signals can be complex-valued.

Problem7. (Root-Mean Square Bandwidth)The root-mean square (rms) bandwidth ofa low-pass signal g(t) of finite energy is defined by

Wrms =

"R f

2|G(f)|2dfR |G(f)|

2df

#1/2

where |G(f)|2 is the energy spectral density of the signal. Correspondingly, the rootmean-square (rms) duration of the signal is defined by

Trms =

"R t

2|g(t)|2dtR |g(t)|

2dt

#1/2

.

8/12/2019 PDC_Chap4

25/32

4.10. Exercises 145

We want to show that, with the above definitions and assuming that |g(t)| 0 fasterthan 1/

p|t| as |t| , the time bandwidth product satisfies

TrmsWrms 14

.

(a) Use Schwarz inequality and the fact that for any c C , c+c = 2

8/12/2019 PDC_Chap4

26/32

146 Chapter 4.

Comment: In this exercise we have shown that we can always find a real-valued orthonor-mal basis for an inner product space that fulfills the stated condition with respect ofconjugation. An equivalent condition is that if g(t) G then also the inverse Fouriertransform of gF(f) is in G. The set Gof complex-valued finite-energy waveforms thatare time-limited to (T2 , T2 ) (in a strict sense) and frequency-limited to (W2, W2) fulfillsthe condition for any of the bandwidth definitions given in Section 4.3. (If we use theabsolute bandwidth definition, then Tbust be infinite or else the set G is empty.)

Problem9. (Average Energy of PAM)Let U be a random variable uniformly distributedin [a, a] and let S be a discrete random variable independent of U and uniformlydistributed over the PAM constellation {a,3a, ,(m 1)a} , where m is an eveninteger. Let V =S+ U.

(a) Find the distribution of V .

(b) Find the variance of Uand that of V .

(c) Use part (b) to determine the variance of S. Notice that the variance of S is theaverage energy of the PAM constellation used with uniform distribution.

Problem 10. (Suboptimal Receiver for Orthogonal Signaling) This exercise takes a dif-ferent approach to the evaluation of the performance of Block-Orthogonal Signaling (Ex-ample 70). Let the message H {1, . . . , m} be uniformly distributed and consider thecommunication problem described by:

H=i : Y =ci+ Z, ZN(0,2Im),where Y = (Y1, . . . , Y n)

T

Rm is the received vector and {c1, . . . , cm}

Rm the code-

book consisting of constant-energy codewords that are orthogonal to each other. Withoutloss of essential generality, we can assume

ci=Eei,

where ei is the i th unit vector in Rm , i.e., the vector that contains 1 at position i and

0 elsewhere, and E is some positive constant.

(a) Describe the statistic of Yj for j = 1, . . . , m given that H= 1 .

(b) Consider a suboptimal receiver that uses a threshold t = E where 0 < < 1 .

The receiver declares H=i if i is theonly integer such that Yi t . If there is nosuch i or there is more than one index i for which Yi t , the receiver declares thatit cannot decide. This will be viewed as an error. Let Ei= {Yi t} , Eci ={Yi< t} ,and describe, in words, the meaning of the event

E1 Ec2 Ec3 Ecm.

8/12/2019 PDC_Chap4

27/32

4.10. Exercises 147

(c) Find an upper bound to the probability that the above event does not occur whenH= 1 . Express your result using the Q function.

(d) Now we let E and ln m go to while keeping their ratio constant, namely E =Ebln m log2 e . (Here Eb is the energy per transmitted bit.) Find the smallest value

of Eb/2

(according to your bound) for which the error probability goes to zero as Egoes to . Hint: Use m 1< m= exp(ln m) and Q(x)< 12exp(x2

2) .

Problem 11. (Receiver Diagrams) For each signaling method discussed in Section 4.8,draw the block diagram of an ML receiver.

Problem12. (Bit-By-Bit on a Pulse Train) A communication system uses bit-by-bit ona pulse train to communicate at 1 Mbps using a rectangular pulse. The transmitted signalis of the form

Xj

Bj [0,Ts](t jTs).

where Bj {b} . Determine the value of b needed to achieve bit-error probabilityPb= 10

5 knowing that the channel corrupts the transmitted signal with additive whiteGaussian noise of power spectral density N0/2 where N0= 10

2 W/Hz.

Problem 13. (Bit Error Probability) A discrete memoryless source produces bits at arate of 106 bps. The bits, which are uniformly distributed and iid, are grouped into pairsand each pair is mapped into a distinct waveform and sent over an AWGN channel ofnoise power spectral density N0/2 . Specifically, the first two bits are mapped into one ofthe four waveforms shown in Figure4.9with Ts= 2106 , the next two bits are mappedonto the same set of waveforms delayed by Ts , etc.

(a) Describe an orthonormal basis for the inner product spaceW spanned by wi(t) ,i = 0, . . . , 3 and plot the signal constellation in Rn , where n is the dimensionalityofW.

(b) Determine an assignment between pairs of bits and a waveforms such that the biterror probability is minimized and derive an expression for Pb .

(c) Draw a block diagram of the receiver that achieves the above Pb and uses a singleand causal filter.

(d) Determine the energy per bit Eb and the power of the transmitted signal.

8/12/2019 PDC_Chap4

28/32

148 Chapter 4.

t

w0(t)

1

0 Ts

t

w2(t)

1Ts

t

w1(t)

1

Tst

w3(t)

1 Ts

Figure 4.9:

Problem14. (m -ary Frequency Shift Keying) m -ary Frequency Shift Keying (m -FSK)is a signaling method that uses signals of the form

wi(t) =A

r2

T cos(2(fc+ if)t) [0,T](t), i= 0, , m 1,

where A , T, fc , and f are fixed parameters.

(a) Assuming that fcT is an integer, find the smallest value of f that makes wi(t)

orthogonal to wj(t) when i 6=j .(b) In practice the signals wi(t), i = 0, 1, , m 1 can be generated by changing

the frequency of a signal oscillator. In passing from one frequency to another aphase shift is introduced. Again, assuming that fcT is an integer, determine thesmallest value f that ensures orthogonality between cos(2(fc+if)t+i) andcos(2(fc+jf)t +j) whenever i 6=j regardless of i and j.

(c) Sometimes we do not have complete control over fc either, in which case it is notpossible to set fcT to an integer. Argue that if we choose fc >> mf then for allpractical purposes the signals will be orthogonal to one another.

(d) Determine the average energy Eand the frequency-domain interval occupied by thesignal constellation. How does theBTproduct behave as a function ofk= log2(m) ?

8/12/2019 PDC_Chap4

29/32

4.10. Exercises 149

Problem 15. (Antipodal Signaling and Rayleigh Fading) Consider using antipodal sig-naling, i.e w0(t) = w1(t) , to communicate one bit across a Rayleigh fading channel thatwe model as follows. When wi(t) is transmitted the channel output is

R(t) =Awi(t) + N(t),

where N(t) is white Gaussian noise of power spectral density N0/2 and A is a randomvariable of probability density function

fA(a) =

2aea

2

, ifa 0,0, otherwise.

(4.7)

We assume that, unlike the transmitter, the receiver knows the realization of A . We alsoassume that the receiver implements a maximum likelihood decision, and that the signalenergy is Eb .

(a) Describe the receiver.

(b) Determine the error probability conditioned on the event A= a .

(c) Determine the unconditional error probabilityPf . (The subscript stands for fading).

(d) Compare Pf to the error probability Pe achieved by an ML receiver that observesR(t) =mwi(t) + N(t) , where m= E [A] . Comment on the different behavior of thetwo error probabilities. For each of them, find the Eb/N0 value necessary to obtainthe probability of error 105 .

Problem 16. (Non-White Gaussian Noise) Consider the following transmitter/receiverdesign problem for an additivenon-white Gaussian noise channel.

(a) Let the hypothesis H be uniformly distributed in H = {0, . . . , m 1} and whenH=i , i H , let wi(t) be the channel input. The channel output is then

R(t) =wi(t) + N(t)

where N(t) is a Gaussian process of known power spectral density G(f) , where weassume that G(f)6= 0 for all f. Describe a receiver that, based on the channeloutput R(t) , decides on the value of Hwith least probability of error. Hint: Find away to transform this problem into one that you can solve.

(b) Consider the setting as in part (a) except that now you get to design the signal set

with the restrictions that m = 2 and that the average energy can not exceed E.We also assume that G2(f) is constant in the interval [a, b] , a < b , where it alsoachieves its global minimum. What are the two signals that allow for the smallestpossible probability of error to be achieved?

8/12/2019 PDC_Chap4

30/32

150 Chapter 4.

Problem17. (Continuous-Time AWGN Capacity)To prove the formula for the capacityCof the continuous-time AWGN channel of noise power density N0/2 when signals arepower-limited to P and frequency-limited to (W2, W2) , we first derive the capacity Cdfor the discrete-time AWGN channel of noise variance 2 and symbols constrained toaverage energy not exceeding Es . The two expressions are:

Cd=1

2log2

1 +

Es2

[bits per channel use]

C= (W/2)log2

1 +

P

N0(W/2)

[bps].

To derive Cd we need tools from information theory. However, going from Cd to Cusing Theorem64 is straightforward. To do so, let G be the set of all signals that arefrequency-limited to (W2, W2) and time-limited to (T2 , T2 ) at level . We choose small enough that for all practical purposes all signals ofG are strictly frequency-limitedto (W2, W2) and strictly time-limited to (T2 , T2 ) . Each waveform in G is representedby an n -tuple and as T goes to infinity n approaches W T. Complete the argument

assuming n= W Tand without worrying about convergence issues.

Problem 18. (Energy Efficiency of PAM) This exercise complements what we havelearned in Example67. Consider using the m -PAM constellation {a,3a,5a , . . .(m1)a} to communicate across the discrete-time AWGN channel of noise variance2 = 1 . Our goal is to communicate at some level of reliability, say with error probabilityPe= 10

5 . We are interested in comparing the energy needed by PAM versus the energyneeded by a system that operates at channel capacity, namely at 1

2log2

1 + Es

2

[bits per

channel use].

(a) Using the capacity formula, determine the energy per symbolECs (k) needed to trans-

mit k bits per channel use. (The superscript C stands for channel capacity.) Atany rate below capacity it is possible to make the error probability decrease withoutlimit by increasing the codeword length. This implies that there is a way to achievethe desired error probability at energy per symbol ECs (k) .

(b) Using m -PAM, we can achieve an arbitrary small error probability by making theparameter a sufficiently large. As the size m of the constellation increases, the edgeeffects become negligible, and the average error probability approachesQ( a

2) , which

is the probability of error conditioned on an interior point being transmitted. Findthe numerical value of the parameter a for which Q( a

2) = 105 . (You may use

12exp(x

2

2) as an approximation of Q(x) .)

(c) Having fixed the value of a , we can use equation (4.1) to determine the averageenergy EPs (k) needed by PAM to send k bits at the desired error probability. (Thesuperscript P stands for PAM.) Find and compare the numerical values of EPs (k)and ECs (k) for k= 1, 2, 4 .

8/12/2019 PDC_Chap4

31/32

4.10. Exercises 151

(d) Find limkECs (k+1)ECs (k)

and limkEPs (k+1)EPs (k)

.

(e) Comment on PAMs efficiency in terms of energy per bit for small and large valuesof k . Comment also on the relationship between this exercise and Example67.

8/12/2019 PDC_Chap4

32/32

152 Chapter 4.

PDC_Chap4

Documents