Modeling in Simulink and Synthesis of Digital Pre ...1043766/FULLTEXT01.pdf · IBO Input Back-Off IFFT Inverse Fast Fourier Transform IM Inter-Modulation IM3 Third-order Inter-Modulation

Electronics Engineering Department of Electrical Engineering, Linköping University, 2016

Modeling in Simulink and Synthesis of Digital Pre-Distortion for WLAN Power

Amplifiers on a Coarse-Grained Reconfigurable Fabric

Muhammad Safdar

Linköping University SE-581 83 Linköping, Sweden

Copyright 2016 Muhammad Safdar.

Master of Science Thesis in Electrical Engineering

Muhammad Safdar

LiTH-ISY-EX--16/4997--SE

Supervisors:

Ted Johansson

ISY, Linköping University

Shuo Li

KTH, Stockholm

Examiner:

Mark Vesterbacka

ISY, Linköping University

Abstract

3

Abstract

High data rates are highly demanded now-a-days in most of the communication systems such

as audio/video broadcasting, cable networks, wireless networks etc. This can be achieved using

Orthogonal Frequency Division Multiplexing (OFDM), which is a bandwidth-efficient method.

However, the major drawback of the OFDM technique is its high Peak-to-Average Power Ratio

(PAPR). Due to this high PAPR, the amplified signal is distorted if its peaks are not controlled.

This thesis investigates a PAPR reduction technique called Fourier Projection Algorithm

(FPA). During the thesis, the FPA algorithm is successfully designed to reduce the PAPR in the

OFDM systems to avoid the clipping. The results of the FPA algorithm show that the efficiency

of the system depends on the throughput, the complexity, and Tone Rate Loss (TRL) of the

system. The simulations are first carried out in SIMULINK and MATLAB environments and

later on it is synthesized on coarse-grained reconfigurable fabric platform.

Contents

4

CONTENTS

Abstract ...................................................................................................... 3

Contents ..................................................................................................... 4

Acknowledgments ..................................................................................... 6

List of Figures ............................................................................................ 7

Glossary of Terms ..................................................................................... 9

1 Introduction ............................................................................................. 11

1.1 Motivation ........................................................................................... 11

1.2 Purpose ................................................................................................ 11

1.3 Problem Statement ............................................................................... 12

1.4 Research Limitations ........................................................................... 12

2 Theory ...................................................................................................... 13

2.1 The Power Amplifier (PA) ................................................................... 13

2.2 RF Transmitter ..................................................................................... 14

2.3 Nonlinearity ......................................................................................... 15

2.4 AM-AM and AM-PM Distortions ....................................................... 17

2.5 Orthogonal Frequency Division Multiplexing (OFDM) ..................... 18

2.5.1 Multi-Carrier Modulation .............................................................. 19

2.5.2 Orthogonality in OFDM ................................................................ 20

2.5.3 Mathematical Definition of PAPR ................................................. 21

2.5.4 Advantages and disadvantages of OFDM System ......................... 21

2.6 Non-Linearity of OFDM Signals ......................................................... 22

2.7 Compensation Techniques for Non-Linear Distortions ....................... 22

2.7.1 Power Back-off Method ................................................................ 23

2.7.2 Amplifier Linearization Methods .................................................. 24

2.7.2.1 Feed-forward Linearizer ........................................................... 24

2.7.2.2 Feedback Linearizer ................................................................. 25

2.7.2.3 Pre-Distortion Linearizer .......................................................... 26

2.7.3 PAPR Reduction Techniques ......................................................... 27

2.9 Coarse-Grained Reconfigurable Architectures (CGRAs) .................... 27

2.9.1 Introduction ................................................................................... 27

2.9.2 Dynamically Reconfigurable Resource Array (DRRA) ................ 28

2.9.2.1 Data-Path Unit (DPU) .............................................................. 28

2.9.2.2 Register File (RFile) ................................................................. 29

2.9.2.3 Sequencer ................................................................................. 29

2.9.3 VESYLA .......................................................................................... 29

Contents

5

3 Implementation ....................................................................................... 31

3.1 Fourier Projection Algorithm .............................................................. 31

3.1.1 The POCS Algorithm ..................................................................... 31

3.1.2 Flow graph of FPA......................................................................... 33

3.2 Implementation of FPA in SIMULINK .............................................. 34

3.3 Implementation of Radix-2 FFT and IFFT on DRRA fabric .............. 38

3.4 Implementation of FPA on DRRA fabric............................................ 41

4 Results ...................................................................................................... 42

4.1 Number of Iterations versus Number of unused tones ........................ 42

4.2 Throughput versus Number of unused tones ....................................... 44

4.3 Throughput versus Number of Iterations ............................................. 44

4.4 Discussions .......................................................................................... 45

5 Conclusion and Future work ................................................................. 46

6 References ................................................................................................ 47

Acknowledgments

6

Acknowledgments

First of all, I am thankful to Prof. Ahmed Hemani for giving me this opportunity of working

here with his team at KTH Royal Institute of Technology, Stockholm. His encouragement, pos-

itive feedback, and guidance was always excellent. He was always available to help whenever

I wanted to discuss any problem I was facing during this thesis. I would like to thank my su-

pervisor Dr. Shuo Li at KTH, who guided and assisted me throughout my thesis and made it

possible for me to understand the FPA algorithm. I learned a lot from him during this period. It

was a great experience of moving back to Stockholm and doing my thesis at KTH.

I would like to thank my examiner Prof. Mark Vesterbacka and supervisor Prof. Ted Johansson

at Linköping University for their support and guidance. I am very grateful to Ted Johansson for

his valuable insights during the selection of this thesis. I would like to thank my parents and

other family members for their unconditional support and love. Finally, thanks to my all friends

in Linköping and Stockholm for their company and wonderful time I had with them.

.

List of Figures

7

List of Figures

2_1 Block diagram of a basic wireless communication system ........................... 14

2_2 A schematic of RF transmitter ....................................................................... 15

2_3 Response of a nonlinear amplifier with two tone test signal ......................... 16

2_4 Third order Intercept Point (IP3) ................................................................... 17

2_5 AM-AM characteristics ................................................................................. 18

2_6 AM-PM characteristics ................................................................................. 18

2_7 Spectrum of multi-Carrier modulation .......................................................... 19

2_8 Multi-Carrier Modulator ............................................................................... 19

2_9 (a)Traditional multi-carrier technique(b)Orthogonal multi-carrier Technique

.............................................................................................................................. 20

2_10 Frequency Spectrum of OFDM signal ........................................................ 21

2_11 IBO and OBO .............................................................................................. 24

2_12 Feed-forward Linearizer .............................................................................. 25

2_13 Feedback Linearizer .................................................................................... 26

2_14 PDs response opposite to a PA’s response in magnitude and phase ........... 26

2_15 DRRA single cell ......................................................................................... 28

2_16 Mapping example for DRRA cell ................................................................ 30

3_1 Decomposition of POCS algorithm ............................................................... 32

3_2 Flow graph of FPA ........................................................................................ 34

3_3 Four Projection Algorithm from SIMULINK ............................................... 35

3_4 Inside view of the training stage ................................................................... 35

3_5 Inside view of the FPA .................................................................................. 36

3_6 Counter logic ................................................................................................. 36

3_7 Clipping block ............................................................................................... 37

3_8 Single butterfly operation .............................................................................. 39

3_9 Implementation of Radix-2 FFT single butterfly on DRRA ......................... 40

3_10 Implementation of FPA on DRRA fabric .................................................... 41

4_1 Number of iterations versus Number of unused tones ........................... 43

List of Figures

8

4_2 Number of iterations versus Number of unused tones for varying PAPR levels .............................................................................................................................. 43

4_3 Throughput versus Number of unused tones .......................................... 44

4_4 Throughput versus Number of iterations ................................................. 45

Glossary of Terms

9

Glossary of Terms

ACE Active Constellation Extension

AFE Analog Front End

AGU Address Generation Units

ASICs Application Specific Integrated Circuits

BER Bit Error Rate

CF Crest Factor

CGIs Coarse Grain Instructions

CGRA Coarse-Grained Reconfigurable Architecture

DAC Digital to Analog Converter

DFT Discrete Fourier Transform

DPU Data-Path Unit

DRRA Dynamically Reconfigurable Resource Array

FFT Fast Fourier Transform

FPA Fourier Projection Algorithm

FPGAs Field-Programmable Gate Arrays

IBO Input Back-Off

IFFT Inverse Fast Fourier Transform

IM Inter-Modulation

IM3 Third-order Inter-Modulation

IP3 Third-Order Intercept Point

ISI Inter-Symbol Interference

MCM Multi-Carrier Modulation

OBO Output Back-Off

OFDM Orthogonal Frequency Division Multiplexing

P1dB 1-dB Compression Point

PA Power Amplifier

Glossary of Terms

10

PAPR Peak-to-Average Power Ratio

PD Pre-Distortion

POCS Projection Onto Convex Sets

PTS Partial Transmit Sequences

RF Radio Frequency

RFile Register File

SLM Selective Level Mapping

SNR Signal-to-Noise Ratio

TR Tone Reservation

TRL Tone Rate Loss

Introduction

11

1 INTRODUCTION

1.1 Motivation

The increasing demand of high-bandwidth data communication has been the main reason for

multi-carrier systems and Orthogonal Frequency Division Multiplexing (OFDM) systems in

recent communication systems. Despite many benefits of OFDM systems, high Peak-to-Aver-

age Power Ratio (PAPR) in OFDM signals creates complexity to the Digital to Analog Con-

verter (DAC) and Analog Front End (AFE).

The high fluctuations in OFDM signal’s power levels demand higher dynamic range DACs,

which consume more power and the signal is distorted due to the limited linearity characteristics

of the Power Amplifier (PA). This causes spectral re-growth of the signal. The spectral re-

growth causes an unwanted interference with the adjacent channels. Thus, reducing the PAPR

reduces the power dissipation and lowers the cost of DAC and AFE.

There are different ways to deal with the high fluctuations of PAPR. Many methods have been

presented for linearizing the PAs in the literature [1] and many techniques have been introduced

for reducing the PAPR of the OFDM systems [2]. This thesis mainly focuses on PAPR reduction

techniques but an overview of some linearization techniques is also presented and compared

with PAPR reduction techniques. System simulations are carried out in the SIMULINK and

MATLAB environments and later on it is synthesized on coarse-grained reconfigurable config-

uration embedded system technology platform, CREST.

1.2 Purpose

This thesis uses a technique called Fourier Projection Algorithm (FPA) to reduce the PAPR in

OFDM communication systems. The purpose of this thesis is to model the FPA in SIMULINK

and MATLAB environments, and then synthesize it on a CREST fabric, which is a Coarse-

Grained Reconfigurable Architecture (CGRA), developed by Royal Institute of Technology

KTH in Sweden, -IIT Delhi and -IISc Banglore in India [3]. CREST provides better flexibility

compared to Application Specific Integrated Circuits (ASICs) and better performance than

Field-Programmable Gate Arrays (FPGAs). The SIMULINK and MATLAB models and the

synthesized design serve as a reference design for refining the CREST fabric.

Introduction

12

1.3 Problem Statement

The output in the OFDM system is the superposition of multiple subcarriers. Some instantane-

ous outputs might be higher than the average power in the OFDM system due to the same phases

of these subcarriers. This is known as high PAPR, which is the most serious problem in the

OFDM system. This requires power amplifiers with high dynamic range, which are very ex-

pensive. The dynamic range of the power amplifiers is limited, therefore PAPR in the OFDM

system should be reduced. FPA is the algorithm to reduce the PAPR of the OFDM system. In

order to evaluate the FPA implementation, we first compute the PAPR of the system; then after

the signal is distorted by the implemented FPA, we compute the resulting PAPR and compare

it to the original signal’s PAPR. The questions to be answered are: How can FPA be imple-

mented on CGRAs? How is the implemented FPA evaluated? How effective is the implemented

FPA on the PAPR reduction?

1.4 Research Limitations

The thesis focuses on the FPA technique. The implemented FPA targets the OFDM systems

since normally they produce high PAPR signals, which can easily go out of the dynamic range

of the power amplifier. The FPA is used to reduce the peak values such that the signal is within

the dynamic range of the power amplifier. The FPA algorithm is implemented on CGRA be-

cause KTH and Catena’s joint project CREST II requires a CGRA implementation of the FPA

algorithm.

Theory

13

2 THEORY

Now-a-days, Orthogonal Frequency Division Multiplexing (OFDM) technique is preferred

over the traditional Multi-Carrier Modulation (MCM) technique due to its useful properties

such as high data rates. This chapter describes the basic concepts of Power Amplifier (PA), RF

transmitter, and third order Inter-Modulation distortions (IM3) generated by the PA, and the

PA’s characteristic curve. Later, the non-linearity problems in the PA due to high Peak-to-Av-

erage Power Ratio (PAPR) in OFDM systems are discussed in detail.

2.1 The Power Amplifier (PA)

The PA plays the key role in modern communication systems. It is mainly present in the trans-

mitter to increase the power level of the signal before being sent to an antenna. The amplifica-

tion of the signal is very important otherwise a highly distorted signal is received at the receiver

due to its poor Signal-to-Noise Ratio (SNR). The PA is an essential component in the transmit-

ter but causes many problems in the signal.

The important parameters of the PA are its output power, gain, distortion, and efficiency. The

gain of the PA should be as high as possible while its distortion should be kept as small as

possible to get the best results.

The PA’s efficiency is another important factor, which should be looked into while designing

the transmitter as the devices containing the PAs are usually battery driven e.g. a mobile tele-

phone. A PA with lower efficiency dissipates more heat and hence consumes more power and

battery. The PA has a linear response for small signal conditions and starts getting non-linear

as the signal level increases.

Increasing the efficiency of the PA drives it into the non-linear region due to its limited linear

characteristics, which produces unwanted frequencies causing spectral re-growth of the signal

and in-band distortions. The spectral re-growth causes undesirable adjacent channel interfer-

ence while in-band distortions cause Bit Error Rate (BER) distortions. Thus, there is a trade-off

between the efficiency and the interference. In practice, channels that are adjacent in frequency

may have sidebands that interfere with each other, called adjacent channel interference. This is

why the adjacent channel interference is being regulated by the regulation authority and it must

Theory

14

be tightly controlled to avoid the violation of the rules. One way of avoiding the adjacent chan-

nel interference is to run the PA in back-off mode but this reduces the efficiency of the ampli-

fier.

Due to this non-linear behavior of the PA, we are restricted to a certain input peak power oth-

erwise we will get the distorted output. Before we progress to the PAPR reduction of OFDM

system, it is a good idea to understand the block diagram of the transmitter and the nonlinearities

of the PA, which are discussed in next two sections.

2.2 RF Transmitter

An overall wireless system can be represented by Figure 2_1. The digital baseband signals to

be transmitted, are converted to analog, up-converted, and finally transmitted via an antenna

through the channel. The received signals are down-converted back to the baseband signals and

then are converted back to the digital domain.

Source Encoder

D/A RF Tx Tx Antenna

Source Decoder

A/D RF Rx Rx Antenna

Channel

Figure 2_1: Block diagram of a basic wireless communication system

The PA and mixer add the most non-linearity impairments to the transmitter. These are usually

located just before the antenna in the transmitter in a communication system as shown in Figure

2_2. Mixers add phase noise, non-linearity, and spurious frequencies while the PA adds non-

linearity in the signal. The non-linearity characteristics of the PA cause a problem to the OFDM

systems due to its high PAPR.

Theory

15

PA

Antenna

Mixer

Local Oscillator

Modulator

Figure 2_2: A schematic of RF transmitter

2.3 Nonlinearity

In a perfectly linear system, the principle of superposition holds. Suppose that K maps the in-

put x to an output y, then

y = K(x). (2.1)

For two inputs x1 and x2 we can write as:

y1 = K(x1), (2.2)

y2 = K(x2), (2.3)

then linear system must satisfy:

ay1 + by2 =aK(x1) + bK(x2) . (2.4)

In a non-linear system, superposition property does not hold anymore unlike in a linear system.

In a frequency domain, a linear system generates the same number of tones as that of the input

while a non-linear system generates extra tones known as harmonics.

Electronic devices are never perfectly linear. This non-linear behavior is often not desirable in

the devices and is characterized by a single tone and a two-tone test.

In a single-tone test, if we apply a single tone f on the input of a non-linear PA, more than one

frequencies (of the order of positive integers) other than the original tone are generated at the

output of an amplifier. These tones are called harmonics.

Theory

16

In a two-tone test, if we apply two closely spaced tones f1 and f2 at the input of a non-linear

device, the output of the non-linear device includes third-order Inter-Modulation products

(IM3) of the form nf1±mf2 as shown in Figure 2_3, where m and n are positive integers and

m+n is known as the order of distortion. IM3 tones cause an undesired spectral re-growth at the

output and hence interference with the adjacent channels. The original tones, i.e. f1 and f2, ap-

pear also amplified at the output of the PA. The PA also generates more distortion tones other

than the IM3 products, which are placed very far away from the main tones. These tones are

not much harmful and can be easily filtered out.

Po

ut

RF PAPin Pout

Pin

Frequency

Frequency

f1 f2

f1 f2

2f2–f12f1–f2

3f2–2f13f1–2f2

2f1 2f2

f1+f2 2f2+f12f1+f2

3f23f1

Fundamental

zone

Second

Harmonic

zone

Third

Harmonic

zone

DC

zone

f2–f1

2f2–2f1

Figure 2_3: Response of a nonlinear amplifier with two-tone test signal

Figure 2_4 shows the plot of Pout versus Pin and IM3 versus Pin, where Pin is the input power

and Pout is the output power. The slope of the fundamental Pout is 1 while the slope of IM3 is 3.

If we extend these slopes, they intersect at a point, called Third-Order Intercept Point (IP3). IP3

is used for approximating the linear region of PA. If IP3 is higher, there is less distortion at

higher power levels and vice versa.

Theory

17

Pin (dBm)

Pout (dBm)

IIP3

OIP3

IP3

Fundamental,

Slope=1

IM3 products,

Slope=3

P1dB,out

P1dB,in

1dB

Figure 2_4: Third order intercept point (IP3)

Figure 2_4 shows another way of characterizing the PA that is the 1-dB compression point

(P1dB). It is defined as the point where the difference between the output of the device and the

linear output is 1dB.

2.4 AM-AM and AM-PM Distortions

The PA’s performance is power limited i.e. after a certain input power it shows non-linear be-

havior at the output and has a big impact on the signal with every increase of the power level.

The non-linear characteristics of the PA can be represented using AM-AM and AM-PM plots.

AM stands for amplitude and PM stands for phase. AM-AM represents the amplitude distor-

tions while AM-PM represents the phase distortion to the output. There are three different op-

eration regions in PA’s curves:

Linear region, where output follows the input

Saturation region where output reaches the maximum level

Non-linear region is also known as compression region, where the PA’s output de-

creases with every increase of the input power level

These regions can be seen in Figure 2_5 and 2_6 [4].

Theory

18

IBO(dB)

OB

O(d

B)

Linear Region

Saturation Region

Compression Region

Figure 2_5: AM-AM characteristics [4]

IBO(dB)

Ph

ase

(de

gre

es)

Figure 2_6: AM-PM characteristics [4]

2.5 Orthogonal Frequency Division Multiplexing (OFDM)

In the next sub-sections, basic concepts of Multi-Carrier Modulation (MCM), Orthogonal Fre-

quency Division Multiplexing (OFDM), the mathematical definition of PAPR, the advantages,

and the disadvantages of OFDM systems are discussed.

Theory

19

2.5.1 Multi-Carrier Modulation (MCM)

In this scheme, transmitted data stream is divided into multiple data streams of equal bandwidth

known as sub-channels. Each of these bit streams has low bit rate. Each subcarrier is inde-

pendently modulated using narrow band signals [5].

In a single carrier modulation scheme, the system does not utilize the whole bandwidth effi-

ciently. Therefore, the idea of MCM was first presented in the mid-1960s [6]. In a classical

parallel transmission system, the total signal bandwidth of width W is divided into a number of

non-overlapping equidistant sub-channels Ch. 1 to Ch. N with identical bandwidths. Unlike the

single carrier modulation, each sub-channel in MCM is modulated using separate sub-carriers

as shown in Figure 2_7. These modulated signals are then added to get the desired signal for

transmission. There are guard bands in order to avoid the spectral overlapping of the sub-chan-

nels [6].

Figure 2_7: Spectrum of multi-Carrier modulation

The basic principle of generating MCM is shown in Figure 2_8. Input data M is divided into n

number of messages, which are then modulated by different sub-carriers.

Figure 2_8: Multi-Carrier Modulator

Theory

20

The modulated carriers are summed up for transmission. At the receiver end, the reverse oper-

ation is performed to that of the transmitter. Those modulated sub-carriers are separated by

using filters before being demodulated to retrieve the sent messages [7].

2.5.2 Orthogonality in OFDM

OFDM is the special case of the MCM technique. In OFDM systems, subcarriers are orthogonal

to each other and the information is sent on parallel overlapping subcarriers, unlike MCM where

it is sent on non-overlapping subcarriers. Thus, the bandwidth is saved by using overlapping

orthogonal sub-carriers. The other advantage of orthogonality is that it causes lesser interfer-

ence from the neighboring carriers [5]. It can be seen in Figure 2_9 [6] that by using OFDM

technique, almost 50% of the bandwidth can be saved. This saved bandwidth can then be uti-

lized for sending more information.

Figure 2_9: (a) Traditional multi-carrier technique (b) Orthogonal multi-carrier Technique [6]

The frequency spectrum of the OFDM signal is shown in Figure 2_10. The spectrum of a single

sub-carrier is a sinc function in the frequency domain. All the sub-carriers in this spectrum are

orthogonal to each other, hence they do not interfere with each other. Accurate carrier synchro-

nization is very important in the OFDM systems otherwise, it will cause interference from the

adjacent sub-channels.

Theory

21

Figure 2_10: Frequency Spectrum of OFDM signal [5]

2.5.3 Mathematical Definition of PAPR

The fluctuations in the OFDM signal can be expressed in terms of PAPR. PAPR is the ratio

between the maximum instantaneous power and the average power of the signal. PAPR is re-

lated to another term called Crest Factor (CF). The CF is another way to see how extreme the

peaks are in a waveform. The CF of a waveform is the ratio of the waveform’s peak value to its

effective value. The relation between CF and PAPR can be expressed as:

CF = √PAPR (2.5)

or PAPR = CF2. (2.6)

The PAPR of the OFDM signal can be expressed by the equation:

PAPR{x(t)} =maxt∈τ(|x(t)|)2

E{x(t)2} , (2.7)

where x(t) is the original OFDM signal, τ is the time interval, maxt∈τ(|x(t)|)2 is the peak signal

power, and E{x(t)2} is the average signal power.

2.5.4 Advantages and Disadvantages of OFDM Systems

There are several advantages and disadvantage of OFDM systems according to [8]. The ad-

vantages are described below:

OFDM makes an efficient use of bandwidth by using the overlapping sub-carriers.

It is spectrally efficient by using FFT and IFFT operations for modulation and demodulation

functions.

Interference appearing does not affect all sub-channels and hence not all the data is lost.

Theory

22

OFDM systems have better resistance to frequency selective fading due to multiple narrow-

band signals.

OFDM systems have many drawbacks in spite of some very useful advantages as discussed

above. These are summarized below:

OFDM systems have high PAPR i.e. their amplitudes have very high fluctuations. Hence,

they require a high linearity in power amplifiers to accommodate the large amplitude vari-

ations.

Accurate synchronization is required in OFDM otherwise, it will cause interference from

the adjacent channels.

Another disadvantage of OFDM systems is their higher sensitivity to carrier-frequency off-

set and drift than single-carrier systems.

2.6 Non-Linearity of OFDM Signals

We know that there is a PAPR problem in OFDM systems i.e. their amplitude values have very

high amplitude variations. Therefore, it requires a high linearity in the power amplifier to ac-

commodate these large amplitude variations as discussed in the previous section. These fluctu-

ations occur due to IFFT processing at the transmitter due to a large number of independently

modulated sub-carriers.

Due to this high PAPR of OFDM systems, it causes the power amplifier to be driven in an

uncontrolled way in its saturation and compression regions. Hence, this behavior of OFDM

systems to the non-linear PAs makes them worse compared to the traditional single-carrier sys-

tems. To overcome this problem, there are some techniques for compensating non-linear dis-

tortions, which are discussed in the next section.

2.7 Compensation Techniques for Non-Linear Distortions

To decrease the BER and out-of-band distortions, the non-linear distortion of the system must

be dealt with in a way that allows it to be operated close to the saturation region. There are

several approaches, which provide distortion compensation in the transmitter. These are

grouped into three main classes as below:

Power back-off method

Amplifier linearization methods

Theory

23

PAPR reduction methods

It is a good idea to understand all of the above methods before we present the detailed discussion

of an FPA technique. In the next sub-sections, all of these three methods are described briefly.

2.7.1 Power Back-off Method

One of the main problems in PAs is Inter-Modulation terms (IM). When a multi-tone signal is

applied to the input of a PA, it amplifies the desired signal as well as generates some unwanted

terms known as IM terms. This non-linearity in PAs increases as the power level approaches to

its saturation point i.e. the region where the output reaches the maximum level. Though, this

varies from amplifier to an amplifier and with varying conditions.

To tackle this non-linearity issue, the PA is operated at power back-off from its saturation point.

This means that that maximum output power level of PAs is reduced to bring its signal within

the linear range of PA’s transfer curve.

The non-linear distortions are reduced by the amount of back-off level, which is measured using

the two quantities such as Input Back-Off (IBO) and Output Back-Off (OBO). These are de-

fined in dBs according to [9] by:

IBO = 10log10Pmax,in

<Pin> , (2.8)

OBO = 10log10Pmax,out

<Pout> , (2.9)

where Pmax, in and Pmax, out are the input and output saturation power levels, and Pin and Pout are

the averages of the input and output powers. Either of the IBO and OBO can be used to specify

the operating point of PAs. The IBO and OBO can be defined graphically as shown in Figure

2_11.

The disadvantage of using back-off mode is that it decreases the power efficiency of the PA.

Therefore, this technique is not considered as a good option to deal with non-linear distortions.

Theory

24

Pin

Pout

IP1dB

OP1dB

1dB

Saturation

Region

Back-off from P1dB

Input back-off

Output back-off

Operation

point

Figure 2_11: IBO and OBO

2.7.2 Amplifier Linearization Methods

Amplifier linearization is a way to reduce the distortion of the PAs. Many linearization tech-

niques have been presented to compensate the non-linear distortions of the PAs. Some extra

circuitry or components are added to compensate the non-linearity effects of the amplifier.

Though there are different kinds of methods to deal with the non-linearity issues, the most

common approaches are feed-forward, feedback, and pre-distortion. Each of these techniques

uses different algorithms to compensate the non-linearity of the PAs.

Digital techniques are preferred as they offer a cost-effective solution due to the baseband im-

plementation compared to analog techniques. In the following sub-sections, these different am-

plifier linearization techniques are briefly discussed [1].

2.7.2.1 Feed-forward Linearizer

The feed-forward linearizer can be seen in Figure 2_12 [1]. This system has two loops, i.e.

signal- and error-cancellation loops. The first loop produces the distortion signal of the main

amplifier by subtracting the input signal from the output signal. The purpose of the second loop

Theory

25

is to produce a distortion-free signal by subtracting the amplified distorted signal from the de-

layed distorted output of the main amplifier. Hence the resulting signal at the output is distortion

free.

The disadvantages of this technique are complexity to track the component behavior changes

e.g. effects of temperature or properties of the components can change with time etc. and high

power consumption.

Main Amp

Aux. Amp

Coupler

Delay

Coupler

Delay

(-)(+)

Vin

VVin

Vout1 Vout2

(-)

Figure 2_12: Feed-forward linearizer [1]

2.7.2.2 Feedback Linearizer

The feedback linearizer’s basic principle is illustrated in Figure 2_13. In this technique, the

output of the amplifier is attenuated and subtracted from the input signal to compensate for the

distortion of the amplifier. It can be seen that the gain of the closed loop is:

GCL = Gamp / (1 + Gamp). (2.10)

Thus, the gain of the amplifier is reduced due to this closed loop but at the same time, it atten-

uates the distortion by 1/(1+ Gamp) at the cost of gain reduction.

The best results for linearity can be obtained by correcting both the phase and the amplitude. In

the envelope feedback technique, only the amplitude information can be corrected but the Car-

tesian feedback technique can correct both the amplitude as well as the phase. The disadvantage

Theory

26

of Cartesian feedback technique is that it is not successful in high-frequency applications.

PA

α

Vin(t) Verror(t)

Vd(t)

Vout(t)

+-

Figure 2_13: Feedback Linearizer

2.7.2.3 Pre-Distortion Linearizer

The Pre-Distortion (PD) is a method of distorting the signal before being sent to the PA. The

PD has reverse transfer characteristics to that of the PA as shown in Figure 2_14 [1] and hence

we get the linearized characteristics at the output. Another way to look at it is that the PD cir-

cuitry produces the Inter-Modulation (IM) products, which are equal in magnitude but 180 de-

grees out-of-phase to that of IM products produced by the PA. Depending on where the PD

linearizer is placed, it can be divided into two main categories i.e. Analogue IF/RF pre-distor-

tion and Digital baseband pre-distortion.

The analog PDs are small and inexpensive but they usually focus only on the reduction of third-

order Inter-Modulation (IM3) products but on the other hand, the digital PDs are easier to be

adaptive and are popular due to their rapid advancements in FPGAs, ASICs etc.

Predistortion Linearizer PA

Pin,dpd Pout,dpd Pin,A Pout,AInput Output

Pin,dpd

Pout,dpd

Pin,A

Pout,A

Pin

Pout

Figure 2_14: PDs response opposite to a PA’s response in magnitude and phase [1]

Theory

27

2.7.3 PAPR Reduction Techniques

There are many techniques, which are used for reducing the PAPR of the OFDM systems but

the main focus of this thesis is on a technique called Fourier Projection Algorithm (FPA), which

is discussed in detail in chapter 4. In PAPR reduction methods, there is a trade-off between the

Bit-Error-Rate (BER) increase, the transmit signal power increase, data rate loss, and compu-

tation complexity increase.

According to [2], PAPR reduction techniques are categorized into two main groups:

Signal scrambling techniques: How to scramble the codes to decrease the PAPR of the

system. Most famous techniques in this category are Tone Reservation (TR), Selective

Level Mapping (SLM) and Partial Transmit Sequences (PTS). The disadvantage of

these techniques is that they decrease the throughput of the system by introducing the

redundancy.

Signal distortion techniques: These techniques introduce both in-band and out-of-band

interferences and increase the complexity of the system. These techniques clip the signal

to reduce the high peaks prior to amplification. Clipping distorts the signal, which

causes both in-band and out-of-band interferences. The most practical techniques of this

category are peak clipping and filtering, windowing, peak cancellation, peak power sup-

pression, weighted multicarrier transmission etc.

2.8 Coarse-Grained Reconfigurable Architectures (CGRAs)

2.8.1 Introduction

Field-Programmable Gate Arrays (FPGAs) have been an affordable solution for implementing

the logic circuits without the use of integrated circuits fabrication facility in the past. One draw-

back of FPGAs is that the logic functions are programmed at a bit level, which is unnecessary

for many applications. Another disadvantage of FPGAs is that a large amount of area is needed

for a large number of processing units and routing switches [10]. A new trend has been to limit

the standard logic operations to be performed on a word level instead of a bit level. Coarse-

Grained Reconfigurable Architectures (CGRAs) provide the word level optimizations and very

efficient routing switches. The CGRAs have worse flexibility compared to the traditional

FPGAs but offer reduction in energy consumption and area and also low configuration memory

and time. Lots of research is going on to integrate time-multiplexing, parallelism, and power

Theory

28

management in the CGRAs. Recent CGRAs provide improved performance and flexibility at

the cost of additional memory [11]. The performance-flexibility gap between the Application

Specific Integrated Circuits (ASICs) and the FPGAs have always been the problem, which is

now filled by the CGRAs.

Dynamically Reconfigurable Resource Array (DRRA) fabric has been designed by researchers

at the Royal Institute of Technology (KTH), Stockholm, which is used for customization of the

CGRA. In the next section, a brief introduction of the DRRA is presented. The DRRA is still

in its early stages and is under development. Therefore, there are still certain limitations and

problems in this fabric.

2.8.2 Dynamically Reconfigurable Resource Array (DRRA)

The DRRA is a fabric of coarse-grained cells. Each cell in this fabric contains a Register File

(RFile), a sequencer, four Address Generation Units (AGU) and a Data Path Unit (DPU). One

DRRA cell can be seen in Figure 2_15. Current DRRA version has 2 rows and 4 columns of

cells.

RFile

DPU

Sequencer

Figure 2_15: DRRA single cell

2.8.2.1 Data-Path Unit (DPU)

The DPU has four inputs and two outputs corresponding to the two complex numbers and has

16-bits data path. The Inputs and the outputs of the DPU are used for receiving and sending

Theory

29

data to the DRRA interconnection network. The DPU has different modes for different opera-

tions like multiple add/subtract or multiply and add operations. It has the ability to perform

truncation, rounding and saturation operations as well as signal processing operations such as

encoding, scrambling etc. DPUs can even be configured in parallel to achieve the parallelism

of operations.

2.8.2.2 Register File (RFile)

The Register File (RFile) is used for local storage for implementing the Coarse Grain Instruc-

tions (CGIs). The RFile has 2 read and 2 write ports and can store 32 words of 16 bits each.

Every port in the RFile has a dedicated Address Generation Unit (AGU), which is controlled

by the sequencer.

2.8.2.3 Sequencer

The DRRA is controlled by the sequencer. Each sequencer controls only a single DRRA cell in

the DRRA fabric, which makes it easier for customization, unlike the other architectures where

a single controller is used. Each sequencer controls the RFile, DPU, and interconnect-switches

in its DRRA cell.

2.9 VESYLA

A software called VESYLA is used for mapping an application on the DRRA. VESYLA is a

compiler, which accepts MATLAB code and generates the VHDL to run on the DRRA fabric.

VESYLA has currently several limitations as it is still being developed.

VESYLA does not support the automatic resources currently. Therefore, the MATLAB code is

mapped manually to the fabric. For example, for Register and Processor Allocation, it is

mapped as below:

For Register Allocation: %! RFILE<>[R_index, C_index]

For Processor Allocation: %! CDPU[R_index, C_index]

Where R_index and C_index indicate the location of RFile and CDPU in the fabric.

Writing the MATLAB code for VESYLA can be explained by one simple example. Let us

assume we want to implement the operation,

Result = (a+b-c) * d. (2.11)

Theory

30

This task can be mapped as shown in Figure 2_16 on the DRRA fabric. The DRRA Cell [0, 0]

is used for calculating a+b, which is then sent to the next DRRA Cell [0, 1] to calculate a+b-

c. Finally, this is fed to the last DRRA Cell [0, 2], which gives us the final result (a+b-c) * d.

The result at the end is stored in the RFile[0, 2].

The MATLAB code for this example can be seen in Table 4_1 below.

Figure 2_16: Mapping example for DRRA cell

Table 3_1: MATLAB source code example for VESYLA a = [0]; %! RFILE<>[0, 0]

b = [0]; %! RFILE<>[0, 0]

c = [0]; %! RFILE<>[0, 1]

d = [0]; %! RFILE<>[0, 2]

abcd = [0]; %! RFILE<>[0, 2]

ab(1) = a(1) + b(1); %! CDPU<>[0, 0]

abc(1) = ab(1) - c(1); %! CDPU<>[0, 1]

abcd(1) = abc(1) * d(1); %! DPU<>[0, 2]

Implementation

31

3 IMPLEMENTATION

Fourier Projection Algorithm (FPA) for reducing the Peak-to-Average Power Ratio (PAPR) of

the Orthogonal Frequency Division Multiplexing (OFDM) system is discussed in detail in this

chapter. It is first implemented in SIMULINK and MATLAB environments and then it is syn-

thesized on the coarse-grained reconfigurable fabric platform.

3.1 Fourier Projection Algorithm (FPA)

We have learnt in the previous chapter that the OFDM systems have very high PAPR. Fourier

Projection Algorithm (FPA) is a technique to reduce the peaks to avoid the distortions in the

OFDM systems. The OFDM system rarely uses all its bandwidth i.e. there is some portion of

the available bandwidth, which carries almost no or little data. These tones carrying no data are

reserved for the clipping control. Since the reserved tones in OFDM are orthogonal to the rest

of the tones, therefore changing their energy does not affect the original data. These reserved

tones are used to store the energy of the tones of the original data, which have high peaks to

avoid the non-linear distortions of the Power Amplifier (PA).

The purpose of this technique is that the magnitude of all elements of a length N is kept below

the pre-defined clipping threshold C. The FPA works with a popular algorithm called Projection

Onto Convex Sets (POCS) [12]. This algorithm converges by bouncing back and forth between

the two sets. Firstly, L number of tones in the frequency domain are reserved for the clipping

control. These tones carry no data but are used to store the energy of the clipped used tones.

The signal of N-length in the time domain is projected onto an N-dimensional clipping 'hyper-

cube' centered around the origin with side length 2C. The data in the time domain is clipped to

bring all data sample values under the threshold value C if it is greater than C. The clipped

signal is mapped in the frequency domain and its used tones are reset while unused tones L are

left as they are. These L tones contain now the energy of the clipped used tones. This signal

data in the time domain is again projected back onto the clipping hypercube and the whole

process is repeated until both projections generate no changes [12].

3.1.1 The POCS algorithm

The POCS algorithm can be seen graphically in Figure 3_1. It starts by taking IFFT of the data

to get x in the time domain. The algorithm can be described as below [12]:

Implementation

32

1. IFFT of data to get x.

2. Clip x. Check if the elements change values then go to step 3 otherwise return x and

terminate.

3. FFT of x to get data in frequency domain and set (N-L) used elements back to the

original data whereas keeping the unused tones L as they are.

4. IFFT to get the new x.

5. Go back to step 2.

is the clipping amplitude, which occurs at the point of clip d. As this algorithm utilizes

FFT/IFFT processing, its complexity, therefore has reduced to O (N log N). Firstly, clip x to

get xc and dn

N

nc

1

0 then take FFT on both sides to get X and C:

xc = x + c (3.1)

FFT: Xc = X + C

Figure 3_1: Decomposition of POCS algorithm [12]

Projecting the clipped signal Xc onto the used sub-carriersX returns 𝑋′ whereas projecting x

onto the used sub-carriers gives x unchanged. Then projecting the clip impulse C onto B zeros

Implementation

33

out the used tones, which carry the transmitted data and gives us new 𝐶′ which can be written

as:

C′ = C ∗ 1SB (3.2)

where 1𝛿𝐵is the indicator function of

B . Taking IFFT gives us the signal back in time-domain

𝑥′:

x′ = IFFT(X′)

x′ = IFFT(X + C′)

x′ = IFFT(X) + IFFT(C′)

x′ = IFFT(X) + IFFT(C ∗ 1SB)

x′ = x + c ⊗ s (3.3)

x′ = x + ∑ αs(n−d) mod NNn=0 (3.4)

where, ⊗ = the circular convolution [12]. Thus, it is completely iterated in the time domain by

applying scaled and shifted version of the shaping functions to X.

1. IFFT of input data to get x.

2. Check if the amplitude changes at any position when x is clipped. If no changes occur

return x and terminate.

3. For each clip, add the shaping function scaled by the clipping amplitude to x. The shap-

ing function is also circularly shifted so that it is centered about the clipping position.

4. Go back to step 2.

3.1.2 Flow Graph of FPA

The flow graph of the FPA algorithm can be seen in Figure 3_2. The first step of the FPA is

clipping of the input data xs in the time-domain to get xc. The input samples are brought under

C if it is greater than C. In the next step, there is a while loop condition where it is checked if

any sample value of the input data xs is greater than C. If the condition is false, it returns xs and

terminates the loop otherwise it goes to the body loop. In the body loop, the clipped signal xc is

converted back into the frequency domain and the used tones N-L are reset while the unused

tones L are left as they are. The new xs is obtained by converting the signal into the time-domain.

Implementation

34

New xs is compared against the threshold value C and the whole process is repeated until the

condition is false.

Figure 3_2: Flow graph of FPA

3.2 Implementation of the FPA in SIMULINK

Figure 3_3 shows the FPA block in SIMULINK, which is used to deal with the PAPR problem

in the OFDM systems. C is the clipping threshold, M are the used tones, which carry data and

L are the unused tones, which carry no data. Threshold value C is usually selected little more

than the average value of the OFDM signal.

Implementation

35

FPA

xs, in

xs, out

L

c

TrainingstageM

Figure 3_3: Four Projection Algorithm from SIMULINK

As we have described, there are always a certain number of unused tones in the OFDM systems.

Therefore, first of all, the unused tones are determined in the training stage. It is done by con-

verting the input data xs into the frequency domain, which has N number of elements. It is

assumed that the L number of tones are the unused tones, which are used for storing the energy

of the clipped data. Thus, if the L number of elements are zeroed out then there are N-L used

elements left, which carry the original data. Inside view of the training stage can be seen in

Figure 3_4. The vector M is multiplied with the input data Xs in the frequency domain, which

zeros out the L elements. Hence, the input data contains only N-L elements after the training

stage.

Figure 3_4: Inside view of the training stage

The second block called FPA is the most important block in this algorithm. This block is used

to clip the signal in the time domain and then reset the used tones in the frequency domain. We

are using a while loop in our model to control the peak of the OFDM signal. The condition of

Implementation

36

the while loop is shown in the Figure 3_5(a) whereas the loop body is shown in Figure 3_5(b).

Each of these blocks is explained below in detail separately.

Figure 3_5: Inside view of the FPA

The Condition block checks if the input data xs is greater than the required threshold value C.

If the condition is true, we get 1 on the first input of the AND gate and vice versa. On the other

input of the AND gate, there is a Counter block. This block counts simply the number of itera-

tions of the while loop. The internal view of the Counter block can be seen in Figure 3_6. The

number of iterations can be changed from this block and we set to a certain limit otherwise

simulations can run forever in some cases. Hence the while loop runs as long as we are under

the iteration numbers limit and the input data xs is greater than the threshold value C.

Figure 3_6: ‘Counter’ logic

Implementation

37

There is one another block called Clipping, which serves the purpose of clipping the data in

time domain. Though, there is one built-in block available in the SIMULINK library called

Saturation, which does the required clipping task but to understand how it works, we have

implemented this block ourselves. Inside view of the Clipping block can be seen in Figure 3_7.

Figure 3_7: Clipping block

This block works on real and imaginary parts separately. First of all, each input data sample is

compared against 0 to find if the sample value is positive or negative. In next step, the sampled

values are compared against the threshold value C or -C, depending on if the input data sample

is positive or negative.

If the sample value is positive, it is compared against the threshold value C with the next rela-

tional operators. It gives C at the output if the input sample is greater than C otherwise it returns

the same value as that of the input sample value.

In case, if it is found negative value, it is compared against the threshold -C instead of a positive

C. Rest of the process of bringing the values under the threshold limit is same as of the positive

Implementation

38

value explained previously. Finally, we will get the output data in the time domain, which is

clipped by threshold value C. For both real and imaginary part, this whole process is done sep-

arately.

In the loop body of the while loop, the input data in the time domain is first clipped by the

Clipping block if it has peaks greater than the threshold value C. The output of the Clipping

block is converted back into the frequency domain by taking its FFT. By multiplying with the

vector L and then adding with the original signal, its used tones are reset while unused tones

are left as they are. Our target is to keep getting data in the time-domain within the threshold

limit. This data contains the original used tones while unused tones contain the energy of the

clipped data. New data containing both L and M tones, is converted back to time domain using

IFFT. This signal is the new xs, which is compared against the threshold value in while condi-

tion. This process is repeated again and again until the while condition is false.

3.3 Implementation of Radix-2 FFT and IFFT on the DRRA fabric

As the DRRA fabric is in its early stages, the FFT/IFFT operations are not available in the

fabric. These operations were implemented first as these are the important operations in this

thesis. We will not describe the implementation of IFFT because is it same as FFT except you

swap the real and imaginary parts and the divide the result with the number samples to get the

IFFT. Implementation of Radix-2 FFT on the DRRA fabric is described in this section.

Fast Four Transform (FFT) is a method for computing the Discrete Fourier Transform (DFT)

of a series of input samples in time-domain. Radix-2 FFT butterfly operations are defined by

the equation below [13]:

X[k] = ∑ x(n)WNknN−1

n=0 , (3.5)

for k=0,....,N-1, where

WN = e−j2πN . (3.6)

Splitting the sequence x(n) of the above equation into two sequences of odd and even samples,

each of length N/2 we can write as:

X[k] = ∑ x(2m)WNk2mN/2−1

m=0 + ∑ x(2m + 1)WNk(2m+1)N/2−1

m=0 .

Implementation

39

X[k] = ∑ g1(m)WNk2mN/2−1

m=0 + WNk ∑ g2(m)WN

k2mN/2−1m=0 . (3.7)

Here, WNk2m = e−j2k2π

N = e−jk 2π

N/2 = WN/2km , so we get:

X[k] = ∑ g1(m)WN/2kmN/2−1

m=0 + WNk ∑ g2(m)WN/2

kmN/2−1m=0 , (3.8)

X[k] = G1[k] + WNkG2[k], (3.9)

where G1[k] = ∑ g1(m)WN/2kmN/2−1

m=0 and G2[k] = ∑ g2(m)WN/2kmN/2−1

m=0 are N/2-point DFTs of

g1(n) and g2(n). G1[k] and G2[k] are periodic with a period N/2, so G1[k] = G1[k +N

2] , G2[k] =

G2[k +N

2] and WN

k+N

2 = −WNk. Thus we can write as:

X[k] = G1[k] + WNkG2[k], (3.10)

X [k +N

2] = G1[k] − WN

kG2[k], (3.11)

where k=0,…, N/2-1.

This single butterfly operation can be seen in Figure 3_8.

G1[k]

G2[k](WN)^k

X[k]

X[k+N/2]-1

Figure 3_8: Single butterfly operation

In general, butterfly equations of Radix-2 FFT can be written as [3]:

A = a + W b (3.12)

B = a - W b (3.13)

In the DRRA, we work on real and imaginary parts separately. Let us suppose that Dr and Di

are the real and imaginary parts of ‘a’ then a = Dr (k) + i Di (k). Similarly, for ‘b’ we have b=Dr

(kh) + i Di (kh). Where ‘k’ is the index for even samples and ‘kh’ is the index for odd samples.

Thus, we can insert ‘a’ and ‘b’ in the above equations:

Implementation

40

A = (Dr (k) + iDi (k) ) + (Wr + iWi) (Dr (kh) + iDi (kh)),

A = Dr (k) + iDi (k) + Wr Dr (kh) - Wi Di(kh) + i(Wr Di (kh) + Wi Dr (kh)),

A = (Tr0(k) - Wi Di(kh)) + i(Ti0(k) + Wi Dr(kh)), (3.14)

where Tr0(k) = Dr (k) + Wr Dr (kh) and Ti0(k) = Di (k) + Wr Di (kh).

B = (Dr (k) + iDi (k)) - (Wr + iWi)(Dr (kh) + iDi (kh)),

B = Dr (k) + Di (k) - Wr Dr (kh) + Wi Di (kh) - i(Wr Di (kh) + Wi Dr (kh)),

B = (Tr1 (k) + Wi Di (kh)) + i(Ti1 (k) - Wi Dr (kh)), (3.15)

where Tr1(k) = Dr (k) - Wr Dr (kh) and Ti1(k) = Di (k) - Wr Di (kh).

Single butterfly implementation on the DRRA fabric can be seen in Figure 3_9 [3]. We are

using 2 RFiles, one for the real and other for the imaginary part of the equations (3.14) and

(3.15) separately. The third RFile is used to store the twiddle factor and the fourth RFile is used

for the delay in Dr(kh) and Di(kh).

Current RFile in DRRA fabric can store up to 32 words, each of 16 bits. Therefore, for 64-

points FFT, we need 2 RFiles for the real and imaginary parts and we need 2 more RFiles for

the twiddle factor and the delay line.

Seq

uen

cer

Dreal

Seq

uen

cer

Dimag

Seq

uen

cer

Twiddle factor, W

Seq

ue

nce

r

Delay Line

RFile

RFile

RFile

RFile

DP

U

DP

UD

PU

DP

U

DR

RA

Cell [

0, 0

]

DR

RA

Cell [

0, 1

]

DR

RA

Cell [

1, 0

]

DR

RA

Cell [

1, 1

]

For real parts For imaginary parts

×

-+

Dr(kh) Dr(k) Di(kh) Di(k)

×

-

+

×-

+

×

-+

WrWr

Wi WiDr(kh)Di(kh)

Wr Wi

Dr(kh) Di(kh)

Dr(kh) Di(kh)

AREBRE BIMAIM

Tr1(k)Tr0(k) Ti0(k) Ti1(k)

Figure 3_9: Implementation of Radix-2 FFT single butterfly on DRRA [3]

Implementation

41

3.4 Implementation of FPA on the DRRA fabric

Implementation of FPA on the DRRA fabric can be seen in Figure 3_10. We have used 8 DRRA

cells for implementing the FPA algorithm. RFile[0, 0] and RFile[1, 0] are being used for the

storing the real part Xs,r and imaginary part Xs,i respectively in the frequency domain of the

original OFDM signal, which we want to clip. As in the DRRA fabric, we work on real and

imaginary parts separately, therefore RFile[0, 1] and RFile[1, 1] are being used for the FFT and

IFFT operations. Clipping function of the signal is implemented in the DRRA cell [1, 3]. The

clipped signal xc is sent to the DDRA cell [0, 1] and DDRA cell [1, 1] for conversion into the

frequency domain. FFT operation generates the real part Xc,r and imaginary part Xc,i separately.

Xc,r and Xc,i are multiplied with the vector L, then are added with Xs,r and Xs,i respectively to

reset the used tones while the unused tones are left as they are. The generated output is then

converted back into the time domain, which is the new xs. This xs is compared again against the

predefined threshold value C, which is stored in the RFile[1, 3]. If it is greater than C, the whole

process is repeated again otherwise it terminates and we get the required signal with reduced

PAPR. S

eq

ue

nce

r

Xs,r

DRRA Cell [0, 0]

Se

qu

en

ce

r

Xs,i

DRRA Cell [1, 0]

Se

qu

en

ce

r

Dreal

DRRA Cell [0, 1]

Se

qu

en

ce

r

Dimag

DRRA Cell [1, 1]

Se

qu

en

ce

r

DRRA Cell [0, 2]

Se

qu

en

ce

r

Tiddle factor, W

DRRA Cell [1, 2]

Se

qu

en

ce

r

L

DRRA Cell [0, 3]

Se

qu

en

ce

r

Clip

DRRA Cell [1, 3]

RFile

DPU

Figure 3_10: Implementation of FPA on DRRA fabric

Results

42

4 RESULTS

This chapter describes the results of simulations of this thesis. The efficiency of the Fourier

Projection Algorithm (FPA) can be measured with Tone Rate Loss (TRL), the number of iter-

ations required to reduce the Peak-to-Average Power Ratio (PAPR), and the throughput of the

system.

The TRL is defined as the ratio of the number of unused tones to the total tones.

Tone rate loss = Number of unused tones

Total number of tones. (4.1)

Throughput is a measure of the number of bits per unit time or a sample per unit time. Our

system is sending N number of bits, out of which NB are the unused tones. So, the sample size

is N−NB

N. The time taken by the total samples is same as by the number of iterations n. Thus,

N−NB

N/n is the required throughput of our system.

Another important factor is the complexity of the system. The complexity of the system is de-

fined as the number of iterations required to reduce the PAPR of the system.

In the system, four different outputs of the simulations are the reduction of the PAPR, the

number of iterations versus the number of unused tones, throughput versus the number of un-

used tones, and throughput versus the number of iterations. In the next four sections, the sim-

ulation results are plotted and discussed in detail.

4.1 Number of iterations versus number of unused tones

In this analysis, a random number of unused tones are selected and the FPA algorithm is applied

to count the required number of iterations to reduce the PAPR of the system. Figure 4_1 shows

the simulation result for the varying the number of unused tones for the FPA algorithm versus

the number of iterations it needs to reduce the PAPR of the signal. Note that with the increase

of the number of unused tones, the FPA model requires a lesser number of iterations to reduce

the PAPR to the required threshold limit and vice versa. Thus, the TRL increases with the in-

crease of the number of unused tones but the number of iterations to reduce the PAPR reduces

and vice versa. It can be observed from the Figure 4_1 that the number of iterations for conver-

gence remains fairly constant for a higher number of unused tones. However, the complexity

of the FPA algorithm increases for the lower number of unused tones. It can also be noticed

Results

43

that for the number of unused tones from 11 to 20, the number of iterations has dropped from

487 to 37.

Figure 4_1: Number of iterations versus Number of unused tones

In the Figure 4_2, effects of different PAPR is shown. The PAPR level is varied from 3 to 4

dB, and the number of iterations increases from 487 to 637 for the 11 number of unused

tones. However, it can be seen that for the higher number of unused tones, the number of iter-

ations are fairly same for both the 3 and 4 dB PAPR signals.

Figure 4_2: Number of iterations versus Number of unused tones for varying PAPR levels

Results

44

4.2 Throughput versus number of unused tones

Next plot is the throughput of the system against the number of unused tones. The FPA is sim-

ulated for different number of unused tones and the corresponding throughput of the system is

calculated by N−NB

N/n to plot the number of unused tones versus throughput. It can be seen in

Figure 4_3 that the throughput of the system increases with the increase of the TRL. Thus, there

is a trade-off between the TRL and the throughput of the system. Higher the TRL, better the

throughput of the FPA algorithm and vice versa. It can be noticed that for increasing the number

of unused tones from 11 to 20 tones, the throughput of the system increases from 0.0017 to

0.0186.

Figure 4_3: Throughput versus Number of unused tones

4.3 Throughput versus number of Iterations

Figure 4_4 shows the graph, which is a measure of the throughput of the FPA versus the number

of iterations. Note that as the number of unused tones increases, the number of iterations for

reducing the PAPR of the OFDM system decreases and hence the throughput of the system

increases. The throughput of the system increases from 0.0017 to 0.0186 with the decrease of

the number of iterations from 487 to 37.

Results

45

Figure 4_4: Throughput versus Number of iterations

4.4 Discussions

The reduction of PAPR of OFDM systems, using the FPA algorithm, has been achieved suc-

cessfully. The results in this chapter show that the performance of the FPA algorithm depends

on the number of unused tones and the PAPR level of the system. The throughput and the

complexity of the FPA algorithm can be improved at the cost of the number of unused tones as

well as the PAPR of the system. For higher PAPR level, we need more unused tones to reduce

the PAPR of the OFDM system. So, increasing the number of unused tones reduces the data

rate of the system, which is one of the disadvantages of this algorithm.

The FPA method works well in reducing the PAPR but its computation is complex. We are not

sure on how many unused tones we need for clipping control of the signal. There is a need to

find a way to calculate the number of unused tones that are needed for a specific PAPR value.

The number of unused tones can vary depending on the PAPR of the system. If the given num-

ber of unused tones are lesser than the required number of unused tones, FPA algorithm does

not reduce the PAPR to a required limit.

Conclusions and Future Work

46

5 CONCLUSIONS AND FUTURE WORK

The purpose of this thesis of modeling the Fourier Projection Algorithm (FPA) in SIMULINK

and MATLAB environments and its synthesized design on the CREST fabric is achieved suc-

cessfully. Simulation results in chapter 4 show that the FPA is a very efficient way of suppress-

ing the Peak-to-Average Power ratio (PAPR) of Orthogonal Frequency Division Multiplexing

(OFDM) systems but there is a trade-off between the number of unused tones, the number of

iterations and the throughput. The level of PAPR reduction and time to suppress to its required

level depends on the number of unused tones. The FPA algorithm is also successfully imple-

mented in CGRAs. The effectiveness of the FPA algorithm depends on the original PAPR of

the system and the number of unused tones to reduce the original PAPR. Further, the FPA

algorithm offers good PAPR reduction with a less amount of data loss.

This thesis is based only on the prototype implementation of the FPA. The main future work of

this thesis is full integration of the FPA algorithm on the WLAN transmitter. There is one more

extension of this work on the receiver side. Since FPA technique is used to reduce the PAPR in

the transmitter but at the receiver end to retrieve the data back to its original form, there is

another technique called Clipping Estimation and Correction (CEC), which is not implemented

in this thesis. In addition, a better estimator for the unused tones can be implemented.

References

47

REFERENCES

[1] A. Katz, “Linearization: Reducing distortion in power amplifiers,” IEEE microwave

magazine, vol. 2, no. 4, pp. 37–49, 2001.

[2] V. Vijayarangan and R. Sukanesh, “An overview of techniques for reducing peak to av-

erage power ratio and its selection criteria for orthogonal frequency division multiplexing radio sys-

tems,” Journal of theoretical and applied information technology, vol. 5, no. 1, pp. 25–36, 2009.

[3] M. A. Shami, “Dynamically reconfigurable resource array,” PhD dissertation, KTH

Stockholm, 2012.

[4] S. Cioni, G. E. Corazza, M. Neri, and A. Vanelli-Coralli, “On the use of ofdm radio in-

terface for satellite digital multimedia broadcasting systems,” International Journal of Satellite Com-

munications and Networking, vol. 24, no. 2, pp. 153–167, 2006.

[5] K. Pietikäinen, “Orthogonal frequency division multiplexing,” Internet presentation,

2005.

[6] N. Chide, S. Deshmukh, and P. Borole, “Implementation of ofdm system using ifft and

fft,” International Journal of Engineering Research and Applications (IJERA), vol. 3, no. 1, pp. 2009–

2014, 2013.

[7] J. A. Bingham, “Multicarrier modulation for data transmission: An idea whose time has

come,” IEEE Communications magazine, vol. 28, no. 5, pp. 5–14, 1990.

[8] A. Chadha, N. Satam, and B. Ballal, “Orthogonal frequency division multiplexing and

its applications,” arXiv preprint arXiv:1309.7334, 2013.

[9] E. Costa, M. Midrio, and S. Pupolin, “Impact of amplifier nonlinearities on ofdm trans-

mission system performance,” IEEE Communications Letters, vol. 3, no. 2, pp. 37–39, 1999.

[10] R. Panda and S. Hauck, “Dynamic communication in a coarse grained reconfigurable

array,” in Field-Programmable Custom Computing Machines (FCCM), 2011 IEEE 19th Annual Inter-

national Symposium on. IEEE, 2011, pp. 25–28.

[11] S. M. Jafri, M. Daneshtalab, A. Hemani, N. Abbas, M. A. Awan, and J. Plosila, “Tea:

Timing and energy aware compression architecture for efficient configuration in cgras,” Microproces-

sors and Microsystems, vol. 39, no. 8, pp. 973–986, 2015.

[12] A. Gatherer and M. Polley, “Controlling clipping probability in dmt transmission,” in

Signals, Systems & Computers, 1997. Conference Record of the Thirty-First Asilomar Confer-

ence on, vol. 1. IEEE, 1997, pp. 578–584.

[13] W. T. Cochran, J. W. Cooley, D. L. Favin, H. D. Helms, R. A. Kaenel, W. W. Lang,

G. Maling, D. E. Nelson, C. M. Rader, and P. D. Welch, “What is the fast fourier transform?” Pro-

ceedings of the IEEE, vol. 55, no. 10, pp. 1664–1674, 1967.

Modeling in Simulink and Synthesis of Digital Pre ...1043766/FULLTEXT01.pdf · IBO Input Back-Off IFFT Inverse Fast Fourier Transform IM Inter-Modulation IM3 Third-order Inter-Modulation

Documents