Artificial Neural Networks for RF and Microwave Design: From Theory to Practice Qi-Jun Zhang + Kuldip C. Gupta* and Vijay K. Devabhaktuni + +Department.

Artificial Neural Networksfor RF and Microwave Design:

From Theory to Practice

Qi-Jun Zhang+

Kuldip C. Gupta*and

Vijay K. Devabhaktuni+

+Department of Electronics, Carleton University, Ottawa, ON, Canada

*Department of ECE, University of Colorado, Boulder, CO, USA

• Introduction and overview• Neural network structures• Neural network model development process• RF/Microwave component modeling using neural

networks• High-frequency circuit optimization using neural

network models• Conclusions

Outline

• Accurate RF/Microwave design is crucial for the current upsurge in VLSI, telecommunication and wireless technologies

• Design at microwave frequencies is significantly different from low-frequency and digital designs

• Substantial development in RF/microwave CAD techniques have been made during the last decade

• Further advances in CAD are needed to address new design challenges, e.g., 3D-EM optimization, CPW and multi-layered circuits, IC antenna modules, etc

• Fast and accurate models are key to efficient CAD

• Neural network based modeling and design could

significantly impact high-frequency CAD

Introduction

A Quick Illustration Example:

Neural Network Model for Delay Estimation in a High-Speed

Interconnect Network

High-Speed VLSI Interconnect Network

Driver 1

Driver 2

Receiver 1

Driver 3

Receiver 2

Receiver 3

Receiver 4

Circuit Representation of the Interconnect Network

C1R1 R2 C2

C4R4

C3R3

L1 L2L3

L4

1 2

3

4

SourceVp, Tr

Rs

• A PCB contains large number of interconnect networks, each with different interconnect lengths, terminations, and topology, leading to need of massive analysis of interconnect networks

• During PCB design/optimization, the interconnect networks need to be adjusted in terms of interconnect lengths, receiver-pin load characteristics, etc, leading to need of repetitive analysis of interconnect networks

• This necessitates fast and accurate interconnect network models and neural network model is a good candidate

Need for a Neural Network Model

Neural Network Model for Delay Analysis

L1 L2 L3 L4 R1 R2 R3 R4 C1 C2 C3 C4 Rs Vp Tr e1 e2 e3

……......

1 2 3 4

Simulation Time for 20,000Interconnect Configurations

Method CPU

Circuit Simulator (NILT) 34.43 hours

AWE 9.56 hours

Neural Network Approach 6.67 minutes

• Neural networks have the ability to model multi-dimensional nonlinear relationships

• Neural models are simple and the model computation is fast

• Neural networks can learn and generalize from available data thus making model development possible even when component formulae are unavailable

• Neural network approach is generic, i.e., the same modeling technique can be re-used for passive/active devices/circuits

• It is easier to update neural models whenever device or component technology changes

Important Features of Neural Networks

• Neural models are efficient alternatives to closed-form expressions, equivalent circuit models and look-up tables

• Neural network models can be developed from measured or simulated data

• Neural models can also be used to update or improve the accuracy of already existing models

• Neural network models have been developed for active devices, passive components and interconnect networks

• These models have been used in circuit simulators for circuit-level simulation, design and optimization

Neural Networks for RF/Microwave Applications: Overview

Neural Network Structures

• A neural network contains• neurons (processing elements) • connections (links between neurons)

• A neural network structure defines • how information is processed inside a neuron • how the neurons are connected

• Examples of neural network structures• multi-layer perceptrons (MLP)• radial basis function (RBF) networks• wavelet networks• recurrent neural networks• knowledge based neural networks

• MLP is the basic and most frequently used structure

Neural Network Structures

MLP Structure

(Output) Layer L

(Hidden) Layer L-1

1 2 NL

NL-1321

. . . .

. . . . . . .

. . . .

(Hidden) Layer 2

(Input) Layer 1

321 N2

N131 2

x1 x2 x3

. . . .

. . . .

xn

y1 y2 ym

Information Processing In a Neuron

12lz

liz

(.)

….

liw 0

10lz

liw 1 l

iw 2

li

liN l

w1

11lz

1

1

lN l

z

• Input layer neurons simply relay the external inputs to the neural network

• Hidden layer neurons have smooth switch-type activation functions

• Output layer neurons can have simple linear activation functions

Neuron Activation Functions

0

0.5

1

1.5

-25 -20 -15 -10 -5 0 5 10 15 20 25

Sigmoid

()=1/(1 +e-)

Activation Functions for Hidden Neurons

-2

-1

0

1

2

-10 -8 -6 -4 -2 0 2 4 6 8 10

Arc-tangent

()=(2/)arctan()

-2

-1

0

1

2

-10 -8 -6 -4 -2 0 2 4 6 8 10

Hyperbolic-tangent

()=(e+ -e-)/(e+ +e-)

MLP Structure

(Output) Layer L

(Hidden) Layer L-1

1 2 NL

NL-1321

. . . .

. . . . . . .

. . . .

(Hidden) Layer 2

(Input) Layer 1

321 N2

N131 2

x1 x2 x3

. . . .

. . . .

xn

y1 y2 ym

x

z(1)

z(l -1)

z(2)

z(L)

y

Z1 Z2 Z3 Z4

y1 y2

x1 x2 x3

W ’jk

Wki

Outputs

yj =W ’jkZk

k

3 Layer MLP: Feedforward Computation

Inputs

Zk = tanh(Wki xi )

Hidden Neuron Values

i

How can ANN represent an arbitrary nonlinear input-output relationship?

Universal Approximation Theorem(Cybenko, 1989, Hornik, StinchCombe and White, 1989)

In plain words:

Given enough hidden layer neurons, a 3-layer MLP neural network can approximate an arbitrary continuous multidimensional function to any desired accuracy

• The number of hidden neurons depends upon the degree of non-linearity, and dimension of the original problem

• Highly nonlinear problems and high dimensional problems need more neurons while smoother problems and small dimensional problems need fewer neurons

• To determine number of hidden neurons• experience• empirical criteria• adaptive schemes• software tool internal estimation

How many hidden neurons are needed?

Development of Neural Network Models

Notation

y = y(x, w): ANN model x: inputs of given modeling problem or ANN

y: outputs of given modeling problem or ANN

w: weight parameters in ANN

d : data of y from simulation or measurement

Define Model Input-Output and Generate Data

Define model input-output (x, y), for example,

x: physical/geometrical parameters of the component y: S-parameters of the component

Generate (x, y) samples (xk, dk) , k = 1, 2, …, P, such that the data set sufficiently represent the behavior of the given x-y problem

Types of Data Generator: simulation and measurement

• Uniform grid distribution• Non-uniform grid distribution• Design of Experiments (DOE) methodology

central-composite design2n factorial design

• Star distribution • Random distribution

Where Data Should be Sampled

x1

x3

x2

Input / Output Scaling

The orders of magnitude of various x and d values in microwave applications can be very different from one another.

Scaling of training data is desirable for efficient neural network training

The data can be scaled such that various x (or d ) have similar order of magnitude

Training, Validation and Test Data Sets

The overall data should be divided into 3 sets, training, validation and test.

Notation:Tr - Index set of training dataV - Index set of validation dataTe - Index set of test data

In RF/microwave cases where overall data is limited, validation and test (or training and validation) data can be shared.

Error Definitions

Training error:

Validation and test errors EV and ETe can be similarly defined.

Training Objective: Adjust w to minimize EV , but the update of w is carried out using the information and

At end of training, the quality of the neural model can be tested using test error ETe

)(wErT

w

ErT

q1

Tk

m

1j

q

jmin,jmax,

jkkj

rT

r

r dd

d)w,x(y

m)size(T

1)w(E

• Sample-by-sample (or online) training: ANN weights are updated every time a training sample is presented to the network, i.e., weight update is based on training error from that sample

• Batch-mode (or offline) training: ANN weights are updated after each epoch, i.e., weight update is based on training error from all the samples in training data set

• An epoch is defined as a stage of ANN training that involves presentation of all the samples in the training data set to the neural network once for the purpose of learning

Types of Training

Neural Network Training

The error between training data and neural network outputs is feedback to the neural network to guide the internal weight update of the network

x

Neural Network W

y

Training Data

d

Training Error

-

Typical Training Process

Step 1: w = initial guess, set epoch = 0

Step 2: If (EV(epoch) < required_accuracy) or if (epoch > max_epoch)

then STOP

Step 3: Compute (or and ) using all samples

in training data set (i.e., batch-mode training)

Step 4: Use optimization algorithm to find and update the weights

Step 5: Set epoch = epoch + 1 and GO TO Step 2

)(wErT w

wErT

)(

wwww

)(wErT

Gradient-based Training Algorithms

where h is the direction of the update of w is the step size

Gradient-based methods use information of and to determine update direction h Step size is determined in one of the following ways:

Small value either fixed or adaptive during training Line minimization to find best value of

Examples of algorithms: backpropagation, conjugate gradient, and quasi-Newton

hw

)(wErT w

wErT

)(

Example: Backpropagation (BP) Training(Rumelhart, Hinton, Williams 1986)

In the gradient algorithm,

Let the update direction h be the negative gradient direction, then:

or

where is called learning rate is called momentum factor

hw

w

wEww

rT

)(

1|)(

epoch

T

ww

wEww

r

Desired accuracy achieved?


Yes

Flow-chart Showing Neural Network Training, Neural Model Testing, and Use of Training, Validation and Test Data Sets in ANN Modeling

Evaluate

validation error

Perform feedforward computation for all

samples in validation set

Assign random initial values for all the weight

parameters

Select a neural network structure, e.g., MLP

STOPTraining

Evaluate test error as an independent quality

measure for ANN model reliability

Perform feedforward computation for all samples in test set

START


samples in training set

Compute derivatives of training error w.r.t. ANN

weights

Update neural network weight parameters using a gradient-based algorithm (e.g., BP, quasi-Newton) Evaluate

training error

Evaluate validation error



Assign random initial values for all the weight

parameters

Select a neural network structure, e.g., MLP

Evaluate test error as an independent quality

measure for ANN model reliability

Perform feedforward computation for all samples in test set

STOPTraining


samples in training set

Compute derivatives of training error w.r.t. ANN

weights

Update neural network weight parameters using a gradient-based algorithm (e.g., backpropagation) Evaluate

training error

START

No

Yes



Evaluate validation error





Example:EM-ANN Models for CPW

Circuit Design and Optimization

Example: CPW Symmetric T-junction

Range of input parameters of CPW T-junction model

Input Parameter Minimum Value Maximum Value

Frequency 1 GHz 50 GHz

Win 20 m 120 m

Gin 20 m 60 m

Wout 20 m 120 m

Gout 20 m 60 m

Substrate: H=25 mil

r=12.9 tan = 0.0005

Strip: tmetal = 3 m

Wa = 40 m

CPW T-junction Geometry

Strip

Gout

Gout

Wout

Wa

Win GinGin

Error Comparison Between EM-ANN Model and EM Simulations for the CPW Symmetric T-Junction

|S11|S11 |S13|

S13 |S23|S23 |S33|

S33

Training DataAverage ErrorStd. Deviation

0.001500.00128

0.7540.696

0.000710.00058

0.1760.172

0.000840.00097

0.2460.237

0.001060.00109

0.6330.546

Test DataAverage ErrorStd. Deviation

0.003450.00337

0.7820.674

0.000880.00085

0.1410.125

0.001260.00105

0.1770.129

0.000830.00068

0.8380.717

Example:ANN Based Design of

a CPW Folded Double-Stub Filter

• CPW Transmission line

• CPW Bend

• CPW Short-circuit

• CPW Open-circuit

• CPW Step-in-width

• CPW Symmetric T-junction

List of EM-ANN Models Trained from

Detailed Electromagnetic (EM) Data

CPW Folded Double-Stub Filter Designed Using EM-ANN Models of Circuit Components

ANN Based Optimization:• Goal: Resonant at 26 GHz

• Optimize lstub and lmid

• Required 7 overall circuit

simulations• CPU-Time: 3 minutes

Simulation of the optimized filter circuit using EM-ANN models:

•30 seconds•100 frequency points

Full-wave EM simulation of the optimized filter circuit:

•14 hours •17 frequency points

CPW folded double-stub filter geometry

Comparison of Optimized Circuit Responses From EM-ANN Based Simulations and Full-Wave EM Simulations

Example:FET Modeling Using

Neural Networks

Source

L

W

a

Nd

Drain

Gate

Neural Model for MESFET Modeling

….

Id qg qd qs

L W a Nd Vgs Vds

Vd (V)

FET I-V curves: Neural model vs FET test data

Vg = 0V

-1V

-2V

-3V

-4V

-5V

I d (m

A)

FET S-parameters: Neural model vs FET test data

Frequency (GHz)

S21

S11

S22

S12S -

para

met

ers

(dB

)

SIMULATOR

Harmonic Balance

Equation Solver

Linear subnet

Nonlinearsubnet

YV I, QV V

Incorporating Large-Signal FET Neural Network Model into HB Circuit Simulator

I, Q

y

V

x

Neural Model

Example:

Yield Optimization of

a 3-Stage MMIC Amplifier

Using Neural Network Models

Three-Stage X-Band Amplifier CircuitWith 3 FETs Represented By ANN Models

Vd1

Vd3Vd2

Vg1

Vg3Vg2

Input

Output

Variable Mean Deviation (%) Variable Mean Deviation (%) Nd (1/m3) 1023 7.0 d (m) 0.1 4.0 L (m) 1.0 3.5 SC1 (m2) 326.8 3.5 a (m) 0.3 3.5 SC2 (m2) 2022.4 3.5 W (m) 300 2.0 SC3 (m2) 218.2 3.5 WL (m) 20 3.0 SC4 (m2) 352.2 3.5 SL (m) 10 3.0

Distributions for Statistical Variables

3-Stage Amplifier: Before and After ANN Based Yield Optimization500 Monte Carlo Simulations are Shown

• ANNs can be trained to learn RF/Microwave data and the resulting ANN model is fast and can accurately represent the corresponding input-output relationship

• ANN development is a computer-based training process as opposed to human-based trial-and-error process in developing empirical/equivalent circuit models

• Neural network modeling approach is generic, i.e., the same ANN method can be used to model passive or active components, for devices or circuits

• Neural models can be used in place of CPU-intensive detailed models to enhance high-level CAD operations such as circuit simulation and yield optimization

• Neural network based modeling and design promises to address ever increasing demand for efficient CAD

Conclusions

• Advanced methods of combining neural networks with RF/Microwave knowledge for efficient modeling

• Towards full automation of model generation process using neural networks featuring online data generation and ANN training

• Modeling nonlinear dynamic behaviors of devices and circuits employing neural network techniques

• 3D-EM modeling of passive components using ANNs

• Combining neural networks with other advanced CAD concepts such as the space mapping (SM) technique

• Application of artificial neural networks for RF and microwave measurements

• Knowledge Aided Design (KAD) of microwave circuits exploiting neural networks

Emerging Trends

A HyperLink to NeuroModeler

NeuroModeler is a software for developing neural network models for passive and active components/circuits for high-frequency circuit design.

Here is a Web demonstration version with which you can train a MLP neural model, test it with test data, and see the neural model reproducing the input-output relationship it learnt.

Basic Steps of Running NeuroModeler

The demonstration version is self-explanatory, where the basic steps are: Press the “New Neural Model” button to define the neural model structure and number of input, output and hidden neurons

Press the “Train Neural Model” button to train the neural model

Press the “Test Neural Model” button to test the quality of the model

Press the “Display Model Input-Output” button to see the input-output relationship reproduced by the neural model

Start NeuroModeler

Make sure your computer is connected to the Internet.

Now you can click NeuroModeler here to start the Web demonstration version of the program.

Artificial Neural Networks for RF and Microwave Design: From Theory to Practice Qi-Jun Zhang + Kuldip C. Gupta* and Vijay K. Devabhaktuni + +Department.

Documents

neural network model

layer mlp neural network

neural networks mlp

optimization neural

fast neural networks

artificial neural networks

hours neural network

e slide