Function Approximation spring 2006 1 Function Approximation Fariba Sharifian Somaye Kafi
Dec 26, 2015
Function Approximation spring 2006
1
Function Approximation
Fariba Sharifian
Somaye Kafi
Function Approximation spring 2006
2
Contents Introduction to Counterpropagation Full Counterpropagation
Architecture Algorithm Application example
Forward only Counterpropagation Architecture Algorithm Application example
Function Approximation spring 2006
3
Contents
Function Approximation Using Neural Network Introduction Development of Neural Network Weight Equations Algebra Training Algorithms
Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra
Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data
Function Approximation spring 2006
4
Introduction to Counterpropagation
are multilayer networks based on combination of input, clustering and output layers
can be used to compress data, to approximate functions, or to associate patterns
approximate its training input vectors pair by adoptively constructing a lookup table
Function Approximation spring 2006
5
Introduction to Counterpropagation (cont.)
training has two stages Clustering Output weight updating
There are two types of it Full Forward only
Function Approximation spring 2006
6
Full Counterpropagation
Produces an approximation x*:y* based on
input of an x vector input of a y vector only input of an x:y ,possibly with some distorted or missing
elements in either or both vectors.
Function Approximation spring 2006
7
Full Counterpropagation (cont.)
Phase 1 The units in the cluster layer compete. The learning rule for
weight updates on the winning cluster unit is (only the winning unit is allowed to learn)
learning)Kohonen standard is (This
,...,2,1
,...,2,1
mkuywu
niwxwwold
kJk
old
kJ
new
kJ
old
iJi
old
iJ
new
iJ
Function Approximation spring 2006
8
Full Counterpropagation (cont.)
Phase 2 The weights from the winning cluster unit J to the output units are
adjusted so that the vector of activations of the units in the Y output layer, y*, is an approximation to the input vector y; x*, is an approximation to the input vector x. The weight updates for the units in the Y output and X output layers are
learning) Grossberg asknown is (This
,...,2,1
,...,2,1
nitxbtt
mkvyavvold
Jii
old
Ji
new
Ji
old
Jkk
old
Jk
new
Jk
Function Approximation spring 2006
9
Architecture of Full Counterpropagation
X1
Xi
Xn
Z1
Zj
Zp
Y1
Yk
Ym
Y1*
Yk*
Ym*
X1*
Xi*
Xn*
Hidden layer uw
t
Cluster layer
v
Function Approximation spring 2006
10
Full Counterpropagation Algorithm
learning) (Grossberglayer cluster fromout for weight rates learning:,
learning)(Kohonen layer cluster into sfor weight rates learning : ,
unit layer,output to,unit layer,cluster fromweight :
unit layer,output to,unit layer,cluster fromweight :
unit Z layer,cluster to,unit layer,input fromweight :
unit Z layer,cluster to,unit layer,input fromweight :
y vector ion toapproximat computed:
vector xion toapproximat computed:
unit Zlayer cluster of activation:
)(y :input x toingcorrespondoutput target :y
),...,,...,( x: vector ninginput trai :x
*
*
1
1
ba
XXZt
YYZv
YYu
XXw
y
x
z
,...,y,...,yy
xxx
*
ijjk
*
kjjk
jkkj
jiij
jj
mk
ni
Function Approximation spring 2006
11
Full Counterpropagation Algorithm (phase 1) Step 1. Initialize weights, learning rates, etc. Step 2. While stopping condition for Phase 1 is false, do Step 3-8 Step 3. For each training input pair x:y, do Step 4-6 Step 4. Set X input layer activations to vector x ; set Y input layer activations to vector y. Step 5. Find winning cluster unit; call its index J Step 6. Update weights for unit ZJ:
Step 7. Reduce learning rate and . Step 8. Test stopping condition for Phase 1 training
Function Approximation spring 2006
12
Full Counterpropagation algorithm(phase 2)
Step 9. While stopping condition for Phase 2 is false, do Step 10-16
(Note: and are small, constant values during phase 2)
Step 10. For each training input pair x:y, do Step 11-14
Step 11. Set X input layer activations to vector x ; set Y input layer activations to vector y. Step 12. Find winning cluster unit; call its index J Step 13. Update weights for unit ZJ:
Function Approximation spring 2006
13
Full Counterpropagation Algorithm(phase 2)(cont.)
Step 14. Update weights from unit ZJ to the output layers
Step 15. Reduce learning rate a and b. Step 16. Test stopping condition for Phase
2 training.
Function Approximation spring 2006
14
Which cluster is the winner? dot product (find the cluster with the largest net input)
Euclidean distance (find the cluster with smallest square distance from the input)
i k
kjkijij uywxnet
i k
kjkijij uywxD 22
Function Approximation spring 2006
15
Full Counterpropagation Application
The application for counterpropagation is as follows:
Step0: initialize weights. step1: for each input pair x:y, do step 2-4. Step2: set X input layer activation to vector x
set Y input layer activation to vector Y;
Function Approximation spring 2006
16
Full Counterpropagation Application (cont.)
Step3: find cluster unit Z, that is closest to the input pair
Step4: compute approximations to x and y:
X*i=tji
Y*k=ujk
Function Approximation spring 2006
17
Full counterpropagation example Function approximation of y=1/x After training phase we have
Cluster unit v w z1 0.11 9.0 z2 0.14 7.0 z3 0.20 5.0 z4 0.30 3.3 z5 0.60 1.6 z6 1.60 0.6 z7 3.30 0.3 z8 5.00 0.2 z9 7.00 0.14 z10 9.00 0.11
Function Approximation spring 2006
18
Full counterpropagation example (cont.)
X1
Z1
Z2
Z10
Y1
Y1* X1
*
.
.
.
0.11 9.0
9.0 0.11
0.14 7.0
0.25.0
0.14
0.2
7.0
5.0
Function Approximation spring 2006
19
Full counterpropagation example (cont.) To approximate value for y for x=0.12 As we don’t know any thing about y compute D just by means of x D1=(.12-.11)2 =.0001 D2=.0004 D3=.064 D4=.032 D5=.23 D6=2.2 D7=10.1 D8=23.8 D9=47.3 D10=81
Function Approximation spring 2006
20
Forward Only Counterpropagation
Is a simplified version of the full counterpropagation
Are intended to approximate y=f(x) function that is not necessarily invertible
It may be used if the mapping from x to y is well defined, but the mapping from y to x is not.
Function Approximation spring 2006
21
Forward Only Counterpropagation Architecture
X1
Xi
Xn
Z1
Zj
Zp
Y1
Yk
Ym
Input layer Cluster layer Output layer
XYw
XY
u
Function Approximation spring 2006
22
Forward Only Counterpropagation Algorithm
Step 1. Initialize weights, learning rates, etc. Step 2. While stopping condition for Phase 1 is false, do Step 3-8 Step 3. For each training input x, do Step 4-6 Step 4. Set X input layer activations to vector x Step 5. Find winning cluster unit; call its index j Step 6. Update weights for unit ZJ:
Step 7. Reduce learning rate Step 8. Test stopping condition for Phase 1 training.
n iwxww oldiJi
oldiJ
newiJ 1,2,...,,
Function Approximation spring 2006
23
Step 9. While stopping condition for Phase 2 is false, do Step 10-16 (Note: is small, constant values during phase 2) Step 10. For each training input pair x:y, do Step 11-14 Step 11. Set X input layer activations to vector x ; set Y input layer activations to vector y. Step 12. Find winning cluster unit; call its index J Step 13. Update weights for unit ZJ ( is small)
Step 14. Update weights from unit ZJ to the output layers
Step 15. Reduce learning rate a. Step 16. Test stopping condition for Phase 2 training.
n iwxww old
iJi
old
iJ
new
iJ 1,2,...,,
.,...,2,1 , mkuyauu old
Jkk
old
Jk
new
Jk
Function Approximation spring 2006
24
Forward Only Counterpropagation Application
Step0: initialize weights (by training in previous subsection).
Step1: present input vector x. Step2: find unit J closest to vector x. Step3: set activation output units:
yk=ujk
Function Approximation spring 2006
25
Forward only counterpropagation example Function approximation of y=1/x After training phase we have
Cluster unit w u z1 0.5 5.5 z2 1.5 0.75 z3 2.5 0.4 z4 . . z5 . . z6 . . z7 . . z8 . . z9 . . z10 9.5 0.1
Function Approximation spring 2006
26
Function ApproximationUsing Neural Network
IntroductionDevelopment of Neural Network Weight EquationsAlgebra Training Algorithms
Exact Matching of Function Input –Output DataApproximate Matching of Gradient Data in Algebra TrainingApproximate Matching of Function Input-Output DataExact Matching of Function Gradient Data
Function Approximation spring 2006
27
Introduction analytical description for a set of data
referred to as data modeling or system identification
Function Approximation spring 2006
28
standard tools
Splines Wavelets Neural network
Function Approximation spring 2006
29
Why Using Neural Network
Splines & Wavelets not generalize well to higher 3 dimensional spaces
universal approximators
parallel architecture trained to map multidimensional nonlinear
functions
Function Approximation spring 2006
30
Why Using Neural Network (cont)
Central to the solution of differential equations. Provide differentiable closed-analytic- form solutions have very good generalization properties widely applicable
translates into a set of nonlinear, transcendental weight equations
cascade structure nonlinearity of the hidden nodes linear operations in the input and output layers
Function Approximation spring 2006
31
Function Approximation Using Neural Network
functions not known analytically have a set of precise input–output samples
functions modeled using an algebraic approach design objectives:
exact matching approximate matching
feedforward neural networks Data:
Input Output And/or gradient information
Function Approximation spring 2006
32
Objective
exact solutions
sufficient degrees of freedom retaining good generalization properties
synthesize a large data set by a parsimonious network
Function Approximation spring 2006
33
Input-to-node values
algebraic training base if all sigmoidal functions inputs are known weight
equations become algebraic
input-to-node values, sigmoidal functions inputs
determine the saturation level of each sigmoid at a given data point
Function Approximation spring 2006
34
weight equations structure
analyze & train a nonlinear neural network
means linear algebra controlling the distribution controlling the saturation level of the active
nodes
Function Approximation spring 2006
35
Function ApproximationUsing Neural Network
IntroductionDevelopment of Neural Network Weight EquationsAlgebra Training Algorithms
Exact Matching of Function Input –Output DataApproximate Matching of Gradient Data in Algebra TrainingApproximate Matching of Function Input-Output DataExact Matching of Function Gradient Data
Function Approximation spring 2006
36
Development of Neural Network Weight Equations
Objective approximate a smooth scalar function of q Inputs
using a feedforward sigmoidal network
Function Approximation spring 2006
37
Derivative information
can improve network’s generalization properties
partial derivatives with input
can be incorporated in the training set
Function Approximation spring 2006
38
Network Output
z: computed as a nonlinear transformation w: input weight p: input b: bias d: output bias v: output weight :sigmoid functions
such as:
input-to-node variables
Function Approximation spring 2006
39
Scalar OutPut of Network
Function Approximation spring 2006
40
Exactly Match of the Function’s Outputs
output weighted equation
Function Approximation spring 2006
41
Gradient Equations
derivative of the network output with respect to its inputs
Function Approximation spring 2006
42
Exact Matching of the Function’s Derivatives
gradient weight equations
Function Approximation spring 2006
43
Input-to-node Weight Equations
rewriting 12
Function Approximation spring 2006
44
Four Algebraic Algorithms
Exact Matching of Function Input –Output Data
Approximate Matching of Gradient Data in Algebra Training
Approximate Matching of Function Input-Output Data
Exact Matching of Function Gradient Data
Function Approximation spring 2006
45
Function ApproximationUsing Neural Network
IntroductionDevelopment of Neural Network Weight EquationsAlgebra Training Algorithms
Exact Matching of Function Input –Output DataApproximate Matching of Gradient Data in Algebra TrainingApproximate Matching of Function Input-Output DataExact Matching of Function Gradient Data
Function Approximation spring 2006
46
A.Exact Matching of Function Input-Output Data
Input S is known matrix ps strategy for producing a well-conditioned S
input weights o
random number N(0,1) L scaling factor
user-defined scalar input-to-node values that do not saturate the sigmoids
Function Approximation spring 2006
47
Input bias
The input bias d is computed to center each sigmoid at one of the training pairs from
Function Approximation spring 2006
48
Finally, the linear system in (9) is solved for v by inverting S
Function Approximation spring 2006
49
17 produced an ill-conditioned S => computation repeated
Function Approximation spring 2006
50
Fig. 2-a. Exact input–output-based algebraic algorithm
Exact Input-Output-Based Algebraic Algorithm
Function Approximation spring 2006
51
Fig. 2-b. Exact input–output-based algebraic algorithm with added p-steps for incorporating gradient information.
Exact Input-Output-Based Algebraic Algorithm with gradient information.
Function Approximation spring 2006
52
Then
Exact matching Input output gradient information
solved exactly simultaneously for the neural parameters.
Function Approximation spring 2006
53
Function ApproximationUsing Neural Network
IntroductionDevelopment of Neural Network Weight EquationsAlgebra Training Algorithms
Exact Matching of Function Input –Output DataApproximate Matching of Gradient Data in Algebra TrainingApproximate Matching of Function Input-Output DataExact Matching of Function Gradient Data
Function Approximation spring 2006
54
B.Approximate Matching of Gradient Data in Algebra Training
estimate output weights input-to-node values
first soluation: use randomized W all parameters refined by a p-step node-by-node
update algorithm.
Function Approximation spring 2006
55
Approximate Matching of Gradient Data in Algebra Training (cont)
d and can be computed solely from
Function Approximation spring 2006
56
Approximate Matching of Gradient Data in Algebra Training (cont)
kith gradient equations solved for the input weights associated with the ith node
Function Approximation spring 2006
57
Approximate Matching of Gradient Data in Algebra Training (cont)
end of each step Solve
terminate user-specified gradient tolerance error enters through v and through the input
weights error adjusted in later steps
basic idea ith node input weights mainly contribute to the kth
partial derivatives
Function Approximation spring 2006
58
Function ApproximationUsing Neural Network
IntroductionDevelopment of Neural Network Weight EquationsAlgebra Training Algorithms
Exact Matching of Function Input –Output DataApproximate Matching of Gradient Data in Algebra TrainingApproximate Matching of Function Input-Output DataExact Matching of Function Gradient Data
Function Approximation spring 2006
59
C.Approximate Matching of Function Input-Output Data
algebraic approach approximate parsimonious network exact sulotion s<p satisfy rank(S|u)= rank(S)= s
example linear system in (9) not square sp inverse relationship between u and v (9) will be overdetermined
Function Approximation spring 2006
60
Approximate Matching of Function Input-Output Data (cont)
superimposes technique networks that individually map the nonlinear
function over portions of its input space
training set, covering entire input space input space divided into m subsets
Function Approximation spring 2006
61
Approximate Matching of Function Input-Output Data (cont)
J
Fig. 3.Superposition of -node neural networks into one s-node network
Function Approximation spring 2006
62
Approximate Matching of Function Input-Output Data (cont)
the gth neural network approximates the vector
by the estimate
Function Approximation spring 2006
63
Approximate Matching of Function Input-Output Data (cont)
full network matrix of input-to-node values
with the element in the ith column and kth row
Terms main diagonal terms
input-to-node value matrices for m sub-networks off-diagonal terms,
columnwise linearly dependent on the elements in
Function Approximation spring 2006
64
Approximate Matching of Function Input-Output Data (cont)
output weights
S constructed to be of rank s rank of = s or s+1 zero or small error during the superposition error does not increase with m
Function Approximation spring 2006
65
Approximate Matching of Function Input-Output Data (cont)
key to developing algebraic training techniques construct a matrix S, through N display the desired characteristics
desired characteristics S must be of rank s s is kept small to produce a parsimonious
network.
Function Approximation spring 2006
66
Function ApproximationUsing Neural Network
IntroductionDevelopment of Neural Network Weight EquationsAlgebra Training Algorithms
Exact Matching of Function Input –Output DataApproximate Matching of Gradient Data in Algebra TrainingApproximate Matching of Function Input-Output DataExact Matching of Function Gradient Data
Function Approximation spring 2006
67
D.Exact Matching of Function Gradient Data
Gradient-based training sets
At every training point k is known for e of the neural network inputs
denoted by x remaining (q-e) denoted by a
Input–output information
&
Function Approximation spring 2006
68
Exact Matching of Function Gradient Data (cont)
input weight
output weight
gradient weight
input-to-node weight equation
Function Approximation spring 2006
69
First Linear System(36)
by reorganizing all values
s=p => is a known -dimensional column vector
rewritten f A is a ps(q-e+1)s matrix computed from all –input vectors
Function Approximation spring 2006
70
Second Linear System(34)
known (34) system Becomes linear
always can be solved for v provided s = p S nonsingular v can be treated as a constant
Function Approximation spring 2006
71
Third Linear System(35)
(35) becomes linear unknowns consist of x-input weights known gradients in training set X is a
known epes
Function Approximation spring 2006
72
Exact Matching of Function Gradient Data (cont)
algorithm goals determines effective distribution for elements weight equations solved in one step
first solved strategy
with probability=1, produce well-conditioned S consists of generating according to
Function Approximation spring 2006
73
Input-to-Output Values
Substituted in (38)
Function Approximation spring 2006
74
Input-to-Output Values (cont)
sigmoids are very nearly centered desirable one sigmoid be centered for a given
input prevent ill-conditioning S
same sigmoid should close to saturation for any other known input
need a factor absolute value of the largest element in
Function Approximation spring 2006
75
Exact Matching of Function Gradient Data (cont)
Function Approximation spring 2006
76
Example: Neural Network Modeling of the Sine Function
A sigmoidal neural network is trained to approximate the sine function u=sin(y) over the domain 0≤ y ≤π
The training set is comprised of the gradient and output information shown in the table1.{yk, uk , ck} k=1,2,3
q=e=1
Function Approximation spring 2006
77
Function Approximation spring 2006
78
Function Approximation spring 2006
79
It is shown that the data is matched exactly by a network with two nodes
Suppose the input-to-node values and are chosen such that
Function Approximation spring 2006
80
Function Approximation spring 2006
81
Function Approximation spring 2006
82
equations. In this example, is chosen to make the above weight equations consistent and to meet the assumptions in (57) and (60)–(61). It can be easily shown that this corresponds to computing the elements of ( and ) from the equation
Function Approximation spring 2006
83
Function Approximation spring 2006
84
Function Approximation spring 2006
85
Function Approximation spring 2006
86
Conclusion algebraic training vs optimization-based techniques.
faster execution speeds better generalization properties reduced computational complexity can be used to find a direct correlation between the number of network
nodes needed to model a given data set and the desired accuracy of representation.
Function Approximation spring 2006
87
Function ApproximationFariba SharifianSomaye Kafi