Software and Hardware Implementation of Cellular Automata for Structural Analysis and Design Zafer Gürdal * & Mark T. Jones ** Virginia Tech * Depts. of Aerospace and Ocean Eng., & Engineering Science and Mechanics ** The Bradley Department of Electrical and Computer Engineering 06/17/03 National Institute of Aerospace, Hampton VA Support NASA LaRC, NRA 98, Innovative Algorithms for Aerospace Engineering Analysis and Optimization, PM: Jarek Sobieski NASA LaRC, Mechanics and Durability Branch, PM: Damodar Ambur Virginia Tech, ASPIRES Program
41
Embed
Software and Hardware Implementation of Cellular Automata for Structural Analysis and Design
Software and Hardware Implementation of Cellular Automata for Structural Analysis and Design. Zafer Gürdal * & Mark T. Jones ** Virginia Tech *Depts. of Aerospace and Ocean Eng., & Engineering Science and Mechanics **The Bradley Department of Electrical and Computer Engineering - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Software and Hardware Implementation of Cellular Automata for Structural
Analysis and Design
Zafer Gürdal* & Mark T. Jones**
Virginia Tech
* Depts. of Aerospace and Ocean Eng., & Engineering Science and Mechanics** The Bradley Department of Electrical and Computer Engineering
06/17/03 National Institute of Aerospace, Hampton VA
Support
NASA LaRC, NRA 98, Innovative Algorithms for Aerospace Engineering Analysis and Optimization, PM: Jarek Sobieski
NASA LaRC, Mechanics and Durability Branch, PM: Damodar Ambur
Virginia Tech, ASPIRES Program
CA Software Hardware Implementation
June 17, 2003
Outline
Introduction Evolutionary Design Elements of Cellular Automata
CA applied to Engineering Design Truss Domain Composite Laminate Design
Hardware Implementation Configurable Computing – FPGAs CA Implementation Results
Multigrid Acceleration
CA Software Hardware Implementation
June 17, 2003
Evolutionary DesignMimic natural evolution of biological systems for
structural design Evolutionary design often relies on local
optimality/decision making of independent parts
Examples: Reaction wood
Bone growth
Cellular Automata: Decomposition of a seemingly complex macro behavior into basic small local problems
Idealizations of complex natural systems– Flock behavior– Diffusion of gaseous systems– Solidification and crystal growth– Hydrodynamic flow and turbulence
General characteristics– Locality – Vast Parallelism– Simplicity
– Matching engine executes on one FPGA (XC2V1000)– Performs 200 billion cell updates per second– 1,200 billion operations per second (1.2 TOPS)
BYU - Network Intrusion Detection Systems (2002)– Hardware implementation uses one FPGA (XC2V1000)– Outperformed software version running on P3 – 750MHz:
• Up to 400 times more throughput than software version• Up to 1000 times less latency than software version
Xilinx – High Performance DES Encryption (2000)– Implemented on one small FPGA (XCV150)– Maximum throughput 10.75 GB/sec– Outperformed best ASIC implementation
University of Texas at Austin – Target Recognition System (2000)– System built using one FPGA (ORCA 40k) and Myrinet interfacing – Capable of processing 900 templates per second– 2,800 billion operations per second (2.8 TOPS)
CA Software Hardware Implementation
June 17, 2003
Iterative Methods for Linear Systems
Consider Jacobi’s method– D xi+1 = (D-A) xi + b– In software, we would select either single or
double precision floating pointOn a configurable computer we can select
any format in which to store/compute value– Choose the desired precision of the solution– Reconstruct the method for fast computation
CA Software Hardware Implementation
June 17, 2003
Iterative Methods Continued
Re-cast as iterative improvement scheme• ri = b - A xi Compute in n bits
xi = A–1 ri Compute in k bits
• xi+1 = xi + xi = A–1 ri Compute in n bits
Use Jacobi to solve for xi in compact, fast k-bit hardware (cost ~ bits2)
Thm: Convergence rate is independent of k
Thm: Optimal choice of k ~ n/(# iterations)1/3
CA Software Hardware Implementation
June 17, 2003
Convergence
Solution Error vs. Number of Iterations
K= 3,6,9 decimal digits
No difference in convergence rate
CA Software Hardware Implementation
June 17, 2003
Performance Advantage
Execution Cost (number of bit operations) vs. the size of the matrix
Compares cost of normal vs. modified algorithm
Convergence for each algorithm is identical
CA Software Hardware Implementation
June 17, 2003
Euler Beam Formulation
x
y F
h h
wL ,θL wC ,θC wR ,θR
FC
Control Volume
MC
FR
MR
FL
ML
C C g CK u f f
R
RR
L
LLg h
w
h
EI
h
w
h
EI
26
612
26
6123
*
3
*
f
* * * *
3 * * * *
12 61
6 4
L R L R
L R L R
EI EI EI EI
h EI EI EI EI
CK
, C CC C
C C
w Fwhere
h M
u f
d(x)
Cell Neighborhood
Cell Equilibrium
CA Software Hardware Implementation
June 17, 2003
Cellular Automata ModelMultiple Cells per Processing Element
CA Software Hardware Implementation
June 17, 2003
Beam Design
residual
g C C Cr f f K u
1 Ce K r
euu CC kk 1
error
correction
Equilibrium Update
Design Update1
2 4Md α
Eγ
Converged
Design Update
Converged
End
Equilibrium Update
NO
NO
YES
YES
CA Software Hardware Implementation
June 17, 2003
Algorithm Strategy
The limited precision algorithm illustrated for Jacobi’s method earlier is applied to CA– Much smaller, faster circuits for applying CA rule
updates in k-bit operations– Built-in 18x18 multipliers compute residual
Built-in high-speed memories provide– Storage for intermediate and permanent quantities– Many customizable word-lengths– Extremely high memory bandwidth
CA Software Hardware Implementation
June 17, 2003
Processing Element
CA Software Hardware Implementation
June 17, 2003
FPGA Performance
0
200
400
600
800
1000
1200
1400
1600
9 10 11 12 13 14 15 16 17 18 19 20
Number of Processing Elements
Cell
Up
da
tes
Per S
econ
d (
Mil
lion
)
t
8 BitModel
16 BitModelC
ell U
pd
ates
Per
Sec
ond
(M
illi
ons)
CA Software Hardware Implementation
June 17, 2003
CA Performance
0.E+00
2.E+06
4.E+06
6.E+06
8.E+06
1.E+07
1.E+07
1.E+07
2.E+07
2.E+07
2.E+07
10 15 20 25 30 35 40 45
Number of cells
Tot
al n
umbe
r of
cel
l upd
ates
CA Software Hardware Implementation
June 17, 2003
Multigrid Acceleration
x
y F
lattice 8hlattice 4hlattice 2hlattice h
E : Equilibrium update to convergence
: Equilibrium updated α timesS
S
S
S
E
S
S
S
h
2h
4h
8h
latticeV - cycle
S
S
S
E
S
E
S
S S
E
S
E
S
S
S
h
2h
4h
8h
lattice W - cycle
: Restriction (on r)
: Prolongation (on e)
CA Software Hardware Implementation
June 17, 2003
2 21
2 2 21
3 1 3 1
4 8 4 82 2
h hh i ii h h
i i
w wh
h h
2 21
2 2 21
1 1 1 1
2 8 2 82 2
h hh i ii h h
i i
w ww
h h
Prolongation
2
2 1 21 0
2
hh ii h
i
ww
h
2
2 1 2
10
2 2
hh ii h
i
wh
h
2 2 2 2,h hi iw h 2 1 2 1,h h
i iw h 2 2,h hi iw h
2 2,h hi iw h 2 2
1 1,h hi iw h
lattice 2h
lattice h2 1 2 1,h hi iw h 2 2 2 2,h h
i iw h
CA Software Hardware Implementation
June 17, 2003
2
1/ 2 1/ 8
3/ 4 1/ 8
1 0
0 1/ 2
1/ 2 1/ 8
3/ 4 1/ 8
hh
I
Prolongation/Restriction
2 2 2 2,h hi iw h 2 1 2 1,h h
i iw h 2 2,h hi iw h
2 2,h hi iw h
lattice 2h
lattice h
22
h h hh e I e
Correction Prolongation
2 2h h hh r I r
Residual Restriction 2
2h h
h hT
I I where
Prolongation Operator
CA Software Hardware Implementation
June 17, 2003
0 0 .1 0 .2 0 .3 0 .4 0 .5 0 .6 0 .7 0. 8 0. 9 10
0 .1
0 .2
0 .3
0 .4
0 .5
0 .6
0 .7
0 .8
0 .9
1
x /L
A/Ao
0 0 .1 0 .2 0 .3 0 .4 0 .5 0 .6 0 .7 0. 8 0. 9 10
0 .1
0 .2
0 .3
0 .4
0 .5
0 .6
0 .7
0 .8
0 .9
1
x /L
A/Ao
0 0 .1 0 .2 0 .3 0 .4 0 .5 0 .6 0 .7 0. 8 0. 9 10
0 .1
0 .2
0 .3
0 .4
0 .5
0 .6
0 .7
0 .8
0 .9
1
x /L
A/Ao
0 0 .1 0 .2 0 .3 0 .4 0 .5 0 .6 0 .7 0. 8 0. 9 10
0 .1
0 .2
0 .3
0 .4
0 .5
0 .6
0 .7
0 .8
0 .9
1
x /L
A/Ao
0 0 .1 0 .2 0 .3 0 .4 0 .5 0 .6 0 .7 0. 8 0. 9 10
0 .1
0 .2
0 .3
0 .4
0 .5
0 .6
0 .7
0 .8
0 .9
1
x /L
A/Ao
Design with 5 Cells:
Design with 17 Cells:
Design with 65 Cells:
~Design with 257 Cells:
~
~
Design with 3 Cells:
Nested Iteration for MG accelerated CA
d(x)
CA Software Hardware Implementation
June 17, 2003
CA Design Performance with Full MG
100
101
102
103
104
105
106
107
108
1 10 100 1000Number of Cells
Tot
al n
um
ber
of
cell
up
dat
es
CA Software Hardware Implementation
June 17, 2003
Concluding Remarks
Summary– CA paradigm has been demonstrated for various
structural systems– CA paradigm matches well with Configurable
Computing acceleration– Full Multigrid acceleration for CA improves design
convergence Future Work
– Expand the design capabilities in terms of structural details and the types of field problems that can be solved
– Tools that will enable engineers to effortlessly use configurable computers for CA applications
– Continue to investigate algorithms to improve CA performance