Slide 1
Derivatives of static response from linear finite element
analysisLocal search algorithms benefit from derivatives even when
they are calculated by finite differencesOften derivatives can be
calculated at fraction of cost of finite-difference derivativesGoal
of todays lecture is to show why this is usually true for static
responseKey concepts will be the direct and adjoint variants of
derivative calculationAlso there are some numerical pitfalls
Finite difference derivatives cost approximately one additional
analysis per design variable. Structural optimization is often done
with thousands of design variables, so each time we calculate
derivatives by finite differences we will need to perform thousands
of analyses.Fortunately, there is a shortcut that allows us to
calculate derivatives at a fraction of the cost, which is the topic
of todays lecture.
It turns out that there are two ways to reduce the cost of
calculating derivatives called the direct and adjoint techniques,
and they are the main subject of the lecture. Another important
topic are numerical difficulties that may be
encountered.1Properties of governing equationsEquations of
equilibrium in finite element system: Ku=fK is normally large and
sparse and ill-conditioned, but positive definiteA variant of
Gaussian elimination is used for solution, without pivoting (averts
matrix fill-up), that allows most of the computation to be done
without involving the load f.In aerospace applications system needs
to be solved repeatedly for thousands of different right hand
sides. Why?
The shortcut we have for finding derivatives depends on the
properties of the stiffness matrix K that we use to solve for
displacements from the system Ku=f, with f being the force vector.
The matrix is large and sparse, often ill-conditioned, but positive
definite.
These properties mean that we usually solve the system of
equations by Gaussian elimination, where most of the processing is
independent of the force vector. That means that solving for
multiple load vectors is efficient, which is important even if we
are not calculating derivatives. This is because in aerospace
design we often needs to calculate the response to thousands of
load vectors coming from different combinations of maneuvers, gust
loads, and weight distributions.
The trick to efficient calculation of derivatives will turn out
to converting the calculation of a derivative to instead calculate
the response to a different load vector.2Cholesky decompositionA
positive-definite stiffness matrix K can be decomposed as: K=LTDL,
where L is a lower triangular matrix with 1s on diagonal, and D is
a diagonal matrixSolution of Ku=f is replaced by
Why is that good for multiple RHS?When K has a banded or skyline
sparseness structure, it is preserves by L.
An example of Gaussian elimination is the Cholesky decomposition
(Andr-Louis Cholesky (October 15, 1875, Montguyon August 31, 1918,
Bagneux) was a French military officer and mathematician). The
amount of computation that goes into the decomposition is of order
n3, when the matrix is full, and much lower for sparse matrix, but
still typically increases with the cube of the dimension. On the
other hand the cost of solving for u after the decomposition is
complete is of the order of n2. 3Direct methodEquations for
displacement vector Ku=fResponse quantity of interest
g(u,x)Differentiate
RHS of derivative equation called pseudo loadCan be calculated
outside of finite element programRequires only forward and backward
substition
Consider the need to calculate the derivative of a constraint
function (e.g., the stress at a point) that depends on the
displacement vector and a design variable x (such as thickness).
When we differentiate the equations Ku=f with respect to the design
variable, we will get
Ku+Ku=for Ku=f-Ku=fp
where a prime denotes derivative with respect to x. Note that we
found that we can find the derivative u by solving a system of
equations with the same stiffness matrix, but with a different load
vector f-Ku=fp . This load is called the pseudo load, and its
calculation is often inexpensive because the design variable often
affects only a small part of the matrix K.
With the direct method of derivative calculation we first solve
for the derivative of the load vector and then we substitute it to
calculate the derivative of g. The method is most efficient when we
have small number of design variables and large number of functions
g.4Adjoint methodRewrite derivative equation
For both adjoint and direct method need pseudo load vector and z
vector.
The first step in deriving the adjoint method is to substitute
the equation for the displacement derivative into the equation for
the derivative of g. Then we note that if we apply the inverse of
the stiffness matrix to the vector on its left instead of a vector
on its right we can get an alternative expression.
The alternative expression involves solving for a load vector
equal to z, which is the derivative of g with respect to the
displacement components. Since g is usually a stress component, it
depends on very few displacement components and the so its
computation is usually easy.5Direct vs. Adjoint methodDirect method
requires psuedo load calculation and solution for each design
variableminimal effort with new constraintsAdjoint method requires
solution for each new g, and pseudo load calculation for each
variableHow do we choose?See beam example in Section 7.2
Direct method requires solution for one pseudo load for each
design variable, but it does not suffer much if there are many
functions g. So for example, if we need to calculate the
derivatives of many stresses from the derivatives of the
displacements, the direct method would be a good choice.
On the other hand, the adjoint method requires a solution for
each g. So if we have a single g and a large number of design
variables it would be a good method. For example, if we need to
calculate the derivatives of one displacement component with
respect to many design variables the adjoint method would be
ideal.6Problems direct & adjointFor a dense matrix, the
Cholesky decomposition requires about n3 operations, and forward or
backward substitution about n2 operations. Estimate the operation
count for calculating the derivatives of 3 displacement components
with respect to 10 design variables for n=1000, using the direct
and adjoint methods. Assume that the pseudo load calculation is
negligible.How do the two methods compare for calculating the
derivative of the compliance, uTf?
The semi-analytical methodAnalytical derivatives of stiffness
matrices are burdensome especially in commercial software. Why?Most
resort to finite-difference calculation of derivatives of stiffness
matrix and force vector
Calculating the derivatives of the stiffness matrix and the load
vector with respect to design variables can be burdensome,
especially for complex finite element that depend on large number
of parameters. These include material properties, dimensions, and
the coordinates of the nodes. So it is quite common to calculate
the derivatives of K and f by finite differences instead of
analytically, and then use them with the direct or adjoint method.
The combination is called the semi-analytical method.8Problems with
shape derivatives for a car model, circa 1985.
Derivative of compliance with respect to a dimension of car
My PhD student, Bruno Barthelemy, stumbled across a problem with
the semi-analytical method when on internship at Ford in 1985. He
used the semi-analytical method to calculate the derivatives of the
compliance (work of the applied loads) with respect to the
dimensions of the car frame. For two of the dimensions he found
that it was very difficult to find a step size that provided
accurate derivatives.9Cantilever beam example
When Barthelemy came back from his internship, he set to work to
find the reason for the problem, and he was finally able to
duplicate it for the derivative of the tip displacement of a beam
with respect to the length of the beam.
As the number of elements in the beam increased the error of the
semi-analytical method increased. This was shown with his own beam
finite element code as well as with a commercial finite element
code called EAL.
In comparison the overall finite difference approach, where you
change the length and recalculate the displacement was very
accurate.10Difficulty and partial solutionWe live dangerously when
we solve ill-conditioned equations like FE equationsLoading that
excites huge errors are not met in real life, they look like high
vibration modesThe pseudo load can come to resemble such loads
because it is not derived from physical loadingHaftka, R.T.,
Stiffness-Matrix Condition Number and Shape Sensitivity Errors,
AIAA Journal, Vol. 28, No. 7, pp. 1322-1324, 1990. Van Keulen
refined semi-analytical method purges K of rigid body component as
a way around most common manifestation of problem
Barthelemys dissertation was mostly about documenting the
problem, because there was wide spread skepticism that the problem
exists. He was also able to show that the problem is likely to be
associated with the pseudo load that you get when you have shape
design variables.
In follow-up work, I noted that the stiffness matrix used in
finite element calculations have condition numbers that are in the
thousands or higher. This means that small errors in the matrix
elements or in the loads can lead to large errors in the solution.
In fact the condition number is the largest possible amplification
factor for the error.
For large amplification of the error, you need the actual loads
on the structure to resemble an eigenvector associated with high
eigenvalue. That is usually highly oscillatory, and not encountered
in actual applications. However, the pseudo load is not physical,
and when shape variable are involved, it can have this property of
rapid oscillation.
There are now treatments of the semi-analytical method that
avoid this problem, in particular one due to Van keulen and
colleagues. (Boer and van Keulen, Refined semi-analytical design
sensitivities, Int. J. Solids Struct. 37, 6961-6980, 2000)11Problem
semi-analyticalDescribe in 45-55 words the pros and cons of the
semi-analytical methodUse the semi-analytical method to reproduce
the results of the figure on slide 10. You will need to generate
the stiffness matrix of a cantilever element with several elements.
Check that you assembled the right matrix by noting that the tip
displacement of a beam under an end load P is
If you find it difficult to do the problem in non-dimensional
form, you may use the following data: P=1,000 lb, L=120, E=30 msi,
I=3 in4.
12