The Development of Hyper-Dual Numbers for Exact Second-Derivative Calculations Jeffrey A. Fike and Juan J. Alonso Department of Aeronautics and Astronautics Stanford University 49 th AIAA Aerospace Sciences Meeting January 4, 2011
The Development of Hyper-Dual Numbers for Exact Second-Derivative Calculations
Jeffrey A. Fike and Juan J. Alonso
Department of Aeronautics and AstronauticsStanford University
49th AIAA Aerospace Sciences Meeting
January 4, 2011
Outline
● What are Hyper-Dual Numbers?● Why use Hyper-Dual Numbers?● Implementation of Hyper-Dual Numbers and
Demonstration of Hyper-Dual Calculations● CFD Results using Hyper-Dual Numbers● Cost of using Hyper-Dual Numbers● Extensions and Alternative Formulations
Outline
● What are Hyper-Dual Numbers?● Why use Hyper-Dual Numbers?● Implementation of Hyper-Dual Numbers and
Demonstration of Hyper-Dual Calculations● CFD Results using Hyper-Dual Numbers● Cost of using Hyper-Dual Numbers● Extensions and Alternative Formulations
Extension of Dual Numbers
● Complex Numbers
z=a+bi
i2=-1● Quaternions
q=a+bi+cj+dk
i2=j2=k2=-1
ij=-ji=k
● Dual Numbers
x=a+bϵ
ϵ2=0, ϵ≠0
● Hyper-Dual Numbers
x=a+bϵ1+cϵ2+dϵ1ϵ2
ϵ12=ϵ2
2=(ϵ1ϵ2)2=0
ϵ1≠ϵ2≠ϵ1ϵ2≠0
Dual Numbers
● Have a history of independent discovery● William Kingdon Clifford in 1873● Eduard Study in 1891
● They developed what are now known as Dual-Quaternions● quaternions composed of dual numbers, or dual
numbers composed of quaternions● Represent rotations and translations● used in robotics, computer graphics, and flight
simulation
Generalized Complex Numbers
● a+b E● Addition:
● (a+b E) + (c+d E) = (a+c) + (b+d) E
● Multiplication: ● (a+b E)(c+d E) = ac + (ad+bc)E + bd E2
● Three types:● E2=-1 Ordinary Complex Numbers● E2= 0 Dual Numbers● E2= 1 Double Numbers
Outline
● What are Hyper-Dual Numbers?● Why use Hyper-Dual Numbers?● Implementation of Hyper-Dual Numbers and
Demonstration of Hyper-Dual Calculations● CFD Results using Hyper-Dual Numbers● Cost of using Hyper-Dual Numbers● Extensions and Alternative Formulations
Comparison of Methods
● There are many different methods for calculating derivatives● Finite-Differences, Complex-Step Approximation,
Adjoint, etc.
● Each method has its own advantages and disadvantages● Accuracy● Ease of implementation● Computational Efficiency
Finite Differences
● Derived from the Taylor Series
● Many options● Forward Difference
– First-Order Approximation
● Central Difference– Second-Order Approximation
f xh= f xhf ' xh2
2f ' ' xh3
6f ' ' ' x⋯
f ' x=f xh− f x
hO h
f ' x=f xh− f x−h
2hO h2
Complex-Step Approximation
● Taylor Series with an imaginary step
● First-Derivative Approximation
f xhi= f x hf ' x i−h2
2f ' ' x−h3
6f ' ' ' x i⋯
f ' x=Imag 〚 f xhi 〛
hOh2
f xhi= f x−h2
2f ' ' xO h4i h f ' x−h2
6f ' ' ' x O h4
Properties of Complex-Step Approximation
● Immune to Subtractive Cancellation Error.● The Complex-Step Approximation can be used with
an arbitrarily small step size to produce numerically-exact first-derivatives.– Eliminates the need to search for a good step size.
● Fairly easy to implement.● What about second-derivative calculations?
● Does the complex-step approximation retain these properties?
Second-Derivative Complex-Step Approximations
● One possibility
● It is possible to derive alternate formulas for the second derivative that use two different complex steps.● But still suffer from subtractive cancellation error.
f ' ' x=2 f x−Real 〚 f xhi 〛
h2 Oh2
Immunity to Subtractive Cancellation Error
● For the Complex Step Approximation, the first-derivative term is the leading term of the imaginary part and can be extracted without a difference operation.● This new method should have the second derivative
as the leading term of a non-real part.
● Usually want both first and second derivatives● Requires each to be a leading term● Suggests a number with multiple non-real parts
Cross-Derivative Calculations
● For multi-variable functions, cross derivatives are computed based on previous calculations.● Error is cumulative
● To eliminate this type of procedure, the perturbation of each variable should be applied to different non-real parts.● Again suggesting the use of multiple non-real parts
∂2 f x , y∂ x∂ y
≈ 12 2 f x , y −Real 〚 f xhi , yhi〛
h2 −∂2 f x , y
∂ x2 −∂2 f x , y
∂ y2
Quaternions
● q = a + b i + c j + d k
i2 = j2 = k2 = -1, ijk = -1, (ij) = -(ji) = k
Note: multiplication is not commutative● Taylor Series
● Second Derivative Approximation
● Subject to Subtractive Cancellation error
f xh1ih2 j0k = f xh1 f ' xih2 f ' x j−h1
2h22
2f ' ' x ⋯
f ' ' x=2 f x−Real 〚 f xh1ih2 j0 k 〛
h12h2
2 O h2
A Different Type of Number
● Enforce multiplicative commutivity, E1E
2=E
2E
1
● This creates a constraint on the possible values of E
1 and E
2
(E1E
2)2 = (E
1)2 (E
2)2
● Many possibilities:● E
12 = E
22 = -1, (E
1E
2)2 = 1, Circular Fourcomplex
● E1
2 = E2
2 = (E1E
2)2 = 1, Hyper-Double
● E1
2 = E2
2 = (E1E
2)2 = 0, Hyper-Dual
x=x1x2 E1x3E 2x 4E 1E 2
Hyper-Dual Numbers
● E12= E
22= (E
1E
2)2= 0 , with E
1 ≠ E2 ≠ E1E2 ≠ 0
produces the desired results● Change to dual number notation:
ϵ12=ϵ2
2=(ϵ1ϵ2)2=0, ϵ1≠ϵ2≠ϵ1ϵ2≠0
● For , the Taylor Series becomes
● This expression is exact, no truncation error
d=h11h22012
f xd = f xh1 f ' x 1h2 f ' x 2h1h2 f ' ' x12
Outline
● What are Hyper-Dual Numbers?● Why use Hyper-Dual Numbers?● Implementation of Hyper-Dual Numbers and
Demonstration of Hyper-Dual Calculations● CFD Results using Hyper-Dual Numbers● Cost of using Hyper-Dual Numbers● Extensions and Alternative Formulations
Implementation
● Implemented as a class using operator overloading in C++ and Matlab.● This allows existing codes to be converted to use
Hyper-Dual Numbers with very little modification.– Typically, only the variable type declarations need to be
changed.– For codes using MPI there is slightly more that needs to
be changed, such as defining new reduce operations.– Calls to external packages, such as PETSc, may need to
be altered.● Implementation available at http://adl.stanford.edu
Arithmetic Operations
● Consider two hyper-dual numbers:
● Addition is defned as
● Multiplication is defned as
ab=a1b1a2b21a3b32a4b412
a∗b=a1∗b1a1∗b2a2∗b11a1∗b3a3∗b12
a1∗b4a2∗b3a3∗b2a4∗b112
a=a1a21a32a 412 b=b1b21b32b412
Other Operations
● The inverse
Only exists for
● This suggests a defnition for the norm
● Which implies that comparisons should only be made based only on the real part
is equivalent to ● This allows the code to follow the same
execution path as the real valued code
norma= a12
ab a1b1
1a=
1a1
−a2
a12 1−
a3
a12 2 2a2a3
a13 −
a4
a12 12
a1≠0
Functions
● Diferentiable functions can be defned using the Taylor Series for a generic hyper-dual number.
● For instance,
f a = f a1a 2 f ' a11a3 f ' a12a 4 f ' a1a2a3 f ' ' a112
a3=a133 a2 a1
213 a3a1223 a4a1
26 a2 a3a1 12
sin a =sin a1a2 cos a11a 3 cosa12
a4 cos a1−a2a3 sin a112
Derivative Calculations
● For● To calculate ● Compute● Which gives
● One run provides the derivatives
∂2 f x ∂ x i∂ x j
f x , x∈ℝ n i.e. x=x1 , x2 , , xi , , x j , , xn T
f xij with x ij=xh11e ih22e j0 12
f x ij = f xh ∂ f x∂ xi
1h∂ f x∂ x j
2h2 ∂2 f x∂ xi∂ x j
12
∂ f x ∂ xi
,∂ f x ∂ x j
,∂2 f x ∂ xi∂ x j
f xij= f x h1
∂ f x ∂ xi
1h2
∂ f x ∂ x j
2h1h2
∂2 f x∂ xi∂ x j
12
Simple Example
● This function can be evaluated as:
f x =sin3 x
t0=xt1=sin t0
t 2=t13
Hyper-Dual Evaluation
● This function can be evaluated as:
t0=xh11h22012
t1=sin t0
=sin xh1 cos x1h2 cos x 2−h1h2 sin x 12
t 2=t13
=sin3 x3 h1 cos x sin2 x 13 h2 cos x sin 2x 2
− 34 h1h2 sin x−3sin 3x 12
Outline
● What are Hyper-Dual Numbers?● Why use Hyper-Dual Numbers?● Implementation of Hyper-Dual Numbers and
Demonstration of Hyper-Dual Calculations● CFD Results using Hyper-Dual Numbers● Cost of using Hyper-Dual Numbers● Extensions and Alternative Formulations
CFD Results
● Hyper-Dual numbers have been applied to an unstructured, parallel, 3-D, unsteady Reynolds-averaged Navier-Stokes solver, Joe.
● Two examples● Turbulent, transonic flow over an airfoil
– Derivatives of aerodynamic coefficients with respect to free stream Mach number and angle of attack.
● Inviscid, supersonic flow over a wedge– Derivatives of the pressure ratio across the oblique shock
with respect to the Mach number before the shock.
Turbulent, Transonic Airfoil
● NACA 0012● Mach 0.8● α=1°● Reynolds
Number of 7e6● Spalart-Allmaras
Turbulence Model
Comparison of Derivatives
Joe, hyper-dual 9.29954176747898Joe, finite difference 9.29952525902111
Joe, hyper-dual 0.644541632471284Joe, finite difference 0.644517820502788
Joe, hyper-dual 0.165004436840803Joe, finite difference 0.165004436840802
C L
d C L
d
d C L
d M
Comparison of Derivatives
Joe, hyper-dual -445.302962067975Joe, finite difference -432.298641328544
Joe, hyper-dual -338.716668969934Joe, finite difference -331.287774990585
Joe, hyper-dual -309.121862728704Joe, finite difference -310.051984087067
d 2C L
d 2
d 2C L
d M 2
d 2C L
d d M
Mach 2 Flow Over a 15° Wedge
M1 = 2.0
P1
M2
P2
θ = 15°
β
Error in Pressure Ratio Calculation
Error in First Derivative
Error in Second Derivative
Outline
● What are Hyper-Dual Numbers?● Why use Hyper-Dual Numbers?● Implementation of Hyper-Dual Numbers and
Demonstration of Hyper-Dual Calculations● CFD Results using Hyper-Dual Numbers● Cost of using Hyper-Dual Numbers● Extensions and Alternative Formulations
Computational Cost
● Working with Hyper-Dual numbers is more expensive than using real numbers● Hyper-Dual addition requires 4 real additions● Hyper-Dual multiplication requires 9 real
multiplications and 5 real additions
● Only one Hyper-Dual function evaluation is required per second derivative● Finite Differences require multiple function
evaluations per derivative
Gradient & Hessian Calculations
● Cost of forming the gradient and Hessian of a function of n variables
● Forward Difference● (n+1)2 real function evaluations
● Central Difference● 2n(n+2) real function evaluations
● Hyper-Dual Numbers● n(n+1)/2 Hyper-Dual function evaluations
Actual Costs
● The runtime for the Hyper-Dual number version of Joe takes on average 10 times that of the real number version.● As low as 7, or as high as 30 (Compiler dependent)
● However, the cost may be reduced if the analysis code involves an iterative procedure● Converge using real numbers, then perform one
iteration using hyper-dual numbers– For Super-Sonic Business Jet analysis, cost reduced by
a factor of 8.● 0.9 times Forward Difference, 0.46 times Central Difference
Outline
● What are Hyper-Dual Numbers?● Why use Hyper-Dual Numbers?● Implementation of Hyper-Dual Numbers and
Demonstration of Hyper-Dual Calculations● CFD Results using Hyper-Dual Numbers● Cost of using Hyper-Dual Numbers● Extensions and Alternative Formulations
Extensions of this Method
● Derivatives of complex valued functions● Change the class definition to use complex
numbers instead of double precision reals.
● If interested in first derivatives only, can simplify to Dual Numbers and reduce computational cost
● Can extend to third derivatives or higher by including other terms such as ϵ3, etc.
● Partial implementations available for frst two
Conclusion
● Created a new method for second-derivative calculations
● Developed and implemented hyper-dual numbers, which yield exact derivatives
● Demonstrated use on problems of realistic complexity
Questions?
Backup Slides
Division Algebra
● Real Numbers, Complex Numbers, and Quaternions are Division Algebras
– Additive associativity, additive commutativity, additive identity, additive inverse, multiplicative associativity, multiplicative identity, multiplicative inverse, left and right distributivity
● These new numbers satisfy all these conditions except for the multiplicative inverse, i.e. an inverse exists for all numbers not equal to zero● We have an inverse for all numbers whose norm is
not equal to zero
Other Functions
● Non-diferentiable functions, such as the absolute value, can be defned as a procedure
if x<0return -x
elsereturn x
Analytic Derivatives
● Analytic derivatives of the pressure ratio across an oblique shock with respect to the Mach number before the shock can be derived using an adjoint approach.
● The oblique shock relation is cast as a residual equation.
R=2cot M 1
2 sin2−1M 1
2 cos 22− tan=0
Pratio=P2
P1
=121
M 12 sin 2−1
Analytic Derivatives
Where there are three adjoint equations
d Pratio
d M 1
=∂ P ratio
∂M 1
1
∂R∂M 1
d 2P ratio
d M 12 =
∂2Pratio
∂M 12 1
∂2R∂M 1
22∂ R∂M 1
3 ∂R∂M 1
2
1=−∂ Pratio
∂ ∂ R∂ −1
3=− ∂2P ratio
∂ 21
∂2R∂2 ∂ R∂
−2
2=−2 ∂2Pratio
∂M 1∂ 1
∂2R∂M 1∂
3∂R∂M 1
∂ R∂ ∂ R∂
−1
Grid Convergence
Grid Convergence