The Development of Hyper-Dual Numbers for Exact Second

The Development of Hyper-Dual Numbers for Exact Second-Derivative Calculations

Jeffrey A. Fike and Juan J. Alonso

Department of Aeronautics and AstronauticsStanford University

49th AIAA Aerospace Sciences Meeting

January 4, 2011

Outline

● What are Hyper-Dual Numbers?● Why use Hyper-Dual Numbers?● Implementation of Hyper-Dual Numbers and

Demonstration of Hyper-Dual Calculations● CFD Results using Hyper-Dual Numbers● Cost of using Hyper-Dual Numbers● Extensions and Alternative Formulations

Outline



Extension of Dual Numbers

● Complex Numbers

z=a+bi

i2=-1● Quaternions

q=a+bi+cj+dk

i2=j2=k2=-1

ij=-ji=k

● Dual Numbers

x=a+bϵ

ϵ2=0, ϵ≠0

● Hyper-Dual Numbers

x=a+bϵ1+cϵ2+dϵ1ϵ2

ϵ12=ϵ2

2=(ϵ1ϵ2)2=0

ϵ1≠ϵ2≠ϵ1ϵ2≠0

Dual Numbers

● Have a history of independent discovery● William Kingdon Clifford in 1873● Eduard Study in 1891

● They developed what are now known as Dual-Quaternions● quaternions composed of dual numbers, or dual

numbers composed of quaternions● Represent rotations and translations● used in robotics, computer graphics, and flight

simulation

Generalized Complex Numbers

● a+b E● Addition:

● (a+b E) + (c+d E) = (a+c) + (b+d) E

● Multiplication: ● (a+b E)(c+d E) = ac + (ad+bc)E + bd E2

● Three types:● E2=-1 Ordinary Complex Numbers● E2= 0 Dual Numbers● E2= 1 Double Numbers

Outline



Comparison of Methods

● There are many different methods for calculating derivatives● Finite-Differences, Complex-Step Approximation,

Adjoint, etc.

● Each method has its own advantages and disadvantages● Accuracy● Ease of implementation● Computational Efficiency

Finite Differences

● Derived from the Taylor Series

● Many options● Forward Difference

– First-Order Approximation

● Central Difference– Second-Order Approximation

f xh= f xhf ' xh2

2f ' ' xh3

6f ' ' ' x⋯

f ' x=f xh− f x

hO h

f ' x=f xh− f x−h

2hO h2

Complex-Step Approximation

● Taylor Series with an imaginary step

● First-Derivative Approximation

f xhi= f x hf ' x i−h2

2f ' ' x−h3

6f ' ' ' x i⋯

f ' x=Imag 〚 f xhi 〛

hOh2

f xhi= f x−h2

2f ' ' xO h4i h f ' x−h2

6f ' ' ' x O h4

Properties of Complex-Step Approximation

● Immune to Subtractive Cancellation Error.● The Complex-Step Approximation can be used with

an arbitrarily small step size to produce numerically-exact first-derivatives.– Eliminates the need to search for a good step size.

● Fairly easy to implement.● What about second-derivative calculations?

● Does the complex-step approximation retain these properties?

Second-Derivative Complex-Step Approximations

● One possibility

● It is possible to derive alternate formulas for the second derivative that use two different complex steps.● But still suffer from subtractive cancellation error.

f ' ' x=2 f x−Real 〚 f xhi 〛

h2 Oh2

Immunity to Subtractive Cancellation Error

● For the Complex Step Approximation, the first-derivative term is the leading term of the imaginary part and can be extracted without a difference operation.● This new method should have the second derivative

as the leading term of a non-real part.

● Usually want both first and second derivatives● Requires each to be a leading term● Suggests a number with multiple non-real parts

Cross-Derivative Calculations

● For multi-variable functions, cross derivatives are computed based on previous calculations.● Error is cumulative

● To eliminate this type of procedure, the perturbation of each variable should be applied to different non-real parts.● Again suggesting the use of multiple non-real parts

∂2 f x , y∂ x∂ y

≈ 12 2 f x , y −Real 〚 f xhi , yhi〛

h2 −∂2 f x , y

∂ x2 −∂2 f x , y

∂ y2

Quaternions

● q = a + b i + c j + d k

i2 = j2 = k2 = -1, ijk = -1, (ij) = -(ji) = k

Note: multiplication is not commutative● Taylor Series

● Second Derivative Approximation

● Subject to Subtractive Cancellation error

f xh1ih2 j0k = f xh1 f ' xih2 f ' x j−h1

2h22

2f ' ' x ⋯

f ' ' x=2 f x−Real 〚 f xh1ih2 j0 k 〛

h12h2

2 O h2

A Different Type of Number

● Enforce multiplicative commutivity, E1E

2=E

2E

1

● This creates a constraint on the possible values of E

1 and E

2

(E1E

2)2 = (E

1)2 (E

2)2

● Many possibilities:● E

12 = E

22 = -1, (E

1E

2)2 = 1, Circular Fourcomplex

● E1

2 = E2

2 = (E1E

2)2 = 1, Hyper-Double

● E1

2 = E2

2 = (E1E

2)2 = 0, Hyper-Dual

x=x1x2 E1x3E 2x 4E 1E 2

Hyper-Dual Numbers

● E12= E

22= (E

1E

2)2= 0 , with E

1 ≠ E2 ≠ E1E2 ≠ 0

produces the desired results● Change to dual number notation:

ϵ12=ϵ2

2=(ϵ1ϵ2)2=0, ϵ1≠ϵ2≠ϵ1ϵ2≠0

● For , the Taylor Series becomes

● This expression is exact, no truncation error

d=h11h22012

f xd = f xh1 f ' x 1h2 f ' x 2h1h2 f ' ' x12

Outline



Implementation

● Implemented as a class using operator overloading in C++ and Matlab.● This allows existing codes to be converted to use

Hyper-Dual Numbers with very little modification.– Typically, only the variable type declarations need to be

changed.– For codes using MPI there is slightly more that needs to

be changed, such as defining new reduce operations.– Calls to external packages, such as PETSc, may need to

be altered.● Implementation available at http://adl.stanford.edu

Arithmetic Operations

● Consider two hyper-dual numbers:

● Addition is defned as

● Multiplication is defned as

ab=a1b1a2b21a3b32a4b412

a∗b=a1∗b1a1∗b2a2∗b11a1∗b3a3∗b12

a1∗b4a2∗b3a3∗b2a4∗b112

a=a1a21a32a 412 b=b1b21b32b412

Other Operations

● The inverse

Only exists for

● This suggests a defnition for the norm

● Which implies that comparisons should only be made based only on the real part

is equivalent to ● This allows the code to follow the same

execution path as the real valued code

norma= a12

ab a1b1

1a=

1a1

−a2

a12 1−

a3

a12 2 2a2a3

a13 −

a4

a12 12

a1≠0

Functions

● Diferentiable functions can be defned using the Taylor Series for a generic hyper-dual number.

● For instance,

f a = f a1a 2 f ' a11a3 f ' a12a 4 f ' a1a2a3 f ' ' a112

a3=a133 a2 a1

213 a3a1223 a4a1

26 a2 a3a1 12

sin a =sin a1a2 cos a11a 3 cosa12

a4 cos a1−a2a3 sin a112

Derivative Calculations

● For● To calculate ● Compute● Which gives

● One run provides the derivatives

∂2 f x ∂ x i∂ x j

f x , x∈ℝ n i.e. x=x1 , x2 , , xi , , x j , , xn T

f xij with x ij=xh11e ih22e j0 12

f x ij = f xh ∂ f x∂ xi

1h∂ f x∂ x j

2h2 ∂2 f x∂ xi∂ x j

12

∂ f x ∂ xi

,∂ f x ∂ x j

,∂2 f x ∂ xi∂ x j

f xij= f x h1

∂ f x ∂ xi

1h2

∂ f x ∂ x j

2h1h2

∂2 f x∂ xi∂ x j

12

Simple Example

● This function can be evaluated as:

f x =sin3 x

t0=xt1=sin t0

t 2=t13

Hyper-Dual Evaluation

● This function can be evaluated as:

t0=xh11h22012

t1=sin t0

=sin xh1 cos x1h2 cos x 2−h1h2 sin x 12

t 2=t13

=sin3 x3 h1 cos x sin2 x 13 h2 cos x sin 2x 2

− 34 h1h2 sin x−3sin 3x 12

Outline



CFD Results

● Hyper-Dual numbers have been applied to an unstructured, parallel, 3-D, unsteady Reynolds-averaged Navier-Stokes solver, Joe.

● Two examples● Turbulent, transonic flow over an airfoil

– Derivatives of aerodynamic coefficients with respect to free stream Mach number and angle of attack.

● Inviscid, supersonic flow over a wedge– Derivatives of the pressure ratio across the oblique shock

with respect to the Mach number before the shock.

Turbulent, Transonic Airfoil

● NACA 0012● Mach 0.8● α=1°● Reynolds

Number of 7e6● Spalart-Allmaras

Turbulence Model

Comparison of Derivatives

Joe, hyper-dual 9.29954176747898Joe, finite difference 9.29952525902111



C L

d C L

d

d C L

d M

Comparison of Derivatives

Joe, hyper-dual -445.302962067975Joe, finite difference -432.298641328544



d 2C L

d 2

d 2C L

d M 2

d 2C L

d d M

Mach 2 Flow Over a 15° Wedge

M1 = 2.0

P1

M2

P2

θ = 15°

β

Error in Pressure Ratio Calculation

Error in First Derivative

Error in Second Derivative

Outline



Computational Cost

● Working with Hyper-Dual numbers is more expensive than using real numbers● Hyper-Dual addition requires 4 real additions● Hyper-Dual multiplication requires 9 real

multiplications and 5 real additions

● Only one Hyper-Dual function evaluation is required per second derivative● Finite Differences require multiple function

evaluations per derivative

Gradient & Hessian Calculations

● Cost of forming the gradient and Hessian of a function of n variables

● Forward Difference● (n+1)2 real function evaluations

● Central Difference● 2n(n+2) real function evaluations

● Hyper-Dual Numbers● n(n+1)/2 Hyper-Dual function evaluations

Actual Costs

● The runtime for the Hyper-Dual number version of Joe takes on average 10 times that of the real number version.● As low as 7, or as high as 30 (Compiler dependent)

● However, the cost may be reduced if the analysis code involves an iterative procedure● Converge using real numbers, then perform one

iteration using hyper-dual numbers– For Super-Sonic Business Jet analysis, cost reduced by

a factor of 8.● 0.9 times Forward Difference, 0.46 times Central Difference

Outline



Extensions of this Method

● Derivatives of complex valued functions● Change the class definition to use complex

numbers instead of double precision reals.

● If interested in first derivatives only, can simplify to Dual Numbers and reduce computational cost

● Can extend to third derivatives or higher by including other terms such as ϵ3, etc.

● Partial implementations available for frst two

Conclusion

● Created a new method for second-derivative calculations

● Developed and implemented hyper-dual numbers, which yield exact derivatives

● Demonstrated use on problems of realistic complexity

Questions?

Backup Slides

Division Algebra

● Real Numbers, Complex Numbers, and Quaternions are Division Algebras

– Additive associativity, additive commutativity, additive identity, additive inverse, multiplicative associativity, multiplicative identity, multiplicative inverse, left and right distributivity

● These new numbers satisfy all these conditions except for the multiplicative inverse, i.e. an inverse exists for all numbers not equal to zero● We have an inverse for all numbers whose norm is

not equal to zero

Other Functions

● Non-diferentiable functions, such as the absolute value, can be defned as a procedure

if x<0return -x

elsereturn x

Analytic Derivatives

● Analytic derivatives of the pressure ratio across an oblique shock with respect to the Mach number before the shock can be derived using an adjoint approach.

● The oblique shock relation is cast as a residual equation.

R=2cot M 1

2 sin2−1M 1

2 cos 22− tan=0

Pratio=P2

P1

=121

M 12 sin 2−1

Analytic Derivatives

Where there are three adjoint equations

d Pratio

d M 1

=∂ P ratio

∂M 1

1

∂R∂M 1

d 2P ratio

d M 12 =

∂2Pratio

∂M 12 1

∂2R∂M 1

22∂ R∂M 1

3 ∂R∂M 1

2

1=−∂ Pratio

∂ ∂ R∂ −1

3=− ∂2P ratio

∂ 21

∂2R∂2 ∂ R∂

−2

2=−2 ∂2Pratio

∂M 1∂ 1

∂2R∂M 1∂

3∂R∂M 1

∂ R∂ ∂ R∂

−1

Grid Convergence

Grid Convergence

The Development of Hyper-Dual Numbers for Exact Second

Documents