New Stochastic Volatility Models · 1 SV Models Introduction (New) Stochastic Volatility Models Examples 2 Neural Networks Supervised Learning The Neural Network and Training The

New Stochastic Volatility Models- PDE, Approximation, Deep Pricing and Calibration -

Webinar - WBS Quantshub, 14.05.2020

Jorg Kienitz - Finciraptor, UCT, AMNA, Quaternion

UCT, BUW, Finciraptor finciraptor.de, [email protected]

Jorg Kienitz - Finciraptor, UCT, AMNA, Quaternion New Stochastic Volatility Models 1/62

Disclaimer

This presentation and any accompanying material are being provided solely for information and general illustrativepurposes. The author will not be responsible for the consequences of reliance upon any information contained in orderived from the presentation or for any omission of information therefrom and hereby excludes all liability for lossor damage (including, without limitation, direct, indirect, foreseeable, or consequential loss or damage andincluding loss or profit and even if advised of the possibility of such damages or if such damages were foreseeable)that may be incurred or suffered by any person in connection with the presentation, including (without limitation)for the consequences of reliance upon any results derived therefrom or any error or omission whether negligent ornot. No representation or warranty is made or given by the author that the presentation or any content thereof willbe error free, updated, complete or that inaccuracies, errors or defects will be corrected.

The views are solely that of the authors and not of any affiliate institution. The Chatham House rules apply.

The presentation may not be reproduced in whole or part or delivered to any other person without prior permissionof the author.


The presentation is based on:M. Felpel, J. Kienitz, T. McWalter,Effective Stochastic Volatility: Applications to ZABR-typeModels, [8]

J. Kienitz, S. K. Acar, Q. Liang, N. Nowaczyk,The CV Makes the Difference - Control Variates for NeuralNetworks, [19]


SV and ML

1 SV ModelsIntroduction(New) Stochastic Volatility ModelsExamples

2 Neural NetworksSupervised LearningThe Neural Network and TrainingThe Control Variate ApproachExample

3 Appendix


Outline



3 Appendix


Outline



3 Appendix


Discrete Volatility Surfaces

We start with the discrete implied volatility surface.

T := {T1,T2, . . . ,TN} as set of option maturities

K := {K1,K2, . . . ,KN}, Ki := {Ki ,1,Ki ,2, . . . ,Ki ,Mi} be sets

of strike values indexed by the number of maturitiesconsidered. Usually Ki = Kj for all 1 ≤ i , j ≤ N.

The implied volatility for each quoted option with respect toTi ∈ T , Kj ∈ K:

Σd : T × K → R+

(T ,K ) 7→ σd .

The map Σd is called the discrete implied volatility surface.

The implied volatilities are with respect to a reference model(e.g. Bachelier, Black-Scholes-Merton)


Continuous Volatility Surface

For practical purposes we consider the continuous implied volatilitysurface given by the map

Σc,0 : [0,T ]× [Kl ,Ku]→ R+

(T ,K ) 7→ σc

Many approaches for modeling the dynamics of (inst.)volatility and to determine Σc wrt a reference model exist

We consider stochastic volatility models (SVM).

Selecting SVM and its parameters determine Σc (and itsdynamics Σc,t(T ,K ), t ∈ R+).

Matching to the observed discrete implied volatility surface iscalled calibration, and, once a model is calibrated, thecontinuous implied volatility surface may be used forinterpolation and extrapolation.


General Stochastic Volatility Models

In particular, we consider the model GSVM determined by thecoupled SDEs given by

dFt = C (Ft)vt dW(1)t , Ft0 = f ,

dvt = µ(vt) dt + ν(vt) dW(2)t , vt0 = α,

with d〈W (1),W (2)〉t = ρ dt.

(1.1)

Our general framework provides to this approach covers most wellknown stochastic volatility models: SABR model (includingdisplacements), [13], free SABR (fSABR) model, [3], ZABR model,[2], Stein-Stein model, [22], Schoebel-Zhu model, [21], Hestonmodel, [16].But also new variants of the classic models including fZABR (freeZABR), mrZABR (mean reverting ZABR) or fmrSABR (free meanreverting SABR).


Classic and New Stochastic Volatility Models

Specific choices of the coefficients lead to ZABR-type models. Inparticular we consider GSVMs of the form (1.1) where thefunctions µ, C and ν are of the form given by:

µ C ν Model

0 F βt vt SABR0 1 vt Normal SABR (nSABR)

0 (Ft + d)β vt Displaced SABR

0 |Ft |β vt Free SABR (fSABR)

κ(θ − vt) F βt vt mean reverting SABR (mrSABR)

0 F β1t vβ2

t ZABR

0 (Ft + d)β1 vβ2t Displaced ZABR

0 |Ft |β1 vβ2t Free ZABR (fZABR)

κ(θ − vt) F β1t vβ2

t mean reverting ZABR (mrZABR)


General Stochastic Volatility Models

The choice of model and parameters should ensure the best fitto the current (discrete) market implied volatility surface

The dynamics are suitable for risk management and trading ofexotic contracts.

Sometimes ease of implementation determines the choice ofthe model, rather than model suitability.

We provide a general modeling approach with a tractablecomputational framework that does not require thiscompromise.


Outline



3 Appendix


Effective PDE

To achieve numerical tractability, we use singular perturbationmethods to derive an approximate PDE, called the effective PDE,for the marginal probability density of the asset. Here, thisprobability density should be understood as

P[F < Ft < F + dF

∣∣∣ Ft0 = f , vt0 = α].

This technique was originally introduced by Hagan et al. [10, 9, 11]for SABR models.


Effective PDE

Given the reduced density for a specified exercise time T ,

Q(t,F ) dF = P[F < Ft < F + dF

∣∣∣Ft0 = f , vt0 = α]. (1.2)

we can then recover option prices with payoff h by an evaluation of

Vh,Q(T ,K ) =

∫h(F )Q(T ,F ) dF

To compute the reduced density, we derive a PDE of the form

∂tQ(t,F ) = ∂FF [D(t,F )Q(t,F )] , Q(t0, f ) = δ(F − f ), (1.3)

where D(·, ·) is a function that involves the model parameters anddepends on t and the asset value F . It can be viewed as a localvolatility function.


Effective PDE

The effective PDE, also called the effective forward equation,is accurate to order O(ε2).

For achieving a stable and efficient numerical implementationto solve the PDE, we especially need to specify the boundarybehavior.

This leads us to consider two PDEs for accumulatingprobability, with the probability densities (for lower and upperbound) are denoted by QL and QR .

∂tQL(t) = lim

F↓bl∂F [D(t,F )Q(t,F )] , QL(t0) = 0

and ∂tQR(t) = − lim

F↑bu∂F [D(t,F )Q(t,F )] , QR(t0) = 0.


Effective PDE

Theorem 1

Let GSVM (1.1) obey Assumptions I–IV (appendix), an effectivePDE for the effective probability (1.2) and (1.3), is given by:

D(t,F ) =1

2a(t)2C (F )2eG(t)

(1 + 2b(t)z(F ) + c(t)z(F )2

),

where the coefficients are specified as

a(t) = Y (t, t0, α), b(t) =1

a(t)s(t)I1(t),

c(t) = b(t)2 +1

a(t)s(t)2I2(t)− 6b(t)

s(t)2I3(t) +

2

a(t)s(t)2I4(t),

G (t) = −s(t)c(t)− s(t)b(t)Γ0 +1

a2I5(t), Γ0 = −C ′(f ).


Effective PDE

The general expressions for I1 to I4 can be found in theAppendix,

Time-dependent parameters can be handled

We use also can apply effective parameters (ParameterAveraging, see [1] or [12]


Implied Volatility Formulae using Effective Parameters

For a fixed maturity T we derive effective parameters for GSVM:

b =2

T 2

∫ T

0ub(u) du

c =3

T 3

∫ T

0u2c(u) du +

18

T 3

∫ T

0b(u)

∫ u

0vb(v) dv du − 3b2

G =1

T

∫ T

0G (u) du +

1

T

∫ T

0u(c(u)− c) du.

These constant parameters are used to determine effective SABRparameters:

νeff =√c , ρeff =

b√c, αeff = α

(1 +

1

2G +

1

4αbΓ0T

). (1.4)

⇒ Allows to approximate implied volatility to order O(ε2) withSABR approximation formula!


Example: ZABR

For the ZABR model the explict parameters are:

νeff = ναβ2−1√

1 + (β2 − 1)ρ2

ρeff =ρ√

1 + (β2 − 1)ρ2

αeff = α

(1 +

1

4ρ2ν2α2(β2−1)(1− β2)T

) (1.5)

To guarantee that the term√

D(t,F ) remains real, we furtherimpose the condition

β2 > 1 +ρ2 − 1

ρ2.


Example: mrZABR

Considering the mean-reverting ZABR model with a reversion backto the initial state, i.e., θ = α:

b(t) =ρναβ2−2

κ(t − t0)

(1− e−κ(t−t0)

),

c(t) =(1 + ρ2)ν2α2(β2−2)

κ2(t − t0)2

(1− e−κ(t−t0)

)2

+6ρ2ν2α2(β2−2)

κ3(t − t0)3

(1− e−κ(t−t0)

)(1− κ(t − t0)− e−κ(t−t0)

)+(1 + β2)

2ρ2ν2α2(β2−2)

κ2(t − t0)2

(1− (1 + κ(t − t0))e−κ(t−t0)

),

G (t) = −α2(t − t0)c − ρναβ2

κ

(1− e−κ(t−t0)

)Γ0 +

ν2α2(β2−1)

2κ

(1− e−2κ(t−t0)

).


Example: mrZABR

Fixing a specified maturity T , the corresponding constant effectiveparameters are:

b =2ρναβ2−1

κ2T 2

(κT − 1 + e−κT

),

c =3(1 + ρ2)ν2α2(β2−1)

2(κT )3

(2κT + 4e−κT − 3− e−2κT

)+6(1 + β2)

ρ2ν2α2(β2−1)

(κT )3

(κT + 2e−κT − 2 + κTe−κT

)−12ρ2ν2α2(β2−1)

(κT − 1 + e−κT

(κT )2

)G =

ν2α2(β2−1)

4κ2T

(2κT + e−2κT − 1

)− 1

2cT − ρναβ2

κ2T

(κT − 1 + e−κT

)Γ0.


Outline



3 Appendix


Examples for Probability Distributions

Figure 1 shows the output obtained by numerically solving theeffective PDE. It is the density of the asset at maturity anddepends on all the input parameters.

0.00

0.05

0.10

0.15

0.20

0.25

-6 -4 -2 0 2 4 6 8 10 12 14

Strikes in Forward Units

Density from Effective Equation

SABR ZABR mrSABR mrZABR

Figure: Output from numerically solving the effective PDE for SABR,ZABR, mrSABR and mrZABR.


Examples of Implied Bachelier Volatilities

0.0000

0.0002

0.0004

0.0006

0.0008

0.0010

0.0012

0.0014

0.0016

0.0018

0.0020

0 20 40 60 80 100 120 140 160 180

Strikes in bp

SABR Implied Bachelier Volatility (effective vs MC)

SABR SABR MC

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

0 20 40 60 80 100 120 140 160 180

Strikes in bp

ZABR Implied Bachelier Volatility (effective vs MC)

ZABR ZABR MC

0.0000

0.0002

0.0004

0.0006

0.0008

0.0010

0.0012

0.0014

0.0016

0.0018

0 20 40 60 80 100 120 140 160 180

Strikes in bp

mrSABR Implied Bachelier Volatility (effective vs MC)

mrSABR mrSABR MC

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

0 20 40 60 80 100 120 140 160 180

Strikes in bp

mrZABR Implied Bachelier Volatility (effective vs MC)

mrZABR mrZABR MC

0.0000

0.0002

0.0004

0.0006

0.0008

0.0010

0.0012

0.0014

0.0016

0.0018

0.0020

0 20 40 60 80 100 120 140 160 180

Striks in bp

fSABR Implied Bachelier Volatility (effective vs MC)

SABR F SABR F MC

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

0.0030

0 20 40 60 80 100 120 140 160 180

Strikes in bp

fZABR Implied Bachelier Volatility (effective vs MC)

ZABR F ZABR F MC

Figure: Implied Bachelier volatility computed from the Call option pricesobtained from the effective equation and Monte Carlo simulation for theSABR (top left), ZABR (top right), mrSABR (mid left), mrZABR (midright), fSABR (bottom left) and fZABR (bottom right).


=

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

0.0030

-0.002 0.003 0.008 0.013 0.018 0.023

Strikes

Implied Bachelier Volatility

ZABR ZABR_Shifted

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

-0.002 0.003 0.008 0.013 0.018 0.023

Strikes


mrZABR mrZABR_Shifted

Figure: The implied volatility for the ZABR model with parametersβ = 0.5, β2 = 0.8, ν = 0.3, ρ = −0.8, an underlying forward rate of0.005, which is shifted by 0.002, and a displacement of 0.001 (left) andmrZABR with mean reversion of κ = 0.2 and shift (right).


ZABR with different CEV parameters (volatility)

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

0.0030

0.0035

-0.002 0.003 0.008 0.013 0.018 0.023

Strikes


SABR ZABR_0.9 ZABR_0.8 ZABR_0.7

Figure: Implied volatility for the ZABR model when β2 changes.


mrZABR with different reversion rates

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

0.0030

-0.002 0.003 0.008 0.013 0.018 0.023

Strikes


ZABR mrZABR_0.1 mrZABR_0.5 mrZABR_0.8

Figure: Implied volatility for the mean reversion ZABR when κ changes.


ZABR Density with different CEV parameters (volatility)

0.00

0.05

0.10

0.15

0.20

0.25

0.30

-5 -3 -1 1 3 5 7 9

Strikes in Forward Units

Density Function

SABR ZABR_0.9 ZABR_0.8 ZABR_0.7

Figure: Density for the ZABR model when β2 changes. This model maylead to higher and steeper peaks in the density function, compared withthe SABR model.


fZABR

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

-0.002 0.003 0.008 0.013 0.018 0.023

Strikes


fZABR fZABR_Shifted

Figure: Free ZABR implied volatility with parameters as in Figure 3.


Outline



3 Appendix


Outline



3 Appendix


Supervised Learning

The setting we consider is the Supervised Learning approach wherewe consider a set

D := {(x1, y1), . . . , (xn, yn)},

with xi ∈ Rd , yi ∈ Rl for i = 1, . . . , n.

X := (x1, . . . , xn) the inputs and Y := (y1, . . . , yn) the targetsor labels.

The dimension l is the dimension of the targets.

We realize the supervised learning using an deep neuralnetwork approach.


Outline



3 Appendix


The neural network

The neural network we use consists of one input layer(#nodes = d), with d being the input tensor dimension.

The output layer’s number of nodes corresponds to thedimension l of the labels y quantities that we wish to learn.

We stack a number of hidden layers li , i = 1, . . . , n on top ofthe input layer each having ni number of nodes.

For each layer we specify an activation function (elu for ourexperiments) and linear activation for the output.


The Training

We create the data (≈ 40.000 samples) for SABR, eg. usingalphaMin = 0.01; alphaMax = 0.3 ; alphaDelta = alphaMax - alphaMin;

betaMin = 0.01; betaMax = 0.4; betaDelta = betaMax - betaMin;

nuMin = 0.01; nuMax = 0.25 ; nuDelta = nuMax - nuMin;

rhoMin = -.9; rhoMax = -0.5 ; rhoDelta = rhoMax - rhoMin;

TMin = 1; TMax = 2.5; TDelta = TMax - TMin;

The strike values are set tostrikes = np.linspace(kmin,kmax,Nk)

The data set is being scaled

For the training we split the sets X ,Y into training(Xtrain,Ytrain) and validation (aka test) sets (Xval,Yval).

15% of the data for creating the validation set.

we choose a cost function that measures the difference of theANN’s output to the true values Ytrain

The optimization is done using the Adam optimizer, [20].


1 2 3 4 5Strike

0.2

0.4

0.6

0.8

1.0

1.2

Impl

ied

Bach

elie

r vol

atilit

y

SABR Implied Bachelier Smile

1 2 3 4 5Strike

0.08

0.06

0.04

0.02

0.00

Diffe

renc

es Im

plie

d Ba

chel

ier v

olat

ility Differences in Implied Bachelier volatility CV vs SABR

1 2 3 4 5Strike

0.0

0.1

0.2

0.3

0.4

0.5

Call

Optio

n Pr

ice

SABR - Call Option Prices

1 2 3 4 5Strike

0.004

0.003

0.002

0.001

0.000

Diffe

renc

es C

all O

ptio

n Pr

ice C

V vs

SAB

R Differences in Implied Bachelier volatility CV vs SABR

Figure: (Left) Some realizations for implied Bachelier volatilities / pricescomputed by PDE and approximation formula (Right) Differences of themethods.


The testing phase

For testing we consider the model’s learning history.

If the results are not satisfactory we may alter the topology ofthe network by adding or subtracting layers, resp. nodes.

General methods including cross-validation or over-fittingissues are not repeated here and can be found in [14].

Finally, the result is applied on newly generated data that isneither used during training nor validation.


Outline



3 Appendix


The Control Variate Approach

Applying a control variate approach to Deep Learning weconsider another set YCV and set Ynew := Y − YCV.

We apply the training to (X ,Ynew). It remains to choose theset YCV.

Once having learned the relationship between X and Ynew weuse the ANN to predict values. To this end let xinput be theinput, ypredict the prediction derived by applying the ANN andycv the value derived by applying the control variate.

Then, we derive an approximation to true value by setting

ytrue ≈ ycv + ypredict

We use the loss function for ’extreme’ cases to match the’true’ solution.



The CV method can be used in two ways. To this end let n be thenetwork:

Learn in a region R where the chosen CV is reasonable goodusing the objective function (empirical risk):

l1(y , n(x)) = ‖y − n(x)‖22, x ∈ R

Learn in a region R where the chosen CV is reasonable goodand outside the region x /∈ R using the objective function(regularized empirical risk):

l2(y , n(x)) = 1{x∈R}l1(y , n(x))+β1{x /∈R}‖A·(ycv(x)−(ycv+n(x)))‖22,

A a linear map, β a scalar and ycv a control variate for x /∈ R.

The second term is a regularization and we may take A = Id .



The CV method can be used in two ways:

’Cheap to compute’ CV + learn the difference (Difference Learning).Final price: CV + learned differences within a training region.Then, use penalty on the loss function for ’extreme’ cases to controlthe behaviour of the approximation outside a given region.In this way we aim to use the exact numerical method (PDE,Integration, etc.) as CV outside the training region to stabilize theapproximation and use a simple deep learning architecture -feedfwd, fully connected.

Choose a (possible expensive to compute) CV to determine theasymptotics and learn the difference (Asymptotic Learning).Suggested in [4]. The aim is to control the asymptotic behaviour.They suggest methods for achieving that (Spline methods andConstrained Radial Layers). This needs sophisticated deep learningarchitecture.


Control Variates

In practice we often face situations where for a given modelreasonably good approximation formulae exist.

(i) Special case of a model, e.g. SVM with ρ = 0 or r an all purposeapproximation taking into account all model parameters

(ii) different model, e.g. Black-Scholes model for computing Hestonmodel prices, [16].

(iii) Standard contracts close to an exotic, e.g. Bermudan swaptions, seefor instance [18].

(iv) Markov projection for baskets of SVM or LV.

These are exactly the cases where the control variate can beapplied in the sense of difference learning.Instead of learning the values of the set Y we only learn the labelsYnew calculated by applying the approximation, analytic or vanillapricing, ie. the control variate YCV.


Calibration with ANN

Using ANN to calibrate models we have different choices

Inverse map approach

Learn the inverse map from observable market data.This approach needs a lot of observed market data for training. This might be a bottleneck.

Learn the inverse map from model prices (training +validation) and use observable market data as test input.Possibility to have as many training data we want. Learning the inverse map directly may suffer

from instabilities, see [15, 7].

Two-step approachLearn the pricing and calibrate using standard techniques using thelearnd pricing.Possible to create as many samples for training/validation as we like using some pricing function.

Separation of pricing and calibration leads to stability. Calibration is lightning fast, see [6, 5, 17]


Calibration with ANN

Applying a two step ML approach we see the advantages:

Independence of the pricing approximationFor each model the most favourable pricing approximation could beused. Since the generation of prices is separated from the actualcalibration we even can rely on Monte Carlo methods.

Availability of training dataIt is possible to generate as many training data as we like. We couldalso use different price approximations, eg. net architectures fordifferent parameter sets.

InterpretatbilityThe interpretability of the results is the same as in the classicapproach. Since we work with models instead of purely ANN basedmethods the model parameters have the same meaning as in theclassic approach. The ANN is nothing but a complex Black-Boxapproximation that we need to assure it works.

We directly apply the improved pricing methodology usingthe CV since for calibration the optimizer calls the trainedANN pricing function.


Outline



3 Appendix


The Neural Network

The deep neural network for the examples we choose:

import keras

from keras.layers import Activation

from keras import backend as K

from keras.utils.generic_utils import get_custom_objects

keras.backend.set_floatx(’float64’)

input1 = keras.layers.Input(shape=(Nparams,)) # input layer

x1 = keras.layers.Dense(20,activation = ’elu’)(input1) # hidden layer 1

x2 = keras.layers.Dense(20,activation = ’elu’)(x1) # hidden layer 2

x3 = keras.layers.Dense(20,activation = ’elu’)(x2) # hidden layer 3

x4=keras.layers.Dense(Nk,activation = ’linear’)(x3) # output layer; size depends on option surface

# set up the model

modelGENp = keras.models.Model(inputs=input1, outputs=x4)

modelGENp.summary()

We use only 40.000 samples.


The SABR Model - Training

We train two neural networks on varying parameters for α, β,ν, ρ and T keeping the forward equal to 1 (wlog due totransformation properties of the SABR model).

We train on a log-moneyness range from −0.5 to 1.5.

One is the standard approach without using a control variateand the other is with applying the SABR approximationformula for Bachelier volatility as control variate.

The resulting errors and the standard deviation are muchsmaller for the latter case!

We have eabs, std = 0.00373 and an average error ofeav,std = 4.17e − 05 and eabs,cv = 0.00034 as well aseav,std = 3.99e − 06. The absolute error is 10 times smaller for thecontrol variates technique.


SABR Model - Learning Call Prices

0 25 50 75 100 125 150 175 200epoch

0.2

0.3

0.4

0.5

0.6lo

ss

model loss SABR PDE pricetrainvalidation

0 25 50 75 100 125 150 175 200epoch

0.00

0.05

0.10

0.15

0.20

0.25

loss

model loss SABR cv pricetrainvalidation

0 25 50 75 100 125 150 175 200epoch

0.10

0.15

0.20

0.25

0.30

0.35

0.40

loss

model loss SABR PDE voltrainvalidation

0 25 50 75 100 125 150 175 200epoch

0.10

0.15

0.20

0.25

0.30

loss

model loss SABR cv voltrainvalidation

Figure: Learning history for PDE prices/implied Bachelier volatilities.(Left) Standard (Right) CV.


The SABR Model - Learning Call Prices

0.0004 0.0002 0.0000 0.0002 0.0004Error

0

50000

100000

150000

200000

250000

300000

350000

Num

ber o

f Dat

a po

ints

plaincv

0.003 0.002 0.001 0.000 0.001 0.002Error

0

100000

200000

300000

400000

Num

ber o

f Dat

a po

ints

plain

0.0002 0.0001 0.0000 0.0001 0.0002 0.0003Error

0

50000

100000

150000

200000

250000

300000

350000

Num

ber o

f Dat

a po

ints

Histogramcv

Figure: (Top) Error histogram for CV and Standard appoach (Bottom)Histograms for CV and Standard approach.


The SABR Model - Learning Iimplied Volatilities

0.0100 0.0075 0.0050 0.00250.0000 0.0025 0.0050 0.0075 0.01000

50000

100000

150000

200000

250000cvplain

Figure: For a given maturity we show the relative error for the standard(blue) and the control variate (orange) approach. The errors arecalculated along the strike range of moneyness from 0.5 to 1.5 for impliedvolatilities.


The SABR Model

To further illustrate the superiority of the control variate approachwe consider the relative errors for a given maturity for moneynesslevels from 0.5 to 1.5.

0.5 0.0 0.5 1.0 1.5moneyness

0.6

0.4

0.2

0.0

0.2

0.4

0.6

0.8pe

rcen

tage

SABR relative error, full training vs cv training

Figure: For a given maturity we show the relative error for the standard(orange) and the CV (blue) approach along the strike range of moneynessfrom 0.5 to 1.5.


SABR Model - Learning Call Prices

Figure: Trained neural net with CV applied to unseen data. (Top) Prices,(bottom) Differences in bp.


Pitfalls

Figure: Trained neural net with CV applied to unseen data.

The CV approach does not automatically account for data failureswhen learning the pricing function.

If there is erroneous data the ANN is trained as if the data is correct.

To this end we used some corrupted data when training the pricingwithin a SABR model.


Summary and Conclusions

General Stochastic Volatility models (GSVM) is considered

New variants of classical models were considered

PDE and approximation methods were introduced andillustrated with examples

Neural networks (ANN) with Control Variates (CV) areconsidered

The CV method stabilizes the ANN approach for pricing andcalibration

An extensive example from the GSVM is given, the SABRmodel


Outline



3 Appendix


Assumption I

To derive the effective PDE we make the following assumptionsrelated to (1.1):

Assumption I

The drift term, µ(·), is differentiable, with derivative µ′(·), and asolution Y (t, t0, α) to the following PDE exists:

∂tY (t, t0, α) = µ(Y (t, t0, α))

Y (t, t, α) = α

Y (t0, t0, α) = α.


Assumption II

Assumption II

The function Y is differentiable and has an inverse functiony(t0, t, a) such that

Y (t, t0, α) = a ⇔ α = y(t0, t, a).

Remark 3.1

Functions µ(·) allowing a closed-form solution include:

(i) for µ(x) = µ the solution is Y (t, t0, α) = α + µ(t − t0).

(ii) for µ(x) = κ(θ − x) the solution isY (t, t0, α) = αe−κ(t−t0) + θ(1− e−κ(t−t0)).


Assumption III

Assumption III

The functions

X (t, t0, α) = ∂αY (t, t0, α),

Z (t, u) = Z (t, u, t0, α) = y(u, t,Y (t, t0, α)),

z(F ) =

∫ F

f

1

C (u)du,

s(t) = S(t0, t, α) =

∫ t

t0

Z (t, u, t0, α)2 du

and

ψ(t, u,Z ) = ν(Z (t, u)

)Z (t, u)X

(t, u,Z (t, u)

)are well defined,


Assumption III

Assumption III

X(t, u,Z (t, u)

)−1exists, and the following integral functions are

defined:

I1(t) = ρ

∫ t

t0

ψ(t, u,Z ) du,

I2(t) = 2

∫ t

t0

ν(Z (t, u)

)2X(t, u,Z (t, u)

)2∫ t

u

Z (t, v)X(t, v ,Z (t, v)

)−1dv du,

I3(t) = ρ

∫ t

t0

ψ(t, u,Z )

∫ t

u

Z (t, v)X(t, v ,Z (t, v)

)−1dv du,

I4(t) = ρ2

∫ t

t0

ψ(t, u,Z )

∫ t

u

∂Z

(ψ(t, v ,Z )

)X(t, v ,Z (t, v)

)−1dv du,

I5(t) =

∫ t

t0

ν(Z (t, u)

)2X(t, u,Z (t, u)

)2du.


Assumption IV

Assumption IV

The function C (·) is differentiable at f , with derivative denoted byC ′(·).


Literature I

Andersen, L. and Piterbarg, V.

Interest Rate Modeling - Volume I: Foundations and Vanilla Models.Atlantic Financial Press, 2010.

Andreasen, J. and Huge, B. N.

Expanded Forward Volatility.RISK, 1, 2013.

Antonov A., Konikov, M., and Spector, M.

FreeBoundarySABR.RISK, 2015.

Antonov A., Konikov M., Piterbarg V.

Neural Networks with Asymptotics Control.SSRN, 3585966, 2020.

Bayer C., Horvath B., Muguruza A., Stemper B., Tomas M.

On deep calibration of (rough) stochastic volatility models.ArXiv, 2019.

Bayer C., Stemper B.

Deep calibration of rough stochastic volatility models.ArXiv 1810.03399, 2018.

Dimitroff G., Roeder D.R., Fries C. P.

Volatility model calibration with convolutional neural networks.Preprint, SSRN:3252432, 2018.

Felpel M., Kienitz J., and McWalter T.

Effective Stochastic Volatility - Applications to ZABR-type models.SSRN Preprint, 2020.


Literature II

Hagan, P., Kumar, D. , Lesniewski, A. S. and Woodward, D. E.

Universal Smiles.Wilmott Magazine, 2016.

Hagan, P., Kumar, D.,Lesniewski, A. and Woodward, D.

Arbitrage-Free SABR.Wilmott Magazine, 1:60–75, 2014.

Hagan, P., Lesniewski, A. and Woodward, D.

Probability Distribution in the SABR Model of Stochastic Volatility.Working Paper, http://lesniewski..us/papers/ProbDistForSABR.pdf, 2005.

Hagan, P., Lesniewski, A. S. and Woodward, D. E.

Effective Media Analysis for Stochastic Volatility Models.Wilmott Magazine, 1:46–55, 2018.

Hagan, P.S., Kumar, D., Lesniewski A.S. and Woodward, D.E.

Managing Smile Risk.Wilmott Magazine, 1:84–108, 2002.

Hastie T., Tibshirani R., and Friedman J.

The Elements of Statistical Learning, 2nd edition.Springer Series in Statistics, 2008.

Hernandez, A.

Model Calibration with Neural Networks.SSRN: https://ssrn.com/abstract=2812140, 2016.


Literature III

Heston, S.

A closed form solution for options with stochastic volatility with applications to bond and currency options.Rev. Fin. Studies, pages 327–343, 1993.

Horvath B., Muguruza A., Tomas M.

Deep Learning Volatility.Preprint, 2019.

Kienitz J., Acar S., Liang Q., Nowaczyk N.

Deep Option Pricing.Machine Learning in Finance, 1, 2020(a).

Kienitz J., Acar S., Liang Q., Nowaczyk N.

The CV makes the difference.SSRN, https://papers.ssrn.com/sol3/papers.cfm?abstract id=3527314, 1, 2020(b).

Kingma, A. and Ba, J.

Adam: A method for stochastic optimization.International Conference for Learning Representation (ICLR), 2015.

Schoebel, R. and Zhu, J.

Stochastic Volatility with an Ornstein Uhlenbeck Process: An Extension.European Finance Review, 3:23–46, 1999.

Stein, E. M. and Stein, J. C.

Stock Price Distribution with Stochastic Volatility: An Analytic Approach.Review of Financial Studies, 4:727–752, 1991.


New Stochastic Volatility Models · 1 SV Models Introduction (New) Stochastic Volatility Models Examples 2 Neural Networks Supervised Learning The Neural Network and Training The

Documents