Risk Analysis & Modelling Lecture 6: CAT Models & EVT
Dec 19, 2015
Risk Analysis & Modelling
Lecture 6: CAT Models & EVT
Cat ModellingCatastrophe Modelling (also known as CAT Modelling) is the process of using statistical models to assess the loss which could occur in a Catastrophic EventCatastrophic Events are extreme events which go beyond normal occurrences such as exceptionally strong windstorms or exceptionally large financial losses So far on this course we have primarily focused on the modelling of the day-to-day Attritional losses– these losses have been high in frequency and low in severityCatastrophic Losses are low in frequency and high in severity
Two Sources of CAT Risks
0
0.05
0.1
0.15
0.2
0.25
0.3
0 1 2 3 4 5 6 7
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
Small Number of Very Large Claims: for example MAT: total destruction of a large oil tanker – Normally Man Made Events
Infrequent External Events Causing Many Simultaneous Losses: Property Insurance (eg Hurricanes & Meteor
Strikes) – Normally Events of Nature
Infrequent Natural Catastrophes Causing Multiple Losses
Modelling Natural CatastrophesNatural Catastrophes like Hurricanes and Earthquakes are also modelled statistically in terms of their Frequency and SeverityStatistical Distributions are used to describe the random severity of a CatastropheIn the Earth Sciences the EP (Exceedance Probability) Curve is used to estimate the Severity of a Catastrophe in terms of the probability of observing a Catastrophe of greater severityThe measure of the severity used to describe a Catastrophe’s strength depends on its typeFor example a Windstorm’s Severity might be measured in terms of its Maximum Sustained Wind Speed while an Earthquake’s Severity might be measured interms of a Maximum Shaking Index
EP Curve for Maximum Windstorm Speed over the Next Year
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50 100 150
Maximum Sustained Windspeed (mph)
Pro
bab
ilit
y o
f E
xcee
den
ce
The Exceedence Probability for a Windstorm with a Maximum
Windspeed greater than 105 mph is 5%
Return PeriodAnother Probabilistic Measure which is commonly used to measure the Severity of a Catastrophe is the Return PeriodIt measures the Average Time you would have to wait to observe a Catastrophe of a certain strengthFor example, a 100-year storm will only occur on average every one hundred yearsThe Return Period (RP) is related to the Annual Excedence Probability (EP) by the following formula:
Using this formula calculate the Annual Exceedence Probability of a 100 Year Storm (probability of experiencing a 100 Year Storm or worse in a given year)
RPEP
1
Return Period vs Maximum Windspeed
0
20
40
60
80
100
120
140
160
0 10 20 30 40 50 60
Return Period (Years)
Ma
xim
um
Su
sta
ine
d W
ind
sp
ee
d
Maximum Windspeed of a 30 Year Storm is 125mph
From Catastrophe To LossAn Insurance Company could use an estimate of the Worst Case Intensity of a Catastrophe to imply the Worst Case Loss they could incurEstimating the Loss would involve implying damage to property from the intensity of the CatastropheThe intensity of the Catastrophe (such as the intensity of the wind speed or the height of the flood water) is likely to vary at different locationsThe damage the Catastrophe causes will also depend on the type of structure or property insuredTo accurately measure the damage caused by the Catastrophe we need to know the location of the insured properties, the strength of the catastrophe at those locations and the property’s Vulnerability to the Catastrophe…..
Proprietary Catastrophe ModelsProprietary Catastrophe Models are complex Computer Models used to estimate the distribution of losses an Insurance Company could experience due to Catastrophic EventsPopular propriety CAT models include RMS, EQECAT and AIRTheir primary use is to model losses on Property Insurance PortfoliosThey combine models from the Earth Sciences and Engineering with the Frequency-Severity Model to produce Distributions of the lossesAlthough these models are too complicated to build in Excel we will look at how they work internallyProprietary CAT models can be split into four components: Hazard, Vulnerability, Exposure and Loss
Components of CAT Models
HAZARD VULNERABILITY
EXPOSURE
LOSS
Model Output(AAL, AEP, OEP.)
Hazard ComponentThe Hazard Component is the part of the CAT Model that deals with the Severity of the Catastrophe It contains a database of a selection of historical and simulated catastrophes (floods, hurricanes, earthquakes) known as the Event CatalogueThe Catastrophes in the Event Catalogue are stored as points on a map measuring their Maximum Intensity at that location For example, a Hurricane would be stored as a series of points on a map measuring the Maximum Sustained WindspeedThe area on the map affected by the Catastrophe is known as the Foot PrintSome Catastrophes, such as winter storms, have a very large foot print
The simulated Catastrophes are generated using Earth Science computer models and reflect the frequency and severity of real world phenomenaThe Event Catalogue represents the entire range of possible Catastrophes that could occur in the future Each Catastrophe also has an Average Annual Frequency or Rate associated with it – this is the measure of how likely it is that this Catastrophe will occur – this is the Primary UncertaintyIn the RMS Model the distribution of the Frequency of each Catastrophe is assumed to follow a Poisson DistributionEach Catastrophe is also assigned a unique Event ID to identify it in the Database
Hazard Component Diagram
Historical Catastrophes (eg Hurricane Katrina)
Simulated Catastrophes
Rate ()
Event Catalogue Database
For every Catastrophe in the Database a Map of Maximum Intensities and its
Average Frequency (or Rate) is storedIntensity Map
Vulnerability ComponentThe Vulnerability Component converts the Catastrophe’s Intensity at a point on the Map to the Total Damage Incurred on a particular Structure or Building (the Ground Up Loss)The Vulnerability Component comprises of a series of Vulnerability Curves (from Civil Engineering) which relate the Mean (Average) Damage Ratio (MDR) to a Catastrophe’s Intensity measureThe Mean Damage Ratio (MDR) is the ratio of the damage done to the property or structure as a proportion of its total value (Exposure)There are different Vulnerability Curves for different Catastrophe Types and different Building Types
Which Vulnerability Curve to use for a given Property depends on its resistance to the Catastrophe (such as resistance to Ground Shaking)For example, Vulnerability Curves for Earthquake Damage could be divided into Good, Average or Poor ResistanceThe allocation of a building to one of these categories would be determined by build material, age, structural design and the rock type under the foundationsThe Actual Damage Ratio that will occur in a Catastrophe is a random variable that will fluctuate about its Average – this is the Secondary UncertaintyThe RMS model provides estimates for both the Average and Standard Deviation of the Damage doneRMS assumes the random Damage caused by the Catastrophe follows a Beta Distribution Ultimately the CAT Model uses the Vulnerability Curves to Simulate the Ground Up Loss given the strength of the Catastrophe at that Location
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
5 6 7 8 9 10 11 12
Modified Mercalli Shaking Intensity (MM Intensity)
Me
an
Da
ma
ge
Ra
tio
(M
DR
)
Vulnerability Curves for Earthquake Catastrophe
Poor Construction Quality
Medium Construction Quality
Good Construction Quality
The actual damage ratio fluctuates about the MDR
Exposure ComponentRelevant details of the Insurer’s Underwriting Portfolio are stored in a database called the Exposure ComponentFor each building insured the Sum Insured or Exposure, address and the various physical characteristics required to assess the appropriate Vulnerability Curve are storedThe Exposure Database also contains policy specific information such as Limits, Deductibles, Coverage of the policy (all-risks vs named peril etc) and details of Reinsurance coverageBy combining the information held in the Hazzard, Vulnerability and Exposure components a table of the possible losses can be generated…
The Event Loss Table (ELT)Using the Information in the Hazzard, Vulnerability and Exposure components the CAT model produces a table of potential losses from Catastrophic Events called the Event Loss TableIts columns include the Event ID, the Rate, the Exposure, the Average and Standard Deviation of Loss SeverityAlthough it has been derived by combining Earth Science and Civil Engineering models, from an Actuarial perspective the ELT is simply a table of the Frequency and Severity of losses due to CatastrophesThe Rate is the Average Frequency of the Loss (Frequency of the Catastrophe causing this loss) from the Hazard Module which is Poisson DistributedThe Average and Standard Deviation of Loss Severity are calculated from the Vulnerability Curves and the CAT’s Intensity at the buildings location and the losses are assumed to be Beta Distributed…..
Event Loss Table
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
5 6 7 8 9 10 11 12
Modified Mercalli Shaking Intensity (MM Intensity)
Me
an
Da
ma
ge
Ra
tio
(M
DR
)
Hazard Component Intensity Maps
VulnerabilityCurves
Event LossTable
For the Various CATs a table of the Frequency and Severity of Losses for the Properties in the Exposure Database is
produced called the Event Loss Table
Exposure Database
Event Loss TableEvent ID Rate
()
Exposure Average
Severity
Std Dev Severity
5564 0.00013 13.5m 1.7m 2.4m
7831 0.005 8.7m 0.8m 1.1m
…. ….. …. …. ….
The Event Loss Table gives the Frequency and Severity of Losses due to Catastrophes in the Event Catalogue
Loss ComponentThe Loss Component is the Calculation Engine of the CAT ModelUsing the information in the Event Loss Table and the Statistical Assumptions of the Model (Frequency is Poisson and Severity is Beta Distributed) the Loss Component can produce a series of Loss Distributions and Loss statisticsThe simplest calculation that the Loss Component can make is the Average Annual Loss (AAL)This is simply the equal to the sum of the Average Losses for each row in the Event Loss Table which can be calculated from the formula we saw in Lecture 4:
Where is the Average Fequency or Rate, and is the Average Loss SeverityThe Average Annual Loss is used when calculating the Catastrophe Load for the Premium
*Loss Expected
AAL Loss Example
Event ID Rate
()
Exposure Average
Severity
Std Dev Severity
5564 0.01 13.5m 1.7m 2.4m
7831 0.05 8.7m 0.8m 1.1m
8214 0.03 20.7m 2.7m 4.2m
The Average Annual Loss across these 3 Catastrophic Losses isAAL = 0.01*1.7+0.05*1.1+0.03*2.7= 0.153m
Annual Exceedence Probability Curve
The Annual Exceedance Probability (AEP) Curve gives the probability that the Total Annual Loss from all Catastrophes over a year will exceed a value
To calculate this distribution the Loss Component has to run a full frequency Severity-Simulation across all possible Catastrophes
To calculate the AEP Curve the Loss Component has to use the Poisson-Beta Frequency Severity Model combined with the statistics in the ELT
The AEP Curve can be used to calculate how much capital is required to absorb 99.5% of all losses due to Catastrophes over a one year period and would be used to calculate the CAT SCR in Solvency II
AEP Curve Calculation
ID R E
1 … … … …
… … … … …
Event Loss Table
0
0.02
0.04
0.06
0.08
0.1
0.12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Number Claims Per Week
Pro
bab
ility
Frequency SeverityModel (Poisson-Beta)
0
0.0005
0.001
0.0015
0.002
0.0025
0 500 1000 1500 2000 2500
Claim Size
Pro
bab
ility
Den
isty
Simulated Distribution ofTotal Loss Over One Year
Using a Monte Carlo Simulation (Air, Equecat) or Fast Fourier Transform (RMS) the Aggregate Loss Distribution over One Year Can
Be Calculated from the ELT and the Frequency-Severity Model
OEP and AEP Curve in RMS
The Probability that Losses due to the Catastrophes Modelled will
Exceed 20 million is 1%
Occurrence Exceedence Probability Curve
The Loss Component can also calculate the Occurrence Exceedence Probability (OEP) Curve
This EP Curve gives the probability of the Total Loss from ANY ONE Catastrophe Exceeding a given loss over a period of time
When the frequency of Catastrophic Events are low the AEP and OEP curve will be similar since it unlikely multiple catastrophes will occur
The OEP Curve can be used by the Insurance Company to determine how much risk they should Retain when purchasing Catastrophe Excess of Loss Reinsurance.
Proprietary CAT Models & AccuracyConceptually Proprietary CAT models make sense, however their accuracy in some cases has been very poorThe largest insurance losses predicted by CAT models for Hurricane Katrina were $25 billion, the post event estimates are closer to $50 billionThis discrepancy of 100% is primarily due to the unforeseen effect of storm surge and flooding during the hurricaneWhile CAT models certainly have their uses they are expensive (RMS has spent over $200 million) They suffer from Model Risk – the more complex the model the greater the chance that a part of it is incorrectThe complexity also means it is difficult to know if the output statistics reasonable, for this reason some insurance companies use multiple CAT models and take an average of their answers!
Cat Risks in Solvency IIAs we saw in last weeks class the Standard Model in Solvency II is based on estimating losses for 1 in 200 year events and aggregating these losses using Correlation MatricesThe Standard Model in Solvency II provides a simplified CAT model based on scenarios for Earthquakes, Floods, Windstorms, Hail and Subsidence catastrophes with a 200 Year Return periodThese scenarios are provided on spreadsheets in which the insurer needs to enter their exposure to different geographical regions (sub-divided by CRESTA Zones), the loss given the severity of the disaster in the regions is then calculated https://eiopa.europa.eu/Publications/Standards/20131202_eiopa_helper_tab_cat.xlsmCRESTA Zones are a standardised way of dividing geographical regions in the UK these are either Postcode Areas (Low Resolution) or Postcode Districts (High Resolution)
Natural Catastrophes In Solvency II
Solvency II provides average LDR (Loss Damage Ratios) for various CRESTA Zones (Low Resolution which Post Code Areas in the UK) for 1 in 200 year Natural CatastrophesThe LDR ratio for each CRESTA Zone is divided into a National Average Loss Ratio LDRNAT and a factor for the severity of the catastrophe in each CRESTA Zone FZONE:
LDRZONE=FZONE*LDRNAT
The loss (or the capital required to absorb this loss) that the insurer would experience in the event of this 1 in 200 Catastrophe would calculated as:
Where TIVZONE is the Total Insured Value or Exposure in this zone (which will normally be the sum of Total Insured Value of Property with Fire and Other Damage Coverage and the total insured value of Static Cargo (Marine) in this CRESTA Zone)SCRZONE is the size of the loss from this 1 in 200 Catastrophe in this CRESTA Zone and equivalently the Solvency Capital Requirement (SCR) needed to absorb thisNational LDR Scenarios and Relative Severity Factors are provided for various natural catastrophes: Windstorms, Earthquakes, Floods, Hail and SubsidenceReinsurance should be taken into account at the appropriate level, for example, if the insurer has an Aggregate XL Windstorm programme for the UK then the effect of this on the SCR should be taken into account after losses have been aggregated for the whole country
SCRZONE=TIVZONE*LDRZONE=TIVZONEFZONE*LDRNAT
1 in 200 Windstorm LDRNAT
Country LDR
Belgium 0.16%
Switzerland 0.08%
Denmark 0.25%
France 0.12%
Germany 0.09%
Iceland 0.03%
Ireland 0.20%
Lithuania 0.10%
Netherlands 0.18%
Norway 0.08%
Poland 0.04%
Spain 0.03%
Sweden 0.09%
UK 0.17%
UK CRESTA Relativity Factors For Windstorm (FZONE)
CRESTA Zone (Postcode Area) CRESTA Relativity Factor
AB 90.00%
AL 110.00%
B 70.00%
BA 150.00%
BB 110.00%
BD 90.00%
BH 150.00%
BL 90.00%
BN 190.00%
BR 70.00%
BS 130.00%
….. …..
…… ……
Review QuestionAn Insurance Company has a Total Sum Insured in Postcode B of £10,000,000 and a Total Sum Insured in Postcode BN of £20,000,000, calculate the SCR required to absorb the losses resulting from the 1 in 200 year Catastrophic Windstorms in these two CRESTA Zones
Assuming a Correlation between the losses in these two CRESTA Zones of 50% calculate the SCR required to absorb both of these using the Risk Aggregation Formula:
SCRVPSCRVSCR T ..
Aggregating Risks Between Different CRESTA Zones Within a Country
Post Code
Area AB AL B BA BB BD BH BL BN ….
AB 1.00 0.00 0.00 0.00 0.25 0.25 0.00 0.25 0.00 …..
AL 0.00 1.00 0.75 0.75 0.50 0.50 0.75 0.50 0.75 …..
B 0.00 0.75 1.00 0.75 0.75 0.75 0.50 0.75 0.50 …..
BA 0.00 0.75 0.75 1.00 0.25 0.50 1.00 0.50 0.75 …..
BB 0.25 0.50 0.75 0.25 1.00 1.00 0.25 1.00 0.25 …..
BD 0.25 0.50 0.75 0.50 1.00 1.00 0.25 1.00 0.25 …..
BH 0.00 0.75 0.50 1.00 0.25 0.25 1.00 0.25 0.75 …..
BL 0.25 0.50 0.75 0.50 1.00 1.00 0.25 1.00 0.25 …..
BN 0.00 0.75 0.50 0.75 0.25 0.25 0.75 0.25 1.00 …..
…. ….. ….. ….. ….. …… ….. ….. ….. ….. …..
Large Correlation Matrices (124 by 124 for the UK) are used to Aggregate SCRs between CRESTA Zones for a Given Nat Cat within a Country, this Matrix is for
Windstorms for the CRESTA Zones (Postcode Areas) in the UK
Aggregating SCRS for Natural Catastrophes Across Countries
AT BE CH CZ DE DK ES FR UK …..
AT 1.00 0.25 0.50 0.25 0.25 0.00 0.00 0.25 0.00 …..
BE 0.25 1.00 0.25 0.25 0.50 0.25 0.00 0.50 0.50 …..
CH 0.50 0.25 1.00 0.25 0.25 0.00 0.25 0.50 0.00 …..
CZ 0.25 0.25 0.25 1.00 0.25 0.00 0.00 0.25 0.00 …..
DE 0.25 0.50 0.25 0.25 1.00 0.50 0.00 0.50 0.25 …..
DK 0.00 0.25 0.00 0.00 0.50 1.00 0.00 0.25 0.25 …..
ES 0.00 0.00 0.25 0.00 0.00 0.00 1.00 0.25 0.00 …..
FR 0.25 0.50 0.50 0.25 0.50 0.25 0.25 1.00 0.25 …..
UK 0.00 0.50 0.00 0.00 0.25 0.25 0.00 0.25 1.00 …..
…. …. ….. ….. ….. ….. ….. ….. ….. ….. …..
Once the SCR is Combined for an Individual Country they are then Aggregated Again Across Counties using another Correlation Matrix for that particular
Catastrophe. This Matrix is for Aggregating Windstorm Losses
Aggregating Across Different Natural Catastrophes
Windstorm Earthquake Flood Hail Subsidence
Windstorm 1.00 0.00 0.0 0.0 0.00
Earthquake 0.00 1.00 0.00 0.00 0.00
Flood 0.0 0.00 1.00 0.00 0.00
Hail 0.0 0.00 0.00 1.00 0.00
Subsidence 0.00 0.00 0.00 0.00 1.00
The SCRs Between Different Natural Catastrophes are combined again using another Correlation matrix to Obtain the SCR_NAT_CAT – note that the matrix assumes
catastrophes are independent or uncorrelated
Extreme Value Theory (EVT)If we were to look inside the CAT models and see how they are generating extreme floods or windstorms we would see a recurring theme: Extreme Value Theory (EVT)EVT is based on the Pickands–Balkema–de Haan Theorem which states that the distribution describing the behaviour of the Extreme Events of any random variable above a suitably high threshold will follow a single distribution – the Generalised Pareto Distribution (GPD)We can think of the GPD as a distribution that describes the tails of all other distributionsEVT is the most powerful statistical tool available to assess “how bad things can get from what we have seen so far”
EVT: Fitting the Tails
If we stand back and look at the the Statistical Methods we have been using to measure Risk we notice a pattern
This pattern is that we have been looking at the tails of distributions
We have been fitting statistical distributions to the entire behaviour of a random variable and then used this to estimate the worst 5%, 1% of 0.1% of outcomes
EVT just focuses on the tails of a distribution and tries to find the best fit just for the tail
Largest Claims: Upper Tail
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2
Largest 1% of claims
We make a lot of strict assumptions about the random variable in order to estimating the whole distribution and then just use the tail
VaR: Lower Tail
Largest 1% of losses
Describing the entire random behaviour
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2
Find the distribution that best fits the entire dataset
EVT: Describing the tail
0
0.1
0.2
0.3
0.4
0.5
0.6
0 1 2 3 4 5 6 7 8
Take the largest values in the data set and just fit the tail
Conditional ProbabilityThe EVT Distribution is generally used to measure Conditional Probability instead of the Unconditional Probability we have been using so farConditional Probability is simply the probability of something happening GIVEN THAT something else has already happenedFor example, what is the probability of rolling a 6 on a fair dice if we know we will roll more than a 3The chance of rolling a 6 unconditionally is simply 1 in 6 or 1/6 (6 is one of six equally likely possibilities 1,2,3,4,5,6) – this is the unconditional probabilityThe chance of rolling a 6 conditional on the fact we will roll more than 3 is 1 in 3 or 1/3 (6 is one of three equally likely possibilities 4,5,6) – this is the conditional probability
Conditional Probability Review Question 1
10% of the population are over 180cm tall and 2% of the population are over 190cm tallWhat is the probability of selecting someone at random and finding they are less than 190cm tall (Unconditional Probability)?What proportion of the population over 180cm tall are also over 190cm?What is the probability of selecting someone at random from a group of individuals over 180cm and finding they are less than 190cm tall (conditional probability)?
Conditional Probability Review Question 2
Only 5% of the claims on a given class of insurance are greater than £2000
Out of the large claims greater than £2000, 80% are less than or equal to £5000
What proportion of the large claims greater than £2000 are also greater than £5000?
What is the probability of any claim being greater than £5000 (unconditional probability)?
Generalised Pareto Distribution Formula
0, uY
The Cumulative Distribution Function for the Generalised Pareto Distribution (GPD) is:
is the shape parameter, is the shift parameter and M is the minimum or threshold above which we analyse the random variable
In the context of EVT, this is a Conditional CDF giving the probability of the random variable being less than Y GIVEN THAT IT IS GREATER THAN OR EQUAL to M
The CDFs we have seen so far are unconditional, they simply measure the probability that the variable is less than some value
Y
MYCDF 1)(*
The Inverse CDF for Generalised Pareto Distribution is
Where P is the probability of a random variable being less than some value given that it is greater than or equal to uThe Probability Density Function for the Generalised Pareto distribution is:
MY
1
1 P
MY
1)(
Y
MYPDF
VBA Formula
Public Function GPDCDF(X, ShapeParam, ShiftParam, MinParam)GPDCDF = 1 - (((MinParam + ShiftParam) / (X + ShiftParam)) ^ ShapeParam)End Function
Public Function GPDICDF(P, ShapeParam, ShiftParam, MinParam)GPDICDF = ((MinParam + ShiftParam) / ((1 - P) ^ (1 / ShapeParam))) - ShiftParamEnd Function
Public Function GPDPDF(X, ShapeParam, ShiftParam, MinParam)GPDPDF = ShapeParam * ((MinParam + ShiftParam) ^ ShapeParam) / ((X + ShiftParam) ^
(ShapeParam + 1))End Function
Generalised Pareto CDF M=2000, =5 & =2000
00.10.20.30.40.50.60.70.80.9
1
0 2000 4000 6000 8000 10000 12000 14000
The random variable can take on values below the threshold of 2000, however this distribution is conditional on the outcome
being greater than or equal to 2000
Of all the claims above 2000, 90% are below 4340
GPD Review Question
Of the claims above 2000, 90% are below 4340, what proportion of the claims above 2000 are above 4340?If 5% of all claims are above 2000, what is the probability of observing any claim being above 4340 (unconditional probability)?What is the 99.5% PML? (Hint this is the largest 10% of the largest 5% or the 90% Quantile in the Tail)
Fitting Distributions
Throughout the course we have used statistical distributions to describe random variablesUp until now we have been fitting the shape of the distributions (Normal, Gamma, Pareto etc) using simple statistics like the mean and variance of our observations The mean and variance are the first two moments and this method of fitting distributions is know as the Method of MomentsUnfortunately the moments of a distribution do not always exist (they can be undefined or infinte) and this method does not always provide the best fitWe cannot fit the Generalised Pareto Distribution to a dataset using the Method of Moments…..
Example: Pareto DistributionIn an early lecture we looked at the Pareto Distribution (which is closely related to the Generalised Pareto Distribution) as a way of modelling claim severity, its Probability Density Function was
We saw how the distributions was determined by the minimum (M) and the shape ()We then saw how we could determine the shape parameter from the minimum and the average of a data set:
The intuition behind this approach is that if the distribution has the same moments as the data set it should provide a “reasonably good” fit
MXE
XE
)(
)(
1.)(
X
MXPDF
Fitting the Pareto Distribution
N
iiX
NXE
1
1)
~(Imply From Dataset
Fit the Pareto DistributionShape:
Best Fit for the Dataset: Same Average and Minimum
Historic Observations (Xi)
Calculate the Average
0.0000.0500.1000.1500.2000.2500.3000.3500.4000.450
0 10 20 30 40
MXE
XE
)(
)(
Maximum LikelihoodThere is a much more versatile approach to fitting statistical distributions to historical data or observationsThis is the Maximum Likelihood Method which was introduced by the famous Statistician Ronald Fisher in 1922Although it is one of the most important techniques in Statistics, the idea behind the Maximum Likelihood Method is very simpleStatistical distributions describe the probability of observing outcomesSo, select the distribution’s parameters so that the probability of observing the data from that distribution is maximisedHow likely is it that this sample of data came from this distribution?Find the distribution which is most likely to have produced this sample – the Maximum Likelihood
Selecting the Distribution
Distribution A
Distribution BHow likely is it that this data came from Distribution B?
How likely is it that this data came from Distribution A?
0.0000.0500.1000.1500.2000.2500.3000.3500.4000.450
0 10 20 30 40
0.0000.0500.1000.1500.2000.2500.3000.3500.4000.450
0 10 20 30 40
Maximum Likelihood
Out of all the possible Pareto Distributions which is most likely to
have produced our set of observations
0.0000.0500.1000.1500.2000.2500.3000.3500.4000.450
0 10 20 30 40
How to calculate the LikelihoodWe have the general idea that we want to select the distribution so as to maximise the probability of observing our observations or maximise the likelihoodHow do we calculate the probability of observing a set of outcomes from a given distribution?The PDF tells us the density of outcomes in a region – this concept is very closely related to the probability of the outcomeWe are more likely to observe an outcome from a region of high density than from a region of low densityFor the following argument we will think of maximising the probability density and maximising the probability as equivalent
Probability Revision
= 20%
= 50%
= 30%
P( ) = 50% * 30% * 20% = 3%
What is the probability of drawing a green then a red then a blue?
Probability (Density) of a Sample From a Continuous Distribution
P( ) = P( ) * P( ) * P( )
This gives us the probability (density) of observing this sample
Maximum Likelihood EvaluationLet PDF be the density of a distribution we want to fit to a dataset X1,X2,X3,…XN
PDF is parameterised by a set of parameters S (if PDF is the Pareto Density then S would be the minimum and shape parameters)
The Maximum Likelihood problem is simply:
Select S to Maximise PDF(X1)* PDF(X2)* PDF(X3)*…*PDF(XN)
This multiplication calculates the Joint Density of the sample of observations
Lets do our first ML calculation
Lets fit a Pareto distribution to a data set in Excel using maximum likelihood
We can use the VBA ParetoPDF function to calculate the probability of each observation given our selected minimum and shape parameters
We can multiply all the probability (densities) together using the PRODUCT formula
We can use solver to find the Shape parameter which maximise the likelihood of the sample
Getting about the 0 Problem
The solution is simple - we take a log transformation
So instead of maximising this problem:
)(*).......(*)(*)()( 3211
N
N
ii APAPAPAPAP
We maximise this:
)(*).......(*)(*)(ln)(ln 3211
N
N
ii APAPAPAPAP
From the rules of logs:
))(ln(...))(ln())(ln())(ln()(*).......(*)(*)(ln 321321 NN APAPAPAPAPAPAPAP
Log Likelihood Maximisation
We reformulate the problem as maximising the log likelihood:
Select S to maximise:
N
iiAP
1
ln
Isn’t this a completely different problem?Yes and NoThe maximum of this problem is the same as the
maximum of the original problem because the natural logarithm (ln) is a monotonic function – See Appendix
What if we Lost Half the Data?
Imagine we roughly half our data or all the values below 240, could we still fit a loss distribution to describe the behaviour of losses?
By modifying our Densities to take into account we are only observing values above 240 we could ask which Pareto Distribution best describes the underlying data conditional on the fact we only observe values over 240:
)240(1
)()240|(
cdf
xpdfxxpdf
Why This is Useful
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
0 200 400 600 800 1000
Values Below 240 are missing but we can use the Maximum Likelihood method to
deduce what their most likely behaviour is
Reinsurance Companies often do not see losses below the Excess of an Excess of Loss Reinsurance Treaty but they can use this technique to assess the entire loss distribution from the losses they see in the tail
EVT and Claim SeveritySome lines of business such as MAT can experience exceptionally large losses
These exceptionally large losses are low in frequency so the existing loss data is often very limited in scope
Future Catastrophic losses have the potential to be far greater than the losses already observed
EVT can be used to extrapolate the existing loss data to find out how large future catastrophic losses could be and their probability of occurring
Setting the ThresholdIn order to use EVT to estimate the distribution of large losses we need to set a Threshold (u) above which we fit our tailThe selection of the Threshold depends on how much of the tail you need to use for the analysis, how much data you have available (the higher the threshold the less data) and how well the GPD distribution fits the dataOne technique that can be used to help decide where to place the threshold is the Mean Excess Plot (see Appendix)Generally speaking the higher the threshold the better the GPD distribution fits the remaining data but the less data you have to fit it – this trade-off is one of the main problems with EVT in practice!
Splitting Data Above Threshold
U*
Cla
ims
Siz
es
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2
U*
We Fit the GPD to claims above the Threshold U*
Applying this to Claims DataWe will take some claims and use this to extrapolate extreme lossesOur first task is to set the Threshold above which we will estimate the tailWe will set the Threshold at claims above £2000 in sizeThis results in our tail estimate containing the largest 35 out of 500 claimsOur estimation is for the 7% upper tail of the claims distribution.
Probability of Observing a Claim Greater than £10,000
We can use the GPD CDF to calculate the conditional probability of observing a claim less than or equal to 10000 when the Shape () is 2, the Shift () is -900 and u is 2000
So the Conditional Probability of observing a claim greater than this is 0.0146 (1 – 0.9853) or 1.46%It is important to remember that this is a Conditional Probability – conditional on the outcome already being in the largest 7% of claimsThis is the largest 1.46% of the largest 7% - so it represents the largest 0.102% (7% * 1.46%)So the Unconditional Probability of observing a loss greater than 10000 is 0.102%
9853.090010000
900200011)10000(*
2
CDF
EVT and Predicting the Total Loss of a Catastrophe
Another use of EVT would be to fit a distribution to the total financial damage caused by a Catastrophe in a regionWe will try to fit EVT to damage caused by Hurricanes to property in the Florida RegionUsing this we could try to extrapolate the how much total damage the largest hurricanes could cause to the entire Insurance IndustryUsing the Data on the industry loss sheet we fit the Generalised Pareto Distribution with a threshold of 1000 (million) and find the best fit distribution has a shape of 1.03 and shift of 3910For more information on financial damage caused by hurricanes in the US visit: http://www.icatdamageestimator.com/viewdata
GPD CDF for the total loss due to Hurricanes in Florida (Shape = 1.03, Shift = 3901, Min = 1000)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 50000 100000 150000 200000 250000 300000 350000
Total Damage (Millions $)
Cu
mu
lati
ve P
rob
abil
ity
EVT predicts that the 95% PML due to hurricanes in the Florida Region is 86370 million or 86
Billion=GPDICDF(0.95,1.03,3901,100)
Empirical Vs EVT Fit
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
0 10000 20000 30000 40000 50000 60000
Emprical Distribution of Hurricane Losses EVT Fit
Tail starts to diverge because EVT tells us much larger losses are possible
Hurricane Frequencies
We have a distribution for severity of damage caused by Hurricanes in FloridaIf we wish to calculate the AAL, OEP or AEP to model the loss over a year we need to know the Catastrophe’s frequencyThe frequency of Catastrophes is random as well as their loss severityAs we saw with the RMS model a common distribution used to describe the frequency of Catastrophic Events is the Poisson distributionThe historical average frequency of Hurricanes in the Florida region is about 0.56
Company’s Share of Industry Loss
It is useful to have a simple alternative model to “reality check” Proprietary Catastrophe ModelsInsurers are concerned about the level of industry losses because they will have to pay a percentage of this total lossThe percentage that the insurer will have to pay will ultimately be determined by the level of exposure they have to the catastropheOne simple way to calculate this exposure would be to look at proportion of the total value of property in the region insured by the companyThis could be used to give an approximate estimate of the Insurance Company’s Gross loss which could be compared to the output of a Proprietary Model
Frequency Severity CAT Model
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5000 10000 15000 20000 25000 30000
Industry Loss (Millions $)
Cu
mu
lati
ve
Pro
ba
bil
ity
Frequency: How many Catastrophes of this type occur in the year?
Severity: What is the loss for each Catastrophe?
Total Loss From CAT
3 CATS
Insurers Total Loss
* % of Total Exposure
Appendix: Using Log-Likelihood to Decide Which Distribution to UseSo far we have used a visual method to decide which distribution best fits our data – comparing the Gamma or Pareto Distribution to the Empirical DistributionWe could also compare the Log-Likelihood from various distributions and select the one with the highest valueThis would be equivalent to saying which distribution is most likely to have produced our observations, the Pareto or Gamma?
Which Provides Best Fit, Pareto or Gamma?
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 1000 2000 3000 4000
Empirical Pareto Gamma
Log-Likelihood for Pareto is -6588
Log-Likelihood for Gamma is -6802
By comparing the Log-Likelihood we can see that the Probability that this dataset came from the Pareto Distribution is higher than if it came from the Gamma Distribution
Appendix: Calculating the 99.5% PML
We will now calculate the 99.5% Probable Maximum Loss (PML) on a given claim using our GPDThis is the claim such that 99.5% of the time claims will be less than this (0.5% Exceedance Probability)Firstly we need to calculate how far into the 7% tail we need to go to find the value such that only 0.5% of all outcomes are greater than this:
X * 0.07 = 0.005X = 0.005 / 0.07 = 0.0714
What this means is that if we find the largest 7.14% of the largest 7% of claims we will find the claims such that we will only observe claims greater than this 0.5% of the timeFinding the largest 7.14% in the tail is equivalent to finding the 92.85% quantile (1 – 0.0714)
Using the Inverse CDF we can find the location of the 92.85% quantile where the Shape () is 2, the Shift () is -900 and u is 2000:
We could also calculate this in Excel using the GPDICDF formula:
=GPDICDF(0.9285,2,-900,2000)
If we repeat this calculation for the entire range in our 7% tail that we have fitted to the data we can obtain an EP Curve (See Appendix EP Curve sheet)
8.5013)900(
9285.01
9002000
2
1
Y
EP CurveThe EP Curve only deals with Exceedance Probabilities greater less
than 7% because we are only modelling the upper 7% tail (claims above 2000)
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
7.00%
0 1000 2000 3000 4000 5000 6000 7000
Claim Severity
Exc
eeda
nce
Pro
babi
lity
We will only observe claims greater than £2959 2% of the
time
Appendix: GPD Formula
The Formula you will generally see in text books for the GPD is:
Where is the shape and is the scale by making the following substitutions for these parameters
1
.11)(
MX
XCDF
1
M .
We can derive this form of the Generalised Pareto Distribution:
Y
MYCDF 1)(*
Appendix: MLE for the Shape of the GPD
N
iim
N
i
xxnNY
M
111 ).1()ln(..ln.ln
n
iim xxn
n
d
d
1
0)ln()ln(.
The log-likelihood function for the GPD is:
Assuming the shift () is fixed or predetermined then we can see the maximum value for the shape () can be located at:
Solving we can find that the optimal shape is:
This is implemented in the MLEShape VBA function on the spreadsheet
This speeds up the problem of finding the optimal values for and since we can simply itterate of the values of and then using this formula find the optimal value for a using calculus
It simplifies the problem of finding the optimal value of and from a 2 dimensional grid search to a 1 dimensional grid search
n
imi xx
n
1
)ln()ln(
ˆ
Appendix: Monotonic Functions
A monotonic (increasing) function is a function with the following properties:
This means that if we take a sequence of values and apply a monotonic increasing function to each of them then their order does not changeThe implication for a maximisation or minimisation problem is that a monotonic transformation does not change the location of the maximum and minimum values
)()( YfXf
YX
Monotonic Transformation Does Not Change Location of the Maximum
x
y
Ln(y)
x
Applying the Log function does not change the location of the peak
Appendix: Problem with Solver
We have looked at solver as a powerful technique with which to find the maximum and minimum values – such as minimise risk or maximise returnHowever, solver can get “stuck” sometimesThis is because it works on an algorithm called the Feasible Direction Search which basically takes an initial position and tries to move from this position to a higher position (maximise) or lower position (minimise)It stops when it cannot make a small move which will take it higher in the case of a maximisation problem, or lower in the case of a minimisation problemThis can lead to it finding what is called a local solution, rather than a global solution
Solver Diagram
x
F(x)
Solver starts here
Solver makes a series of moves to increase the
value of F(x)
Solver stops here because it cannot make a small
move to take it to a higher value of F(x) and returns x* as the optimal value
x*
This is the true, global optimal value
Grid Search Diagram
x
F(x)
x1 x2 x3x4 x5 x6 x7 x8 x9 x10 x11 x12 x13
Evenly spaced grid of values
Out of all the values on the grid this is the optimal
Appendix: Testing EVT
To see if the Generalized Pareto Distribution really describes the tail of nearly all distributions we can look at the average of a random variable x above a threshold v (this is known as the Mean Excess function ME):
For a Generalized Pareto Distributed random variable (x) this should be a linear or straight line function of v:
v x~ where)~()( vxEvME
1
).()(
uvvME
Mean Excess becomes Linear Above Threshold
0
5000
10000
15000
20000
25000
30000
35000
40000
0 5000 10000 15000 20000 25000
Threshold v
Mea
n E
xces
s
Above a certain threshold the mean excess becomes linear suggesting the distribution is
GPD above this threshold
Threshold should be applied above this point where ME
starts to become linear