Active Portfolio Management Lectures 1 Richard R. Lindsey
Active Portfolio Management
Lectures
1 Richard R. Lindsey
Portfolio Choice
Individual:
1. Strictly prefers more to less (strictly increasing utility
function)
2. Risk averse
0 initial wealth
riskless interest rate
random return on j-th risky asset
dollar investment in j-th asset
uncertain end of period wealth
f
j
j
w
r
r
a
w
55 Richard R. Lindsey
Portfolio Choice
0
0
( )(1 ) (1 )
(1 ) ( )
j f j jj j
f j j fj
w w a r a r
w w r a r r
0{ }max [ ( (1 ) ( ))]j
f j j fa
j
EU w r a r r
56 Richard R. Lindsey
Portfolio Choice
2
F.O.C. [ ( )( )] 0
S.O.C. [ ( )( ) ] 0 j f
j f
EU w r r j
EU w r r j
() 0 more preferred to less
() 0 concave utility or risk averse
U
U
57 Richard R. Lindsey
Portfolio Choice
Theorem: An individual who is risk averse and strictly
prefers more to less will invest in risky assets iff the rate
of return on at least one asset > rf .
Consider the case with a single risky asset
F.O.C. [ ( )( )] 0fEU w r r
58 Richard R. Lindsey
Portfolio Choice
Claim:
Consider the no investment case
*
*
*
0 iff [ ] 0
0 iff [ ] 0
0 iff [ ] 0
f
f
f
a E r r
a E r r
a E r r
0 0[ ( (1 ))( )] ( (1 ))( [ ] )f f f fEU w r r r U w r E r r
59 Richard R. Lindsey
Portfolio Choice
60 Richard R. Lindsey
() 0 sign is entirely determined by [ ] fU E r r
[ ] 0 can increase utility by adding some of the risky asset
[ ] 0 can increase utility by shorting some of the risky asset
[ ] 0 utility is maximized
f
f
f
E r r
E r r
E r r
Portfolio Choice
Richard R. Lindsey61
In the multi-asset case, to hold no risky assets or to short
them
And again
Therefore, a risk averse individual with strictly increasing
utility avoids any positive investment in risky assets only
if none of the investments have a positive risk premium.
0
0
[ ( (1 ))( )] 0
( (1 ))( [ ] ) 0
f f
f f
EU w r r r j
U w r E r r j
0 only if [ ] 0 j j fa j E r r j
Portfolio Choice
Richard R. Lindsey62
When one or more of the risky assets has a positive risk
premium, the investor will have positive holdings in some
risky assets
Note that j and j´ are not necessarily the same because with
more than one risky asset, a positive risk premium on an
asset does not necessarily mean a positive investment (e.g.
2 assets w/ + risk premium but one stochastically
dominates the other).
0 if [ ] 0j j fj a j E r r
Risk Aversion
Richard R. Lindsey63
Consider now the case with one risky asset and one riskless
asset.
For a monotonically increasing strictly concave (MISC)
individual to invest all her wealth in the risky asset:
1st order Taylor series expansion around
0[ ( (1 ))( )] 0fEU w r r r
0( (1 ))fU w r
Risk Aversion
Richard R. Lindsey64
Note that this is for a small risk.
The minimum risk premium to induce full investment is
0 02 2
0 0
[ ( (1 ))( )] ( (1 )) [ ]
( (1 )) [( ) ] o( [( ) ]f f f
f f f
EU w r r r U w r E r r
U w r E r r w E r r
0 20
0
20 0
( (1 ))[ ] [( ) ]
( (1 ))
( (1 )) [( ) ]
ff f
f
A f f
U w rE r r w E r r
U w r
R w r w E r r
Risk Aversion
Richard R. Lindsey65
This is known as the Arrow-Pratt measure of absolute risk
aversion (the inverse of RA is the risk tolerance).
For small risks (or small changes in risk) it is a measure of
the intensity of an individual’s aversion to risk.
It is a measure of curvature (but since vonNeumann-
Morgenstern utility is unique up to affine transformations,
the 2nd derivative is not sufficient).
Risk Aversion
Richard R. Lindsey66
Theorem:
( )0 decreasing absolute risk aversion
( )0 increasing absolute risk aversion
( )0 constant absolute risk aversion
A
A
A
dR zz
dz
dR zz
dz
dR zz
dz
00
00
00
( )0 if 0
( )0 if 0
( )0 if 0
A
A
A
dR zdaw z
dw dz
dR zdaw z
dw dz
dR zdaw z
dw dz
Risk Aversion
Richard R. Lindsey67
Decreasing absolute risk aversion implies that the risky asset
is a normal good (i.e. the dollar demand increases as
wealth increases).
Increasing absolute risk aversion implies that the risky asset
is an inferior good (i.e. the dollar demand decreases as
wealth increases).
Constant absolute risk aversion implies that the dollar
demand is invariant with respect to wealth.
Risk Aversion
Richard R. Lindsey68
Absolute risk aversion is therefore related to the dollar
demand for the risky asset.
But under decreasing absolute risk aversion, an individual
may actually increase, hold constant, or decrease the
proportion of wealth in the risky asset as wealth increases.
This brings us to the Arrow-Pratt measure of relative risk
aversion( )R AR zR z
Risk Aversion
Richard R. Lindsey69
Theorem:
Where
Is the wealth elasticity of demand.
( )1 if 0 (relatively elastic)
( )1 if 0
( )1 if 0 (relatively inelastic)
R
R
R
dR z
dz
dR z
dz
dR z
dz
0
0
wda
dw a
Risk Aversion
Richard R. Lindsey70
η<1: the proportion of agent’s initial wealth invested in the
risky asset decreases as wealth increases
η=1: the proportion of agent’s initial wealth invested in the
risky asset is constant as wealth increases
η>1: the proportion of agent’s initial wealth invested in the
risky asset increases as wealth increases
Linear Risk Tolerance Utility
Richard R. Lindsey71
To get sharper results and closed form solution for securities
holdings, we need to specify the form of the utility
function. Most typically we use a class of utility function
known as linear risk tolerance (LRT) utilities or HARA
utilities (hyperbolic absolute risk aversion). These utility
functions satisfy state independence and time additivity.
Linear Risk Tolerance Utility
Richard R. Lindsey72
Definition: Linear risk tolerance utility, the time additive
and state dependent utility function U(·) satisfies linear
risk tolerance if it solves the differential equation:
Where φ and β are independent of z.
Note: every LRT utility function is identified by 2
parameters: the intercept φ and the slope β.
( )
( )
U zz
U z
Linear Risk Tolerance Utility
Richard R. Lindsey73
This differential equation has three sets of solutions
depending on the value of β
Where ≈ means that the solutions are unique up to a positive
linear transform.
11(A) 0,1 : ( ) where 0; max ,0
1U z z z
(B) 1 : ( ) lnU z z
(C) 0 : ( ) exp where 0z
U z
Linear Risk Tolerance Utility
Richard R. Lindsey74
These three classes are:
(A) Generalized Power Utility (when 0)
1( )AR z
z
2
( )0
( )
AdR z
dz z
Linear Risk Tolerance Utility
Richard R. Lindsey75
( )R
zR z
z
2
( )
( )
RdR z
dz z
Which is 0 iff 0
0 iff 0
0 iff 0
Recall from Risk Aversion
Richard R. Lindsey76
Theorem:
Where
Is the wealth elasticity of demand.
( )1 if 0 (relatively elastic)
( )1 if 0
( )1 if 0 (relatively inelastic)
R
R
R
dR z
dz
dR z
dz
dR z
dz
0
0
wda
dw a
Linear Risk Tolerance Utility
Richard R. Lindsey77
When = 0 we have power utility which is CPRA or
constant proportional (relative) risk aversion. Also known
as iso-elastic utility.
The proportion of wealth in the risky asset is invariant to
changes in wealth.
When = -1 we have quadratic utility.
Linear Risk Tolerance Utility
Richard R. Lindsey78
(B) Generalized Log Utility (when 0)
1( )AR z
z
2
( ) 10
( )
AdR z
dz z
Linear Risk Tolerance Utility
Richard R. Lindsey79
( )R
zR z
z
2
( )
( )
RdR z
dz z
Which is 0 iff 0
0 iff 0
0 iff 0
Recall from Risk Aversion
Richard R. Lindsey80
Theorem:
Where
Is the wealth elasticity of demand.
( )1 if 0 (relatively elastic)
( )1 if 0
( )1 if 0 (relatively inelastic)
R
R
R
dR z
dz
dR z
dz
dR z
dz
0
0
wda
dw a
Linear Risk Tolerance Utility
Richard R. Lindsey81
When = 0 we have log utility which is CPRA or constant
proportional (relative) risk aversion. Also known as iso-
elastic utility.
The proportion of wealth in the risky asset is invariant to
changes in wealth.
Note when = 0 we have RR(z) = 1.
Linear Risk Tolerance Utility
Richard R. Lindsey82
(C) Negative Exponential Utility
Constant absolute risk aversion (CARA)
Dollar demand for risky assets is unaffected by changes in
wealth (riskless borrowing or lending absorbs all
changes).
1( )AR z
( )0AdR z
dz
Stochastic Dominance
Empirical Observations Properties of U(z)
Investors prefer more to less U(z) > 0
Investors are risk averse U(z) > 0
The risky asset is a normal good dRA(z)/dz < 0
Richard R. Lindsey83
We now want to relate these three properties of utility
functions to the properties of payoff distributions.
For example, one question we can ask is: Under what
circumstances can we unambiguously say that an
individual will prefer one risky asset to another if all
we know is that he prefers more to less?
Stochastic Dominance
We can answer questions like this using stochastic
dominance.
Note that stochastic dominance is:
1. Always a pairwise comparison.
2. Only a partial ordering among risky assets.
3. Much richer than what we will cover here (e.g. you can
develop much of modern portfolio theory just using
stochastic dominance).
Richard R. Lindsey84
Stochastic Dominance
Definition: First Order Stochastic Dominance
Then XA FSD XB .
Richard R. Lindsey85
( ) Pr[ ]F x X x
( ) and ( ) are different distributions
( ) 0
A BF x F x
a F a
If ( ) ( ) 0
0 some
A BF x F x x
x
Stochastic Dominance
Richard R. Lindsey86
Stochastic Dominance
Definition: Second Order Stochastic Dominance
Then XA SSD XB .
Richard R. Lindsey87
If ( ) ( ) 0
0 some
t
A Ba
F x F x dx t
t
and [ ] [ ]A BE X E X
Stochastic Dominance
Richard R. Lindsey88
Stochastic Dominance
Definition: Third Order Stochastic Dominance
Then XA TSD XB .
Richard R. Lindsey89
If ( ) ( ) 0
0 some
y t
A Ba a
F x F x dxdt y
y
[ ] [ ] and [ ] [ ]A B A BE X E X Var X Var X
Stochastic Dominance
Richard R. Lindsey90
Stochastic Dominance
Richard R. Lindsey91
Stochastic Dominance
Theorem: XA FSD XB XA SSD XB XA TSD XB (these are
progressively weaker tests).
Theorem: E[U(XA)] > E[U(XB)] for all U(·) (that are finite
for all finite x) such that U(x) > 0 everywhere iff XA FSD
XB (i.e. prefers more to less).
Theorem: E[U(XA)] > E[U(XB)] for all U(·) (that are finite
for all finite x) such that U(x) > 0 and U(x) < 0
everywhere iff XA SSD XB (i.e. risk averse).
Richard R. Lindsey92
Stochastic Dominance
Theorem: E[U(XA)] > E[U(XB)] for all U(·) (that are finite
for all finite x) such that U(x) > 0, U(x) < 0 and U(x) >
0 everywhere iff XA TSD XB .
Theorem: E[U(XA)] > E[U(XB)] for all U(·) (that are finite
for all finite x) such that U(x) > 0, U(x) < 0 and RA(x) <
0 everywhere iff XA TSD XB (i.e. risky asset is a normal
good).
Richard R. Lindsey93
Stochastic Dominance
Theorem: The following three statements are equivalent:
1. A FSD B
2. FA(x) ≤ FB(x) for all x
3. x A = x B + α where α ≥ 0
Theorem: The following three statements are equivalent:
1. A SSD B
2. E[x A] = E[x B] and
3. x A = x B + ε where E[ε |A] = 0
Richard R. Lindsey94
if ( ) ( ) 0 and 0 some t
A Ba
F x F x dx t t
Stochastic Dominance
Let’s consider an example
Which investment do we choose?
Richard R. Lindsey95
1
1 with probability 0.25
4 with probability 0.75X
2
2 with probability 0.50
4 with probability 0.25
5 with probability 0.25
X
1
1
[ ] 3.25
[ ] 1.6875
E X
Var X
2
2
[ ] 3.25
[ ] 1.6875
E X
Var X
Stochastic Dominance
Richard R. Lindsey96
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6
X1
X2
Stochastic Dominance
Cannot have FSD because the cumulative distribution
functions cross.
No SSD because both distribution functions are admissible.
Definition: A distribution is admissible or efficient with
respect to a set of distribution functions, S, if it is not
dominated by a member of S.
Richard R. Lindsey97
Stochastic Dominance
Richard R. Lindsey98
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6
X1
X2
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0 1 2 3 4 5 6
g(t)
Stochastic Dominance
X2 TSD X1so we would choose X2.
Note that this choice reflects a preference for skewness.
If you must take a risky gamble, do you prefer to take it
when wealth is high or low?
Richard R. Lindsey99
Riskiness of Distributions
This is a partial ordering of distributions.
Definition: Distribution Y is more risky than distribution X
if:
1. Y=X+Z where E[Z|X]=0 and non-degenerate.
2. Y is obtained from X by the addition of a mean
preserving spread.
3. X is preferred to Y by all risk averters providing
E[X]=E[Y].
4. Var[Y] > Var[X] provided E[X]=E[Y].
Richard R. Lindsey100
Riskiness of Distributions
Theorem: The partial orderings given by 1, 2, and 3 are
equivalent.
Theorem: The partial orderings given by 1, 2, 3, and 4 are
equivalent for normal distributions. (Reason: normals are
stable under addition if variances are finite.)
Richard R. Lindsey101
Bibliography
Huang, Chi-fu, and Robert Litzenberger, Foundations for
Financial Economics, North-Holland.
Levy, Haim, Stochastic Dominance: Investment Decision
Making under Uncertainty, Springer.
Ohlson, James, The Theory of Financial Markets and
Information, North-Holland.
Rothschild, M. and J. E. Stiglitz (1970). ―Increasing Risk: I.
A Definition.‖ Journal of Economic Theory 2: 225-43.
Richard R. Lindsey102
Optimization: Definitions
Richard R. Lindsey104
Our optimization problems will take the form:
Where f is a function, x is an n-vector and S is a set of n-
vectors. We call f the objective function, x the choice
variable or control variable, and S the constraint set or
opportunity set.
max ( ) subject to xf x x S
▲▲▲▲▲▲
Optimization: Definitions
Richard R. Lindsey105
Definition: The value x* of the variable x solves the problem
if
In this case, we say that x* is a maximizer of the function fsubject to the constraint x an element of S, and that f(x*) is the maximum (or maximum value) of the function fsubject to the constraint.
max ( ) subject to xf x x S
*( ) ( ) f x f x x S
▲▲▲▲▲▲
Optimization: Definitions
Richard R. Lindsey106
A minimizer is defined analogously.
x1 is a local maximizer x2 is a minimizer
x3 is a maximizer x4 is a ?
x5 is a ?
▲▲▲▲▲▲
Optimization: Definitions
Richard R. Lindsey107
Note that we can transform the objective function f with any strictly increasing function g. In other words:
Is identical to the set of solutions to the problem:
This fact is sometimes useful since it may be easier to work with a transform of the objective function rather than the original function.
max ( ) subject to xf x x S
max ( ( )) subject to xg f x x S
▲▲▲▲▲▲
Optimization: Definitions
Richard R. Lindsey108
Minimization problems are just the maximization of the
negative of the objective function
Has the same set of solutions as
max ( ) subject to x
f x x S
min ( ) subject to xf x x S
▲▲▲▲▲▲
Optimization: Definitions
Richard R. Lindsey109
Note that a continuous function on a compact set (closed
and bounded) attains both a minimum and a maximum on
that set (this is the Extreme Value Theorem). This is a
sufficient condition for a maximum (and a minimum) to
exist.
▲▲▲▲▲▲
Interior Optimum: One Variable
Richard R. Lindsey110
Proposition: (FOC) Let f be a differentiable function of a single variable defined on the interval I. If a point x* in the interior of I is a local or global maximizer or minimizer of f then f '(x*) = 0 (i.e. it is stationary).
Proposition: (SOC) Let f be a function of a single variable with continuous first and second derivatives, defined on the interval I. Suppose that x* is a stationary point of f in the interior of I (so that f '(x*) = 0).
1. If f "(x*) < 0 then x* is a local maximizer.
2. If x* is a local maximizer then f "(x*) ≤ 0.
3. If f "(x*) > 0 then x* is a local minimizer.
4. If x* is a local minimizer then f "(x*) ≥ 0.
Note: These are necessary conditions.
▲
Interior Optimum: Many Variables
Richard R. Lindsey111
Proposition: (FOC) Let f be a differentiable function of nvariables defined on the set S. If the point x in the interior of S is a local or global maximizer or minimizer of f then f i'(x) = 0 for i = 1, ..., n (i.e. it is stationary).
Proposition (SOC) Let f be a function of n variables with continuous partial derivatives of first and second order, defined on the set S. Suppose that x* is a stationary point of f in the interior of S (so that f i'(x*) = 0 for all i).
1. If H(x*) is negative definite then x* is a local maximizer.
2. If x* is a local maximizer then H(x*) is negative semidefinite.
3. If H(x*) is positive definite then x* is a local minimizer.
4. If x* is a local minimizer then H(x*) is positive semidefinite.
Note: These are necessary conditions.
▲▲▲
Interior Optimum: Many Variables
Richard R. Lindsey112
Where H is the Hessian matrix
2 2
1 1 1
2 2
1
n
n n n
f f
x x x xH
f f
x x x x
▲▲▲
Interior Optimum: Many Variables
Richard R. Lindsey113
An implication of this result is that if x* is a stationary point of f then
1. if H(x*) is negative definite then x* is a local maximizer
2. if H(x*) is negative semidefinite, but neither negative definite nor positive semidefinite, then x* is not a local minimizer, but might be a local maximizer
3. if H(x*) is positive definite then x* is a local minimizer
4. if H(x*) is positive semidefinite, but neither positive definite nor negative semidefinite, then x* is not a local maximizer, but might be a local minimizer
5. if H(x*) is neither positive semidefinite nor negative semidefinite then x* is neither a local maximizer nor a local minimizer.
A stationary point which is neither a maximizer or a minimizer is called a saddle point (note that not all saddle points look like a saddle. For example, every point (0, y) is a saddle point of the function f (x, y) = x3.).
▲▲▲
Global Optimum: One Variable
Richard R. Lindsey114
Proposition: Let f be a differentiable function defined on
the interval I, and let x be in the interior of I. Then:
1. if f is concave then x is a global maximizer of f in I if and only if x
is a stationary point of f
2. if f is convex then x is a global minimizer of f in I if and only if x
is a stationary point of f .
So if f is twice differentiable:
1. f "(z) ≤ 0 for all z ∈ I ⇒ [x is a global maximizer of f in I if and
only if f '(x) = 0]
2. f "(z) ≥ 0 for all z ∈ I ⇒ [x is a global minimizer of f in I if and only
if f '(x) = 0].
▲
Global Optimum: Many Variables
Richard R. Lindsey115
Proposition: Suppose that the function f has continuous
partial derivatives in a convex set S and let x be in the
interior of S. Then:
1. if f is concave then x is a global maximizer of f in S if and only if
it is a stationary point of f .
2. if f is convex then x is a global minimizer of f in S if and only if it
is a stationary point of f .
So if f is twice differentiable:
1. H(z) is negative semidefinite for all z ∈ S ⇒ [x is a global maximizer
of f in S if and only if x is a stationary point of f ].
2. H(z) is positive semidefinite for all z ∈ S ⇒ [x is a global minimizer
of f in S if and only if x is a stationary point of f ].
▲▲
Global Optimum: Many Variables
Richard R. Lindsey116
Note the difference between this and the local optima:
Sufficient conditions for local maximizer: if x* is a
stationary point of f and the Hessian of f is negative
definite at x* then x* is a local maximizer of f.
Sufficient conditions for global maximizer: if x* is a
stationary point of f and the Hessian of f is negative
semidefinite for all values of x then x* is a global
maximizer of f.
▲▲
Constrained Optimization: Equality
Richard R. Lindsey117
Usually it is not enough to consider solutions which
maximize (or minimize) a particular function (e.g. Diet
Coke can).
Instead, we want to find a solution which is subject to fixed,
outside constraints.
To solve these problems, we can use Lagrange multipliers.
▲
Constrained Optimization: Equality
Richard R. Lindsey118
Suppose that Monique and
Carl are going swimming
in the river, and they see
each other in a field
bounded by the river.
Since it is such a hot day,
they want to jump in the
river as quickly as
possible, but they want to
do it together. What point
(P) on the riverbank
should they meet?
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey119
In mathematical terms, if d(M,P) is the distance between M
and P, they must solve the problem:
Subject to the constraint:
Pmin (P) (M,P) (P,C)f d d
(P) 0g
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey120
We can solve this graphically
if we recall that ellipses are
curves of constant P (i.e.
for every point P on an
ellipse, the total distance
from one focus of the
ellipse to P and then to the
other focus is the same).
So we need to find and
ellipse (with C and M as
the foci) which is tangent
to the riverbank.
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey121
Or, mathematically, the normal vector to the ellipse must
point in the same direction as the normal vector to the
river.
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey122
Recall that the gradient of a function f (which is written )
is a normal vector to a curve (in two dimensions) or a
surface (in higher dimensions). The length of the normal
vector doesn’t matter; any constant multiple of the
gradient is also a normal vector. In our case, we have two
functions whose normal vectors are parallel, so:
The unknown multiplier -λ is necessary because the
magnitudes of the two gradients may be different.
f
(P) (P)f g
▲▲▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey123
Alternatively, we can approach the problem by considering
the optimization problem and combine it with the
constraint to form a new function called the Lagrangian or
Lagrangian function:
and then we set:
P, Pmin (P, ) min (P) (P)f gL
(P, ) 0L
▲▲▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey124
Proposition: Let f and g be continuously differentiable
functions of two variables defined on the set S, let c be a
number, and suppose that (x*, y*) is an interior point of S
that solves the problem
Suppose also that either
,max ( , ) subject to g( , )x yf x y x y c
* *,0
g x y
x
* *,0
g x y
y
▲▲▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey125
Then there is a unique number λ such that (x*, y*) is a
stationary point of the Lagrangian
That is (x*, y*) satisfy the FOC
( ,y) ( , ) ( ( , ) )x f x y g x y cL
* * * * * *( ,y ) ( , ) ( , )0
x f x y g x y
x x x
L
* *( , )g x y c
* * * * * *( ,y ) ( , ) ( , )0
x f x y g x y
y y y
L
▲▲▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey126
▲▲▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey127
Algorithm for solving a two-variable maximization problem with an equality constraint.
Let f and g be continuously differentiable functions of two variables defined on a set S and let c be a number. If the problem
has a solution, it may be found as follows.
A) Find all the values of (x, y, λ) in which 1. (x, y) is an interior point of S
2. (x, y, λ) satisfies the FOC and the constraint.
B) Find all the points (x, y) that satisfy g1'(x, y) = 0, g2'(x, y) = 0, and g(x, y) = c. (For most problems, there are no such values of (x, y). In particular, if g is linear there are no such values of (x, y).)
C) If the set S has any boundary points, find all the points that solve the problem maxx,y f (x, y) subject to the two conditions g(x, y) = c and (x, y) is a boundary point of S.
D) The points (x, y) you have found at which f (x, y) is largest are the maximizers of f .
,max ( , ) subject to g( , )x yf x y x y c
▲▲▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey128
Example: Consider the problem
(Note that the objective function xy is defined on the set of
all 2-vectors, which has no boundary. The constraint set is
therefore not bounded, so the extreme value theorem does
not imply that this problem has a solution.)
The Lagrangian is
,max subject to 6x yxy x y
( ,y) ( 6)x xy x yL
▲▲
Constrained Optimization: Equality
Richard R. Lindsey129
The FOC are
And the constraint
These equations have a unique solution, (x, y, λ) = (3, 3, 3). We
have g'1(x, y) = 1 ≠ 0 and g'2(x, y) = 1 ≠ 0 for all (x, y), so we
conclude that if the problem has a solution it is (x, y) = (3, 3).▄
0yx
L
0xy
L
6x y
▲▲
Constrained Optimization: Equality
Richard R. Lindsey130
Example: Consider the problem
(Note that the constraint set is compact and the objective
function is continuous, so the extreme value theorem
implies that this problem has a solution.)
The Lagrangian is
2 2 2
,max subject to 2 3x yx y x y
2 2 2( ,y) (2 3)x x y x yL
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey131
The FOC are
And the constraint
(Note that the constraint could also be considered the FOC
for the Lagrangian with respect to λ, the Lagrange
multiplier.)
2 4 2 ( 2 ) 0xy x x yx
L
2 2 0x yy
L
2 22 3 0x y
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey132
To find the solutions of these three equations, first note that
from the first equation we have either x = 0 or y = 2λ. We
can check each possibility in turn.
x = 0: we have y = 31/2 and λ = 0, or y = −31/2 and λ = 0.
y = 2λ: we have x2 = y2 from the second equation, so either x =
1 or x = −1 from the third equation.
x = 1: either y = 1 and λ = 1/2, or y = −1 and λ = −1/2.
x = −1: either y = 1 and λ = 1/2, or y = −1 and λ = −1/2.
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey133
So, the FOC have six solutions: 1. (x, y, λ) = (0, 31/2,0), with f (x, y) = 0.
2. (x, y, λ) = (0, −31/2,0), with f (x, y) = 0.
3. (x, y, λ) = (1, 1, 1/2), with f (x, y) = 1.
4. (x, y, λ) = (1, −1, −1/2), with f (x, y) = −1.
5. (x, y, λ) = (−1, 1, 1/2), with f (x, y) = 1.
6. (x, y, λ) = (−1, −1, −1/2), with f (x, y) = −1.
Now, g'1(x, y) = 4x and g'2(x, y) = 2y, so the only value of (x, y) for which g'1(x, y) = 0 and g'2(x, y) = 0 is (x, y) = (0, 0). At this point the constraint is not satisfied, so the only possible solutions of the problem are the solutions of the first-order conditions.
We conclude that the problem has two solutions, (x, y) = (1, 1) and (x, y) = (−1, 1).▄
▲▲▲▲
Constrained Optimization: Equality
2/3/2009Richard R. Lindsey134
Consider the problem
And suppose we solve the problem for various values of c.
Let the solution be (x*(c), y*(c)) with a Lagrange
multiplier of λ*(c). Assume that the functions x*, y*, and
λ* are differentiable and that g1'(x*(c), y*(c)) ≠ 0 or
g2'(x*(c), y*(c)) ≠ 0, so that the first-order conditions are
satisfied. Let f *(c) = f (x*(c), y*(c)). Differentiate f *(c)
with respect to c:
,max ( , ) subject to g( , )x yf x y x y c
▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey135
Differentiate f *(c) with respect to c:
(using the FOC). Note, however, that g(x*(c), y*(c)) = c for
all c, so the derivatives of each side of this equality are the
same for all c. That is
* * * * * * * * *
* * * * * * * **
( ) ( ( ), ( )) ( ) ( ( ), ( )) ( )
( ( ), ( )) ( ) ( ( ), ( )) ( )( )
f c f x c y c x c f x c y c y c
c x c y c
g x c y c x c g x c y c y cc
x c y c
* * * * * * * *( ( ), ( )) ( ) ( ( ), ( )) ( )1
g x c y c x c g x c y c y cc
x c y c
▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey136
Therefore
Or…The value of the Lagrange multiplier at the solution of the problem is equal to the rate of change in the maximal value of the objective function as the constraint is relaxed.
(Note that this follows directly from our use of the gradient earlier.)
So, in a utility maximization problem, the optimal value of the Lagrange multiplier measures marginal utility of our control variable (or the shadow price of that variable).
**( )( )
f cc
c
▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey137
Sufficient conditions for a local optimum with two variables.
Consider the problem
Suppose (x*, y*) and λ* satisfy the FOC:
And the constraint
,max ( , ) subject to g( , )x yf x y x y c
* * * *( , ) ( , )0
f x y g x y
x x* * * *( , ) ( , )
0f x y g x y
y y
* *( , )g x y c
▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey138
Then
If D(x*, y*, λ*) > 0 then (x*, y*) is a local maximizer
of f subject to the constraint g(x, y) = c.
If D(x*, y*, λ*) < 0 then (x*, y*) is a local mimimizer
of f subject to the constraint g(x, y) = c.
Where D(x*, y*, λ*) is the determinant of the bordered
Hessian of the Lagrangian.
▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey139
* * * *
* * 2 * * 2 * * 2 * * 2 * ** * * * *
* * 2 * * 2 * * 2 * * 2 * ** *
( , ) ( , )0
( , ) ( , ) ( , ) ( , ) ( , ), ,
( , ) ( , ) ( , ) ( , ) ( , )
g x y g x y
x y
g x y f x y g x y f x y g x yD x y
x x x x x x y x y
g x y f x y g x y f x y g x y
y y x y x y y y y
▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey140
Example: Consider again the problem
We previously found that there are six solutions to the FOC
1. (x, y, λ) = (0, 31/2,0), with f (x, y) = 0.
2. (x, y, λ) = (0, −31/2,0), with f (x, y) = 0.
3. (x, y, λ) = (1, 1, 1/2), with f (x, y) = 1.
4. (x, y, λ) = (1, −1, −1/2), with f (x, y) = −1.
5. (x, y, λ) = (−1, 1, 1/2), with f (x, y) = 1.
6. (x, y, λ) = (−1, −1, −1/2), with f (x, y) = −1.
2 2 2
,max subject to 2 3x yx y x y
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey141
Further, we found that solutions 3 and 5 are global
maximizers and solutions 4 and 6 are global minimizers.
The two remaining solutions of the FOC, (0, 31/2) and
(0, −31/2), are neither global maximizers nor global
minimizers. Are they local maximizers or local
minimizers?
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey142
The determinant of the bordered Hessian of the Lagrangian
is
The determinant is
0 4 2
( , , ) 4 2 4 2
2 2 2
x y
D x y x y x
y x
2 2 2 2 2
2 2
4 ( 8 4 ) 2 (8 2 (2 4 )) 8(2 (2 ) (4 ))
8(6 (4 ))
x x xy y x y y x y y x y
y x y
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey143
(since 2x2 + y2 = 3 at each solution, from the constraint). The
value of the determinant at the two solutions is
(0, 31/2, 0): −8·33/2, so (0, 31/2) is a local minimizer;
(0, −31/2, 0): 8·31/2, so (0, −312) is a local maximizer. ▄
▲▲▲▲
Constrained Optimization: Equality
Richard R. Lindsey144
Proposition: Suppose that f and g are continuously differentiable functions defined on an open convex subset S of two-dimensional space and suppose that there exists a number λ* such that (x*, y*) is an interior point of S that is a stationary point of the Lagrangean
Suppose further that g(x*, y*) = c.
Then if L is concave – in particular if f is concave and λ*g is convex – then
(x*, y*) solves the problem maxx,y f (x, y) subject to g(x, y) = c.
L is convex – in particular if f is convex and λ*g is concave – then (x*, y*) solves the problem minx,y f (x, y) subject to g(x, y) = c.
( ,y) ( , ) ( ( , ) )x f x y g x y cL
▲
Envelope Theorem
Richard R. Lindsey145
Often we are interested in how the maximal value of a
function depends on its parameters.
Consider the unconstrained maximization problem:
Assume that for any a the problem has a unique solution;
denote this solution x*(a). Denote the maximum value
of f , for any given value of a, by M *(a): M *(a)
= f (x*(a), a). We call M * the value function.
max ( ( ), ) xf x a a
▲▲▲
Envelope Theorem
Richard R. Lindsey146
Taking the derivative of M using the chain rule
The first term is the indirect effect of how changing a affects the optimal
choice of x and how that change in x affects the value of f. The second term
is the direct effect of how changing a changes f holding x fixed at x(a). This
expression can be simplified by noticing that since x*(a) is the optimal
choice for x at each value of a,
* * *( ) ( , ) ( ) ( ( ), )dM a f x a dx a f x a a
da x da a
*( , )0
f x a
x
▲▲▲
Envelope Theorem
Richard R. Lindsey147
This means
Or the change in the objective function adjusting optimally
is equal to the change in the objective function when one
doesn’t adjust x.
In other words, the total derivative of f(x(a),a) with respect
to a is equal to the partial derivative of f(x(a),a) with
respect to a, evaluated at the optimal choice of x.
This is known as the Envelope Theorem.
* *( ) ( ( ), )dM a f x a a
da a
▲▲▲
Envelope Theorem
Richard R. Lindsey148
Note that to compute the effect of changing a on x(a), we
differentiate the FOC
*( , )
0
f x ax
a
2 * 2 *
2
( , ) ( ) ( ( ), )0
f x a dx a f x a a
da x ax
▲▲
Envelope Theorem
Richard R. Lindsey149
The sign of the denominator is negative by the SOC,
therefore the sign of the expression is determined by the
sign of the mixed partial in the numerator.
2 *
2 *
2
( ( ), )( )
( , )
f x a adx a x a
da f x a
x
▲▲
Envelope Theorem
Richard R. Lindsey150
Now consider
Then the Lagrangian is
The envelope theorem states
Again, we only have to take into account the change in y, not the associated change in x.
,max ( , ) subject to g( , ) 0x yf x y x y
( ,y) ( , ) ( , )x f x y g x yL
* * **( (y),y) ( ( ), ) ( ( ), )x f x y y g x y y
y y y
L
▲
Envelope Theorem
Richard R. Lindsey151
Example: Consider a utility maximization problem: maxx
U(x) subject to p·x = w. where x is a vector (a bundle of
goods), p is the price vector, and w is the consumer's
wealth (a real number). Denote the solution of the problem
by x*(p, w), and denote the value function by v, so that
The function v is known as the indirect utility function.
*( , ) ( ( , )) for every ( , )v p w U x p w p w
▲▲
Envelope Theorem
Richard R. Lindsey152
By the envelope theorem
Thus
This result is known as Roy's identity. ▄
* *( , )( , ) ( , )i
ii
v p wp w x p w
p
*( , )( , )
v p wp w
w
*
( , )
( , )( , )
i
ii
v p w
px p w
v p w
w
▲▲
Mean-Variance Analysis: Intro
Richard R. Lindsey153
Mean-variance model for asset choice was developed by
Markowitz (1952 Journal of Finance).
Recalling our discussion of stochastic dominance, we can
see that, in general, investors should have MISC
preferences. In other words, they should exhibit a
preference for expected return and aversion to variance.
But for arbitrary distribution functions and utility functions
E[U(·)] cannot be expressed as a function of only mean
and variance.
▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Intro
Richard R. Lindsey154
To see this, take a Taylor series expansion around the
expected end of period wealth:
2
( )
3
( [ ]) ( [ ])( [ ])
1( [ ])( [ ])
2
1( [ ])( [ ])
!
n n
n
U w U E w U E w w E w
U E w w E w
U E w w E wn
▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Intro
Richard R. Lindsey155
Taking the expectation:
Unless the last term is zero, we need more than the mean and variance.
Note that the last part of the last term is the nth central moment of w .
( )
3
1[ ] ( [ ]) ( [ ]) [ ]
2
1( [ ]) [( [ ])]
!
n n
n
E U w U E w U E w Var w
U E w E w E wn
▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Intro
Richard R. Lindsey156
For arbitrary distributions, the mean-variance model can be
motivated by assuming quadratic utility:
There are no additional terms because the third and higher
order derivatives are zero.
2
2 2
[ ] [ ] [ ]2
[ ] ( [ ]) ( )2
bE U w E w E w
bE w E w w
▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Intro
Richard R. Lindsey157
Problems with quadratic utility
Saturation (i.e. utility decreases as wealth increases after a certain
point).
Increasing absolute risk aversion (i.e. risky assets are inferior
goods).
▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Intro
Richard R. Lindsey158
For arbitrary preferences, the mean-variance model can be
motivated by assuming that rates of return on risky assets
are multivariate normal.
The normal is completely characterized by the mean and the
variance (all higher moments can be described as
functions of the first two moments).
Note: the lognormal is also characterized by the mean and
variance, but is not stable under addition.
▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Intro
Richard R. Lindsey159
Problems with normality
Unbounded
Inconsistent with limited liability
Inconsistent with economic theory (no place for negative
consumption)
Experimentally, returns are not normal
Note: multivariate normal is sufficient for mean-variance
analysis, but not necessary.
▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Intro
Richard R. Lindsey160
Although the mean-variance model is not a general model of
asset choice, it holds a central role in finance due to it’s
tractability and it’s richness of empirical predictions.
▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey161
Assume that we have:
N ≥ 2 assets
frictionless markets
unlimited short selling
common knowledge about
expected returns
the variance-covariance structure
finite variances and unequal expectations
variance-covariance matrix of asset returns
1
the vector of expected returns
N
e
e
e
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey162
If we plot the variance and expected returns for all N
securities
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey163
And then consider all possible portfolios of them
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey164
We have the feasible set of portfolios in mean-variance
space (which is a parabola).
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey165
Definition: A portfolio is a frontier portfolio if it has the
minimum variance among portfolios having the same
expected rate of return.
1
[ ] [ ] 1N
p i i
i
E r w E r w e w w
1 1
[ ]N N
p i j ij
i j
Var r w w w w
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey166
A portfolio p is a frontier portfolio iff wp, the N-vector of
portfolio weights of p is the solution to:
{ }
1min
2
s.t. and 1
w
p
w w
w e E r w
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey167
Forming the Lagrangian and solving for the first order
conditions:
F.O.C.
1
12
pw w E r w e w L
0w ew
L
0pE r w e
L
1 0w
L
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey168
Since Ω is positive definite, these first order conditions are
necessary and sufficient for a global optimum.
Solving the 1st FOC for the weights
Premultiply by the expected returns and using the 2nd FOC
1 1
pw e
1 1
pE r e e e
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey169
Or premultiply the portfolio weights by a vector of 1’s and
use the 3rd FOC
Define
1 11 e
1 1A e e 1B e e
1C 2D BC A
B AM
A C
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey170
Note: A, B, C, and D are just numbers. M contains
sufficient information to prove everything in efficient set
mathematics.
Solving for the Lagrange multipliers
▲▲▲▲▲▲▲▲▲▲▲▲
C A
D
pE r
B A
D
pE r
Mean-Variance Analysis: Basics
Richard R. Lindsey171
And substituting into our expression for wp gives
Any frontier portfolio can be found this way since the expected return was arbitrary and this equation is a necessary and sufficient solution.
1 1C [ ] A B A [ ]
D D
p pp
E r E rw e
1 1 1 11 1C A [ ] B A
D Dp pw e E r e
h [ ] gp pw E r
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Basics
Richard R. Lindsey172
Note that is the vector of portfolio weights corresponding
to a frontier portfolio with E[r]=0 and that is the
vector of portfolio weights corresponding to a frontier
portfolio with E[r]=1.
Claim all frontier portfolios can be generated by forming
portfolios of the two frontier portfolios formed with
weights and .
Note that it therefore follows that all frontier portfolios can
be formed from any two distinct frontier portfolios.
g
g
g h
g h
▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Frontier
Richard R. Lindsey173
The covariance between the returns of any two frontier
portfolios is
Or the variance of any frontier portfolio can be found and
then we can write
1 C A A( , ) [ ] [ ]
C D C Cp q p q p qCov r r w w E r E r
2
2
2
A( ) C
11 D
C C
pp
E rr
▲▲▲▲▲
Mean-Variance Analysis: Frontier
Richard R. Lindsey174
Which is the equation of a hyperbola in SD-E[r] space with
center (0, A/C) and asymptotes
The minimum variance portfolio is defined as the portfolio
having the minimum variance of all possible portfolios.
Note
A D
C Cp pE r
1[ ]
CMVE r
A[ ]
CMVVar r
▲▲▲▲▲
Mean-Variance Analysis: Frontier
Richard R. Lindsey175
Definition: Frontier
portfolios which have
expected rates of return
strictly greater than that
of the minimum variance
portfolio are called
efficient portfolios.
These are portfolios which
have the highest return
for a given variance.
▲▲▲▲▲
Mean-Variance Analysis: Frontier
Richard R. Lindsey176
Let be m frontier portfolios and
be real numbers such that .
Then
Therefore, any linear combination of frontier portfolios is on
the frontier.
1, ,iw i m
1, ,i i m
1
1m
i
i
1 1
1
m m
i i i i
i i
m
i
i
w g hE r
g h E r
▲▲▲▲▲
Mean-Variance Analysis: Frontier
Richard R. Lindsey177
If the i=1,…,m portfolios are efficient, and αi>0 for all i,
then
Any convex combination of efficient portfolios is an
efficient portfolio (i.e. the set of efficient portfolios is a
convex set).
1 1
A A
C C
m m
i i i
i i
E r
▲▲▲▲▲
Bibliography
Richard R. Lindsey178
Cornuejols and Tütüncü, Optimization Methods in Finance, Cambridge.
Huang and Litzenberger, Foundations for Financial Economics, North-Holland.
Intriligator, Mathematical Optimization and Economic Theory, Prentice-Hall.
Marsden and Tromba, Vector Calculus, Freeman.
Varian, Microeconomic Analysis, Norton.
Mean-Variance Analysis: Risk Free Rate
Richard R. Lindsey187
Everything we have done so far did not have a riskless asset.
Now consider N+1 assets with equal to the portfolio
weights on risky assets is the solution to
▲▲▲▲▲▲
pw
pw
{ }
1min
2
s.t. (1 )
w
f p
w w
w e w r E r
Mean-Variance Analysis: Risk Free Rate
Richard R. Lindsey188
Which has the solution
▲▲▲▲▲▲
1
2B 2A C
p f
p f
f f
E r rw e r
r r
2
2
2( )
B 2A C
p f
p
f f
E r rr
r r
Mean-Variance Analysis: Risk Free Rate
Richard R. Lindsey189
There are three cases.
1. A/C>rf
▲▲▲▲▲▲
Mean-Variance Analysis: Risk Free Rate
Richard R. Lindsey190
2. A/C<rf
▲▲▲▲▲▲
Mean-Variance Analysis: Risk Free Rate
Richard R. Lindsey191
3. A/C=rf
Note: invest everything in the riskless asset and hold an arbitrage portfolio of risky assets whose weight sums to zero.
▲▲▲▲▲▲
Mean-Variance Analysis: Risk Free Rate
Richard R. Lindsey192
We can also write
which holds independent of the relationship between rf and
A/C
and
for any frontier portfolio p other than the riskless asset.
▲▲▲▲▲▲
q f qp p fE r r E r r
1q qp f qp p qr r r
, 0p q qCov r E
Mean-Variance Analysis
Richard R. Lindsey193
Let’s return to our minimization problem:
There are alternative ways to pose this problem; for
example, we could rewrite the constraints as:
▲▲▲▲
{ }
1min
2
s.t. and 1
w
p
w w
w e E r w
Aw b
Mean-Variance Analysis
Richard R. Lindsey194
Where
Note: If we wanted to include a riskless asset, we could also have N+1 assets
with one of the assets’ return equal to the risk-free rate.
▲▲▲▲
1 2
1 1 1
N
Ae e e
1
[ ]p
bE r
Mean-Variance Analysis
Richard R. Lindsey195
Forming the Lagrangian
With FOC
▲▲▲▲
1
2w w b Aw L
0w Aw
L
0A w b
L
Mean-Variance Analysis
Richard R. Lindsey196
Solving now, from the first FOC
Substituting into the second FOC and solving for the
optimal weights gives
▲▲▲▲
1w A
1 1 1( )w A A A b
Mean-Variance Analysis
Richard R. Lindsey197
Example: Assume that we have three stocks with the
following characteristics (what do you expect?)
▲▲▲
1
2
3
0.100162
0.164244
0.182082
e
e e
e
11 12 13
21 22 23
31 32 33
0.100162 0.045864 0.005712
0.210773 0.028283
0.066884
Mean-Variance Analysis
Richard R. Lindsey198
And that we want a 15% return on the portfolio (is this
feasible?). The constraints can be written
▲▲▲
1 1 1
0.100162 0.164244 0.182082A
1
0.15b
Mean-Variance Analysis
Richard R. Lindsey199
Now we can use the solution to find the optimal weights
▄
▲▲▲
1 1 1( )w A A A b
0.3830
0.0397
0.5773
w
Mean-Variance Analysis
Richard R. Lindsey200
Do you see any problems or issues associated with the
solution to our portfolio problem?
▲▲▲
Mean-Variance Analysis
Richard R. Lindsey201
Do you see any problems or issues associated with the
solution to our portfolio problem?
There may be other constraints which must be imposed:
Diversification constraints
max or min
Short-sale constraints
Borrowing constraints
Leverage constraints
Tracking error constraints
Etc.
▲▲▲
Mean-Variance Analysis
Richard R. Lindsey202
For example, the Investment Company Act of 1940
Rule 12-d3 imposes certain investment constraints on
mutual funds:
Mutual funds cannot own more than 5% of other investment
companies (firms which derive more than 15% of revenue from
securities related activity)
If a mutual fund advertises as a diversified fund, it cannot hold
more than 5% of its assets in any company or hold more than
10% of the voting stock for any company for 75% of the fund
▲▲▲
Mean-Variance Analysis
Richard R. Lindsey203
This means that we may want to (or need to) place
additional constraints on our optimization. Further, these
constraints may be inequality constraints (for example a
short-sale constraint would be expressed as wi ≥ 0 for
all i.).
So, let’s revisit optimization – this time with inequality
constraints.
▲▲▲
Optimization with Inequalities
Richard R. Lindsey204
Consider a problem of the form
where f and gj for j = 1, ..., m are functions of n variables, x
= (x1, ..., xn), and cj for j = 1, ..., m are constants.
All of the problems we have studied so far can be put into
this form…
▲▲
max ( ) subject to ( ) for 1, ,j jx
f x g x c j m
Optimization with Inequalities
Richard R. Lindsey205
For equality constraints, we simply introduce two inequality
constraints for every equality. For example, the problem
Can be written as
▲▲
max ( ) subject to ( ) 0x
f x g x
max ( ) subject to ( ) 0 and ( ) 0x
f x g x g x
Optimization with Inequalities
Richard R. Lindsey206
To start thinking about how to solve the general problem,
first consider the case with a single constraint
There are two possible solutions for this problem, one where
the constraint is binding and the other is where the
constraint does not bind. In the latter case, where the
constraint is not binding for small changes in the
constraint, we say that the constraint is slack.
▲▲▲▲▲▲
max ( ) subject to ( )x
f x g x c
Optimization with Inequalities
Richard R. Lindsey207
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey208
As before, we define the Lagrangian by
From our previous analysis of problems with equality
constraints and problems with no constraints,
if g(x*) = c (as in the left-hand panel) and the constraint
satisfies a regularity condition, then L'i(x*) = 0 for all i
if g(x*) < c (as in the right-hand panel), then f i'(x*) = 0 for
all i.
▲▲▲▲▲▲
( ) ( ) ( ( ) )x f x g x cL
Optimization with Inequalities
Richard R. Lindsey209
In the first case (that is, if g(x*) = c) we have λ ≥ 0. Suppose, to the contrary, that λ < 0. Then we know that a small decrease in c raises the maximal value of f . That is, moving x* inside the constraint raises the value of f , contradicting the fact that x* is the solution of the problem.
In the second case, the value of λ does not enter the conditions, so we can choose any value for it. Given the interpretation of λ, setting λ = 0 makes sense. Under this assumption we have f i'(x) = L'i(x) for all x, so that L'i(x*) = 0 for all i.
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey210
Thus in both cases we have L'i(x*) = 0 for all i, λ ≥ 0, and
g(x*) ≤ c. In the first case we have g(x*) = c and in the
second case λ = 0.
We can combine the two cases by writing the conditions as
▲▲▲▲▲▲
*( )0 for 1, ,
j
xj n
x
L
* *0, ( ) , and either 0 or ( ) 0g x c g x c
Optimization with Inequalities
Richard R. Lindsey211
Alternatively, since the product of two numbers is zero if at
least one of them is zero, we can write
Note that we have not ruled out the possibility that both λ = 0 and g(x*) = c.
The inequalities λ ≥ 0 and g(x*) ≤ c are called
complementary slackness conditions; at most one of these
conditions is slack (i.e. not an equality).
▲▲▲▲▲▲
*( )0 for 1, ,
j
xj n
x
L
* *0, ( ) , and ( ( ) ) 0g x c g x c
Optimization with Inequalities
Richard R. Lindsey212
For a problem with many constraints, we introduce a
multiplier for each constraint and obtain the Kuhn-Tucker
conditions. For the problem
The Kuhn-Tucker conditions are
▲▲
max ( ) subject to ( ) for 1, ,j jx
f x g x c j m
*( )0 for 1, ,
i
xi n
x
L
* *0, ( ) , and ( ( ) ) 0 for 1, ,j j j j j jg x c g x c j m
Optimization with Inequalities
Richard R. Lindsey213
Where
▲▲
1
( ) ( ) ( ( ) )m
j j j
j
x f x g x c
L
Optimization with Inequalities
Richard R. Lindsey214
Example: Consider the problem
The Lagrangian is
▲▲
1 2
2 21 2
,max ( 4) ( 4)x x
x x
2 21 2 1 2 1 1 2 2 1 2( , ) ( 4) ( 4) ( 4) ( 3 9)x x x x x x x x L
1 2 1 2subject to 4 and 3 9x x x x
Optimization with Inequalities
Richard R. Lindsey215
And the Kuhn-Tucker conditions are
▄
▲▲
1 1 2
2 1 2
1 2 1 1 1 2
1 2 2 2 1 2
2( 4) 0
2( 4) 0
4, 0, and ( 4) 0
3 9, 0, and ( 3 9) 0
x
x
x x x x
x x x x
Optimization with Inequalities
Richard R. Lindsey216
We have seen that a solution x* of an optimization problem
with equality constraints is a stationary point of the
Lagrangean if the constraints satisfy a regularity condition
(∇g(x*) ≠ 0 in the case of a single constraint g(x) = c)). In
an optimization problem with inequality constraints a
related regularity condition guarantees that a solution
satisfies the Kuhn-Tucker conditions. The weakest forms
of this regularity condition are difficult to verify. The next
result gives three alternative strong forms that are much
easier to verify.
▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey217
Proposition Let f and gj for j = 1, ..., m be continuously
differentiable functions of many variables and let cj for j =
1, ..., m be constants. Suppose that x* solves the problem
Suppose that either each gj is concave
or each gj is convex and there is some x such that gj(x) < cj for j = 1, ..., m
or each gj is quasi-convex, ∇gj(x*) ≠ (0, ..., 0) for all j, and there is some x
such that gj(x) < cj for j = 1, ..., m.
Then there exists a unique vector λ = (λ1, ..., λm) such that
(x*, λ) satisfies the Kuhn-Tucker conditions.
▲▲▲▲
max ( ) subject to ( ) for 1, ,j jx
f x g x c j m
Optimization with Inequalities
Richard R. Lindsey218
Example of a quasi-convex
function which is not
convex.
Example of a function which
is not quasi-convex.
▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey219
Recall that a linear function is concave, so the conditions in
the result are satisfied if each constraint function is linear.
Note that the last part of the second and third conditions is
very weak: it requires only that some point strictly satisfy
all the constraints.
One way in which the conditions in the result may be
weakened is sometimes useful: the conditions on the
constraint functions need to be satisfied only by the
binding constraints—those for which gj(x*) = cj.
▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey220
We saw previously that for both an unconstrained
maximization problem and a maximization problem with
an equality constraint the first-order conditions are
sufficient for a global optimum when the objective and
constraint functions satisfy appropriate
concavity/convexity conditions. The same is true for an
optimization problem with inequality constraints.
Precisely, we have the following result.
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey221
Proposition: Let f and gj for j = 1, ..., m be continuously
differentiable functions of many variables and let cj for j =
1, ..., m be constants. Consider the problem
Suppose that
f is concave
and gj is quasi-convex for j = 1, ..., m.
If there exists λ = (λ1, ..., λm) such that (x*, λ) satisfies the
Kuhn-Tucker conditions then x* solves the problem.
▲▲▲▲▲▲
max ( ) subject to ( ) for 1, ,j jx
f x g x c j m
Optimization with Inequalities
Richard R. Lindsey222
Corollary: The Kuhn-Tucker conditions are both necessary
and sufficient if the objective function is concave and
either
each constraint is linear
or each constraint function is convex and some vector of the
variables satisfies all constraints strictly.
But sometimes the condition that the objective function is
concave is too strong to be useful, for instance, we
generally assume that utility functions are quasi-concave,
in which case, the following result is useful.
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey223
Proposition: Let f and gj for j = 1, ..., m be continuously differentiable functions of many variables and let cj for j = 1, ..., m be constants. Consider the problem
Suppose that f is twice differentiable and quasi-concave
and gj is quasi-convex for j = 1,...,m.
If there exists λ = (λ1, ..., λm) and a value of x* such that (x*, λ) satisfies the Kuhn-Tucker conditions and f 'i(x*) ≠ 0 for i = 1, ..., n then x* solves the problem.
▲▲▲▲▲▲
max ( ) subject to ( ) for 1, ,j jx
f x g x c j m
Optimization with Inequalities
Richard R. Lindsey224
Corollary: Suppose that the objective function is twice
differentiable and quasi-concave and every constraint is
linear. If x* solves the problem then there exists a unique
vector λ such that (x*, λ) satisfies the Kuhn-Tucker
conditions, and if (x*, λ) satisfies the Kuhn-Tucker
conditions and f 'i(x*) ≠ 0 for i = 1, ..., n then x* solves
the problem.
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey225
Very Important!
If you have a minimization problem, remember that you can
transform it to a maximization problem by multiplying the
objective function by −1. Thus for a minimization
problem the condition on the objective function in the first
result above is that it be convex, and the condition in the
second result is that it be quasi-convex.
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey226
Example: maxx[−(x − 2)2] subject to x ≥ 1
Written in the standard format, this problem is
maxx[−(x − 2)2] subject to 1 − x ≤ 0.
The objective function is concave and the constraint is linear. Thus the Kuhn-Tucker conditions are both necessary and sufficient: the set of solutions of the problem is the same as the set of solutions of the Kuhn-Tucker conditions.
▲▲
Optimization with Inequalities
Richard R. Lindsey227
The Kuhn-Tucker conditions are
−2(x − 2) + λ = 0
x−1 ≥ 0, λ ≥ 0, and λ(1 − x) = 0.
From the last condition we have either λ = 0 or x = 1.
x = 1: 2 + λ = 0, or λ = −2, which violates λ ≥ 0.
λ = 0: −2(x − 2) = 0; the only solution is x = 2.
Thus the Kuhn-Tucker conditions have a unique solution,
(x, λ) = (2, 0). Hence the problem has a unique solution
x = 2. ▄
▲▲
Optimization with Inequalities
Richard R. Lindsey228
Example: maxx[−(x − 2)2] subject to x ≥ 3
Written in the standard format, this problem is
maxx[−(x − 2)2] subject to 3 − x ≤ 0.
As in the previous example, the objective function is concave and the constraint function is linear, so that the set of solutions of the problem is the set of solutions of the Kuhn-Tucker conditions.
▲▲
Optimization with Inequalities
Richard R. Lindsey229
The Kuhn-Tucker conditions are
−2(x−2) + λ = 0
x−3 ≥ 0, λ ≥ 0, and λ(3 − x) = 0.
From the last conditions we have either λ = 0 or x = 3. x = 3: −2 + λ = 0, or λ = 2.
λ = 0: −2(x − 2) = 0; since x ≥ 3 this has no solution compatible with the other conditions.
Thus the Kuhn-Tucker conditions have a single solution, (x, λ) = (3, 2). Hence the problem has a unique solution, x = 3.▄
▲▲
Optimization with Inequalities
Richard R. Lindsey230
These two examples illustrate a procedure for finding
solutions of the Kuhn-Tucker conditions that is useful in
many problems.
1. Look at the complementary slackness conditions, which
imply that either a Lagrange multiplier is zero or a
constraint is binding.
2. Check the implications of each case, using the other
equations.
In these two examples, this procedure is very easy to follow.
The following examples are more complicated.
▲
Optimization with Inequalities
Richard R. Lindsey231
Example: Consider the problem
The objective function is concave and the constraints are
both linear, so the solutions of the problem are the
solutions of the Kuhn-Tucker conditions.
▲▲▲▲▲▲
1 2
2 21 2
,max ( 4) ( 4)x x
x x
1 2 1 2subject to 4 and 3 9x x x x
Optimization with Inequalities
Richard R. Lindsey232
We previously found the Kuhn-Tucker conditions,
What are the solutions of these conditions? Start by looking at the two conditions λ1(x1 + x2 − 4) = 0 and λ2(x1 + 3x2 − 9) = 0. These two conditions yield the following four cases.
▲▲▲▲▲▲
1 1 2
2 1 2
1 2 1 1 1 2
1 2 2 2 1 2
2( 4) 0
2( 4) 0
4, 0, and ( 4) 0
3 9, 0, and ( 3 9) 0
x
x
x x x x
x x x x
Optimization with Inequalities
Richard R. Lindsey233
(1) x1 + x2 = 4 and x1 + 3x2 = 9:
In this case we have x1 = 3/2 and x2 = 5/2. Then the first two
equations are
5 − λ1 − λ2 = 0
3 − λ1 − 3λ2 = 0
which imply that λ1 = 6 and λ2 = −1, which violates the
condition λ2 ≥ 0. We can rule out this case.
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey234
(2) x1 + x2 = 4 and x1 + 3x2 < 9, so that λ2 = 0:
Then first two equations imply x1 = x2 = 2 and λ1 = 4.
All the conditions are satisfied, so
(x1, x2, λ1, λ2) = (2, 2, 4, 0) is a solution.
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey235
(3) x1 + x2 < 4 and x1 + 3x2 = 9, so that λ1 = 0:
Then the first two equations imply x1 = 12/5 and x2 = 11/5,
violating x1 + x2 < 4. We can rule out this case.
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey236
(4) x1 + x2 < 4 and x1 + 3x2 < 9, so that λ1 = λ2 = 0:
Then first two equations imply x1 = x2 = 4, violating x1 + x2
< 4. We can rule out this case.
So (x1, x2, λ1, λ2) = (2, 2, 4, 0) is the single solution of the
Kuhn-Tucker conditions. Hence the unique solution of
problem is (x1, x2) = (2, 2).▄
▲▲▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey237
Example: maxx,y xy subject to x + y ≤ 6, x ≥ 0, and y ≥ 0.
The objective function is twice-differentiable and quasi-
concave and the constraint functions are linear, so the
Kuhn-Tucker conditions are necessary and if ((x*, y*), λ*)
satisfies these conditions and no partial derivative of the
objective function at (x*, y*) is zero then (x*, y*) solves
the problem. Solutions of the Kuhn-Tucker conditions at
which all derivatives of the objective function are zero
may or may not be solutions of the problem (we need to check
the values of the objective function at these solutions).
▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey238
The Lagrangian is
The Kuhn-Tucker conditions are
y − λ1 + λ2 = 0
x − λ1 + λ3 = 0
λ1 ≥ 0, x + y ≤ 6, λ1(x + y − 6) = 0
λ2 ≥ 0, x ≥ 0, λ2x = 0
λ3 ≥ 0, y ≥ 0, λ3y = 0.
1 2 3( , ) ( 6)x y xy x y x y L
▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey239
(1) If x > 0 and y > 0 then λ2 = λ3 = 0, so that λ1 = x = y from
the first two conditions. Hence x = y = λ = 3 from the third
condition. These values satisfy all the conditions.
(2) If x = 0 and y > 0 then λ3 = 0 from the last condition and
hence λ1 = x = 0 from the second condition. But now from
the first condition λ2 = −y < 0, contradicting λ2 ≥ 0.
(3) If x > 0 and y = 0 then λ2 = 0, and a symmetric argument
yields a contradiction.
(4) If x = y = 0 then λ1 = 0 form the third set of conditions,
so that λ2 = λ3 from the first and second conditions. These
values satisfy all the conditions.
▲▲▲▲
Optimization with Inequalities
Richard R. Lindsey240
We conclude that there are two solutions of the Kuhn-
Tucker conditions, (x, y, λ1, λ2, λ3) = (3, 3, 3, 0, 0) and
(0, 0, 0, 0, 0). The value of the objective function at (3, 3)
is greater than the value of the objective function at (0, 0),
so the solution of the problem is (3, 3). ▄
▲▲▲▲
Optimization Summary
Richard R. Lindsey241
Conditions under which FOC are necessary and sufficient:
Unconstrained Maximization Problems
If x* solves maxx f (x) then f 'i(x*) = 0 for i = 1, ..., n.
If f 'i(x*) = 0 for i = 1, ..., n and if f is concave then x*
solves maxx f (x).
▲▲▲▲
Optimization Summary
Richard R. Lindsey242
Equality Constrained Maximization Problems (one constraint)
If x* solves maxx f (x) subject to g(x) = c, and if
∇g(x*) ≠ (0,...,0), then there exists λ such that L'i(x*) = 0
for i = 1, ..., n and g(x*) = c.
If there exists λ such that L'i(x*) = 0 for i = 1, ..., n and
g(x*) = c and if f is concave and λg is convex then x*
solves maxx f (x) subject to g(x) = c.
▲▲▲▲
Optimization Summary
Richard R. Lindsey243
Inequality Constrained Maximization Problems
If x* solves maxx f (x) subject to gj(x) ≤ cj for j = 1, ..., m
and if {gj is concave for j = 1, ..., m} or {gj is convex for
j = 1, ..., m and there exists x such that gj(x) < cj for
j = 1, ..., m} or {gj is quasi-convex for j = 1, ..., m,
∇gj(x*) ≠ (0,...,0) for j = 1, ..., m, and there exists x such
that gj(x) < cj for j = 1, ..., m} then there exists (λ1,...,λm)
such that L'i(x*) = 0 for i = 1, ..., n and λj ≥ 0, gj(x*) ≤ cj,
and λj(gj(x*) − cj) = 0 for j = 1, ..., m.
▲▲▲▲
Optimization Summary
Richard R. Lindsey244
Inequality Constrained Maximization Problems
If there exists (λ1,...,λm) such that L'i(x*) = 0 for i = 1, ..., n
and λj ≥ 0, gj(x*) ≤ cj, and λj(gj(x*) − cj) = 0 for j = 1, ..., m
and if gj is quasi-convex for j = 1, ..., m and either {f is
concave} or {f is quasi-concave and twice differentiable
and ∇ f (x*) ≠ (0,...,0) where L(x) = f (x) − ∑j=1mλj(gj(x) −
cj)} then x* solves maxx f (x) subject to gj(x) ≤ cj for
j = 1, ..., m.
▲▲▲▲
Bibliography
Richard R. Lindsey245
Cornuejols and Tütüncü, Optimization Methods in Finance, Cambridge.
Huang and Litzenberger, Foundations for Financial Economics, North-Holland.
Intriligator, Mathematical Optimization and Economic Theory, Prentice-Hall.
Marsden and Tromba, Vector Calculus, Freeman.
Varian, Microeconomic Analysis, Norton.
▲▲▲▲▲▲▲▲▲▲▲▲▲
Mean-Variance Analysis: Risk Free Rate
Richard R. Lindsey247
3. A/C=rf
Note: invest everything in the riskless asset and hold an arbitrage portfolio of risky assets whose weight sums to zero.
▲▲
Mean-Variance Analysis: Risk Free Rate
Richard R. Lindsey248
Recall the expression for the optimal weights
Substituting rf=A/C and premultiplying by ι, we get
1
2B 2A C
p f
p f
f f
E r rw e r
r r
1
2
2
A
C B 2A C
AA C
C B 2A C
0
p f
p
f f
p f
f f
E r rw e
r r
E r r
r r
▲▲
M-V Analysis Inequalities
Richard R. Lindsey249
Let’s return to our exploration of mean-variance analysis.
When we add inequality constraints to our problem, the
quadratic optimization problem generally does not have a
simple analytical solution. Instead, we must use
numerical methods to solve for the optimal portfolio
weighting.
▲▲
M-V Analysis Inequalities
Richard R. Lindsey250
State-of-the-art quadratic programming algorithms with inequality constraints use two kinds of approaches: (1) the active-set method or projection method, and (2) the interior point method.
Both of these approaches solve a series of sub-problems where there are only equality constraints. They differ only in how they arrange the order of those sub-problems. In the active-set method, you proceed along the boundary of the feasible set defined by the constraints. In the interior-point method, you proceed within the feasible set. (You can use Matlab’s functions e.g. quadprog).
Current implementations of interior methods often outperform active set methods in terms of speed. On the other hand, active set methods are more robust and better suited for warm starts, which are important for solving integer optimization problems (quadprog uses an
active set method).
▲▲
M-V Analysis Inequalities: Example
Richard R. Lindsey251
Example: Let’s return to our earlier numerical example,
adding the restriction that we cannot short any of the
stocks. In addition, we will also add the constraint that
stock 2 must have a weight of at least 0.10. Our problem
can be written:
▲▲▲▲
1min
2
s.t.
ww w
Aw b
M-V Analysis Inequalities: Example
Richard R. Lindsey252
Where
▲▲▲▲
1 1 1
0.100162 0.164244 0.182082
1 0 0
0 1 0
0 0 1
1 0 1
A
M-V Analysis Inequalities: Example
Richard R. Lindsey253
And
Notice to express the constraint that w2≤0.10, we used w1+w3≤0.90. Sometimes
we need to reengineer our constraints to reach a solution.
▲▲▲▲
1
0.15
0
0
0
0.90
b
M-V Analysis Inequalities: Example
Richard R. Lindsey254
The solution is
(using quadprog this took 1 iteration)
▄
▲▲▲▲
0.3699
0.1000
0.5301
w
M-V Analysis
Richard R. Lindsey255
Congratulations!
M-V Analysis
Richard R. Lindsey256
Congratulations!
Now you know how to do everything in portfolio analysis –
you just need to set up the appropriate problem.
M-V Analysis
Richard R. Lindsey257
Congratulations!
Now you know how to do everything in portfolio analysis –
you just need to set up the appropriate problem.
Let’s consider a few alternatives…
M-V Analysis: Diversification Constraint
Richard R. Lindsey258
As discussed last time, there are sometimes regulatory
requirements for diversification. In addition, many portfolios
are required (by their managers/investors) to have minimum
and/or maximum investment limits in certain stocks, industries,
sectors, or asset classes. These types of problems can be
generally expressed:
Where the vectors wl and wu represent lower and upper bounds.
1min
2
s.t.
and
w
l u
w w
Aw b
w w w
M-V Analysis: Trading Volume
Richard R. Lindsey259
A typical constraint is one on trading volume. This constraint may be used for a large portfolio where you want to avoid price impact or for any portfolio where you want to control the liquidity risk of the portfolio.
Where x is a vector of ADV in dollar terms and c is a constant for the threshold.
(e.g. $500 million portfolio; 10% of ADV (in millions) of stock iwi ≤ (0.1/500)xi ) Can you generalize this?
1min
2
s.t.
and
ww w
Aw b
w cx
M-V Analysis: Beta Exposure
Richard R. Lindsey260
Sometimes it is desirable to match the beta of a benchmark portfolio:
Where:
(note that this will not bound the tracking error or asset specific risk – only the factor risk)
benchmark
1min
2
s.t.
and
ww w
Aw b
w
1
N
M-V Analysis: Beta Exposure
Richard R. Lindsey261
Or we can specify a range for the beta exposure:
lower limit upper limit
1min
2
s.t.
and
ww w
Aw b
w
M-V Analysis: Factor Exposure
Richard R. Lindsey262
Or sometimes we are matching multiple factors:
Where:
(NB: tilting)
lower limit upper limit
1min
2
s.t.
and B
ww w
Aw b
w
11 12 1
21 22 2
1 2
B
K
K
N N NK
M-V Analysis: Tracking Error
Richard R. Lindsey263
Most professionals with a benchmark use a minimization of
tracking error when weighting stocks in the portfolio.
M-V Analysis: Tracking Error
Richard R. Lindsey264
Most professionals with a benchmark use a minimization of
tracking error when weighting stocks in the portfolio.
Two methods:
1. Minimize the tracking error for a given expected excess
return over the benchmark.
2. Maximize the expected excess return over the benchmark
without exceeding a maximum tracking error constraint,
M-V Analysis: Tracking Error
Richard R. Lindsey265
Tracking error is generally defined as the standard deviation of the portfolio returns minus the benchmark returns:
Consider the components of the variance
The last term is beyond our control and the first term is what we ―usually‖ minimize.
benchmarkTE ( )
( )
p
p b
StdDev r r
Var r r
( ) ( ) 2 ( , ) ( )p b p p b bVar r r Var r Cov r r Var r
M-V Analysis: Tracking Error
Richard R. Lindsey266
Define
And our problem becomes
1( , )
( , )
b
N b
Cov r r
Cov r r
min 2
s.t.
and
w
p
w w w
Aw b
w
M-V Analysis: Tracking Error (Factors)
Richard R. Lindsey267
If we are dealing with multiple factors and want to minimize
tracking error, we note:
Where the vector f are the factors into which we have
decomposed returns and the residual terms for different
securities have covariance of zero.
( ) ( ) ( )i i i iVar r Var f Var
1 1i i j j K K ir f f f
M-V Analysis: Tracking Error (Factors)
Richard R. Lindsey268
We can then write the variance-covariance matrix as
Or
1,1 1, 1 1 1,1 ,1
,1 , 1 1, ,
1
( ) ( , )
( , ) ( )
( ) 0
0 ( )
K K N
N N K K K K N K
N
Var f Cov f f
Cov f f Var f
Var
Var
B ( )B ( )Var f Var
M-V Analysis: Tracking Error (Factors)
Richard R. Lindsey269
B then represents the N by K matrix of factor exposures;
Var(f ) is a K by K matrix of factor premium variances and
Var(ε) is an N by N diagonal matrix of error variances.
The squared tracking error is then
If we add any other relevant constraints, we can solve this
using our quadratic optimizer.
(note: we are now minimizing the tracking error)
2TE ( ) B ( )B ( ) ( ) ( )( )p b p b p b p bw w Var f w w w w Var w w
M-V Analysis: Tracking Error (Tilting)
Richard R. Lindsey270
When we actually have specific values or weights for our
factor exposure, we can tilt the portfolio to those weights
by applying a constraint
Where B is as defined earlier and d is the vector
representing the tilt. For example, if we have five factors:
market, size, growth, country, and sector and we wanted to
overweight size and growth, we could use
B ( ) dp bw w
d (0 0.1 0.1 0 0)
M-V Analysis: Tracking Error (Tilting)
Richard R. Lindsey271
The zeros in d make sure that the portfolio’s exposures to
the benchmark with respect to market, country and sector
are the same, and the values make sure that the exposure
to size and growth will by higher than the benchmark by
0.1.
With factor tilting, the optimization problem becomes
min( ) ( )( )
s.t. B ( ) d
and any other constraints
p
p b p bw
p b
w w Var w w
w w
M-V Analysis: Tracking Error (Ghost)
Richard R. Lindsey272
There may be cases in which you do not know what the
underlying securities in the benchmark are or their
weights. In this case, you would minimize the tracking
error with respect to the history of returns of the
benchmark. One possible approach is to minimize
Where βb is the benchmark’s factor exposure and εb is the
benchmark’s error term. Now that we have described the
tracking error, we continue as before.
2B B ( ) 0
TE ( )0 ( )1 1 1 1
p p p p
b b b
Varw w w wVar f
Var
M-V Analysis: Tracking Error (Risk-Adj)
Richard R. Lindsey273
As indicated earlier, an alternative approach is have a
maximum tracking error constraint and maximize
expected return of the portfolio subject to that constraint.
We could write this as
And any other constraints. Alternatively, if we did not have
a target mean or tracking error, we could use a tracking
error risk aversion parameter A and write
2
max
s.t. ( )
w
p b x
w
Var r r
max ( )p bw
w AVar r r
Richard R. Lindsey274
Note that these two formulations are related. The set of
maximum-return portfolios obtained as we vary the
tracking error constraint is identical to the set of optimal
portfolios obtained as we vary the tracking-error risk
aversion parameter. In other words, we can always choose
parameters so the two formulations are equivalent. This
property may be useful for solving the optimization
problem depending on how our optimizer wants the
problem to be set.
M-V Analysis: Tracking Error (Risk-Adj)
M-V Analysis
Richard R. Lindsey275
Get the idea?
One we know how to solve the portfolio optimization
problem, everything else is just a wrinkle.
M-V Analysis
Richard R. Lindsey276
Get the idea?
One we know how to solve the portfolio optimization
problem, everything else is just a wrinkle.
That doesn’t mean that it’s easy – what it means is that we
have to figure out how to pose the problem that we want
to solve in a manner in which we can solve it (with the
help of an optimizer).
M-V Analysis
Richard R. Lindsey277
Get the idea?
One we know how to solve the portfolio optimization problem, everything else is just a wrinkle.
That doesn’t mean that it’s easy – what it means is that we have to figure out how to pose the problem that we wantto solve in a manner in which we can solve it (with the help of an optimizer).
But, just for fun, let’s see if there is anything else we can learn.
M-V Analysis Utility
Richard R. Lindsey278
Notice that in the numerical example at the beginning of
class, we assumed that we wanted an expected return for
the portfolio of 15% and optimized to achieve that
objective. What makes this right?
▲▲▲
Theory would tell us
that what we want
to do is find the
point on the
efficient frontier
which maximizes
the investor’s utility.Note that less risk averse investors will have “flatter” indifference curves.
M-V Analysis Utility
Richard R. Lindsey279
In practice, we often use a modified approach to mean-
variance analysis in which we construct optimal portfolios
for different risk tolerance parameters (λ), and by varying
λ, find the efficient frontier.
In this approach, we trade off risk against return by
maximizing
For various risk tolerances λ.
▲▲▲
21 1max max max
2 2p p
x x xU w w w
M-V Analysis Utility
Richard R. Lindsey280
Where
The unconstrained optimum is found using the FOC
Under the normal regularity conditions.
▲▲▲
10
dUw
dw
* 1w
( , )ij iCov R c R c
[ ]i iE R c
M-V Analysis Utility
Richard R. Lindsey281
Or with equality constraints
Forming the standard Lagrangian
▲▲▲
1max max subject to
2w wU w w w Aw b
1( )
2w w w Aw b
L
M-V Analysis Utility
Richard R. Lindsey282
FOC
▲▲▲
10w A
w
L
0Aw b
L
* 1 ( )w A
Aw b
M-V Analysis Utility/2-Fund Separation
Richard R. Lindsey283
Solving for the optimal weights
Notice that the optimal solution is split into a constrained
minimum-variance portfolio and a speculative portfolio.
This is known as two-fund separation. The first term does
not depend either on the expected returns or on the risk
tolerance – it is the constrained minimum-variance
portfolio. The second term depends on the expected
returns and the investor’s risk tolerance.
▲▲▲
* 1 1 1 1 1 1( ) ( ( ) )w A A A b A A A A
M-V Analysis Efficiency of Solution
Richard R. Lindsey284
A brief aside:
Note that constrained optimization reduces the efficiency of the
solution. A constrained solution must be less optimal than an
unconstrained solution (assuming that the constraint is
binding). The loss in efficiency can be measured as the
difference between a constrained and unconstrained solution.
But, not every difference between constrained and unconstrained
portfolios is statistically or economically significant. So we
might want to test whether there is a difference. One way to
test for significance is to use the Sharpe ratio (SR).
▲▲
M-V Analysis Efficiency of Solution
Richard R. Lindsey285
Consider a simple case of running an unconstrained
optimization with k* assets and a constrained optimization
with k assets (k* > k). We can use
Where the statistic is F-distributed and the Sharpe Ratio is
▲▲
* *
* * *2 2
2 , ( 1)
( )( )( )F
(1 ) k N k k
N k k k SR SR
SR
fr rSR
Asset-Liability Management
Richard R. Lindsey286
Now consider the problem when we also have stochastic
liabilities. In this case, we focus on the difference between
assets and liabilities. This is known as surplus. The
change in surplus depends directly on the returns of the
asset portfolio (Rp) as well as the liability returns (Rl).
We will express surplus returns as a change in surplus
relative to assets
Surplus Assets Liabilitiesp lR R
Surplus Liabilities
Assets Assetsp l p lR R R fR
Asset-Liability Management
Richard R. Lindsey287
Where f is the ratio of liabilities to assets. If we set f = 1 and
Rl = c, we are back in the world without liabilities (or
where cash is our liability).
If we want to use the same optimizer, we need to transform
this problem into one of surplus – i.e. we need to express
covariance in terms of surplus risk and expected returns in
terms of the relative return of assets verses liabilities.
S S
1max subject to
2ww w w Aw b
Asset-Liability Management
Richard R. Lindsey288
11 1 1
S1
1
1 0 0 1 0 0
0 1 0 1
0 0 1 0 0 1
k l
k kk kl
l lk ll
f f
f f
f f
1
S (1 )
l
k l
f
c f
f
Asset-Liability Management
Richard R. Lindsey289
Now our solution is
By varying the risk-tolerance parameter, we can trace out
the surplus-efficient frontier.
* 1 1S S
1 1 1 1S S S S S
( )
( ( ) )
w A A A b
A A A A
Asset-Liability Management
Richard R. Lindsey290
The unconstrained (asset-only) frontier and the surplus-
efficient frontier coincide if:
Liabilities are cash (or, equivalently, if assets have zero covariance
with liabilities)
All assets have the have the same covariance with liabilities
There exists a liability-mimicking asset and it lies on the efficient
frontier
The Investment Universe
Richard R. Lindsey291
The choice of the investment universe has a significant
impact on the outcome of portfolio construction. If we
constrain ourselves to NYSE equities, it is likely that our
optimizer will produce a solution skewed toward smaller
cap stocks (why?). If we add Nasdaq equities and foreign
equities, this is likely to change as the variance-covariance
structure changes.
In general, to avoid the accumulation of estimation errors,
we would like to limit our portfolio optimization to groups
of assets with high intragroup and low intergroup
correlations.
The Investment Universe
Richard R. Lindsey292
In the two asset case, our unconstrained optimization
produces* 1w
* 1 11 11 12 1*
* 1 122 21 22
ww
w
*11 22
111 11 22 12 21
2 211 11 11
1 1
(1 )
dw
d
The Investment Universe
Richard R. Lindsey293
As the correlation between the two assets approaches 1, the
portfolio weights will react very sensitively to changes in
means (or expected return estimates). As assets become
more similar, any expected return becomes increasingly
important for the allocation decision. Portfolio
optimization with highly correlated assets will almost
certainly lead to extreme and undiversified results.
In the next homework set, I have you explore a method of reducing this
problem using cluster analysis.
Risk Decomposition
Richard R. Lindsey294
It is often useful to understand the sources of risk in and
how those risks are spread through our portfolio. To get at
this, we can decompose risk in the following way.
Consider the standard deviation of portfolio returns
The first question we would like to address is how does
portfolio risk as we change the holdings of a particular
asset?
1/2
1/2 2( )p i ii i j ij
i i j i
w w w w w
Risk Decomposition
Richard R. Lindsey295
What we need is the ―marginal contribution to risk‖ MCTR
which can be easily calculated
Where the ith element in the k by 1 vector is
1MCTRp
kp
d w
dw
i ii j ij
p j i ipi p
i p p
w wd
dw
Risk Decomposition
Richard R. Lindsey296
Note that if we add the weighted MCTRs of all securities in
the portfolio, we get the volatility of the portfolio
as we would expect. If we divide this expression by the
volatility of the portfolio, we get
p ipi i p
i pi i
dw w
dw
21
p ipii i i
p i pi i i
dww w
dw
Risk Decomposition
Richard R. Lindsey297
Which shows that the percentage contributions to risk
(PCTR), which add up to 100%, are equal to the weighted
betas. This can be written as a vector
Where W is a k by k diagonal matrix with portfolio weights
on the diagonal. Each element of the vector PCTR is
given by
1
WPCTR
pk
p
d
dw
PCTRpi
i i ip i
dww
dw
Bibliography
Richard R. Lindsey298
Huang and Litzenberger, Foundations for Financial
Economics, North-Holland.
Intriligator, Mathematical Optimization and Economic
Theory, Prentice-Hall.
▲▲▲▲▲▲▲▲▲▲▲▲▲
Factor Risk Contributions
Richard R. Lindsey300
Last time we looked at risk decomposition of a portfolio.
Today we will assume that we can decompose the
uncertainty in asset returns into common factors.
Stocks are at least partly driven by characteristics like
industry, country, size, etc.
We can write the risk premium of a given stock as a
combination of these factor returns weighted by their
respective factor exposures.
▲▲▲▲
Factor Risk Contributions
Richard R. Lindsey301
Where r is a k by1vector of risk premia (asset return minus
cash), X is a k by p matrix of factor exposures, f is a p by 1
vector of factor returns and u is a k by 1 vector of asset-
specific returns which are both uncorrelated with factor
returns and uncorrelated across assets.
The covariance matrix of excess returns can be expressed
r Xf u
[ ] [( )( ) ]E rr E Xf u Xf u
▲▲▲▲
Factor Risk Contributions
Richard R. Lindsey302
Where Σff denotes the p by p covariance matrix of factor
returns and Σuu is a k by k covariance (diagonal) of asset-
specific returns
[ ] [( )] [( )] [( )] [( )]E rr E Xfu E Xff X E uu E uX f
ff uuX X
▲▲▲▲
Factor Risk Contributions
Richard R. Lindsey303
We can now decompose the portfolio risk into a common
and a specific part
Using the same logic as last time, we get for the marginal
factor contribution to risk MFCTR (an f by 1 vector)
2p ff uuw X X w w w
MFCTR( )
p ff
p
d X w
d X w
▲▲▲▲
Implied View Analysis
Richard R. Lindsey304
So far, we have calculated the optimal portfolio weights
from given return expectations. But often we are working
with previously established portfolios and all we have are
the weights. How can we determine what the expectations
are and whether or not the weights make sense?
This is done using ―reverse optimization‖, which maps the
positions into implicit return expectations.
▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey305
In an unconstrained portfolio optimization, marginal risks
are traded off against marginal returns. A portfolio is
therefore optimal when the relationship between marginal
risks and marginal returns is the same for all assets in the
portfolio
Since the Sharpe ratio of the portfolio measures the
relationship between incremental risk and return, we can
express the relationship between marginal return and
marginal risk as:
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey306
Where the beta measures the sensitivity of an asset to
movements of the portfolio:
Note that this follows from portfolio mathematics not from an equilibrium
condition, but if the portfolio were the market portfolio, the implied returns
would be the returns that investors would need to hold the market portfolio.
pp
p p
w
2p
w
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey307
This kind of analysis can be used to show investors whether
their return expectations are consistent with market
realities, i.e., whether they are over or under investing
their risk budget in particular areas and whether they are
investing in a way that is consistent with their views.
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey308
Let’s consider an example
Expected return 10% (5% excess); Volatility 8.97%, Sharpe
ratio 0.57
Asset Weight % Return % Volatility %
Equity 40 11 18
Absolute Rtn 15 12 8
Private Eqty 15 11 9
Real Estate 5 10 14
US Bonds 25 7 3
Non-US Bonds 0 8 8
Cash 0 5 0
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey309
With a correlation matrix
1.0 0.0 0.5 0.5 0.3 0.3 0.0
0.0 1.0 0.0 0.0 0.0 0.0 0.0
0.5 0.0 1.0 0.5 0.3 0.3 0.0
0.5 0.0 0.5 1.0 0.5 0.3 0.0
0.3 0.0 0.3 0.5 1.0 0.8 0.0
0.3 0.0 0.3 0.3 0.8 1.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 1.0
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey310
We can compute the marginal contribution to risk using the
equation from last time
We compute the MCTR for US Bonds as 0.014 –what does
this mean? Suppose instead of holding 25%, we invested
26%, then our total portfolio risk would change from
8.7948 to 8.8089
MCTRi i p
__
8.8089 8.7948 0.0141p
p US BondsUS Bonds
dw
dw
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey311
Or for the complete picture
Biggest increase in risk would come from equities (already about 80%), smallest increase from Absolute Return (most diversifying).
Asset PCTR % MCTR Implied Rtn %
Equity 79.1 0.174 9.84
Absolute Rtn 1.9 0.011 0.62
Private Eqty 10.2 0.060 3.39
Real Estate 4.8 0.085 4.80
US Bonds 4.0 0.014 0.80
Non-US Bonds 0.0 0.029 1.66
Cash 0.0 0.000 0.00
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey312
0
2
4
6
8
10
12
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey313
Implied excess return for Absolute Return strategies is much
lower than the forecast. This means that the investor is
underspending risk in this area.
For equities, the investor is overspending in the risk
allocation. A large allocation in a relatively
undiversifying asset requires large implied return to make
the portfolio optimal.
In this case, it is apparent that the investor’s implied return
for equities is much larger than historical experience.
▲▲▲▲▲▲▲▲▲▲▲
Implied View Analysis
Richard R. Lindsey314
View Optimization
This approach can be used iteratively where changes are
made to allocations or to forecasts until there is reasonable
correspondence between implied returns and expected
returns.
It can also be used to build a consensus view within a
portfolio team.
Note, however, that these views are for an unconstrained investor.
▲▲▲▲▲▲▲▲▲▲▲
Correcting for Autocorrelation
Richard R. Lindsey315
Some asset classes appear to have much less risk than one
might commonly believe.
Corporate high yield
Hedge funds
If the risk for an asset class is underestimated, too much
capital will be allocated to that class.
Loss of efficiency in the portfolio.
Broader issue of societal allocations.
▲▲▲▲▲▲▲
Correcting for Autocorrelation
Richard R. Lindsey316
Positively autocorrelated returns (high returns tend to be
followed by high returns), show less historical volatility
than an uncorrelated series.
Where does autocorrelation come from?
Infrequent trading in illiquid securities.
Real estate
High yield
Hedge funds
Non-synchronous trading
▲▲▲▲▲▲▲
Correcting for Autocorrelation
Richard R. Lindsey317
One of the ways to check and correct for autocorrelation is
known as the Blundell-Ward filter:
Which creates a new, transformed return series, r*, using the
returns r at times t and t-1. The coefficient a1 is estimated
from an autoregressive first-order (AR(1)) model:
1*1
1 1
1
1 1t t t
ar r r
a a
0 1 1t t tr a a r
▲▲▲▲▲▲▲
Correcting for Autocorrelation
Richard R. Lindsey318
Note that by applying this filter the mean is unchanged:
And the variance increases:
1*
1 1
1
1 1t
ar r r r
a a
212 * 2
21
1( ) ( )
(1 )t t
ar r
a
▲▲▲▲▲▲▲
Correcting for Autocorrelation
Richard R. Lindsey319
This approach can also be used to arrive at more realistic
beta estimates.
Let’s consider an example using four hedge fund indices,
convertible arbitrage, distressed debt, event-driven and
macro and the MSCI USA index as the market, we could
run three types of regressions
0it mt tr r
* *0it mt tr r
0 1 1 2 2 3 3it mt mt mt mt tr r r r r
▲▲▲▲▲▲▲
Correcting for Autocorrelation
Richard R. Lindsey320
Index a1 β0 β*0 β0+β1+β2+β3
Convertible 0.55 (7.66) 0.09 0.22 0.25
Distressed 0.52 (6.86) 0.18 0.44 0.49
Event-Driven 0.28 (3.56) 0.29 0.38 0.38
Macro 0.18 (2.10) 0.29 0.37 0.52
The betas from ordinary regressions appear to
underestimate the true market exposure and therefore
overstate the diversifying effects associated with the
hedge funds.
▲▲▲▲▲▲▲
Problems with the Covariance Matrix
Richard R. Lindsey321
The covariance matrix is a fundamental tool for our analysis,
so it is worthwhile spending a bit of time looking at its
properties.
Since this is intended to be a covariance matrix, it must be
true that for all w. In other words, it must be
positive semi-definite. A necessary and sufficient
condition for positive semi-definiteness (for symmetric
matrices) is that all of the eigenvalues of Σ are positive or
zero and at least one eigenvalue is greater than zero.
0w w
▲▲▲▲▲
Problems with the Covariance Matrix
Richard R. Lindsey322
However, we may find that we sometimes have negative
eigenvalues when we have estimated out covariance
matrix.
This can arise for several reasons:
Estimates are generated from time series of different lengths.
The number of observations is less than the number of assets or
risk factors.
Two or more assets are collinear.
▲▲▲▲▲
Problems with the Covariance Matrix
Richard R. Lindsey323
Consider the following:
Where the variances have been standardized to 1.0 for
simplicity.
The eigenvalues can be found
1.0 0.9 0.3
0.9 1.0 0.7
0.3 0.7 1.0
1 2 3( , , ) (2.0,1.29, 0.3)e e e
▲▲▲▲▲
Problems with the Covariance Matrix
Richard R. Lindsey324
So this matrix is not positive semi-definite. One of the ways
to fix this is to perform an adjustment to the matrix.
1. Find the smallest eigenvalue (here e3)
2. Create a minimum zero eigenvalue by shifting the
covariance matrix where I is an identity
matrix.
3. Scale the resulting matrix by 1/(1/e3) to enforce
variances of 1:
*3e I
** *
3
1
1 e
▲▲▲▲▲
Problems with the Covariance Matrix
Richard R. Lindsey325
For our example, the new adjusted matrix is
With eigenvalues
**
1.0 0.69 0.23
0.69 1.0 0.54
0.23 0.54 1.0
1 2 3( , , ) (1.77,1.22,0)e e e
▲▲▲▲▲
Significance of the Inverse Covariance
Richard R. Lindsey326
Let’s turn to the economics of our unconstrained solution
If we run the regression of asset i against all other k-1 assets
The explanatory power of this regression is given as
* 1w
i ij j i
j i
r a r
2iR
▲▲▲▲
Significance of the Inverse Covariance
Richard R. Lindsey327
It can then be shown than
1122 2 2
11 1 11 1 11 1
2212 2 21
22 2 22 2 22 2
1 22 2 2
1
(1 ) (1 ) (1 )
1
(1 ) (1 ) (1 )
1
(1 ) (1 ) (1 )
k
k
k k
kk k kk k kk k
R R R
R R R
R R R
▲▲▲▲
Significance of the Inverse Covariance
Richard R. Lindsey328
Which means that the optimal weight for asset i is
The numerator is the excess return after regression hedging
(i.e. the excess return after the reward for implicit
exposure to other assets has been removed. This is
equivalent to a in the regression.
*
2(1 )
i ij j
j ii
ii i
wR
▲▲▲▲
Significance of the Inverse Covariance
Richard R. Lindsey329
Since ζii is the total risk associated with asset i, the fraction
of risk that cannot be hedged away is the denominator of
our expression.
In terms of the regression equation, this is the unexplained
variance or the variance of the error term.
*
2(1 )
i ij j
j ii
ii i
wR
▲▲▲▲
Significance of the Inverse Covariance
Richard R. Lindsey330
Since the regression attempts to minimize the variance of
the errors – this means that the optimization will put
maximum weight into those assets that are similar to the
other assets (as a group) but have a small return
advantage. This property leads to implausible results
when estimation errors are taken into account.
Covariance in Good and Bad Times
Richard R. Lindsey331
Often we find that during times of market difficulty, correlations within an asset class increase. Sometimes this is stated, ―In times of stress, all correlations go to one.‖
Is the low correlation in a full sample covariance matrix just an artifact of reasonably positive correlation in normal times and of highly negative correlation in unusual times? Or is it a diversifying asset?
Investors may not want to bet on average correlation – they may actually have preferences that vary depending on the state of the world.
▲▲▲▲
Covariance in Good and Bad Times
Richard R. Lindsey332
To address these types of issues, we may want to optimize
our portfolio based upon our expectation of the occurrence
of ―normal‖ and ―unusual‖ times.
To determine what are unusual times, we will define them
according to their statistical distance from the mean vector
This statistic is distributed Chi-Squared with k degrees of
freedom. If we define an unusual observation as the outer
10%, we can test each time period.
1 1ˆ ˆ( ) ( )ˆ ˆt t t t tr r d d D
▲▲▲▲
Covariance in Good and Bad Times
Richard R. Lindsey333
Notice that the distance is weighted by the inverse of the
covariance matrix. This means that we take into account
asset volatilities (the same deviation from the mean might
be significant for low-volatility series but not for high-
volatility series). Hence, outliers are not necessarily
associated with down markets.
▲▲▲▲
Covariance in Good and Bad Times
Richard R. Lindsey334
We could now build a new covariance matrix weighted by
our subjective (or estimated) probabilities.
Where we have included the relative risk tolerance for each
regime (note that these must be scaled so they sum to the
actual risk tolerance of the investor).
Note that this analysis can be very sensitive to the inclusion of new assets
since that may change which periods are usual and unusual. For that reason,
it may be useful to define unusual times with respect to a core set of assets.
(1 )new normal normal unusual unusualp p
▲▲▲▲
Estimation Error
Richard R. Lindsey335
We should be clear that everything that we have done so far
is predicated on a couple of things:
1. We are using expected returns – in other words,
forecasted returns for our assets.
2. We are using an expected variance-covariance structure
– in other words, forecasted for our universe of assets.
3. If the future deviates from our forecasts by a significant
amount, we will not have an optimal portfolio. (This is an
issue of performance measurement)
▲▲▲▲▲▲▲▲▲▲▲▲▲
Estimation Error
Richard R. Lindsey336
As I have said, generally you will want to forecast the mean
in some manner (if we have time we will talk more about
this later in the course). Your forecast could be a simple
forecast (like last period’s return or the sample mean) or it
could be more complex (Delphi method; time series
forecast; multi-factor forecast).
▲▲▲▲▲▲
Estimation Error
Richard R. Lindsey337
For the variance-covariance structure, one typically uses
simple approaches like the estimated structure based upon
the sample history, a 250 day moving average, or an
exponentially weighted average. You can add complexity
to this by embedding Arch-Garch processes or other
generalizations, but remember that if you are not using a
factor decomposition (and thereby reducing the space),
you are now attempting to forecast a large number of
variables for a problem of any size.
2
2
nn
▲▲▲▲▲▲
Estimation Error
Richard R. Lindsey338
To review what I discussed last time, assume that we have
an estimated mean of 10% and an estimated volatility of
20%.
Estimation error for the mean is given by
And the confidence interval is calculated as
T
,z zT T
▲▲▲▲▲▲
Estimation Error
Richard R. Lindsey339
For the variance, Campbell, Lo and MacKinlay have shown
We can see from these expressions that the estimation error
for the mean is effected by the length of the time series T
and the estimation error for the variance is effected both
by the length and by the frequency of sampling (∆t).
We also see this in the following tables:
12 2ˆ( ) 1 2
TVar
t
▲▲▲▲▲▲
Estimation Error
Richard R. Lindsey340
Estimation Period (yrs) Estimation Error % 95% Confidence Interval %
1 20 78
5 9 35
10 6 25
20 4 18
50 3 11
Effect of Sample Period on Estimation Error for Mean Returns
▲▲▲▲▲▲
Estimation Error
Richard R. Lindsey341
Effect of Sample Period on Estimation Error (%) for Variance
Estimation Estimation Frequency
Period yrs Daily Weekly Monthly Quarterly
1 0.35 0.79 1.71 3.27
5 0.16 0.35 0.74 1.30
10 0.11 0.25 0.52 0.91
20 0.08 0.18 0.37 0.64
50 0.05 0.11 0.23 0.40
▲▲▲▲▲▲
What is more important – estimation error in the
mean or in the variance?
Currency in the Portfolio
Richard R. Lindsey342
When optimizing a portfolio, one often has to deal with a
block structure. In other words, two or more blocks of
assets (eg. stocks and bonds, equities and currencies,
active managers and passive strategies).
Often the correlation between blocks is ignored or set to
zero and the problem is solved separately, or the problem
is solved in a two-step process where one finds the
―optimal‖ allocation for part of the problem and then finds
the ―optimal‖ allocation for the second part of the
problem.
▲▲▲▲▲▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey343
We will study this problem using currencies.
Optimal currency hedging is the subject of ongoing debate
between plan sponsors, asset managers and consultants.
We will consider asset returns (local return plus currency
return minus domestic cash rate)
i ii h
i i
p sa c
p s
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey344
And currency returns (local cash rate plus currency return
minus domestic cash rate)
The covariance matrix of asset and currency returns is
assumed to follow the block structure
ii i h
i
se c c
s
aa ae
ea ee
▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey345
Currency hedging takes the form of regression hedging
where we regress asset returns against all currency returns:
Regression hedging can also be expressed in matrix terms as
Where β is
1 1i i i ik k ik k ia e e e
1ea ee
11 12 1
21 22 2
1 2
k
k
k k kk
▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey346
We can now define the variance in asset returns that remains
unexplained by currency returns (this is the conditional
variance of asset returns conditioned on currency returns)
And write the inverse of the covariance matrix of asset and
currency returns as
|a e aa ee
1 1| |1
1 1 1| |
a e a e
a e ee a e
▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey347
Where we use the results for the inverse of a partitioned
matrix
1 1 1 111 12 12 22
1 1 1 1 1 121 22 22 21 22 22 21 12 22
P P D D P P
P P P P D P P P D P P
111 12 22 21D P P P P
▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey348
For example, checking the value of D
1 111 12 22 21
1 1 1
1 1 1
1
|
( ) ( )
aa ae ee ea
aa ae ee ee ee ea
aa ae ee ee ee ea
aa ee
a e
D P P P P
▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey349
Now, defining
And recalling the solution to the unconstrained optimization
a
e
ww
w
a
e
* 1w
▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey350
There are three solutions to our problem.
First is the simultaneous optimization or the joint full blown
optimization (choosing the optimal asset and currency
positions simultaneously):
This assumes that the manager has expertise over all assets and
currencies.
1 1*| |,*
* 1 *, ,
a e a a e ea sim
sim
e sim ee e a sim
ww
w w
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey351
Note that the optimal hedge positions for currency depend
on the optimal asset positions, which are themselves
effected by the presence of currencies in the portfolio.
Also, the hedge positions have a speculative component
driven by non-zero expected returns in currencies as well
as a variance reduction component related to beta.
* 1 *, ,e sim ee e a simw w
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey352
If currencies carry a positive risk premium (the currency
return is, on average, greater than the interest rate
differential), currencies will be included in the optimal
portfolio because the first term will be positive.
Instead, let’s focus on the case (often assumed in practice)
that currencies do not offer a significant risk premium. In
this case, the solution becomes
* 1, |
* *, ,
a sim a e a
e sim a sim
w
w w
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey353
Suppose now that local asset returns are also uncorrelated
with currency returns. In that case, taking on currency
risk does not help to reduce (or hedge) asset risk and
currency risk would always be an add-on to asset risk.
If local returns are not correlated with currency movements,
the covariance between currency returns and foreign
assets returns in home currency units contains solely the
covariance between currencies.
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey354
Which in matrix terms becomes
or
, , ,
,
j j ji i i i
i i j i j i j
ji
i j
s s sp s p sCov Cov Cov
p s s p s s s
ssCov
s s
1 1ee ee
ea ee
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey355
So the currency positions will completely hedge out the
currency risk that arises from the unhedged asset positions
(unitary hedging):
* 1, |
* *, ,
a sim a e a
e sim a sim
w
w w
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey356
Now, suppose the opposite – that foreign asset returns (in
home country currency) and currency returns are not
correlated. Now we would have and
so our solution would be
Since the covariance of asset returns conditioned on
currency returns would be
0ea1 0ea ee
* 1,
*, 0
a sim aa a
e sim
w
w
|a e aa ee aa
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey357
To summarize:
1. If currencies carry a risk premium, there will always be a
speculative aspect to currency exposure.
2. If currencies do not have a risk premium, we need to look at
currency exposure in terms of its ability to reduce asset risk:
a. Zero correlation between local returns and currency returns means
currencies add risk without return or diversification benefits.
b. Negative correlation between local returns and currency returns
makes currencies a hedge asset that reduces total portfolio risk.
c. Positive correlation between local returns and currency returns
would increase total portfolio risk. In that case, over-hedging
(short position in currency is greater than the long position in the
asset) is optimal.
▲▲▲▲▲▲▲▲
Currency in the Portfolio
Richard R. Lindsey358
Now consider the second approach, where we optimize asset
positions in a first step and in a second step choose
optimal currency positions conditional on the already
established asset positions. This is known as partial
optimization and the solution is
Terms representing the conditional covariance drop out and
there is no feedback of currency positions on asset
positions. Total risk is controlled but currencies are managed
independently.
* 1,*
1 **,,
a par aa a
par
ee e a pare par
ww
ww
▲
Currency in the Portfolio
Richard R. Lindsey359
The final option for constructing portfolios with currencies
is simply separate optimization (also known as currency
overlay)
In this case currencies are completely independent and should be measured
against their own benchmark.
* 1,*
1*,
a sep aa asep
ee ee sep
ww
w
▲
Currency in the Portfolio
Richard R. Lindsey360
I hope, by now, that it is obvious to you that these different
techniques are in decreasing order of efficiency (in other
words, decreasing utility).
Moreover, it should also be obvious that currencies are just a
proxy for any investible asset that you want as part of your
portfolio (hedge funds; foreign equity; private equity; real
estate; etc.). These three techniques can always be used
(and commonly are), but they are always in decreasing
efficiency.
▲
Bibliography
Richard R. Lindsey361
Blundell and Ward, ―Property Portfolio Allocation: A Multifactor
Model‖, Land Development Studies, 1987.
Chan and Hussey, ―Marginal Contribution to the Sharpe Ratio‖,
Northwater Capital Management Inc., January 2009.
Chow, Jacquier, Kritzman, and Lowry, ―Optimal Portfolios in
Good Times and Bad‖, Financial Analysts Journal, 1999.
Scholes and Williams, ―Estimating Beta from Nonsynchronous
Data‖, Journal of Financial Economics, 1977.
Stevens, ―On the Inverse of the Covariance Matrix in Portfolio
Analysis‖, Journal of Finance, 1998.
▲▲▲▲▲▲▲▲▲▲▲▲▲
Bibliography
Richard R. Lindsey362
Campbell, Lo, and MacKinlay, The Econometrics of
Financial Markets, Princeton University Press, 1997.
Jorion, ―Mean Variance Analysis of Currency Overlays‖,
Financial Analysts Journal, 1994.
▲▲▲▲▲▲▲▲▲▲▲▲▲
Risk Revisited
Richard R. Lindsey386
So far we have often relied on an assumption (or
presumption) of normal returns. But we know that asset
returns are not normal and, therefore, the mean and
variance do not fully describe the characteristics of the
joint asset return distribution. Specifically, the risk and
the undesirable outcomes associated with the portfolio
cannot be adequately captured by the variance.
Let’s spend a bit of time looking at alternative portfolio risk
measures that are sometimes used in practice.
Risk Revisited
Richard R. Lindsey387
Generally speaking, there are two different types of risk
measures:
1. Dispersion Measures: consider both positive and
negative deviations from the mean, and treat those
deviations as equally risky.
2. Downside Measures: maximize the probability that the
portfolio return is above a certain minimal acceptable
level known as the benchmark or disaster level.
Dispersion: Standard Deviation
Richard R. Lindsey388
Of course, the best known and most used dispersion
measure is (for historical reasons) the foundation of
modern portfolio theory – standard deviation
1/2
1/2 2( )p i ii i j ij
i i j i
w w w w w
Dispersion: Mean-Absolute Deviation
Richard R. Lindsey389
The mean-absolute deviation or MAD approach doesn’t use squared deviations, but absolute deviations
Where
And ri is the return on the asset and μi is the expected return on the asset.
p i i i i
i i
MAD r E w r w
p i i
i
r w r
Dispersion: Mean-Absolute Deviation
Richard R. Lindsey390
The computation of optimal portfolios under MAD is
straightforward since the optimization problem is linear
and can be solved with standard linear programming
routines.
Note that it can be shown that if individual asset returns are
multivariate normal
2
p pMAD r
Dispersion: Mean-Absolute Moment
Richard R. Lindsey391
The mean-absolute moment (MAMq) of order q is defined by
Or
Which is a straightforward generalization of the mean-standard deviation (q=2) and the mean-absolute deviation (q=1) approaches.
1/
, 1
q p i i i i
i i
MAM r E w r w q
1/
( ) , 1q
q
q p p pMAM r E r E r q
Downside Measures
Richard R. Lindsey392
Now let’s turn to downside measures, where the objective is to have a portfolio return above a certain minimum – a safety first approach.
While these types of measures may have significant intuitive and theoretical appeal, they are often computationally more complicated to use in a portfolio context.
Downside risk measures of individual assets cannot be easily integrated into portfolio downside risk measures since their computation requires knowledge of the entire joint distribution of asset returns.
You usually have to resort to computationally intense nonparametric estimation, simulation, and optimization techniques.
Moreover, the estimation error for downside measures is usually higher than that for mean-variance approaches since we only use a portion of the original data – often just the tail of the empirical distribution.
Downside: Roy’s Safety First
Richard R. Lindsey393
Published the same year (1952) as Markowitz’s paper (the
foundation of Modern Portfolio Theory), was Roy’s paper
on safety first (the foundation of downside risk measures).
Under MPT, the investor makes a trade off between risk and
return where the final portfolio allocation depends on the
investor’s utility function. As you know, it can be hard, or
even impossible, to determine the investor’s actual utility
function.
Downside: Roy’s Safety First
Richard R. Lindsey394
Roy argued that an investor, rather than thinking in terms of
utility, first wants to make sure that a certain amount of
the principal is preserved. Thereafter, the investor decides
on a minimal acceptable return that achieves this principal
preservation.
In essence, the investor solves
Where Pr is the probability function and rp is the portfolio
return.
0min Pr( ) subject to 1pw
r r w
Downside: Roy’s Safety First
Richard R. Lindsey395
Of course, it would be unlikely that the investor would know
the true probability function, but if we recall that
Tchebycheff’s inequality (for a random variable x, mean μ
and variance σ2 ) states that for any positive real number c
Then we can write
2
2Pr x c
c
0 0
2
2
0
Pr( ) Pr( )p p p p
p
p
r r r r
r
Downside: Roy’s Safety First
Richard R. Lindsey396
Therefore, not knowing the probability function, the investor
solves the approximation
Note that if r0 is equal to the risk-free rate, then this optimization problem is
equivalent to maximizing a portfolio’s Sharpe ratio.
0
min subject to 1p
w p
wr
Downside: Semi-variance
Richard R. Lindsey397
Even in his 1959 book, Markowitz proposed the use of
semi-variance to correct for the fact that variance
penalizes over-performance and under-performance
equally.
Portfolio semi-variance is
2
2,min min ,0p i i i i
i i
E w r w
Downside: Lower Partial Moment
Richard R. Lindsey398
The lower partial moment risk measure is a generalization of
semi-variance. The lower partial moment with power
index q and a target rate of return r0 is given by
If we set q=2 and r0 equal to the expected return, we get the
semi-variance.
Note, it can be shown q=1 represents a risk neutral investor, 0<q≤1 a risk
seeking investor and q>1 a risk-averse investor.
0
1
, , 0min ,0p
q qr q r pE r r
Downside: Value at Risk
Richard R. Lindsey399
The best known downside risk measure is probably value at
risk (VaR), originally developed by JP Morgan. VaR is
related to the percentiles of loss distributions, and
measures the predicted maximum loss at a specified
probability level (for example 95%).
VaR can be defined as
Typical values of (1-ε) are 90%, 95%, and 99%.
1VaR min | Prp pr r r r
Downside: Value at Risk
Richard R. Lindsey400
Note that there a several equivalent ways to define VaR
emphasizes that r is the value such that the probability of a loss greater than r is less than ε.
An alternative (and equivalent) way to define VaR
emphasizes that r is the value such that the probability that the maximum loss is at most r is (1-ε).
1VaR min | Prp pr r r r
1VaR min | Pr (1 )p pr r r r
Downside: Value at Risk
Richard R. Lindsey401
There are many well known problems with VaR:
1. The common assumption of lognormal returns is problematic
when you have long and short positions.
2. It is not sub-additive (in other words, the risk of two
combined portfolios may not be less than the sum of the risks
of each), which means that diversification does not generally
hold.
3. When calculated from generated scenarios, VaR is a non-
smooth and non-convex function with multiple stationary
points making it a difficult function to find a global optimum.
4. It does not take into account the magnitude of losses beyond
the VaR value.
Downside: Conditional Value at Risk
Richard R. Lindsey402
The problems with value at risk led to the development of
desirable properties for a risk measure. Risk measures
which satisfy these properties are known as coherent risk
measures.
A risk measure ρ is called a coherent measure of risk if it
satisfies:
1. Monotonicity: if X ≥ 0, then ρ(X) ≤ 0.
2. Subadditivity: ρ(X+Y) ≤ ρ(X)+ ρ(Y).
3. Positive Homogeneity: for any positive real number c,
ρ(cX) = cρ(X).
4. Translational invariance: for any real number c,
ρ(X+c) ≤ ρ(X)-c.
Downside: Conditional Value at Risk
Richard R. Lindsey403
These properties can be interpreted:
1. If there are only positive returns, then the risk should be non-
positive.
2. The risk of a portfolio of two assets should be less than or
equal to the risks of the individual assets.
3. If the portfolio is increased c times, the risk becomes c times
larger.
4. Cash or another risk-free asset does not contribute to
portfolio risk.
Note that standard deviation is not a coherent measure since it violates the
monotonicity property. Semi-deviation type measures violate the
subadditivity condition. The four properties together are quite restrictive.
Downside: Conditional Value at Risk
Richard R. Lindsey404
Conditional value at risk is a coherent risk measure defined
as:
CVaR measures the expected amount of losses in the tail of
the distribution of possible portfolio losses (beyond the
portfolio VaR).
This is also known as expected shortfall, expected tail loss,
or tail VaR.
(1 ) (1 )CVaR ( ) | VaR ( )p p p pr E r r r
Downside: Conditional Value at Risk
Richard R. Lindsey405
Let’s consider some of the mathematical properties of
CVaR.
Let w be the vector denoting the number of shares of each
asset and y be a random vector describing the uncertain
outcomes of the economy (or the market variables). The
function f(w,y) (the loss function) represents the loss
associated with the portfolio vector w (Note that for each
w, the loss function is a one-dimensional random
variable). Finally, p(y) is the probability associated with
scenario y.
Downside: Conditional Value at Risk
Richard R. Lindsey406
Now, assuming all random variables are discrete, the
probability that the loss function does not exceed a certain
value γ is given by the cumulative probability
Using this cumulative probability, we can write
{ | ( , ) }
( , ) ( )y f w y
w p y
(1 )VaR ( ) min{ | ( , ) (1 )}w w y
Downside: Conditional Value at Risk
Richard R. Lindsey407
Since CVaR of the losses of portfolio w is the expected
value of the losses conditioned on the losses being in
excess of VaR, we have
(1 )
(1 )
(1 ) (1 )
{ | ( , ) VaR ( )}
{ | ( , ) VaR ( )}
CVaR ( ) ( ( , ) | ( , ) VaR ( ))
( ) ( , )
( )
y f w y w
y f w y w
w E f w y f w y w
p y f w y
p y
Downside: Conditional Value at Risk
Richard R. Lindsey408
Downside: Conditional Value at Risk
Richard R. Lindsey409
The continuous equivalents of these formulas are
( , )
( , ) ( )
f w y
w p y dy
(1 )VaR ( ) min{ | ( , ) (1 )}w w y
(1 )
(1 ) (1 )
( , ) VaR ( )
CVaR ( ) ( ( , ) | ( , ) VaR ( ))
1( , ) ( )
f w y w
w E f w y f w y w
f w y p y dy
Downside: Conditional Value at Risk
Richard R. Lindsey410
Moreover, we see that
(1 )
(1 )
(1 )
( , ) VaR ( )
(1 )
( , ) VaR ( )
(1 )
1CVaR ( ) ( , ) ( )
1VaR ( ) ( )
VaR ( )
f w y w
f w y w
w f w y p y dy
w p y dy
w
Downside: Conditional Value at Risk
Richard R. Lindsey411
Since
In other words, CVaR is always at least as large as VaR, but
it is a coherent risk measure (and VaR is not). Further,
CVaR is a concave function and therefore has a unique
minimum.
Note, however, we have a problem in that you need to have
an analytical expression for VaR – this problem was
solved by Rockefellar and Uryasev (2000).
(1 )( , ) VaR ( )
1( ) 1
f w y w
p y dy
Downside: Conditional Value at Risk
Richard R. Lindsey412
Their idea is that instead of CVaR we can use the function
Rockefellar and Uryasev prove the following
1. is a convex and continuously differentiable
function in .
2. is a minimizer of .
3. The minimum value of is .
( , )
1( , ) ( ( , ) ) ( )
f w y
F w f w y p y dy
( , )F w
(1 )VaR ( )w ( , )F w
( , )F w (1 )CVaR ( )w
Downside: Conditional Value at Risk
Richard R. Lindsey413
So we can find the optimal value of by
solving the optimization problem
If we denote as the solution to this optimization
problem, then is the optimal CVaR.
The optimal portfolio is given by and the corresponding
VaR is given by .
In other words, we can compute the optimal CVaR without first calculating
VaR.
(1 )CVaR ( )w
,min ( , )w
F w
* *( , )w * *( , )F w
*w*
Downside: Conditional Value at Risk
Richard R. Lindsey414
In practice, the probability density function p(y) is not
known or difficult to estimate. Instead, we might have T
different scenarios Y={y1,…,yT} that are sampled from the
probability distribution or that have been obtained from
computer simulations. Evaluating the auxiliary function
using the scenarios Y, we obtain* *( , )F w
1
1( , ) max(( ( , ) ),0)
TY
i
i
F w f w yT
Downside: Conditional Value at Risk
Richard R. Lindsey415
Therefore the optimization problem
Takes the form
(1 )min CVaR ( )w
w
,1
1min max(( ( , ) ),0)
T
iw
i
f w yT
Downside: Conditional Value at Risk
Richard R. Lindsey416
Which can also be written
Subject to
Along with any other constraints (like short sales). Where zi
is an auxiliary variable for .
,1
1min
T
iw
i
zT
0, 1, ,iz i T
( , ) , 1, ,i iz f w y i T
max(( ( , ) ),0)if w y
Downside: Conditional Value at Risk
Richard R. Lindsey417
Under the assumption that f(w,y) is linear in w, the above
optimization is linear and can be solved using standard
linear programming techniques.
Downside: Conditional Value at Risk
Richard R. Lindsey418
This representation of CVaR can also be used to construct
other portfolio optimization problems. For example, the
mean-CVaR optimization problem
Subject to
Along with other constraints on w written as
maxw
w
(1 ) 0CVaR ( )w c
ww C
Downside: Conditional Value at Risk
Richard R. Lindsey419
Results in the following
Subject to
maxw
w
0
1
1 T
i
i
z cT
0, 1, ,iz i T
( , ) , 1, ,i iz f w y i T
ww C
Downside: Conditional Value at Risk
Richard R. Lindsey420
Palmquist, Uryasev, and Krokhmal provide us with an
example of the mean-CVaR approach.
They considered two-week returns for all of the stocks in the
S&P 100 from July 1, 1997 to July 8, 1999 for scenario
generation. Optimal portfolios were constructed solving
the mean-CVaR optimization approach for a two-week
horizon at different levels of confidence.
Downside: Conditional Value at Risk
Richard R. Lindsey421
Note risk is the percent of the portfolio allowed to be put at risk.
Downside: Conditional Value at Risk
Richard R. Lindsey422
It can be shown that for a normally distributed loss function,
the mean-variance and mean-CVaR frameworks generate
the same efficient frontier. However, when distributions
are non-normal, these two approaches can be significantly
different.
M-V optimization relies on deviations on both sides of the
mean, while M-CVaR relies only on the part of the
distribution which contributes to high losses.
Downside: Conditional Value at Risk
Richard R. Lindsey423
Bibliography
Richard R. Lindsey424
Artzner, Delbaen, Eber, and Heath, ―Coherent Measures of Risk‖, Mathematical Finance, 1999.
Grootveld and Hallerbach, ―Variance Verses Downside Risk: Is There Really That Much Difference?‖, European Journal of Operational Research, 1999.
Krokhmal, Palmquist, and Uryasev, ―Portfolio Optimization with Conditional Value-At-Risk Objective and Constraints‖, Journal of Risk, 2002.
Markowitz, ―Portfolio Selection‖, Journal of Finance, 1952.
Rockafellar and Uryasev, ―Optimization of Conditional Value-At-Risk‖, Journal of Risk, 2000.
Roy, ―Safety-First and the Holding of Assets‖, Econometrica, 1952.
Uryasev, ―Conditional Value-At-Risk: Optimization Algorithms and Applications‖, Financial Engineering News, 2000.
▲▲▲▲▲▲▲▲▲▲▲▲▲
Asset Allocation
Allocation between asset classes accounts for the major
portion of risk and return in a portfolio
Selection of specific instruments is a decision with smaller
influence on portfolio performance
Asset Allocation should consider all financial aspects Current and future wealth, income, and financial needs
Financial goals
Taxes and tax advantaged investments
Liquidity (for unexpected needs)
Investors (all types) need customized strategies
426 Richard R. Lindsey
▲
Typical Financial Advice for Individuals
Richard R. Lindsey427
Questionnaires to assess investor’s risk aversion
E*Trade, Charles Schwab, Fidelity, Financial Engines, etc.
Risk aversion of the investor typically assumed to be CRRA
Choose from standardized portfolios
Conservative (20% stocks)
Dynamic (40% stocks)
Aggressive (60% stocks)
Is this customized?
Typical Financial Advice for Individuals
Richard R. Lindsey428
Recently, so called life-cycle funds have been popular
Fidelity Freedom 2020
Asset allocation is purely time-dependent
Rule of thumb percent stock = 100 – age
But these strategies do not depend on wealth, expected
performance, cash flow, etc.
Dynamic Asset Allocation
Richard R. Lindsey429
In real life investors change their asset allocation as time
goes by and new information is available
In theory investors value wealth at the end of the planning
horizon (and along the way) using a specific utility
function and maximize expected utility
Fixed-mix strategies are optimal only under certain
conditions
In general, the optimal investment strategy is dynamic and
reflects real-life behavior
Dynamic Asset Allocation
Richard R. Lindsey430
After a stock market correction (with significant losses in
the stock portion of the portfolio) an investor would:
Dynamic Asset Allocation
Richard R. Lindsey431
After a stock market correction (with significant losses in
the stock portion of the portfolio) an investor would:
Rebalance back to the original allocation (constant RRA)
Dynamic Asset Allocation
Richard R. Lindsey432
After a stock market correction (with significant losses in
the stock portion of the portfolio) an investor would:
Rebalance back to the original allocation (constant RRA)
Buy more stocks and assume a larger stock allocation than in the
original portfolio (increasing RRA)
Dynamic Asset Allocation
Richard R. Lindsey433
After a stock market correction (with significant losses in
the stock portion of the portfolio) an investor would:
Rebalance back to the original allocation (constant RRA)
Buy more stocks and assume a larger stock allocation than in the
original portfolio (increasing RRA)
Do nothing and keep the new stock allocation or sell stocks to assume
a smaller stock allocation than in the original portfolio (decreasing
RRA)
Dynamic Asset Allocation
Richard R. Lindsey434
Samuelson (1969)
Optimal program for investment/consumption in each period
Backward dynamic programming (maximize discounted expected
utility over lifetime)
No bequest
One risky asset (iid) and one riskless
Power utility
Optimal to invest the same proportion of wealth in stocks
in every period, independent of wealth
Merton (1969) extended this to multiple risky assets and a
variety of bequest situations
Dynamic Asset Allocation
Richard R. Lindsey435
Conflict between theoreticians and practitioners
Samuelson’s and Merton’s result is that under their
assumptions about the market and under constant relative
risk aversion, the consumption and investment decisions
are independent of each other; the optimal investment
decision is invariant with respect to the investment
horizon and with respect to wealth.
Dynamic Asset Allocation
Richard R. Lindsey436
This is the same as an investment problem where you
maximize the utility of final wealth at the end of the
investment horizon, by allocating and reallocating at each
period along the way.
The result follows directly from the utility function used.
Myopic investment strategy.
Dynamic Asset Allocation
Richard R. Lindsey437
Mossin (1968) attempted to isolate the class of utility
functions of terminal wealth which result in myopic utility
for intermediate periods.
Log utility for general asset distributions
Power utility for serially independent asset distributions
If there is a riskless asset – all HARA (linear risk tolerance) utility
functions
Dynamic Asset Allocation
Richard R. Lindsey438
Hakansson (1971) showed for HARA no myopic strategy
except for complete absence of restrictions on borrowing
and short sales A percent margin requirement
An absolute limit on borrowing
Lending that must be repaid
Therefore, under those restrictions, only power and log
utility functions can lead to myopic policies; furthermore
if there is serial correlation only log utility produces
myopic policies
Dynamic Asset Allocation
Richard R. Lindsey439
More recently, numerical dynamic portfolio optimization
methods have been developed
Two methods
Stochastic programming
Stochastic dynamic programming (stochastic control)
Stochastic Programming
Richard R. Lindsey440
Efficiently solves the most general models
Transaction costs
Return distributions with serial dependence
Lends itself well to the more general asset liability model (ALM)
Traditionally uses scenario trees to represent possible
future events
Need to keep the tree thin for computational tractability
In later stages a very small number of scenarios are used to represent
the distribution (very thin sub-trees)
Emphasis is on obtaining a good first-stage solution rather than an
entire accurate policy
Stochastic Dynamic Programming
Richard R. Lindsey441
Used when focus is on obtaining optimal policies and transaction costs are not a primary issue.
Based on Bellman’s dynamic programming principle. An optimal policy has the property that, whatever the initial action, the
remaining choices constitute an optimal policy with respect to the subproblemstarting at the state that results from the initial conditions.
Closed form solutions exist for HARA utility functions.
For general monotone increasing and concave utility functions there are no analytical solutions, but can be solved numerically when state space is small.
Curse of dimensionality
Dynamic Portfolio Choice
Richard R. Lindsey442
Let’s extend the single-period utility maximization problem
to a multi-period setting.
Let:
t = 0,…, T be discrete time periods with T the investment
horizon
Rt be the random vector of asset returns in time periods t
yt = (y1,…, yN)t be the amount of money invested in the
different asset classes i = 1,…, N at time t
Scalars W0 and st, t = 0,…, T-1, represent the initial wealth
and possible cash flows (positive and negative) over time
Dynamic Portfolio Choice
Richard R. Lindsey443
Dynamic Portfolio Choice
Richard R. Lindsey444
We can then write:
0 0 0
1 1
0 0 1
max
st.
, 1, ,
0, , , , given, 0
T
t t t t
t T T
E U y
y W s
R y y s t T
y W s s s
Dynamic Portfolio Choice
Richard R. Lindsey445
As an aside, note that with time-additive utility we could
also write
Where δ represents the discount factor.
1
0 0 0
1 1
0 0 1
max
st.
, 1, ,
0, , , , given, 0
Tt
t
t
t t t t
t T T
E U y
y W s
R y y s t T
y W s s s
Dynamic Portfolio Choice
Richard R. Lindsey446
Back to our problem, defining xt (for t = 0, T-1) as the vector
of fractions invested in each asset class in each period, we
write
Where Wt is the wealth available each period before adding
or deducting cash
tt
t t
yx
W s
1 1 1 1( )t t t t tW R x W s
Dynamic Portfolio Choice
Richard R. Lindsey447
We can then write:
Here we can see that for serially independent asset returns, wealth is a single state connecting one period with the next.
1
0 0 1
max
st. 1 0, , 1
( ) , 0, , 1
0, , , , given, 0
T
t
t t t t t
t T T
E U W
x t T
W R x W s t T
y W s s s
Dynamic Portfolio Choice
Richard R. Lindsey448
Now we can write the problem as a dynamic programming
recursion
1
1 0 0 1
max ( )
st. 1
A b
where ( ) ( )
( ) and , , , given, 0
t t t t t t t
t
t
t
T T
t t t t t T T
U W E U W s R x
x
x
l x u
U W U W
W R x W s W s s s
Dynamic Portfolio Choice
Richard R. Lindsey449
In practice, we need to resort to Monte Carlo simulation to
estimate the expected utility of the single-period utility
maximizing problem in each period.
Let be samples of return
distributions for each period t. We can represent the
problem as:
, , 1, , 1,t tR S t T
1
1ˆ ˆmax ( )
st. 1
A b,
t
t t t t t t tt S
t
t t
U W U W s R xS
x
x l x u
Dynamic Portfolio Choice
Richard R. Lindsey450
Now the dynamic optimization problem can be solved using a backward dynamic programming recursion, conditioning on wealth.
Starting at T-1, parameterize wealth into K discrete levels
and solve the T-1 problem K times using sample ST-1, obtaining solutions .
We then use those solutions to obtain the T-2 solutions and continue ―backward‖. In period 0, the initial wealth is known and we conduct the final optimization using the period 1 value function.
In each period in the backward recursion, use a new sample generated from Monte Carlo.
1ˆ kTx
1, 1, ,kTW k K
Practical Utility
Richard R. Lindsey451
Represent utility as a piecewise exponential function with K
pieces represents a certain absolute risk aversion γi where
i = 1,…, K
Let be discrete wealth levels representing
the borders of each piece i, such that below the risk
aversion is γi and above (until ) the risk aversion is
γi+1 for all i = 1,…, K.
For each piece i represent utility by an exponential function
ˆ , 1, ,iW i KˆiW
ˆiW 1
ˆiW
i iWi i i iU W a b e
Practical Utility
Richard R. Lindsey452
With a first derivative with respect to wealth
The γi are chosen to represent the desired function of risk
aversion verses wealth.
The coefficients of the exponential functions for each piece i
are found by matching both the function values and the
first derivatives at the intersections . In other words, we
fit an spline function.
i ii i Wi i
i
U Wb e
W
ˆiW
Practical Utility
Richard R. Lindsey453
Thus at each wealth level , representing the border
between risk aversion γi and γi+1 , we have the following
two equations
From which we calculate the coefficients (setting a1 = 0 and
b1 = 1)
1ˆˆ
1 1i ii i WW
i i i ia b e a b e
1ˆˆ
1 1i ii i WW
i i i ib e b e
1ˆ( )
11
i i ii Wi i
i
b b e
ˆ1
1
1 i ii Wi i i
i
a a b e
Practical Utility
Richard R. Lindsey454
Example 1
Richard R. Lindsey455
Current wealth $100,000
Cash contributions (savings) of $15,000 per year
20 year investment horizon
US Stocks, International Stocks, Corporate Bonds,
Government Bonds, and Cash
Example 1
Richard R. Lindsey456
US Stocks Int Stocks Corp Bonds Gvt Bonds Cash
Mean 10.80 10.37 9.49 7.90 5.61
Std 15.72 16.75 6.57 4.89 0.70
Example 1
Richard R. Lindsey457
Four utility functions
A: exponential, absolute risk aversion = 2
B: Increasing relative risk aversion and decreasing absolute risk
aversion
2.0 @ W of $0.25M and below, increasing to 3.5 @ W of $3.5 and above
C: Decreasing relative risk aversion and decreasing absolute
risk aversion
8.0 @ W of $1.0M and below, decreasing to 1.01 @ W of $1.5M and above
D: Quadratic (downside)
Quadratic with linear penalty of 1000 for underperforming $1.0M
Recall from Lecture 2
Richard R. Lindsey458
Example 1
Richard R. Lindsey459
Utility CEW Mean Std 99% 95%
Exponential 1.412 1.564 0.424 0.770 0.943
Increasing RRA 1.440 1.575 0.452 0.771 0.937
Decreasing RRA 1.339 1.498 0.436 0.865 0.998
Quadratic 0.982 1.339 0.347 0.911 1.006
Example 1
Richard R. Lindsey460
Exponential Increasing RRA
QuadraticDecreasing RRA
Example 1
Richard R. Lindsey461
57.416.9
25.7
00
Exponential
US Stock
Int Stock
Corp Bonds
Gvmt Bonds
Cash
34
13.7
52.3
0 0
Increasing RRA
US Stock
Int Stock
Corp Bonds
Gvmt Bonds
Cash
10.610
67.2
12.2 0
Decreasing RRA
US Stock
Int Stock
Corp Bonds
Gvmt Bonds
Cash
53.216.4
30.4
0 0
Quadratic
US Stock
Int Stock
Corp Bonds
Gvmt Bonds
Cash
Example 1
Richard R. Lindsey462
Exponential
Example 1
Richard R. Lindsey463
Exponential
Example 1
Richard R. Lindsey464
Exponential
Example 1
Richard R. Lindsey465
Exponential: 1 to go
Example 1
Richard R. Lindsey466
Exponential: 10 to go
Example 1
Richard R. Lindsey467
Exponential: 19 to go
Example 1
Richard R. Lindsey468
Increasing RRA
Example 1
Richard R. Lindsey469
Increasing RRA
Example 1
Richard R. Lindsey470
Increasing RRA
Example 1
Richard R. Lindsey471
Increasing RRA: 1 to go
Example 1
Richard R. Lindsey472
Increasing RRA: 10 to go
Example 1
Richard R. Lindsey473
Increasing RRA: 19 to go
Example 1
Richard R. Lindsey474
Decreasing RRA
Example 1
Richard R. Lindsey475
Decreasing RRA
Example 1
Richard R. Lindsey476
Decreasing RRA
Example 1
Richard R. Lindsey477
Decreasing RRA: 1 to go
Example 1
Richard R. Lindsey478
Decreasing RRA: 10 to go
Example 1
Richard R. Lindsey479
Decreasing RRA: 19 to go
Example 1
Richard R. Lindsey480
Quadratic
Example 1
Richard R. Lindsey481
Quadratic
Example 1
Richard R. Lindsey482
Quadratic
Example 1
Richard R. Lindsey483
Quadratic: 1 to go
Example 1
Richard R. Lindsey484
Quadratic: 10 to go
Example 1
Richard R. Lindsey485
Quadratic: 19 to go
Example 2
Richard R. Lindsey486
Now compare these dynamic strategies with six fixed-mix
strategies. US stocks only
Cash only
All asset classes equally weighted
Risk averse (conservative)
Medium risk (dynamic)
Risk prone (aggressive)
With the exception of equally weighted asset classes, all
strategies are the solution of the single period Markowitz
optimization.
Example 2
Richard R. Lindsey487
Example 2
Richard R. Lindsey488
Strategy Mean Std 99% 95%
US stocks 1.825 1.065 0.469 0.660
Cash 0.868 0.019 0.822 0.834
Equally weighted 1.349 0.301 0.799 0.920
Risk Averse 1.098 0.110 0.869 0.930
Medium Risk 1.538 0.407 0.825 0.975
Risk Prone 1.663 0.639 0.677 0.852
Example 2 CEW Improvement
Richard R. Lindsey489
Exponential Increasing
RRA
Decreasing
RRA
Quadratic
US stocks 9.61% 7.17% 96.12% 12.06%
Cash 62.79% 66.04% 56.08% 13.36%
Equally wtd 11.10% 12.30% 14.56% 2.03%
Risk averse 29.93% 32.42% 27.45% 1.03%
Medium risk 0.55% 0.76% 0.62% 1.19%
Risk Prone 1.63% 0.44% 23.72% 4.81%
Bibliography
Richard R. Lindsey490
Hakansson, ―On Myopic Portfolio Policies, With and Without Serial Correlation of Yields‖, Journal of Business, 1971.
Infanger, ―Dynamic Asset Allocation Strategies Using a Stochastic Dynamic Programming Approach‖, in Handbook of Asset and Liability Management, Volume 1, Zenios and Ziemba eds., 2006.
Merton, ―Lifetime Portfolio Selection Under Uncertainty: the Continuous-time Case‖, Review of Economics and Statistics, 1969.
Mossin, ―Optimal Multiperiod Portfolio Policies‖, Journal of Business, 1968.
Samuelson, ―Lifetime Portfolio Selection by Dynamic Stochastic Programming‖, Review of Economics and Statistics, 1969.
▲▲▲▲▲▲▲▲▲▲▲▲▲
Characteristic Portfolios
Richard R. Lindsey492
Consider a single period problem with no rebalancing within
the period with the underlying assumptions:
There is a riskless asset
All first and second moments exist
It is not possible to build a fully invested portfolio that has zero
risk
The expected excess return on the fully invested portfolio with
minimum risk is positive.
Characteristic Portfolios
Richard R. Lindsey493
Define a vector of asset attributes or characteristics (these
could be betas, expected returns, earnings-to-price ratios,
capitalization, membership in a an economic sector, etc.)
The exposure of portfolio to the attribute is .
1
2
N
a
aa
a
pw apw
Characteristic Portfolios
4/14/2009Richard R. Lindsey494
The characteristic portfolio uniquely captures the defining
attribute.
Characteristic portfolio machinery connects attributes and
portfolios and to identify a portfolio’s exposure to an
attribute in terms of its covariance with the characteristic
portfolio.
The process works both ways, we can start with a portfolio
and find the attribute that the portfolio expresses most
effectively.
Characteristic Portfolios
Richard R. Lindsey495
Proposition 1
1. For any non-zero attribute there is a unique portfolio that
has minimum risk and unit exposure to the attribute.
The weights of the characteristic portfolio are:
Characteristic portfolios are not necessarily fully
invested; they can have long and short positions, and
may have significant leverage.
1
1a
aw
a a
Characteristic Portfolios
Richard R. Lindsey496
2. The variance of the characteristic portfolio is given by:
3. The beta of all assets with respect to the characteristic
portfolio is equal to
aw
2
1
1a a aw w
a a
aw a
2
a
a
wa
Characteristic Portfolios
Richard R. Lindsey497
4. Consider two attributes and with characteristic
portfolios and Let and be, respectively, the
exposure of portfolio to characteristic and the
exposure of portfolio to characteristic . The
covariance of the characteristic portfolios satisfies
aw dw
dw
aw
a d
da ad
a
d
2 2,a d d a a da d
Characteristic Portfolios
Richard R. Lindsey498
5. If is a positive scalar, then the characteristic portfolio
of is . Because characteristic portfolios have
unit exposure to the attribute, if we multiply the attribute
by we will need to divide the characteristic portfolio
by to preserve unit exposure.
a aw
Characteristic Portfolios
Richard R. Lindsey499
6. If characteristic is a weighted combination of
characteristics and , then the characteristic portfolio
of is a weighted combination of the characteristic
portfolios of and ; in particular, if
then
where
d fa d f
a
d f
a
d f
22
2 2
f ad aa d f
d f
w w w
2 2 2
1 f fd d
a d f
aa
Characteristic Portfolios
Richard R. Lindsey500
Proof
The holdings of the characteristic portfolio can be
determined by solving for the portfolio with minimum risk
given the constraint that the exposure to characteristic
equals 1.
The first order conditions are
Where is the Lagrange multiplier.
a
min s.t. 1w w w a
1
0
w a
w a
Characteristic Portfolios
Richard R. Lindsey501
The results are
And
Which proves item 1. Item 2 can be verified using and
the definition of portfolio variance. Item 3 can be verified
using the definition of beta with respect to portfolio P as
1
1a
aw
a a
1
1
a a
aw
2P Pw
Characteristic Portfolios
Richard R. Lindsey502
For item 4, note and
Items 5 and 6 are straightforward.
2
2
{ }
{ }
ad a d
a d
a d
d a
w w
w w
a w
a
2
2
{ }
{ }
ad a d
a d
a a
a d
w w
w w
w d
d
Characteristic Portfolios
Richard R. Lindsey503
Example 1:
Suppose is the attribute. Every
portfolio’s exposure to measures the extent of its
investment if then the portfolio is fully invested.
Portfolio C, the characteristic portfolio for attribute , is
the minimum-risk fully invested portfolio:
1 1 1
1Pw
Characteristic Portfolios
Richard R. Lindsey504
Note every asset has a beta of 1 with this portfolio; and the
covariance of any fully invested portfolio with C is .
1
1
2
1
2
1
C
C C C
C
C
w
w w
w
2C
Characteristic Portfolios
Richard R. Lindsey505
Example 2
Suppose beta is the attribute, where beta is defined by some
benchmark portfolio B
Then the benchmark is the characteristic portfolio of beta
2
B
B
w
Characteristic Portfolios
Richard R. Lindsey506
So the benchmark is the minimum-risk portfolio with a beta
of 1.
Note that the relationship between portfolios C and B is
1
1
2
1
1
B
B B B
w w
w w
2 2BC B C C B
Characteristic Portfolios
Richard R. Lindsey507
Proposition 2
Let q be the characteristic portfolio of the characteristic
(expected excess returns)
Then
a. The Sharpe ratio is
f
1
1q
fw
f f
11 2max{ | }q PSR SR P f f
Characteristic Portfolios
Richard R. Lindsey508
b.
c.
2
1
1
1
q q
q
f w f
f f
2
q
q
q
wf
wSR
Characteristic Portfolios
Richard R. Lindsey509
d. If is the correlation between portfolios P and q, then
e. The fraction of q invested in risky assets is given by
Pq
P Pq qSR SR
2
2
C qq
C
f
Characteristic Portfolios
Richard R. Lindsey510
Proof
For any portfolio , the Sharpe ratio is . For
any positive constant , the portfolio with holdings
will also have a Sharpe ratio equal to . Thus, to find
the maximum Sharpe ratio, we can set the expected excess
return to 1 and minimize risk. We can then minimize
subject to the constraint that . This is just the
problem we solved to get , the characteristic portfolio
of .
Items b and c are properties of the characteristic portfolio.
Pw P P PSR f
Pw
PSR
qw
f
B Bw w
1w f
Characteristic Portfolios
Richard R. Lindsey511
For d, we use c:
And e follows from Proposition 1, item 4.
P PP
P P
qPq
P q
P qq Pq q
P q
f w fSR
wwSR
w wSR SR
Characteristic Portfolios
Richard R. Lindsey512
Proposition 3
Assume
1. Portfolio q is net long
Let portfolio Q be the characteristic portfolio of .
Portfolio Q is fully invested with holdings
In addition SRQ=SRq, and for any portfolio P with a
correlation with portfolio Q, we have
0Cf
q f
0q
Q q qw w
PQ
P PQ QSR SR
Characteristic Portfolios
Richard R. Lindsey513
2.
Note that this specifies exactly how Portfolio Q ―explains‖
expected returns.
3.
2 2
QC
C Q
ff
wrt 2
QQ Q Q
Q
wf f f
2
2
B QQ
Q B
f
f
Characteristic Portfolios
Richard R. Lindsey514
4. If the benchmark is fully invested, , then1B
C BQ
C
f
f
Characteristic Portfolios
Richard R. Lindsey515
Portfolio A (characteristic portfolio for alpha)
Define alpha as . Let be the characteristic
portfolio for alpha, the minimum risk portfolio with alpha
of 100% (note that this portfolio will have significant
leverage). According to Proposition 1, item 6, we can
express in terms of and . From item 4, we see
that the relationship between alpha and beta is
However, by construction, so portfolios A and B are
uncorrelated and
Bf f Aw
Aw Bw qw
2 2,B A B A A B
0B
0A
Characteristic Portfolio of Alpha
Richard R. Lindsey516
Consider the characteristic portfolio for alpha where
Is the vector of forecasted expected residual returns, where
the residual is relative to the benchmark portfolio. Since
the alphas are forecasts of residual return, both the
benchmark and the riskless asset have alphas of zero.
The portfolio weights are
1 2 N
1
1Aw
Characteristic Portfolio of Alpha
Richard R. Lindsey517
Portfolio A has an alpha of 1, and it has minimum
risk among all portfolios with that property. The variance
of portfolio A is
In addition, we can define alpha in terms of Portfolio A
1Aw
2
1
1A A Aw w
2
A
A
w
Alpha
Richard R. Lindsey518
Looking forward (ex ante), a is a forecast of residual return.
Looking backward (ex post), a is the average of the realized
residual returns.
The term alpha (just like beta) comes from the use of linear
regression
The residual returns from this regression are
―Realized alphas are for keeping score – the job of an active manager is to
score – for that you need to forecast alpha‖
( ) ( ) ( )P P P B Pr t r t t
( ) ( )P P Pt t
Alpha
Richard R. Lindsey519
Looking into the future, alpha is a forecast of residual return
Note that by definition, the benchmark portfolio always has
a residual return of 0. Therefore the alpha of the
benchmark portfolio must also be 0.
Similarly, the residual returns for a riskless portfolio is also
0 and it’s alpha must be 0.
n nE
Information Ratio
Richard R. Lindsey520
While α is the primary measure of a portfolio’s excess
return, another metric, the information ratio, is often used
by professionals.
The information ratio adjusts the α for the portfolio’s
residual risk and is written:
αP is predicted alpha; ωP is the predicted standard deviation
of the residual.
Typically, we consider the ex-ante information ratio for making decisions and
the ex-post information ratio for performance evaluation.
P
P
IR
Information Ratio
Richard R. Lindsey521
If ωP is 0, we set IRP equal to 0, and, in general, we define
the information ratio IR as the largest possible value of
IRP given alphas {αn}
max |pIR IR
Information Ratio
Richard R. Lindsey522
Now, returning to Portfolio A (the characteristic portfolio for
alpha), we note that it has several interesting properties
Proposition 4
1. Portfolio A has zero beta; therefore it typically has long
and short positions
2. Portfolio A has the maximum information ratio
0A Aw
1 for all A PIR IR IR P
Information Ratio
Richard R. Lindsey523
3. Portfolio A has total and residual risk equal the inverse of
IR.
4. Any portfolio P that can be written as
has IRP = IR.
1A A
IR
with 0P P B P A Pw w w
Information Ratio
Richard R. Lindsey524
5. Recall Portfolio Q – the characteristic portfolio of ).
This portfolio is a mixture of the benchmark and portfolio
A:
With and
Therefore IRQ = IR. The information ratio of Portfolio Q
equals that of Portfolio A.
q f
Q Q B Q Aw w w
2
2
B QQ
Q B
f
f
2
2
Q Af
Information Ratio
Richard R. Lindsey525
6. Total holdings in risky assets for Portfolio A are
7. Let be the residual return on any portfolio P. The
information ratio of portfolio P is
2
2C A
A
C
{ , }P Q P QIR IR Corr
P
Information Ratio
Richard R. Lindsey526
8. The maximum information ratio is related to portfolio Q’s maximum Sharpe ratio
9. Alpha can be represented as
So alpha is directly related to the marginal contribution to residual risk by the information ratio.
Q Q
Q Q
IR SR
MCRRAQ
A
wIR IR
Information Ratio
Richard R. Lindsey527
10. The Sharpe ratio of the benchmark is related to the
maximal information ratio and Sharpe ratio2 2 2BSR SR IR
Fundamental Law of Active
Management
Richard R. Lindsey528
A portfolio manager applies quantitative analysis to market
data to find and exploit the opportunities for excess return
hidden in market inefficiencies.
Quantitative analysis opens up the possibility of statistical
arbitrage if the methods and models used combine all
available information efficiently.
This is illustrated within the framework of the fundamental
law of active management (Grinold 1989; Grinold & Kahn
1997).
Fundamental Law of Active
Management
Richard R. Lindsey529
The fundamental law states that the information ratio (IR) is
the product of the information coefficient (IC) and the
square root of breadth (BR)
Breadth is defined as the number of independent forecasts of
exceptional return (think of breadth as the number of
independent factors for which you make forecasts).
The information coefficient is the correlation of each
forecast with the actual outcomes (here assumed to be the
same for all forecasts).
IR IC BR
Fundamental Law of Active
Management
Richard R. Lindsey530
This equation says that a higher information ratio can be
achieved by increasing the information coefficient or by
increasing the breadth.
IC can be increased by finding factors that are more
significant than those that are already in the model.
BR can be increased by finding more factors that are
uncorrelated (or relatively uncorrelated) with the existing
factors in the model.
Fundamental Law of Active
Management
Richard R. Lindsey531
Generally, for quantitative portfolio management, we use a
model something like
The fundamental law basically assesses how well our model
explains stock-return process, and it expresses the
equation’s goodness of fit as the product of the number of
explanatory variables and each variable’s average
contribution.
1 1 2 2it i i t i t iK Kt itr f f f
Fundamental Law of Active
Management
Richard R. Lindsey532
While the fundamental law can be expressed in different ways, there are certain general facts which always hold:
1. IR2 approximately equals the goodness of fit (R2) of the forecasting equations.
2. The breadth is the number of explanatory variables in the forecasting equations.
3. IC2 is the average contribution of each explanatory variable in increasing R2
4. When the benchmark is ignored and the risk-free rate is subtracted from the portfolio returns, IR is essentially the maximum Sharpe ratio one can achieve and the fundamental law decomposes the maximum Sharpe ratio into the number of explanatory variables and their average contribution.
Bibliography
Richard R. Lindsey533
Chincarini and Kim, Quantitative Equity Portfolio
Management, 2006.
Grinold, ―The Fundamental Law of Active Management‖,
Journal of Portfolio Management, 1989.
Grinold and Kahn, Active Portfolio Management‖, 2000.
▲▲▲▲▲▲▲▲▲▲▲▲▲