Time-Inconsistency: Problems and Mathematical Theory 2015-12-28.pdf · Time-Inconsistency: Problems and Mathematical Theory Jiongmin Yong (University of Central Florida) December

Post on 06-Aug-2018

223 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Time-Inconsistency:Problems and Mathematical Theory

Jiongmin Yong

(University of Central Florida)

December 28, 2015

Outline

1. Introduction: Time-Consistency

2. Time-Inconsistent Problems

3. Equilibrium Strategies

4. Open Problems

1. Introduction: Time-Consistency

Continuous Compound Interest

— Exponential Discounting.

P(0) — initial principal

r — annual interest rate

P(t) = P(0)ert — Amount at the end of t-th year(compounded continuously)

For any given future times T > t > 0, from

P(T ) = P(0)erT , P(t) = P(0)ert ,

one hasP(T ) = P(t)er(T−t), 0 < t < T ,

or, equivalently,

P(t) = P(T )e−r(T−t), 0 < t < T .

This is the value (price) at t of a payoff P(T ) at T .

e−r(T−t) — exponential discounting.

For any 0 < t1 < t2 < T , one has

er(T−t1)P(t1) = P(T ) = er(T−t2)P(t2).

Therefore,

possess P(t1) at t1 possess P(t2) at t2

This is called the Time-Consistency of exponential discounting

Preferred Choice: Assume that annual rate is r = 10%

Option (A): Get $100 today (December 28, 2015).

Option (B): Get $105 (> 100(1 + r12 )) on January 28, 2016.

Option (A′): Get $110 (= 100× 1.10) on December 28, 2016.

Option (B′): Get $115.50 (> 110(1 + r12 )) on January 28, 2017.

For a time-consistent person,

(A)(A′), (B)(B′),

(B)(A), (B′)(A′).

(We will come back to this example later)

Semigroups: Consider X (s) = b(s,X (s)), s ∈ [t,T ],

X (t) = x .

Suppose for any (t, x) ∈ [0,T )× Rn, the above admits a uniquesolution X (· ; t, x). Then for any τ ∈ (t,T ),

X (s; τ,X (τ ; t, x)) = X (s; t, x), s ∈ [τ,T ].

The restriction X (· ; t, x)∣∣[τ,T ]

is the solution of the equation

starting from (τ,X (τ ; t, x)). A (nonlinear) semigroup property.

Dynamic Programming/Feymann-Kac formula: Consider

X (s) = b(s,X (s)), s ∈ [0,T ],

J(t,X (t)) = h(X (T )) +

∫ T

tg(s,X (s))ds.

For τ ∈ (t,T ),

J(t,X (t)) = h(X (T ; t,X (t))) +

∫ T

tg(s,X (s; t,X (t)))ds

= h(X (T ; τ,X (τ ; t,X (t)))) +

∫ T

τg(s,X (s; τ,X (τ ; t,X (t))))ds

+

∫ τ

tg(s,X (s; t,X (t)))ds = J(τ,X (τ)) +

∫ τ

tg(s,X (s))ds.

Extended semigroup property. (Special case: h(x) = x , g = 0)

This leads to Jt(t, x) + Jx(t, x)b(t, x) + g(t, x) = 0,

J(T , x) = h(x).(1)

Linear Hamilton-Jacobi type equation.

Another viewpoint: The solution J(t, x) of PDE (1) admitsrepresentation:

J(t, x) = h(X (T ; t, x)) +

∫ T

tg(s,X (s; t, x))ds.

This is a deterministic Feynman-Kac formula.

Optimal Control Problem: Consider X (s) = b(s,X (s), u(s)), s ∈ [t,T ],

X (t) = x ,

with (scalar) cost functional

J(t, x ; u(·)) = h(X (T )) +

∫ T

tg(s,X (s), u(s))ds,

where

U [t,T ] =u : [t,T ]→ U

∣∣ u(·) is measurable.

Problem (C). For given (t, x) ∈ [0,T )× Rn, find u(·) ∈ U [t,T ]such that

J(t, x ; u(·)) = infu(·)∈U [t,T ]

J(t, x ; u(·)) ≡ V (t, x).

Bellman Optimality Principle: For any τ ∈ [t,T ],

V (t, x) = infu(·)∈U [t,τ ]

[ ∫ τ

tg(s,X (s), u(s))ds

+V(τ,X (τ ; t, x , u(·))

)].

Let (X (·), u(·)) be optimal for (t, x) ∈ [0,T )× Rn.

V (t, x) = J(t, x ; u(·)) =

∫ τ

tg(s, X (s), u(s))ds

+J(τ, X (τ ; t, x , u(·)); u(·)

∣∣[τ,T ]

)≥∫ τ

tg(s, X (s), u(s))ds + V

(τ,X (τ ;t,x ,u(·))

)≥ inf

u(·)∈U [t,τ ]

∫ τ

tg(s,X (s), u(s))ds

+V(τ,X (τ ; t, x , u(·))

)= V (t, x).

Thus, all the equalities hold.

Consequently,

J(τ, X (τ); u(·)

∣∣[τ,T ]

)= V (τ, X (τ))

= infu(·)∈U [τ,T ]

J(τ, X (τ); u(·)

), a.s.

Hence, u(·)∣∣[τ,T ]

∈ U [τ,T ] is optimal for(τ, X (τ ; t, x , u(·))

).

This is called the time-consistency of Problem (C).

Definition. A problem involving a decision-making is said to betime-consistent if

an optimal decision made at a given time twill remain optimal at any time s > t.

If the above is not the case, the problem is said to betime-inconsistent.

*********************

If the problem under consideration is time-consistent, then oncean optimal decision is made, we will not regret afterwards!

If the whole world is time-consistent,

then the things are too ideal, the life will be much easier!

But, it might also be a little or too boring

(exciting to have some challenges)!

Fortunately (unfortunately?), the life is not that ideal!

(Challenges are around!)

Time-inconsistent problems exist almost everywhere!

2. Time-Inconsistent Problems

In reality, problems are hardly time-consistent:

An optimal decision/policy made at time t, more than often,will not stay optimal, thereafter.

Main reason: When building the model, describing theutility/cost, etc., the following are used:

subjective Time-Preferences and

subjective Risk-Preferences.

• Time-Preferences:

Most people do not discount exponentially! Instead, they overdiscount on the utility of immediate future outcomes.

* Overreaction without thinking the consequences(bad temper and impatience lead to unnecessary fighting,...)

* Break promise, delay planned projects (fail to meet deadlines,such as refereeing papers, quit smoking, ...)

* Shopping using credit cards (meeting immediate satisfaction,big discount, buy one get one free,...)

* Unintentionally pollute the environment due to over-development

* Corruption, without thinking consequences

...........

Doing things not because you need to dobut because you like to do.

Not Doing things not because you do not need to dobut because you do not like to do.

* D. Hume (1739), “A Treatise of Human Nature”

“Reason is, and ought only to be the slave of the passions.”

More than often, people doing things is due to their passions.

* A. Smith (1759), “The Theory of Moral Sentiments”

Utility is not intertemporally sparable but rather that

past and future experiences, jointly with current ones,

provide current utility.

Roughly, in mathematical terms, one should have

U(t,X (t)) = f (U(t − r ,X (t − r)),U(t + τ,X (t + τ)),

where U(t,X ) is the utility at (t,X ).

Exponential discounting: λe(t) = e−rt , r > 0 — discount rate

Hyperbolic discounting: λh(t) = 11+kt — a hyperbola

If let k = er − 1, i.e., e−r = λe(1) = λh(1) = 11+k , then

λe(t) = e−rt =1

(1 + k)t, λh(t) =

1

1 + kt.

For t ∼ 0, t 7→ 11+kt decreases faster than t 7→ 1

(1+k)t :

λ′h(0) = −k < − ln(1 + k) = λ′e(0),

Hyperbolic discounting actually appears in people’s behavior.

Come back to a previous example: Annual rate is 10%

Option (A): Get $100 today (December 28, 2015).

Option (B): Get $105 (> 100(1 + r12 )) on January 28, 2016.

Option (A′): Get $110 (= 100× 1.10) on December 28, 2016.

Option (B′): Get $115.50 (> 110(1 + r12 )) on January 28, 2017.

For a time-consistent person,

(A)(A′), (B)(B′),

(B)(A), (B′)(A′).

However, for an uncertainty-averse person,

(A)(B), (B′)(A′).

Magnifying the example:

Option (A): Get $1M today (December 28, 2015).

Option (B): Get $1.05M (> 1M(1 + r12 )) on January 28, 2016.

Option (A′): Get $1.1M (= 1M × 1.10) on December 28, 2016.

Option (B′): Get $1.155M(>1.1M(1+ r12 )) on January 28, 2017.

For an uncertainty-averse person,

(A)(B), (B′)(A′).

The feeling is stronger?

* Palacious–Huerta (2003), survey on history

* Strotz (1956), Pollak (1968), Laibson (1997), ...

* Finn E. Kydland and Edward C. Prescott, (1977)(2004 Nobel Prize winners)(classical optimal control theory not working)

* Ekeland–Lazrak (2008), Yong (2011, 2012)

• Risk-Preferences:

Consider two investments whose returns are: R1 and R2 with

P(R1 = 100) =1

2, P(R1 = −50) =

1

2,

P(R2 = 150) =1

3, P(R2 = −60) =

2

3.

Which one you prefer?

ER1 =1

2100 +

1

2(−50) = 25,

ER2 =1

3150 +

2

3(−60) = 10.

So R1 seems to be better.

* St. Petersburg Paradox: (posed by Nicolas Bernoulli in 1713)

P(X = 2n) =1

2n, n ≥ 1,

E[X ] =∞∑n=1

2nP(X = 2n) =∞∑n=1

2n1

2n=∞.

Question: How much are you willing to pay to play the game?

How about $10,000? Or $1,000? Or ???

In 1738, Daniel Bernoulli (a cousin of Nicolas) introducedexpected utility: E[u(X )]. With u(x) =

√x , one has

E√X =

∞∑n=1

( 1√2

)n= 1 +

√2.

* 1944, von Neumann–Morgenstern: Introduced “rationality”axioms: Completeness, Transitivity, Independence, Continuity.

Standard stochastic optimal control theory is based on theexpected utility theory.

• Decision-making based on expected utility theory istime-consistent.

• In classical expected utility theory, the probability is objective.

• It is controversial whether a probability should be objective.

• Early relevant works: Ramsey (1926), de Finetti (1937)

Allais Paradox (1953). Ω=1, 2, · · · , 100, P(ω) = 1100 ,∀ω ∈ Ω.

X1(ω) = 100χ1≤ω≤100, X2(ω) = 200χ1≤ω≤70,

X3(ω) = 100χ1≤ω≤15, X4(ω) = 200χ1≤ω≤10.

Most people have the following preferences:

X2 ≺ X1, X3 ≺ X4.

If there exists a utility function u : R→ R+ such that

X ≺ Y ⇐⇒ E[u(X )] < E[u(Y )],

then

X2 ≺ X1 ⇒ E[u(X2)] = 0.7u(200) < u(100) = E[u(X1)],

X3 ≺ X4 ⇒ E[u(X3)] = 0.15u(100) < 0.1u(200) = E[u(X4)],

Thus, 1.05u(100) < 0.7u(200) < u(100), a contradiction.

Ellesberg’s Paradox (1961). In an urn, there are 90 balls,

30 60

Red Black White

XR $100 0 0

XB 0 $100 0

XR∪W $100 0 $100

XB∪W 0 $100 $100

Most people have the following preferences: (ambiguity-averse)

XB ≺ XR , XR∪W ≺ XB∪W .

P(R) =1

3, P(B) ∈ [0,

2

3], P(B∪W ) =

2

3, P(R∪W ) ∈ [

1

3, 1].

P(B ∪W ) = P(B) + P(W ), P(R ∪W ) = P(R) + P(W ).

XB ≺ XR , XR∪W ≺ XB∪W .

P(R) =1

3, P(B) ∈ [0,

2

3], P(B ∪W ) =

2

3, P(R ∪W ) ∈ [

1

3, 1].

P(B ∪W ) = P(B) + P(W ), P(R ∪W ) = P(R) + P(W ).

If there exists a utility function u : R→ R+ such that

X ≺ Y ⇐⇒ E[u(X )] < E[u(Y )],

then

XR∪W ≺ XB∪W ⇐⇒ u(100)P(R ∪W ) < u(100)P(B ∪W )

⇐⇒ u(100)P(R) = u(100)[P(R ∪W )− P(W )]

< u(100)[P(B ∪W )− P(W )] = u(100)P(B)

⇐⇒ XR ≺ XB ,

a contradiction.

Relevant Literature:

* Subjective expected utility theory (Savage 1954)

* Mean-variance preference (Markowitz 1952)leading to nonlinear appearance of conditional expectation

* Choquet integral (1953)leading to Choquet expected utility theory

* Prospect Theory (Kahneman–Tversky 1979)(Kahneman won 2002 Nobel Prize)

* Distorted probability (Wang–Young–Panjer 1997)widely used in insurance/actuarial science

* BSDEs, g-expectation (Peng 1997)leading to time-consistent nonlinear expectation

* BSVIEs (Yong 2006,2008)leading to time-inconsistent dynamic risk measure

Recent Relevant Literatures:

* Bjork–Murgoci (2008), Bjork–Murgoci–Zhou (2013)

* Hu–Jin–Zhou (2012, 2015)

* Yong (2012, 2013, 2014, 2015)

• A Summary:

Time-Preferences: (Exponential/General) Discounting.

Risk-Preferences: (Subjective/Objective) Expected Utility.

Exponential discounting + objective expected utility/disutility

leads to time-consistency.

Otherwise, the problem will be time-inconsistent.

Time-consistent solution:

Instead of finding an optimal solution(which is time-inconsistent),

find an equilibrium strategy(which is time-consistent).

Sacrifice some immediate satisfaction,

save some for the future

(retirement plan, controlling economy growth speed, ...)

3. Equilibrium StrategiesA General Formulation:dX (s)=b(s,X (s), u(s))ds+σ(s,X (s), u(s))dW (s), s ∈ [t,T ],

X (t) = x ,

with

J(t, x ; u(·)) = Et

[ ∫ T

tg(t, s,X (s), u(s))ds + h(t,X (T ))

].

U [t,T ] =u : [t,T ]→ U

∣∣ u(·) is F-adapted.

Problem (N). For given (t, x) ∈ [0,T )× Rn, find u(·) ∈ U [t,T ]such that

J(t, x ; u(·)) = infu(·)∈U [t,T ]

J(t, x ; u(·)).

This problem is time-inconsistent.

-

6

tN =TtN−1tN−2tN−3

XN(tN−1)

?

J(tN−1, xN−1; u(·))

6

J(tN−2, xN−2; u(·))

6

t

XN−1(tN−2)

?

Idea of Seeking Equilibrium Strategies.

• Partition the interval [0,T ]:

[0,T ] =N⋃

k=1

[tk−1, tk ], Π : 0 = t0 < t1 < · · · < tN−1 < tN .

• Solve an optimal control problem on [tN−1, tN ], with costfunctional:

JN(u) = E[h(tN−1,X (T )) +

∫ tN

tN−1

g(tN−1, s,X (s), u(s))ds],

obtaining optimal pair (XN(·), uN(·)), depending on the initialpair (tN−1, xN−1).

• Solve an optimal control problem on [tN−2, tN−1] with asophisticated cost functional:

JN−1(u) = E[h(tN−2,X (T )) +

∫ tN

tN−1

g(tN−2, s,XN(s), uN(s))ds

+

∫ tN−1

tN−2

g(tN−2, s,X (s), u(s))ds].

• By induction to get an approximate equilibrium strategy,depending on Π.

• Let ‖Π‖ → 0 to get a limit.

Definition. Ψ : [0,T ]× Rn → U is called a time-consistentequilibrium strategy if for any x ∈ Rn,

dX (s) = b(s, X (s),Ψ(s, X (s)))ds

+σ(s, X (s),Ψ(s, X (s)))dW (s), s ∈ [0,T ],

X (0) = x

admits a unique solution X (·). For some ΨΠ : [0,T ]× Rn → U,

lim‖Π‖→0

d(

ΨΠ(t, x),Ψ(t, x))

= 0,

uniformly for (t, x) in any compact sets, whereΠ : 0 = t0 < t1 < · · · < tN−1 < tN = T , and

Jk(tk−1,X

Π(tk−1); ΨΠ(·)∣∣[tk−1,T ]

)≤ Jk

(tk−1,X

Π(tk−1); uk(·)⊕ΨΠ(·)∣∣[tk ,T ]

), ∀uk(·) ∈ U [tk−1, tk ],

Jk(·) — sophisticated cost functional.

dXΠ(s) = b(s,XΠ(s),ΨΠ(s,XΠ(s)))ds

+σ(s,XΠ(s),ΨΠ(s,XΠ(s)))dW (s), s ∈ [0,T ],

XΠ(0) = x

[uk(·)⊕ΨΠ(·)∣∣[tk ,T ]

](s) =

uk(s), s ∈ [tk−1, tk),

ΨΠ(s,X k(s)), s ∈ [tk ,T ],

dX k(s) = b(s,X k(s), uk(s))ds

+σ(s,X k(s), uk(s))dW (s), s ∈ [tk−1, tk),

dX k(s) = b(s,X k(s),ΨΠ(s,X k(s)))ds

+σ(s,X k(s),ΨΠ(s,X k(s)))dW (s), s ∈ [tk ,T ],

X k(tk−1) = XΠ(tk−1).

Equilibrium control:

u(s) = Ψ(s, X (s)), s ∈ [0,T ].

Equilibrium state process X (·), satisfying:dX (s) = b(s, X (s),Ψ(s, X (s)))ds

+σ(s, X (s),Ψ(s, X (s)))dW (s), s ∈ [0,T ],

X (0) = x

Equilibrium value function:

V (t, X (t)) = J(t, X (t); u(·)).

The previous explained idea will help us to get such a Ψ(· , ·).

Let D[0,T ] = (τ, t)∣∣ 0 ≤ τ ≤ t ≤ T. Define

a(t, x , u) = 12σ(t, x , u)σ(t, x , u)T , ∀(t, x , u) ∈ [0,T ]× Rn × U,

H(τ, t, x , u, p,P) = tr[a(t, x , u)P

]+ 〈 b(t, x , u), p 〉+g(τ, t, x , u),

∀(τ, t, x , u, p,P) ∈ D[0,T ]× Rn × U × Rn × Sn,

Let ψ : D(ψ) ⊆ D[0,T ]× Rn × Rn × Sn → U such that

H(τ, t, x , ψ(τ, t, x , p,P), p,P) = infu∈U

H(τ, t, x , u, p,P) > −∞,

∀(τ, t, x , p,P) ∈ D(ψ).

In classical case, it just needs

H(t, x , p,P) = infu∈U

H(t, x , u, p,P) > −∞,

∀(t, x , p,P) ∈ [0,T ]× Rn × Rn × Sn.

Equilibrium HJB equation:

Θt(τ , t, x)+tr[a(t, x ,ψ(t, t, x ,Θx(t, t, x),Θxx(t, t, x))

)Θxx(τ , t, x)

]+ 〈 b

(t, x , ψ(t, t, x ,Θx(t, t, x),Θxx(t, t, x))

),Θx(τ , t, x) 〉

+g(τ , t, x , ψ(t, t, x ,Θx(t, t, x),Θxx(τ , t, x))

)=0, (τ, t, x)∈D[0,T ]×Rn,

Θ(τ ,T , x) = h(τ , x), (τ, x) ∈ [0,T ]× Rn.

Classical HJB Equation:

Θt(t, x)+tr[a(t, x ,ψ(t, x ,Θx(t, x),Θxx(t, x))

)Θxx(t, x)

]+ 〈 b

(t, x , ψ(t, x ,Θx(t, x),Θxx(t, x))

),Θx(t, x) 〉

+g(t, x , ψ(t, x ,Θx(t, x),Θxx(t, x))

)= 0, (t, x) ∈ [0,T ]× Rn,

Θ(T , x) = h(x), x ∈ Rn.

orΘt(t, x)+H(t, x ,Θx(t, x),Θxx(t, x)

)= 0, (t, x) ∈ [0,T ]× Rn,

Θ(T , x) = h(x), x ∈ Rn.

Equilibrium value function:

V (t, x) = Θ(t, t, x), ∀(t, x) ∈ [0,T ]× Rn.

It satisfies

V (t, X (t; x)) = J(t, X (t; x); Ψ(·)

∣∣[t,T ]

), (t, x) ∈ [0,T ]× Rn.

Equilibrium strategy:

Ψ(t, x) = ψ(t, t, x ,Vx(t, x),Vxx(t, x)), (t, x) ∈ [0,T ]× Rn.

Theorem. Under proper conditions, the equilibrium HJB equationadmits a unique classical solution Θ(· , · , ·). Hence, an equilibriumstrategy Ψ(· , ·) exists.

4. Open Problems

1. The well-posedness of the equilibrium HJB equation for the caseσ(t, x , u) is not independent of u.

2. The case that ψ is not unique, has discontinuity, etc.

3. The case that σ(t, x , u) is degenerate, viscosity solution?

4. Random coefficient case (non-degenerate/degenerate cases).

5. The case involving conditional expectation.

6. Infinite horizon problems.

Thank You!

top related