Financial notes

FINANCE NOTES

Mike Cliff

Current Draft: June 30, 1998

Contents

1 Introduction 1

2 Asset Pricing 32.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Portfolio Theory . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.1 Single Period Optimization Problem . . . . . . . . . . 42.2.2 Key Results . . . . . . . . . . . . . . . . . . . . . . . . 52.2.3 Multiperiod Portfolio Choice . . . . . . . . . . . . . . . 6

2.3 Equilibrium Asset Pricing Theory . . . . . . . . . . . . . . . . 72.3.1 Utility Functions . . . . . . . . . . . . . . . . . . . . . 82.3.2 CAPM Theory . . . . . . . . . . . . . . . . . . . . . . 92.3.3 ICAPM Theory . . . . . . . . . . . . . . . . . . . . . . 112.3.4 CCAPM Theory . . . . . . . . . . . . . . . . . . . . . 152.3.5 The CIR Model . . . . . . . . . . . . . . . . . . . . . . 16

2.4 Arbitrage Asset Pricing . . . . . . . . . . . . . . . . . . . . . . 202.4.1 State Contingent Claims . . . . . . . . . . . . . . . . . 202.4.2 Arbitrage Pricing Theory . . . . . . . . . . . . . . . . . 21

2.5 Pricing Kernel Approach . . . . . . . . . . . . . . . . . . . . . 232.5.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.5.2 Different Expectations . . . . . . . . . . . . . . . . . . 252.5.3 Asset Pricing with m . . . . . . . . . . . . . . . . . . . 262.5.4 The Agent’s Problem . . . . . . . . . . . . . . . . . . . 262.5.5 The Main Results . . . . . . . . . . . . . . . . . . . . . 272.5.6 Hansen-Jagannathan Bounds . . . . . . . . . . . . . . 28

2.6 Conditioning Information . . . . . . . . . . . . . . . . . . . . . 292.7 Market Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . 302.8 Empirical Asset Pricing . . . . . . . . . . . . . . . . . . . . . 31

2.8.1 Properties of Asset Returns . . . . . . . . . . . . . . . 31

i

ii CONTENTS

2.8.2 General Procedures . . . . . . . . . . . . . . . . . . . . 362.8.3 CAPM Tests . . . . . . . . . . . . . . . . . . . . . . . 372.8.4 ICAPM/CCAPM Tests . . . . . . . . . . . . . . . . . . 402.8.5 APT Tests . . . . . . . . . . . . . . . . . . . . . . . . . 412.8.6 Present Value Relations . . . . . . . . . . . . . . . . . 42

3 Fixed Income 453.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.2 Term Structure Basics . . . . . . . . . . . . . . . . . . . . . . 453.3 Inflation and Returns . . . . . . . . . . . . . . . . . . . . . . . 453.4 Forward Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.5 Bond Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.6 Affine Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.6.1 Vasicek . . . . . . . . . . . . . . . . . . . . . . . . . . 493.6.2 The CIR Model . . . . . . . . . . . . . . . . . . . . . . 493.6.3 Duffie-Kan Class . . . . . . . . . . . . . . . . . . . . . 503.6.4 Other Single Factor Models . . . . . . . . . . . . . . . 513.6.5 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . 51

3.7 Multi-Factor Models . . . . . . . . . . . . . . . . . . . . . . . 513.8 Empirical Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.8.1 Brown & Dybvig (1986) . . . . . . . . . . . . . . . . . 513.8.2 Brown & Schaefer (1994) . . . . . . . . . . . . . . . . . 533.8.3 Chan, Karolyi, Longstaff & Sanders (1992) . . . . . . . 533.8.4 Gibbons & Ramaswamy (1993) . . . . . . . . . . . . . 543.8.5 Pearson & Sun (1994) . . . . . . . . . . . . . . . . . . 543.8.6 Longstaff & Schwartz (1992) . . . . . . . . . . . . . . . 54

4 Derivatives 554.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.2 Binomial Models . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.2.1 Alternative Derivations . . . . . . . . . . . . . . . . . . 574.2.2 Trinomial Models . . . . . . . . . . . . . . . . . . . . . 60

4.3 Black Scholes Model . . . . . . . . . . . . . . . . . . . . . . . 604.3.1 Black Scholes Derivations . . . . . . . . . . . . . . . . 604.3.2 Implied Volatilities . . . . . . . . . . . . . . . . . . . . 644.3.3 Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.4 Advanced Topics . . . . . . . . . . . . . . . . . . . . . . . . . 644.4.1 American Options . . . . . . . . . . . . . . . . . . . . . 64

CONTENTS iii

4.4.2 Exotic Options . . . . . . . . . . . . . . . . . . . . . . 664.4.3 Other Advanced Topics . . . . . . . . . . . . . . . . . . 67

4.5 Interest Rate Derivatives . . . . . . . . . . . . . . . . . . . . . 674.5.1 Stochastic Interest Rate Models . . . . . . . . . . . . . 684.5.2 Stochastic Term Structure Models . . . . . . . . . . . . 68

5 Corporate Finance 715.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715.2 Information Asymmetry/Signaling . . . . . . . . . . . . . . . . 715.3 Agency Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 755.4 Capital Structure . . . . . . . . . . . . . . . . . . . . . . . . . 795.5 Dividends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.5.1 Factors Influencing Dividend Policy . . . . . . . . . . . 885.5.2 Key Dividends Papers . . . . . . . . . . . . . . . . . . 89

5.6 Corporate Control . . . . . . . . . . . . . . . . . . . . . . . . 955.7 Mergers and Acquisitions . . . . . . . . . . . . . . . . . . . . . 100

5.7.1 Tender Offers . . . . . . . . . . . . . . . . . . . . . . . 1005.7.2 Competition Among Bidders . . . . . . . . . . . . . . . 1035.7.3 Managerial Power . . . . . . . . . . . . . . . . . . . . . 1035.7.4 Key Papers . . . . . . . . . . . . . . . . . . . . . . . . 104

5.8 Financial Distress . . . . . . . . . . . . . . . . . . . . . . . . . 1075.8.1 Factors Affecting Reorganizations . . . . . . . . . . . . 1085.8.2 Private Resolution . . . . . . . . . . . . . . . . . . . . 1095.8.3 Formal Resolution . . . . . . . . . . . . . . . . . . . . 1105.8.4 Key Papers . . . . . . . . . . . . . . . . . . . . . . . . 111

5.9 Equity Issuance . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.9.1 Flotation Methods . . . . . . . . . . . . . . . . . . . . 1165.9.2 Direct Flotation Costs . . . . . . . . . . . . . . . . . . 1175.9.3 Indirect Flotation Costs . . . . . . . . . . . . . . . . . 1195.9.4 Valuation Effects . . . . . . . . . . . . . . . . . . . . . 1195.9.5 SEO Timing . . . . . . . . . . . . . . . . . . . . . . . . 1205.9.6 Key Papers . . . . . . . . . . . . . . . . . . . . . . . . 121

5.10 Initial Public Offerings . . . . . . . . . . . . . . . . . . . . . . 1265.10.1 IPO Anomalies . . . . . . . . . . . . . . . . . . . . . . 1285.10.2 Key Papers . . . . . . . . . . . . . . . . . . . . . . . . 130

5.11 Executive Compensation . . . . . . . . . . . . . . . . . . . . . 1335.12 Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . 1385.13 Internal/External Markets and Banking . . . . . . . . . . . . . 143

iv CONTENTS

5.14 Convertible Debt . . . . . . . . . . . . . . . . . . . . . . . . . 1475.15 Imperfections and Demand . . . . . . . . . . . . . . . . . . . . 1515.16 Financial Innovation . . . . . . . . . . . . . . . . . . . . . . . 155

6 Market Microstructure 1596.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1596.2 The Value of Information . . . . . . . . . . . . . . . . . . . . . 1606.3 Single Period REE . . . . . . . . . . . . . . . . . . . . . . . . 1616.4 Batch Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6.4.1 Strategic Uninformed Traders . . . . . . . . . . . . . . 1706.5 Sequential Trade Models . . . . . . . . . . . . . . . . . . . . . 173

6.5.1 Specialists and Dealers . . . . . . . . . . . . . . . . . . 1736.5.2 Other Topics . . . . . . . . . . . . . . . . . . . . . . . 176

6.6 Special Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1766.6.1 Bubbles . . . . . . . . . . . . . . . . . . . . . . . . . . 1766.6.2 Speculation . . . . . . . . . . . . . . . . . . . . . . . . 1776.6.3 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1786.6.4 Cascades . . . . . . . . . . . . . . . . . . . . . . . . . . 178

7 International Finance 1797.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1797.2 Spot Currency Pricing . . . . . . . . . . . . . . . . . . . . . . 1807.3 Forward Currency Pricing . . . . . . . . . . . . . . . . . . . . 1817.4 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1847.5 International Asset Pricing . . . . . . . . . . . . . . . . . . . . 1847.6 Other Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

8 Appendix: Math Results 1858.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

8.1.1 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . 1858.1.2 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . 1858.1.3 Distributions . . . . . . . . . . . . . . . . . . . . . . . 1858.1.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . 1868.1.5 Some Famous Inequalities . . . . . . . . . . . . . . . . 1868.1.6 Stein’s Lemma . . . . . . . . . . . . . . . . . . . . . . 1878.1.7 Bayes Law . . . . . . . . . . . . . . . . . . . . . . . . . 1878.1.8 Law of Iterated Expectations . . . . . . . . . . . . . . 1878.1.9 Stochastic Dominance . . . . . . . . . . . . . . . . . . 187

CONTENTS v

8.2 Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 1888.2.1 Projection Theorem . . . . . . . . . . . . . . . . . . . . 1888.2.2 Cramer-Rao Bound and the Var-Cov Matrix . . . . . . 1888.2.3 Testing: Wald, LM, LR . . . . . . . . . . . . . . . . . . 188

8.3 Continuous-Time Math . . . . . . . . . . . . . . . . . . . . . . 1898.3.1 Stochastic Processes . . . . . . . . . . . . . . . . . . . 1898.3.2 Martingales . . . . . . . . . . . . . . . . . . . . . . . . 1898.3.3 Ito’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . 1898.3.4 Cameron-Martin-Girsanov Theorem . . . . . . . . . . . 1898.3.5 Special Processes . . . . . . . . . . . . . . . . . . . . . 1908.3.6 Special Lemma . . . . . . . . . . . . . . . . . . . . . . 191

References 192

vi CONTENTS

Chapter 1

Introduction

These notes are an effort to integrate the body of knowledge encountered inthe Finance PhD classes at the University of North Carolina. The originaldraft of this document was developed to prepare for the area comprehensiveexams. As such, the presentation is in a condensed format. The theoreticalmodels are derived in a skeleton form, with the focus on the set up and keysteps rather than line-by-line explanations. Similarly, empirical work is sum-marized in terms of purpose, methodology (where important), findings, andfit with the literature. Throughout the manuscript special attention is givento tying together the ideas in differerent models and areas. Providing thisstructure should make the material easier to remember and more meaningfulto interpret.

The organization of the paper is as follows. Chapter 2 covers asset pricing,both theoretical and empirical. A separate chapter on fixed income secuti-ties follows. Chapter 4 coveres derivative securities, from both a binomialperspective and a continuous time framework. Chapter 5 covers the maintopics in corporate finance, again both theoretically and empirically. Nextis a chapter on market microstructure and information economics, which islargely theoretical. A chapter on international finance concludes the mainbody of the document. Lastly is a chapter covering important mathematical,statistical, and econometric issues.

An effort is made to preserve notational consistency, but inevitably therewill be deviations. Bold is used for vectors x and matrices X. Time subscriptsare dropped unless needed for clarity. Generally t is the current time, T isthe end, and τ is the time between two dates. Random variables are given atilde only as needed. Expectations are with respect to the true probabilities

1

2 CHAPTER 1. INTRODUCTION

P unless otherwise denoted. The risk-neutral measure is represented by Q.R denotes a gross return, whereas r is a net return. When working withthe pricing kernel models it is useful to have a notation for element-wiseoperations. I use to denote element-by-element multiplication and torepresent such a division.

This document draws heavily from a variety of sources, including Bhat-tacharya and Constantinides (1989), Cochrane (1998), Campbell, Lo, andMacKinlay (1997), Huang and Litzenberger (1988), Ingersoll (1987), Jarrow,Maksimovic, and Ziemba (1995), as well as lecture notes from Dong-HyunAhn, Jennifer Conrad, and Dick Rendleman.

June 30, 1998Mike [email protected]

Chapter 2

Asset Pricing

2.1 Introduction

There are three primary apporaches to pricing assets. The equilibrium ap-proach begins with agents preferences (e.g., over expected returns or con-sumption). Agents maximize expected utility subject to budget constraintsand market clearing conditions. Equilibrium models price all assets simul-taneously and in equilibrium there is no arbitrage. The arbitrage approachtakes a different point of view. It takes as given the prices of basis assets,which can be combined to generate other payoffs. The absence of arbitrageimplies unique prices for these synthetic assets when markets are (locally)complete. If markets are incomplete, it may be the case that there is a rangeof admissable prices. Unfortunately, it is generally not possible to recovera supporting equilibrium from the arbitrage approach. Somewhat paradox-ically, the arbitrage approach may in fact admit arbitrage opportunities inthe sense that selecting different basis assets may give different prices. Thefinal approach focuses on the pricing kernel. This approach shares many ofthe features of the first two approaches and provides a unifying framework.Under this paradigm, all assets can be priced by the relation p = E[mx].Asset pricing models differ in the specification of the pricing kernel m.

One question that arises immediately in asset pricing is the decision towork in discrete or continous time. The discrete time models were developedfirst, and the have the benefit of a more intuitive feel. Continuous timemodels have a number of advantages. With a single state variable returnsare perfectly instantaneously correlated which simplifies the analysis. More

3

4 CHAPTER 2. ASSET PRICING

generally, moments higher than the second vanish in continuous time.Conditional asset pricing models have become popular in response to

the failure of unconditional models. A conditional model can capture time-varying expected returns and/or risk premiums.

This chapter develops each theoretical approach, discussing the under-lying assumptions and the resulting implications. Derivations are providedfor each, and an effort is made to show the connections among the models.The chapter concludes with a summary of the major emprical results andmethodologies. We begin the chapter by reviewing portfolio theory.

2.2 Portfolio Theory

Portfolio theory is concerned with the investors’ decision to consume or saveand the portfolio selection decision. The theory develops many of the re-sults that appear in the CAPM framework. These results follow from mean-variance mathematics, not from any economic model. Early works were dueto Markowitz (1959), who moved the thinking from maximizing E[R] to con-sideration of both mean and variance. The principle of diversification comesfrom this work.

Under certain conditions, we can consider only the mean and variance ofasset returns. One sufficient condition is quadratic utility (see Section 2.3.1).The other sufficient condition is multivariate normality of asset returns. Al-though either of these assumptions are unlikely to hold, the resulting analysisprovides an intuitively appealing framework.

2.2.1 Single Period Optimization Problem

In terms of notation, consider a vector of asset weights w, returns r, expectedreturns µ, and a variance-covariance matrix Σ. Investors minimize variance,subject to achieving a particular return and the portfolio weights summingto one.

L =1

2w′Σw + λ(µp − w′µ) + γ(1 − w′ι) (2.1)

with FOCs

Σw = λµ + γι µp = w′µ 1 = w′ι.

2.2. PORTFOLIO THEORY 5

Solve for w to get

w = λΣ−1µ + γΣ−1ι. (2.2)

Frontier portfolios are linear combinations of two portfolios. Premultiply byι and µ, then define A = µ′Σ−1ι, B = µ′Σ−1µ, C = ι′Σ−1ι, D = BC − A2.Combine the expression for w and the FOCs to get λ = (Cµp − A)/D andγ = (B − Aµp)/D. This gives

wp = g + hµp (2.3)

where g =[

BΣ−1ι − AΣ−1µ]

/D and h =[

CΣ−1µ − AΣ−1ι]

/D. Notethat ι′g = 1, ι′h = 0, µ′g = 0, µ′h = 1

2.2.2 Key Results

From here, we can establish a number of results [see Markowitz (1959) andRoll (1977)].

• The efficient frontier is a hyperbola in µ-σ space.

• Global minimum variance portfolio o is the point (√

1C, A

C).

• o is positively correlated with all other minimum variance portfoliosand its covariance with these portfolios is its variance, 1/C.

• A frontier portfolio p is efficient if µp ≥ AC

.• For all frontier portfolios except the minimum variance portfolio, there

exists a unique orthogonal frontier portfolio, z with w′zΣwp = 0.

• All portfolios on the efficient frontier are positively correlated. Moregenerally, ρp,j = SRj/SRp where p is on the efficient frontier. (??)

• µg = 0, µg+h = 1.• The portfolios g and g + h span the entire frontier.• Any n ≥ 2 frontier portfolios can span the entire frontier.• If wi is efficient then wq = A′wi is efficient (A diagonal, trace(A) = 1,

and Aii ≥ 0 ∀ i).• The covariance between the returns of a frontier portfolio p and any

other portfolio n (not necessarily on the frontier) is λµn + γ.• µn = µz + β(µp − µz), where σp,z = 0• geometry of tangency lines.


A beta representation is easy to derive from the FOCs.

w′pΣwp = σ2

p = λµp + γ

i′pΣwp = σip = λµi + γ

z′pΣwp = σzp = λµz + γ = 0

where z is the portfolio orthogonal to p (or rf in the SL model) and i is anportfolio. Since the third equation equals zero, subtract it from the first twoequations.

σ2p = λµp + γ − λµz) − γ = λ(µp − µz)

σip = λµi + γ − λµz) − γ = λ(µi − µz)

Solving for λ and rearranging gives the desired result

µi = µz + βip(µp − µz). (2.4)

Note that the beta is measured relative to portfolio p which is currentlyunspecified. The important point of Roll’s critique is that this representationis a mathematical result from the set up of the minimization problem. It doesnot have any economic content unless we specify p as a particular portfolio.His critique also says that the CAPM is not testable because the marketportfolio includes all assets, which we can not measure.

2.2.3 Multiperiod Portfolio Choice

In moving to a multiperiod setting the agent now considers future expectedconsumption. Time subscripts are for indexing only. Other subscripts denotepartial derivatives.

maxC,w

T∑

t=1

Et[U(Ct)] + Et[B(WT )]

Define the indirect utility function as

J(Wt) ≡ maxC,w

T∑

s=t

Et[U(Cs)] + Et[B(WT )]

2.3. EQUILIBRIUM ASSET PRICING THEORY 7

with J(WT ) = B(WT ). At T − 1, indirect utility is

J(WT−1) = maxU(CT−1) + ET−1[J(WT )]

where WT = [WT−1−CT−1] [∑

wi(Ri − Rf ) +Rf ]. The first order conditionsare

UC − ET−1[BWR∗] = 0 and ET−1[BW (Ri − Rf )] = 0.

This generalizes to

J(Wτ ) = maxU(Cτ ) + Eτ [J(Wτ+1)]

with FOCs

UC = Eτ [JWR∗] and Eτ [JW (Ri −Rf )] = 0.

With log utility optimal consumption depends only on current wealthand not on the investment opportunity set. Consumption is a specific pro-portion of wealth and investors choose portfolios as in a single period settingby equating the marginal utilities across assets. With power utility, opti-mal consumption does depend on the investment opportunity set althoughinvestment decisions are independent of consumption. With more generalHARA utility both consumption and portfolio choice depend on wealth.

2.3 Equilibrium Asset Pricing Theory

The equilibrium approach begins with agents’ preferences and maximizesexpected utility subject to budget constraints and market clearing condi-tions. This approach has the advantage of internal consistency (no arbitrageopportunities) and providing comparative statics. Models may be generalequilibrium or partial (e.g., take the riskless rate as given). A disadvantageof these models is that they require taking a stand on preferences, and this of-ten involves a tradeoff between reality and tractability. The standard CAPMis set is a single-period discrete world, whereas the ICAPM and CCAPM aremulti-period models in continuous time.


2.3.1 Utility Functions

Utility functions are the foundation of equilibrium asset pricing models.Specifying a utility function deternines the features of the agents’ prefer-ences, which in turn affect how assets are priced in the economy. Here wewill discuss several important classes of utility functions in a nested frame-work. Many of the commonly used funtions are special cases of more generalspecifications. This section also briefly discusses aggregation, representativeagents, and the implications for asset pricing.

One desirable feature is time-separability of a utility function. This meansthat an agent’s consumption today does not affect his consumption prefer-ences in the future (no hangovers)

u(c0, . . . , cT ) = Et

[

T∑

τ=0

βτuτ (cτ )

]

This is a strong assumption, but it greatly simplifies much of the analysis.Durability is one source of nonseparabilty. Models that relax the separabilityassumption include habit persistence, “keeping up with the Joneses,” and theEpstein-Zin class of recursive preferences models.

Risk-averse agents have utility functions that are concave in wealth (orconsumption). In this case, u[E(c)] ≥ E[u(c)] (by Jensen’s Inequality). It isexpected utility we care about. Concave utility functions mean the agent isbetter off with a certain outome than a risky outcome.

There are several measures of risk aversion. RA = −u′′/u′ is the Arrow-Pratt measure of absolute risk aversion, which applies to small risks. Thelarger is RA, the larger the risk premium required to induce the agent to investin risk assets. CARA means the agent keeps a fixed dollar amount investedin the risky asset as wealth changes. Models based on CARA preferences donot have income effects. IARA implies the risky asset is an inferior good —those with more wealth take less risk. This doesn’t make sense if one thinksabout a subsistence level of wealth.

To measure relative risk aversion, we use RR(C) = RA(C)C. This mea-sure describes proportional changes in the risky asset investment for changesin wealth. The wealth elasticity of demand is unity for CRRA utility func-tions and greater than one for DRRA functions. With CRRA, agents investa contant proportion of their wealth in the risky assets, whereas with DRRAthe fraction of wealth invested in the risky asset increases with initial wealth.


Table 2.1: Common Utility Functions: HARA

u(W ) =1 − γ

γ

(

αW

1 − γ+ b

)γ

Case u = γ b FeaturesRisk-neutral 1Quadratic 2 IARA, M-VNegative Exp. −e−αW −∞ 1 CARA= αPower W γ/γ < 1 0 CRRA= 1 − γLog log(W ) 0 0 CRRA=1

The HARA (hyperbolic absolute risk aversion) family nests many com-monly used classes of utility functions. Table 2.1 summarzies the features ofcommon utility functions.

With a riskless asset, quadratic or HARA utility implies two-fund separa-tion. If there is not a riskless asset, quadratic or CRRA utility provides thisresult. With the exception of quadratic (which has its own undesirable prop-erties), restrictions on utility functions alone do not imply mean-variancepreferences, so therefore do not imply the CAPM.

Equilibrium models rely on the ability to aggregate over individuals inthe economy. A complete or effectively complete market guarantees the ex-istance of a representative agent. The representative agent’s utility functionis completely determined by individual agents’ preferences and wealths andis independent of available assets only when all investors have HARA utility.The risk aversion of the representative agent is the harmonic mean of indi-vidual risk aversions, and will be less than or equal to the wealth-weightedaverage. It is easier to establish the existence of a representative agent thanit is to aggregate demands. In many cases, however, we are interested in theless difficult task of aggregating demand only at the equilibrium price.

2.3.2 CAPM Theory

Assumptions

• homogeneous expectations (distinguishes from portfolio theory)


• Quadratic utility or multivariate normality of returns• rational, risk-averse investors• perfect capital markets• unrestricted short selling (Black)• borrow and lend at riskless rate (SL)

Derivation of Sharpe-Lintner Model

L =1

2w′Σw + λ[µp − µf − w′(µ − µfι)]

FOCs:

Σw = λ(µ − µfι)

µp − µf = w′(µ − µfι)

Solving for λ,

λ = w′Σw[w′(µ − µfι)]−1

so

µ − µfι = Σw/λ =Σw

w′Σw(µp − µf) = β(µp − µf)

Investors will only hold a combination of the riskfree asset and a tangencyportfolio. With homogeneous expectations the portfolio p must be the value-weighted market portfolio M .

µ − µfι = β(µM − µf)

Derivation of Black Model

Black’s (1972) CAPM adds one assumption to give the portfolio math resultseconomic content. With investor homogeneity, all investors will hold efficientportfolios. Since the value weighted market portfolio is a linear combinationof these efficient portfolios, it too is efficient.We can the rewrite (2.4) as

µi = µz + βi(µM − µz).


Alternatively, we can maximize expected return for a given portfolio vari-ance.

L = w′µ + µz(1 − ι′w) + λ(σ2 − w′Σw)

gives FOCs:

σ2 = wΣw 1 = 1′w µ = µzι + 2λΣw.

So

w′µ = µz + 2λσ2

For the market portfolio 2λ = (µM − µz)/σ2M . For a generic asset,

µi = µz + (µM − µz)σiM/σ2M = µz + βi(µM − µz).

Interpretation

The assets that covary negatively with the market tend to payoff when themarket is doing poorly. These assets are valueable to investors in smoothingtheir wealth. Since they are valuable, investors will pay a high price andaccept a low return. Thus, assets with low or negative betas will have low(or possibly negative) expected returns. Higher risk aversion increases the

risk-return tradeoff. This is measured by the Sharpe-ratioE[rM ]−rf

σi, the slope

of the CML.

2.3.3 ICAPM Theory

The intertemporal capital asset pricing model and consumption capital assetpricing model extend the standard CAPM intuition to a multi-period set-ting. The ICAPM replaces dependence on quadratic utility/normal returnswith the assumption of a GBM process which implies normally distributedreturns. In the continuous time setting, higher moments do not matter, im-proving tractability of the model. An advantage over the CAPM is utilitycan be state-dependent, although the time-separability assumption remains.With constant risk tolerance utility functions and constant investment op-portunities, optimal portfolio choices are also constant. When the investmentopportunity set changes, so will portfolio allocations.


Merton’s (1973) ICAPM begins with the specification of asset price paths.Demands are determined by investors’ maximizing current and expected fu-ture utility, subject to his budget constraint. Preferences are instantaneouslystate-independent and depend only on immediate consumption. The in-direct utility function, which is the maximized utility of future wealth, isstate-dependent. A collection of state variables are sufficient statistics forsummarizing the investment opportunities. Investors hedge against adversechanges in the investment opportunity set, with the end goal being a hedgeagainst changes in consumption.

Assumptions• limited liability• perfect markets• no restrictions on trading volume/short selling• always in equilibrium• borrow/lend at same rate• continuous-time trading• state variable has continuous sample path• first 2 return moments exist, higher moments unimportant• returns have a compact distribution• time-separable preferences• ri = αidt+ σidzi

Under certain conditions, we have two-fund separation and the CAPM:1. log utility (this means JWx = 0, investors do not want to hedge)2. σix = 0 ∀ i (no hedge is possible)

The following derivation is for a single state variable x. The more gereralcase of a vector of state variables is similar.

Underlying Processes

dW = −Cdt + [W − Cdt]w′r dx = µdt+ sεx

√dt

Et[dW ] = [Ww′α − C]dt E(dx) = µdt

var(dW ) = W 2w′Σwdt var(dx) = s2dt

cov(dx, r) = ρixsσidt = σixdt


Optimization Problem

J(W,x, t) = maxEt

[∫ t+dt

t

U(C, s)ds+ J(W + dw, x+ dx, t+ dt)

]

J(W + dw, x+ dx, t + dt) = J(W,x, t) + Jtdt+ JWdW + Jxdx

+1

2JWW (dW )2 +

1

2Jxx(dx)

2 +1

2Jtt(dt)

2

+ JWxdwdX + JtxdtdX + JtWdtdW + φ

(2.5)

where φ contains higher-order terms.

E[J(·, ·, ·)] = J + Jtdt+ JWE[dW ] + JxE[dx]

+1

2JWWvar(dW ) +

1

2Jxxvar(dx) + JWxcov(dw, dx) (2.6)

0 = maxC,w

[U(C, t) + Jt + JW (−C +Ww′α) +W 2

2JWWw′Σw

+ Jxµ+1

2Jxxs

2 + JWxWw′σix]dt (2.7)

FOCs: (with portfolio constraint∑N

i=0 wiαi = rf +∑N

i=1 wi(αi − rf ))

UC = JW (envelope condition)

WJW (α − rfι) +W 2JWWΣw +WJWxσix = 0

Now solve for optimal portfolio weights

w∗ =

( −JW

WJWW

)

Σ−1(α − rfι) +

( −JWx

WJWW

)

Σ−1σix (2.8)

Define

D ≡( −JW

WJWW

)

ι′Σ−1(α − rfι) H ≡( −JWx

WJWW

)

ι′Σ−1σix


t =Σ−1(α − rfι)

ι′Σ−1(α − rfι)h =

Σ−1σix

ι′Σ−1σix

Therefore w∗ = Dt +Hh. Further, ι′t = ι′h = 1 so t and h are portfolios.This gives three-fund separation, with the third fund being the riskless asset.h is the “hedge portfolio,” and has the highest correlation with the statevariable x. This set up generalizes with a vector of state variables, in whichcase we have dim(x) + 2-fund separation.

Equilibrium conditions:

Define ak = −JW/JWW and bk = −JWx/JWW where k indexes the investor.Rewrite the second FOC as:

JW (α − rfι) + JWWWΣw∗ + JWxσix = 0

ak(α − rfι) = WkΣwk − bkσix

Sum over all investors and divide by∑

k ak:

(α − rfι) = AΣµ − Bσix or (αi − rf) = Aσim −Bσix

where A =∑

k Wk/∑

k ak, B =∑

k bk/∑

k ak, and µ =∑

k wkWk/∑

k Wk

(average investment in each asset across investors). Now multiply by µ′ andh′ to get

αm − r = Aσ2m − Bσmx, αh − r = Aσ2

hm −Bσhx,

Solving for A and B and substituting,

αi − r =σimσhx − σixσmh

σ2mσhx − σmxσmh

(αm − r) +σixσ

2m − σimσmx

σ2mσhx − σmxσmh

(αh − r)

= βmi (x)(αm − r) + βh

i (x)(αh − r)

The βs have the interpretation of regression coefficients in an IV regression,where x serves as an instrument for h. Note that

σih = Σh =ΣΣ−1σix

ι′Σ−1σix

=σix

ι′Σ−1σix

Therefore, σix = kσih. This trick generalizes to cov(j, x) = kcov(j, h) wherek = ι′Σ−1σix. Terms depending on x can be factored from the betas soβm

i (x) = βmi and βh

i (x) = βhi .


2.3.4 CCAPM Theory

The CCAPM, due to Breeden (1979), is very much like the ICAPM withconsumption growth as the single state variable. In the ICAPM investorshedge against changes in the state variables because these represent changesin the investment opportunity set, and therefore, changes in consumption.The CCAPM goes directly to heding against changes in consumption. Themodel is also similar to the static CAPM, where end of period wealth mat-tered. Since the CAPM is one period, end of period wealth is the sameas consumption. A key assumption in the CCAPM is additively separablepreferences, which gives state independence of direct utility.

To make more clear the link between the ICAPM and the CCAPM, notethat in the ICAPM agents set the marginal utility of wealth equal to themarginal utility of consumption along the optimal consumption path. Thisis the envelope condition, UC = JW . If markets are complete, then perfecthedges for the state variables can be formed and all individuals will have per-fectly (instantaneously) correlated consumption policies. This is an analogueto all individuals holding the market portfolio in the static CAPM.

In many ways, the CCAPM is the most fundamental of the equilibriummodels. It is illogical to choose the CAPM or ICAPM because you thinkthe consumption-based model is wrong. The only reason for chosing analternative model is because the consumption data to test the model may beunsatisfactory.

CCAPM Derivation

The combination of portfolios h and t which the investor chooses minimizethe variance in consumption, not wealth.

The CCAPM can be derived as a simple modification to the previousderivation of the ICAPM. Since UC = JW at the optimum, JWW = UCCC

∗W

and JWx = UCCC∗x. Substituting into (2.8),

w∗ =

( −UC

WUCCC∗W

)

Σ−1(α − rfι) +

( −UCCC∗x

WUCCC∗W

)

Σ−1σix

or

−UC

UCC(α − rfι) = WC∗

WΣw∗ + C∗xσix.


The covariance between the return on asset i and consumption growth is

cov

(

r,dC

C

)

= E[(αdt+ Σdz)(Ctdt+ CWdW + Cxdx+ φ)]

=WCW

CΣw +

Cx

Cσix = σk

iC/C.

Noting that this is different for each agent k and letting T k = −CUC/UCC

T k(αi − rf ) = σkiC .

Summing over all investors we get

(αi − rf) = T−1σiC .

Defining a reference portfolio C,

σ2C = w′

CσiC = T (αC − rf ).

Solving for T and substituting,

(αi − rf) =σiC

σpC(αp − rf) = βiC(αC − rf ).

Note that if the consumption portfolio is not itself a traded asset than theportfolio with the maximum correlation with consumption can be used. Thesame basic intuition applies, but this results in the same kind of instrumentalvariable flavor as in the previous presentation of the ICAPM. If consuptionis available, it serves as the single variable driving the returns process. Whenit is not available we include additional state variables to use as instruments.

2.3.5 The CIR Model

? derive a general equilibrium model with endogenous production and stochas-tic technology shocks. Distribution of production depends on the state vari-ables Y , which are changing randomly. This model fills a void in the litera-ture in that it endogenously determines the equilibrium price path, given thespecification of technology. Recall the ICAPM begins with a specification ofthe price path then determines the equilibrium demand.


Assumptions• single physical good• n production activities follow (2.9)• k state variables follow (2.10)• contingent claims for the single good, whose value follows (2.11)• competitive markets• endogenously determined instantaneous borrowing/lending rate r

• fixed number of identical individuals who maximize E∫ t′

tU [C(s), Y (s), s]ds

• continous investing and trading with no transactions costs• there exists a unique J and v• (technical) v ∈ V is the class of admissible controls• (technical) J, a∗ and C∗ are sufficiently differentiable.

Underlying Processes

n Production Activities

dη(t) = Iηα(Y, t)dt+ IηG(Y, t)dw(t) (2.9)

k State Variables

dY (t) = µ(Y, t)dt+ S(Y, t)dw(t) (2.10)

Value of Contingent Claim i

dF i = (F iβi − δi)dt+ F ihidw(t) (2.11)

Derivation

Budget constraint

dW =

[

n∑

i=1

aiW (αi − r) +

k∑

i=1

biW (βi − r) + rW − C

]

dt

+

n∑

i=1

aiW

(

n+k∑

j=1

gijdwj

)

+

k∑

i=1

biW

(

k∑

j=1

hijdwj

)

(2.12)

or

dW = Wµ(W )dt+W

n+k∑

j=1

qjdwj


Let K(v(t),W (t), Y (t), t) ≡ EW,Y,t

[

∫ t′

tU(v(s), Y (s), s)ds

]

and define Lv(t)

as the differential operator

Lv(t)K = µ(W )WKW +

k∑

i=1

µiKYi+

1

2W 2KWW

n+k∑

i=1

q2i

+

k∑

i=1

WKWYi

n+k∑

j=1

qjsij +1

2

k∑

i=1

k∑

j=1

KYiYj

n+k∑

m=1

simsjm (2.13)

Let the indirect utility function J(W,Y, t) be the solution to

maxv∈V

[Lv(t)J + U(v, Y, t)] + Jt = 0.

J has many of the same properties as U , such as being increasing and strictlyconcave in W .

Defining Ψ = LvJ + U , we get the following necessary and sufficientconditions:

• ΨC = UC − JW ≤ 0• CΨC = 0• Ψa = [α− r]WJW + [GG′a +GH ′b]W 2JWW +GS ′WJWY ≤ 0• a′Ψa = 0• Ψb = [β − r]WJW + [HG′a +HH ′b]W 2JWW +HS ′WJWY = 0

Solving for C, a, b, we obtain a PDE for J . The equilibrium satisfies theseconditions and markets clear:

∑

ai = 1 and bi = 0 ∀ i.

Characterizations

The expected rate of return on wealth is a∗′α. r is the negative of the expectedrate of change in the MU wealth, or a∗′α + the covariance between the rateof return on wealth and the rate of change in the MU of wealth.

r = −E[

JWW

JW

]

= E

[

dW

W

]

+ cov

(

dW

W,−JWW

JW

)

The expected rate of return on the ith contingent claim is

(βi − r)F i = [φW φ′Y ][F i

W FiY ]′


where

φW =

[

(

−JWW

JW

)

var(W ) +k∑

i=1

(

−JWYi

JW

)

cov(W,Yi)

]

= (a∗′α− r)W

and

φYi=

[

(

−JWW

JW

)

cov(W,Yi) +

k∑

j=1

(

−JWYi

JW

)

cov(Yi, Yj)

]

Alternatively, we can write

βi = r − cov(F i, JW )/F iJW

The expected return on a contingent claim is the riskfree rate plus a linearcombination of the first partials of the asset price with respect to W and Y .The weights are the φ coefficients, which are much like factor risk premiumsin the APT or hegde portfolios in the ICAPM. The φs do not depend on thecontingent claim itself and are the same for all claims.

If U is not state-dependent, we get a CCAPM-type result, with φW =−u′′

u′cov(C∗,W ) and φY = −u′′

u′cov(C∗, Y ), giving (βi−r)F i = −u′′

u′cov(C∗, F i).

The expected excess return on an asset is proportional to its covariance withoptimal consumption. We can then express relative rates of return in a waythat does not depend (explicitly) on preferences.

Fundamental Valuation Equation

1

2var(W )FWW +

∑

cov(W,Yi)FWYi+

1

2

∑∑

cov(Yi, Yj)FYiYj

+∑

i

FYi

[

µi −(−JWW

JW

)

cov(W,Yi) −∑

j

(−JWYi

JW

)

cov(Yi, Yj)

]

+ [rW − C∗]FW + Ft − rF + δ(W,Y, t) = 0 (2.14)

where r and C∗ are functions of W,Y, and t. This PDE holds for any contin-gent claim, with boundary conditions and δ depending on the terms of theclaim. The PDE can price assets with payoffs (i) contingent on crossing abarrier, (ii) contingent on not crossing a barrier, and/or (iii) flow payoffs.


We can focus on the system of equations:

dW (t) = [a∗′αW − C∗]dt+ a∗′GWdw(t)

dY (t) = µ(Y, t)dt+ S(Y, t)dw(t)

or a second system with a different drift term reflecting a change of measure:

dW (t) = [a∗′αW − C∗ − φW ]dt+ a∗′GWdw(t)

dY (t) = [µ(Y, t) − φ′Y ]dt+ S(Y, t)dw(t)

The expression JW (W (s),Y (s),s)JW (W (t),Y (t),t)

is the conditional pricing kernel.

2.4 Arbitrage Asset Pricing

Arbitrage pricing takes a set of basis assets as given and uses them to priceother assets.

2.4.1 State Contingent Claims

State contigent claims, or Arrow-Debreu securities, are the building blocksfor all assets. These securities pay $1 in a specified state and zero otherwise.Ross (1977b) shows the absence of arbitrage implies the existence of statecontingent prices and, therefore, of a linear pricing operator. This is reallyjust a spanning result. We can write p(x) =

∑

s φ(s)x(s). This says theprice of security x is the sum over all states of the price of a dollar in eachstate φ(s) scaled by the size of the payoff in each state x(s). Harrison andKreps (1979) extend this to show that this operator can be represented asan expectation with respect to a martingale measure.

Let D denote an (n× n) matrix of asset payoffs with typical element dij,where i denotes the state and j the security. This matrix is a colection ofvectors dj of asset payoffs. α is an n-vector of weights, b an n-vector ofpayoffs. φ is the price vector for the n Arrow-Debreu securities and p theprices of the complex securities. We have the following pricing relations

D′φ = p and Dα = b

with ι′φ = 11+rf

= pf , (1 + rf )ι′φ = ι′π = 1. π = f(θ, λ) is the risk-neutral

probablities, a function of the true probabilties θ and risk aversion λ.

2.4. ARBITRAGE ASSET PRICING 21

2.4.2 Arbitrage Pricing Theory

The APT, originally developed by Ross (1976), has generated a tremendousliterature of theoretical extensions and a wide range of empirical tests. Theintuition is simple. Assume returns follow a factor-model, meaning returnsdepend on the realization of factors and (quasi-) orthogonal shocks.1 Thefactors are not diversifiable, whereas the orthogonal shocks are in some sense.The theory is silent on what the factors are, or even the number of factors.A key idea is the factor-mimicking portfolio.

There are really three different cases of the APT, depending on the as-sumptions about the structure of the Ω matrix of “idiosyncratic” covariances.If we have an exact or noiseless factor model, then Ω is the zero matrix andan exact arbitrage argument will hold. Alternatively, we could have a strictfactor model in which the matrix is diagonal so there is no correlation acrossassets. Large diversified portfolios cause the idiosyncratic variance to go tozero. We appeal to an asymptotic arbitrage argument in which there is noarbitrage on average, although specific securities may be mispriced. Finally,we could allow for a more general correlation structure where Ω may con-tain non-zero off-diagonal elements. This approximate factor model allowsfor idiosyncratic correlations (e.g., industries) and requires restrictions onthe covariaces of returns such that the idiosyncratic part is diversified awaywhile the factors remain. The controversy over the structure of Ω has majorimplications for the testability of the model.

The APT has a flavor very similar to the ICAPM, although it is arisesfrom a different viewpoint. In the end, both models specify expected returnsas a function of a linear combination of their covariances with variables (fac-tors and state variables, respectively). This link arises because it is impliedby the absence of arbitrage. The additional assumptions in the equilibriummodel serve to determine the risk premium associated with each state vari-able.2

The model has been extended in a number of other ways including dy-namic, conditional, nonlinear, international versions. Tests of the model havealso followed several paths, broadly categorized as cross-sectional or time se-

1By quasi-orthogonal shocks I mean that some correlation among the reisduals is al-lowed.

2Actually, models such as the CAPM are partial equilibrium models and take theriskless rate and market price of risk as given. Richer models such as CIR introduceproduction uncertainty and are able to more completely characterize the economy.


ries. In general, tests reject the model but find it provides more favorableperformance than models like the CAPM.

APT Derivation

This derivation is based on the strict factor version. The exact APT deriva-tion will also work under this approach. Modifications for the approximateAPT are mentioned at the end. It is very important to understand that theAPT starts with a characterization of realized returns r, and uses statisticalproperties to say something about expected returns µ.

rt = µt + νt = µt + Bft + ut (2.15)

E[rt] = µt ft ∼ N(0, I) ut ∼ N(0,Ω)

where Ω is diagonal. ft is a factor vector and B a loading matrix, whichtogether give the unexpected factor-related return. Return covariances are

E[rtr′t] = Bff ′B′ + Ω = BB′ + Ω = Ψ.

As an aside, define Φ such that ΦΦ′ = I, giving B = DΦ, a rotation. There-fore, Ψ = D′D + Ω, illustrating the rotational indeterminancy.

Next form a portfolio with weights w. The portfolio variance is

σ2p = w′Ψw = w′BB′w + w′Ωw ≈ w′BB′w.

The strategy is to choose w such that w′BB′w = 0 without making aninvestment, ι′w = 0. To find a w think of this as a regression of µ on [ι B].This is

µ = λ0ι + Bλ + w. (2.16)

The normal equations from the regression give ι′w = 0 and B′w = 0, whichimplies w′BB′w = 0 as desired.

To find w′r, insert (2.16) into (2.15) to get

rp = w′(λ0ι + Bλ + w) + w′Bf + w′u.

2.5. PRICING KERNEL APPROACH 23

Taking expectations and using the orthogonality conditions, µp = w′w. Thisvalidates (2.16), which can be written as

µt ≈ λ0ι + Bλ. (2.17)

If a factor is negatively correlated with the IMRS the model implies a positiverisk premium.

Using wN in (2.16), where N indexes the number of assets, a sequence ofarbitrage portfolios satisfies the Ross pricing bound if wN ′wN does not go toinfinity with N . The approximate factor model is derived by requiring thatas N → ∞ the smallest eigenvalue of B′B → ∞ while the largest eigenvalueof Ω → 0. That is, the factors are pervasive while the idiosyncratic part isdiversifiable.

2.5 Pricing Kernel Approach

The pricing kernel approch is in many ways a hybrid of the equilibrium andarbitrage approaches. The focus is to specify the pricing kernel3 m whichmakes the Euler equation hold:

pt = Et[mt+τxt+τ ] (2.18)

This seemingly simple expression is complex enough to cover pricing for anyasset. The expression can be modified to handle returns, excess returns,stocks, bonds, options, etc. The meaning of the payoff x and the pricechange, but the same intuition applies.

The expected return on an asset is negatively related to its covariance withthe stochastic discount factor. Assets whose returns vary positively with thesdf pay off when the marginal utility is high. That is, they provide wealthin the states when it is most valuable to investors. Consequently, investorsare willing to pay high prices and accept low returns for these assets.

There are basically two ways of doing business. One is to take the IMRS asgiven and interpret (2.18) as the Euler equation arising from the consumer’s

3This object lives by many names, including the stochastic discount factor (sdf), in-tertemporal marginal rate of substitution (IMRS), or benchmark pricing variable. It isincorrectly referred to as the Radon-Nikodym derivative, Arrow-Debreu price, or state-contingent claims price (unless the riskless rate is zero). While on naming conventions, therisk-neutral probability measure is also referred to as the equivalent martingale measure(EMM).


optimization problem. The goal would then be to explain asset returns. Theother view is to take the returns as given and explore the implications for m.

The characteristics of m depend upon the structure of the economy. Ifthe law of one price (LOP) is satisfied, there will exist (at least one) m suchthat (2.18) holds. In the absence of arbitrage (NA), m is strictly positive. Ifmarkets are complete then m is unique.

2.5.1 Basics

This presentation is for a discrete time, multiperiod model. Define the con-sumption set c ∈ B(ei, p) ⊂ R × X. The budget constraints are c(0) =e(0) − θ′p and c(T, ω) = e(T, ω) − θ′d(ω). Combining these two equations,Dθ = c − e. The attainable set Dθ = c ignores the initial endowment. Iwill abuse notation and consistency by letting Q and π∗ refer to the EMM.The later is more appropriate for discrete settings. Also, dividend (payoff)vectors and matrices are indicated by d and D.

Definition 1 The market is complete iff every consumption process is at-tainable (M = X), or iff rank(D) = k.

Definition 2 An arbitrage strategy has non-negative, non-zero consumptionwith e(0) = (0); Dθ ≥+ 0

Definition 3 An Equivalent Martingale Measure Q (or π∗) satisfies p =D′π∗/Rf .

Q exists iff there is no arbitrage, or iff an equilibrium exists. If marketsare complete then Q is unique.

Definition 4 A price functional Φ : R ×M → R (Π : M → R) satisfiesΦ(c) = c(0) + Π(c(T )) = c(0) + θ′p for any θ such that c(T ) = θ′d.

This implies B(e,p) can be expressed as Φ(nc) = 0 where nc(t) ≡ c(t)−e(t) ∈M .

Π is unique even in an incomplete market and exists is there is an equi-librium. A price system is viable: iff there is no arbitrage, iff Q exists, or iffΦ (or Π) exists.

Definition 5 Ψ : X → R is an extension of Π if for all x ∈M,Ψ(x) = Π(c).

A sequence of scaled prices is a Q-martingale.


2.5.2 Different Expectations

Denote the price of asset x, a package of state-contingent claims, as p(x).Then

p(x) =∑

s

φ(s)x(s) =∑

s

π(s)

(

φ(s)

π(s)

)

x(s) = EP[mx]

where π(s) is the (true) probability of state s. It follows then that m(s) =φ(s)/π(s). To move to risk-neutral probabilities π∗, define

π∗(s) ≡ Rfm(s)π(s) = Rfφ(s),

where 1/Rf =∑

φ(s) = E[m]. Then

p(x) =∑

s

φ(s)x(s) =∑

s

π∗(s)

Rfx(s) =

EQ[x]

Rf.

These results imply

p(x) = EQ[x]/Rf = EP[mx].

Stated differently

π∗(s)

π(s)= Rfm(s).

The risk neutral probabilities give greater weight to states with high marginalutility, the “bad” states. In discrete time, the “change of measure” is

π∗ π = Rfm =Q

P

In continuous time the analagous expression is

dQ

dP= lim

n→∞

fQn (x1, . . . , xn)

fPn (x1, . . . , xn)

where fn() represents the joint likelihood under the respective measure. Thisexpression is the Radon-Nikodym derivative, and is the limit of the likelihoodratios. This random variable satisfies

EQ(xT ) = EP

(

dQ

dPxT

)

.


2.5.3 Asset Pricing with m

This analysis is useful in pricing assets. For a collection of assets in an econ-omy, the price is the risk-neutral expectation of the future value, discountedback to the present at the riskless rate

p = D′π∗/Rf .

If the market is complete, Q is unique (π∗ is identifable in a discrete setting)and we can invert the payoff matrix to solve for the probabilities

π∗ = Rf (D′)−1p.

If the market is not complete it is often possible to get a range of admissableEMMs. Further restrictions may result from imposing the NA condition thatthe pricing kernel be positive.

Recall that dividing by the riskless rate will give the Arrow-Debreu pricesφ = π∗/Rf = (D′)−1p. Furthermore, the pricing kernel is

m = pfπ∗ π = (D′)−1p π.

Once the EMM or pricing kernel are known they can be used to price anyother asset.

2.5.4 The Agent’s Problem

There is a relationship between the pricing kernel and equilibrium approaches.The agent will

maxc,c(s)

u(c) +∑

s

βπ(s)u[c(s)] s.t. c+∑

s

φ(s)c(s) = y +∑

s

φ(s)y(s).

FOCs are

u′(c) = λ βπ(s)u′[c(s)] = λφ(s)

Solving,

φ(s) = βπ(s)u′[c(s)]

u′(c)


or

m(s) =φ(s)

π(s)= β

u′[c(s)]

u′(c).

Thus m(s1)/m(s2) = u′[c(s1)]/u′[c(s2)], so m gives the marginal rate of sub-

stitution between date and state contingent claims. In equilibrium, marginalutility growth should be the same for all consumers

βiu′(ci,t+1)

u′(ci,t)= βj

u′(cj,t+1)

u′(cj,t).

Hence m is referred to as the IMRS. Taking the expectation of either m orIMRS gives the price of a riskless bond.

2.5.5 The Main Results

Using the definition of covariance and (2.18)

1 = E[mR] = E[m]E[R] + cov(m,R) (2.19)

E[R] =1

E[m]− cov(m,R)

E[m](2.20)

It follows immediately that if there is a riskless asset Rf = 1/E[m], or pf =E[m]. Without a riskless asset, we can view 1/E[m] as a “shadow” riskfreerate, or a zero beta return. Note that the expectations have been under thetrue probability measure P.

Using the above results,

E[Ri] = Rf +

(

cov(m,R)

var(m)

)(

−var(m)

E[m]

)

= Rf + βi,mλm

which is a beta pricing model.

Relation between m, β models, and MV frontier• p = E[mx] ⇒ β: m, x∗, or R∗ can serve as reference variables. Ifm = b′f , then f , proj(f |X), or proj(f |R) can be used.

• p = E[mx] ⇒ mean-variance frontier which includes R∗

• β ⇒ p = E[mx]: m = b′f


Table 2.2: Common Pricing Kernels

Model mt+1

CAPM a + bRW,t+1

ICAPM a +∑K

k=1 bkfk,t+1

CCAPM β u′(ct+1)u′(ct)

APT b′fBlack-Scholes exp[−(r + 1

2σ2)τ + σdZ]

• MV frontier ⇒ p = E[mx]: m = a+ bRmv

• MV frontier ⇒ β model with Rmv as a reference variable.

Since mean-variance efficiency implies a single beta representation, somesingle beta representation can always be found. The asset pricing model saysthat a particular portfolio (e.g., the market) will be mean-variance efficient.In other words, the content of a model comes from m = f(·), not p = E[mx].Also, given any multi-factor or multi-beta representation, we can alwaysfind a single beta representation. The relationship between the ICAPM andCCAPM is an example of this.

m as a Portfolio

The portfolio that maximizes squared correlation with m is a minimum vari-ance portfolio. m∗, the projection, also prices assets and can replace m.

p = E[mx] = E[(m∗ + ε)x] = E[m∗x]

2.5.6 Hansen-Jagannathan Bounds

The Hansen and Jagannathan (1991) bounds are an important addition toasset pricing. Instead of a binary reject/fail to reject result, the HJ boundsoffer some insights as to why the model may be rejected. The model is mostuseful for testing models like the consumption model where m is explicitlyspecified. The model is useless for evaluating factor models that do notspecify the factors since there are always some factor-mimicking portfoliosthat will work ex post.

2.6. CONDITIONING INFORMATION 29

Working with excess returns, E[mre] = 0, so E[m]E[rei ] = −cov(m, ri) =

ρmriσmσri

. Since |ρ| ≤ 1,

σm

E[m]≥ E[ri]

σri

(2.21)

where r∗ represents the return with the maximum Sharpe ratio. This holdsfor any asset i, including the one with the maximum Sharpe ratio. To beclear, the maximal Sharpe ratio measure the excess return on the tangencyportfolio r∗ relative to its standard deviation (assuming a one-factor world).Both the excess return on the tangent portfolio and the SR depend on Rf .Rewriting as σm = E[m]SR, the H-J bound is a function of E[m]. Aswe change E[m], we get a new Rf , a new tangency portfolio, and a newSharpe ratio. Plotting σm as a function of E[m] gives us the locus of pointscomprising the H-J bound. Note that if we know Rf , the the bound is justa point. These results are based on the law of one price (LOP), and do notuse the no arbitrage (NA) restricition that m > 0.

By imposing the NA restriction we can sharpen the bound given in (2.21).The NA bound is very similar to the LOP bound for moderate values ofE[m], but as E[m] becomes more extreme (higher SR), the NA bound ismuch stricter (higher). For payoffs x and Lagrange multipliers λ and δ,

m+ = [λ+ δ′x]+

subject to E[m+] = E[m] and E[m+x] = p. This nonlinear problem cangenerally be solved numerically. m+ has the interpretation of a call optionwith zero strike price on a portfolio of payoffs [1x]′.

The H-J bound analysis has been extended in several ways. Snow (1991)generalizes the model to include any moment of m. In this setting the boundsare more sensitive to outliers. Other extensions include incorporating trans-actions costs, utilizing cross-moments, and analyzing pricing errors as a wayto detect specification errors. One example is adding different sets of assetsand seeing how much the bound shifts up.

2.6 Conditioning Information

The difference between a conditional and unconditional model is the infor-mation set used. If payoffs and discount factors (and therefore, prices) are


iid, then conditional and unconditional models are the same. Define

UMV iff E[R2p∗] ≤ E[R2

p] ∀ Rp s.t. E[Rp∗] = E[Rp]

CMV iff Et[R2p∗] ≤ Et[R

2p] ∀ Rp s.t. Et[Rp∗] = Et[Rp]

By iterated expectations, this gives UMV ⊆ CMV. If a portfolio is UMV itmust be CMV, but the converse need not be true. We can also consider theset of minimum variance portfolios conditional on Z, CMVZ . Then CMVincludes CMVZ , which in turn includes UMV. A conditional factor pricingmodel does not imply an unconditional model. An unconditional model doesimply a conditional model.

From here we can say that it is possible to reject that a portfolio is UMVor CMVZ , but we can not reject CMV since the information set for CMVis unobservable. This is similar to the issue raised by Roll (1977); rejectingUMV does not imply rejection of CMV. Cochrane (1998) refers to this as theHansen and Richard (1987) critique. The use of scaled factors (i.e., scaledby instruments in the proper information set) is a partial solution.

If the test is based on 1 = E[mR] for some particular m, then it is possibleto test without the complete information set. Recall m∗ can replace m in(2.18), so m∗ is also CMV and is a function of the unobserved informationset.

The use of conditional models allows for time-varying expected returns.This time variation can arises due to changes in the risk premium or becauseof conditional covariances (β changes through time). The ARCH-GARCHfamily of models is often used to capture the time series behavior of condi-tional moments.

2.7 Market Efficiency

Examining the link between the theoretical asset pricing models and empir-ical tests requires a position on market efficiency. The general idea behindmarket efficiency is that prices reflect available information. Of course a moreprecise definition of available information and the implications of reflectingthis information are necessary.

The early view of market efficiency was the random walk. In this modelthe series of innovations is independent. Empirical evidence during this pe-riod found that prices are consistent with a random walk. The apparant im-plications of this model are that prices are not driven by supply/demand and

2.8. EMPIRICAL ASSET PRICING 31

there is no point in fundamental analysis. In fact, the random walk does nothave these implications since slowly adjusting prices would allow profitabletrading strategies. A problem with the random walk is that it simulatneoslyrequires rational investors to eliminate profitable trading opportunites, butalso assumes investors irrationally pay for security analysis.

The martingale model was proposed as an alternative to the random walkby Samuelson in the mid-1960s. A random variable xt+1 is a martingale withrespect to an information set Φt if

E[xt+1] = xt.

A fair game has the property that E[yt+1] = 0. Returns are a fair game ifprices and dividends follow a martingale. Finding a variable that can predictreturns means either that returns are not a martingale or that that variablein not in the information set. More recent versions of market efficiency alsoassume rational expectations.

The martingale will hold when investors have common, constant rateof time preferences, homogeneous beliefs, and are risk-neutral. Note thatrisk neutrality implies a martingale, but does not imply a random walk.The reason is that a martingale allows dependence of higher moments onthe information set, whereas the random walk does not. Allowing for riskaversion does not go very far in reconciling the martingale model with thedata.

There are several reasons not to base market efficiency on the martingalemodel. In a setting such as the ICAPM, conditional expected returns dependon dividends. Since dividends are autocorrelated the conditional expectedreturns are partially forecastable in violation of the martingale model. Timevariation in the risk premium may also lead to failure of the martingale model.Finally, most emprircal tests have a joint hypothesis problem. Rejecting amodel may mean either the model is wrong or the market is inefficient.

2.8 Empirical Asset Pricing

2.8.1 Properties of Asset Returns

Normality offers nice features in modeling asset prices, however departuresfrom normality have been extensively documented. Relative to the normaldistribution, asset returns exhibit skewness and kurtosis. Matters are com-plicated further by serial correlation in returns.


Table 2.3: Patterns In Returns

Factor Relation Comment

Size – Banz (1981)B/M +E/P + Basu (1977)CF/P +1/P +T-bills –Dividend Yield +Term structure slopeExpected Inflation –Credit quality + also related to volatilityJanuary +Monday –Contrarian ?,

Jegadeesh and Titman (1993)Momentum

Cross-sectional Patterns

There is evidence that lagged variables are useful in predicting stock andbond returns. Many of the results documented in the U.S. are also presentin other countries. Table 2.3 provides an overview of these patterns.

Interpretation of these patterns are difficult since many of these variablesare highly correlated, and much of the relation each has with returns comesin January. At longer time horizons some of the effects, such as size andE/P, tend to reverse themselves. A common criticism is that these variablesmay be correlated with the true β when estimates of β are noisy. Chan &Chen () show that average size and estimated beta in size-sorted portfoliosare almost perfectly negatively correlated.

Another issue that arises in interpretation of the cross-sectional regulari-ties is whether they are all capturing the same underlying phenomenon. Thisis especially likely considering price is in many of the variables.

Attempts to disentangle the effects are inconclusive. Some researchers


claim size subsumes E/P, while others claim the opposite. Fama and French(1992) claim that size and B/M together subsume E/P (and beta). Giventhe way these tests are designed, the B/M variable may actually be a proxyfor the true beta. A stock that recently declined in price will have a highB/M. This stock is also likely to be more levered than before its decline, soit is now riskier and should have a higher beta. However the beta estimateis generally based on returns several years prior, so the recent downturn islikely to be washed out. In the end, the estimated beta may be too low, andthe high B/M may capture the added risk of the stock. Alternatively, theB/M results may be due to survivorship biases in the COMPUSTAT tapes.

There are several calendar related patterns in returns. Most famous isthe January effect, where returns are much larger in January than in othermonths. Possible explanations include tax-based trading, window dressingby institutions, and liquidity trading. The January effect is most pronouncedfor small firms.

The weekend effect describes the large negative returns from Friday closeto Monday close. It is not clear that all the abnormal return is due to theweekend period, but Monday returns alone do not seem to account for theentire effect. International evidence is mixed with respect to weekly patterns,but many of the Asian markets have a Tuesday effect, which corresponds toMonday trading in the U.S. There is some evidence that most of the returnseach month occur during the first two weeks. This may be due to portfoliorebalancing caused by month-end salaries. Finally, there is a holiday effect,where one third of the annual returns occur on the trading days preceedingthe eight holidays on which the market is closed.4

In a clever paper Berk (1995) addresses the fact that price is directlyrelated to size. The basic logic is very simple — risky firms will be discountedat a higher rate, therefore current market values will be smaller. This willgive the appearance that small firms have higher returns, even though firmsize (future cashflows) and risk may be unrelated. Consider a set of firmswith log future cash flows c, log price p, and log return r = c − p. Furtherassume size and risk are independent. Now regress returns on beginning ofperiod size

r = α1 + β1p + ε1.

4This is misleading since positive and negative returns cancel out.


The sign of β1 depends on the covariance between r and p

cov(r, p) = cov(c− r, r) = cov(c, r) − var(r) = −var(r) < 0.

Thus we should expect a negative relation between firm size and returns.Now consider a regression of actual returns on expected (model) returns r

r = α2 + β2r + ε2.

Take the pricing errors ε2 and regress them on current size

ε2 = α3 + β3p+ ε3.

The sign of this regression coefficient depends on the covariance

cov(p, ε2) = cov(c− r, ε2)

= cov(c, r − α2 − β2r) − cov(α2 + β2r + ε2, ε2) = −var(ε2) < 0.

This shows that size is negatively related to pricing errors. How much of thevariation in actual returns is explained by size? Decompose the R2 from thefirst regression

R2 =var(β1p)

var(r)= β2

1

var(p)

var(r)=

[

cov(r, p)

var(p)

]2var(p)

var(r)

=var(r)

var(p)=

var(r)

var(c− r)=

var(r)

var(r) + var(c)

The larger the variation in cashflows the lower is the R2. The basic con-clusion of the article is that market value will end up capturing unmea-sured/unmodeled risks.

Time Series Patterns

Asset returns contain patterns in autocorrelations summarized in Table 2.4.Using CRSP stock returns from 1962–1994, portfolio autocorrelations rangefrom 1.3% to 43.1%. Autocorrelations increase with shorter time horizonsand are higher in equally-weighted portfolios than value-weighted portfolios.Both of these effects are likely due to higher autocorrelation in smaller stocks,which may be due to non-synchronous trading. There is weak evidence ofnegative autocorrelations in multi-year returns. In most cases the economicsignificance of the autocorrelations may be small, as is the proportion of thetotal variance explained. Individual stocks, especially smaller ones, tend tohave negative autocorrelation.


Table 2.4: Correlation Patterns

Horizon Individual PortfolioDaily – +Weekly – +Monthly – +AnnualMulti-year –

Variance Ratios

The random walk hypothesis implies the variance of asset returns scales withtime; a T -period return should have a variance T times as large as a one-period return. A similar statistic can be derived using variance differences.Finite sample properties can be significantly improved by using overlappingobservations and making appropriate degrees of freedom adjustments.

Positive autocorrelations suggest variance ratios greater than one. For theequally-weighted portfolios, this seems to be the case, with V R(2) ≈ 1.2, andincreasing with longer-horizons. V R(16) ranges from 1.5 to 1.9, depending onthe time period (this effect is getting smaller as time goes on). These resultsdisappear in value-weighted portfolios. Looking at size-sorted portfolios, thevariance ratios are largest for the small-stock portfolios and are close to onefor the stocks in the largest decile. For individual securities the varianceratios are close to one in general, and less than one for the longer horizons.This is because there is some negative autocorrelation in individual securityreturns due to the bid-ask spread.

The combination of negative autocorrelation in individual securties andpositive autocorrelation in portfolios gives rise to positive cross-autocorrelations.This phenomena can be summarized as a stronger correlation between cur-rent small-stock returns and lagged large-stock returns than between currentlarge-stock returns and lagged small-stock returns. More directly, large stockstend to lead smaller stocks. This can help explain the apparant profitabilityof contrarian strategies.


Long-Horizon Returns

Shiller () and Summers () present models where stock prices have fads or bub-bles, causing large slowly decaying swings from fundamental values. Shorterhorizon portfolio returns have little autocorrelation, while returns at longerhorizons have strong negative autocorrelation. Empirical evidence supportsthese models, although the tests are based on small sample sizes and lackpower. Other empirical results indicate that the variance grows more slowlythan the time horizon, also consistent with the model. A general problemis that that irrational bubbles in stock prices are not distinguishable fromrational time-varying expected returns. Long-horizon returns are also pre-dictable with other variables such as D/P and E/P. These variables canexplain roughly a quarter of the variation in two to four year returns, muchmore than is possible for shorter horizons.

? propose their contrarian viewpoint, where buying losers and sellingwinners (measured over 3 to 5 year periods) produces excess returns. Othershave argued that the excess returns are due to differences in risk, althougha rebuttal paper from DeBondt and Thaler disagrees. It is possible that thecontratrian results are due to a size effect or some type of distressed-firmeffect.

2.8.2 General Procedures

Multivariate tests can elimintate the errors-in-variables problem and increasethe precision of parameter estimates. This type of test still does not say whythe model is rejected. Consider a multi-beta model of the form

Et[Ri,t+1] = λ0,t +

K∑

j=1

βi,j,tλj,t.

To test this using a multivariate regression

Ri,t+1 = αi +K∑

j=1

βi,jRj,t+1 + εi,t+1 (2.22)

the intercept restriction is αi = λ0(1 −∑

βi,j). This is equivalent to mean-variance intersection, meaning that the minimum variance boundaries of allthe asset returns and minimum variance portfolios intersect at a single point.


In other words, a combination of mimicking portfolios lies on the mean-variance frontier.

The multivariate regression in the restricted form uses TN observationsto estimate N + 1 parameters. The unrestricted model has 2N parametersto estimate. Tests with longer time series have more power, while those withmore assets have a larger size. The restrictions can be tested with the Wald(W), likelihood ratio (LR), or Lagrange multiplier (LM) statistics. These areall asymptotically χ2 but may differ in finite samples.

2.8.3 CAPM Tests

The only testable implications of the CAPM are that the market is mean-variance efficient, and for the SL model that the intercept is zero. Roll(1977) indicates that this is inherently impossible to do since the market isunobservable. “Rejecting” the model may simply mean that the proxy is notmean variance efficient. Converesely, “failing to reject” may mean that theproxy is mean variance efficient. In either case, we have not said anythingabout the mean-variance efficiency of the market. Further, there are alwayssome portfolios which are mean-variance efficient. There is also the issuewith conditioning information. The CAPM can hold conditionally but failunconditionally. Without knowing what conditioning information to use, themodels are difficult to test.

Stambaugh (1982) examines the sensitivity to excluded assets in the mar-ket proxy, finding inferences are similar regardless of the specific compositionof the proxy. Kandel and Stambaugh (1987) and Shanken (1987) estimatethe upper bound on the correlation between the proxy and the true marketneeded to overturn rejection of the model. As long as the correlation is atleast 0.70, inferences would not change. Roll and Ross (1994) counter bysaying that if the true market portfolio is efficient, cross-sectional relationsbetween expected return and beta are very sensitive to the proxy choice.

As in any statistical test, there is a tradeoff between size and power.Adding assets tends to increase the size of a test in finite samples. A longertime series can considerably increase the power of a test. GMM tests havebecome popular since they do not rely on normality, homoskedasticity, oruncorrelatedness.

The early evidence was generally supportive of the CAPM, in that theevidence seemed consistent with mean-variance efficiency of the “market”portfolio. Representative studies include Fama and MacBeth (1973), Black,


Jensen, and Scholes (1972), and Blume and Friend (1973). In the mid-1970’sthe “anomalies” literature developed [see Fama (1991) for a review].

Common criticisms of these “anomolies” are sample selection and datasnooping biases. Kothari, Shanken, and Sloan (1995) claim that sampleselection biases drive the results of Fama and French (1992), although Famaand French (1996b) dispute this claim.

Fama-MacBeth (1973)

FM perform introduce what has become a classic methodology for empiricalasset pricing tests. They test the Black and SL CAPMs using monthlyportfolio returns and the equally-weighted NYSE as the market. Their testsexamine (i) the linearity of the risk-return tradeoff, (ii) if variables otherthan β matter, (iii) if the risk premium is positive, and (iv) if the return onthe zero-beta portfolio is equal to the riskless rate.

The procedure is as follows. First, portfolios are formed using estimatedβ of individual securities over a four year period. Since measurement errorwill systematically affect these portfolios, the betas are reestimated over afive year period and averaged across assets to get portfolio β. The β foreach portfolio is recalculated each month over the next four years to coverdelistings. Returns for each of the 20 portfolios are regressed on the port-folio betas. This is repeated each month, and the estimated coefficients areaveraged over time.

The results are generally supportive of the Black model but the estimatedriskless rate is higher than the market rate. Additional regressions includingβ2 and the asset-specific risk indicate that the risk-return relation is linearand there is no reward for bearing unsystematic risk.

Extensions by Litzenberger and Ramaswamy (1979) and Shanken (1992)explicitly adjust standard errors for the EIV bias rather than form portfolios.Shanken (1992) shows that the standard errors in Fama and MacBeth (1973)do not properly reflect measurement error in β, overstating the precision ofthe risk premium estimates.

Black, Jensen & Scholes (1972)

Fama-French (1992)

The controversial Fama and French (1992) paper has generated a significantdebate in the literature. The general goal of the paper is to assess the relative


importance of beta, size, B/M, leverage, and E/P in determining the cross-section of expected returns. These variables had been previously documentedas important in the “anomalies” literature. Their general findings are thatbeta is not systematically related to returns, while size and B/M subsumethe other factors.

The methodology employed is basically an extension of the Fama andMacBeth (1973) procedure. The new steps involve the combination of ac-counting and market data. All accounting data for the fiscal year endingt− 1 is combined with returns measured from July of year t to June of t+1.Stock price data used to construct accounting ratios is from the beginning ofyear t, while the size measure is from June of year t. This procedure ensuresall explanatory variables are known prior to the return.

In order to preserve the firm-specific accounting information, portfoliosare not used in the same way as in FM. Instead, portfolios are used tocalculate betas, which are then assigned to all firms in that portfolio. Theportfolios are formed by first forming size deciles, then forming beta decileswithin each size decile. In both sorts, breakpoints are set based on only theNYSE firms. With these 100 portfolios, portfolio betas are calculated each asthe sum of the coefficient on current and prior month CRSP value-weightedretutns. The beta for a particular stock can change over time as the stockmoves into different portfolios.

This two-way sorting procedure produces variation in beta that is unre-lated to size. Univariate statistics show that average returns are related tosize, but unrelated to beta. This evidence is confirmed by the FM regressions.

Gibbons (1982)

Gibbons (1982) introduces a multivariate test of the CAPM and rejectsCAPM soundly using LR. He uses the CRSP equally-weighted index as themarket, estimates β over a 5 year period, and forms 40 portfolios. This mu-tivariate methodology avoids the EIV problem, provides more precise riskpremium estimates, and has more power than previous tests. The nonlinearrestriction on the intercept is linearized with a Taylor-series expansion.

Stambaugh (1982)

Stambaugh (1982) shows inferences are not sensitive to proxy choice, butare sensitive to the asset choice. He argues that W lacks power, LR has


the wrong size, and LM is closest to its asymptotic distribution. Using aportfolio of stocks, bonds, and preferred, he fails to reject linearity (BlackCAPM), but rejects SL. Using fewer assets he rejects both models.

Shanken (1985)

Shanken (1985) provides the asymptotic results for the multivariate tests inGibbons (1982). He shows that LM < LR < Q∗(= W ). These statistics areall transformations of one another. Shanken uses QA

C , which includes consid-erations for sample size and degrees of freedom adjustments. RecalculatingGibbons’ LR statistic, Shanken shows p = 0.75, so the rejection inference isoverturned.

The cross-section regression test (CSRT) used in this paper does notrequire specifying HA. The procedure estimates beta in a first stage, thenusing betas in cross-sectional regressions. The CAPM is rejected using theequally-weighted CRSP index.

MacKinlay (1987)

MacKinlay (1987) discusses power of multivariate SL CAPM tests. Findsthat tests against an unspecified alternative have low power. The type of de-viation from the model is important in determining power. These tests havereasonable power against cross-sectional random deviations. However, thesetests have low power against omitted factors. He rejects in some subperiodsbut fails to reject overall.

2.8.4 ICAPM/CCAPM Tests

Tests of a multi-beta model are similar to CAPM tests in that they arereally tests of the mean-varance efficiency of a particular combination ofportfolios. There is mixed evidence about the importance of durable goods.Habit persistence models perform better in goodness-of-fit tests, but still donot explain the first moment of the equity premium puzzle.

Hansen Singleton

Reject model. See QM notes for more details.


Mehra Prescott (1985)

The equity premium puzzle arises because extreme risk aversion parametersare needed to make the low volatility of aggregate consumption growth in theU.S. consistent with the returns on both equity and T-bills. Some of theseresults may arise partially because of poorly measured consumption data, butefforts to correct for this still lead to rejections of the model. One possible(partial) explanation for the equity premium puzzle is incomplete markets,which may result in the overestimation of risk aversion. One experimentusing log utility (CRRA = 1) results in an estimate based on aggregateconsumption of CRRA = 3. Weil () presents the same puzzle from theperspective of the riskless asset.

2.8.5 APT Tests

The testable implications of the APT given in (2.17) are1. λi 6= 0 for any i2. λ0(= rf) ≥ 0 (debated)3. linearity

Again, the test really amounts to seeing if a particular combination of port-foliosis mean-varance efficienct.

To make the intertemporal APT testable, certain restricitons need to beimposed. One alternative is to assume that (i) the observed set of assets hasa factor structure, (ii) the noise terms of the observed assets are uncorrelatedwith the noise terms on the unobserved assets, and (iii) the factors span thestate variables. Alternatively, we can assume logarithmic utility in whichcase the intertemporal APT reduces to the APT. These requirements arevery similar to the ICAPM.

As mentioned in Section 2.4.2, the APT has features which make testingdifficult. In fact, one view is that APT is not testable [e.g., Shanken (1982),Reisman (1992)], whereas others [Ingersoll (1984), others ??] claim it is. Theprimary reason for this disagreement is the approximate nature of the model.Are deviations from the exact model due to the approximation or are theygenuine deviations from the model itself? The test then becomes a joint testof the model and the additional assumptions needed to impose the exact pric-ing relation. The APT and ICAPM are not empirically distinguishable. The“pervasive factors” in the APT world can coincide with the “state variables”in the ICAPM world.


The test of the model requires estimation of both the factor loadings (B)and the factor prices (F). The two primary testing approaches differ in theorder these variables are estimated. Cross-sectional tests estimate (B) in thetime series, the use these estimates for a number of firms to estimate (F) inthe cross-section. The time series tests perform the estimation in the reverseorder.

Fama and MacBeth (1973) provide the basic approach for the cross-sectional test [see Section 2.8.3 for details]. Some of these tests estimatethe factors statistically while others use economic specifications. Chen, Roll,and Ross (1986) specify five economic variables as factors: industrial produc-tion, unexpected inflation, changes in expected inflation, credit quality, anda term premium. The find that the specification is good in the sense thatmany of these factors are priced and additional factors such as the marketreturn, consumption growth, and changes in oil prices are not priced. Chan,Chen, and Hsieh (1984) perform a study similar to CRR, but are also able toexplain the size anomaly. However, Shanken and Weinstein (1990) reply thatthese two studies are sensitive to the portfolio formation used. Specifically,forming size-based portfolios at the end of the estimation period causes mis-estimation of the βs to show up systematically in the size portfolios, biasingthe subsequent risk premium estimate.

The time series test method was originally proposed by Black, Jensen,and Scholes (1972) Factor prices are estimated in the first pass, and theirsensitivity in the second pass. The null bypothesis is that the intercept iszero (or α = (1 −Bi)λ0 in the absence of a riskless asset).

In summary, the tests of the APT generally reject the model, but theAPT seems to perform better than alternatives such as CAPM. The APThas been used in applications which offer indirect evidence of its success aswell. In fund performance tests, the model indicates fund managers havenegative Jensen’s alphas, which is a similar result from the CAPM models(the magnitudes differ though). In calculating the cost of capital, CAPMand APT yield similar results. In event studies the APT does not seem tooffer much gain over a single factor model.

2.8.6 Present Value Relations

The history of volatility and returns tests result in a flip-flop of results. Theearly variance bounds tests rejected the present value models, whereas thereturns tests failed to reject. More recently, volatility bounds tests provide


mixed evidence, but the returns tests now reject the model.

Volatility tests

Denote the “perfect foresight” price

p∗t =∞∑

τ=1

βτdt+τ

Then pt = E[p∗t ] or

p∗t = E[p∗t ] + εt = pt + εt

var(p∗t ) = var(pt) + var(εt) ≥ var(pt) (2.23)

This says actual prices should be less volatile than the “model” price fromthe dividend series. In fact, we find the opposite. Actual prices are morevolatile than would be expected from dividends.

There are several problems with the above test. First, the price seriesis nonstationary so it needs to be modified. Second, the infinite sum is aproblem in a finite sample. This can be overcome by including a terminalvalue in the distant future. Third, the observed dividend series is not seriesof independent observations, but rather a single realization. This creates asmall sample problem in implementing the test. Fourth, there is no way tocapture time-varying expected returns in this framework. Finally, differentspecifications of the investors’ information sets lead to different critical val-ues, making interpretation difficult. In summary, there are several necessaryadjustments to the variance bounds test. Even after making these adjust-ments, there is no way to hold size constant so there is no way to meaningfullycompare the power of this test to alternatives.

Shiller (1981) uses the perfect foresight price decomposition to derivevaraince bounds. He finds the actual price is five to thirteen times morevolatile than the perfect foresight price. His analysis indicates that the pricechange volatility is highest when information about dividends is revealedsmoothly. Large, occasional information releases result in prices with lowervariance but higher kurtosis.


Returns tests

Tests of long horizon returns have found that there is siginificant negativeautocorrelation over the three to five year horizon, indicating a tendancy formean reversion.

Orthogonality tests

A model-free version is not subject to the nuisance parameter problem whichplagues the variance bounds test. Both the model-free and the model-basedorthogonality tests are better-behaved econometrically than the returns tests.

Chapter 3

Fixed Income

3.1 Introduction

The pricing of bonds differs from pricing other assets such as equity primarilybecause bonds are nonlinear. A bond has:

1. fixed, known maturity2. fixed, known terminal (face) value3. fixed, known periodic cash flows4. more thinly traded (at least “older” issues)

Term structure models can be viewed as time series models of the stochasticdiscount factor.

Duration, Convexity

3.2 Term Structure Basics

3.3 Inflation and Returns

3.4 Forward Rates

Forward rates had been viewed simply as forecasts of expected future spotrates (PEH). Fama (??) shows that the forward rates also contain expecta-tions of the premium above one month T-bills.

• Holding period return is the change in log price on a particular bondfrom one period to the next.

45

46 CHAPTER 3. FIXED INCOME

• The forward rate is the difference in the log prices of bonds of differentmaturities at the same point in time.

• Premium is the holding period return less the one month spot rate.

Fama (1984)

Fama uses a regression approach to separate the information about expectedfuture spot rates from information about the expected premium.

1. premium = f(forward - spot)2. ∆ spot = f(forward - spot)

Results are that forward rates can predict premiums which vary throughtime and the expected future spot rate up to five months out. Froot has aresponse to Fama’s finding, suggesting that Fama ignores systematic expec-tations errors.

Fama–Bliss (1987)

Find that forward rate forecasts of near-term changes in interest rates arepoor, but forecast power increases at longer time horizons. Interpret this asevidence of a slow mean-reverting process. Also find evidence of time-varyingexpected premiums, and that the ordering of risks and rewards changes withthe business cycle.

Stambaugh (1988)

An affine yield model implies a latent variable structure for bond returns.Fewer state variables than forecasting variables puts testable restrictions onforecasting equations for bond returns. Reject CIR with non-matched ma-turities (avoids measurement error). Addresses source of errors, their conse-quences, and how the choice of instruments affect the outcome of the tests

3.5 Bond Pricing

As any asset, bonds can be priced using the pricing kernel approach presentedin Section 2.5. Begin with the fundamental pricing equation

1 = Et[Mt+1Rn,t+1].

3.6. AFFINE MODELS 47

The uppercase M is used to distinguish it from logs and the n subscriptindicates the time to maturity. The return can obviuosly be expressed asthe relative price change Rn,t+1 = Pn−1,t+1/Pn,t. Substituting this into thepricing equation gives

Pn,t = Et[Mt+1Pn−1,t+1].

Recursive substitution and the fact that the bond is worth a dollar at matu-rity gives another representation

Pn,t = Et[Mt+1 . . .Mt+n].

In this light a bond pricing model is really a time series model of the stochasticdiscount factor.

Fixed income models are broadly categorized as either stochastic interestrate models or stochastic term structure models. Stochastic interest ratemodels begin by specifying a process dr for the short rate. The problem withthis approach is that the model price of the bond may not equal the marketprice. The short rate process also implies prices for bonds of other maturitiesand these may be mispriced as well. The stochastic term structure modelsuse the observed market prices and estimates of the volatility structure toinfer the stochastic process of the short rate. This information is then usedto get a distribution for the bond price.

3.6 Affine Models

Affine yield models represent a class of realtively simple models in which allrelevent variables are conditionally log-normal and log yields are linear instate variables. Affine forward rates imply affine yields. Taking logs of thepricing relation

pn,t = Et[mt+1 + pn−1,t+1] +1

2var(mt+1 + pn−1,t+1).

A model with k state variables implies that the term structure can besummarized by the levels of k bond yields at each point in time and theconstant coefficients relating the bond yields. In this sense affine yield modelsare linear; they are non-linear in the evolutionary process of the k basis yieldsand the relation between the cross-sectional coefficients and the underlyingparameters of the model.


Table 3.1: Single Factor Stochastic Interest Rate Models

dr = (α + βr)dt+ σrγdZ

Model α β γ SpecificationMerton (ABM) 0 0 αdt+ σdZVasicek 0 (α + βr)dt+ σdZCIR SR 1/2 (α + βr)dt+ σ

√rdZ

Courtadon 1 (α + βr)dt+ σrdZDothan 0 0 1 σrdZGBM 0 1 βrdt+ σrdZCIR VR 0 0 3/2 σr3/2dZCEV 0 βrdt+ σrγdZDuffie-Kan 1/2 (α1 + β1r)dt+ (α2 + β2r)

γdZ

Assumptions

• distribution of the SDF is conditionally lognormal;• bond prices are jointly lognormal with the SDF;• (additional strong assumptions): homoskedastic mt+1 (Vasicek)

Properties

• Log prices (and yields) are affine in state variables.• Analytic solution of pricing equations (outside affine yield generally

requires numerical solutions e.g., Black, Derman, and Toy).• Trivial rejection of model without addition of an error term.• Limits the way in which interest rate volatility can change with the

level of interest rates.• Implies risk premia on long bonds always have the same sign (single-

factor).• Applies to real bonds only ?• The model can be renormalized so that the yields themselves are the

state variables (e.g., a two-factor model would use two yields).

3.6. AFFINE MODELS 49

3.6.1 Vasicek

dr = κ(θ − r)dt+ σdB

y1t = xt − β2σ2/2 and − pnt = An +Bnxt

To get this model begin by writing the sdf as a forecast and an innovation

−mt+1 = xt + εt+1.

The sign is a convention. Assume that xt+1 follows an AR(1) process and,for simplicity, its innovations are uncorrelated with εt+1

xt+1 − µ = φ(xt − µ) + ξt+1 and εt+1 = βξt+1.

Now consider the log price of a one period bond

p1,t = Et[mt+1] +1

2var(mt+1) = −xt +

1

2β2σ2 = −y1,t.

• Allows interest rates to be negative (OK for real, not nominal).• Can handle rising, inverted, and humped yield curves, but not inverted

humped curves.• Price of interest rate risk is a constant that does not depend on the

level of the short rate.• Interest rate changes have constant variance.• Limiting forward rate can not be both finite and time-varying.• Log forward rate curve tends to slope downwards unless β is sufficiently

small.• Random walk is a special case.• B measures the sensitivity of the n-period bond return to the one-

period interest rate (and the state variable). This sensitivity increasesin maturity, and is always less than the maturity.

• Average short rate is µ− β2σ2/2.

3.6.2 The CIR Model

dr = κ(θ − r)dt+ σ√rdB


The basic CIR model is a general equilibrium, continuous time model of thereal returns on the asset in an economy [see section 2.3.5]. The general modelis specialized to the term structure in ?. The asset is used to smooth con-sumption, so its value depends on its hedging effectiveness, or its covariancewith consumption. The model is derived in an option pricing frameworkby constructing a riskless synthetic portfolio, which must earn the risklessrate in equilibrium. The hedge portfolio is constructed of bonds of differingmaturities; it is assumed that the market price of risk is the same for bondsof all maturities. A recursive approach must be used to solve the model.Although the model claims to endogenously derive the interest rate process,it is a direct consequence of the specification of the state variable.

Assumptions• identical individuals with time-additive log utility (Dunn and Singleton

relax this assumption but do not have much success)• xt+i and mt+i are normal conditional on xt for i = 1, but non-normal

for i > 1.• y1t = −p1t = xt(1 − β2σ2/2) y1t is proportional to the state variable

and its conditional variance is proportional to its level.• restricts interest rates to be positive

Predictions• Variance proportional to the state variable.• All bond returns are perfectly correlated (general prediction of all

single-factor models).• Prices are a deterministic function of the parameters, the short rate,

and maturity; an error term must be specified to keep the modeltestable.

• The long rate converges to a constant.• Stable parameters (λ, κ, θ, σ).• Forward rate fnt = −B2

nxtσ2/2

• time variation in term premia ?

3.6.3 Duffie-Kan Class

The Duffie-Kan model is the most general affine model possible. It nests allthe common models as special cases.

dr = κ(θ − r)dt+√

α + βrdZ

3.7. MULTI-FACTOR MODELS 51

3.6.4 Other Single Factor Models

HJM

Ho-Lee

BDT

3.6.5 Alternatives• Non-linear models (γ = 3/2)• Non-parametric models• Markov switching models• GARCH• Higher-order ARMA processes• Several state variables

3.7 Multi-Factor Models

Longstaff and Schwartz (2–factor)

−mt+1 = x1t + x2t + x1/21t εt+1

p1t = −x1t − x2t + x1tβ2σ2

1/2• second factor (instantaneous variance of changes in short rate) avoidsimplication that all bond returns are perfectly correlated

• variance of innovation to log SDF is proportional to the level of x1t andis conditionally correlated with x1t but not with x2t.

• One-period yield is no longer proportional to x1t and the short ratealone is no longer sufficient to describe the state of the economy.

• The model is a generalization of the square-root model• it can also generate inverted humped yield curves.• Whenever the SDF can be expressed as the sum of two independent

processes, the resulting term structure is the sum of the term structuresthat would exist under each of these processes.

3.8 Empirical Tests

3.8.1 Brown & Dybvig (1986)• Nominal, prices, cross-sectional, ML

52C

HA

PT

ER

3.FIX

ED

INC

OM

E

Table 3.2: Summary of Empirical Results

Paper Dataa Methodsb Notes Results

BD (1986) P,N C,ML iid errors r > r, σ not constantBS (1994) P,R C,ML Unstable est., don’t support

mean reversion, σ > 0 bindsCKLS (1992) Y,N TS,GMM assume normality reject γ < 1, unconstr. γ = 1.5,

mean reversion not importantGR (1993) Y,R TS,GMM forecast R from N fail to reject CIR, plausible

use non-central χ2 estimates, fit short bonds betterPS (1994) P,N TS,ML second factor for inflation unstable/unrealistic estimates,

non-central χ2 reject original and two factor CIRLS (1992) Y,N C,GMM second factor for volatility reject single factor model,

estimated with GARCH 2–factor holds for short and int bonds

aPrice or Yield; Nominal or Real.bCross-section or Time series, Econometric Method.

3.8. EMPIRICAL TESTS 53

• Assume pricing errors are iid - a strong assumption given the differencesin trading frequency across maturities; an alternative is to assume vari-ance increases with maturity and is correlated across maturities.

• Estimated r systematically overstates implied short rates (recall FamaMacBeth; Merton’s model of heterogeneous information sets).

• Find estimated variance is erratic, although similar in magnitude toCIR weekly time series estimates.

• find annual average of implied standard deviation (¯

σ√r) appears to be

an unbiased predictor of time series estimate of the standard deviationof changes in the short rate.

• Bills appear to be better described by the model than bonds.• Discount issues’ prices are underestimated, premiums are overestimated.• Evidence that the errors are not iid.

3.8.2 Brown & Schaefer (1994)• Real, prices, cross-sectional, ML• CIR model is generally able to replicate observed yield curve shapes• Pricing errors are generally within the bid–ask spread• Parameter estimates are unstable, especially κ+ λ• Positivity constraint on σ2 binds in many cases• Cross-sectional estimates of variance are not unbiased estimates of the

time series estimates.• evidence on mean reversion is generally not supportive

3.8.3 Chan, Karolyi, Longstaff & Sanders (1992)

CKLS present a generalized model that nests eight popular interest rateprocesses.

dr = (α+ βr)dt+ σrγdZ• Nominal, yields, time series, GMM• The γ term seems to be the most important; models with γ < 1 are

all rejected, and those with γ = 1.5 fare the best. The unrestrictedestimate of γ is 1.5, and is significantly different than unity.

• The mean reversion process, which adds considerable complexity to themodel, does not appear to be of major importance.

• Results are trouble for single-factor affine yield models: without meanreversion, the term structure may increase initially, but will then bedownward sloping. Second, with γ > 0.5, the models become in-tractable and must be solved numerically.


3.8.4 Gibbons & Ramaswamy (1993)• Forecast real returns on nominal bonds in a time series setting (assume

inflation is independent of the real SDF ?)• GMM in a time series• Fail to reject CIR, obtain plausible parameter estimates• Reject with off-the-run bonds (measurement error and a small sample).• Model fits short end of term structure better than longer maturities.• Find some evidence of autocorrelation in returns

3.8.5 Pearson & Sun (1994)• Nominal, prices, time series, ML• Generalize square-root model to allow the variance of the state variable

to be linear in the level of the state variable.• Also include a second factor — expected inflation.• Reject original and two-factor CIR model.• Unrealistic parameter estimates:• Unstable parameter estimates (across datasets).• Within sample prediction has no power and is little better than a naive

prediction of current values.

3.8.6 Longstaff & Schwartz (1992)• second factor for volatility estimated using GARCH• test cross-sectional restrictions with GMM• Find model holds for both short-and intermediate-term maturities• Reject single-factor model

Chapter 4

Derivatives

4.1 Introduction

Virtually all derivatives pricing is based on some sort of arbitrage argu-ment. This chapter outlines derivative pricing in terms of both discrete-and continuous-time models. Several derivations of each model are givento show the links between them. More advanced topics are covered rathersuperficially.

4.2 Binomial Models

Binomial option pricing is a special case of Arrow-Debreu pricing presentedin Section 2.4.1. Standardize the price of an asset to have a price of $1, valuein the “up state” of u, and value in the “down state” of d. Recall Xφ = pand X′α = b so

[

u d1 1

] [

φ1

φ2

]

=

[

1pf

]

Solving these equations,

φ1 =Rf − d

Rf(u− d)and φ2 =

u− Rf

Rf(u− d)

π1 =Rf − d

u− dand π2 =

u−Rf

u− d.

55

56 CHAPTER 4. DERIVATIVES

The binomial model is based on a replication argument. Consider posi-tions in a stock and bond such that the portfolio replicates the payoffs on anoption in the next period. That is, we want to find holdings in the stock andbond ∆ and B so the price of the position is Cu in the up-state and Cd inthe down-state

Su∆ +RfB = Cu and Sd∆ +RfB = Cd.

Solving these equations gives

∆ =Cu − Cd

S(u− d)and B =

uCd − dCu

Rf (u− d)=Cu − Su∆

Rf

.

The stock holding ∆ has the interpretation of the partial derivative of thecall price with respect to the stock price. The current price of the option is

C = ∆S +B = [πCu + (1 − π)Cd]/Rf .

To implement this approach we need to calculate u and d. The formalspecification is

u = exp

(

µaτ

n+σa

√τ√n

√

1 − θ

θ

)

and d = exp

(

µaτ

n− σa

√τ√n

√

θ

1 − θ

)

but a “shortcut” specification is

u = exp

(

σa

√τ√n

)

and d = exp

(

−σa

√τ√n

)

.

The subscript a indicates annual figures and continous compounding shouldbe used. The life of the option is τ and there are n periods in the binomialtree. The corresponding riskless rate is Rf = exp(raτ/n).

Solving for the price of the option uses a recursive algorithm. At theexpiration of the option the value is given by C = (ST −K, 0)+. Using thesevalues, the option price at T − 1 can be calculated. Stepping backwardsthrough the tree gives the initial option price.

To get the price of a European put option, put-call parity can be used.This is an arbitrage argument that requires

S + P = C +Ke−rτ .

4.2. BINOMIAL MODELS 57

Table 4.1: Early Exercise of American Options

Call Putd = 0 Never In the moneyd > 0 Before ex-date After ex-date

Volatility does not enter the equation directly since is affects the put and callin the same way.

If the option is American it is necessary to check for early exercise at eachnode in the tree. To do so simply uses C = (Ch, Cx)

+ where the h indicatesthe hold value as calculated above and x is the early exercise value. Earlyexercise is never optimal for a call on a stock that does not pay dividends.For the put to be exercised early it must be sufficiently in the money. If thestock does pay dividends, calls may be exercised just before the ex-date andputs just after the ex-date.

The number of steps in the tree affect the answer for the option price. Themodel value converges to the true value as the number of nodes gets large,but at a computational expense. The model price generally changes verylittle after about a hundred steps. There is an “odd-even” effect where thecalculated value oscillates between over- and under-valued as the number ofnodes in incremented. To remove this error, you can use a weighted averageof prices calculated at n− 1, n, n+ 1 nodes.

4.2.1 Alternative Derivations

CAPM-based derivation

The standard CAPM result is

E[ri] = rf + βi (E[rm] − rf ])

and βi = σim/σ2m = ρimσi/σm. Let λi = ρim [E[rm] − rf ] /σm, the correlation-

adjusted market risk premium. Rewriting the CAPM relation,

E[ri] = rf + λiσi


or

E[Ri] = Rf + λiσi.

Now assume asset i is an option written on a stock whose returns follow abinomial process. P u and P d are the end-of-period state prices, with θ thetrue probability of the up-state. The current price is given by P . Then

E[Ri] =θP u + (1 − θ)P d

P

and

σi =P u − P d

P

√

θ(1 − θ).

Substituting these expressions into the modified CAPM expresion and rear-ranging yields

P =P uπ + P d(1 − π)

Rf

where the risk-neutral probability π = θ − λi

√

θ(1 − θ) is a function of thetrue probabilities and the correlation-adjusted market price of risk. To avoidarbitrage, all assets must be priced with the same risk-neutral probabilities.Every dollar investment in the stock should be priced according to

1 =uπ + d(1 − π)

Rf.

Rearranging and solving for π gives

π =Rf − d

u− d.

Note that when λi = 0, π = θ. This happens when investors are actuallyrisk-neutral or if the security is uncorrelated with the market. With λi > 0the risk-neutral probabilities overstate the true probabilties in unfavorablestates and understate the truth in favorable states.

4.2. BINOMIAL MODELS 59

Relation to Black Scholes

Subscripts u and d index the up- and down-states, while all other subscriptsdenote partial derivatives. Begin with the single period binomial optionpricing equation

C =V uπ + V d(1 − π)

Rf

where

π =Rf − d

u− d=

erτ − e−σ√

τ

eσ√

τ − e−σ√

τ.

Assume a 50% probability of the up-state to get u = eσ√

τ and d = e−σ√

τ .Re-expressing V u and V d

V u = C(eσ√

τS, t + τ) and V d = C(e−σ√

τS, t + τ).

Substitute into the binomial equation

C(S, t) =(erτ − e−σ

√τ )C(eσ

√τS, t + τ) + (eσ

√τ − erτ )C(e−σ

√τS, t+ τ)

erτ [eσ√

τ − e−σ√

τ ].

Next, perform several Taylor series expansions.

∆Su = (eσ√

τ − 1)S ∆Sd = (e−σ√

τ − 1)S ∆t = [(t+ τ) − t] = τ

eσ√

τ = 1 + σ√τ +

1

2σ2τ e−σ

√τ = 1 − σ

√τ +

1

2σ2τ erτ = 1 + rτ.

C(eσ√

τS, t + τ) = C + (eσ√

τ − 1)SCS +1

2(eσ

√τ − 1)2S2CSS + τCt

and similarly for the down state. Substituting all this into the expanded bino-mial formula and simplify by cancelling like terms and drop terms involvinghigher orders of τ gives the Black-Scholes PDE

Ct = rC − rSCS − 1

2σ2S2CSS.


4.2.2 Trinomial Models

Multinomial models are based on matching risk-neutral moments. For ex-ample, the trinomial model requires three probabilities, pu, pm, and pd. If thestock price process is

dS

S= (r − 1

2σ2)dt+ σdW = αk + σdW

then E[dS/S] = αk and var(dS/S) = α2k2 + σ2k. Three equations are usedto solve for the three unknown probabilities

puh+ pm0 + pd(−h) = αk

puh2 + pm02 + pdh

2 = α2k2 + σ2k

pu + pm + pd = 1.

The resulting answers are

pu =1

2

(

σ2 k

h2+ α2 k

2

h2+ α

k

h

)

pu = 1 − σ2 k

h2− α2k

2

h2

pd =1

2

(

σ2 k

h2+ α2 k

2

h2− α

k

h

)

4.3 Black Scholes Model

The famous Black and Scholes (1973) option pricing model and its extensionsby Merton (1973) has revolutionized derivative pricing.

4.3.1 Black Scholes Derivations

Derivation I: Replication

Assume the stock price follows GBM

dS = µSdt+ σSdW

and there is a riskless asset B = ert. The option price depends on the stockprice and time C(S, t). Using Ito’s Lemma

dC = Ctdt+ CSdS +1

2CSS(dS)2.

4.3. BLACK SCHOLES MODEL 61

Making the substitutions gives

dC = (Ct + µSCS +1

2σ2S2CSS)dt+ σSCSdW

= µCCdt+ σCCdW.

Form an arbitrage portfolio with investments wS +wC +wB = 0. The returnon this investment is

dΠ

Π= wS

dS

S+ wC

dC

C+ wBrdt

= wS[µdt+ σdW − rdt] + wC [µCdt+ σCdW − rdt]

= [wS(µ− r) + wC(µC − r)]dt+ [wSσ + wCσC ]dW

Choose wS and wC such that there is no risk, wSσ + wCσC = 0. With norisk, wS(µ− r) + wC(µC − r) = 0 to avoid arbitrage so

µ− r

σ=µC − r

σC= λ,

the market price of risk. Making the substitutions

µ− r

σ=

(Ct + µSCS + 12σ2S2CSS)/C − r

σSCS/C.

Simplifying gives the PDE


2σ2S2CSS.

Derivation II: Using CAPM

An alternative derivation uses the CAPM. The beta of an option is a functionof the stock beta and the elasticity of the option price with respect to thestock price

βC = βSCSS

C.

The expected return on the stock and option are

E

[

dS

S

]

= (r + αβS)dt = µdt and E

[

dC

C

]

= (r + αβC)dt = µCdt.


Making the substitution,

E[dC] = (rC + αSCSβS)dt.

By Ito’s Lemma

dC = (Ct + µSCS +1

2σ2S2CSS)dt+ σSCSdW = µCCdt+ σCCdW.

Taking expectations and setting the two expressions equal gives the BlackScholes PDE


2σ2S2CSS

Solving the PDE

The following method makes use of the Feynman-Kac (Cox-Ross) solution.The boundary condition is C(ST , T ) = (ST −K)+.

C = EQ[e−rτ (ST −K)+] = e−rτEQ[(ST −K)+]

= e−rτEQ[ST |ST ≥ K]Prob[ST ≥ K] −Ke−rτProb[ST ≥ K].

Next, get the conditional distribution

lnST | lnSt ∼ N(ln St + (r − σ2/2)τ, σ2τ) = N(m, v2).

The density is1

f(ST |St) = f(lnST | lnSt) ·∂ lnST

∂ST

=1√

2πσ2τexp

[

−(lnST − lnSt − (r − σ2/2)τ)2

2σ2τ

]

1

ST

=1

vST

√2π

exp

[

−1

2

(

lnST −m

v

)2]

1To derive this realize that under Q, dS = Srdt + SσdZ. Let x = ln S so

dx =∂x

∂SdS +

1

2

∂2x

∂S2(dS)2 =

dS

S− 1

2S2(dS)2 = (r − 1

2σ2)dt + σdZ.

4.3. BLACK SCHOLES MODEL 63

Next calculate the terms involving ST and K

Prob[ST ≥ K] = Prob[lnST ≥ lnK] = 1 − Prob[lnST ≤ lnK]

= 1 − φ

(

lnK −m

v

)

= φ

(

m− lnK

v

)

= φ

(

ln(ST/K) + (r − σ2/2)τ

σ√τ

)

= N(d2).

Using the same idea and a change of variable y = lnST so ey = ST anddST = eydy

EQ[ST |ST ≥ K]Prob[ST ≥ K]

=

∫ ∞

K

ST1

v√

2πexp

[

−1

2

(

lnST −m

v

)2]

1

STdS(T )

=

∫ ∞

lnK

1

v√

2πexp

[

−1

2

(

lnST −m

v

)2]

exp(lnST )d lnS(T )

=

∫ ∞

lnK

1

v√

2πexp

[

− 1

2v2

[

lnST − (m+ v2)]2

+m + v2/2

]

d lnS(T )

= exp(m + v2/2)

∫ ∞

lnK

1

v√

2πexp

[

−1

2

(

lnST − (m+ v2)

v

)2]

d lnS(T )

= exp(m + v2/2)

[

1 − φ

(

lnK − (m+ v2)

v

)]

= exp(·)φ(

m+ v2 − lnK

v

)

= exp[

lnSt + (r − σ2/2)τ + σ2τ/2]

φ

(

ln(St/K) + (r − σ2/2)τ + σ2τ

σ√τ

)

= SerτN(d1)

Combining these results gives the Black Scholes model

C(S, t) = SN(d1) −Ke−rτN(d2)

where

d1 =ln(S/K) + (r + σ2/2)τ

σ√τ

and d2 = d1 − σ√τ .


4.3.2 Implied Volatilities

The volatility parameter is the most difficult to obtain and perhaps the mostimportant. An alternative to using the model to give an option price is toinvert the model to give an implied volatility, taking option prices as inputs.

4.3.3 Hedging

Hedging involves forming portfolios to reduce or minimize various types ofrisk. The most common hedge is a delta-neutral position. This investmenthas an expected price change of zero when the stock price changes — theloss from a drop in the stock is offset by a gain on an option. This is a localhedge, since the delta changes when the stock price changes. A gamma-neutral hedge preserves the delta-hedge. Other hedges include rho for theinterest rate and vega for volatility. Again, these are partial hedges andassume everything else is constant.

To determine the appropriate hedge, find the options with the maximiumand minimum pricing error per unit of stock-equivalent risk model−market

∆. Buy

and sell these options in amounts proportional to the inverse of the delta tobalance the stock-equivalent risk. For a gamma hedge, combine two delta-neutral portfolios such that the gammas balance.

4.4 Advanced Topics

4.4.1 American Options

Boundary Conditions

Define S∗t as the exercise boundary. Conditions for early exercise require

limSt→S∗

t

C(St) = S∗t −K

limSt→S∗

t

∂C(St)

∂St

= 1.

With dividends the stock process is

dS

S= (r − δ)dt+ σdW

4.4. ADVANCED TOPICS 65

so

dC = [Ct + (r − δ)SCS +1

2σ2S2CSS]dt + σCdW = rCdt+ σCW .

The resulting PDE is

Ct + (r − δ)SCS +1

2σ2S2CSS − rC = 0

with boundary conditions

CT (ST ) = (ST −K)+ and C0(S0) = supτ∈[t,T ]

EQ[e−r(τ−t)(S0 −K)+].

At the boundary you are indifferent to exercising since exercising gener-ates

dS + (δS − rK)dt

while continuing generates

dC = (Ct +1

2σ2S2CSS)dt+ CSdS

= [rc− (r − δ)S]dt+ dS

= [r(S −K) − (r − δ)S]dt+ dS

= (δS − rK)dt+ dS.

To exercise you borrow rK and receive δS, so rK = δS and S∗T = rK/δ.

Integration

Broadie & DeTemple and Barone-Adesi & Whaley. Let Ct and ct denoteAmerican and European call option values. We can write

Ct(St) = ct(St) + εt

so

C0(S0) = EQ[e−rt(ST −K)+] + EQ[ε

∫ T

0

e−rtdf(τ)].

CHECK


L-U Bound

Broadie & DeTemple find upper and lower bounds on American options byusing capped calls. A capped call value can be found for a given early exercisepath.

BBS/Richardson Extrapolation

The binomial Black-Scholes method (BBS) is essentially a binomial tree withthe analytic BS formula attached at the last node. This avoids some of theproblems from disctretization in a tree, but preserves the ability to priceAmerican options. Richardson extrapolation involves calculating the pricewith N nodes and again with 2N nodes. The option price is then calcualatedas twice the first minus the second value (e.g., p = 2pN − p2N ). This avoidsthe odd-even effect and allows use of a small N .

4.4.2 Exotic Options

Barrier options utilize the reflection principle. Put-call symmetry says C(S,K, r, δ) =P (K,S, δ, r).

To price a down-and-out call, letH denote the barrier, xt = ln(St/S0), yt =inft∈[0,T ] xt, Yt = supt∈[0,T ] xt, and y = ln(H/S0). Then

Cdoc = e−rtEQ[

(ST −K)+Prob[yT ≥ y]]

= e−rtEQ [(S0exT −K)Prob[yT ≥ y, xt > ln(K/S0)]]

To price lookback options,

Standard : C = (ST −MTT0

)+ P = (MTT0

− ST )+

Extreme : C = (MTT0

−K)+ P = (K −MTT0

)+

Asian options can be of the form

C = (S −K)+ P = (K − S)+

C = (S − K)+ P = (K − S)+

4.5. INTEREST RATE DERIVATIVES 67

Table 4.2: Common Interest Rate Models

Stochastic Interest Rate Stochastic Term StructureRendleman & Bartter Ho & LeeCourtadon HJMVasicek Black, Derman & ToyCIR

4.4.3 Other Advanced Topics

Stochastic Volatility and Jumps

Monte Carlo, QMC, etc.

Parametric Pricing

4.5 Interest Rate Derivatives

An underlying assumption of the preceeding option models is that the assetfollows a lognormal or binomial process. For fixed income securities thisassumption is not valid. The price of these securities must end up back atpar when they mature. Surprisingly, the fact that we know the terminal pricemakes option pricing more difficult. With the view that this is an additionalconstraint on the system it is more understandable why interest rate optionshave this added complexity. Many of the interest rate models are discussedmore fully in Chapter 3.

There are three basic steps in interest rate option pricing. First, randominterest rates are modeled. Next, the distribution of the interest rates areused to infer the distribution of prices for the underlying debt instrument.Finally, the distribution of the underlying asset is used to price the option.

Interest rate options can be broadly categorized as stochastic interestrate models and stochastic term structure models. Refer to Section 3.5 fora discussion. The stochastic interest rate approach is subject to error sincethe option model is based on bond prices that are potentially wrong. In thestocjastic term structure models, market data is used to get a distributionfor the bond price, which, in turn, is used to price the option.


4.5.1 Stochastic Interest Rate Models

The Rendleman & Bartter model uses a binomial process and assumes inter-est rate changes are a constant percentage. Courtadon models the interestrate process in continuous time. Like the RB model, interest rates are lognor-mal. However, Courtadon adds a mean-reversion feature which overcomesthe problem in RB that the interest rate can become infinitely large. TheVasicek model includes mean-reversion, but uses a normal process, allowingnegative interest rates. The CIR model modifies Vasicek by using a squareroot process which produces variance proportional to the level of the interestrate. Refer to Table 3.1 for a summary of model specifications.

4.5.2 Stochastic Term Structure Models

The presentation of the following models are based on their discrete timeanalogs.

Ho & Lee

The Ho-Lee () model generates parallel shifts in the yield curve. In a binomialsetting it produces a recombining tree. It is based on

Rt,j =D(t)[π + (1 − π)δt]

D(t+1)δt−j

where δ = exp[−2φ(τ/n)1.5], D(t) is the current price of the bond maturingat time t, and φ is the standard deviation of the log yield of one-year discountbonds.

BDT

The Black, Derman, & Toy () model features a fixed ratio of adjacent pricesat each point in time, αt. The rate can be expressed as rt,j = αjrt,0. In theHo-Lee model this ratio is fixed for all t.

Heath, Jarrow, & Morton

The Heath, Jarrow & Morton model is the most general term structure model.Although the HJM model is set in continous time, a discrete time analog isavailable. Market bond prices and volatilities are used to determine the

4.5. INTEREST RATE DERIVATIVES 69

interest rate process (tree). In terms of notation, let Dmt,k denote the price

of a bond maturing at m observed at time t in state k. Since the tree doesnot recombine, there are 2t nodes at time t. The states are indexed with thelowest (all downs) state as 0 and the highest state as 2t − 1. By convention,up states are when the bond price increases (and interest rate decreases).The risk neutral and true probabilities of an up-state are π and θ. There aretwo equations expressing current price and volatility as functions of the nextperiod prices.

Dmt,k =

πDmt+1,2k+1 + (1 − π)Dm

t+1,2k

1 + rt,k

σt+1 =ln(

1Dm

t+1,2k

)

− ln(

1Dm

t+1,2k+1

)

2(m− t− 1)

The second equation can be more conveniently expressed as

Dmt+1,2k+1 = exp[σt+1 · 2(m− t− 1)]Dm

t+1,2k

The values Dmt,2k and σt+1 are generally estimated (or given) and the prices

at t+1 are determined by solving the equations simultaneously at each node.Unlike the standard binomial pricing model, this model is solved by steppingforward through the tree.


Chapter 5

Corporate Finance

5.1 Introduction

Corporate finance covers a range of issues related to the choice of capitalstructure, distributiuon of cashflows, and issuance of securities. Asymmetricinformation problems are common and is the subject of much of the workin corporate finance. Also important are the agency costs that arise fromthe conflicts of interest between the decision makers and other parties. Acommon example of an agency costs is between the manager and the outsideowners of the firm. The compensation contract offered to managers is oneway of dealing with this agency cost.

Many of the earlier works make relatively strong assumptions. The lastfew sections attempt to understand the implications of relaxing these as-sumptions. When demand for assets is not perfectly elastic there will beprice effects caused by changes in quantity. Similarly, imperfections maygive rise to financial innovation.

5.2 Information Asymmetry/Signaling

An information asymmetry occurs when one group of agents has better in-formation than other groups. Adverse selection arises when an agent makinga decision is better informed than the person with whom he is contracting.This is different than moral hazard, where the agent with superior informa-tion can influence the outcomes by his action. A signal is an action thatan agent takes to provide credible information. The signal must impose a

71

72 CHAPTER 5. CORPORATE FINANCE

greater cost on the “low quality” agents than on the “high quality” agentsto prevent mimicking.

A common element of signaling papers in finance is that the source of theinformational asymmetry generally comes from managers’ superior forecastsof future cash flows. Investors are typically homogeneous with respect totaxes and restrictions on trading, otherwise clienteles would arise. Modelsalso usually prevent the manager from trading personally.

The outcomes depend on the nature of the informational asymmetry. Ifthe asymmetry is over the assets in place, but not the new project, overpricingand project scaling are the most efficient signals. If there is asymmetricinformation about the project’s value, and good firms have more valuableprojects than bad firms, signals which burn money after the issuance aredominant.

The overinvestment (when only project value is asymmetric information)can be eliminated through money burning, but the underinvestment problem(when there is differential information about the assets in place) is not com-pletely solved. The distinction that the burned money comes from projectcash flows is important. Equity financed money burning is an inefficientsignal.

Akerlof (1970)

In his famous “lemons” paper, Akerlof (1970) shows how quality uncertaintycan affect the size and average quality in the automobile market. In extremecases, markets can fail completely. A case is made for the role of certaininstitutions in improving the efficiency of the market.

The seller of the cars know more about the quality of the car than thebuyers. Demand depends on price and average quality, QD = D(p, µ). Theaverage quality will also depend on price, µ = µ(p). Supply depends on priceas well, QS = S(p). In equilibrium, S(p) = D(p, µ(p)). A low average qualitywill cause the owners of good cars not to sell, further lowering the averagequality.

Several applications of the basic model are discussed. In the insurancemarket, healthy individuals will tend to opt out of the market, leaving theinsurer with a disproportionately large share of the unhealthy. Costs ofdishonesty must include both the direct costs as well as the indirect costs ofdriving business out of the market. There are several institutions that canmitigate these types of problems. Risk transferring guarantees can allow the

5.2. INFORMATION ASYMMETRY/SIGNALING 73

owners of good cars to get their fair value. Brand-names and chains can alsoreduce quality uncertainty, as do licensing practices.

Spence

Spence (1973) develops the signaling model in the context of the job market.This is really just one example of an investment under uncertainty problem.Here the potential employer can not observe the quality of the applicants, butcan use education to make rational inferences about the applicant’s quality.Education separates the types of applicants because it is costly to obtain, andmore so for the low types. The high types will obtain just enough educationto make it unattractive for a low type to mimic.

The employers offer wage schedules that are a function of the educationalsignal and other (non-signal) indices. Individuals choose education levels tomaximize wages net of signaling costs.

For the signaling equilibrium to work the costs of signaling must be neg-atively correlated with productive capacity. Otherwise, lower quality typeswill overinvest in the signal to mimic higher quality types. The use of indicesresults in forming probability distributions conditional on both the signaland the indices. This segments the population by indices, and these subsetsneed not have the same equilibrium.

Spence (1974) is a more general description of the signaling environment.There must be an information asymmetry where the seller knows more aboutthe good than the buyer. The seller signals and the buyer responds. Thesignal is based on the anticipated response of the buyer.

Myers & Majluf (1984)

Myers and Majluf (1984) is the classic paper on financing under asymmetricinformation. A firm has a positive NPV project that requires external financ-ing. If the manager believes the stock is underpriced, pursuing the projectrequires issuing underpriced stock, diluting the value to existing sharehold-ers. Investors will then believe that when a firm does issue, it is likely thatthe stock is overpriced. Consequently, announcements of new issues generatea share price decline.

In the basic model, a firm has existing assets in place, a, and a valuableinvestment opportunity, b > 0. The project is all-or-nothing and requiresthe issuance of equity to make the initial investment I. The firm currently


has slack S, which is fixed and publicly known, and would need to raiseadditional equity E = I − S. There are three dates in the model. At timet − 1 the market has the same information as management; valuations aregiven by A and B. At time t the manager learns a and b, while the marketknows only the distribution of A and B. Additional assumption are perfectmarkets, costly signaling, and passive existing shareholders.

Managers act in the interest of old shareholders by maximizing V old0 =

V (a, b, E). The market value of the shares will generally be different fromthe manager’s valuation since the market does not know a or b. Denote themarket value P ′ if stock is issued and P otherwise. The managers will issuenew stock when

E/(P ′ + E)(S + a) ≤ P ′/(P ′ + E)(E + b).

In words this says the old shareholders must get more of the new value thanthe new shareholders get of the original value. The firm is more likely toissue when b is high or a is low. Rearranging, the indifference equation is

b = (E/P ′)(S + a) − E.

Above this line the firm will issue and invest, below it will do nothing. Theissue price P ′ is given by

P ′ = S + A(M ′) + B(M ′)

where the last terms represent expected values given issuance.Unless the firm is certain to issue, P ′ < P . The decision not to issue

is good news about the value of the existing assets. The ex ante loss frompassing up good projects is L = F (M)B(M). With S > I, L = 0. If thefirm could be split, then the problem goes away. The solution of P ′ requiresa simple numerical algorithm. Start by setting P ′ = S + A + B. Thendetermine M and M ′ and calculate P ′ = S + A(M ′) + B(M ′). Repeat thisprocedure until convergence.

The above analysis can be extended to include debt financing. If thefirm can issue riskless debt the problem disappears — the firm always issuesriskless debt and takes the project. If the firm can only issue risky debt theproblem in reduced, but not eliminated. Thus, the general rule is to issuesecurities less subject to mispricing first. The firm will issue and invest onlywhen b ≥ ∆D (or ∆E). We should have |∆D| < |∆E| and with the same

5.3. AGENCY THEORY 75

signs. In this case the firm will never issue equity. Any time it decides toissue it will use debt. This extreme condition can be tempered by introducingcosts of debt such as bankruptcy or agency costs. Note that if the informationasymmetry is about the variance of value rather than the mean, then equitywill dominate debt.

The model makes a number of predictions. It says it is generally betterto issue safe securities, a pecking order result. Firms with insufficient slackmay forgo good investment opportunities — the underinvestment problem.Firms can build up slack by retaining earnings or issuing securities wheninformation asymmetries are small to avoid some of these problems. Firmsshould avoid issuing risky securities to pay dividends. Stock price will fallwhen managers have superior information and they issue securities. A mergerbetween a firm with little slack and one with a lot of slack is likely to increasevalue, but negotiating such a merger is likely to be difficult.

The basic Myers and Majluf (1984) framework has been extended in anumber of ways, including dividend policy, scale of investment, project tim-ing, and public offerings (overpricing and underpricing).

Cooney & Kalay (1993)

? extend the classic Myers and Majluf (1984) paper to allow for the possi-bility of negative NPV projects. With this additional realism the stock pricereaction to equity issuance is not necessarily negative. Note that it is theexistence, not acceptance, of negative NPV projects that drives this result.

In the new model, low values of a can cause overinvestment. Firms willaccept negative NPV projects in order to sell overvalued existing assets.Also, firms with riskier new projects may experience stock price increaseson issuance announcement. The revised model has lower issue prices andprobability of issuance. When there is a limited supply of zero NPV projects(e.g., transactions costs and taxes for financial investments) there may bepositive annoncement effects.

5.3 Agency Theory

An agency relationship is a contract under which a principal engages an agentto perform some task on his behalf which involves delegating some decision-making authority to the agent. The agent may not always have natural


incentives to act in the best interest of the agent. The principal can addressthis problem by establishing the appropriate incentives and/or monitoringthe agent. Incentive alignment is rarely free; agency costs are defined as thesum of monitoring costs, bonding costs, and the residual loss.

Jensen & Meckling (1976)

In a widely cited paper, Jensen and Meckling (1976) develop a theory of theownership structure of the firm using elements of property rights, agency,and financial theory. Property rights specify how costs and rewards will beallocated among the participants in an organization. The firm is defined as alegal fiction which serves as a “nexus for contracting.” The firm has divisibleclaims on assets and cashflows, but does not have intentions, behaviors, ormotivations. The paper focuses on the positive aspects of agency theory —the interaction of the various parties assuming they act optimally. Most ofthe previous literature was normative in nature.

The presence of inside and outside equity owners introduces an agencycost of outside equity. This arises because the outsider funds a portion of theinsiders perquisite consumption, so the insider will consume “too much.” Aslong as the market anticipates this all these costs will be passed back to theinsider.

The manager consumes perquisites F . When he is the sole proprietorhe chooses to consume F ∗ and the firm is worth V ∗. Every dollar of perqshe forgoes increases the value of the firm by a dollar. He chooses the pointwith the highest utility given his budget set. The manager sells a share(1 − α) to an outsider. Now the manager pays only $α for every dollar inbenefits. If the outsider pays V ∗ and holding F ∗ constant, the budget slopechanges to −α. If the manager can change his consumption, he will increaseconsumption which lowers the value of the firm. With rational expectationsthe market will foresee this and will pay only V 0. At this point the manageris over-consuming perqs; by decreasing to F ′ he increases his utility and thevalue increases to V ′. The entire decrease in value V ∗ − V ′ is borne bythe insider. This is a gross cost, since it does not include the benefit fromincreased consumption. The net cost is given by the change in the utilitylevels.

Introducing monitoring allows an improvement. The insider receives allthe benefits from monitoring (i.e., he bears all the net costs). It does notmatter who actually makes the payment for these costs since they all fall

5.3. AGENCY THEORY 77

back to the insider in the end. The outcome is suboptimal or inefficientonly relative to a world with no agency costs. Given that these costs exist,and since the insider bears these costs, the insider will minimize these costs.The size of these agency costs will depend on managers’ tastes, degree ofmanagerial discretion, monitoring and bonding costs, difficulty in measuringperformance, and the costs of devising, implementing and enforcing incentivecontracts.

The scale of the firm can also be analyzed in this framework. When theinsider lacks sufficient resources and needs external financing, agency costsreduce the value of the firm at a given level of fringe benefits consumption.The insider will stop increasing the value of the firm when the gross incrementin value is offset by the incremental loss in the consumption of additionalfringe benefits. The end result is that the insiders are worse off than before.The reason the can not be the same is because they can not credibly committo not consuming additional benefits.

Debt financing creates a risk-shifting incentive since equity holders enjoythe benefits of positive outcomes without a matching liability for negativeoutcomes, much like an option. This is an overinvestment problem — thefirm takes bad projects because the equity holders can expropriate wealthfrom the bondholders. Again, monitoring and bonding are possible (partial)solutions, but are likely to be difficult to implement.

In a multiperiod setting, “being good” will reduce agency costs due toa reputation effect. Yet the problem will not be solved since each agenthas an end to their game and will always eventaully face the temptation toshirk. Inside debt may help reduce the problems as well since the managerwill not be tempted to expropriate wealth from his own bonds. In somesense the manger’s salary may serve this purpose. He may take measures topreserve his salary, including pursuing safe investments. Incentive compen-sation, such as options, may be effective in this case. Convertible securitiesmay incent managers to avoid risk-shifting. Security analysts may also helpreduce agency costs. In situations where it is easy for the insider to consumeperqs, less outside equity should be used.

Jensen (1986)

Jensen (1986) discusses how free cashflows (FCF) can cause agency costs byallowing managers discretion to make bad investments. Reducing FCF canminimize a manager’s ability to waste resources and it also subjects the firm


to more frequent monitoring since it has to access the capital markets moreoften.

The agency costs of debt have been cited as a reason to use less debt.Jensen points out that debt can also help reduce agency costs by reducingFCF. Debt can be viewed as a substitute for dividends in this sense. Ad-ditional debt will also serve to increase efficiency as bankruptcy becomesmore likely. There is evidence supporting these claims. Leverage-increasingtransactions are associated with increases in equity value. LBO targets tendto have high FCF and low growth opportunities. Also, strip, or mezzanine,financing limits the conflicts of interest between classes of security holders.

The FCF hypothesis also applies to takeovers. Firms with high FCF andunused borrowing power are likely to undertake bad mergers. Takeovers, es-pecially hostile ones, can generate the crisis needed to make changes. Withindeclining industries, mergers are likely to be value-enhancing since they re-move resources from a relatively unproductive sector. Acquirers tend tohave performed well, generating excess cash to pursue the acquisition. Tar-gets tend to either have poor managers and poor performance or good per-formance and significant FCF. Cash or debt financed takeovers generallyprovide larger benefits than transactions financed with stock.

Fama (1980)

Fama (1980) explains how the separation of ownership and control in a largecorporation is an efficient organizational form. The basic idea is that manage-ment is a special type of labor which coordinates inputs and makes decisions.Management rents its human capital to the firm. Risk bearers provide capi-tal ex ante in exchange for uncertain future payments. The capital marketsand managerial labor markets provide discipline to the manager. Monitoringoccurs within and among management, up and down the chain of command.The board monitors top management; it can include top management butshould also include outsiders.

A distinction between ownership of the firm and ownership of capitalis made. Since the firm is a collection of contracts, no one really owns it.Rather, security holders own claims on the cashflows. With this view, controlrights over a firm’s decisions does not necessarily lie with the security holders.

In order to hold the manager accountable there must be some mechanismfor ex post settling up. The general necessary conditions are uncertaintyabout managerial talents or tastes, labor markets that efficiently use past

5.4. CAPITAL STRUCTURE 79

information in determining wages, and a wage revision process that is strongenough to resolve incentive problems. When the manager is the sole pro-prietor he can not avoid ex post settling up with himself. The optimal payincentives are effort based. This does not expose the manager to risks beyondtheir control for which they would demand compensation. The problem isthat effort is difficult to measure. When performance measures are noisy,less weight should be put on recent results.

Lehn & Poulsen (1989)

Lehn and Poulsen (1989) analyze FCF in privatizing transactions to identifythe sources of value. Unlike other corporate control transactions, synergiesare not a potential source of value. The four sources under consideration aretax effects, wealth redistribution, asymmetric information, and agency costs.The results are largely consistent with the FCF hypothesis. These transac-tions are more likely in firms with high CF/EQ or low sales growth. Premi-ums paid are also positively related to CF/EQ. The results are strongest inthe hostile takeover wave in the mid-80’s and among firms with low manage-ment ownership.

The analysis consists of two parts. First, firms that went private are con-trasted to a control sample that did not to understand the factors importantin the decision. This is done by comparing means of the groups and also in alogit regression. The variables of interest are CF/EQ, Tax/EQ, sales growth,and footsteps, a dummy for competing bids or rumors. The results indicatethat privatized firms are larger, have more cash, slightly lower recent growth,and are more likely to have other bids. These effects tend to become strongerin the second half of the sample. The second part of the paper attempts toexplain the cross-sectional variation in premiums in these transactions byregressing the premium on CF/EQ, Tax/EQ, and sales growth. The resultsare generally supportive of the FCF hypothesis, especially in the second halfof the sample and among low management ownership firms.

5.4 Capital Structure

The capital structure decision balances the costs and benefits of the vari-ous financing choices. These can be categorized as taxes, bankruptcy costs,and agency costs. Tax considerations include both the advantages of debt


at the corporate level as well as the disadvantage at the individual level.Bankruptcy costs associated with debt are subdivided into direct and indi-rect components. Agency costs arise from the conflicts of interest betweendifferent investor classes and also with management. Although debt createsseveral agency costs, it actually reduces agency costs under the FCF hypoth-esis. Agency costs can lead to underinvestment or overinvestment.

Miller (1977)

This paper is a study of the way taxes affect capital market equilibrium.Pre-“Debt and Taxes” the view was that optimal capital structure involvedbalancing the corporate tax advantage of debt against the costs of financialdistress (loss of tax shields, overinvestment, underinvestment, monitoringcosts, etc.). Miller’s “horse and rabbit stew” refers to the corporate taxadvantages of debt dominating the costs associated with bankruptcy. Milleradds personal tax considerations of investors to the mix. Taxes are importantin the capital structure decision because they affect aggregate supply anddemand for corporate securities.

Using a bond market equilibrium analysis, Miller argues that the highercosts of borrowing negate the entire benefit of tax shields so the capitalstructure choice is irrelevant for individual firms, although there will be anoptimal amount of aggregate debt. With progressive corporate taxes and/orif the differential information-related costs of debt versus equity are convexin the amount of debt, then capital structure may in fact matter.

The classic M&M Proposition I is modified to include personal taxes

VL = VU +

[

1 − (1 − τC)(1 − τS)

(1 − τB)

]

B.

Proposition I (with taxes) says that firms can increase value by issuing debt.But if this is the case then the market is not in equilibrium. Assume forsimplicity that there are no capital gains taxes, all bonds are riskless, andthere are no transactions costs, Miller’s equilibrium is given by the curves SandD in Figure 5.1. The flat part of the demand curve represents the demandfor taxable bonds by tax-exempt investors. To get taxable investors to holdbonds, the rate must be high enough to offset the taxes. The equilibriumis where τC = τB. In the more general case, with capital gains taxes, theequilibrium condition is

(1 − τC)(1 − τS) = (1 − τB).


Q

R

S = r0/(1 − τC)

D = r0/(1 − τB)

S1 = r0/(1 − τ ′C)

S2 = r0/(1 − τ ′C) − d

Q∗

r∗

r0

Figure 5.1: Bond Market Equilibrium

The area between the supply and demand curves below the equilibrium is the“bondholder surplus.” This arises because rates are driven up to the pointwhere the marginal investor’s tax rate is equal to the corporate rate, but allinvestors can get the same rate in the market.

A crucial assumption in Miller is the inability to perform tax arbitrage:selling assets taxed at a high rate to buy those taxed at a lower rate. Cliente-les may arise because of differences in tax treatment of various organizationalforms and differences in transaction costs [Shin and Stulz (1996)].

DeAngelo & Masulis (1980)

? generalize Miller (1977a) to include more realistic taxes, bankruptcy,and agency costs. In this “modified balancing theory” the full burden ofbankruptcy or lending costs is not necessarily borne by the debtors. Some ofthese costs are shifted to bond buyers in the form of lower risk-adjusted inter-est rates. Miller’s irrelevance result is shown to be extremely fragile. Wheneither non-debt tax shields or bankruptcy/agency costs create an increasingmarinal cost individual firms do have an optimal capital structure.

The single period1 model allows different tax rates for each investor, solong as the ordinary income tax rate is higher than the capital gains rate.All firms face the same marginal tax rates. The set up results in three taxbrackets: those who prefer debt, those who prefer equity, and those who areindifferent.

1A multiperiod model with tax carryforwards, etc. would be qualitatively similar.


For the marginal investor µ

(1 − τµB)π(s)

PB(s)=

(1 − τµB)

PB

=(1 − τµ

E)

PE

=(1 − τµ

E)π(s)

PE(s)∀ s.

Non-debt tax shields such as depreciation are given by ∆, Γ are the dollaramount of tax credits, and θ represents the maximum fraction of the taxliability that can be shielded by tax credits. There are four outcomes thatresult from different states.

Debt Equity StateX(s) 0 [0, s1]B X(s) − B [s1, s2]B X(s) − B − τC [X(s) − B − ∆](1 − θ) [s2, s3]B X(s) − B − τC [X(s) − B − ∆] + Γ [s3, s]

For all states up to s3 the firm loses some of its tax shields, even though itmay not be in bankruptcy. The value of the firm is given by

∫

SB(s)+E(s)ds.

In Miller’s world, ∆ = Γ = 0 so s1 = s2 = s3. Taking the partial wrt B,PB = PC(1−τC). The interpretation of the flat section of the supply curve isthat all tax shields are fully utilized in all states of nature. The curve beginsto slope to compensate the firm for some of these tax shields going unutilizedin some states.

In the new equilibrium, the net tax advantages of debt are equated withthe expected default costs

(1 − τB) − (1 − τC)(1 − τS) = E[default and agency costs].

Firms with low earnings may lose some of the value of their tax shields.The incremental value of interest tax shields decreases as firms increase lever-age, implying a negative slope for the supply curve of taxable corporatebonds. This is depicted as S1 in Figure 5.1

Adding leverage-related deadweight costs d will cause the tax advantageof corporate borrowing to become more significant. At the margin, the dead-weight cost per dollar of borrowing, d∗ is the same for all firms. The new sup-ply curve S2 has a more negative slope because of the deadweight costs. Thisreduces the level of aggregate borrowing and the equilibrium risk-adjustedrate of return. Leverage-related deadweight costs increase the marginal taxadvantage of borrowing because they decrease the supply of bonds, eliminat-ing some of the “bondholder surplus.”


The existence of an optimal capital structure in this setting is essentiallyan empirical issue. Do deadweight costs and underutilization of tax shieldshave significant impacts on the rate of return to bondholders? There is evi-dence that deadweight costs and possible underutilization of tax shields aresufficiently significant to affect bond pricing. Evidence implies that leverage-related costs reduce the supply of corporate bonds and lower the cost ofborrowing, generating a positive net tax advantage of corporate debt.

The theory also implies that firms that reach d∗ faster than others willhave less leverage. In other words, firms that are more likely to encounterfinancial distress at a given debt ratio are less likely to borrow. Supportiveevidence shows that there is a significant negative relation between observedleverage measures and historical failure rates. The probability of financialdistress is also positively related to the variability of operating earnings. Insum, the evidence is consistent with the generalized balancing theory.

Myers (1977)

In Myers (1977) the firm is viewed as a collection of assets in place and growthopportunities. Risky debt reduces the value of the real options, an agencycost. This cost arises either from a suboptimal underinvestment strategy orfrom the costs of avoiding underinvestment. This underinvestment resultseven when managers are acting in shareholders’ best interest. The level ofborrowing is inversely related to the relative size of the growth opportunitiesand is determined by the tradeoff between these costs and the tax benefits ofdebt. The shareholders absorb the costs of avoiding underinvestment, whichinclude:

• Rewrite/renegotiate debt contract• Shorten maturity prior to “exercise date”• Mediation• Dividend restrictions• Reputation effects• Monitoring

The basic analysis considers the value of a firm facing an investmentopportunity requiring an investment I and paying V (s). A firm with riskydebt P will issue take the project if V (s) ≥ I + P .

The analysis can be extended to a multiperiod setting

Vt = VE,t + VD,t.


The firm will invest as long as the incremental benefit

dVE/dI = dV/dI − dVD/dI > 1.

If the value of debt depends on the volatility of the firm value, then thetransfer of value from equity to debt is

dV/dI − dVE/dI = dV/dI · ∂f/∂V + ∂f/∂σ2 · ∂σ2/∂I > 0.

In conclusion, Myers’ work indicates that assets in place can support moredebt than growth opportunities can, capital intensive businesses with highoperating leverage can support more debt, and more profitable firms shouldhave more debt. This logic is similar to Shleifer and Vishny (1992) who saymore liquid assets can support more debt.

Masulis (1980)

Masulis (1980) examines the valuation effects of capital structure changeson security value. The sample of intrafirm exchange offers and recapitaliza-tions abstracts from asset changes that accompany many other changes incapital structure. The types of transactions considered include issuing debtfor equity (E → D), preferred for equity (E → P ), and debt for preferred(P → D).

There are three primary sources of valuation effects. Tax-related storiespredict changes in equity value to be positively related to increases in debt.Bankruptcy and reorganization expenses should cause a negative relationbetween equity value and leverage increases. Wealth redistribution fromagency costs are a zero sum game, so gains to one group of security holders areat the expense of another group. Two other theories that are not consideredare signaling and the offer premium hypothesis.

The methodology employed uses comparison period returns. This ap-proach essentially calculates the abnormal return for a security as the de-viation from the mean return over a comparison period. These abnormalreturns are averaged across all securities to get a portfolio abnormal return.

The results are largely consistent with the tax and wealth redistributioneffects, but provide little evidence about the bankruptcy costs. Leverage in-creasing transactions tend to increase equity value, while leverage decreasingtransactions tend to decrease shareholder value.


Table 5.1: Predictions in Masulis

Source E → D E → P P → DTax + 0 +Bankruptcy – 0 –WR: E + + –/0WR: P – – +WR: D – –/0 –

Table 5.2: Predictions in Titman & Wessels

Attribute Pred. Significant ResultsCollateral Value +Non-debt tax shield –Growth –Uniqueness – YesIndustrial +Size + Small = STVolatility –Profitability – MV measure

Titman & Wessels (1988)

Titman and Wessels (1988) expand the range of capital structure theoriestested and attempt to overcome some econometric problems. The paper usesa factor analytic technique, similar to some APT tests, to relate unobservableattributes to capital structure measures using observable data. The processinvolves estimating a measurement model and a structural model simultane-ously. Although theoretically appealing, implementation requires imposinga number of restrictions on the loading matrix in the measurement model.

The authors use six D/E ratios as dependent variables obtained fromall combinations of long-term, short-term, and convertible debt to book andmarket equity, LT, ST, Conv/BE,ME. The explanatory attributes aresummarized in Table 5.2.

The results indicate that uniqueness is important. The authors believe


this supports the costs of financial distress, but the proxies may also be re-lated to non-debt tax shields and collateral value. The size effect for smallfirms is taken as evidence that transaction costs may be important. Theanalysis is unable to explain the cross-sectional variation in convertible debt.The lack of evidence in many cases may be due to problems with the mea-surement model.

Rajan & Zingales (1995)

The purpose of the Rajan and Zingales (1995) paper is to see if factorsdetermined to be important in determining capital structure in the U.S. arealso important in other countries. This research is valuable because manyof the theories that explain capital structure were developed in response toempirical observations. The paper studies the G-7 countries: U.S., U.K.,Canada, Japan, Germany, France, and Italy.

There are a few limitations to the analysis. First, there is a bias towardslarge, listed companies. Second, there are variations in industry concentra-tions across countries. Third, there are differences in financial statements andreporting across countries. Finally, bank- versus market-oriented economiesmay produce systematic differences.

The primary analysis in the paper relates four factors to leverage, which ismeasured in both book and market terms. The factors are tangibility, M/B,size, and profitability. A long list of control variables are also included.The authors also look at the distribution of wealth transfers out of the firmand find that these payments are generally made through the most tax-advantaged route.

The general results indicate that U.K. and German firms tend to havelower leverage than firms in the U.S. The factors generally have the hypothe-sized relation with leverage. Tangibility is positively related, M/B negativelyrelated. Size is positively related except for in Germany where it is nega-tively related. Profitability is negatively related, except in Germany andFrance [this is opposite the predictions of Ross (1977a) and Myers (1977)].

Graham (1996)

Graham (1996) is the first paper to take a careful look at the role of marginaltaxes in the capital structure decision. Economic theory indicates marginalrates are what matter, but previous studies have used statutory rates as a

5.5. DIVIDENDS 87

matter of convenience.The approach for estimating marginal rates is to calculate the present

value of current and future taxes on a $1 increase in income based on simu-lations. The main analysis regresses (D1 −D0)/D0 on the marginal tax rate,relative cost of debt, probability of bankruptcy, non-debt tax shields, and alist of control variables.

The results find that the marginal tax rate is important in explainingcapital structure. The difference between statutory and marginal tax ratesis also important, providing evidence that firms still use it in the capitalstructure decision. Firms with volatile tax rates tend to use more debt asexpected with a progressive tax schedule. The relative cost of debt has thewrong sign, but there may be a multicollinearity problem.

5.5 Dividends

Developing a model of dividend policy consistent with firms maximizing prof-its and individuals maximizing utility has been a challenge. MM moved thethinking away from the view that more dividends were better. Dividend ir-relevence in perfect markets is based on the idea of replicating any desiredpayoff by buying/selling shares. Transactions costs remove the ability of in-dividuals to make home made dividends. There may be clienteles that preferdividends. There are also behavioral arguments, market timing stories, andinstitutional constraints (“prudent man” rules).

Stylized Facts:• Corporations payout a significant portion of earnings as dividends.• Dividends have been the predominant form of payout.• Individuals in high tax brackets receive substantial dividends.• Corporations smooth dividends.• Market reactions are positively correlated with dividend changes.

Black (1976) presents arguments for and against dividends as the “divi-dend puzzle.” A firm may choose to pay dividends to provide a return ex-pected by investors, even though this may be irrational. With transactionscosts, dividends may be a better way to distribute wealth to shareholdersthan selling a few shares. The dividends may be used to signal information,such as higher expected future earnings. Finally, dividends could be used toexpropriate wealth from bondholders. Reasons not to pay dividends include


tax avoidance, investment in growth opportunities, and the pecking orderargument.

5.5.1 Factors Influencing Dividend Policy

Dividends and Taxes

Since capital gains taxes are typically lower than the tax on dividends, andcapital gains can be deferred, there is a general tax disadvantage to dividends.This advantage may vary over investor types (low tax-rate individuals, cor-porations, tax-exempt institutions). The price drop on the ex-date has beenwell-documented to be less than the dividend amount. The average premiumincreases with the dividend yield, consistent with the tax clientele hypothe-sis. There is also evidence of abnormal volume around the ex-date, indicatingthere is not a (perfect) tax clientele.

Signaling with Dividends

Signaling implications that have been tested empirically include (i) dividendchanges should be followed by subsequent earnings changes in the same direc-tion, (ii) unanticipated changes in dividends should be followed by revisionsin the market’s expectation of future earnings, (iii) unanticipated dividendchanges should be accompanied by stock price changes in the same direction.

There is only weak evidence that dividend changes convey informationabout future earnings. There is evidence that earnings forecast revisions arepositively related to both dividend changes and the market reaction (causal-ity), consistent with the signaling hypothesis. There is fairly strong evidenceof a positive relation between market reaction and dividend changes.

Agency Costs and Dividends

Expropriation of bondholders may come in the form of dividend payments.Under this hypothesis, equity increases in value with a payout, while debtloses value. Under the alternative that dividends signal good news, bothdebt and equity should increase in value. There is evidence that bond pricesdrop significantly with dividend decreases, but does not change significantlyat an increase. This is consistent with the information content explanation.Dividends also reduce the free-cash flow problem of Jensen (1986). In sum,

5.5. DIVIDENDS 89

there is weak empirical support for the informational content of dividends,and practically no support for dividends as a solution to agency problems.

5.5.2 Key Dividends Papers

Miller & Rock (1985)

Miller and Rock (1985) develop a model where firms use dividends to signaltheir quality in a setting where there is an information asymmetry aboutcurrent earnings. The model has two periods. At time zero the firm investsin a project whose profitability is unobservable by investors. The projectproduces earnings at time one, which the firm uses to finance the dividendand new investment. The project produces additional earnings at time two,which are correlated with time one earnings. Good firms pay a level ofdividends sufficiently high to make it unattractive for bad firms to copy them.Costs arise from the distortion in the investment decision. Dividends provideinformation about earnings through the sources and uses of funds identity.This model does not say why firms use dividends rather than repurchases.

• Outsiders can not observe the cash flows.• All firms have identical investments with diminishing marginal rates of

return.• External financing is done only with riskless debt.• All dividends and capital gains are taxed at τ .• Dividends and repurchases are perfect substitutes.• Firms signal by distributing cash and altering their investments.• Good firms are able to distribute more cash and still match investments

of bad firms.• Bad firms can not afford to mimic the good firms because they would

have to forgo projects with relatively high marginal returns.• The equilibrium has deadweight costs relative to the perfect informa-

tion case.

In this two period model a firm has a concave investment technology F (I)and makes investments It at t = 0, 1 that generate random earning Xt+1 =F (It) + εt+1. The errors are unconditionally mean zero, but E[ε2|ε1] = γε1.The sources and uses of funds identity requires

I1 +D1 = X1 +B1,

where B1 is additional financing and D1 the dividend.


At time 1 the value of the shares is

V1 = D1 − B1 + [F (I1) + γε1]/(1 + r). (5.1)

The firm maximizes value by choosing I1, D1 andB1 subject to the sources/usesconstraint. Substituting for the net dividend, the FOC is F ′(I∗1 ) = 1 + r.The earnings announcement effect is

V1 − E0[V1] = ε1

[

1 +γ

1 + r

]

=[

X1 − E0[X1]]

[

1 +γ

1 + r

]

. (5.2)

The difference between actual and expected dividends is

(D1 −B1) − E0[D1 − B1] = X1 − E0[X1] = ε1

and the dividend announcement effect is the same as (5.2).The dividend announcement reveals information about current earnings,

which in turn are useful for predicting future earnings. There are two com-ponents to the announcement effect. The first is a dollar for dollar reactionto the dividend surprise. The second is the discounted future change aris-ing from the persistence parameter. In this model, earnings announcementsshortly after the net dividend announcement should not contain any newinformation. In practice such earnings announcements do appear to be in-formative. This is because they contain information on outside financing,which is not part of the gross dividend. Financing announcement effects aresimilar to dividend announcements, but with the sign reversed.

With intermediate trading, optimal policies are inconsistent because afirm could pay a higher dividend by forgoing investments and raise the stockprice. A solution to this problem is underinvestment. The informationalasymmetry is that at time 1 the market knows the initial investment and thefirst dividend, while the directors also know the cashflow and investment.That is, Ωm = I0, D1 and Ωd = I0, D1, I1, X1, B1, ε1. As a result, thedirectors and market have different valuations of the firm. The directorsvalue the firm according to (5.1). The market can only use its informationin the valuation

V1 = D1 − B1 + Em1 [F (I1) + γε1|Ωm]/(1 + r).

The managers choose the net dividend and investment to maximize theweighted average of the two valuations subject to the sources/uses constraint.

5.5. DIVIDENDS 91

The weights are the fraction owned by selling stockholders and the fractionretained. The public can use its information about the net dividend to inferthe earnings for which the dividend is optimal. Although there are an in-finite number of informationally consistent valuation schedules, one Paretodominates the others. A firm with the lowest earnings will choose the samenet dividend and investment level as in the full information case, giving aboundary condition. The solution to an ODE satisfying the maximizationproblem has all net dividends at least as large as the optimal level.

Higher dividends serve as a signal of higher current earnings. The betterfirms are able to pay out a higher dividend and forgo productive investments.Since the investment technology is concave, forgoing projects has a highermarginal cost for the lower quality firms. This separating equilibrium restoresconsistency, but at the expense of underinvesting.

There is some empirical evidence supporting the validity of dividends assignals. Examples include Vermaelen (1981) and Prabhala (1993), but ? donot find supportive evidence. Since the Miller and Rock (1985) model is inresponse to the observation that unexpected dividend changes are positivelyrelated to stock price changes, the one would expect to find some supportiveevidence.

Prabhala (1993)

Prabhala (1993) presents a framework where dividends serve as a signal ofthe quality of investment opportunities. This comes in response to earlierliterature where Tobin’s q and dividend yield are claimed to explain stockprice reactions to dividend announcements arising from agency costs of freecashflows and the existence of dividend clienteles. This same evidence isconsistent with a signaling model which subsumes the importance of theother effects.

The motivation for the signaling interpretation is that the other interpre-tations are inconsistent with rational expectations. Since q, dividend yield,firm value, and stock price are useful in predicting dividends, they should beused in making optimal forecasts. The alternative interpretations depend ondividend changes being unanticipated.

This model can be viewed as an extension of Miller and Rock (1985),where the information asymmetry now relates to the quality of growth op-portunities θ. A larger net dividend gets a higher market price at t = 1,but reduces investment and the cashflow at t = 2 which is distributed to the


remaining (1 − k) shareholders. Since signal costs decrease with firm type,the better firms can afford to signal more than the lower quality firms.

The dividend yield effect has been interpreted as evidence supportingthe existence of clienteles. Evidence that high-yield firms experience largerannouncement effects is consistent with this argument. Prabhala reinterpretsthis evidence in a signaling framework where dividends are more informativeabout growth opportunities for high-yield firms. Also, these firms are lesslikely to have strong growth prospects so dividend increases are less likely.

Prior studies show dividend increase announcement effects for low q firmsare larger than for the high q firms, consistent with the free cashflow hypoth-esis. This is because a reduction in FCF is more valuable for firms whichtend to squander cash. The signaling interpretation reflects the market’s ex-pectations: high q firms are more likely to have better growth prospects andare more likely to increase dividends so dividend increases result in smallerannouncement effects.

The empirical methodology estimates a dividend forecast, then examineswhether the deviation from the forecast explains price changes. The explana-tory variables used to forecast the dividend are long-term dividend yield, q,firm value, stock price, the difference in long- and short-term yields, and stockvolatility. Announcement effects are then regressed on dividend surprises.Results indicate a positive relation between dividend surprise and announce-ment effect, and dividends are more informative signals for high-yield firms.Tobin’s q has limited marginal benefit beyond the dividend surprise. Thereis little evidence of the agency or clientele effects after controlling for thesignaling effect, although these former hypotheses are not explicitly rejected.

Vermaelen (1981)

Vermaelen (1981) examines the price behavior of securities when firms repur-chase shares in a tender offer or on the open market. This allows testing theimportance of information/signaling, personal taxes, corporate taxes, andbondholder expropriation.

Repurchases serve as a signal of firm value since managers’ ownership,etc. creates an incentive to increase stock price by announcing a tenderoffer. Repurchasing shares above their true value will dilute the value ofthe managers’ holdings. But with positive information, the manager may bewilling to pursue a tender offer. The more valuable the information, the lowerthe marginal cost to buying back large fractions, offering a higher premium,

5.5. DIVIDENDS 93

and holding more shares in the firm. The price during the offer is given by

PA = αPT + (1 − α)PE

with α being the fraction purchased to the fraction tendered.Vermaelen finds that repurchase announcements are followed by a perma-

nent increase in stock price. Signaling seems to be the predominant influence.There is no evidence of wealth expropriation from bondholders or tenderingshareholders. Those that do not tender are worse off than those that do, butthey are better off than before. The results are also inconclusive with respectto the leverage and personal tax hypotheses.

Open market transactions are associated with a negative CAR prior tothe announcement, followed by a an abnormal return of roughly 2% aroundthe announcement. Tender offers exhibit a flat CAR prior to announcement,but an abnormal return on the order of 15% around the announcement.Following the announcement, the tender offers have a decline the CAR, whichis consistent with the expiration of some of the offers. Looking specificallyat the expiration of offers, there is a negative abnormal return.

The abnormal return to shareholders, INFO, is regressed on a numberof signaling variables to test this hypothesis. INFO is defined as

I/(N0P0) = (1 − FP )(P ′E − P0)/P0 + FP (PT − P0)/P0,

the weighted average of the return to tendered and non-tendered shares.The results are consistent with the signaling hypothesis. The size of the offerpremium, target fraction, managerial ownership, and subscription level areall positively related to the value of information.

DeAngelo, DeAngelo & Skinner (1996)

? provide another test of dividends as signals. They identify stocks with ahistory of growth followed by a decline in earnings and examine the dividendpolicy before and after the decline. They find that dividends are not reliablesignals of future earnings. The results could be due to overoptimistic man-agers who “oversignal,” the relatively small cash commitment of a dividendmay undermine its credibility as a signal, or signaling based on imperfectinformation.

At the year 0 dividend decision, 68% of the firms increase dividends whileonly 1% decrease dividends. There is no evidence of positive earnings sur-prises among these firms over the next three years, and some evidence of


negative surprises. Dividend increases cause small abnormal returns at theannouncement, but over the course of the year the firms have large negativeabnormal returns. The dividend increasing firms have a less negative abnor-mal return than the decreasers, which suggests that managers may be ableto prop up the stock price with a dividend increase.

Eades, Hess & Kim (1994)

Eades, Hess, and Kim (1994) examine the time series of ex-dividend daypricing and identify variation due to tax effects, strategic short-term trading(dividend capturing), and business cycle effects. They find the variability inpricing is positively correlated with dividend yield and dividend pricing iscountercyclical. Dividend capturing reduces ex-date returns and depends ontransactions costs, interest rates, and dividend yield.

The methodology forms ex-date portfolios on each calendar date. Stan-dardized excess portfolio returns (SER) are the ex-date portfolio return (in-cluding the dividend) less the average non-ex-date portfolio return, dividedby the estimated portfolio standard deviation. The portfolios are further sub-divided into high-yield and low-yield portfolios. The SER of the low-yieldportfolio is always positive, has relatively low variation, and zero to negativeautocorrelation. The SER for the high-yield portfolio changes from positiveto negative, is more volatile, and exhibits high positive autocorrelation.

The tax effect hypothesis is tested by including dummy variables for dif-ferent tax regimes in an ARIMA model. There is little evidence of a taxeffect. The test of the dividend capturing hypothesis includes a dummy forthe introduction of negotiated commissions. This lowers transactions costsand makes it easier for corporations to perform tax arbitrage. The dummy issignificantly negative, especially for the high-yield firms. This is consistentwith the dividend capturing hypothesis. The dividend capturing hypothe-sis also predicts dividend capturing is negative related to T-bill yields andpositively related to dividend yields. The evidence also supports these pre-dictions. Analysis of the business cycle effects indicate that low-yield firmsare valued countercyclically (procyclical ex-date returns). The high-yieldfirms do not exhibit this pattern because the dividend capturing effects workin an offsetting direction.

5.6. CORPORATE CONTROL 95

5.6 Corporate Control

Manne (1965)

The Manne (1965) paper is the first to introduce the idea of a market forcorporate control. For the market for corporate control to be effective theremust be a high positive correlation between managerial efficiency and shareprice. Takeovers lead to competitive efficiency among managers and aremore efficient than bankruptcy. They allow increased mobility of capitalwhich provides more efficient allocation of resources. Corporate control maybe transferred through a proxy contest, direct share purchases, or mergers.

Proxy contests are the most expensive, most uncertain, and least used.This method tends to be used when the issue is over compensation not man-agers’ policies. Proxy contests are more likely with disperse share ownership.The share price generally rises on the announcement.

Direct share purchases may be open market purchases, direct purchasesof blocks from large owners, or tender offers. With lower ownership concen-tration other shareholders are more likely to participate in the premium andoutsiders are willing to pay less for control.

Mergers typically offer cost advantages over the other methods. In amerger the manager’s interest are generally in line with the owner’s. Themain exception is that managers do not have an incentive to buy managerialservices as cheaply as possible. When incumbent managers recommend amerger there are likely to be side payments. Within an industry mergersmay be an alternative to bankruptcy. These mergers typically reduce theinformation gap between the target and bidder.

Shleifer & Vishny (1986)

Shleifer and Vishny (1986) examine the role of large shareholders as monitorsand the ways in which they bring about improvements in corporate policy.They basic idea is that someone needs to monitor the managers, but it is tooexpensive for small owners to do so. Large shareholders are better able tobear the monitoring costs and will do so when it is in their best interest.

In the model the large shareholder L has a probability I of getting a valueimprovement Z above q from a probabilty distribution F (Z) for a cost C(I).The large shareholder begins with α shares so he needs an additional .5 − α


to attain control. If he invests C(I) he will bid q + π where

.5Z − (.5 − α)π − cT ≥ 0 (5.3)

and cT represents the costs of making the bid. The small shareholders willtender if

π − E[Z|Z ≥ (1 − 2α)π + 2cT ] ≥ 0.

Let π∗(α) and I∗(α) be the optimal amounts, and Zc(α) be the cutoff valueat which L is indifferent about taking over.

There are a number of important results. First, the premium decreasesin L’s stake, π∗′(α) ≤ 0. Second, a larger initial stake permits takeovers forsmaller improvements, Zc′(α) < 0. Third, with a larger stake L invests morein monitoring, I∗′(α) > 0. Next, the expected increase in firm profits riseswith α, given L has an improvement. Therefore, an increase in α decreasesthe premium but increases the market value of the firm. Increasing cT willincrease the takeover premium but decrease the market value of the firm.There is not an equilibrium where L attains more than the amount necessaryfor control, say 50%. This is because the small shareholders will infer that Lis trying to profit at their expense.

“Jawboning” is an alternative to a takeover. Essential L uses his size as athreat of takeover. The managers may then be willing to negotiate and makesome of the changes L seeks. This method can be incorporated into the aboveanalysis by including the condition that (5.3) be greater than αβZ, whereβ is the proportion of the potential value gain attainable through negotia-tion. Jawboning will typically be used for making less valuable improvementssince the costs are typically lower. As before, the value of the firm increaseswith α, but now the option to jawbone can actually make the larger share-holder worse off. This is because the the required bid on the takeovers rises.Small shareholders can be worse off as well since takeovers are typically morevaluable to them than private negotiation.

Assembling a large block is a complicated problem. If L can accumulatea position anonamously he can deprive small shareholders from their gainsfrom his larger holding. If L trades publicly small shareholders will bid theprice up to reflect the potential value gain. This makes it expensive for Lto get his position. He will want to increase his position again to offsetthese additional costs. But the small shareholders will see this and holdoutfrom selling the first time. Similarly, L will never fragment his stake because


doing so reduces the value of his remaining shares since there will be lessmonitoring. Assembling a block is a one-way proposition. It is expensive todo, so once done it should not be undone. Large blocks should be sold intactto preserve the value of monitoring.

Dividends may provide the compensation to L necessary to get him toassemble a block. Large shareholders are typically corporations who enjoytax benefits on dividend income. Dividends are a sort of bribe from the smallshareholders to the large to get them to serve as monitors.

Stulz (1988)

Stulz (1988) shows that managements’ voting power is important in deter-mining capital structure. For small α, ∂V/∂α > 0, for large α, ∂V/∂α < 0.The intuition is that the premium offered in a takeover increases with α,but the probability of an offer falls. When α is too high, it is beneficialto make a takeover less costly to managment with a golden parachute, forexample. There is no benefit in this model to the manager holding the con-trolling interest. He will be able to block any takeover in this case. Thisimplies α∗ ∈ [0, 1/2). The conflict of interest in the model arises from thefact that successful tender offers affect the wealth of outside shareholders andmanagers differently.

These results are demonstrated in a single period model where the man-ager owns α of an all equity firm. At the beginning of the period there ishomogeneous information and a bidder decides if he wants to get informationon the target. He pays I for information delivered at the end of the period.The bidder will bid for half the shares a price of the no-bid value plus apremium on all the shares, y/2+P . All the benefits of the value increase goto the target. The probability of a successful offer depends on the likelihoodshareholders’ tax rates are low enough to accept the bid and the fractionof outsiders needed to make the offer a success. The bidder chooses P tomaximize the difference between the gain and the premium times the prob-ability of making a successful bid. With α > 0, the bidder has to persuadez(α) = 1

2(1−α)> 1/2 of the outsiders to tender. Increasing α decreases the

probability of a successful bid so the bidder’s expected value falls as well.For the bidder the optimal premium increases with α.

Allowing the manager to tender preserves the general results. More risk-averse managers will hold less shares since they are risky. With DARA pref-erences, α will increase with the manager’s wealth. Managers with greater


α

V

MSV

JM

Stulz

Figure 5.2: Managerial Ownership and Firm Value

benefits from control will hold more share to protect their interests. Managerswill also hold more shares when the sensitivity of offer success to changes inownership is large.

Due to risk aversion and budget constraints, managers typically hold onlya small portion of the shares. Alternatives that increase (some) their votingpower can increase firm value. Changing the debt ratio or repurchasingshares will increase α. Convertible debt and delayed conversion can also helpsince conversion will decrease α. By changing the requirements for controla super-majority rule or differential voting rights effectively increase α. Themanager may also have voting power over shares he does not own. In ESOPsand pensions the manager is often the trustee. A standstill agreement givesthe manager voting power over a large shareholder’s position but may alsoeffectively eliminate a bidder.

Morck, Shleifer & Vishny (1988)

Morck, Shleifer, and Vishny (1988) offer an empirical test of the effect of man-agerial ownership on firm value. High managerial ownership may be goodbecause it aligns the incentives of the managers with the shareholders. How-ever, too much managerial ownership may be bad because the manager maybecome entrenched. The analysis studies the relation between Tobin’s q andmanagerial ownership after controlling for intangible assets, tax shields, size,and industry. More specific tests distinguish between insider and outsider


ownership and connections to founding families.In the study management is defined as the board of directors. Tobin’s

q is regressed on measures of growth oppportunities (R&D/A, Adv./A), taxshields/capital structure (D/A), size (A), industry dummies, and dummiesfor board ownership. The ownership dummies indicate ownership up to 5%,from 5% to 25%, and above 25%. The results indicate that there is a positiverelation between ownership and firm value at very low and very high levels ofownership, and a negative relation in between. The explanation is that theincentive alignment effect is always present, but the entrenchment effect is notimportant until ownership is sufficicently high. Also, managers become fullyentrenched at some point, while the incentive effect continues to increase.The results are robust to different ownership breakpoints and measures offirm value. Ananlysis of the board composition indicates that outsiders areslightly better monitors but still become entrenched. Close connection to thefounding family increases value in new firms but decreases value in old firms.The main results are depicted graphically in Figure 5.2 (not drawn to scale).These results are consistent with a combination of the predictions in Jensenand Meckling (1976) and Stulz (1988).

These results may be partly due to the fact that managers in high q firmsare more likely to have more stock. This is likely to be especially importantin the low ownership range and can induce a spurious correlation betweenownership and q.

Cotter, Shivdasani & Zenner (1996)

? examine the effect that outside directors have on the value of target share-holders. This is a situation where the board should be particularly important.Insiders on the board may have incentives that are different than outsiders.The results indicate that target shareholder gains are 20% larger when theboard is independent. The value comes at the expense of the bidder share-holders. Outsiders are associated with higher initial bids and greater offerrevisions. With an independent board, defense mechanisms such as poisonpills enhance shareholder returns rather than entrench managers. Targetgains are negatively related to interlocking boards and positively related toownership of insiders.

The study examines the impact of board composition on the initial pre-mium, premium revisions, and target shareholder gains. Board members areclassified as independent, insiders, or gray. Control variables include size,


poison pills, golden parachutes, managerial ownership, block ownership, andperformance.

5.7 Mergers and Acquisitions

There are many potential benefits to mergers and acquisitions. These takeoverscan remove inefficient management, achieve economies of scale, or generatesynergies. Offsetting these benefits are costs such as wealth redistributionsand reduced efficiency may arise due to the numerous conflicts of interestand informational asymmetries.

Stylized Facts• target SH earn large positive AR and negative AR on failure• bidding SH earn zero to negative AR• multiple bidder contests magnify AR• bidder AR were lower in 1980’s than before• joint MV increases on average• success is highly uncertain and positively related to bid premium and

toehold• defensive measures reduce probability of success• target reaction to defensive measures and greenmail is negative• large target mgt. share increases bid premium• prob. of hostile takeover lower with high target D/E• bid revisions are large jumps• puzzlingly low toeholds• mixed evidence about means of payment

5.7.1 Tender Offers

Tender offers can be either conditional or unconditional on attaining a criticallevel of participation. Target SH are more likely to tender if he thinks thepost-takeover value is low and if he thinks he is pivotal. Shareholders havean incentive not to tender if he thinks the post-takeover value is high. He canlet others tender so the takeover succeeds, giving him much of the benefit.

Complete Information

With complete information about the future value, no shareholders will ten-der for less than the future value; all of the potential benefits go to the target

5.7. MERGERS AND ACQUISITIONS 101

shareholders and none to the bidder. The bidder may be able to make a profitby diluting the value of minority shares after the takeover. The threat of thisdilution may induce the target SH to tender at a price less than the fullfuture value. If the bidder has a toehold he will also be able to profit evenwithout dilution. In practice the gains on the toehold are not likely to be im-portant since toeholds are typically small. A bidder may be able to threatenthe target SH in other ways to get them to tender as well. One exampleis to threaten to enter the target’s market and compete with them, therebyreducing the value of the target.

Incomplete Information

A bidder may have a better idea about the future value than the target.Under rational expectations, targets know that bidders will try to use theirsuperior information to under-bid. In equilibrium, the free-rider problemremains and bidders will still refuse to tender. There are two types of equi-librium, one where offers are uninformative, the other where the offer providesinformation.

This problem was originally studied in Grossman & Hart (). The two-tiered tender offer and dilution of holdout shares are potential solutions tothe problem. The difference is the type of signaling possible in a two-tieredbid, which allows separation of the signals for undervaluation and privatesynergies. This type of offer can eliminate the incentive to free-ride withoutvoluntary dilution. Another approach is to solve the free-rider problem byallowing the individuals realize the effect their action has on the outcome.This is a rational, but not a competitive, outcome.

In Shleifer and Vishny (1986) there are incentives in the form of divi-dends for large shareholders to monitor the managers. This increases thevalue of the shares for all shareholders, including the small shareholders.The intention of acquiring a large block also raises the share price, makingit more costly to acquire the block. The dividend incentive argument, whichpresumes the large shareholders are corporations, is not well-supported em-pirically (cite ??).

A low bid may signal that the expected improvement is small. Sincebidders with high potential improvements have a stronger incentive to bidhigh, a low bid is a credible signal. The probability of an offer’s successincreases with the bid premium and size of the toehold and decreases withthe number of additional shares needed for control.


Defensive Actions

Defensive actions may reduce shareholder value if it blocks a potentially goodtakeover, but the may also improve value by increasing the incentive to bidhigh and encouraging other bids. Some actions, such as the poison pill reducethe incentive to bid high. In general, strategies that impose greater costs onthe bidder when the offer succeeds than when it fails reduce the incentive tobid high. In summary, some defensive measures are in the shareholders’ bestinterest, while others are used to create private benefits for the manager.

More subtly, a defensive measure may change the informational asym-metry. Decreasing the importance of publicly known improvements decreasethe probability of success. Another defensive measure is to signal to the tar-get shareholder that their shares are undervalued, in which cases they areless willing to tender. Target management may do this by increasing lever-age and/or repurchasing shares. There may also be reputational effects toconsider.

Pivotal Shareholders

A pivotal shareholder is more likely to tender than a non-pivotal one. Alarge blockholder is much more likely to be pivotal than a small investor.The ability of a bidder to revise a bid becomes very important with pivotalshareholders.

Means of Payment

The means of payment has important consequences for the information re-vealed by the bidder. Offering equity may indicate that the bidder’s sharesare overvalued [Myers and Majluf (1984)]. An offer of cash may signal highvalue. Cash offers create an adverse selection problem for the bidder. Offer-ing equity can reduce the risk of overpayment by making the terms of theoffer contingent on the target’s value. The target shares in gains or losses soit will tend to reject the transactions that are likely to be undesirable. Fi-nally, there are tax advantages to using at least 50% equity financing. Withno private information on the part of the target, the target can increase theauction price with equity. If the target has private information about thesynergy, the bidder could benefit by conditioning the target’s acceptance onits value.


5.7.2 Competition Among Bidders

In the English auction model of bidding, bidders trade incremental bids untilthe bidders with the lowest valuations drop out. The winning bid will bejust above the second highest valuation. A very important assumption isthat bids can be costlessly revised and resubmitted.

If bids are costless to submit but there is an investigation cost, the bidder’sstrategy will change. Now he will want to submit a large initial bid to avoid acostly bidding contest. The high initial pre-emptive bid does not deter otherbidders directly by requiring other bidders to improve, but rather it is asignal that the initial bidder has a high valuation and reduce the probabilityof additional bidders. All the bidders that decide to investigate will thenenter into the English auction. If bids are costly to submit, then the revisedbids will move in large steps.

Management may undertake activities to discriminate among bidders. Ingeneral, exclusion of bidders is viewed as bad for target shareholder. Thereare some reasons that these defensive measures may be good. For example,target management may reject a bid if the target firm is worth more, or ifit is likely that other bids will come. Other measures may be repurchasingshares to make a takeover more difficult, or removing the incentive for thetakeover by fixing existing problems. Removing bidders may increase ex antethe frequency of bidding competition. It is optimal to pay greenmail only ifthere is no white knight.

5.7.3 Managerial Power

Managers have potentially conflicting interests of maximizing shareholdervalue and looking out for their own best interest. Increasing target debtlevels may be a way of reducing some of the agency costs that may be theimpetus for the takeover. High leverage may also allow the target to capturea greater fraction of the bidder’s improvements. Shifts in debt levels can alsoaffect management’s voting power and gains from change in control. If thesupply of shares in upward sloping, bidders must offer larger premiums. Thiscauses the level of the bid to increase with the manager’s share ownership.


5.7.4 Key Papers

Roll (1986)

The hybris hypothesis of Roll (1986) suggests that takeovers occur becausemanagers overestimate their own abilities. Under this hypothesis the gains totakeovers are small or non-existent. This explanation is not inconsistent withstong-form efficiency, whereas other explanations require at least temporaryinefficiency. There is some evidence that is generally consistent with thisidea.

Since the current market price is a lower bound on bids, only bids withrelatively high valuations are observed. Thus takeovers attempts are likelyto contain random overvaluation errors. Since these transactions are drivenby a relatively small number of people and depend heavily on individualdecisions irrational behavior is more likely.

The theory predicts that the bidding firm will have a price decline on theannouncement followed by a further decline on winning or an increase onlosing the bid. The total gains to the takeover should be non-positive. Gainsto the target come at the expense of the bidder and transactions costs are adeadweight loss. There should be more hubris among firms that have beensuccessful recently.

Lang, Stulz, & Walkling (1989)

Lang, Stulz, and Walkling (1989) examine the variation in tender offer abnor-mal returns to understand the determinants of the bid premium. The resultsindicate benefits are greatest for high q bidders and low q targets, consistentwith the Jensen (1986) FCF argument. The evidence that high q biddersprofit indicates that the hubris hypothesis is not a complete description ofthe process.

In successful tender offers the typical bidder has a low q for several yearsprior to the bid. The target’s q tends to have declined recently. The takeoverscreating the most value are high q firms acquiring low q firms. The mostvalue is lost when low q firms acquire high q firms. These results could alsobe interpreted to mean that value is created when undervalued firms areacquired and destroyed when overvalued firms are acquired.

The analysis regresses gains on dummy variables for the q of the bidderand target. The q is measured as either the average over three years priorto the bid or from the most recent year. The regressions do not control for


the growth opportunities, form of payment, or number of bidders. This is aproblem because q can measure not only management ability, but also thegrowth potential of the firm. Separate regressions are performed for biddergains, target gains, and total gains. Each are further subdivided by whetherthere are opposing offers.

Berger & Ofek (1995)

Berger and Ofek (1995) examine the effect of diversification on firm value.Diversification programs were popular in the 1950s and ’60s, but more re-cently firms have moved in the opposite direction. The authors compare thesum of imputed stand-alone values for a firm’s segments to the market valueof the firm. The general conclusion is that diversifivation tends to reduce firmvalue by roughly 15%. Unrelated diversifications destroy the most value.

There are many potential benefits and costs to consider in analyzing thevalue of diversification. There are gains in operating efficiency, increaseddebt capacity, reduced taxes, and efficiencies with internal capital markets.Potential agency costs such as FCF, cross-subsidation, and incentive conflictsbetween and among the divisions weigh against the benefits.

Imputed valuations are based on the median ratio of capital to EBIT,A, Samong single-segment firms. For each multi-segment firm the value from di-versification is the log of the ratio of actual value to imputed values.2

The results indicate that multi-segment firms tend to have lower ratios.Regressions of excess value on multi-segment indicators indicate that diver-sification reduces value by 15%, even after controlling for size, profitability,and growth. Acquisitions of related segments tend to be less harmful thandiversifying acquisitions. The results are robust to imputed measure andpersist over time.

In examing the sources of value gain or loss the authors consider overin-vestment, cross-subsidization, and tax effects. Overinvestment, as measuredby capital expenditures to assets in low-q segments, is negatively related toexcess value, especially for diversified firms. The cross-subsidation effect iscaptured by a dummy variable for negative cashflows in a segment. Again,the effect is more negative for multi-segment firms. The evidence suggeststhat the tax benefits are economically insignificant.

2This is biased towards a diversification discount since logs are not symmetric about 1.Also, the allocation of overhead and reliance on segment-level reporting may create biasesin the imputed valuations.


Mitchell & Lehn (1990)

Mitchell and Lehn (1990) ask “Do Bad Bidders Become Good Targets?”The answer seems to be yes. The idea is to see whether takeovers disciplinemanagers of firms that have demonstrated poor acquisition programs. Thissuggests that at least part of the gains to targets may be in reduced agencycosts. The authors find that there is little change in value for acquisitionsin general. But there is a significant decrease in value for bidders who aresubsequently acquired. For all firms, the average gain for an acquisition thatis later divested is smaller. This effect is especially true for firms that laterbecome targets themselves. Finally, the probability that a firm becomes atarget is inversely related to the announcement effects of its acquisition.

In defining bad bidders it is important to distinguish between overpay-ment, which can not be fixed, and poor ongoing performance, which presum-ably can be fixed. Also note that most targets of hostile takeovers did notpreviously make an acquisition, so this is only a partial explanation.

The main analysis is based on an event study methodology of abnormalbidder returns around the bid announcement. Average abnormal returnsfor different classifications of the bidders are compared. On average, bid-ders earn a negligible return, but non-targets actually earn a positive return.Subsequent targets, especially those in hostile takeovers, earn negative re-turns. Since the divestiture rate is higher for subsequent targets than fornon-targets, it appears that the bad bidders are bad because the have poorongoing performance. Logit regressions give evidence that firms that makebad acquisitions are more likely to get takeover offers than firms that makegood acquisitions.

Mitchell & Mulherin (1996)

The Mitchell and Mulherin (1996) paper addresses the impact of industryshocks on the high level of restructuring in the 1980s. The hypothesis isthat tender offers, mergers, and LBOs are among the lowest cost means ofresponding to industry change. The study is motivated by the high concen-tration of restructuring within industries. If these activities are driven byindustry effects, the announcement of one firm in an industry should provideinformation about the prospects of the other firms in that industry. In thissense it is not surprising that we see poor performance following a takeover.These activities are not the cause of a problem, but rather a response to a

5.8. FINANCIAL DISTRESS 107

problem.

The study is based on the roughly 1,000 Value Line firms in 1981. Thesefirms are tracked throughout the rest of the decade and marked as to thetype of takeover target they were (if at all). The analysis indicates thatover the full period there is significant clustering of takeover activity at theindustry level. Furthermore, within industries there is also clustering overtime. Across all industries takeovers are spread fairly evenly over time. Thisprovides evidence that takeovers are responses to industry-specific shocks.Further analysis indicates that this industry clustering was less common inthe 1970s. Regressions of takeover activity on variables measuring sales andemployment shock and growth indicate that it is industry change, not growth,that drives the takeovers. The findings in this paper indicate the problem ofasset liquidity in Shleifer and Vishny (1992) may be important.

5.8 Financial Distress

When a firm faces financial distress a number of problems arise. Dependingon how severe the distress, the firm may be tempted to underinvest or engagein risk-shifting. A firm in financial distress can attempt to reschedule its debt,raise cash via the issuance of new securities, or sell some of its assets.

It is important to distinguish between financial and economic distress.Bankruptcy proceedings are intended to do so by directing assets to theirgreatest use. In some cases this means liquidating the assets and dissolvingthe firm, whereas in other cases it means reorganizing the firm and its financ-ing to preserve going-concern value. There is debate over whether marketsor courts are better at resolving distress.

There is evidence that competing firms experience a stock price dropon the bankruptcy announcement, indicating that the announcement sig-nals poor industry conditions [see Mitchell and Mulherin (1996)]. However,firms in concentrated industries with low leverage have price increases. Zen-der (1991) discusses optimal security design that implements efficient invest-ment. Bankruptcy is the mechanism that creates a state-contingent transferof control.

The direct costs of bankruptcy are relatively small. Indirect costs maybe more significant, but are hard to measure. Liquidation costs are distinctfrom costs of financial distress and arise in the process of selling assets.


5.8.1 Factors Affecting Reorganizations

Free Rider Problem

A debt restructuring requires unanimous approval of all holders of a class ofsecurity. To get around this requirement, a firm can use an exchange offer,where security holders have the right to participate in the exchange. Sincethe restructuring is designed to increase the health of the firm, the old debtincreases in value. Therefore, some of the bondholders may hold out of theexchange and capture this increase in value. Since all the bondholders (in agiven class) have the same incentives, the exchange is likely to fail.

This problem can be solved with different indenture provisions ex ante, orwith coercive participation. Examples of indenture provisions include grant-ing a trustee the right to accept offers on behalf of the bond holders, requiringonly a majority for approval, or including a “continuous” call provision. Co-ercive methods include ex post modification of the covenants directly.

Information Asymmetries

Insiders and outsiders may disagree about the value of the firm due to dif-ferential information. Further, they may have incentives to intentionallymisrepresent the value of their claims. The state of financial distress may bemisrepresented as well (e.g., discount bonds). Insiders of a firm with poorprospects may hide the truth, whereas insiders of a firm with better prospectsmay claim they are in distress hoping for a favorable debt renegotiation. In-termediate payments such as coupons and deviations from the APR rule canreduce these problems.

Agency Costs

The various investor groups and managers have different incentives in thebankruptcy process, leading to conflicts of interest. Some of these groupsmay join together to form a coalition to increase their bargaining power.

Managerial Behavior

Fama (1980) posits that a competitive market for managerial talent is animportant mechanism to control the behavior of corporate managers. Man-agerial behavior is likely to be influenced by financial distress for several


reasons, including direct financial effects, potential loss of future income,loss of firm-specific human capital, loss of power/presige, and reputation ef-fects. It difficult to observe managerial ability, so it is hard to tell if financialdistress is due to poor management, the wrong incentives, or an adverseenvironment. There is evidence of increased board turnover prior to finan-cial distress, just when the board is most needed to monitor the managers.Management of distressed firms are many times more likely to experienceturnover than managers at healthy firms. There is also evidence that therole of the board changes after restructuring.

5.8.2 Private Resolution

Private resolution of financial distress involves activities outside formal bank-ruptcy proceedings. Common techniques include exchange offers, tender of-fers, covenant modification, maturity extension, or rate adjustment.

Evidence on Restructurings

Asset and financial characteristics jointly affect the choice of restructuringmechanism. Private workouts are more common for firms with (i) more in-tangible assets, (ii) fewer classes of debt, and (iii) greater reliance on bankfinancing. There is evidence that the market is capable of predicting whethera workout will be successful, and that workouts are a more efficient form ofreorganization than Chapter 11.3 Evidence from the Japanese markets in-dicates that firms with close ties to a main bank are able to invest moreand increase sales more following the onset of financial distress. The closerelationship with the main bank internalizes some of the free rider and asym-metric information problems.

Asset Sales

A firm may sell some of its assets to relieve its financial distress. Asset salesmay be different for distressed firms than for healthy firms. As discussed inSection 5.15, Shleifer and Vishny (1992) suggest that the secondary marketfor interfirm asset sales may be subject to adverse liquidity problems. The

3This may be misleading since the firms that choose Ch. 11 may have done so optimallygiven the characteristics of their bankruptcy.


purchaser may be exposed to unique risks in the transaction with the dis-tressed firm, or they may also be distressed if there are industry problems.These factors combine to reduce the attractiveness of the asset sale. Evidenceindicates that asset sales among distressed firms are more common when thefirm has several divisions.

New Capital

If the firm still has good projects it may wish to acquire additional capital.If the firm is in distress, it may have difficulty raising capital, as in Myers(1977). Underinvestment arises because much of the benefit from the newcapital goes to the old debtholders. To solve this problem, new securitiesshould be senior and/or asset-backed.

5.8.3 Formal Resolution

Since a firm can generally choose private or formal bankruptcy proceedings,the cost of bankruptcy will be the lesser of the two. The ability to chooseavenues will cause many of the features in the formal proceedings to appearin the private resolutions as well.

Liquidation (Ch. 7)

Reorganization (Ch. 11)• automatic stay

– stops principal and interest payments to unsecured creditors– secured creditors lose rights to collateral, may receive “adequate

protection” payments– effectively extends maturity of debt– Executory contracts can be assumed or rejected– reduces blocking power of debtholders and leads to renegotiation

• debtor-in-possession– Current management and directors typically retain control– Management can file reorganization plan within 120 days, exten-

sions are common– incremental senior borrowing is allowed, strip seniority/collateral

from existing debt• Negotiation

– All classes of creditors and court must approve agreement


– “Cramdown” forces creditors to accept the plan– Management delays are a bargaining tool transferring wealth from

debt to equity– Debtors have favorable bargaining power– power given to management/equity viewed as compensation for

not exercising option to delay or shift risk

Chapter 11 has ambiguous effects on efficiency, but provides the greatesteconomic benefit when underinvestment is a problem.

Prepackaged Bankruptcy

A firm is allowed to simultaneously file for bankruptcy and give its plan ofreorganization. This allows the firm to get the efficiency of the private re-structuring, yet retain some of the benefits of the formal proceeding (e.g., thecramdown and certain tax benefits). Prepacks may also reduce the holdoutproblem inherent in workouts.

5.8.4 Key Papers

Ross (1977)

Ross (1977a) develops a theory incorporating managerial incentives into thecapital structure decision. Insiders have private information and are com-pensated by a known incentive schedule. Since the manager incurs a penaltyif the firm goes into bankruptcy, the amount of debt is a valid signal since itis costly for the managers, and more so for managers at lower-quality firms.This signal then influences the market’s perception of the firm’s risk, al-though it does not affect the actual risk. The M&M irrelevancy result holdswithin a risk class, but there is an optimal capital structure for each firmtype.

There are several empirical predictions from this model. Cross-sectionally,the cost of capital will be unrelated to the financing decision, although thedebt level is uniquely determined. Bankruptcy risk should be an increasingfunction of firm type and debt level. Finally, value should increase withleverage in the cross-section.


James (1995)

The paper by James (1995) attempts to understand the conditions underwhich a bank will take equity in a distressed firm. Bank debt is generallythought to be easier to renegotiate than public debt since coordination iseasier. Banks have limited incentives to make unilateral concessions, sincethis will create a wealth transfer to junior claimants. Banks are more likelyto take equity when bankruptcy costs are high, such as when a firm hassignificant growth opportunities.

James examines roughly 100 bank debt restructurings in the 1980s. Insome cases the firms attempted restructuring of public debt as well. Therestructurings involved either forgiving financial obligations or modifying theterms of the debt.

There are five general findings. First, whenever the bank takes equity thepublic debtholders (if any) also take equity. Public bondholders are muchmore likely to act unilaterally than are banks. Second, banks tend to makelarger concessions when there is no public debt. Third, banks also tend totake relatively large equity positions and hold them for several years. Fourth,banks are more likely to take equity when the firms has a small proportionof public debt, more valuable growth opportunities, greater cashflow con-straints, poor prior operating performance. Finally, the firms in which bankstake equity tend to perform better subsequently than the ones in which theydo not.

Hotchkiss (1995)

The basic goal of Hotchkiss (1995) is to see if Chapter 11 bankruptcy proceed-ings are effective in reviving troubled companies. Results indicate that a largenumber of firms are not viable after the reorganization and that existing man-agements’ role in the process is associated with continued poor performance.The latter point may mean either that the process favors management orthat these distressed firms have difficulty in attracting new managers.

The paper includes an analysis of post-bankruptcy operating performancein terms of accounting profitability, deviation from cashflow projections,and subsequent distress. Many of the firms increase in size shortly afterbankruptcy. The firms begin with average profitability in their industry fiveyears prior to bankruptcy. Closer to the filing, performance deteriorates.Following confirmation of the plan performance improves somewhat, but a


number of firms continue to have trouble. The cashflow forecast errors aresignificantly negative each year, beyond any industry effect. This result maybe due to incentives to make high forecasts; the managers who remain incontrol tend to make overly optimistic forecasts. Finally, roughly a third ofthe firms file for a second restructuring within a few years.

Logit regressions provide additional evidence about firm characteristics.Large firms are more likely to emerge as public companies and are less likelyto report negative operating income. There is strong evidence that retainingthe pre-bankruptcy CEO is positively related to poor post-bankruptcy per-formance. Finally, there is some evidence that firms filing in New York aremore likely to remain in distress.

Weiss (1990)

Weiss (1990) performs an examination of the direct costs of bankruptcy andviolation of the absolute priority rule. He finds direct costs average about3% of firm value (20% of equity value) the year prior to bankruptcy. Theabsolute priority rule is frequently violated, especially in New York. There isno evidence that these cases are resolved more quickly. Larger transactionsare more likely to violate strict priority since there are more opportunities toextract concessions.

One view is that the violation of APR is to compensate equityholdersfor not exercising their option to delay the proceedings or pursue actionsdetrimental to the senior debtholders. Evidence suggests that equity mar-kets anticipate the deviation from APR, and the junior debt incorporates apremium for APR violations.

Betker (1995)

In order to understand the effectiveness of prepackaged bankruptcies, Betker(1995) documents the costs and sources of economic gain associated withthis method. The time spent in bankruptcy is much shorter, 2.5 months ina prepack versus 25 months in Chapter 11. The total time including prelim-inary negotiations is similar to Chapter 11 and is long relative to workouts.The direct costs are estimated to be about 3%, very similar to the results inWeiss (1990) for Chapter 11 proceedings. Indirect costs in a prepack may belower, but it is not clear by how much. It is possible that the indirect costswould be similar to a workout. Prepacks appear to offer some tax advantages


over workouts in treatment of NOLs, but not CODs.

5.9 Equity Issuance

The security issuance decision involves many of the same issues as capitalstructure. In addition, the issuance process creates other considerations. Sea-soned Equity Offerings (SEOs) are similar to Initial Public Offerings (IPOs)in many respects. The primary difference is that there is an existing marketprice from which the valuations can be based. This section discusses thecommon elements and the specifics of SEOs. The following section addressesthe issues particular to IPOs.

Smith (1986) provides a review of the theory and evidence on securityissuance. There are several theories that attempt to explain the empiricalevidence. There may be an optimal capital structure in which case optimizingfirms should have non-negative valuation effects for capital structure changes.The issuance could serve as a signal of decreased cashflows as in Miller andRock (1985). The degree of predictability will also influence the size of theannouncement reaction. Since debt principal repayment is predictable, debtreissuances should also be more predictable and have smaller announcementeffects. A similar argument can be made with high dividend yield firmssuch as utilities. As in Myers (1984) and Myers and Majluf (1984), wheninformation asymmetries are large the price impacts should also be larger.Finally, changes in ownership concentration such as equity carve-outs mayaffect value. Table 5.3 summarizes these predictions and the related evidence.

Stylized Facts

• Retained earnings are most common source of financing• Debt is used more than equity, net retirement of equity in 1980’s• Increased use of leverage over time• Equity is issued relatively more frequently during expansions• Private placements are becoming more important• Gradual switch from rights to firm commitment• Strong preference for firm commitment for non-equity issues• IPOs use firm commitment (60%) or best efforts (40%)• DRIPs and ESOPs have replaced rights offerings• Underwritten offers are more expensive (directly), but more common

5.9.E

QU

ITY

ISSU

AN

CE

115

Table 5.3: Theories of Security Issuance Reactions

Theory Prediction Evidence

Optimal Capital Structure AR > 0 Opler and Titman (1995)Info. Asymmetry AR < 0, Yes: Mikkelson and Partch (1986)Myers and Majluf (1984) more so for Eckbo and Masulis (1992),

securities with high Opler and Titman (1995).info. asymmetry No: Helwege and Liang (1996)

Opler and Titman (1995)Signaling Issue signals Yes: Mikkelson and Partch (1986), PrabhalaMiller and Rock (1985) lower earnings No: ?Ownership Concentration Use underwriting with Eckbo and Masulis (1992)Eckbo and Masulis (1992) disperse ownershipPredicatability Smaller reaction to Prabhala (1993)Prabhala (1993) predictable issues Mikkelson and Partch (1986)


5.9.1 Flotation Methods

Since the use of an underwriter has higher direct costs than a rights of-fer, there must be some indirect benefits provided by the underwriter. Theflotation choice can be viewed as an attempt to signal firm quality. The un-derwriter also acts as a monitor or certifying agent. The best firms will usestandby rights offers, medium quality firms will use uninsured rights, and theworst firms will use firm commitment underwriting. The flotation choice canalso be viewed as an optimal risk sharing contract in a principal-agent prob-lem, where the issuer is the principal and the investment bank is the agent.The issuer wants to incent the banker to expend effort which is difficult tomeasure or observe.

Firm Commitment

In a firm commitment the investment bank assumes the risk of the offer. Itessentially buys the offer from the issuer and is responsible for selling it. Theprocess begins with an SEC filing. Next, a preliminary prospectus statinga range of offer prices and the maximum number of shares is issued. AfterSEC approval, the final offer price is set and a final prospectus is issued. Theunderwriter’s guarantee begins once the final offer price is set. Competitionamong underwriters has led to the “bought deal,” where an investment bankwill buy an entire issuance outright. Firm commitment becomes more at-tractive with less asymmetric information, more risk-averse issuers, less risk-averse underwriters, less price uncertainty, or when the investment bank’seffort is more observable.

Best Efforts

In a best efforts offer the underwriter acts as a marketing agent on behalfof the issuer. The issuer bears the risk of the offering. The filing process issimilar to a firm commitment offer, except there is a minimum sales level be-low which the offer will be withdrawn. After SEC approval, the underwriterattempts to sell the issue during a selling period. Average initial returns arehigher with best efforts offerings.

5.9. EQUITY ISSUANCE 117

Rights Offers

Current shareholders are given short-term warrants in proportion to theirshareholdings. Shareholders can either exercise the warrants or sell them.The subscription price is typically 15-20% below the current market price.Sometimes rights offers use standby underwriting to guarantee the proceedsof unsubscribed shares. Rights offers in the U.S. are typically fully sub-scribed.

Indirect Issuances

Convertibles, warrants, options, DRIPs, ESOPs are examples of indirectmethods of equity issuance. Stein (1992) develops a theory for convertibleissuance as ‘back door” equity financing. DRIPs and ESOPs have replacedrights offerings to some extent.

Shelf Registration

The issuer can pre-register for the issuance of a security over a two yearperiod. This can reduce the direct costs of issuance but it increases theinformation asymmetry problem since it is easier for managers to time theiroffers.

Negotiated Bid

A firm can select its investment bank through either a negotiated or compet-itive bid process. Negotiated bids are more common, especially among largerissue, even though they are more expensive. The main users of competitivebid offers are utilities, which are required to do so. Possible explanations in-clude side payments to managers, increased accounting-based compensationto managers, lower variability in costs, reduced agency costs, and protectionof proprietary information.

5.9.2 Direct Flotation Costs

A summary of direct floatation costs is shown in Panel A of Table 5.4. Directflotation costs are generally higher for equity issues than for other securi-ties. They also tend to be higher for industrial companies than for utilities.


Table 5.4: Some Issuance Costs

Panel A: Direct Flotation CostsMethod Industrial UtilityRights 1.8% 0.5%Standby Rights 4.0% 2.4%Firm Commitment 6.1% 4.2%

Panel B: Seasoned Issue Valuation EffectsSecurity Industrial UtilityEquity –3.14% –0.75%Conv. Preferred –1.44% –1.38%Preferred –0.19% 0.08%Conv. Debt –2.07%Debt –0.26% –0.13%

Convertible debt offers have higher flotation costs than similar sized non-convertible offers, consistent with the hypothesis that issue costs are relatedto security volatility. Not surprisingly, underwriter compensation is higher innegotiated contracts than in competitively bid contracts. Underwriter com-pensation has decreased since the introduction of shelf-registration, althoughthis may be due to selection bias issues.

Several firm characteristics are correlated with direct flotation costs [seeSmith (1986) and Eckbo and Masulis (1992)]. The models use direct flotationcosts as a percentage of issue proceeds as the dependent variable. A positiveintercept indicates there are fixed costs to the issuance. Measures of sizeindicate that the costs are a decreasing, convex function of size, indicatingthere are economies of scale. High shareholder concentration also lowersissuance costs (this may be due to an increased reliance on subscriptionprecommitments). The direct costs are positively related to stock volatility.Dummy variables indicate that rights offers have the lowest direct flotationcosts, and firm commitment offers the highest. These results are robust tothe time period used and across industrial and utility firms.

Issuers often grant an overallotment option, allowing the underwriter topurchase additional shares if the offer is oversubscribed. This increases theunderwriter’s incentive to sell the issue, reducing the risk of failure.


5.9.3 Indirect Flotation Costs

Given the lower direct costs of rights offers, but the preference for firm com-mitment offers, indirect expenses may be important. Managers may receivepersonal benefits from underwriters, or there may be pressure from invest-ment bankers who sit on the board. Also, sales to the public are more likelyto create a more disperse ownership structure, either reducing the monitor-ing of managers as in Shleifer and Vishny (1986) or increasing liquidity as inMerton (1987). Expected rights offer failure costs are small.

Other indirect costs may include the capital gains taxes and transactionscosts to the shareholders associated with a rights offer. There may also beanti-dilution clauses and wealth transfers to convertible security holders.

5.9.4 Valuation Effects

Leverage increasing transactions produce positive ARs, while leverage de-creasing transactions have a negative effect. There is an average negativeprice impact of SEO announcements of about 3%. This contrasts to no sig-nificant price impacts for the announcement of straight debt, equity soldthrough rights offers, or private placements. Common stock offer cancella-tions are also associated with positive reactions. Indirect equity issuances(e.g., convertible debt) are also associated with negative announcement re-actions. Evidence on shelf registration indicates a more negative reaction,which is consistent with the increased adverse selection problem. These val-uation effects are summarized in Panel B of Table 5.4.

There are several possible explanations for these valuation effects. Ifthere is an optimal capital structure, then a change to restore the optimallevel should be met with a positive reaction. The nonpositive announcementeffects do not support this hypothesis, although the announcement may alsoconvey information about the firm’s situation. This signaling effect, as inRoss (1977a), implies that leverage-decreasing events signal negative revi-sions in management’s expectations, and should be accompanied by a neg-ative price reaction. Under the Miller and Rock (1985) model, any securityissuance signals lower than anticipated operating cash flow and is bad news.There is some evidence indicating firms tend to issue debt following earningsdeclines, whereas equity issuances tend to come before an abnormal earningsdecline.

The adverse selection problem in Myers and Majluf (1984) can also ex-


plain some of the valuation effects. In an extension to this framework, man-agers who choose the size of the offer choose larger offers when their stock ismore overvalued. Other adverse selection problems arise because the under-writer is able to distribute shares in the “good” offers to preferred clients.Also, as in Rock (1986), with differentially informed investors the under-priced issues will be oversubscribed, while the overpriced ones will go to theuninformed investors. There are some (partial) solutions to these adverseselection problems. The firm can try to change managerial incentives, useprivate placements, maintain financial slack, use certifying institutions, useequity carveouts, or issue convertible securities.

Much of the empirical evidence is consistent with the adverse selectionhypotheses. The announcements have nonpositive effects regardless of se-curity type but larger or riskier issues have more negative reactions. Firmcommitment offers have the most negative reactions, followed by standbys,then uninsured rights. These results are consistent with the model of Eckboand Masulis (1992). Other supportive evidence includes the more negativereaction for industrial issues than for utilities, and more negative reactionsto shelf registration announcements.

5.9.5 SEO Timing

There is some evidence supporting the hypothesis that equity offers will bemore frequent in an expansion. The argument is that there are more prof-itable investment opportunities in these times, and firms are less likely toforego investment projects because of underpricing. Additional evidence in-dicates that the announcement effect is less negative during expansions forequity issues, while announcements of debt issuances are not effected. TheMyers (1984) pecking order hypothesis suggests firms will issue equity in eco-nomic downturns because they are less likely to have excess cash and theirleverage is likely to have increased as market values of equity have fallen.There relative regularity of debt issuances raises the possibility that the an-nouncement effect is small because the market anticipates the issuance. Jung,Kim, and Stulz (1996) and Opler and Titman (1995) provide evidence thatthe debt-equity choice is predictable.


5.9.6 Key Papers

Eckbo & Masulis (1992)

Eckbo and Masulis (1992) model the choice between a rights issue and anunderwritten offer as an extension to Myers and Majluf (1984). In the model,shareholder takeup k is an important determinant of the flotation method.Firms using uninsured rights offers may use subscription precommitmentsto credibly signal a high takeup. The precommitments in rights offers andunderwriter certification in firm commitment offers serve to reduce the wealthtransfer between current shareholders and outsiders.

Firms with more dispersed ownership will tend to choose underwriting.Firms with less discretion over their issuance, such as a utility, will tend touse a rights offer. The model predicts that the announcement effect will bemost negative for firm commitments offers, followed by standby rights anduninsured rights. This analysis can be applied to other flotation methods aswell.

To analyze the determinants of direct costs, they estimate a regressionwith measures of size, percentage change in shares, ownership concentration,return standard deviation, and dummies for offer type. The results indicatethere are significant fixed costs and economies of scale. A positive coefficienton the change in shares variable indicates there are adverse selection costs.High ownership concentration lowers direct issuance costs, perhaps throughprecommitments. More volatile returns are associated with higher costs sincethere is increased underwriting risk. After controlling for issue characteris-tics, rights offers are still less expensive than standby or firm commitmentoffers. Having documented the rights issue paradox, the authors present amodel to explain it.

In the model, firms will issue if the value of the projects exceeds the directcost and dilution from issuing undervalued securities, b−(f+c) ≥ 0. The costc(k,m) depends on the level of existing shareholder participation. Managersselect the flotation method m to maximize firm value. The market getsinformation about k through precommitments, trading volumes and actualsubscription levels.

With full participation the dilution cost is zero. When k < 1 some un-dervalued firms will find it too costly to issue. In this sense k is similar tothe inverse of slack. High-k firms select uninsured rights and use takeup tosubstitute for the underwriter guarantee. Firms with k ∈ (kf , ks) will choose


standby rights. The lowest k firms will not bother paying the additionalrights distribution costs and will just use firm commitment offers. If a firm isovervalued then high-k firms may choose either uninsured rights or they may“hide” with a firm commitment offer. If they are detected they will sell ata lower price or cancel and forgo the project. Since the market understandsthese strategies, the high-k firms will face the lowest adverse selection costsand the low-k firms the highest costs.

The authors test their model using an event study methodology. Con-sistent with prior literature, the negative market reaction is strongest forfirm commitment offers and weakest for uninsured rights. After adjustingfor flotation costs, either type of rights issue has a negligible effect. Reac-tions are less negative for utilities, consistent with smaller adverse selection.Firm commitment offers are generally associated with stock price runups,whereas this effect for standby or uninsured rights are smaller or negligible,respectively.

Mikkelson & Partch (1986)

In a study similar to Masulis (1980), Mikkelson and Partch (1986) reexam-ine the effect of announcements of capital structure changes on stock priceto better understand the determinants. They find a significant negative an-nouncement effect for stock and convertible debt, and a less pronounced effectfor debt. Completed offerings have positive returns between announcementand issuance, and a negative return at the issuance, indicating that firmstime their security issuance. Similarly, firms that refinance have more nega-tive reactions than those who raise funds for capital expenditures. In general,the results are consistent with the predictions of Myers and Majluf (1984)and the notion of a pecking order.

The paper uses an event study methodology to measure excess returns.The estimation window for α and β are the 140 days beginning 21 days afterissuance or cancellation. Throughout their sample the number of announce-ments varies considerably across time. External financing is not a commonevent for many firms, consistent with the pecking order hypothesis. Equityoffers tend to finance new assets. Among public offerings for cash, stock hasthe most negative abnormal return at about –4% and straight debt the leastnegative reaction. These results are consistent with the predictions in bothMyers and Majluf (1984) and Miller and Rock (1985), although the latterpaper does not distinguish between security types.


Table 5.5: Price Reactions by Issuance Type

Issuance Pre-AD AD AD-ID IDAll Equity + –All Debt – 0Completed Equity + –Cancelled Equity – +Completed Debt 0 0

The abnormal returns surrounding these events provide evidence thatmanager time security issuance. Prior to the announcement, equity offerstend to have runups, while debt offers tend to have declines. At the an-nouncement, the equity offers have the most negative returns and the debtoffers the least negative. For completed offers, equity again has a runupbetween the announcement and issuance, while debt is essentially flat. Atthe issuance there is another negative effect for equity offers and a neutralreaction to debt issuances. In general, convertible debt and preferred stockfall in between the debt and equity effects, although small sample sizes makeinterpretation more tenuous. A more direct analysis of cancelled and com-pleted offers confirms that the cancelled offers have declines between theannouncement and cancellation, while completed offers increase in price be-tween announcement and issuance. Further, at the cancellation there is apositive return versus a negative return at the issuance. Note that thesepatterns are ex post, and it is not likely that there are any profitable tradingstrategies. An effort to determine whether debt ratings make a differenceis inconclusive due to small sample sizes, but announcements of bank creditlines are associated with positive abnormal returns.

Opler & Titman (1995)

If there is an optimal capital structure then firms experiencing an equity pricerunup should issue debt to move back towards the optimum. Evidence thatfirms issue equity after a runup seems to indicate the opposite. Opler andTitman (1995) address this issue by seeing if deviations from an estimateof the optimal debt ratio are useful in predicting whether the firm issuesdebt. The general results indicate that firms do move towards a target debt


ratio. A puzzling finding is that the security choice of firms least subject toinformation asymmetry are the most sensitive to recent returns.

There are several possible explanations of equity issuance after run-ups.The optimal capital structure could simply change over time. If firms whosegrowth opportunities improve have price runups, these firms should desirerelatively more equity financing. An agency theory explanation is that addi-tional debt constrains a manager’s ability to grow and raises the probabilityof default and firing. The observed behavior is also consistent with the Myersand Majluf (1984) model where the firms with overvalued securities issue. Abehavioral explanation is that managers want to avoid dilution as a rationalresponse to an irrational market.

The analysis is performed in two stages. In the first stage debt ratiosare regressed on proxies for growth opportunities and size to get predicteddebt levels. Deviations from the predicted level and control variables arethen used to predict the probability of debt issuance is a second stage. Thesecond stage regressions are further stratified by size, dividend policy, andutilities.

Their findings do not fully support any of the proposed theories. Partialsupport comes from several observations. Profitable firms issue debt or re-purchase shares to offset the accumulation of retained earnings. The largerissuances tend to involve equity, perhaps in response to the higher fixed costs.Stock return and M/B are good predictors of equity issuance. The resultson convertible debt generally fall between debt and equity. Firms that issueshort-term debt are less profitable than equity issuers, whereas long-termdebt issuers are more profitable.

The results from the stratified regressions are less supportive of the theo-ries. Utilities, firms that pay dividends, and firms followed by more analystsare more sensitive to recent returns in their security choice. Small firms areless sensitive to price runups. In the more active market for corporate controlin the mid-80s, there is no evidence that managers are less willing to issueequity following a stock price decline.

Helwege & Liang (1996)

Helwege and Liang (1996) test the pecking order theory using a sample ofIPOs. The basic design is to identify a cohort of firms going public in 1983and follow their financing choices through time. The general finding is thatthere is little support for the pecking order. The probability of external


financing is unrelated to internal cash shortages and financing patterns indi-cate an “overuse” of equity.

The study starts with 367 firms. Over the next decade a roughly equalnumber go bankrupt, are acquired, and survive. The firms tend to have lossesearly in their lives then show increases in profitability. Dividends are rarelypaid and there is a tendency to rely on internal funds over time. Firms seemto choose private debt, then equity, then public debt. Large firms tend toissue debt, whereas small firms or low growth firms use more private debt.Coefficient estimates on default and asymmetric information variables aremostly inconsistent with the pecking order. Riskier firms tend to issue moreequity.

Jung, Kim & Stulz (1996)

Jung, Kim, and Stulz (1996) perform a test comparing the issuance decision,market reaction, and subsequent actions predicted by the pecking order [My-ers (1984)], agency [a special case of Myers and Majluf (1984)], and issuancetiming [Loughran and Ritter (1995)] theories. The findings are consistentwith agency theory, partially support the pecking order, and do not supportissuance timing.

Under agency theory managers will issue when its shares are overvaluedto maximize current shareholder value (assuming it can not issue risklessdebt). This setup is a special case of Myers and Majluf (1984) where thereis no information asymmetry about the assets in place but the manager hasincentives to issue equity to take negative NPV projects that are privatelyvaluable. The agency cost of outside equity arises because of managerialdiscretion. Issuing debt instead reduces the manager’s discretion, but givesrise to the underinvestment problem of Myers (1977) since gains go to thebondholders first. Thus, high growth firms will have less leverage to avoidforegoing good projects. The pecking order says firms should issue debtinstead of equity whenever possible. The timing model predicts firms willissue equity when overvalued, so subsequent returns should be lower.

The analysis is based on a sample of debt and equity issues between 1977and 1984. Equity issuers tend to be smaller, riskier, growth oriented firms.The security issue choice is estimated with a logistic regression. High M/Bfirms, leading indicators, and recent returns are positively related to thechoice of equity. Firms with high taxes are less likely to issue equity. Firmsthat are predicted to issue debt but instead issue equity appear to overinvest.


Overall, abnormal returns are negative for equity issues and insignificantfor debt issues. For equity issues with high prior excess returns the an-nouncement abnormal return is positive. The correlation from the firm typepredicted in the logit regression and abnormal returns is positive for equityissues and negative for debt issues. This is evidence supportive of the agencytheory but not the pecking order theory.

The general results indicate that firms issuing equity tend to either havevaluable growth opportunities or lack valuable growth opportunities but haveexcess debt capacity. These firms lacking valuable growth opportunities havemore negative stock price reactions to announcement of equity issuance.Other evidence indicates that some firms issue equity to benefit the man-agers rather than the shareholders.

5.10 Initial Public Offerings

This section discusses the features that distinguish IPOs from seasoned is-suances. The primary difference is that valuing an IPO is more difficult thanvaluing an existing public firm. Essentially all the problems with SEOs re-main with additional complications. The potential benefits of going publicinclude (i) diversification, (ii) liquidity, and (iii) more capital to take goodprojects. The costs include (i) information collection/disclosure, (ii)legal,auditing (iii) underwriting and one-time direct issuance costs, (iv) manage-ment time and effort, and (v) dilution. Much of the general discussion in thissection comes from Ibottson and Ritter (1995).

IPO’s are a stage in the life cycle of a firm. Initially, firms will be self-financed since the capital requirements are the smallest and the informationasymmetry problems the largest. The next step is often financing by friends,relatives, and associates. Personal relationships serve to align the interests ofthe manager and the investors. Next comes non-affiliated sources of privatecapital auch as bank financing and venture capital. These owners typically re-quire a large amount of information disclosure and are often active investors.After exhausting available private financing a firm will go public.

IPOs are characterized by information asymmetry problems. In partic-ular, there are adverse selection problems since the owners self-select intogoing public, and moral hazard problems since the manager/owners affectthe value of the firm. There are several mechanisms to deal with the infor-mational asymmetries. By holding a sizeable portion of the firm the manager

5.10.IN

ITIA

LP

UB

LIC

OFFE

RIN

GS

127

Table 5.6: Theories of IPO Underpricing

Theory Prediction EvidenceWinner’s curse riskier issues, greater underpricing someCostly info. acq. upward revisions more underpriced yesCascades underprice to guarantee sucessI-banker power no underpricing of IB IPOs noLawsuits underprice to avoid lawsuits mixedSignaling underprice IPO for successful SEO noRegulatory utilities less underpricedWealth redist. bribe limitedStabilization underwriter support generates return noOwnership disp. underpricing creates diverse ownershipMarket incompl. compensation for risk-bearing some


provides a signal to outsiders. Managers may agree to a “lock up” period,during which they will not sell their shares. A manager could also take asmall fixed salary in exchange for a contingent compensation scheme. Firmsmay hire certifying agents4 who have credibility arising from their desire toprotect their reputation capital. There is evidence supporting the role ofcertifying agents [Booth and Chua (1996)].

5.10.1 IPO Anomalies

There are three puzzling observations with respect to IPOs. The issues aresignificantly underpriced from their secondary market values, yet over thelonger term IPOs tend to underperform. There are also cycles in the extentof underpricing.

New Issue Underpricing

Initial returns are skewed, with a positive mean and median near zero.Smaller offerings are more underpriced so equally-weighted returns overstatedegree of underpricing. The underpricing effect is present internationally.

Underpricing can be viewed as the solution to a moral hazard problembetween the issuer and the underwriter. The degree of underpricing willincrease with demand uncertainty.

Rock (1986) argues there will be a winner’s curse due to an adverse se-lection problem arising from an information asymmetry between informedand uninformed shareholders. In this model the market price and under-writer offers are jointly determined in equilibrium and the selling mechanismis exogenous. The banker sets an optimal offer and the market reacts. Theinformed agents will only buy IPOs if they are good deals, in which casethey will be oversold and the uninformed will not be able to get their desiredamount. In the case where the IPO is overpriced, the informed will pass,leaving the full amount to the uninformed. The uninformed will realize thisand require a discount on all issues in order to buy any of the IPOs. Animplication is that underpricing should be greater for riskier issues. Koh andWalter (1989) provide direct evidence in support of the model.

Welch (1992) presents a model of information cascades where agents’decisions are influenced by the actions of other agents. Firms will underprice

4See James (1995) or Smith (1986).

5.10. INITIAL PUBLIC OFFERINGS 129

to get the first few investors to participate, starting a cascade. Booth andChua (1996) argue that shares are more valuable to investors when theyare liquid. Providing a more dispersed ownership structure will increasethe liquidity of the shares. Shares are underpriced to compensate a broadinvestor base for costly information acquisition.

Long-Run Underperformance

The is significant evidence that IPOs perform poorly after the initial largereturns. The magnitude of this underperformance is on the order a –15%CAR over the following three years. This type of underperformance is alsopresent in closed-end funds and REITs.

There is some evidence supporting each of the following theories of under-performance. The divergence of opinion argument of Miller (1977b) is thatthe buyers of IPOs are the most optimistic. With greater uncertainty, thedifference between the optimistic and the pessimistic is larger. As time goeson, information will be revealed that will cause the difference of opinion toconverge, and therefore the price will drop. There is some survey evidencesupporting this theory. The impresario hypothesis suggests that investmentbankers underprice initially to create the appearance of excess demand. Un-der the windows of opportunity hypothesis there is a sort of dynamic peckingtheory where firms will issue equity when it is overvalued in general.

Cycles

Cycles in both volume and underpricing are well documented but hard toexplain as rational. One explanation is changing risk composition, meaningmore risky offerings are underpriced more, and there may be a clustering ofIPOs by similar firms. There is some evidence in this direction, but it is notentirely convincing. A second explanation is “positive feedback” strategies,where investors buy IPOs expecting positive autocorrelation. If enough in-vestors do this, the autocorrelation becomes a self-fulfilling prophesy (this isbasically the “greater fool” theory). This effect may be difficult to stop witharbitrage since is difficult to short-sell the IPO [Rajan and Servaes (????)].


5.10.2 Key Papers

Welch (1992)

In his “cascades” model Welch (1992) provides an explanation of IPO un-derpricing. Investors are sequentially asked if they want part of the IPO.Each investor gets a signal of the value, but the investors can not observeeach others signals. Investors are able to observe the actions of those thatwent before them. Using the decisions of others to update their own beliefs,information cascades can occur where individuals disregard their own infor-mation and follow the masses. Issuers may underprice to ensure the first fewinvestors accept and ensure success of the offer. Cascades are not necessar-ily bad for an issuer. He is at less of an informational disadvantage sincethe individuals are unable to aggregate their information. The underwriterseeks to distribute the offer widely to make investor communication moredifficult. With inside information, a high offer price increases the probabilityof offer failure and more so for lower quality issuers, creating a separatingequilibrium.

The model uses an economy of rational, risk-neutral investors. The un-known true price of the firm is V , and the issuer has a reservation priceV P ≤ V L < V H . All participants have a prior V ∼ U [V L, V H ]. The pricecan be expressed as p = θV H + (1 − θ)V L where θ ∈ [0, 1] indexes the firmtype. Each investor gets a signal s ∈ H,L with Prob[si = H] = θ.

With perfect communication all successful offers are underpriced. Ob-served ex post underpricing is strictly increasing in uncertainty. If communi-cation can only go from early to late investors things change slightly. Issuerproceeds are path-dependent, but in large economy the perfect informationresults obtain. When only decisions, and not information, are observable,things change dramatically. Once an investor M with an H decides not toinvest, no subsequent investors will invest. Similarly, once M with an L in-vests, all that follow him will also invest. This says that as the game goes on,individual signals get less weight relative to the information from previousdecisions. As soon as one person goes against their information, anyone afterhim would place even less weight on that information.

With cascades and an infinite number of investors the probability of fail-ure is zero for P ≤ 1/3 and one for P ≥ 2/3. All prices in between have anuncertain outcome. An uninformed risk-neutral issuer will optimally chooseP = 1/3 and the offer always succeeds. Since everyone chooses this price and

5.10. INITIAL PUBLIC OFFERINGS 131

the average price is 1/2, the expected IPO underpricing is 50%. Under theseconditions, the issuer has higher expected proceeds with cascades than withpath dependency or perfect communication.

What if the issuer can modify the price based on past sales? Issuers withsufficiently high risk aversion may prefer to start an immediate cascade topath dependency with the option to change the price later.

The model has a number of implications. First, when distribution is lessfragmented (a local issue) the issuer will underprice more. For these issuesthe offer price decreases with the issuer’s risk aversion and capital require-ments. Issuers have an incentive to prevent communications to preserve thecascade. Welch argues that the winner’s curse in Rock (1986) is not im-portant, since success of the offer is a foregone conclusion by the time the“marginal” investor is approached.

To add another element of realism, issuers are given inside informationabout the firm type which is correlated with the outside signals. This makesit relatively less expensive for a high quality firm to raise the price than fora low quality firm, creating a separating equilibrium.

Loughran & Ritter (1995)

Loughran and Ritter (1995) attempt to understand the long run underper-formance of new issues. They find that only a small part of the underperfor-mance is explained by B/M effects. The degree of underperformance variesthrough time.

The authors calculate three- and five-year returns for a large sample ofIPOs and SEOs. The issue date return is not included in the calculations. Re-turns are also calculated for a sample of non-issuing matching firms. Wealthindices of issuers’ returns relative to matching firms are calculated for the twoholding periods by cohort year. In almost all cases these indices are less thanone and deteriorate from the three year measure to the five year measure. ForSEOs the issuing firms had extremely high (72% on average) returns in theyear prior to issuance. This underperformance is not due to mean reversionas in ?. Separating extreme winners into issuers and non-issuers, the issuer’sunderperform over the next five years while the non-issuers beat the market.

Since many researchers have documented a relation between the cross-section of returns and B/M, the paper tests for this effect. Although sizeand B/M are significant, a dummy variables for new issues is significantlynegative, especially in periods following heavy volume. Using a Fama-French


three factor model, intercept estimates for issuers is significantly less than fornon-issuers. Also, the issuers have higher betas, which is inconsistent withthe lower returns from the previous analysis.

Koh & Walter (1989)

Koh and Walter (1989) provide a direct test of the Rock (1986) model of IPOunderpricing as a response to the winner’s curse. This test is unique in that ituses data from Singapore where rationing of oversubscribed offers is done ina special lottery. In this market all applicants for a given size of the issuancehave an equal chance of winning. The tests confirm the implications of theRock model that there is a winner’s curse and that uninformed investors earna return similar to the riskless rate.

The authors use simulations to generate the returns to different biddingstrategies. Assuming no rationing occurs, underpricing is large even aftertransactions costs. Examination of the probability of an allocation indicatesthat small investors are more likely to get an allocation. More importantly,investors are nearly three times as likely to get an overpriced issue than anunderpriced one. Average returns incorporating costs and the probabilityof allocation are approximately zero, consistent with the Rock model. Alsoconsistent are the correlations between proportions applied for or allocated toand initial returns. The small investors apply for and get a larger proportionof the issue when the issue is more fairly priced, whereas the larger investorsapply for and get more when the issue is underpriced. Both small and largeinvestors’ demands increase for underpriced issues, but the large investorsare much more responsive.

Booth & Chua (1996)

Booth and Chua (1996) explain IPO underpricing as an attempt to generateownership dispersion and enhance liquidity in the secondary market givinga flavor of Merton (1987). In the model, informed investors are more likelyto participate in secondary market trading. The underpricing is set to com-pensate investors for information acquisition. They find that underpricingis positively related to information costs. Investment banker prestige is neg-atively related to underpricing in firm commitment offers and unrelated inbest efforts. Finally, the clustering of issues seem to lower information costsand underpricing.

5.11. EXECUTIVE COMPENSATION 133

The model has a number of empirical predictions. Underpricing shouldbe negatively related to the probability of receiving an allocation. The costsof achieving ownership dispersion and liquidity should be higher for bestefforts offers since they tend to be smaller. Best efforts offers have a higherprobability of failure so they should be more underpriced. Finally, best effortsoffers should benefit most from clustering.

Initial returns are regressed on firm size, offer price, IPO activity in themarket and industry, underwriter rank, and interactions with these variablesand dummies for offer type. The results indicate that more reputable un-derwriters underprice less in firm commitment offers, and that clustering isimportant, especially for best efforts offers.

5.11 Executive Compensation

Murphy (1985)

Murphy (1985) reexamines the relation between firm performance and exec-utive compensation. This study focuses on individual executives over timeand includes important explanatory variables as well as indirect forms ofcompensation which prior research has ignored. Murphy finds that executivecompensation is strongly positively related to firm performance.

The paper attempts to avoid errors in variables problems associated withomitting factors such as entrepreneurial ability, managerial responsibility,firm size and past performance. If these factors are constant over time,then time series regressions for individual executives can correctly assess thesensitivity of pay to performance. The components of compensation underconsideration include: salary, bonus, salary + bonus, deferred compensation,stock options, and total. Compensation is purged of any direct relation tothe firm’s stock price, and compensation over time is re-expressed in 1983dollars.

The analysis is conducted in two parts. Time series regressions of an-nual compensation for each executive on measures of performance for eachfirm-year and dummy variables to control for the executive’s position. Themeasures of performance include combinations of the stock return and salesgrowth. There is an intercept for each individual to capture any other im-portant variables which are constant over time. Cross-sectional regressionsuse average compensation (over time) and average performance.


The results indicate that executive compensation changes by about 20%of the firm’s returns. The ranking of sensitivity is CEO, President, Chair-man, and Vice President. The bonus is most sensitive to performance andthe option compensation is negatively related to performance. The cross-sectional regression without sales growth gives the wrong signs, evidenceof mis-specification. Adding the sales growth to the regression reduces theposition-specific sensitivity in the time series regression and reverses the signsin the cross-sectional regression. Including performance interacted with po-sition gives positive coefficients but the hierarchical ordering fails. In partic-ular, the results indicate that vice presidents have the most sensitive com-pensation. Using relative performance provides evidence that salaries arepositively related to raw returns and negatively related to relative returns,while bonuses are unrelated to raw returns and positively related to relativereturns.

Jensen & Murphy (1990)

Jensen and Murphy (1990) examine pay for top executives to see if they arecompensated in a way that will reduce agency costs by aligning incentives.They find that during the mid-70s to mid-80s, executive pay is not particu-larly sensitive to performance, and most of the sensitivity comes from stockownership.

The methodology regresses different measures on compensation changeon changes in shareholder wealth to get a sensitivity estimate. They findthat a $1000 change in value causes only a $3.25 change in CEO wealth,and $2.50 of this comes from stock ownership. The sensitivity is greater insmall firms, $8.05 versus $1.85. Bonuses are generally very stable and donot seem to reflect changes in performance. Real CEO stock holdings andthe level of pay have fallen over time, suggesting that political pressures haveconstrained the ability to offer pay for performance contracts. The variabilityof CEO pay has also fallen so that it is no more variable now than generallabor, although executives are less likely to receive pay cuts and more likelyto receive large raises. Pay seems to be tied to accounting measures ratherthan market or individual performance. Dismissals do not seem to be animportant incentive, mainly because they are rare. Sensitivity does not seemto reflect the managers level of stockholdings. Sensitivity has decreased overtime.

These results are inconsistent with formal agency models of optimal con-


tracts. Alternative explanations are that CEOs are unimportant inputs inthe production process, actions are easily monitored/evaluated, or there arepolitical or social pressures that “cap” compensation. Of these, only thelatter is reasonable. Perhaps managerial risk aversion requires even highercompensation for subjecting managers to performance risk. This is hard torationalize as the sole reason since the amount of wealth at risk is a rela-tively small portion of total CEO wealth. Highly sensitive contacts may notbe feasible since executives can not credibly commit to paying large amountsin the event of poor performance. It may also be the case that there arenon-pecuniary benefits such as power and prestige that do provide the rightincentives. However, these factors may incent the manager to be a goodcitizen rather than maximize share holder value.

A weakness of the study is that firm value changes may not be goodmeasures of the CEOs performance. For example, flat performance duringa recession may in fact be good. Tests indicate that relative performance isnot important, however. There is also an endogeneity problem.

Yermack (1995)

Yermack (1995) tests nine theories of why companies award executives stockoptions. The main idea being tested is whether firms with high agency costsincrease pay for performance sensitivity with stock options. The primaryfindings support few of the theories. There is evidence that regulated firmsare less likely to use options, while firms with noisy accounting earnings orliquidity contraints will use options more.

The analysis uses two possible dependent variables, the option delta timesthe fraction of ownership or the value of option compensation relative tosalary and bonus. The first is a “flow” measure while the latter is a “stock”measure. Measures of option values are based on the Black-Scholes modeland include only new awards. A tobit regression incorporates individual firmeffects and accounts for the large number of variables with values of zero.The predictions and results are in Table 5.7

Sloan (1993)

Sloan (1993) examines the incremental role of accounting figures in deter-mining CEO compensation. The logic follows Fama (1980); accounting earn-ings do not subject the risk-averse manager to uncontrollable market noise.


Table 5.7: Predictions and Results in Yermack

Theory Prediction FindingIncentive Alignment –Horizon Problems +Growth Oppty’s + –Accounting Noise + +Agency Costs of Debt/FCF –Regulation – –Liquidity constraints + +Tax loss CF +Earnings Management –

There will be a greater reliance on earnings when: (i) the firm’s stock returnsare highly correlated with market noise, (ii) earnings are highly correlatedwith firm specific signals in returns, or (iii) earnings are less correlated withmarket wide noise. Thus, accounting earnings are used as an instrumentalvariable in a sense. Ideally, pay would be a function of actions, but theseare not easily observable. Instead, price can be used as a determinant ofcompensation, but price is a noisy measure. The weights placed on price andearnings reflect the tradeoff between incentive alignment and risk-sharing.

There are two important variables in the analysis, the ratio of variance inmarket wide noise to variance of earnings noise and the correlation betweenthese sources of noise. Both variables are interacted with accounting perfor-mance and stock performance. When the ratio of noise variances is large,compensation should be based more on the accounting earnings and less onthe returns performance. When the sources of noise are positively correlatedthe firm will base compensation less on accounting measures and more onstock returns. The results indicate that the variance of noise in returns isless than the variance of noise in earnings. The correlation between marketwide noise and earnings noise is close to zero.

Sloan finds support for the three hypotheses tested in this paper. First,earnings measures shield executives from market noise. Second, CEO com-pensation is more sensitive to earnings performance when the returns arenoisy relative to earnings. Finally, firms place more emphasis on earningswhen the correlation between noise in stock returns and earnings are closer


to negative one.

Bizjak, Brickley & Coles (1993)

Bizjak, Brickley, and Coles (1993) explain why firms use multi-year compen-sation contracts and show that it is not always optimal to tie compensation tocurrent performance when there are informtion asymmetries. This contrastswith the rule of maximizing current stock price in a world of perfect marketsand homogeneous expectations [Fama & Miller (1972)]. The basic intuitionis that when compensation is based only on current performance there areincentives to maximize the current stock price at the expense of long-runperformance, either by under- or overinvesting. Supportive evidence showsthat high growth firms use longer contracts. There is no relation betweeneither CEO starting age or tenure and growth opportunities. A surprisingresult is that the sensitivity of salary/bonus and total compensation to stockperformance are lower in high growth firms.

In the model managers use the observable investment decisions to ma-nipulate the market’s inference about the firm. The incentive to do so isstrongest when the manager is likely to leave the firm before the market fullylearns the firm’s type. The compensation plan is then structured to balancethe emphasis on current versus future stock price.

To test the theory empirically the authors use M/B and R&D as proxiesfor informational asymmetries with control variables for size and regulatedindustries. The main analysis uses the ratio of salary and bonus incentives tototal compensation incentives.5 Additional regressions use these variables inisolation. The results indicate that firms with high information asymmetriespay a lower proportion of compensation in the form of salary and bonus.Large firms, regulated firms, and high growth firms have total compensationand salary/bonus that are less sensitive to changes in shareholder wealth.

Smith & Watts (1992)

Smith and Watts (1992) test a variety of theories regarding decisions aboutfinancing, dividend, and compensation policies. The evidence suggests thatcontracting theories are more important in explaining cross-sectional varia-tion in these policies than either tax-based or signaling theories.

5These are actually the change in each per $1000 change in shareholder wealth as inJensen and Murphy (1990).


Table 5.8: Predictions and Results of Smith & Watts

Dependent Indendent VariableVariable A/V Reg. Size Ret.E/V – – –D/P + + ? (+)Comp. – + (–) + +Bonus ? (+) – +Option – – +

Symbols shown are predictions. Actual results that are significantly different are inparenthesis.

The study considers four endogenous policy variables: E/V for financ-ing, D/P for dividends, CEO salary for compensation, and frequency of op-tion/bonus plans for incentive compensation. Independent variables includebook assets to value for the investment opportunity set, size, accounting re-turn, and a dummy for regulated industries. The data are on the industrylevel.

The results indicate that firms with more growth options have lower lever-age, lower dividend yields, higher compensation, and more frequent usage ofstock option plans. Regulated firms have higher leverage, higher dividendyields, lower compensation, and less frequent usage of stock option/bonusplans. Finally, larger firms tend to have higher dividend yields and higherlevels of executive compensation. These results inply relations among thepolicy variables as well. There should be a positive relation between lever-age and dividend yield and between compensation and the use of incentiveplans. There should be negative relations between dividend yield and incen-tive plans and also between leverage and either compensation or incentiveplans.

5.12 Risk Management

There are several ways a firm can manage risk, including diversification,insurance (nonlinear), and hedging (linear). To measure the valuation effectof risk management researchers either use an event study or matched samples.

5.12. RISK MANAGEMENT 139

Stulz (1995)

Stulz (1995) attempts to reconcile the theories and practice of risk manage-ment. Survey data indicate that firms typically hedge transactions and donot engage in speculation or arbitrage. At the same time, managers indicatetheir view influences the extent of hedging and many large firms view thetresury as a profit center. Large firms tend to use derivatives more thansmaller firms.

Theories predict gains from risk management may come from severalsources. In an efficient market with diversification, these gains must ariseonly from real resource gains such as reducing costs due to financial distress,taxes, wages, or capital acquisition. Since increases in capital are a substitutefor risk management, firms with low leverage are generally not expected tobenefit much from hedging. In this sense, hedging allows firms to save capital.Since managers dictate the risk management policy it is important to considertheir incentives to reduce or increase risk. The chances of bankruptcy alsoaffect risk management. The lowest risk firms can afford to take bets andthe highest risk firms are forced to take bets.

Since most of the arguements for risk management focus on left-tail out-comes, methods such as variance reduction are not really appropriate. Valueat risk emphasizes the magitude of the loss that occurs with a given proba-bility, but it is not appropriate either. The path of firm value over time ismore important than the distribution at a point in time.

Froot, Sharfstein & Stein (1993)

Froot, Scharfstein, and Stein (1993) develop a theoretical framework describ-ing optimal risk management strategies. The focus is on what and how muchhedging should be done as opposed to why or how to implement the program.The optimal amount and type of hedging depends on the nature of a firm’sinvestment and financing opportunities.

This paper take the view that the motivation for risk management is toreduce the variability in cashflows since it disrupts investment and financingactivities.6 When cashflows are variable the amount of external financingand/or investment will also be variable. Holding investment fixed requireschanging external financing. If the marginal cost of funds increases in the

6Other theories include managerial risk aversion, information asymmetries in the labormarket, taxes, financial distress costs/additional debt capacity, and underinvestment.


amount of financing then the investment policy will still be altered. Thus,actions the firm can take to reduce cashflow variability may increase firmvalue. This is based on the assumption that firms are more efficient athedging than individuals.

An implication of the model is that high R&D firms are more likely tohedge. These firms may have greater difficulty raising external funds becauseeither the growth opportunities are not good collateral or since there may belarge information asymmetries. Also, the R&D growth options are not likelyto be correlated with hedgeable risks. This effect comes from the distinctionbetween collateral value sensitivities and marginal product sensitivities. Herethe marginal product is insensitive to hedgeable risk. Therefore, the firmdesires more hedging so it can still fully invest in the bad states. If themarginal product were more sensitive there would be a natural hedge in thesense that when the firm is in a bad state it wants to invest less anyways.

Several conclusions arise from the model.• Optimal hedging does not always mean full hedging.• Firms should hedge less when future investment and cashflows are

highly correlated and more when collateral and cashflows are corre-lated.

• Hedging by multinationals is influenced by revenue and expense expo-sures to exchange rates.

• Nonlinear hedging allows added precision.• Futures and forwards are different intertemporally.• Hedging practices of competitors matters to a firm.

May (1995)

May (1995) tests the theory that managerial risk preferences affect the riskmanagement decisions of the firm. The paper focuses on acquisitions, whichcan be a substitute for other risk management practices. For managers,diversification may be a positive NPV project, even though it may be bad forshareholders. The main finding is that managers with more personal wealthinvested in the firm tend to diversify, despite evidence that diversificationtypically reduces firm value [Berger and Ofek (1995)].

The CEO’s motive are proxied by his tenure, estimated fraction of wealthin equity, specialization of human capital, and past performance. The rela-tion between these variables and the diversification level sought, industry-adjusted leverage, volatility, and idiosyncratic risk are considered. Diversifi-

5.12. RISK MANAGEMENT 141

Table 5.9: Preditions and Results in Tufano

Hypothesis Variable Predicted Actual

Distress Cash costs +Leverage + +

Disruption Exploration +of Invest. Acquisitions +Cost of Firm value –Ext. Fin. Reserves –Tax Tax loss CF +Risk Mgr. stock + +Aversion Mgr. options – –

Nonmgr. block ? –Other Fin. Diversification –Policies Cash – –

cation level sought is measured as the covariance of returns between bidderand target, firm-specific risk reduction, and implied change in volatility.

There is strong evidence that the fraction of wealth in equity is impor-tant. CEOs with specific expertise tend to buy related targets. Poor pastperformers often make diversifying acquisitions. There is weak evidence thatseasoned experts also make diversifying acquisitions, perhaps because theirhuman capital becomes too firm-specific.

Tufano (1996)

By focusing on the gold industry Tufano (1996) is able to carefully examinethe determinants of risk management. Isolating the gold industry allows astudy where there is a common exposure to output price. The wide varietyof risk management policies and gold-related derivative instruments usedby the industry provides cross-sectional variation. Data collection effortsare aided by the public disclosure of risk management activities. The goldindustry should use very little hedging since its assets are mostly tangibleand known, investors can hedge on their own relatively easily, and detailedreporting minimizes informational asymmetries. Despite these reasons, 85%of the firms do manage risk.


To perform the analysis, Tufano calcualtes a delta percentage (∆%) whichis the portfolio delta times the ratio of ounces hedged to expected production.If ∆% = 0 there is no hedge and the firm is long its full production. At∆% = 1 the firm has a full hedge. A delta percentage less than zero or greaterthan one indicates a speculative long or short position, respecitvely. Theindependent variables in the tobit regression are summarized in Table 5.9.

He finds support for management incentives, but little support for firmincentives. When managers own more stock options firms manage risk less,but when managers have more wealth invested the firms manage risk more.Other results show that firms with low cash balances or CFO’s with shorttenure manage risk more.

Geczy, Minton & Schrand (1996)

Geczy, Minton, and Schrand (1996) try to explain “Why Firms Use CurrencyDerivatives.” They test the predictions of hedging theories by looking at asubset of the Fortune 500 firms with ex ante foreign exchange exposure. Thestudy also considers how the magnitude of the exposure affects the benefitsfrom risk reduction and the associated expenses. The results indicate thatfinancing constraints provide incentives for hedging. There is evidence of un-derinvestment, especially for firms with little financial flexibility. Firms maychoose to use foreign-denominated debt as a substitute for direct hedging.The expenses associated with hedging are important. There is no supportfor speculative positions.

Roughly 40% of the firms use currency swaps, forwards, or options. Usageis more common among firms with more growth opportunities or greaterfinancial constraints, consistent with the model of Froot, Scharfstein, andStein (1993). Larger firms or firms using other derivative instruments aremore likely to use currency derivatives, indicating economies of skill and scale.Firms tend to hedge foreign currency with forwards and foreign interest withswaps.

The analysis is based on a logit regression predicting currency derivativeuse. The categories of factors considered are managerial incentives, bond-holders, equityholders, operating characteristics, substitues, and costs. Aneffort is made to account for the endogeneity problem related to a firm’schoices of capital structure, executive compensation, and derivatives usage.

Consistent with the argument that more foreign exchange exposure in-creases the benefits to hedging, the authors find that the likelihood of cur-

5.13. INTERNAL/EXTERNAL MARKETS AND BANKING 143

rency derivatives use is positively related to foreign sales, foreign-denominateddebt, and foreign pre-tax income. The positive relation between hedging andR&D and the negative relation with the quick ratio support the claim thatfirms with the highest external finance costs use currency derivatives.

5.13 Internal/External Markets and Banking

The distinction between internal and external capital markets becomes im-portant when there are market frictions. Fama (1985) claims that banksmust provide some unique services since they are effectively taxed by reserverequirements, but the orgainzational form still exists. Possible explanationsare an informational advantage, greater capacity to monitor, and a certifica-tion/signaling role.

Rajan (1996)

Rajan (1996) presents a model incorporating the endogenous costs and bene-fits of bank debt. An optimal borrowing structure reduces a bank’s ability toappropriate rents from the borrower without drastically reducing its abilityto control. The main result is that an informed bank can prevent a managerfrom continuing a negative NPV project, but it comes at a cost of reducedmanagerial effort and value due to the bank’s bargaining power over positiveNPV projects. Arm’s length debt has neither the bargaining power nor themonitoring capacity of bank debt, but demands a higher return ex ante tocompensate for the negative NPV projects.

In the model an owner-manager needs external financing to pursue aproject idea. After making the investment, the manager exerts costly effortwhich affects the distribution of project returns. The bank has the ability toforce discontinuation if the project becomes negative NPV. Since the manageris a residual claimant, he always wants to continue [Jensen and Meckling(1976)]. Note that everyone is risk-neutral in the model.

The structure of the bank loan is important. If the bank requires repay-ment when the true state is revealed, the bank has the power to hold upthe manager unless he has other financing options. This causes the owner tolose some of the surplus from the project and he will no longer exert optimaleffort. Alternatively, the bank can require repayment only at completion ofthe project. Now the bank loses its power to force discontinuation and has


to bribe the manager to stop negative NPV projects. Competition amongfinanciers has ambigous effects. It reduces the bank’s ability to extract asurplus in the good states, but also reduces its ability to force discontinutionsince the manager can borrow from uninformed sources.

Puri (1996)

The purpose of Puri (1996) is to determine whether banks suffered from aconflict of interest when they were allowed to underwrite securities offerings.The Glass-Steagall Act of 1933 prevented banks from underwriting based onthe premise that banks had an incentive to underwrite offerings of their owntroubled loans.

There is a tradeoff between the informational advantage banks have,which should reduce the yield premium, and the conflict of interest, whichwould raise the premium. The strategy of the paper is to look at yield pre-miums of commerical banks versus investment banks. The null hypothesisis that the yield premiums are the same for the two types of banks. Thesample includes several hundred offerings between 1927 and 1929, the pe-riod between the McFadden Act, which made underwriting legal, and theDepression. The main analysis is a regression of yield premium on controlvariables and a dummy for commercial banks. The control variables includecredit quality, loan amount, syndiate size, firm age, and dummy variables forexchange listing, securitization, and new issue.

The results suggest that commercial banks did not have a conflict ofinterest. The yield on bank underwritten issues is lower than that on un-derwritings by investment banks, especially for the informationally sensitiveofferings such as new issues, industrials, preferred, and lower-grade. Thisindicates the informational effects dominate the conflict of interest and isconsistent with positive AR for bank loan announcements.

Shin & Stulz (1996)

A test of whether divisional structures influence investment policy is thefocus of Shin and Stulz (1996). A firm with multiple divisions has severalpotential costs and benefits of diversification. On the one hand, internalcapital markets will provide cheaper access to capital if external markets areimperfect. On the other hand, bureacracy may hamper efficient investment.The basic evidence is that the investment of small divisions depends heavily

5.13. INTERNAL/EXTERNAL MARKETS AND BANKING 145

on the cashflow of larger divisions, but the investment of larger divisionsdoes not depend much on the cashflows of other divisions. This suggeststhat internal capital markets are important, but does not tell us if they aregood or bad.

There are three hypotheses under consideration. With bureaucratic rigid-ity, additional management and inefficient policies and procedures may causefirms to give divisions “sticky” fraction of the total capital budget. One divi-sion’s allocation will be inversely related to the cashflows of other divisions.This inverse relation will be stronger with more divisions, and weaker wheninvestment is not expeceted to be sensitive to cashflows. Under the hy-pothesis of efficient internal capital markets firms will shift funds (includingdividends) to the source of highest value. In this setting other divisions willbenefit when a large division has high cashflows and relatively poor invest-ment opportunities. Finally, the free cashflow hypothesis says that firmsmay still shift funds to the best use, but dividends will not be paid. Theprediction of this theory is that firms will invest more in non-core segmentsif the core business has high cashflows and poor prospects.

To address these theories the authors examine the link between CF andinvestment at the division level compared to the entire firm. The link betweena division’s investment and the cashflows of the other divisions is also consid-ered. A distinction is made between small and large divisions. The ratio ofdivisional capital expenditures to lagged divisional assets is regressed on thelagged value of that ratio, divisional sales growth, divisional CF/Assets, andCF/Assets of other segments. These regressions are performed separately forsmall and large segments, with futher subdivision on the number of segments.The entire anlysis is repeated for large firms.

The results indicate that the investment of all divisions are positively re-lated to each of the independent variables. For small divisions, the cashflowsof other divisions are fairly important, while this is not the case for largerdivisions. With more divisions the importance of other segments increases.

For firms where investment is not expected to be sensitive to cashflows(e.g., low leverage or high q7), the sensitivity of a division’s investment toother divisions’ cashflows is weaker. This sensitivity increases with the num-ber of divisions. These results are consistent with the bureaucratic rigidityhypothesis. There is little evidence supportive of the efficient internal capi-tal markets or free cashflow hypotheses; firms with large divisions that have

7Market to Book is actually used as a proxy for Tobin’s q.


poor prosepects but large free cashflow do not seem to direct more funds tosmall divisions in growing industries.

Billett, Flannery & Garfinkel (1995)

Billett, Flannery, and Garfinkel (1995) attempt to determine whether thequality of the lender has a valuation impact on the borrower. The lender’sidentity might matter if certain lenders have special monitoring abilities, orif the lender’s preferences for certain risk classes signal the borrower’s type.Announcement of issuance of public securities is generally met with a pricedecline. Private securities are often associated with a positive price impact.Therefore, public and private financing do not seem to be perfect substitutes.

Institutional features may affect this process. Banks have access to pri-vate information in the form of deposit accounts. Government regulationsrequire banks to focus on the risk of individual loans rather than the entireportfolio. Therefore, borrowing from a constrained bank may signal a lessrisky borrower. The lender’s credit quality may also matter. Borrowers arelikely to prefer healthy banks to preserve long-term relationships and mini-mize search/switching costs. Expertise in monitoring may produce economiesto specialization. A high rating will reduce the lender’s cost of capital. Areputational equilibrium may develop where lenders are expected to deliversecurities of a certain type.

The analysis performs an event study on a sample of firms with loanannouncements in the 1980s. Univariate analysis shows that there is nota difference between the abnormal returns when the lender is a bank ver-sus a non-bank. However, borrowers experience a positive abnormal returnwhen borrowing from a bank with a high credit rating, versus a negativeabnormal return from lenders with lower ratings. Regression results indi-cate that abnormal returns increase by 20 basis points for each change inthe lender’s credit rating after controlling for other factors such as firm size,preannouncement run-ups, and other firm characteristics.

Fazzari, Hubbard & Peterson (1988)

Fazzari, Hubbard, and Peterson (1988) test whether financing constraintsaffect investment. In perfect markets, financing alternatives are perfect sub-stitutes and the investment and financing decisions are separate. Marketimperfections make external markets more expensive. Asymmetric informa-

5.14. CONVERTIBLE DEBT 147

tion is the primary friction, others include transactions costs, taxes, agencyproblems, and financial distress.

The paper explores the empirical support for the q theory, sales ac-celorator model and the neoclassical model of investment. Each of thesemodels predict that factors other than cash flow drive investment. Underthe q theory, firms invest as long as the marginal q is greater than unity.The neoclassical theory is based on the notion that the financial character-istics of a firm do not affect the cost of capital. The sales accelerator modelsays that sales growth drives investment.

The basic idea behind the empirical tests is to define three classes of firmsbased on dividend payouts (retained earnings). These groups are proxies forinformation asymmetry; high payouts mean the firm has the lowest coststo external financing. Investment per dollar of capital is then regressed onfinancial measures to see if there are differences across groups. The resultsindicate that cashflow is important in determining investment, and more sofor the firms with low dividend payouts. This supports the pecking ordertheory.

5.14 Convertible Debt

Convertible debt can be viewed as an indirect equity issuance — when a firmcalls its bonds it is like issuing equity. Under a signaling hypothesis, firmstend to issue equity when their shares are overvalued.

Many researchers have argued that in perfect markets, convertible bondsshould be called as soon as possible to minimize the value of the liability.Early empirical evidence suggests that corporations wait too long to call andthere are negative excess returns at the announcement of the call. Subse-quent researchers proposed several reasons why a firm may choose to delaythe call. This could be due to managerial compensation schemes based onEPS, the effect of reduced bondholder goodwill on future issuances, a prefer-ence for voluntary conversion induced by dividend increases, and suboptimalconversion strategies by the security holders.

Stein (1992)

Stein (1992) develops a theory explaining the use of convertible debt basedon the cost of financial distress and the importance of call provisions. Con-


vertibles allow a company to get equity into the capital structure “throughthe back door,” while mitigating the adverse selection costs of a direct equityissuance. Since a convertible issue is like a combination of debt and equitythe issuance signals better prospects than an equity issuance.

The model is an extension of Myers and Majluf (1984), where there aregood, medium, and bad firms that differ in the probability of a high cash flow.The firm knows its type at time zero, while investors get this information attime one. The cashflow is revealed at time two. A good firm is certain toget the high cashflow XH . Medium firms get XH with probabiltity p. Badfirms may improve with probability (1− z) and have a p% chance at XH , ordeteriorate and get nothing.

A basic version of the model gives firms the choice of equity, long termdebt and convertible debt. When costs of financial distress are sufficientlyhigh (C > I − XL) there is a separating equilibrium. Good firms choosedebt since there are no distress costs and the firm does not have to sellundervalued securities. Medium firms choose convertible debt to reflect thetradeoff between distress costs and issuing undervalued securites. The badfirms choose equity because the distress costs of other securities outweigh thebenefits.

There are several forms of empirical support for the model. Firms oftenstate the desire to get equity into the capital structure as a reason for issuingconvertible securities. Convertible debt is often (and fairly quickly) convertedinto equity. Convertible issuers tend to have high informational asymmetriesand costs of financial distress as indicated by high R&D/Sales, M/B, D/E,and CF volatility. Finally, the stock price reaction to convertible issues istypically half to a third the negative reaction of equity issuances.

Ofer & Natarajan (1989)

The paper by Ofer and Natarajan (1989) assesses whether the negative shareprice reaction to a call announcement is due to signaling. There is a declinein performance after the announcement as well as a continued negative CARover the next five years.

Under a signaling framework the announcement of the call will be metwith a negative return since investors perceive the call as signaling bad news.For the signal to be effective the firm must perform poorly after the call.The sample consists of over 100 voluntary calls during the 1970s. Thereis a potential sample selection bias since the pre-announcement performance

5.14. CONVERTIBLE DEBT 149

tends to be abnormally high. After the call, what may be normal performancewill look poor in comparison. Other papers which correct for this problemdo not find evidence of poor post-announcement performance.

The authors use several measures of performance to avoid the causalityproblem between the call decision and the performance measures. EBIT willnot be affected by the conversion. EBT is affected through the reduction ininterest. EPS is affected by both the interest and the increase in number ofshares. Finally, AEBT is EBT less the interest that would have been paid.

Three models of normal performance are used. The first assumes the per-formance is stationary through time. The second and third models expressexpected performance as a function of average market- and industry-wideperformance. In all cases the results indicate that these firms have unexpect-edly poor performance. The call announcement is associated with a negativeabnormal return, then followed by negative CARs over the next five years.These results are consistent with the information signaling hypothesis andthe predictions of Myers and Majluf (1984).

Dunn & Eades (1989)

Dunn and Eades (1989) attempt to explain the observation that firms waittoo long too call preferred stock by focusing on the assumption that investorsfollow perfect-market strategies. If enough investors deviate from the perfect-market strategy then it may be optimal for the firm to delay the call. If thedividend yield on the callable security is lower than on the common stockthen managers can take advantage of the slow conversion by passive investors.

The optimal call policy for the firm is to force conversion by calling as soonas the conversion value exceeds the call price, but before the issue enters thevoluntary conversion region (VCR). The VCR is the first ex-date where thedividend on conversion is greater than the preferred dividend and conversionpremium.

The study uses convertible preferred stock to avoid complications relatedto interest tax deductibility. Consistent with the passive investory theory

• Many investors do not convert in the VCR• Convertible preferreds sell below conversion values• Firms are generally not able to increase shareholder wealth by calling• Passive investors would typically realize incremental returns by con-

verting

The authors define the dividend ratio (DR) as the total conversion dividends


relative to the total preferred dividends in a year. The price ratio (PR) isthe average ratio of preferred price to conversion value of equity. The shareratio (SR) is the fraction of preferred shares remaining at the end of the yearafter conversion.

When DR < 1 then PR > 1 indicating that the preferred sells at apremium due to the conversion option and dividend advantage. When DR >1 then PR = 1 since there is no conversion premium. The SR drops to around80% prior to entering the VCR, drops to around 50% in the next year, thendeclines to roughly 10% ten years after entering the VCR. Consistent withthe theory, callable survivors have higher φC/Call, lower SR, higher DR,and lower PR than the called sample. The called sample also has a higherproportion of issues in the VCR.

Regression results show that before entering the VCR, conversions in-crease when preferred is selling below its conversion value. After enteringthe conversion region, investors are increasingly motivated to convert as thedividend advantage of common stock increases. Using institutional ownershipas a proxy for active investors, there is some weak evidence that institutionalinvestors reduce their holdings more than other investors.

Asquith & Mullins (1991)

Asquith and Mullins (1991) explain why companies do not call convertibledebt when the conversion value exceeds the call price, as predicted by manytheories. There are three primary criteria used to explain this behavior.The first, and most obvious, is simply that the issues are still call-protected.Second, the firms may want the conversion value to be somewhat higherthan the call price to provide protection from a price decline during the callnotice period. Finally, the most powerful explanation is that there may becashflow advantages to the firm from not calling when the after-tax interestafter corporate taxes is less than the dividends.

An analysis of convertible bonds with conversion values in excess of parindicates that 89% fall into one of the above categories. 21 of the remaining22 are close to or subsequently meet the requirements for one of these groups.Voluntary conversion is more likely with higher conversion value or higherdividends relative to after-tax interest. An increase in conversion value de-creases the option value. Investors voluntarily convert when investors getmore cash in dividends, a time when firms have an incentive not to call.This is supported by the data since less than 20% of the issues remain when

5.15. IMPERFECTIONS AND DEMAND 151

converted dividends exceed the interest. Although the investor’s problem isthe inverse of the firm’s, the decisions are not symmetric because of taxes.Therefore there are bonds which a firm will not call and investors do notconvert.

Asquith (1995)

Asquith (1995) corrects prior studies by showing that, when measured prop-erly, there is no call delay. Prior studies draw the conclusion that conversionvalue in excess of call value indicates a delay from the optimal time to call.A number of these bonds are still call-protected. Many of those that arenot protected have the after-tax yield below the dividend, providing a cash-flow incentive not to convert. Finally, delayed conversion bonds often haverelatively low premia or volatile cashflows, providing a price protection justi-fication for the delay. These motivations are discussed in Asquith and Mullins(1991). This paper adds an analysis of the delay between when a bond iscallable and when it is called.

The paper finds that those bonds that are called have fewer “live” days.Bonds with relatively high conversion prices and those with D < I(1−τ) arecalled more quickly. A puzzle is that there are several bonds withD > I(1−τ)that are called. The general conclusion is that most bonds are called as soonas possible unless there are cashflow advantages to delaying. The mediancall delay for all bonds is four months, but less than one month if a pricecushion is considered. Asquith argues that call premiums are not a usefulmethod of detecting whether bonds are called late. Overall, the average callpremium is 50%. The average call premium drops to 25% after consideringfactors such as cashflow motivated delays, sudden stock price increases, andlarge premiums while call protected.

5.15 Imperfections and Demand

In perfect markets demand curves should be flat but market imperfectionsmay cause downward sloping demand curves. Many important propositionsin finance are based on the assumption that investors can buy or sell stockwithout changing the price. Observed price reactions indicate prices aresensitive to volumes. Large block purchases generally result in price increases,while sales cause prices to fall. With equity issuance there are negative price


reactions, potentially due to agency costs of free cashflow [Jensen (1986)],asymmetric information [Myers and Majluf (1984)], and signaling [Miller andRock (1985)]. In takeovers bidder prices typically fall while targets receive apremium. Convertible debt and the call announcements are associated withnegative market reactions. It is not clear if these reactions are driven bysignaling, liquidity, or downward sloping demand curves.

Shleifer (1986)

Shleifer (1986) provides evidence that demand curves for stocks do slopedown. He uses inclusion in the S&P 500 as a sample since this event increasesdemand for the stock without contaminating information effects. Earlierstudies had examined the price effects of large block trades but these eventsmay be based on information. A possible certification role of index member-ship is refuted since the returns are unrelated to bond ratings. The liquidityhypothesis is rejected by finding no difference in the returns of Fortune 500firms and other firms.

There is no evidence that the market is able to predict inclusion in theindex. Before daily notification of the inclusion there is no abnormal returnon the event day. Since 1976 there has been a daily notification service ofchanges in the index. In this period inclusion in the index is associated witha positive abnormal return of about 2.8%. This return lasts for several weeksand seems to be related to buying by index funds. Other evidence supportsthe downward sloping demand curve hypothesis as well. The price reactionto large block trades typically only lasts a few hours. Firms with multipleclasses of stock that issue more of one class generally experience a price droponly for that class of stock [Loderer, Cooney, and VanDrunen (1991)]. Adownward sloping demand curve is also consistent with the January effect.

Shleifer & Vishny (1992)

Shleifer and Vishny (1992) relate the costs of asset sales to leverage in ageneral equilibrium setting. When a firm is in financial distress, the mostideal purchasers of the assets are likely to be in financial distress themselves.This liquidity cost is recognized ex ante as a cost of leverage. The mainresult is that more liquid assets are able to support more debt. This isbroadly consistent with Myers (1977).

The intuition behind the model is that assets are often specialized, making

5.15. IMPERFECTIONS AND DEMAND 153

them most valuable to firms within the industry. When industry shocks senda firm into financial distress its competitors will also be affected. As a result,there is an industry debt capacity and the leverage of one firm will depend onthe leverage of its peers. Firms outside the industry may have an interest inthe assets but are likely to pay less. Outsiders fear overpaying since they lackthe expertise to properly value the assets, they may lack the knowledge orskills to fully utilize the assets, and they face agency costs in hiring expertsto help them.

There are several empirical implications of the model. Liquid assetsshould be financed with more debt. Cyclical and growth oriented assetsare likely to have lower debt financing. Ceteris paribus, smaller firms shouldbe able to support more debt since they can more easily be purchased. Con-glomerates should also be able to use more debt since the divisions can cross-subsidize each other. High markets are likely to be liquid markets.

The takeover wave of the 1980s is consistent with this theory. Corpo-rate cashflows were large as were the number of potential buyers. Antitrustenforcement was relaxed, allowing more intra-industry acquisitions. This in-creased liquidity and the rise of the junk bond market reinforced each other.

Merton (1987)

Merton (1987) is an asset pricing model which relaxes the assumption of ho-mogeneous information. Although the model is cast as one with imperfectinformation, it can be interpreted as a model of incomplete markets. In-vestors are unable to fully diversify so they demand a premium for bearingthis undiversifiable unsystematic risk.

In this one period model risk-averse investors know about a subset of thesecurities in n risky firms. There is also a riskless asset and another assetthat combines the riskless security with a forward contract. The market isabsent frictions from taxes, transactions costs, and restrictions on borrow-ing. If all investors had complete information sets the model reduces to thestandard SL CAPM, otherwise the market portfolio is not mean-variance ef-ficient. Information costs come in the form of gathering and processing data,transmitting information, and most impotantly, making investors aware ofthe firm.

The return generating process is

Rk = Rk + bkY + σkεk.


An investor is informed about asset k if he knows Rk, bk, σk. All informedinvestors have conditionally homogenous beliefs. This structure is similar tothe single asset model of Grossman and Stiglitz (1980), but here there is nogaming between the informed and uninformed because investors only investin securities in which they are infomed. The shadow cost of not knowingabout an asset is the same for all uninformed investors and is equal to theexpected excess return on the asset. The equilibrium expected return is

Rk = R + bkbδ + δxkσ2k/qk.

This equation shows the expected return decreases when the investor baseincreases.

There model makes several predictions. A large common-factor exposure(bk), large size (xk), or large variance (σ2

k) create high expected returns.When the firm is well-known or has a large investor base (qk) the expectedreturn is smaller. This may give rise to a size effect. These effects can giverise to downward-sloping demand curves. Expansion of the firm’s investorbase and increases in investment will tend to coincide, giving a motivation foran underwritten offer instead of a rights offer. Managers have an incentive toexpand the investor base, especially for relatively unknown firms and thosewith large firm-specific variances. This can explain why firms advertise theirstock and invest in generating interest in the firm by the financial press. Themodel is also consistent with IPO waves in gereral and concentration withinan industry.

Kadlec & McConnell (1994)

Kadlec and McConnell (1994) use exchange listing to test the predictions ofthe Merton (1987) model of investor recognition and the Amihud & Mendel-son (1986) model of liquidity factors. In the former model expected returnsdecrease as the size of the investor base grows. In the latter, expected returnsdecrease with a reduction in the relative bid-ask spread. If the expected re-turn decreases then the market value should increase and abnormal returnsshould be positive.

During the 1980s, announcement of NYSE listing results in an abnormalreturn of 5 to 6%. The listing is also associated with a 19% increase inthe number of shareholders, a 27% increase in institutional ownership, a 5%reduction in absolute bid-ask spreads, and a 7% reduction in relative spreads.

5.16. FINANCIAL INNOVATION 155

The results are consistent with both models. The proxy for Merton’sshadow cost of incomplete information is the inverse of the change in investorbase scaled by the level of firm-specific risk and market value. Controllingfor the change in bid-ask spread, an increase in investor base results in apositive abnormal return. Controlling for change in investor base, a decreasein the spread is associated with higher abnormal returns.

Loderer, Cooney & VanDrunen (1991)

Loderer, Cooney, and VanDrunen (1991) isolate and identify the potentialinfluence of price elasticity on demand using the price discount from SEOsby regulated firms. Regulated firms are used because they are more likely tohave preferred stock and less likely to have information asymmetries. If thestock issuance announcement contains negative information it there shouldbe a neagative reaction for preferred stock as well. The evidence supportsthe incomplete markets theory of Merton (1987), but is inconclusive withrespect to theories of liquidity or heterogeneous beliefs.

To estimate the determinants of elasticity, INV ELAS8 is regressed onvariance (–), size (–), investor base/liquidity (+), and proxies for informationeffects (+). To capture information effects the authors consider ∆E[EPS],∆EPS, ∆ROE, and the price change of nonconvertible preferred stock at theannouncement. The results are significant and consistent with predictions forall variables except liquidity and information, which are insignificant. Theseresults are robust to a number of different specifications and proxy variables.A potential caveat is the predictability of issuance by regulated firms maymake it difficult to detect information effects.

5.16 Financial Innovation

When there are market imperfections there may be structures of claims thathas special value. Just as the prior section dealt with imperfections and assetdemands, this section addresses the effects on the supply of securities. Topicscovered here include optimal financial contracts, the incentives to innovate,and the existence of clienteles.

8This is the inverse of elasticity. The inverse introduces a nonlinearity in the modelthat may result in mis-specification.


Zender (1991)

Zender (1991) develops a model of the optimal financing contract that incor-porates both cashflow and control allocations. Most existing theories focusonly on cashflows. The optimal financial instruments completely resolve in-centive problems induced by asymmetric information. In the setting of thepaper, standard debt and equity contracts are optimal. Bankruptcy broad-ens the investment opportunity set and facilitates cooperation between theparties.

In the model there are three agents: an entrepreneur, an active owner,and a passive owner. The owners are risk-neutral and have limited capital.At t = 0 contracts are designed and sold and the initial investment I0 ismade. At t = 1 information about CF3 is made public. The firm receivesCF1 and assignment of t = 2 controls are made. At t = 2 the firms has aninvestment opportunity which the controlling owner knows but the publiconly knows the distribution. The investment requires an investment I thatis unobservable to outsiders. At t = 3 the investment payoff is realized andthe firm is liquidated.

There is disagreement among the agents about investment/dividend pol-icy due to the passive investor’s inability to observe investment expenditures.The agents realize up front that risk-shifting may occur and they mitigateit by inducing a state-contingent control change when an observable signalis realized. Cashflows to debt must be fixed in order to provide the equity-holder incentives to make efficient investments. This can explain the use ofdebt before tax shields.

Tufano (1989)

Tufano (1989) examines “innovative” investment banks and the benefits frominnovation. He finds that innovators gererally do not charge monopoly prices(underwriting spread). Instead, they charge lower long-run prices and gainmarket share. One interpretation is that innovation can reduce costs oftrading, underwriting, and marketing.

To identify the importance of price as a source of first-mover advan-tage underwriting spreads are regressed on measures of competitiveness andunderwriter identity and control vartiables for offering characteristics. Adummy variable for the monopoly period is insignificant for all offers andnegative for imitated products. Permanent price effects as measured by a

5.16. FINANCIAL INNOVATION 157

pioneer dummy are reliably negative.

The long-run quantity effects appear to be an important source of first-mover advantage. Pioneers capture market share nearly 2.5 times as largeas imitators. Temporary quantity effects due to periods of monopoly arenot important since the number of deals is small and imitators are quick tofollow.

Kim & Stulz (1988)

Kim and Stulz (1988) directly test the clientele hypothesis, which says thatfirm value can be increased by seeking funding from groups with uniquedemands. The evidence is consistent with this hypothesis.

The authors focus on Eurobonds from U.S. firms that also issue domesticdebt. Eurobonds are geneally bearer bonds allowing the holder to escapetaxes. There are some questions over the enforceability of the bond indenture,so reputation replaces restrictive covenants. Foreign investors may desirethese securities because they offer diversification yet have smaller purchasingpower and political risks. This market is characterized by larger underwritingspreads.

If the supply of Eurobonds is not perfectly elastic then excess demandcan create profitable financing opportunities since investors will accept loweryields. The supply of these securities is somewhat constrained because of thehigh issuance costs, low risk requirement, and reputational capital required.

The results indicate there are positive abnormal returns at the announce-ment of Eurobond issues. A comparison sample of domestic debt issuesshows no significant announcement effect, as in Mikkelson and Partch (1986).The positive abnormal returns occur mostly during the 1979–1982 period ofbought-deal underwriting when yield spreads were large. This type of ar-rangement reduced the time it takes to issue Eurobonds. The positive abnor-mal returns diminished in subsequent years when shelf registration increasedthe attractiveness of domestic issues. Abnormal returns were indistinguish-able from zero when tax laws ended the withholding tax for foreign investorsin domestic bonds. The clientele hypothesis is tested by regressing abnormalreturns on the size of the financing bargain. They find a slope coefficientdifferent from zero but not different from one, consistent with the clientelehypothesis.


Jung, Kim & Stulz (1996)

The paper by Jung, Kim, and Stulz (1996) finds that some firms appearto issue equity for the benefit of managers rather than shareholder. SeeSection 5.9.6 for a more complete discussion of this paper.

McConnell & Schwartz (1992)

McConnell and Schwartz (1992) describe the process leading up to the devel-opment of the Liquid Yield Option Note (LYON) by Merrill Lynch in 1985.This is a zero-coupon, callable, convertible, putable instrument. This instru-ment is designed to reduce the transactions costs associated with a strategyof investing in options and the money market. These investors desire a riskyinvestment paying interest but preserving the principal, much like portfo-lio insurance. The value of the security is relatively insensitive to the riskof the company, reducing the cost of information asymmetries. There is aself-selection by firms since only those with the most confidence in their pros-epects will issue. When pricing the instrument it is important to considerthe interaction/covariance between the various components.

Chapter 6

Market Microstructure

6.1 Introduction

Information economics deals with incorporating information into asset prices.Market microstructure is the study of the process and outcomes of exchangingassets under explicit trading rules. The focus is often on the interactionbetween the mechanics of the trading process and its outcomes, with specificemphasis on how actual markets and intermediaries behave.

Randomness is an important part of any of these models. The sourceof the randomness has implications for the characteristics of the model. Inall cases there is uncertainty about future outcomes or cashflows. Informedagents have imperfect infomation about the future value of the asset. Thisinformation may be the same for all informed agents, or they may eachhave diverse signals. Some models include uinformed agents whose demandsdepend on price. Noise trading is an additional source of uncertainty thatintroduces uncertainty about the net demands for the asset and preventsfully revealing prices.

A major difference in the models is whether trades are processed in abatch or sequentially. The latter allows dynamics in the price process andfacilitates analysis of the bid-ask spread. The literature is fragmentated inthe view on the risk preferences of specialists.

There are several important idiosyncracies in early papers that much ofthe subsequent work tries to solve. The first is a paradox where agentsignore their private information when prices are fully revealing. If this istrue, then how does the private information get into prices in the first place.

159

160 CHAPTER 6. MARKET MICROSTRUCTURE

A second paradox arises when private information is costly and prices arefully revealing. If so, then there is no incentive for collection of privateinformation, and this private information will then never become impoundedin the prices. Finally, there is the schizophrenia result where rational agentsin a competitive market act as price takers, ignoring the impact their tradeswill have on the price. The first problems are solved by introducing noisetrading. Allowing imperfect competition solves the last problem.

6.2 The Value of Information

Hirshleifer (1971)

Hirshleifer (1971) analyzes the private and social value to private informa-tion in a context of uncertain personal productivity. A distinction is madebetween foreknowledge, knowing something in advance of its occurance, anddiscovery, recognition of something (that may have already occurred) whichis not readily observable. Hirshleifer argues that there is no social value toforeknowledge without production. This is because information is valuableonly if it can affect actions. Under the assumptions in the paper, agentshave the same endowments, preferences, and beliefs so there is no incentiveto trade. If the informed agent could speculate the information would beprivately valuable.

In a production economy, foreknowledge is both privately and sociallyvaluable. This is because production can be shifted to the optimal channelsbased on this information. The informed agent can sell his information sothat the economy can fully use it in redirecting productive activities. Thishas implications for the timing of information releases. Announcements atregularly scheduled intervals allow risk-averse agents to insure before thenews announcement to get out of the way. Random releases of news as it oc-curs allow more efficient reallocation of production, but expose the agents todistributive risk. The same general results obtain with discovery information.

Marshall (1974)

Marshall (1974) shows that information can be socially valuable even in apure exchange economy if agents have heterogeneous priors. With homoge-neous beliefs, private information has no social value if it can be hedged and

6.3. SINGLE PERIOD REE 161

it reduces value if it can not be hedged. In a production economy informa-tion is socially valuable with sufficient hedges. Marshall says that there is anoverincentive to produce private information.

6.3 Single Period REE

Prices reflect traders’ information in a securities market. In a Rational Expec-tations Equilibrium (REE) traders with heterogeneous information attemptto infer the information of others from the prices, and then use this infor-mation to revise their beliefs. There are several problems with the REEconcept. Unless noise is added, prices are typically fully revealing [Grossman(1976)]. Fully revealing prices preclude speculative trading on the basis ofheterogeneous beliefs, giving the “no trade” result of Tirole.1 If traders areallowed to condition on trades as well as prices, then these data are sufficientstatistics for all information and there is no advantage to being informed. Fi-nally, without restricting the distribution of information, there is no tradingmechanism that could implement an REE.

Milgrom & Stokey (1982)

Milgrom and Stokey (1982) is a base-case for the information content oftrades. The model imposes very restrictive assumptions such as completemarkets and concordant beliefs. Agents are risk-averse. Under these condi-tions, a “no trade” result obtains. Once ex ante trading occurs to a Paretooptimal level, no future trading will take place although prices may change.This is because anyone willing to trade must have private information. Otheragents will realize this and will be unwilling to trade since they all interpretinformation in the same way.

1The no trade result of Milgrom and Stokey (1982) will obtain with homogeneous beliefsand a Pareto optimal allocation.


Grossman (1976)

The Grossman (1976) paper deals with the price system as an aggregator ofdiverse information. If private signals are identically distributed, then theprice reveals the average of all agents’ information and private information isredundant given the price. The REE is identical to a Walrasian equilibriumin an artificial economy where agents share their information before trading.With complete markets, equilibirum allocations are ex post Pareto efficient.In this model, agents have CARA utility so there are no wealth effects, butin a REE there are information effects. A price change affects the desirabilityof an asset.

The model specifies informed trader i knows

yi = p1 + εi.

The resulting price is p0(y1 . . . yN). Prices reflect each agent’s private infor-mation but do not depend on preferences. This results in a paradox: indi-viduals ignore their own information in favor of the aggregated information,but if they do ignore their private information, how does it get into prices?The result that prices perfectly aggregate information is not robust to theaddition of noise, but another paradox remains. If markets are “perfect” andinformation collection is costly, then there is no incentive to collect informa-tion. The agents in this model are “schizophrenic” in that their actions affectprice but they take price as given in determining their demand.

Grossman & Stiglitz (1980)

Grossman and Stiglitz (1980) say that informationally efficient markets cannot exist. If private information is costly, but has no value, then there isno incentive to collect it. This paper differs from Grossman (1976) in thatit is a model of asymmetric information rather than diverse information. Itendogenously derives the allocation of pretrading information, whereas mostother papers take it as exogeneous.

The model is based on perfect competition, one-shot trading, and a Wal-rasian auctioneer. The return on the risky asset is

u = θ + ε.

An agent can pay c to realize θ. Informed agents receive the same signal andall agents have negative exponential utility with risk aversion parameter a.

6.3.SIN

GLE

PE

RIO

DR

EE

163

Table 6.1: Summary of Key Models

Paper MM Inf. Uninf.a Noiseb CommentsMilgrom and Stokey (1982) MAC — NoGrossman (1976) MAC — No pi = P + εi

Grossman and Stiglitz (1980) MAC MAC Yes r = θ + εHellwig (1980) MAC — Yes diverse infoDiamond and Verrecchia (1981) MAC — Yes diverse info, noise in endowmentsAdmati (1985) MAC — Yes multiple assetsKyle (1989) MAU MAU Yes in = v + en

Kyle (1985) SNC SNU — Yes dynamic modelAdmati and Pfleiderer (1988) SNC MNC M YesAdmati and Pfleiderer (1989) MN MNU M YesFoster and Viswanathan (1990) SC SU C YesSlezak (1994) MAC MAC Yes multi-period generalization of GSAmihud and Mendelson (1980) SNU M — Yes spread = cost, is MM is C?Glosten and Milgrom (1985) NC MN — YesGlosten (1989) SNU MA — Yes all traders have liq. and info.Rock (1989) SA MA MN Inf. = Mkt. orders, Uninf. = limit

Codes in Table: S: Single, M: Multiple; A: Risk-averse, N: Risk-neutral; C: Competitive, U: Uncompetitive.a Uninformed traders whose demands depend on price.b Noise generally refers to liquidity traders, whose demands do not depend on price.


The informed and uninformed have demands

XI() =θ − Rp

aσ2ε

and XU() =E[u|P () = p] − Rp

avar(u|P = p).

For markets to clear

λXI + (1 − λ)XU = x.

In Grossman (1976) there is a paradox since agents ignore their owninformation, yet prices perfectly aggregate this information. A solution isto introduce noise in the form of liquidity traders. Now prices are not fullyrevealing, private information still has value, and trading based on commonbeliefs is possible. With dynamic trading the market maker can break evenon average, not on every trade [see Glosten (1989)].

If competition is imperfect, the equilibrium price reveals less information,although the price is determined as if a nontrading auctioneer aggregated de-mand curves. If a market maker replaces the auctioneer, one needs to askwhat services the market maker is providing. Many papers take the posi-tion that the market maker is an information processor. The informationalcomponent of the spread is proportional to the probability of trading withan informed agent and also proportional to the informed trader’s expectedprofit from holding the asset. Spreads will be larger for larger quantities.

Hellwig (1980)

Hellwig (1980) attempts to avoid the schizophrenic agents in Grossman (1976)by enlarging the economy. The model takes the limit of the incorrect econ-omy, rather than fixing the problem. In other words, this solution is essen-tially “at the limit” rather than “in the limit,” leaving open the question ofhow an economy becomes large in the first place. The paper is still importantin that it shows the schizophrenia problem may be small when the economyis large.

The model is basically an extension of Grossman (1976), but with theaddition of noise in the supply of the risky asset. The amount of informationalso grows with the size of the economy [Kyle (1989) holds it fixed]. In a finite-agent economy when the noise is small, the price becomes fully revealing, asin Grossman. Upon enlarging the economy, the prices do not fully reflect theinformation of the informed agents. Individuals find their own information to

6.3. SINGLE PERIOD REE 165

be incrementally informative to the price alone. The strength of an agent’sreaction to his signal is inversely related to his risk aversion and the noisinessof his signal.

Diamond & Verrecchia (1981)

The Diamond and Verrecchia (1981) model of a competitive market yieldsprices which partially aggregate diverse information to form prices which arenot fully revealing. Prices deviate from the “efficient” level by a randomamount. Noise is explicitly modeled as random endowments in the riskyasset. If individual endowments are iid, per capita supply is constant in thelimit and the model approaches the Grossman (1976) fully revealing model.If the variance of individual endowments grows with the population, the limitis Hellwig (1980).

Admati (1985)

Admati (1985) is a multisecurity version of Hellwig (1980). Investment deci-sions are based on MV considerations, but each agent in effect uses a differentmodel since they condition on different information. These conditional mod-els do not natually aggregate to imply similar unconditional models. There-fore, the market is geneally not MV efficient for any particular informationset, including all public infomation. Uncertainty about the supply of oneasset may prevent the prices of other assets from being fully revealing. Thismay represent a solution to the Grossman and Stiglitz (1980) paradox.

The correlations among the assets can result in a number of strange re-sults. Price may be decreasing in the profitability of an asset or increasingin its supply. The predicted payoff of an asset may be decreasing in price.Finally, assets may increase in price with greater demand.

Kyle (1989)

The Kyle (1989) paper solves the schizophrenia problem by allowing imper-fect competition. The model uses noise traders, uninformed traders, andmulitple informed speculators in a static model. A Walrasian auctioneer ac-cepts limit orders. The informed speculators receive independent, normallydistributed noisy signals of the asset value. Traders have negative exponen-tial (CARA) utility.


The value of the asset is given by v with variance τ−1v . Noise traders

have random demands z with variance σ2z . There are N informed agents

with information in = v + en where var(en) = τ−1e . There is a symmetric

linear equilibrium with informed demands Xn(p, in) = µI + βin − γIp anduninformed demands Xm(p) = µU − γUp.

If all information could be combined the precision of the forecast wouldbe τF = var−1(v|i1, . . . , iN ) = τU + Nτe. The precision for the informedand uninformed are τI = var−1(v|p, in) = τv + τe + ψI(N − 1)τe and τU =var−1(v|p) = τv + ψUNτe. The terms ψI and ψU represent the fraction ofinformation available to the type of agent. When ψ = 0 prices are unin-formative and when ψ = 1 prices are fully revealing. Expressions for theseterms are

ψU =Nβ2

Nβ2 + σ2zτe

and ψI =(N − 1)β2

(N − 1)β2 + σ2zτe

.

The results of the model are prices that are less revealing than in theperfect competition case. The uninformed breakeven on average and theinformed profit at the expense of the noise traders. An increase in the numberof uninformed or a decrease in their risk aversion ρU increases the informationeffect. As M → ∞, E[v|p] = p, a martingale result. An increase in thenumber of informed or a decrease in per capita noise trading increases ψI .

In the limiting economy as N → ∞, τe = τE/N where τE is fixed. Theuninformed do not trade (??). As the infomed become risk-neutral, pricesbecome fully revealing in the competitive case, but only half as much in theimperfect competetion case. Endogenizing information acquisition overcomesthe schizophrenia problem.

This equilibrium is different from the competitive outcome since informedtraders now take into account the effect of their actions on the market price.In this case, traders must know the pricing function, the number of othertraders, and all other agents’ demand schedules. By accounting for theirimpact on price, traders no longer completely trade away their informationaladvantage.

6.4 Batch Models

This section begins with the analysis of market orders in the Kyle (1985)model. A market maker observes the net order flow and sets a single price

6.4. BATCH MODELS 167

at which all orders are cleared. Without price-contingent orders, it is notpossible to explore the bid-ask spread or transaction prices. This frameworkdoes allow analysis of the effect informed traders’ strategies have on prices.Kyle was the first to develop a model of this nature. Price-contingent or-ders are taken up in Kyle (1989). The strategic action of uniformed agentsare covered in models such as Admati and Pfleiderer (1988), Admati andPfleiderer (1989), and Foster and Viswanathan (1990).

Kyle (1985)

The classic Kyle (1985) model uses a single risk-neutral informed trader, agroup of noise traders, and a single risk-neutral market maker. The modelis dynamic, allowing an analysis of trading strategies over time.

The model is presented first in a single period setting. The random futureasset value is v, which only the informed trader can observe. The marketmaker does not explicitly know v, but knows v ∼ N(p0,Σ0). The uninformedtraders provide noise in the aggregate order flow (x+ u), thereby preventingthe market maker from perfectly inferring v. These noise traders submitorders for u ∼ N(0, σ2

u).The informed trader, who has rational expectations, knows the pricing

function and the distribution of noise trades. He chooses an order quantityto maximize his expected profits

X(v) = argmax E[Π(X(·), P (·))|v].

The informed trader does not know the price at which his order will be filled.The equilibrium2 is the pair P (·) and X(·). The market maker sets price

equal to the expected value of v conditional on observing x+ u.

P (x+ u) = E[v|x+ u].

The equilibrium is

X(v) = β(v − p0)

P (x+ u) = p0 + λ(x+ u)

2This setup is not game-theoretic, but can be made so by including additional marketmakers with identical information, or by giving the market maker an objective function.The equilibrium then is such that each player’s strategy is a best response given hisinformation at each stage in the game.


where β =√

σ2u/Σ0 and λ = 1

2

√

Σ0/σ2u. The market maker can use his

knowledge of X(·) to observe a random variable ∼ N(v, σ2u/β

2). Using Bayesrule, his posterior on v is N(p0 + λ(x+ u),Σ0/2).

To derive the equilibrium, suppose that P and X can be expressed aslinear functions of µ, λ, α, and β

P (y) = µ+ λy and X(v) = α + βv.

The expected profit for the informed agent given his signal is

E[Π|tildev = v] = E[

[v − P (x+ u)]x|v = v]

= (v − µ− λx)x.

Profit maximization gives the FOC v− µ− 2λx = 0, or X(v) = α+ βv withα = −µβ and β = 1/(2λ). The market efficiency condition is

µ+ λy = E[v|α+ βv + u = y].

Normality makes the regression linear. Applying the projection theoremgives

λ =cov(v, y)

var(y)=

βΣ0

β2Σ0 + σ2u

and

µ− p0 = −λ(α + βp0).

Solving, we get µ = p0 and α = −βp0.Several characterizations can be made. The unconditional expeceted

profit to the informed is

E[Π] = E[

E[Π|v]]

= E[(v − p0 − λx)x] = E[β(1 − λβ)(v − p0)2]

=1

2βΣ0 =

1

2(Σ0σ

2u)

1

2 .

The variance of the value conditional on the price is

var(v|p) = var(v − p0 − λ(α+ βv + u)) = E[

[v − p0 − λ(α+ βv + u)]2]

= E[

[(v − p0)(1 − λβ) − λ(α + u)]2]

= E[(v − p0)(1 − λβ) + λ2u2]

= Σ0/4 + λ2σ2u =

1

2Σ0


Note that the noise traders have an expected loss, which can be justi-fied with liquidity trading arguments. The noise traders’ loss is the informedtrader’s gain. The market maker expects to break even on average by balanc-ing his loss to the informed with the gain from trading with the uninformed.

In the discrete time sequential auction there is a unique linear equilibrium.There are constants βn, λn, αn, δn, and Σn such that

∆xn = βn(v − pn−1)∆tn

∆pn = λn(∆xn + ∆un)

Σn = var(v|∆x1 + ∆u1, . . . ,∆xn + ∆un)

E[πn|p1, . . . , pn−1, v] = αn−1(v − pn−1)2 + δn−1

Given Σ0, the constants are a unique solution to a difference equation system

αn−1 =1

4λn(1 − αnλn)

δn−1 = δn + αnλ2nσ

2u∆tn

βn∆n =1 − 2αnλn

2λn(1 − αnλn)

λn = βnΣn/σ2u

Σn = (1 − βnλn∆tn)Σn−1

The derivation of the above results follows three steps. First, solve for theinformed agent’s trading strategy as a function of the price function. Second,find the price function that is consistent with market efficiency given optimaltrades. Finally, show the difference equation system implied by the first twosteps has a solution.


In a continuous time setting, µ(t) follows a Brownian motion. Therefore,the uninformed quantity is independent through time. Since this indepen-dence will not be true for the informed trader, there is a linkage betweenquantity and information that causes prices to (eventually) reflect all infor-mation. The informed trader need not trade the same amount every period.He changes his trade size to try to “hide” from the market maker.

The prices have a constant volatility as information is gradually incor-porated into prices at a constant rate. Prices follow a martingale (and arandom walk), so they are efficient in the semi-strong sense. The informedtrader profits more by continuously trading than by using a mixed strategyattempting to manipulate prices. You could not tell that there is an informedtrader by looking at prices alone. The continuous time setting makes it possi-ble to spread information quickly without removing the incentives to acquireinformation [Grossman and Stiglitz (1980)].

The speed with which the informed trader pushes prices to the true valuemeasures resiliency. This speed is the difference between his private infor-mation and the current price, divided by the remaining trading time. Thedepth of the market, constant over time, is proportional to the amount ofnoise trading and is inversely proportional to the amount of private informa-tion. The market is infinitely tight in continuous time.

There are many extensions to the model. You could let the market makerknow more about the distribution of orders than the market as a whole. Thisdrastically reduces the informed traders ability to make profits and pricesreflect information much more quickly. Another extension allows multipleinformed traders. Foster and Viswanathan (1990) is an example of this, wherethe normality assumption is relaxed to elliptical distributions. The result isthe competition between informed forces prices to their full-information levelsalmost immediately, eliminating the smoothing behavior. Their work alsoshows that the Kyle results may be sensitive to the normality assumption.Kyle (1989) uses a more complex trading mechanism (limit orders) in a singleperiod setting to overcome this problem.

6.4.1 Strategic Uninformed Traders

Strategic uninformed trading may allow these agents to reduce their tradinglosses. This may create price effects by the uninformed traders, as theyattempt to “hide” from the informed traders. Admati and Pfleiderer (1988)and Admati and Pfleiderer (1989) examine intraday timing decisions of the


uninformed. Foster and Viswanathan (1990) focus on interday effects as thelevels of public and private information vary across days.

Admati & Pfleiderer (1988)

There is empirical evidence of U-shaped patterns in intraday volume andvolatility. In Admati and Pfleiderer (1988) the risk-neutral3 informed tradersget their information one period before it becomes public knowledge. Theinformed then just decide the optimal order size in each period. The unin-formed discretionary traders can not split trades, but they do decide when totrade. There are also nondiscretionary traders providing noise in the model.The competitive informed traders do not consider the price consequences oftheir actions. Uninformed traders end up clustering their trades. This clus-tering can improve the liquidity of the market and reduce their losses to theinformed. The informed traders recognize the clumping of uninformed tradesand will also trade during these periods, intensifying the clustering.

The concentration of discretionary liquidity traders does not affect theamount of information revealed by prices or the variance of price changes ifthe number of informed traders is fixed. This is because there is an increasein informed trading just enough to keep the informational content the same.Endogenizing information acquisition intensifies the concentration of tradingand prices become more informative. The liquidity traders are better off withno informed traders, but if there are any the cost of trading decreases withthe number of informed.

A critical assumption is the independence of trade between periods. Sub-sequent prices will not reflect previous order flow. If the uninformed areallowed to split their trades, an equilibrium may not exist, and if it does itmay not be unique. The results are also sensitive to the assumptions aboutthe risk preferences of the informed traders. If they are risk-averse, then itmay not be the case that periods with more informed traders result in betterprices for the uninformed. Thus, the clumping may not hold if traders arerisk-averse. If uniformed trade flows become more informative over time,uninformed traders will be more likely to trade early.

Admati & Pfleiderer (1989)

Admati and Pfleiderer (1989) examine patterns in mean returns in a frame-

3The results do not change with risk-averse liquidity traders.


work where market makers reduce the adverse selection problem by inducingpatterns in volume and price. By changing the bid or ask commission, themarket maker can change the expected number of liquidity sellers and buy-ers. The market maker’s expected loss to the informed decreases with thecommission, but so does his expected profits on the discretionary liquiditytraders. The market maker processes trades in a manner combining some fea-tures of batch and sequential trading. Traders do know the prices at whichthey will transact, but prices are updated after every period in time, notafter every trade.

Equilibrium trading results in all discretionary buying occuring in a singleperiod, and similarly for selling. This is because the liquidity trading reducesthe adverse selection problem. The paper uses a market where traders canonly buy on even days and sell on odd days as an example.

Foster & Viswanathan (1990)

In Foster and Viswanathan (1990) an interday pattern in trading arises be-cause the informational advantage of the informed decreases over time as theuninformed infer information from the price and the market maker from theorder flow. The informed trader will be at the greatest advantage when themarket first opens, such as in the morning or on Monday.

The model is an extension of Kyle (1985). There is only one informedtrader and the uninformed act competitively. The ability of the uninformedto choose when to trade creates the temporal pattern. ?? The sensitivity ofthe price to the order flow increases with the amount of information releasedby the informed and falls with the amount of liquidity trading. The informedtrades more when there are more liquidity traders to hide his trade. Conse-quently, he has a higher profit when there is more liquidity trading, or whenhe releases more private information. The release of private information willbe smooth throughout the day.

Slezak (1994)

Slezak (1994) develops a multiperiod generalization of Grossman and Stiglitz(1980) that produces patterns in both the mean and variance of returnswithout relying on irrationality, bubbles, or strategic liquidity trading. Thesepatterns arise because of the effect market closures have on the informationstructure in the economy.

6.5. SEQUENTIAL TRADE MODELS 173

The model uses risk-averse agents. Market closures alter investor uncer-tainty by changing the timing of resolution of uncertainty and by reducingthe informed agent’s comparative advantage at risk bearing. Closures affectthe variance of returns by altering the informativeness of the price. Post clo-sure prices reflect a greater proportion of private news on the reopening day,but less private news accumulated over the closure. Preclosure prices are rel-atively less informative as well. Post closure liquidity costs are higher sinceincreased adverse selection causes the uninformed to provide less liquidity.

6.5 Sequential Trade Models

Sequential trade models allow for the analysis of bid-ask spreads and thedetails of the price process. The main underlying idea is that an informedtrader will prefer to buy when the price is low and sell when it is high. Themarket maker will lose money on him if there is a single price. By introducinga bid-ask spread the market maker can offset the losses to the informed withgains from the uninformed.

6.5.1 Specialists and Dealers

Amihud & Mendelson (1980)

In Amihud and Mendelson (1980) the (risk-averse ??) market maker max-imizes expected profits by changing the bid and ask. This can give rise toan asymmetric bid ask as the market maker manages his inventory. Thiscontrasts with Admati and Pfleiderer (1989) where an asymmetric spreadresults from information effects.

In the model the market maker is a monopolist who sets bid and askprices (pb and pa) to maximize expected profits. The quotes are good for asingle transaction. The arrival of buy and sell orders is Poisson with ratesD(pa) and S(pb) where D′ < 0, S ′ > 0. The market maker dislikes extremeinventory positions because they force him to take transactions under unfa-vorable conditions. To stay in his desired inventory range the market makeradjusts bid and ask prices to manage his inventory.

Glosten & Milgrom (1985)

Glosten and Milgrom (1985) model the market maker’s pricing decision in


an environment where he learns from previous trades. In a competitive mar-ket, informed trades will reflect their information. A sell order will lowerthe market maker’s expectation, while a buyer will raise his expectation.The competitive market maker sets the bid and ask such that his expectedprofit on any trade is zero. The bid reflects the expected value of an assetconditional on a sell order arriving. Bayes rule updates the conditional prob-abilities as trades occur. Since the distribution of trades differs depending onthe true state, the market maker will eventually learn the informed trader’sinformation. Many of the results stem from the fact that the informed canonly trade a single unit at a point in time.

Prices follow a martingale with respect to the specialist’s and public in-formation; price changes will be serially uncorrelated. Spreads due to adverseselection are different from spreads arising from transactions costs, risk aver-sion, or monopoly power. These other sources of spreads will lead to negativeserial correlation. The spread can be expressed as Ψ + 2c, where Ψ is theadverse selection cost and c is the cost of transacting. The covariance ofadjacent price changes is − 1

2Ψc− c2. This is similar to Roll (1984), with the

addition of the adverse selection component. The variance of a price changeis θ2 + (Ψ/2)2 + cΨ + 2c2, where θ2 is the variance of public informationarriving exogenously between trades.

The bid-ask spread reflects the informational asymmetry. With largevolume the spread will be small. As the market maker learns the insider’sinformation the valuation of the informed trader and the market maker con-verge. The spread will increase when the informed traders’ information isbetter, insiders become relatively more numerous, or the elasticity of thesupply and demand of uninformed traders increases. If the adverse selectionproblem is too large, then the market may collapse as in Akerlof (1970). Ifthe market closes for this reason the problem only gets worse. Once a marketcloses it will stay closed until the information asymmetry is reduced.

Glosten (1989)

When investors trade on private information it can lead to suboptimal risksharing if the market maker reduces the liquidity of the market. Glosten(1989) looks at whether the monopoly power of the specialist can preservemarket liquidity and avoid market failure.

In Glosten and Milgrom (1985) the market maker is competitive. He setsthe price to have a zero expected profit on every trade. When information

6.5. SEQUENTIAL TRADE MODELS 175

asymmetries are large the market may fail completely. Furthermore, marketclosure generally makes the information asymmetry worse. By giving themarket market monopoly power4 he can set prices to average profits acrosstrades. He will lose on trades with the informed, but compensate with tradesto the uninformed. The result is increased liquidity. The market maker isrisk-neutral so there are no inventory costs associated with risk bearing. Themodel also ignores any dynamic trading.

Rock (1989)

Rock (1989) examines the interaction of the specialists order book and prices.risk-neutral uninformed traders submit limit orders. A risk-averse marketmaker competes with the orders in the book. The market maker has twoadvantages. First, he knows the size of the trade. Second, he moves secondso he can get out of the way of big trades and fill them from the book,creating an adverse selection problem. The book orders tend to only get theunprofitable trades.

Limit orders provide liquidity to the market. These orders have an optioncomponent to them. The order is an obligation to buy or sell at the specifiedprice. Since the order submitters are writing the option, they receive anoption premium in the form of reduced transactions costs. These investorsavoid the adverse selection component of the bid-ask spread by standingready to transact ahead of time.

The assumptions about risk preferences are important in this model. Ifthe specialist was risk-neutral he would not need to bother with the orderbook. It is the risk neutrality of the limit order submitters that gives thema comparative advantage at risk bearing. If the limit order submitters wererisk-averse they would only submit orders in response to inventory positions,etc. The risk aversion of the specialist will cause him to take transactions atprices that may differ from underlying value.

4The market maker need not literally be a monopolist. He has superior informationabout the trading process from his order book, but may face competition from limit ordersand other floor traders. Limit orders allow traders to provide liquidity to the market andcompete with the market maker. What is important is his ability to average profits acrosstrades.


6.5.2 Other Topics

Trading Volume

Volume of trade generally increases with the precision of private information.Equilibrium beliefs are not always more homogeneous if information is moreprecise.

Sale of Information

People with private information can profit from it by selling it or by tradingon it themselves. The more selling they do, the less valuable the informationis in their trading. The information seller can add noise to the information(either the same noise for each purchaser or unique noises). Selling can alsobe done indirectly, as in a mutual fund.

Regulation

The adverse selection component of trading costs is like a tax on noise tradersthat subsidizes the acquisition of private information and its release throughthe price system. Regulators can attempt to influence the liquidity of marketsand the informativeness of prices. Attempts to reduce noise trading on thegrounds that it destabilizes prices may not work. It is noise trading thatattracts informed traders to the market in the first place. Reducing noisetrading may actually reduce the informativeness of prices.

6.6 Special Topics

6.6.1 Bubbles

Bubbles deal with deviations from fundamental value. Shiller (1981) is oneof the classic papers in this area. Refer to Section 2.8.6 for more information.Tirole () has a no trade result in a dynamic context where trade does notoccur because it would burst a bubble.

Blanchard & Watson (1982)

Blanchard and Watson (1982) argue that rational bubbles are possible evenin efficient markets. The market price can be expressed as the fundamental

6.6. SPECIAL TOPICS 177

value plus a bubble

pt = p∗t + ct

where E[ct|Ωt−1] = (1 + r)ct−1.

A deterministic bubble is given by ct = c0(1+ r)t. The bubble grows withtime so that it eventually dominates the fundamental value portion of theprice. Since the growth must continue forever for the price to be rational,this type of bubble is implausible. A stochastic bubble is created by addinga random shock to the to a deterministic bubble.

A stochastic crash takes a value of ct = µt +ct−1(1+r)/π with probabilityπ and ct = µt otherwise. In this case E[µt|Ωt−1] = 0. This produces asituation where the bubble will persist with probabilty π or crash. Theaverage return is greater than r to compensate for the risk of a crash.

Arbitrage does not eliminate these bubbles. Since the bubble grows inany of the above cases, as the time horizon becomes infinite the bubble willbe infinitely large. Since some assets, such as bonds, have finite lives thebubble must be zero at their maturity. Therefore bubbles are ruled out forthese securities. The structure above also rules out negative bubbles sincethey imply negative security prices with a positive probability.

Empirically detecting bubbles is challenging. To use the price processto say something intersting about bubbles requires an understanding of thefundamental value process — including the information sets available. Testsfor bubbles can be divided into variance bounds and patterns in innovations.The variance bounds tests, such as Shiller (1981) put upper bounds on theconditional or unconditional variances of prices relative to the variance ofdividends. The innovation patterns tests look for either runs in shocks orextreme outliers.

6.6.2 Speculation

Hart & Kreps (1986)

Hart and Kreps (1986) show that, contrary to common belief, speculation candestabilize prices. Speculators buy when the chances of price appreciationare high, which is not necessarily when prices are actually low.


6.6.3 Noise

DeLong, Shleifer, Summers, & Waldman (1990)

? develop an overlapping generations model with irrational noise traders.The rational investors do not fully exploit the irrational investors. Theirshort-run focus prevents them from completely wiping out the irrational in-vestors.

“Noise trader risk” is the chance that marketwide irrational beliefs ofthe noise traders may become even more irrational before reverting to theirmean. Essentially the noise trader beliefs are slowly mean reverting. If anarbitrageur has a limited investment horizon there is a chance that the priceswill not return to their true value before he has to close out his position. Infact, if the beliefs become more irrational the arbitrageur may face a loss.

There are several plausible preditions from the model. Prices are morevolatile with noise trading. If the noise traders’ opinions are stationary therewill be a mean-reverting component to stock returns. Assets may be un-derpriced realtive to fundamental value, consistent with the equity premiumpuzzle.

6.6.4 Cascades

Bikhchandani, Hirshleifer, & Welch (1992)

Bikhchandani, Hirshleifer, and Welch (1992) generalize the idea of IPO cas-cades in Welch (1992). For the details of cascades refer to the discussionof the original paper in Section 5.10.1. An information cascade describesa sequence of decisions where individuals ignore their own private informa-tion in favor of information inferred from the observation of others decisions.Cascades can be reversed by the release of new information.

Chapter 7

International Finance

7.1 Introduction

What distinguishes international finance from traditional finance is the addi-tion of foreign exchange rate assets, both spot and forward. There are severalmeasures of returns in international finance. The return from currency spec-ulation by buying forward and selling spot is (ft−st+1)/st. Mean returns aregenerally close to zero. Depreciation is defined as (st+1−st)/st. The forwardpremium is (ft − st)/st.

There are several basic concepts that are important in international fi-nance. Covered Interest Rate Parity dictates that

exp(rit) = exp(rj

t )F ij

t

Sijt

or rit − rj

t = f ijt − sij

t

to prevent arbitrage. Uncovered Interest Rate Parity states that

exp(rit) = exp(rj

t )E[Sij

t+1]

Sijt

or rit − rj

t = E[sijt+1] − sij

t

In terms of notation, the ij superscript indicates the price of a unit of cur-rency j in terms of currency i. Purchasing Power Parity (PPP) is anotherno arbitrage condition that says the prices of a good in different countriesmust be the same after converting currencies.

Floating rates began in 1973. The period shortly thereafter is known asthe “dirty float” period.

179

180 CHAPTER 7. INTERNATIONAL FINANCE

There are two puzzles in international finance. The first is the deviationof the forward rate from the expected future spot rate. This captures thedifference between covered and uncovered interest parity. The second puzzleis the home country bias — too little investment in foreign assets.

7.2 Spot Currency Pricing

Lucas (1982)

The Lucas (1982) model extends Lucas (1978) to international asset pric-ing. In the model there are two infinitely-lived countries with identicalagents. There are two non-storable goods, no production, stochastic en-dowment shocks, and monetary instability. The model is developed first ina barter economy, then in a world with a single currency, and finally withnational currencies and flexible exchange rates. Country 0 produces goodX in amounts ξt. Similarly, country 1 produces good Y in amounts ηt.Denote the price of Y , in units of X, in state s as pY (s). The prices of thefuture streams ξt and ηt are given by qX(s) and qY (s), again in units ofX.

An agent with wealth θ chooses consuption of (X, Y ) at prices (1, pY (s))and shares (θX , θY ) of (ξt, ηt) at prices (qX(s), qY (s)). The agents objec-tive is to

maxE

[ ∞∑

t=0

βtU(Xit, Yit)

]

subject to a budget constraint and a cash in advance contraint. This meansthat the value of current period endowments can not be used in trading forassets or the other consumption good until next period. Each agent can beviewed as a two member household. One member collects the endowmentand exchanges it for currency while the other uses existing currency to tradeassets and goods. The two members do not interact until the end of thetrading period.

With national currencies, the monetary shocks are given by

∆Mt+1 = w0,t+1Mt and ∆Nt+1 = w1,t+1Nt.

Within each country the price of the home good in terms of home currencyis

pX(s,M) = M/ξ and pY (s,N) = N/η.

7.3. FORWARD CURRENCY PRICING 181

Also note the price of Y in terms of X can be expressed as pY (s) = ∂U/∂Y∂U/∂X

.

The exchange rate (currency 0 per unit of currency 1) is

e(s,M,N) =pX(s,M)

pY (s,N)pY (s) =

M/ξ

N/ηpY (s) =

πY

πX

pY (s)

where πi = 1/pi(s, ·) gives the purchasing power for country i. In equilibrium,

Vi(t)πi(t)∂U(t)

∂i= E

[

β∂U(t + 1)

∂iπi(t + 1)[Vi(t + 1) +Di(t+ 1)]

]

.

For a riskless asset

Bi(t) = E

[

β∂U(t + 1)/∂i

∂U(t)/∂i

π(t+ 1)

π(t)

]

= E[m].

7.3 Forward Currency Pricing

Hansen & Hodrick (1983)

Hansen and Hodrick (1983) study the determinants of the risk premium inforeign exchange rates. This premium arises when the forward rate is notequal to the expected future spot rate, Ft 6= E[SSt+1

]. The basic idea is totest the orthogonality condition

E[Qm,t+k(sjt+k − f j

t,k)] = 0

where Qm,t+k is the IMRS of money.To make the above condition testable the authors propose three mod-

els. The first is a lognormal model which implies a constant risk premium.The second uses a riskless nominal rate and assumes a constant conditionalcovariance. The third is a latent variable model. The first two models arerejected, while the third provides some evidence that the risk premium isimportant. These test are joint tests of the orthogonality condition and theauxillary restrictions in each of the three models.

Fama (1984)

Fama (1984a) uses the same basic framework Fama (1984b), which looks atTreasury bills. The idea here is to determine the information in the forward


premium about forecast errors and changes in the spot rate. This researchshows that excess returns are not only predictable ex ante, but also that thevariance of the predictable component exceeds the variance of the expectedrate change.

The analysis begins with a specification for the components of the forwardrate

ft = E[st+1] + pt

where the lower case letters indicate logs and pt is the premium. This canbe modified to represent the forward premium, which is then used to predictthe forecast error and spot rate innovation

ft − st = E[st+1 − st] + pt

ft − st+1 = α1 + β1(ft − st) + ε1,t+1

st+1 − st = α2 + β2(ft − st) + ε2,t+1.

Adding the last two equations implies that α1 + α2 = 0, β1 + β1 = 1, andε1,t+1 + ε2,t+1 = 0.

Fama finds that both components vary through time, but most of thevariation in the forward rate is due to the premium. The null hypothesis isβ1 = 0 and β2 = 1. The estimated coefficients are β1 > 1 and β2 < 0. Thus,the premium and expected future spot rate are negatively correlated.

Possible explanations for these findings can be categorized as either arisk premium story or some type of forecast errors. The risk premium canarise in either a CAPM or a dynamic gereral equilibrium setting if investorshave rational expectations. While a risk premium could account for non-zeroexcess returns, it does not explain the high variablility. Explanations basedon forecast errors may rely on either rational or irrational agents. Examplesof cases with rational investors include learning models and the peso problem.

Mark (1988)

Mark (1988) allows time-variation in beta or the risk premium in a single betaCAPM to attempt to explain the forward premium puzzle. The conditionalbeta comes from an ARCH model. Using GMM, he fails to reject the model,indicating that there is evidence of time-varying beta. Additional tests rejectthe hypothesis of a constant beta.

7.3. FORWARD CURRENCY PRICING 183

Froot & Frankel (1989)

Froot and Frankel (1989) use survey data to extend the analysis in Fama(1984a). They focus on the regression

st − st−1 = α + β(ft−1 − st−1) + εt

and decompose beta into β = 1 − βre − βrp. The term βre captures failureof rational expectations while βrp represents the risk premium. The priorsare that βrp is large and βre small, but the authors find the opposite. Therisk premium does not appear to be an economically important source ofthe forward premium. The authors fail to reject the hypothesis that all thebias in the forward premium is due to expectation errors. Contrary to Fama(1984a), Froot and Frankel find that the variance of expected depreciationis large relative to the variance of the risk premium and the risk premium isuncorrelated with the forward discount. This analysis does not incorporatelearning effects or the “peso problem.”

Backus, Gregory & Telmer (1993)

Backus, Gregory, and Telmer (1993) view the evidence on forward premi-ums in the same light as the equity premium puzzle. They introduce habitpersistance to get around the high risk aversion implied by models with repre-sentative agents and time-seperable utility. The model is tested using GMMestimation and simulations.

The statistical properties of forward and spot rates imply predicatablereturns from speculation. These returns are highly variable and imply ahighly variable pricing kernel. Using GMM, the authors reject models withpower utility and a particular specification of habit persistance. Simulationsare used to place more structure on the theory. The evidence is partiallyconsistent with the revised theory.

Huang (1989)

Huang (1989) examines the risk-return characteristics of the term structureof forward FX. The analysis is much like Hansen and Hodrick (1983), but ina multiple maturity setting. The evidence is that there appear to be somecountry-specific effects in the short (1 month) end of the term structure. Inparticular, Huang rejects the model using one month forwards, contrary to


Hansen and Hodrick. With 3, 6, and 12 month forwards and with multiplematurities he fails to reject. These results are important since virtually allother papers in the literature (at least the ones mentioned here) use onemonth forwards. If there are strange influences on this maturity then theresults in other papers may not be robust.

7.4 Integration

Bekaert and Harvey (1995) develop a conditional regime-switching modelwhere expected returns are a weighted average of returns in integrated andsegmented markets

E[ri] = φλcov(ri, rW ) + (1 − φ)λvar(ri)

where φ is the probability the market is integrated. This probability is es-timated with regime switching models assuming constant or time-varyingtransition probabilties. The authors find evidence of a time-varying worldprice of risk related to the business cycle (the Sharpe ratio is high in a trough).There is also evidence of time-varying integration for a number of countries.

Evans & Lewis (1995)

Evans and Lewis (1995) study whether long swings in the dollar can affectrisk premium estimates. Using a regime-switching model they find that longswings make risk premium to appear to contain a permanent disturbance andcan bias the Fama-style forward regressions. The authors are also unableto reject the restriction that the actual forward premium equals the riskpremium plus the expected change in the exchange rate.

7.5 International Asset Pricing

Skip. Papers: Stulz (1981), Bansal, Hsieh, & Viswanathan (1993), Dumas &Solnik (1995), Ferson & Harvey (1993).

7.6 Other Topics

Skip. Papers: Bekaert & Hodrick (1992), Engle & Hamilton (1990), Engle,Ito, & Lin (1990).

Chapter 8

Appendix: Math Results

8.1 Basics

8.1.1 Norms

A norm measures the magnitude of a vector. The Euclidean norm is thecommon measure.

||x|| ≡√

x′x ≡[

∑

x2i

]1/2

8.1.2 Moments

Moments describe the characteristics of a distribution. The ith moment isµ′

i = E[xi] and the ith central moment is µi = E[

(x− E[x])i]

. The first

moment is the mean and the second central moment the variance.

8.1.3 Distributions

Normal

If x ∼ N(µ, σ2) the density and characteristic functions are

f(x) = (2πσ2)−1/2 exp

(

−(x− µ)2

2σ2

)

φ(t) = exp

(

iµt− σ2t2

2

)

185

186 CHAPTER 8. APPENDIX: MATH RESULTS

Lognormal

If x is normally distributed, then z = ex is lognormal (its log is normal).

f(z) =1

σz√

2πexp

[

−(ln z − µ)2

2σ2

]

.

z = exp(

µ+ σ2/2)

var(z) = exp(2µ+ σ2)(exp(σ2) − 1)

8.1.4 Convergence

Probability

A sequence of random variables xn converges in probability to a constant cif

limn→∞

Pr[|xn − c| < δ] = 1 ∀ δ > 0.

Distribution

A sequence of random variables xn with cdf Fi converges in distribution toa random variable x with cdf F if

limn→∞

Fn(x) = F (x)

Almost Sure

A sequence of random variables xn defined on a probability space (Ω, F, P )converges almost surely to an rv x if

limn→∞

xn(ω) = x(ω)

for each ω ∈ Ω except for ω ∈ E where P (E) = 0.

Quadratic Mean

8.1.5 Some Famous Inequalities

Jensen’s Inequality

If G is concave in x, then

E[G(x)] ≤ G[E(x)].

This is where risk aversion comes from.

8.1. BASICS 187

Chebychev’s Inequality

If mean µ and variance σ exist, then for all ε > 0

Pr[|x− µ| ≥ ε] ≤ σ/ε2

Cauchy-Schwarz Inequality

(E[xy])2 ≤ E[x2]E[y2]

8.1.6 Stein’s Lemma

If (x, y) ∼ N(·, ·), g is everywhere differentiable, and E[|g ′(x)|] <∞, then

cov(g(x), y) = E[g′(x)]cov(x, y).

This result is useful in working with the fundamental valuation equation1 = E[mR]. It can linearize a model under normality.

8.1.7 Bayes Law

Bayes law is useful for updating probabilities.

Prob(Xi|Y ) =Prob(XiY )

Prob(Y )=

Prob(Y |Xi)Prob(Xi)∑N

i=1 Prob(Y |Xi)Prob(Xi)

8.1.8 Law of Iterated Expectations

The law of iterated expectations is useful in conditioning down on a finerinformation set. If E[|Y |] <∞ and F0 ⊂ F1 ⊂ F , then

E[Y |F0] = E[

E[Y |F1]|F0

]

.

8.1.9 Stochastic Dominance

To compare two risky payoffs c1 and c2, we can use the notion of stochasticdominance. The idea is to choose the asset

Let F (c) = Pr[c1 ≤ c], G(c) = Pr[c2 ≤ c]. First order SD: 1 dominates 2in the first-order sense if F (c) ≤ G(c) ∀ c. Second order SD: 1 dominates 2in the second-order sense if

∫ c

−∞ F (r)dr ≤∫ c

−∞G(r)dr


8.2 Econometrics

This is a very brief review of some of the highlights from econometrics thatare not immediately obvious.

8.2.1 Projection Theorem

If

E[y|x] = α + βx

then

β =cov(x, y)

var(x)and α = y − βx.

8.2.2 Cramer-Rao Bound and the Var-Cov Matrix

The Cramer-Rao Bound gives the minimum variance of an estimator. Esti-mators that achieve the bound are most efficient in their class.

Under regularity conditions, the variance of an unbiased estimator θn isbounded by var(θn) ≥ var(G)−1 = −E[H]−1 where G and H are the gradiantand Hessian.

8.2.3 Testing: Wald, LM, LR

There are three basic tests of hypotheses, the Wald (W), likelihood ratio(LR), and the Lagrange multiplier. All three are asymtotically χ2, but finitesample properties may differ. One test may be preferred over the otherdepnding on the easy of calculation under the null or alternative hypotheses.

Consider a ML estimate with g(y, θ) = ln[f(y, θ)] the log-likelihood. LetG = gθ and H = gθθ′. Then E[G] = 0, var(G) = E[GG′] = −E[H] = I andvar(θ)

a→ I(θ)−1.

W =(θ − θ)′[var(θ − θ)]−1(θ − θ)

LM =G(θR)′I(θR)G(θR)

LR = − 2[g(θR) − g(θU)]

8.3. CONTINUOUS-TIME MATH 189

G

g

cW

LR

LM

8.3 Continuous-Time Math

8.3.1 Stochastic Processes

8.3.2 Martingales

prices follow a martingale when adjusted for dividends.

Random Walk

8.3.3 Ito’s Lemma

Consider the diffusion process of a variable X:

dX(t) = µ(X, t)dt+ σ(X, t)dW (t)

where dW is a standard diffusion process with the properties E[dW ] = 0and E[dW 2] = dt. Then the function F (X, t) has the stochastic differentialequation

dF (X, t) =∂F

∂XdX +

[

∂F

∂t+

1

2σ2(X, t)

∂2F

∂X2

]

dt

8.3.4 Cameron-Martin-Girsanov Theorem

If Wt is a P-Brownian motion and γt is an F -previsible process satisfying theboundedness condition

EP[exp(1

2

∫ T

0

γ2t dt)] <∞,

then there exists a measure Q such that


1. Q is equivalent to P

2. dQdP

= exp(

−∫ T

0γtdWt − 1

2

∫ T

0γ2

t dt)

3. Wt = Wt

∫ T

0γsds is a Q-Brownian motion.

There is a converse as well.

8.3.5 Special Processes

Arithmetic Brownian Motion

dX = µdt+ σdW

X grows linearly with increasing uncertainty. X is normally distributed withmean X + µ(τ) and standard deviation σ

√τ .

Geometic Brownian Motion

dX = µXdt+ σXdW

X grows exponentially at rate µ with volatility proportional to the level ofX. The distribution of X is lognormal which makes it useful in modelingasset prices.

Mean-reverting Process

dX = κ(µ−X)dt+ σXγdW

If γ = 1/2 then X is distributed non-central χ2. It is often used to modelinterest rates, inflation, and volatility; the CIR model is an example of thesquare root process. If γ = 1, this is called a Ornstein–Uhlenbeck process.

8.3. CONTINUOUS-TIME MATH 191

8.3.6 Special Lemma

If[

xy

]

∼ N(0,Ω) with Ω =

[

σ2x σxy

σxy σ2y

]

then

E[(Ax exp(x− 1

2σ2

x − Ay exp(y − 1

2σ2

y)+] = AxN(d1) − AyN(d2)

where

d1 =ln(Ax/Ay) − Σ√

Σ, d2 = d1 −

√Σ,

and Σ = var(x− y) = σ2x + σ2

y − 2σxy.


Bibliography

Admati, Anat, 1985, A noisy rational expectations equilibrium for multi-asset securities markets, Econometrica 53, 629–657.

, and Paul Pfleiderer, 1988, A theory of intraday patterns: Volumeand price variablity, Review of Financial Studies 1, 3–40.

, 1989, Divide and conquer: A theory of intraday and day-of-the-weekmean effects, Review of Financial Studies 2, 189–223.

Akerlof, George A., 1970, The market for “lemons”: Quality uncertainty andthe market mechanism, Quarterly Journal of Economics 84, 488–500.

Amihud, Y., and H. Mendelson, 1980, Dealership markets: Market-makingwith inventory, Journal of Financial Economics 8, 31–53.

Asquith, Paul, 1995, Convertible bonds are not called late, Journal of Fi-nance 50, 1275–1289.

, and David Mullins, 1991, Convertible debt: Corporate call policyand voluntary conversion, Journal of Finance 46, 1273–1289.

Backus, David, Allan Gregory, and Chris Telmer, 1993, Accounting for for-ward rates in markets for foreign currency, Journal of Finace 48, 1887–1908.

Banz, R., 1981, The relation between the return and market value of commonstocks, Jounrnal of Financial Economics 9, 3–18.

Basu, S., 1977, The investment perfomance of common stocks in relation totheir price to earnings ratios: A test of the efficient markets hypothesis,Journal of Finance 32, 663–682.

193

194 BIBLIOGRAPHY

Bekaert, Geert, and Campbell Harvey, 1995, Time-varying world marketintegration, Journal of Finance 50, 403–444.

Berger, Phillip, and Eli Ofek, 1995, Diversification’s effect on firm value,Journal of Financial Economics 37, 39–65.

Berk, Jonathan, 1995, A critique of size related anomalies, Review of Finan-cial Studies 8, 275–286.

Betker, Brian, 1995, An empirical examination of pre-packaged bankruptcy,Financial Management 24, 3–18.

Bhattacharya, Suipto, and George Constantinides, 1989, Frontiers of Mod-ern Financial Theory . , vol. I & II of Studies in Financial Economics(Rowman & Littlefield: Totowa, NJ).

Bikhchandani, S., David Hirshleifer, and Ivo Welch, 1992, A theory of fads,fashion, custom, and cultural change as informational cascades, Journalof Political Economy 100, 992–1025.

Billett, Matthew, Mark Flannery, and Jon Garfinkel, 1995, The effect oflender identity on a borrowing firm’s equity return, Jounrnal of Finance50, 699–718.

Bizjak, John, James Brickley, and Jeffrey Coles, 1993, Stock-based incen-tive compensation and investment behavior, Journal of Accounting andEconomics 16, 349–372.

Black, Fisher, 1972, Capital market equilibrium with restricted borrowing,Journal of Business 45, 444–455.

, 1976, The dividend puzzle, Journal of Portfolio Management 2, 5–8.

Black, Fischer, Michael Jensen, and Myron Scholes, 1972, The capital assetpricing model: Some empirical tests, in Michael Jensen, ed.: Studies inthe Theory of Capital Markets (Praeger: New York, NY).

Black, Fischer, and Myron Scholes, 1973, The pricing of options and corpo-rate liabilities, Journal of Political Economy 81, 637–659.

BIBLIOGRAPHY 195

Blanchard, O., and Mark Watson, 1982, Bubbles, Rational Expectations, andFinancial Markets . , vol. Crises in the Economic and Financial Structure(Lexington Books: Lexington, MA).

Blume, M., and I. Friend, 1973, A new look at the capital asset pricing model,Journal of Finance 28, 19–33.

Booth, James, and Lena Chua, 1996, Ownership dispersion, costly informa-tion, and IPO underpricing, Journal of Financial Economics 41, 291–310.

Breeden, Douglas T., 1979, An intertemporal asset pricing model withstochastic consumption and investment opportunities, Journal of Finan-cial Economics 7, 265–96.

Brown, Roger, and Stephen Schaefer, 1994, The term structure of real in-terest rates and the Cox, Ingersoll, and Ross model., Journal of FinancialEconomics 35, 3–42.

Brown, Stephen, and Philip Dybvig, 1986, The empirical implications of theCox, Ingersoll, and Ross theory of the term structure of interest rates,Journal of Finance 41, 617–632.

Campbell, John Y., Andrew W. Lo, and A. Craig MacKinlay, 1997, TheEconometrics of Financial Markets (Princeton University Press: Prince-ton, NJ).

Chan, K.C., Nai-fu Chen, and David Hsieh, 1984, An exploratory investiga-tion of the firm size effect, Journal of Financial Economics 14, 451–471.

Chan, K.C., G. Andrew Karolyi, Francis Longstaff, and Anthony Sanders,1992, An empirical comparison of alternative models of the short-terminterest rate, Journal of Finance 47, 1209–1227.

Chen, Nai-fu, Richard Roll, and Stephen A. Ross, 1986, Economic forces andthe stock market, Journal of Business 59, 383–403.

Cochrane, John, 1998, Asset pricing, Unpublished Book.

Diamond, Douglas, and Robert Verrecchia, 1981, Information aggregation ina noisy rational expectations economy, Journal of Financial Economics 9,221–235.

196 BIBLIOGRAPHY

Dunn, Kenneth, and Kenneth Eades, 1989, Voluntary conversion of convert-ible securities and the optimal call strategy, Journal of Financial Eco-nomics 23, 273–301.

Eades, Kenneth, Patrick Hess, and E. Han Kim, 1994, Time-series variationin dividend pricing, Journal of Finance 49, 1617–1638.

Eckbo, Espen, and Ronald Masulis, 1992, Adverse selection and the rightsoffer paradox, Journal of Financial Economics 32, 293–332.

Evans, Martin, and Karen Lewis, 1995, Do long-swings in the dollar affectestimates of the risk premia?, Review of Financial Studies 8, 709–742.

Fama, Eugene, 1980, Agency problems and the theory of the firm, Journalof Politcial Economy 88, 288–307.

, 1984a, Forward and spot exchange rates, Journal of Monetary Eco-nomics 14, 319–338.

, 1984b, The information in the term structure, Journal of FinancialEconomics 13, 509–521.

, 1991, Efficient capital markets: II, Journal of Finance 46, 1575–1618.

, and Kenneth French, 1992, The cross-section of expected stockreturns, Journal of Finance 47, 427–465.

, 1996b, The CAPM is wanted, dead or alive, Journal of Finance.

Fama, Eugene F., and James MacBeth, 1973, Risk, return and equilibrium:Empirical tests, Journal of Political Economy 81, 607–636.

Fazzari, Steven, Glenn Hubbard, and Bruce Peterson, 1988, Financing con-straints and corporate investment, Brookings Papers on Economic Activi-ties 1, 141–195.

Foster, Douglass, and S. Viswanathan, 1990, A theory of interday variationsin volume, variance and trading costs in securities markets, Review ofFinancial Studies 3, 593–624.

BIBLIOGRAPHY 197

Froot, Kenneth, and Jeffrey Frankel, 1989, Forward discount bias: Is it anexchange rate risk premium?, Quarterly Journal of Economics Feb., 139–161.

Froot, Kenneth, David Scharfstein, and Jeremy Stein, 1993, Risk manage-ment: Coordinating corporate investment and financing policies, Journalof Finance 48, 1629–1658.

Geczy, Christopher, Bernadette Minton, and Catherine Schrand, 1996, Whyfirms use currency derivatives, Working paper.

Gibbons, Michael, 1982, Multivariate tests of financial models: A new ap-proach, Journal of Financial Economics 10, 3–27.

, and Krishna Ramaswamy, 1993, A test of the Cox, Ingersoll, andRoss model of the term structure, Review of Financial Studies 6, 619–658.

Glosten, Larry, 1989, Insider trading, liquidity, and the role of the monopolistspecialist, Journal of Business 62, 211–235.

, and P. Milgrom, 1985, Bid, ask, and transaction prices in a special-ist market with heterogeneously informed traders, Journal of FinancialEconomics 14, 71–100.

Graham, John, 1996, Debt and the marginal tax rate, Journal of FinancialEconomics 41, 41–73.

Grossman, Sanford, 1976, On the efficiency of competitive stock marketswhere trades have diverse information, Journal of Finance 31, 573–585.

, and J. E. Stiglitz, 1980, On the impossibility of informationallyefficient markets, American Economic Review 70, 393–408.

Hansen, Lars Peter, and Robert J. Hodrick, 1983, Risk Averse Speculationin the Forward Foreign Exchange Market: An Econometric Analysis ofLinear Models, vol. Exchange Rates and International Macroeconomics .pp. 113–152 (University of Chicago Press: Chicago).

Hansen, Lars Peter, and Ravi Jagannathan, 1991, Implications of securitiesmarket data for models of dynamic economies, Journal of Political Econ-omy 99, 225–262.

198 BIBLIOGRAPHY

Hansen, Lars Peter, and S.F.R Richard, 1987, The role of conditioning infor-mation in deducting testable restrictions implied by dynamic asset pricingmodels, Econometrica 55, 587–613.

Harrison, J., and David Kreps, 1979, Martingales and arbitrage in multi-period securities markets, Journal of Economic Theory 20, 381–408.

Hart, Oliver, and David Kreps, 1986, Price destabilizing speculation, Journalof Political Economy 94, 927–952.

Hellwig, M.F., 1980, On the aggregation of information in competitive mar-kets, Journal of Economic Theory 22, 477–498.

Helwege, Jean, and Nelie Liang, 1996, Is there a pecking order? evidencefrom a panel of IPO firms, Journal of Financial Economics 40, 429–458.

Hirshleifer, Jack, 1971, The private and social value of information and thereward to inventive activity, American Economic Review 61, 561–574.

Hotchkiss, Edith Shwalb, 1995, Postbankruptcy resolution: Direct costs andviolation of priority claims, Journal of Finance 50, 3–21.

Huang, Chi-fu, and Robert H. Litzenberger, 1988, Foundations for FinancialEconomics (Prentice-Hall: Englewood Cliffs, NJ).

Huang, Roger, 1989, An analysis of intertemporal pricing for forward foreignexchange contracts, Journal of Finance 44, 183–194.

Ibottson, Roger, and Jay Ritter, 1995, Initial Public Offerings, vol. North-Holland Handbooks of Operations Research and Management Science: Fi-nance . pp. 993–1016 (North-Holland: Amsterdam).

Ingersoll, 1987, Theory of Financial Decision Making . Studies in FinancialEconomics (Roman & Littlefield: Savage, MD).

Ingersoll, Jonathan, 1984, Some results in the theory of arbitrage pricing,Journal of Finance 39, 1021–1039.

James, Christopher, 1995, When do banks take equity in debt restructurings,Review of Financial Studies 8.

BIBLIOGRAPHY 199

Jarrow, Robert, V. Maksimovic, and W. T. Ziemba, 1995, Finance . , vol. 9of Handbooks in Operations Research and Management Science (North-Holland: Amsterdam).

Jegadeesh, Narasimhan, and Sheridan Titman, 1993, Returns to buying win-ners and selling losers: Implications for stock market efficiency, Journal ofFinance 48, 65–91.

Jensen, Michael, 1986, Agency costs of free cash flow, corporate finance, andtakeovers, American Economic Review 76, 323–329.

, and W.H. Meckling, 1976, Theory of the firm: Managerial behavior,agency costs, and ownership structure, Journal of Financial Economics 3,305–360.

Jensen, Michael, and Kevin Murphy, 1990, Performance pay and top man-agement incentives, Journal of Political Economy 98, 225–264.

Jung, Kooyul, Yong-Cheol Kim, and Rene Stulz, 1996, Investment oppor-tunities, managerial discretion, and the security issue decision, Journal ofFinancial Economics 42, 159–185.

Kadlec, Greg, and John McConnell, 1994, The effect of market segmentationand illiquidity on asset prices: Evidence from exchange listings, Jounral ofFinance 49, 611–636.

Kandel, S., and Robert Stambaugh, 1987, On correlations and inferencesabout mean-variance efficiency, Journal of Financial Economics 18, 61–90.

Kim, Yong-Cheol, and Rene Stulz, 1988, The Eurobond market and corpo-rate financial policy: A test of the clientele hypothesis, Journal of Finan-cial Economics 22, 189–205.

Koh, and Walter, 1989, A direct test of Rock’s model of the pricing of un-seasoned issues, Journal of Financial Economics 23, 251–272.

Kothari, S., J. Shanken, and R. Sloan, 1995, Another look at the cross-sectionof expected returns, Journal of Finance 50, 185–224.

Kyle, Albert S., 1985, Continuous auctions and insider trading, Econometrica50, 1315–1335.

200 BIBLIOGRAPHY

, 1989, Informed speculation with imperfect competition, Review ofEconomic Studies 56, 317–356.

Lang, Larry, Rene Stulz, and Ralph Walkling, 1989, Managerial perfomance,Tobin’s q, and the gain from tender offers, Journal of Financial Economics24, 137–154.

Lehn, Kenneth, and Annette Poulsen, 1989, Free cash flow and stockholdergains in going private transactions, Journal of Finance 44, 771–787.

Litzenberger, Robert, and Krishna Ramaswamy, 1979, The effect of personaltaxes and dividends on capital asset prices: Theory and evidence, Journalof Financial Economics 7, 163–196.

Loderer, Claudio, John Cooney, and Leonard VanDrunen, 1991, The priceelasticity of demand for common stock, Journal of Finance 46, 621–651.

Longstaff, Francis, and Eduardo Schwartz, 1992, Interest rate volatility andthe term structure: A two-factor general equilibrium model, Journal ofFinance 47, 1259–1282.

Loughran, and Ritter, 1995, The new issues puzzle, Journal of Finance 50,23–51.

Lucas, Robert, 1978, Asset prices in an exchange economy, Econometrica 46,1429–1445.

, 1982, Interest rates and currency prices in a two=country world,Journal of Monetary Economics 10, 335–360.

MacKinlay, A. Craig, 1987, On multivariate tests of the CAPM, Journal ofFinancial Economics 18, 341–371.

Manne, Henry G., 1965, Mergers and the market for corporate control, Jour-nal of Political Economy 73, 110–120.

Mark, Nelson, 1988, Time-varying betas and risk premia in the pricing offorward foreign exchange contracts, Journal of Financial Economics 22,335–354.

Markowitz, Harry, 1959, Portfolio Selection: Efficient Diversification of In-vestments (Wiley: New York).

BIBLIOGRAPHY 201

Marshall, J. M., 1974, Provate incentives and information, American Eco-nomic Review 64, 373–390.

Masulis, Ronald, 1980, The effects of capital structure change on securityprices, Journal of Financial Economics 8, 139–178.

May, Don, 1995, Do managerial motives influence firm risk reduction strate-gies?, Journal of Finance 50, 1291–1308.

McConnell, John, and Eduardo Schwartz, 1992, The origin of LYONS: Acase study in financial innovation, Journal of Applied Corporate Financepp. 40–47.

Merton, Robert, 1987, A simple model of capital market equilibrium withincomplete information, Jounral of Finance 42, 483–510.

Merton, Robert C., 1973, An intertemporal capital asset pricing model,Econometrica 41, 867–887.

Mikkelson, Wayne, and Megan Partch, 1986, Valution effects of security offer-ings and the issuance process, Journal of Financial Economics 15, 31–60.

Milgrom, P., and N. Stokey, 1982, Information, trade, and common knowl-edge, Journal of Economic Theory 26, 17–27.

Miller, Merton, 1977a, Debt and taxes, Journal of Finance 32, 261–276.

, 1977b, Risk, uncertainty, and divergence of opinion, Journal ofFinance 32, 1151–1168.

, and Kevin Rock, 1985, Dividend policy under asymmetric informa-tion, Journal of Finance 40, 1030–1051.

Mitchell, Mark, and Kenneth Lehn, 1990, Do bad bidders become good tar-gets?, Journal of Political Economy 98, 372–398.

Mitchell, Mark L., and J. Harold Mulherin, 1996, Impact of industry shockson takeover and restructuring activity, Journal of Financial Economics 41,193–229.

Morck, R.A., Andrei Shleifer, and Robert Vishny, 1988, Management own-ership and market valution: An empirical analysis, Journal of FinancialEconomics 20, 293–315.

202 BIBLIOGRAPHY

Murphy, Kevin, 1985, Corporate performance and managerial remuneration:An empirical analysis, Journal of Accounting and Economics 7, 11–42.

Myers, Stewart, 1977, Determinants of corporate borrowing, Journal of Fi-nancial Economics 5, 147–175.

, 1984, The capital structure puzzle, Journal of Finance 39, 575–592.

, and N. Majluf, 1984, Corporate financing and investment decisionswhen firms have information that investors do not have, Journal of Finan-cial Economics 13, 187–221.

Ofer, Aharon, and Ashok Natarajan, 1989, Convertible call policies: Anempirical analysis of an information-signalling hypothesis, Journal of Fi-nancial Economics 19, 91–108.

Opler, Tim, and Sheridan Titman, 1995, The debt-equity choice: An analysisof issuing firms, Working Paper.

Pearson, Neal, and Tong Sheng Sun, 1994, Exploiting the conditional densityin estimating the term structure: An application to the Cox, Ingersoll, andRoss model, Journal of Finance 49, 1279–1304.

Prabhala, N. R., 1993, On interpreting dividend announcement effects: Freecash flow, clientele, or signalling?, Yale Working Paper.

Puri, Manju, 1996, Commercial banks in investment banking: Conflict ofinterest or certification role?, Journal of Financial Economics 40, 373–401.

Rajan, Raghuram, 1996, Insiders and outsiders: The choice between informedand arm’s length debt, Jounrnal of Finance 47, 1367–1400.

, and Henri Servaes, ????, The effect of market conditions on initialpublic offerings, .

Rajan, Raghuram, and Luigi Zingales, 1995, What do we know about capitalstructure? some evidence from international data, Journal of Finance 50,1421–1460.

Reisman, H., 1992, Reference variables, factor structure, and the approxi-mate multibeta representation, Journal of Finance 47, 1303–1314.

BIBLIOGRAPHY 203

Rock, Kevin, 1986, Why new issues are underpriced, Journal of FinancialEconomics 15, 187–212.

, 1989, The specialist’s order book, Unpublished Working Paper.

Roll, Richard, 1977, A critique of the asset pricing theory’s tests, Journal ofFinancial Economics 4, 129–176.

, 1984, A simple measure of the effective bid/ask spread in an efficientmarket, Journal of Finance 39, 1127–1139.

, 1986, The hybris hypothesis of corporate takeovers, Journal of Busi-ness 59, 197–216.

, and Stephen Ross, 1994, On the cross-sectional relation betweenexpected returns and betas, Journal of Finance 49, 101–122.

Ross, Stephen, 1976, The arbitrage theory of capital asset prices, Journal ofEconomic Theory 13, 341–360.

, 1977a, The determination of financial structure: The incentive sig-nalling approach, Bell Jounrnal of Economics 8, 23–40.

, 1977b, Return, Risk, and Arbitrage . , vol. Risk and Return inFinance, I (Ballinger: Cambridge, MA).

Shanken, Jay, 1982, The arbitrage pricing theory: Is it testable?, Jounal ofFinance 37, 1129–1140.

, 1985, Multivariate tests of the zero-beta CAPM, Journal of Finan-cial Economics 14, 327–348.

Shanken, J., 1987, Multivariate proxies and asset pricing relations: Livingwith the Roll critique, Journal of Financial Economics 18, 91–110.

, 1992, On the estimation of beta-pricing models, Review of FinancialStudies 5, 1–34.

Shanken, Jay, and M. Weinstein, 1990, Macroeconomic variables and assetpricing: Esstimation and tests, Working Paper, University of Rochester.

Shiller, Robert J., 1981, Do stock prices move too much to be justified bysusequent changes in dividends?, American Economic Review 71, 421–436.

204 BIBLIOGRAPHY

Shin, Hyun-Han, and Rene Stulz, 1996, An analysis of divisional investmentpolicy, NBER Working Paper.

Shleifer, and Vishny, 1986, Large shareholders and corporate control, Journalof Political Economy 94, 461–488.

, 1992, Liquidation values and debt capacity: A market equilibriumapproach, Journal of Finance 47, 1343–1366.

Shleifer, Andrei, 1986, Do demand curves for stock slope down?, Jounrnal ofFinance 41, 579–590.

Slezak, Steve, 1994, A theory of the dynamics of security returns aroundmarket closures, Journal of Finance 49, 1163–1211.

Sloan, Richard, 1993, Accounting earings and top executive compensation,Journal of Accounting and Economics 16, 55–100.

Smith, Clifford, 1986, Investment banking and the capital acquisition process,Journal of Financial Economics 15, 3–29.

, and Ross Watts, 1992, The investment opportunity set and corpo-rate financing, dividend and compensation policies, Journal of FinancialEconomics 32, 263–292.

Snow, Karl, 1991, Diagnosing asset pricing models using the distribution ofasset returns, Journal of Finace 46, 955–983.

Spence, Michael, 1973, Job market signaling, Quarterly Journal of Economicspp. 355–374.

, 1974, Competitive and optimal responses to signals: An analysis ofefficiency distribution, Journal of Economic Theory 7, 296–332.

Stambaugh, Robert, 1982, On the exclusion of assets from tests of the twoparameter model, Journal of Financial Economics 10, 235–268.

Stein, Jeremy, 1992, Convertible bonds as backdoor equity financing, Journalof Financial Economics 32, 3–21.

Stulz, Rene, 1988, Managerial control of voting rights: Financing policiesand the market for corporate control, Journal of Financial Economics 20,25–54.

BIBLIOGRAPHY 205

, 1995, Rethinking risk management, Working paper.

Titman, Sheridan, and Robert Wessels, 1988, The determinants of capitalstructure choice, Journal of Finance 43, 1–19.

Tufano, Peter, 1989, Financial innovation and first mover advantages, Jour-nal of Financial Economics 25, 213–240.

, 1996, Who manages risk? an emprical examination of risk man-agement practices in the gold mining industry, Journal of Finance 51,1097–1137.

Vermaelen, Theo, 1981, Common stock repurchases and market signalling:An empirical study, Journal of Financial Economics 9, 138–183.

Weiss, Lawrence, 1990, Bankruptcy resolution: Direct costs and violation ofpriority of claims, Journal of Financial Economics 27, 285–314.

Welch, Ivo, 1992, Sequential sales, learning, and cascades, Journal of Finance47, 695–732.

Yermack, David, 1995, Do corporations award stock options effectively?,Journal of Financial Economics 39, 237–269.

Zender, Jaime, 1991, Optimal financial instruments, Journal of Finance 46,1645–1663.

Financial notes

Economy & Finance

international asset

empirical asset pricing

rate models

key papers

arbitrage pricing theory

bond pricing

batch models

trinomial models