Applications of optimal portfolio management Dimitrios Bisias

Applications of optimal portfolio management
by
Dimitrios Bisias
Submitted to the Sloan School of Management
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Operations Research
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2015
c Massachusetts Institute of Technology 2015. All rights reserved.
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sloan School of Management
June 22, 2015
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Andrew W. Lo
Charles E. and Susan T. Harris Professor of Finance
Thesis Supervisor
Accepted by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Patrick Jaillet
Dugald C. Jackson Professor, Department of Electrical Engineering
and Computer Science
Co-director, Operations Research Center
2
Applications of optimal portfolio management
by
Dimitrios Bisias
Submitted to the Sloan School of Management
on June 22, 2015, in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy in Operations Research
Abstract
This thesis revolves around applications of optimal portfolio theory.
In the first essay, we study the optimal portfolio allocation among convergence
trades and mean reversion trading strategies for a risk averse investor who faces Valueat-Risk and collateral constraints with and without fear of model misspecification.
We investigate the properties of the optimal trading strategy, when the investor fully
trusts his model dynamics. Subsequently, we investigate how the optimal trading
strategy of the investor changes when he mistrusts the model. In particular, we
assume that the investor believes that the data will come from an unknown member
of a set of unspecified alternative models near his approximating model. The investor
believes that his model is a pretty good approximation in the sense that the relative
entropy of the alternative models with respect to his nominal model is small. Concern
about model misspecification leads the investor to choose a robust optimal portfolio
allocation that works well over that set of alternative models.
In the second essay, we study how portfolio theory can be used as a framework for
making biomedical funding allocation decisions focusing on the National Institutes
of Health (NIH). Prioritizing research efforts is analogous to managing an investment portfolio. In both cases, there are competing opportunities to invest limited
resources, and expected returns, risk, correlations, and the cost of lost opportunities
are important factors in determining the return of those investments. Can we apply
portfolio theory as a systematic framework of making biomedical funding allocation
decisions? Does NIH manage its research risk in an efficient way? What are the
challenges and limitations of portfolio theory as a way of making biomedical funding
allocation decisions?
Finally in the third essay, we investigate how risk constraints in portfolio optimization and fear of model misspecification affect the statistical properties of the
market returns. Risk sensitive regulation has become the cornerstone of international
financial regulations. How does this kind of regulation affect the statistical properties
of the financial market? Does it affect the risk premium of the market? What about
the volatility or the liquidity of the market?
3
Thesis Supervisor: Andrew W. Lo
Title: Charles E. and Susan T. Harris Professor of Finance
4
Acknowledgments
I would like to express my gratitude to my advisor and mentor, Professor Andrew
W. Lo, for his continuing support and advice over all the years I spent at MIT. His
immense knowledge in diverse research areas, enthusiasm, hard work, outstanding
leadership and motivation have been a source of inspiration. Working with him has
been an honor and privilege and I could not have imagined having a better advisor
and mentor for my Ph.D study.
I would also like to thank the rest of my thesis committee: Professor Dimitri P.
Bertsekas for comments that greatly improved this thesis and for his great books that
made me love the field of optimization in the first place and Professor Leonid Kogan
who provided his insight and expertise that greaty assisted this research.
In addition I would like to thank Dr. James F. Watkins, MD for his invaluable
help, insights and contribution to the second part of this research.
Moreover, I would like to thank Dr. Paul Mende, Dr. Saman Majd and Dr. Eric
Rosenfeld whom I had the fortune of being their teaching assistant in finance classes.
Paul’s experience in quantitative trading made me realize what career I would like to
follow and I am grateful for this.
Being part of MIT and in particular the ORC and LFE communities has been a
blessing and I consider myself very fortunate to be among very interesting and smart
people. I will always remember my years at MIT with nostalgia and joy and I hope
that I ’ll be able to express my gratitude in the future several times.
My life at MIT would not be so complete and joyful if I didn’t have good lifelong
friends to spend time and have productive discussions with. In particular, I would
like to thank Nick Trichakis and his wife Lena, Christos and Elli Nicolaides, Markos
and Sophia Trichas, Thomas and Anastasia Trikalinos, the golden coach George Papachristoudis and Gerry Tsoukalas.
Last but not least I would like to thank my parents Giorgo and Roula and my
sister Katerina for their unconditional love and support. I owe to them everything
and this thesis is dedicated to them.
5
6
Contents
1 Optimal trading of arbitrage opportunities under constraints
29
1.1
Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
1.2
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
1.2.1
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
1.2.2
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
1.2.3
Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
1.2.4
Connection with Ridge and Lasso regression . . . . . . . . . .
46
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
1.3.1
Convergence trades . . . . . . . . . . . . . . . . . . . . . . . .
47
1.3.2
Mean reversion trading opportunities . . . . . . . . . . . . . .
56
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
56
1.3
1.4
2 Optimal trading of arbitrage opportunities under model misspecification
57
2.1
Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
2.2
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
2.2.1
Alternative models representation . . . . . . . . . . . . . . . .
61
2.2.2
Model setup . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
2.3.1
No fear of model misspecification . . . . . . . . . . . . . . . .
65
2.3.2
Fear of model misspecification no constraints . . . . . . . . . .
67
2.3.3
Fear of model misspecification with VaR and margin constraints 70
2.3
2.4
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
72
2.5
2.4.1
Convergence trades without constraints . . . . . . . . . . . . .
73
2.4.2
Mean reversion trading strategies without constraints . . . . .
78
2.4.3
Convergence trades with constraints . . . . . . . . . . . . . . .
92
2.4.4
Mean reversion trading strategies with constraints . . . . . . . 111
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
3 Estimating the NIH Efficient Frontier
131
3.1
NIH Background and Literature Review . . . . . . . . . . . . . . . . 132
3.2
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
3.3
3.4
3.2.1
Funding Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
3.2.2
Burden of Disease Data . . . . . . . . . . . . . . . . . . . . . 139
3.2.3
Applying Portfolio Theory . . . . . . . . . . . . . . . . . . . . 142
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
3.3.1
Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . 147
3.3.2
Efficient Frontiers . . . . . . . . . . . . . . . . . . . . . . . . . 148
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4 Impact of model misspecification and risk constraints on market
157
4.1
Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
4.2
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.3
4.2.1
Model setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.2.2
Varying constraints . . . . . . . . . . . . . . . . . . . . . . . . 161
4.2.3
Varying risk aversions . . . . . . . . . . . . . . . . . . . . . . 165
4.2.4
Varying constraints and risk aversions . . . . . . . . . . . . . . 168
4.2.5
Varying fear of model misspecification . . . . . . . . . . . . . 168
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
A Technical Notes
173
8
List of Figures
1-1 Ellipsoids. Ellipsoids of poor investment opportunities for N=2 convergence trades at times t = 0.3, 0.6, 0.9. . . . . . . . . . . . . . . . .
42
1-2 Weights for the case of uncorrelated spreads and collateral
constraint. Weights for the case of uncorrelated spreads.
. . . . . .
45
1-3 VaR constraints, positive correlations. Wealth distribution at t =
0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated
(ρ = 0.5) convergence trades, while facing VaR constraints (K=1).
Initial wealth is $100. . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
1-4 VaR constraints, negative correlations. Wealth distribution at
t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5) convergence trades, while facing VaR constraints
(K=1). Initial wealth is $100. . . . . . . . . . . . . . . . . . . . . . .
48
1-5 VaR constraints, positive correlations, tight constraints. Wealth
distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two
positively correlated (ρ = 0.5) convergence trades, while facing VaR
constraints (K=0.25). Initial wealth is $100. . . . . . . . . . . . . . .
49
1-6 VaR constraints, negative correlations, tight constraints. Wealth
distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two
negatively correlated (ρ = −0.5) convergence trades, while facing VaR
constraints (K=0.25). Initial wealth is $100. . . . . . . . . . . . . . .
9
49
1-7 Wealth evolution under VaR constraint. Typical path of the
wealth evolution for an investor investing in two convergence trades
using the same noise process for positive and negative correlation under
the VaR constraint. Initial wealth is $100. . . . . . . . . . . . . . . .
50
1-8 Relation between final wealth and frequency the VaR constraint binds. Final wealth is negatively correlated to the percentage
of time the constraints bind when the initial values of the convergence
trades are low. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
1-9 Margin constraints, positive correlations. Wealth distribution at
t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5) convergence trades, while facing margin constraints
(Collateral = 1). Initial wealth is $100. . . . . . . . . . . . . . . . . .
52
1-10 Margin constraints, negative correlations. Wealth distribution
at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5) convergence trades, while facing margin constraints
(Collateral = 1). Initial wealth is $100. . . . . . . . . . . . . . . . . .
52
1-11 Margin constraints, positive correlations, more collateral needed.
Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests
in two positively correlated (ρ = 0.5) convergence trades, while facing
margin constraints (Collateral = 2). Initial wealth is $100. . . . . . .
53
1-12 Margin constraints, negative correlations, more collateral needed.
Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests
in two negatively correlated (ρ = −0.5) convergence trades, while facing margin constraints (Collateral = 2). Initial wealth is $100. . . . .
53
1-13 Wealth evolution under margin constraint. Typical path of the
wealth evolution for an investor investing in two convergence trades
using the same noise process for positive and negative correlation under
the margin constraint. Initial wealth is $100. . . . . . . . . . . . . . .
10
54
1-14 Relation between final wealth and frequency the margin constraint binds. Final wealth is negatively correlated to the percentage
of time the constraints bind when the initial values of the convergence
trades are low. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
1-15 Positions evolution under VaR constraints. Typical path of the
positions in two convergence trading opportunities under VaR constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
1-16 Positions evolution under margin constraints. Typical path of
the positions in two convergence trading opportunities under margin
constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
2-1 Partial derivative of the value function with respect to S for
VS as a function of time at S = 1
a single convergence trade.
for different values of the robustness multiplier for a single convergence
trade. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
2-2 Distortion drift for a single convergence trade. Distortion drift
as a function of time at S = 1 for different values of the robustness
multiplier for a single convergence trade. . . . . . . . . . . . . . . . .
76
2-3 Distortion drift terms for a single convergence trade. Distortion drift terms as a function of time at S = 1 for ν = 1 for a single
convergence trade. The first term corresponds to a positive distortion drift that reduces the wealth of the investor since the investor is
shorting the spread, while the second term corresponds to a negative
distortion drift that points to worse investment opportunities. . . . .
77
2-4 Optimal weight of a single convergence trade. Weight of the
convergence trading strategy as a function of time at S = 1 for different
values of the robustness multiplier. . . . . . . . . . . . . . . . . . . .
77
2-5 Partial derivative of the value function with respect to S for
a single mean reversion trading strategy.
VS as a function of
time at S = 1 for different values of the robustness multiplier. . . . .
11
80
2-6 Distortion drift for a single mean reversion trading strategy.
Distortion drift as a function of time at S = 1 for different values of
the robustness multiplier. . . . . . . . . . . . . . . . . . . . . . . . . .
81
2-7 Distortion drift terms for a single mean reversion trading
strategy.
Distortion drift terms as a function of time at S = 1
for ν = 1. The first term corresponds to a positive distortion drift
that reduces the wealth of the investor, since the investor is shorting
the spread, while the second term corresponds to a negative distortion
drift that points to worse investment opportunities. . . . . . . . . . .
81
2-8 Optimal weight of a single mean reversion trading strategy.
Weight of the mean reversion trading strategy as a function of time at
S = 1 for different values of the robustness multiplier. . . . . . . . . .
82
2-9 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 2 for
different values of the robustness multiplier. The correlation coefficient
is ρ = 0.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
2-10 Ratio of the optimal weights. Ratio of the optimal weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
84
2-11 Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 2 when ρ = 0. Partial derivative of the value
function with respect to S1 and S2 as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
85
2-12 Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 =
2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
2-13 Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 =
2 for different values of the robustness multiplier. The correlation
coefficient is ρ = −0.5. . . . . . . . . . . . . . . . . . . . . . . . . . .
86
2-14 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 1. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 1 for
different values of the robustness multiplier. The correlation coefficient
is ρ = 0.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
2-15 Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.
Ratio of the optimal weights of the mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0.
. . . . . .
88
2-16 Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 1 when ρ = 0. Partial derivative of the value
function with respect to S1 and S2 as a function of time at S1 = 1 and
S2 = 1 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
2-17 Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 =
1 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0.9. . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
89
2-18 Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 1 when ρ = 0.9.
Partial derivative of
the value function with respect to S1 and S2 as a function of time at
S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.9. . . . . . . . . . . . . . . . . . .
90
2-19 Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.9.
Ratio of the magnitude of the optimal weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 1 for
different values of the robustness multiplier. The correlation coefficient
is ρ = 0.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
2-20 Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 =
1 for different values of the robustness multiplier. The correlation
coefficient is ρ = −0.8. . . . . . . . . . . . . . . . . . . . . . . . . . .
91
2-21 Partial derivative of the value function with respect to S for
a single convergence trade when L = 0.1 and L = 100.
VS
as a function of time at S = 1 for different values of the robustness
multiplier. The solid line is when L = 100 and the dotted line is for
L = 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
2-22 Partial derivative of the value function with respect to S for
a single convergence trade when L = 0.1.
VS as a function of
time at S = 1 for different values of the robustness multiplier. The
collateral constraint is |F | ≤ 0.1. . . . . . . . . . . . . . . . . . . . . .
95
2-23 Optimal weight of a single convergence trade when L = 0.1.
Weight of the convergence trading strategy as a function of time at
S = 1 for different values of the robustness multiplier. The collateral
constraint is |F | ≤ 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . .
14
95
2-24 Optimal weight of a single convergence trade when L = 1.
Weight of the convergence trading strategy as a function of time at
S = 1 for different values of the robustness multiplier. The collateral
constraint is |F | ≤ 1. . . . . . . . . . . . . . . . . . . . . . . . . . . .
96
2-25 Distortion drift for a single convergence trade when L = 0.1.
Distortion drift as a function of time at S = 1 for different values of
the robustness multiplier. The collateral constraint is |F | ≤ 0.1. . . .
96
2-26 Distortion drift for a single convergence trade when L = 1.
Distortion drift as a function of time at S = 1 for different values of
the robustness multiplier. The collateral constraint is |F | ≤ 1. . . . .
97
2-27 Distortion drift terms for a single convergence trade when
L = 0.1. Distortion drift terms as a function of time at S = 1 for
ν = 1 and L = 0.1. The first term corresponds to a positive distortion
drift that reduces the wealth of the investor and it is bounded above
due to the collateral constraint, while the second term corresponds to a
negative distortion drift that points to worse investment opportunities.
97
2-28 Distortion drift terms for a single convergence trade when
L = 1. Distortion drift terms as a function of time at S = 1 for ν = 1
and L = 1. The first term corresponds to a positive distortion drift
that reduces the wealth of the investor and it is bounded above due
to the collateral constraint, while the second term corresponds to a
negative distortion drift that points to worse investment opportunities.
98
2-29 Optimal weight of a single convergence trade when L = 0.1 and
L = 100. Weight of the convergence trading strategy as a function of
time at S = 1 for different values of the robustness multiplier. The
solid line is when L = 100 and the dotted line is for L = 0.1. . . . . .
15
98
2-30 Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 2 when L = 0.5. Weights of the convergence trades
as a function of time at S1 = 1 and S2 = 2 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0 and the rhs
of the VaR constraint is L = 0.5. . . . . . . . . . . . . . . . . . . . . 100
2-31 Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 2 when L = 0.5. Value
of the normalized wealth variance for two uncorrelated convergence
trades as a function of time at S1 = 1 and S2 = 2 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0 and
the rhs of the VaR constraint is L = 0.5. . . . . . . . . . . . . . . . . 101
2-32 Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades
as a function of time at S1 = 1 and S2 = 2 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0 and the rhs
of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . . . . . . 101
2-33 Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value
of the normalized wealth variance for two uncorrelated convergence
trades as a function of time at S1 = 1 and S2 = 2 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0 and
the rhs of the VaR constraint is L = 0.05.
. . . . . . . . . . . . . . . 102
2-34 Optimal weights of two positively correlated convergence trades
for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence
trades as a function of time at S1 = 1 and S2 = 2 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0.5 and
the rhs of the VaR constraint is L = 0.05.
16
. . . . . . . . . . . . . . . 103
2-35 Value of the normalized wealth variance for two positively correlated convergence trades at S1 = 1 and S2 = 2 when L = 0.05.
Value of the normalized wealth variance for two positively correlated
convergence trades as a function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation coefficient
is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05. . . . . . . . 104
2-36 Optimal weights of two negatively correlated convergence trades
for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence
trades as a function of time at S1 = 1 and S2 = 2 for different values of
the robustness multiplier. The correlation coefficient is ρ = −0.5 and
the rhs of the VaR constraint is L = 0.05.
. . . . . . . . . . . . . . . 104
2-37 Value of the normalized wealth variance for two negatively
correlated convergence trades at S1 = 1 and S2 = 2 when
L = 0.05. Value of the normalized wealth variance for two negatively correlated convergence trades as a function of time at S1 = 1
and S2 = 2 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is
L = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
2-38 Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades
as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0 and the rhs
of the VaR constraint is L = 0.05. . . . . . . . . . . . . . . . . . . . . 107
2-39 Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value
of the normalized wealth variance for two uncorrelated convergence
trades as a function of time at S1 = 1 and S2 = 1 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0 and
the rhs of the VaR constraint is L = 0.05.
17
. . . . . . . . . . . . . . . 107
2-40 Optimal weights of two positively correlated convergence trades
for S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence
trades as a function of time at S1 = 1 and S2 = 1 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0.8 and
the rhs of the VaR constraint is L = 0.05.
. . . . . . . . . . . . . . . 108
2-41 Value of the normalized wealth variance for two positively correlated convergence trades at S1 = 1 and S2 = 1 when L = 0.05.
Value of the normalized wealth variance for two positively correlated
convergence trades as a function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The correlation coefficient
is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05. . . . . . . . 109
2-42 Optimal weights of two negatively correlated convergence trades
for S1 = 1 and S2 = 1 when L = 8. Weights of the convergence
trades as a function of time at S1 = 1 and S2 = 1 for different values of
the robustness multiplier. The correlation coefficient is ρ = −0.8 and
the rhs of the VaR constraint is L = 0.05.
. . . . . . . . . . . . . . . 109
2-43 Value of the normalized wealth variance for two negatively
correlated convergence trades at S1 = 1 and S2 = 1 when
L = 0.05. Value of the normalized wealth variance for two negatively correlated convergence trades as a function of time at S1 = 1
and S2 = 1 for different values of the robustness multiplier. The correlation coefficient is ρ = −0.8 and the rhs of the VaR constraint is
L = 0.05. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
2-44 Partial derivative of the value function with respect to S for
a single mean reversion trading strategy and a collateral constraint with L = 0.7. VS as a function of time at S = 1 for different
values of the robustness multiplier for L = 0.7. . . . . . . . . . . . . . 113
18
2-45 Distortion drift terms for a single mean reversion trading
strategy and a collateral constraint with L = 0.7. Distortion
drift terms as a function of time at S = 1 for ν = 2 and for L = 0.7.
The first term corresponds to a positive distortion drift that reduces
the wealth of the investor, since the investor is shorting the spread,
while the second term corresponds to a negative distortion drift that
points to worse investment opportunities. The first term is bounded
above due to the collateral constraint. . . . . . . . . . . . . . . . . . . 113
2-46 Optimal weight of a single mean reversion trading strategy
with a collateral constraint with L = 0.7. Weight of the mean
reversion trading strategy as a function of time at S = 1 for different
values of the robustness multiplier and for L = 0.7. . . . . . . . . . . 114
2-47 Partial derivative of the value function with respect to S for
a single mean reversion trading strategy with different collateral constraints.
VS as a function of time at S = 1 for different
values of the robustness multiplier and different collateral constraints.
The solid line is for L = 70 and the dotted line for L = 0.7. . . . . . . 114
2-48 Optimal weight of a single mean reversion trading strategy
with different collateral constraints. Weight of the mean reversion
trading strategy as a function of time at S = 1 for different values of
the robustness multiplier and different collateral constraints. The solid
line is for L = 70 and the dotted line for L = 0.7. . . . . . . . . . . . 115
2-49 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 3. Weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 3. . . . . 117
19
2-50 Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 3. Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 3. . . . . 118
2-51 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 2. Weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 118
2-52 Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 2. Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 119
2-53 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 7. . . . . 119
2-54 Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 7. Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 7. . . . . 120
20
2-55 Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights
of the mean reversion trading strategies as a function of time at S1 =
1 and S2 = 2 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is
L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
2-56 Value of the normalized wealth variance for two positively
correlated mean reversion trading strategies at S1 = 1 and
S2 = 2 when L = 7. Value of the normalized wealth variance for two
positively correlated mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.5 and the rhs of the
VaR constraint is L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . 122
2-57 Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights
of the mean reversion trading strategies as a function of time at S1 =
1 and S2 = 2 for different values of the robustness multiplier. The
correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is
L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
2-58 Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies at S1 = 1 and
S2 = 2 when L = 7. Value of the normalized wealth variance for two
negatively correlated mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the
VaR constraint is L = 7. . . . . . . . . . . . . . . . . . . . . . . . . . 123
21
2-59 Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the
mean reversion trading strategies as a function of time at S1 = 1 and
S2 = 1 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 2. . . . . 125
2-60 Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 1 when
L = 2. Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies as a function of time at
S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0 and the rhs of the VaR constraint
is L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
2-61 Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1 when L = 2. Weights
of the mean reversion trading strategies as a function of time at S1 =
1 and S2 = 1 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0.9 and the rhs of the VaR constraint is
L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
2-62 Value of the normalized wealth variance for two positively
correlated mean reversion trading strategies at S1 = 1 and
S2 = 1 when L = 2. Value of the normalized wealth variance for two
positively correlated mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 1 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.9 and the rhs of the
VaR constraint is L = 2. . . . . . . . . . . . . . . . . . . . . . . . . . 127
22
2-63 Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1 when L = 8. Weights
of the mean reversion trading strategies as a function of time at S1 =
1 and S2 = 1 for different values of the robustness multiplier. The
correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is
L = 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
2-64 Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies at S1 = 1 and
S2 = 1 when L = 8. Value of the normalized wealth variance for two
negatively correlated mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 1 for different values of the robustness
multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the
VaR constraint is L = 8. . . . . . . . . . . . . . . . . . . . . . . . . . 128
3-1 NIH time series flowchart. Flowchart for the construction of NIH
appropriations time series. “NIH Approp.” denotes NIH appropriations; “PHS Gaps” denotes Institute funding by the U.S. Public Health
Service; “Complete Approp.” denotes the union of these two series;
“FY Change” allows for the change in government fiscal years; “4Q
FY” time series refers to the resulting series in which all years are
treated as having four quarters of three months each. . . . . . . . . . 138
3-2 Appropriations data. NIH appropriations in real (2005) dollars,
categorized by disease group.
. . . . . . . . . . . . . . . . . . . . . . 138
3-3 YLL time series flowchart. Flowchart for the construction of years
of life lost (YLL) time series. “WONDER Chapter Age Group” refers
to a query to the CDC WONDER database at the chapter level, stratified by age group at death; “US Pop.” is the United States population
from census data as expressed in the WONDER dataset; and “US
GDP” denotes U.S. gross domestic product. . . . . . . . . . . . . . . 140
23
3-4 YLL data. Panel (a): Raw YLL categorized by disease group. Panel
(b): Population-normalized YLL (with base year of 2005), categorized
by disease group. Both panels are based on data from 1979 to 2007.
141
3-5 Efficient frontiers. Efficient frontiers for (a) all groups except HIV
and AMS, γ = 0; (b) all groups except HIV and AMS, γ = 5; (c) all
groups except HIV and AMS without the dementia effect, γ = 0; and
(d) all groups except HIV and AMS without the dementia effect, γ = 5;
based on historical ROI from 1980 to 2003. . . . . . . . . . . . . . . . 148
4-1 Price of the risky asset as a function of the aggregate market
supply under varying constraints. We assume that we have 5
agents with the same risk aversion coefficients. The red plot assumes
the same L = 30 for all the agents, while the blue assumes L to be
different across the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50. 163
4-2 Price of the risky asset as a function of the aggregate market
supply under tightening constraints. We assume that we have 5
agents with the same risk aversion coefficients. The blue plot assumes
L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 =
40, L5 = 50 and the red assumes that each Li is reduced by 20%. . . . 164
4-3 Price of the risky asset as a function of the aggregate market
supply with less variable constraints. We assume that we have 5
agents with the same risk aversion coefficients. The blue plot assumes
L to be different across the agents L1 = 10, L2 = 20, L3 = 30, L4 =
40, L5 = 50 and the red assumes that L1 = 20, L2 = 25, L3 = 30, L4 =
35, L5 = 40. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
4-4 Price of the risky asset as a function of the aggregate market
supply with constraints and varying risk aversions. We assume
that we have 5 agents with same constraints but different risk aversion
coefficients. The blue plot assumes L = 30 for each agent, while the
red line assumes that the agents are unconstrained. . . . . . . . . . . 166
24
4-5 Price of the risky asset as a function of the aggregate market
supply with tightening constraints and varying risk aversions.
We assume that we have 5 agents with same constraints but different
risk aversion coefficients. The blue plot assumes L = 30 for each agent,
while the red line assumes that L = 20 for each agent. . . . . . . . . . 167
25
26
List of Tables
3.1
IoM recommendations. 12 major recommendations of the 1998
Institute of Medicine panel in four large areas for improving the process
of allocating research funds. . . . . . . . . . . . . . . . . . . . . . . . 133
3.2
ICD mapping. Classification of ICD-9 (1978–1998) and ICD-10 (1999–
2007) Chapters and NIH appropriations by Institute and Center to 7
disease groups: oncology (ONC); heart lung and blood (HLB); digestive, renal and endocrine (DDK); central nervous system and sensory
(CNS) into which we placed dementia and unspecified psychoses to
create comparable series as there was a clear, ongoing migration noted
from NMH to CNS after the change to ICD-10 in 1999; psychiatric and
substance abuse (NMH); infectious disease, subdivided into estimated
HIV (HIV) and other (AID); maternal, fetal, congenital and pediatric
(CHD). The categories LAB and EXT are omitted from our analysis.
3.3
137
Return summary statistics. Summary statistics for the ROI of
disease groups, in units of years (for the lag length) and per-capitaGDP-denominated reductions in YLL between years t and t + 4 per
dollar of research funding in year t−q, based on historical ROI from
1980 to 2003. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
3.4
ROI example. An example of the ROI calculation for HLB from 1986. 147
3.5
Portfolio weights. Benchmark, single- and dual-objective optimal
portfolio weights (in percent), based on historical ROI from 1980 to
2003. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
27
28
Chapter 1
Optimal trading of arbitrage
opportunities under constraints
In financial economics, an arbitrage is an investment opportunity that is too good
to be true when there are no market frictions. In actual financial markets however,
there are frictions and even if there are arbitrage opportunities the investors may not
be able to fully exploit them due to the constraints they face.
We will explore two kinds of risky arbitrage opportunities when there are market
frictions. The first one is a case of a textbook arbitrage, a convergence trade strategy.
The second is a case of a statistical arbitrage, a mean reversion trading strategy. These
two strategies are two of the most popular trading strategies that hedge funds follow,
so studying them in detail when there are market frictions is a valuable exercise.
A convergence trade is a trading strategy consisting of long/short positions in two
similar assets, where we buy the cheap asset, we short the expensive asset and we wait
until the prices of the two assets to converge which we know it will happen for sure
some particular time in the future. An example of this trade involves the difference
in price between the on-the-run and the most recent off-the-run security. An onthe-run security is the most recently issued, and hence most liquid, of a periodically
issued security. Since an on-the-run security is more liquid it trades at a premium
to off-the-run securities [29]. A convergence trade involves taking a long position in
the most recent off-the-run security and shorting the on-the-run security. The on29
the-run will become off-the-run upon the issue of a newer security and then there
will be almost no difference between the two securities in our trade so their prices
will converge. Another example involves investing in Treasury STRIPS with identical
maturity dates but different prices.
A mean reversion trading strategy involves investing in an asset or a portfolio of
assets whose value is a mean reverting process. Since most price series in the equity
space follow random walk, this strategy most commonly involves investing in a portfolio of non-mean-reverting assets whose value is a stationary mean reverting series.
These price series that can be combined in such a way are called cointegrating. A
classic statistical arbitrage example is the pairs trading, which is the first type of
algorithmic mean reversion trading strategy invented by institutional traders, reportedly by the trading desk of Nunzio Tartaglia at Morgan Stanley [64]. The statistical
arbitrage pairs trading strategy bets on the convergence of the prices of two similar
assets whose prices have diverged without a fundamental reason for this.
These arbitrage opportunities are risky under market frictions. In particular the
first case is exposed to the “divergence risk”, i.e. the fact that the pricing differential
between the two similar assets can diverge arbitrarily far from 0 prior to its convergence at some particular time in the future. The second case is exposed both to the
“divergence risk” and to the “horizon risk”, in other words the fact that the times at
which the spread will converge to its long run mean are uncertain.
We will explore the optimal portfolio allocation of a risk averse investor who
invests in N convergence trades or mean reverting trading strategies, while facing
constraints. In particular, we will study the optimal trading strategy when he faces
VaR constraints or collateral constraints. Risk sensitive regulation, such as the VaR
constraint, has lately become a central component of international financial regulations. Collateral or margin constraints, where the investor has to have sufficient
wealth to secure the liabilities taken by short positions, have been ubiquitous in the
financial transactions for centuries and margin calls have been behind several crises
including the LTCM debacle[54].
In the rest of this chapter, we will discuss the relevant literature review. Then we
30
will discuss about the setup of the model and the constraints, we will find the optimal
trading strategy of the investor and finally we will explore the characteristics of this
optimal strategy.
1.1
Literature review
Merton studied the problem of optimal portfolio allocation in a continuous time setting without any market frictions [59]. The optimal portfolio involves two terms: a
market timing term and a hedging demand term. The first term is a myopic term that
represents the optimal allocation if you were interested at each time instant t only for
an horizon dt ahead. The second term represents the investor’s additional demand
due to the covariance of the wealth process with the attractiveness of the available
investment opportunities. Although Merton gives an analytical general solution this
is expressed in terms of the partial derivatives of the value function and additional
work is needed to derive the solution in terms of the model parameters. Additionally
it assumes that there are no market frictions.
Optimal trading of mean reversion trading strategies have been studied by both
Boguslavsky and Boguslavskaya [15] and Jurek and Yang [48]. They have found
analytical solutions for the optimal weight of a single mean reverting trading strategy
for risk averse CRRA investors. Their analysis is similar with the one in Kim and
Omberg [49], where they assume that there is a risk free asset with a constant riskfree rate and a single risky asset with a mean reverting risk premium, which implies
a mean reverting instantaneous Sharpe ratio. In all the cases they have assumed that
there are no market frictions whatsoever.
Longstaff and Liu [53] have studied the problem of optimal trading of a single
convergence trade under a margin constraint. For the single convergence trade case
both VaR constraints and margin constraints collapse in the same constraint and the
problem is significantly easier. In addition by studying only convergence trades they
have taken out one important dimension of risk, the horizon risk, keeping only the
divergence risk. Brennan and Schwarz [17] have also studied the problem of optimal
31
trading of a single convergence trade including transaction costs when the arbitrage
potential is restricted by position limits.
The literature is rich with papers that study the existence of an equilibrium where
there exists mispricings. This persistence of mispricings is typically attributed to
agency problems, frictions or some kind of risk. Unlike textbook arbitrages, which
generate riskless profits and require no capital commitments, exploiting real-world
mispricings requires the assumption of some kind of risk. Shleifer and Vishny [74]
emphasized that risks such as the uncertainty about when the pricing differential
will converge to 0 and the possibility of a divergence of the mispricing prior to its
elimination may play a role in limiting the size of positions that arbitrageurs are
willing to take, contributing to the persistence of the arbitrage in equilibrium. Basak
and Choitoru [6] also showed that arbitrage can persist in equilibrium when there are
frictions. They study dynamic models with log utility and heterogeneous beliefs in
the presence of margin requirements and other portfolio constraints.
With respect to the constraints, Basak and Shapiro [7] study the problem of optimal trading strategy of a risk averse investor who faces finite horizon VaR constraints
in a complete markets setting using the martingale representation approach [4]. Here
again there are no constraints in the optimal portfolio allocation at each time t but
there is only one constraint in the wealth at some finite horizon. Finally, Geanakoplos
[33] studies the collateral constraints, how these determine an equilibrium leverage
and how this leverage changes over time, the so-called leverage cycles.
Let us now discuss about the setup of the model and the constraints and find the
optimal trading strategy of the investor.
1.2
Analysis
We assume we have a risk averse investor maximizing the expected continuously
compounded rate of return or equivalently the expected logarithm of his final wealth
E(lnWT ). There are two cases to consider. In the first case, the investor can invest
in a risk-free asset and N non-redundant convergence trades, modeled as correlated
32
Brownian bridges. In the second case, the investor can invest in a risk-free asset
and N non-redundant mean reversion trading strategies, modeled as a multivariate
Ornstein-Uhlenbeck (OU) process. The investor faces two kinds of constraints: VaR
constraints or collateral constraints. We determine the optimal trading strategy and
its characteristics in both cases.
1.2.1
Models
As we mentioned already, a convergence trade is a trading strategy consisting of
long/short positions in two similar assets, where we buy the cheap asset, we short the
expensive asset and we wait until the prices of the two assets to converge which we
know it will happen for sure some time in the future. The spread of the convergence
trade can be modeled as a Brownian bridge driven by K Brownian motions, which
has the property that the spread will converge to 0 almost surely at some determined
time in the future. The stochastic differential equation governing the spread of the
trade is given by:
K
X
aSt
dSt = −
dt +
σk dZkt
T −t
k=1
(1.1)
where St is the spread of the trade, a is a parameter controlling the rate of the mean
reversion to 0, T is the horizon of the investor which is also the time at which the
spread goes to 0 with probability 1 and Zt is a Brownian motion in RK . We can
see that the reversion to 0 grows stronger as t → T . Therefore, the investment
opportunities get better as the spread gets larger and t → T , since then the drift
term pushing the spread towards 0 gets larger.
A mean reversion trading strategy involves investing in a stationary portfolio of
non-mean reverting assets, whose value is a mean reverting process. The value of
the portfolio can be modeled as an Ornstein-Uhlenbeck (OU) process. The stochastic
differential equation governing it is given by:
dSt = −φ(St − S̄)dt +
K
X
k=1
33
σk dZkt
In our case we have N of these mean reverting processes and we assume that they
are modeled as a multivariate Ornstein-Uhlenbeck process, which is defined by the
following stochastic differential equation:
dSt = −Φ(St − S̄)dt + σdZt
(1.2)
Above Φ is a N-by-N square transition matrix that characterizes the deterministic
portion of the evolution of the process, S̄ is the vector representing the unconditional
mean of the process, σ is a N-by-K matrix that drives the dispersion of the process
and Zt is a Brownian motion in RK .
The Ornstein-Uhlenbeck process has the nice property that its conditional distribution is normal at all times, with mean equal to
Et [St+τ ] = S̄ + e−Φτ (St − S̄)
and covariance matrix independent of St [60]. We assume that Φ has eigenvalues with
positive real part, so that the conditional expectation approaches to S̄ as t → ∞.
The Ornstein-Uhlenbeck process captures the two important dimensions of risk
in all relative value trades: the “horizon risk”, in other words the fact that the
times at which the spread will converge to its long run mean are uncertain and the
“divergence risk”, i.e. the fact that the pricing differential can diverge arbitrarily far
from its long run mean prior to its convergence. The Brownian bridge captures only
the “divergence” risk, since by its definition we assume that the investor has perfect
information about the magnitude of the mispricing at some future date T , i.e. we
assume that the date T on which the mispricing will be eliminated is known ahead
with certainty.
1.2.2
Constraints
We consider two kinds of constraints: VaR and collateral constraints. The VaR
constraint is a widely used statistical risk measure, adopted both by the regulators
34
and the private sector. It is the cornerstone of the capital regulations adopted by
Basel regulations. Both the 1996 market risk amendment of the original 1988 Basel
accord and the Basel II regulations have been built on the notion of Value-at-Risk
[47]. The Value at risk (VaR) at α-level is defined as the threshold value such that the
probability of losses greater than the threshold is less than α. In our case we consider
instantaneous VaR constraints which amount for determining an upper bound in
the wealth volatility, since locally the diffusion processes have normal distributions.
Therefore, the instantaneous VaR constraints are given by:
θT Σθ ≤ LW 2
where θ is a N by 1 vector of positions, Σ is the instantaneous covariance matrix of
the spreads, L is some proportionality constant that determines the tightness of the
constraint and W is the investor’s wealth.
Collateral or margin constraints have been ubiquitous in the financial transactions for
centuries. Even Shakespeare in the “Merchant of Venice” points out the importance
of the collateral, as Shylock charged Antonio no interest rate but he asked for a
pound of flesh as a collateral. The collateral constraints provide protection against
mark-to-market losses whenever an investor generates a liability by shorting an asset.
Therefore, they require that the investor’s wealth is bounded below by the collateral
necessary to secure the liabilities. They are given by:
N
X
λi |θi | ≤ W
i=1
where λi is the collateral necessary to secure the liability in spread i. In our work, each
unit of arbitrage should be understood as being relative to a fixed face or notional
amount and therefore each λi is a percentage of this fixed face value or notional
amount.
35
1.2.3
Solution
Let us now find the optimal trading strategy of a risk averse investor who maximizes
the expected logarithm of his final wealth E(lnWT ). We consider two cases:
• The investor invests in the risk free asset and in N correlated convergence trades.
• The investor invests in the risk free asset and in N correlated mean reversion
trading strategies.
For both cases our analysis is similar. For both cases we have:
Wt =
N
X
θit Sit + θ0t B0t ∀t ∈ [0, T ]
(1.3)
i=1
where θit is the investor’s position in opportunity i for i = 1, · · · , N, θ0t is the investor’s position in the risk free asset, Sit is the spread of the convergence trade or
the value of the mean reverting portfolio and B0t is the price of the risk free asset.
The process θt is adapted to the filtration generated by the Brownian motion Zt .
The investor solves the following problem:
maximizeθ∈Θ E(lnWT )
subject to
dWt =
PN
(1.4)
i=1 θit dSit + θ0t dB0t
dSt = µ(S, t)dt + σ(S, t)dZt
where Θ is the set of admissible trading strategies. Let us first define ∀t ∈ [0, T ] Ft =
θt /Wt ∈ RN .
36
For the convergence trades case, investor’s wealth satisfies the following stochastic
differential equation:
dWt = Wt rdt +
dWt
= rdt +
Wt
N
X
θit Sit (−
i=1
N
X
Fit Sit (−
i=1
ai
− r)dt + θT σdZt
T −t
ai
− r)dt + F T σdZt
T −t
By applying Ito’s Lemma we have that:
N
X
d(ln(Wt )) = rdt +
Fit Sit (−
i=1
ai
− r)dt − 1/2FtT ΣFt dt + FtT σdZt
T −t
Therefore it is:
ln(WT ) = ln(Wt ) +
Z
T
rs ds
t
+
Z
N
X
T
t
+
Z
t
i=1
T
1
ai
− rs ) − FsT ΣFs
Fis Sis (−
T −t
2
!
ds
FsT σdZs
(1.5)
Assuming constant interest rate, we have:
Et (ln(WT )) = ln(Wt ) + r(T − t)
! !
Z T X
N
1 T
ai
− rs ) − Fs ΣFs ds
+ Et
Fis Sis (−
T −t
2
t
i=1
Z T
+ Et (
FsT σdZs)
t
37
(1.6)
For the mean reversion trading strategies case, investor’s wealth satisfies the following
stochastic differential equation:
dWt = Wt rdt +
dWt
= rdt +
Wt
N
X
θit (−ΦTi (St − S̄) − rSit )dt + θT σdZt
i=1
N
X
Fit (−ΦTi (St − S̄) − rSit )dt + FtT σdZt
i=1
where Φi is the i’th row of the transition matrix Φ. By applying Ito’s Lemma we
have that:
N
X
d(ln(Wt )) = rdt +
Fit (−ΦTi (St − S̄) − rSit )dt − 1/2FtT ΣFt dt + FtT σdZt
i=1
Therefore it is:
ln(WT ) = ln(Wt ) +
Z
T
rs ds
t
+
Z
N
X
T
t
+
Z
t
i=1
T
1
Fis (−ΦTi (Ss − S̄) − rSis ) − FsT ΣFs
2
FsT σdZs
!
ds
(1.7)
Assuming constant interest rate we have:
Et (ln(WT )) = ln(Wt ) + r(T − t)
! !
Z T X
N
1
+ Et
Fis (−ΦTi (Ss − S̄) − rSis ) − FsT ΣFs ds
2
t
i=1
Z T
+ Et (
FsT σdZs)
(1.8)
t
Under VaR constaints it is:
FtT ΣFt ≤ L < ∞ ∀t
38
Under the margin constraints it is:
N
X
λi |Fit | ≤ 1 ∀t
i=1
FtT ΣFt =
N X
N
X
Fit Fjt σij
i=1 j=1
≤
N X
N
X
λi λj |Fit ||Fjt|
i=1 j=1
σij
< C < ∞ ∀t
λi λj
Therefore, for both the cases and both the constraints the integrand of the stochastic
integral belongs in H 2 , which is a sufficient condition for the stochastic integral to be
RT
a martingale. Consequently, Et ( t FsT σdZs) is equal to 0.
Maximizing Et (ln(WT )) is equivalent to maximizing the third term is equations
(1.6), (1.8) for both the cases respectively. Let’s now stydy in detail the solution for
both cases for both the constraints.
VaR constraint
Maximizing Et (ln(WT )) under the VaR constraint is equivalent to solving ∀t the
following QCQP:
1
FtT µt + FtT ΣFt
2
subject to FtT ΣFt ≤ L
minimize
where

S ( a1
 1t T −t

+ r)

..


µt = 

.


aN
SN t ( T −t + r)
(1.9)
(1.10)
for the convergence trades case and


ΦT1 (St − S̄) + rS1t


..


µt = 

.


T
ΦN (St − S̄) + rSN t
for the mean reversion trading strategies case.
39
(1.11)
We can easily solve the problem 1.9 by applying the KKT conditions or by geometry (see Appendix). Ftopt , λopt
are optimal iff they satisfy the following KKT
t
conditions ([10]):
• Primal feasibility: FtT opt ΣFtopt ≤ L
• Dual feasibility: λopt
≥0
t
T opt
• Complementary slackness: λopt
ΣFtopt − L) = 0
t (Ft
• Minimization of the Lagrangean: Ftopt = argmin L(Ft , λopt
t )
By solving the KKT conditions (see Appendix for details) we find that:
θtopt
=


−1

−Σ µt Wt
Σ−1 µt Wt

r

− µTt Σ−1 µt
if µTt Σ−1 µt ≤ L
if µTt Σ−1 µt ≥ L
L
This is equivalently written as:
θtopt
Σ−1 µt Wt
opt
=−
= max 1,
opt where 1 + λt
1 + λt
r
µTt Σ−1 µt
L
!
Let’s now discuss more the properties of the solution. The investor has logarithmic
preferences. Therefore, he is a myopic optimizer - there is no hedging demand [59].
At each time t he looks dt ahead and decides how to trade in an optimal way. There
are two cases to consider:
• Case 1: At time t: µTt Σ−1 µt ≤ L In this case, the optimal solution is the
unconstrained myopic optimal solution, since it satisfies the VaR constraint.
For the convergence trades case, this is equivalent to the spread St being in the
N
1
+ r, · · · , Ta−t
+ r).
ellipsoid Et = {S | S T (At Σ−1 At )S ≤ L} where At = diag( Ta−t
40
Q
a1
The volume of the ellipsoid Et is shrinking as t → T , since vol(E) = N
i=1 ( T −t +
p
r)−1 det(Σ)vol(B(0, 1)) where B(0, 1) is the unit sphere. Figure 1-1 shows this
shrinking ellipsoid at three time instants.
For the mean reversion trading strategies case, this is equivalent to the spread or
value of the trade being inside the convex set C = {S | (S−S̄)T ((Φ+rI)T Σ−1 (Φ+
rI))(S − S̄) + 2r S̄ T Σ−1 (Φ + rI))(S − S̄) ≤ L − r 2 S̄ T Σ−1 S̄}, which in the case
of r = 0 is the ellipsoid C = {S | (S − S̄)T (ΦT Σ−1 Φ)(S − S̄) ≤ L. If S̄ = 0 this
convex set is also an ellipsoid.
These ellipsoids characterize poor opportunities where the constraints are not
active. What constitutes poor investment opportunities changes over time for
the case of convergence trades, while it remains invariant for the mean reversion
trades case. For the case of convergence trades, the same spreads initially can be
considered poor investment opportunities, where the investor does not bind the
constraint, he is more conservative, but after some time they can be considered
good opportunities and the investor becomes more aggressive and binds the
constraint.
Informally, when the investment opportunities are poor, the spreads are more
likely to widen which then would lead to mark-to-market losses and the investor
would not have sufficient wealth to take advantage the better investment opportunities and simultaneously satisfy the VaR constraints. Therefore, the investor
is more conservative.
• Case 2: At time t: µTt Σ−1 µt > L Now the unconstrained myopic optimal solution does not satisfy the VaR constraint. This case is equivalent to the spread
St being outside the shrinking ellipsoid Et for the convergence trades case or the
set C for the mean reversion trades case. Now the investment opportunities are
good. The investor wants to invest the unconstrained optimal trading strategy,
but due to the VaR constraint invests in the proportion of this optimal trading
strategy necessary to satisfy the VaR constraint.
41
Ellipsoids of poor investment opportunities for t=0.3, 0.6, 0.9.
0.15
0.1
Spread 2
0.05
0
−0.05
−0.1
−0.15
−0.2
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
Spread 1
Figure 1-1: Ellipsoids. Ellipsoids of poor investment opportunities for N=2 convergence trades at times t = 0.3, 0.6, 0.9.
Margin constraint
Maximizing Et (ln(WT )) under the margin constraint is equivalent to solving ∀t the
following convex program:
1
FtT µt + FtT ΣFt
2
PN
subject to
i=1 λi |Fit | ≤ 1
minimize

1
+ r)
S1t ( Ta−t


..


µt = 

.


aN
SN t ( T −t + r)

for the convergence trades case and



µt = 

ΦT1 (St

− S̄) + rS1t


..

.

ΦTN (St − S̄) + rSN t
42
(1.12)
for the mean reversion trading strategies case.
Let’s apply the KKT conditions. Ftopt , νtopt are optimal iff they satisfy the KKT
conditions:
• Primal feasibility:
PN
i=1
λi |Fit | ≤ 1
• Dual feasibility: νtopt ≥ 0
• Complementary slackness: νtopt (
PN
i=1
λi |Fit | − 1) = 0
• Minimization of the Lagrangean Ftopt = argmin L(Ft , λopt
t )
This program cannot be solved analytically in general. Again there are two cases to
consider.
• Case 1: At time t: kΛΣ−1 µt k1 ≤ 1 where Λ = diag(λ1 , · · · , λN ) In this case, the
optimal solution is the unconstrained myopic optimal solution, since it satisfies
the margin constraint.
For the convergence trades, this is equivalent to having at time t: kΛΣ−1 At Sk1 ≤
1
N
1 where Λ = diag(λ1 , · · · , λN ) and At = diag( Ta−t
+ r, · · · , Ta−t
+ r). In this case
we have that St is inside a “diamond” in N dimensional space, which shrinks
as t → T .
For the mean reversion trades, this is equivalent to having at time t: kΛΣ−1 (Φ(St −
S̄) + rSt )k1 ≤ 1 where Λ = diag(λ1 , · · · , λN ).
Informally again, when the investment opportunities are poor, the spreads are
more likely to widen which then would lead to mark-to-market losses and the investor would not have sufficient wealth to take advantage the better investment
opportunities and have enough wealth for the collateral necessary to secure the
liabilities.
43
• Case 2: At time t: kΛΣ−1 µt k1 ≥ 1 where Λ = diag(λ1 , · · · , λN ). Now the
investment opportunities are good, the unconstrained myopic optimal solution
does not satisfy the collateral constraint and the constraint binds at the optimal
solution.
Uncorrelated opportunities
There is a special case when the trading opportunities are uncorrelated, where we
can solve analytically the KKT conditions (see Appendix for details). In that case
the optimal positions are given by:
θitopt
=
sign(−µit )(| µλiti | − νtopt )+
σi2
λi
Wt
(1.13)
We observe the following:
• First of all for the convergence trades, in case the spread is positive we short the
spread as we would expect and in case it is negative we are long the spread. For
the mean reversion trades, the sign is the opposite of the sign of ΦTi (St −S̄)+rSit .
• Second, if µt is high relative to the collateral then the magnitude of the position
is higher.
• Third, if the variability of the opportunity is high the magnitude of the corresponding position is low.
• Finally the more interesting property of the solution is that it has a cutoff value,
the dual variable, and if the absolute value of µt over the collateral is greater
than the dual variable the position is different from zero otherwise the position
is 0.
It is:
νtopt
= 0 if
N
X
|λi µit |
i=1
44
σi2
≤1
and
νtopt
> 0 if
N
X
|λi µit |
i=1
σi2
>1
The dual variable is 0 when the investment opportunities are poor. It is easy to see
that when the margin constraint binds we have:
F̃it
opt
=
sign(−µit )(| µλiti | − νtopt )+
σi2
λ2i
andkF̃ k1 = 1
(1.14)
Weights in different arbitrage opportunities.
1
0.9
0.8
Weights
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
2
3
4
5
6
Figure 1-2: Weights for the case of uncorrelated spreads and collateral constraint. Weights for the case of uncorrelated spreads.
In Figure 1-2 we can see an example of how we invest in different convergence
trades when there is no correlation among them, with λ = 1 and volatilities equal
to 1 for all the opportunities. The height of each bar is the absolute value of µit
and we invest only in those spreads where the µit is larger than the dual variable.
P
If N
i=1 |µit | < 1, then the dual variable is 0, we invest in all the opportunities and
P
the collateral constraint does not bind. If N
i=1 |µit | > 1 as is in the figure then the
margin constraint binds, the dual variable is positive and we can find it as follows.
We start from the maximum µit and then reduce it until the sum of the weights is
equal to 1 where each weight is the distance between the absolute value of µit and ν.
45
1.2.4
Connection with Ridge and Lasso regression
Before we explore further the properties and the results of the optimal trading strategies, it would be interesting to digress for a while and see what connection there is
between our problems and the regularized regressions.
In the basic form of regularized regression, the goal is not only to have a good fit,
but also regression coefficients that are “small”. Two of the most common forms of
regularized regressions are the Ridge and Lasso regression.
Ridge regression shrinks the regression coefficients by imposing a penalty on their
size [42]. Equation 1.15 is one of the ways to write the Ridge problem.
minimize
PN
i=1 (yi − β0 −
Pp
2
subject to
j=1 βj ≤ t
Pp
j=1 xij βj )
2
(1.15)
The Ridge regression coefficients solution is similar to the optimal trading strategy
followed by a risk averse investor with logarithmic preferences, who can choose among
N diffusion processes and faces VaR constraints. In both cases we have this proportional shrinkage where we reduce all the weights by a constant.
Lasso regression is another common form of a regularized regression. It can be
used as a heuristic for finding a sparse solution. It does a kind of continuous subset
selection [16]. Equation 1.16 is one of the ways to write the Lasso problem.
minimize
subject to
PN
i=1 (yi
− β0 −
Pp
j=1 kβj k ≤ t
Pp
j=1 xij βj )
2
(1.16)
The Lasso regression coefficients solution is similar to the optimal trading strategy
followed by a risk averse investor with logarithmic preferences, who can choose among
N diffusion processes and faces margin constraints. Therefore, we can expect that in
this case we will have a sparse solution where the weights of several of the opportunities will be 0.
46
1.3
Results
Let us move on now to the results first for the convergence trades and then for the
mean reversion trading strategies.
1.3.1
Convergence trades
VaR constraints. We have simulated the optimal trading strategy for N = 2
correlated convergence trading opportunities under VaR constraints. We find the
following:
• It is often optimal for an investor to underinvest i.e. not to bind the constraint.
• The investor typically experiences losses early before locking at a profit as we
can see in Figures 1-3, 1-4, 1-5, 1-6.
• Tighter constraints lead to less variability and less skewness in the distribution
of wealth. They also lead to less final wealth as we can see in Figures 1-5, 1-6.
• The wealth is higher when the opportunities hedge each other, as we can see
in Figures 1-4, 1-6. This makes sense because when the constraints are binding
we care more about losing money which would then lead surely to liquidation
when the investment opportunities are better and therefore we prefer the opportunities to hedge each other. Figure 1-7 shows a typical path for the wealth
evolution using the same noise process for positive and negative correlation
under the VaR constraint. We see clearly this hedging effect where negative
correlation leads to more wealth.
• When the initial values of the convergence trades are low, the constraints bind
for a small percentage of time and final wealth is negatively correlated to the
percentage of time the constraints bind. Figure 1-8 shows this effect.
• The final portfolio wealth is highly positively skewed as it is obvious in Figures
1-3, 1-4, 1-5, 1-6
47
For all the simulations we used: σ1 = σ2 = 1, a1 = a2 = 1, S[0] = [1; 1], rf =
0.06, number of steps = 1000.
Distribution of wealth Time 0.25 rho 0.5
100
50
0
0
500
1000
1500
2000
2500
3000
3500
4000
3500
4000
3000
3500
4000
3000
3500
4000
Distribution of wealth Time 0.5 rho 0.5
100
50
0
0
500
1000
1500
2000
2500
3000
Distribution of wealth Time 0.75 rho 0.5
100
50
0
0
500
1000
1500
2000
2500
Distribution of wealth Time 1 rho 0.5
100
50
0
0
500
1000
1500
2000
2500
Figure 1-3: VaR constraints, positive correlations. Wealth distribution at t =
0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5)
convergence trades, while facing VaR constraints (K=1). Initial wealth is $100.
Time 0.25 rho −0.5 K 1
100
50
0
0
500
1000
1500
2000
2500
Time 0.5 rho −0.5 K 1
3000
3500
4000
0
500
1000
1500
2000
2500
Time 0.75 rho −0.5 K 1
3000
3500
4000
0
500
1000
1500
3000
3500
4000
0
500
1000
1500
3000
3500
4000
100
50
0
100
50
0
2000
2500
Time 1 rho −0.5 K 1
100
50
0
2000
2500
Figure 1-4: VaR constraints, negative correlations. Wealth distribution at t =
0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5)
convergence trades, while facing VaR constraints (K=1). Initial wealth is $100.
48
Time 0.25 rho 0.5 K 0.25
100
50
0
0
500
1000
1500
2000
2500
Time 0.5 rho 0.5 K 0.25
3000
3500
4000
0
500
1000
1500
2000
2500
Time 0.75 rho 0.5 K 0.25
3000
3500
4000
0
500
1000
1500
2000
2500
Time 1 rho 0.5 K 0.25
3000
3500
4000
0
500
1000
1500
3000
3500
4000
100
50
0
100
50
0
100
50
0
2000
2500
Figure 1-5: VaR constraints, positive correlations, tight constraints. Wealth
distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5) convergence trades, while facing VaR constraints (K=0.25). Initial
wealth is $100.
Time 0.25 rho −0.5 K 0.25
100
50
0
0
500
1000
1500
2000
2500
Time 0.5 rho −0.5 K 0.25
3000
3500
4000
0
500
1000
1500
2000
2500
Time 0.75 rho −0.5 K 0.25
3000
3500
4000
0
500
1000
1500
2000
2500
Time 1 rho −0.5 K 0.25
3000
3500
4000
0
500
1000
1500
3000
3500
4000
100
50
0
100
50
0
100
50
0
2000
2500
Figure 1-6: VaR constraints, negative correlations, tight constraints. Wealth
distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively
correlated (ρ = −0.5) convergence trades, while facing VaR constraints (K=0.25).
Initial wealth is $100.
49
1400
rho 0.5
rho −0.5
1200
Final wealth
1000
800
600
400
200
0
0
200
400
600
800
1000
1200
Simulation step
Figure 1-7: Wealth evolution under VaR constraint. Typical path of the wealth
evolution for an investor investing in two convergence trades using the same noise
process for positive and negative correlation under the VaR constraint. Initial wealth
is $100.
1600
1400
Final wealth
1200
1000
800
600
400
200
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Frequency the constraint binds
Figure 1-8: Relation between final wealth and frequency the VaR constraint
binds. Final wealth is negatively correlated to the percentage of time the constraints
bind when the initial values of the convergence trades are low.
50
Margin constraints. We have also simulated the optimal trading strategy for N =
2 correlated convergence trading opportunities under margin constraints using the
same noise process as with the VaR constraints. We have similar results with the
case of VaR constraints as we see in Figures 1-9, 1-10, 1-11, 1-12, 1-13 with the
following important differences:
• When the constraints bind, it is often the case that the position in one of the
convergence trades is 0, i.e. we have less diversification, sparse solution. Figure
1-15 shows a typical path of the positions in two convergence trading opportunities under VaR constraints, where we see that they tend to be different than
0. Figure 1-16 shows the evolutions of the positions in two convergence trading
opportunities under margin constraints for the same exactly asset processes as
before. We clearly see that often we invest only in one position, as we expected
due to the similarity of the positions with the Lasso regression coefficients.
• The final wealth is less skewed and smaller with respect to the case of VaR
constraints.
51
Time 0.25 rho 0.5 Collateral 1
100
50
0
0
500
1000
1500
2000
2500
Time 0.5 rho 0.5 Collateral 1
3000
3500
4000
0
500
1000
1500
2000
2500
Time 0.75 rho 0.5 Collateral 1
3000
3500
4000
0
500
1000
1500
2000
2500
Time 1 rho 0.5 Collateral 1
3000
3500
4000
0
500
1000
1500
3000
3500
4000
100
50
0
100
50
0
100
50
0
2000
2500
Figure 1-9: Margin constraints, positive correlations. Wealth distribution at
t = 0.25, 0.5, 0.75, 1 for an investor who invests in two positively correlated (ρ = 0.5)
convergence trades, while facing margin constraints (Collateral = 1). Initial wealth
is $100.
Time 0.25 rho −0.5 Collateral 1
100
50
0
0
500
1000
1500
2000
2500
Time 0.5 rho −0.5 Collateral 1
3000
3500
4000
0
500
1000
1500
2000
2500
Time 0.75 rho −0.5 Collateral 1
3000
3500
4000
0
500
1000
1500
2000
2500
Time 1 rho −0.5 Collateral 1
3000
3500
4000
0
500
1000
1500
3000
3500
4000
100
50
0
100
50
0
100
50
0
2000
2500
Figure 1-10: Margin constraints, negative correlations. Wealth distribution at
t = 0.25, 0.5, 0.75, 1 for an investor who invests in two negatively correlated (ρ = −0.5)
convergence trades, while facing margin constraints (Collateral = 1). Initial wealth
is $100.
52
Time 0.25 rho 0.5 Collateral 2
100
50
0
0
500
1000
1500
2000
2500
Time 0.5 rho 0.5 Collateral 2
3000
3500
4000
0
500
1000
1500
2000
2500
Time 0.75 rho 0.5 Collateral 2
3000
3500
4000
0
500
1000
1500
2000
2500
Time 1 rho 0.5 Collateral 2
3000
3500
4000
0
500
1000
1500
3000
3500
4000
100
50
0
100
50
0
100
50
0
2000
2500
Figure 1-11: Margin constraints, positive correlations, more collateral
needed. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests
in two positively correlated (ρ = 0.5) convergence trades, while facing margin constraints (Collateral = 2). Initial wealth is $100.
Time 0.25 rho −0.5 Collateral 2
100
50
0
0
500
1000
1500
2000
2500
Time 0.5 rho −0.5 Collateral 2
3000
3500
4000
0
500
1000
1500
2000
2500
Time 0.75 rho −0.5 Collateral 2
3000
3500
4000
0
500
1000
1500
2000
2500
Time 1 rho −0.5 Collateral 2
3000
3500
4000
0
500
1000
1500
3000
3500
4000
100
50
0
100
50
0
100
50
0
2000
2500
Figure 1-12: Margin constraints, negative correlations, more collateral
needed. Wealth distribution at t = 0.25, 0.5, 0.75, 1 for an investor who invests
in two negatively correlated (ρ = −0.5) convergence trades, while facing margin constraints (Collateral = 2). Initial wealth is $100.
53
700
rho 0.5
rho −0.5
600
Final wealth
500
400
300
200
100
0
0
200
400
600
800
1000
1200
Simulation step
Figure 1-13: Wealth evolution under margin constraint. Typical path of the
wealth evolution for an investor investing in two convergence trades using the same
noise process for positive and negative correlation under the margin constraint. Initial
wealth is $100.
1600
1400
Final wealth
1200
1000
800
600
400
200
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Frequency the constraint binds
Figure 1-14: Relation between final wealth and frequency the margin constraint binds. Final wealth is negatively correlated to the percentage of time the
constraints bind when the initial values of the convergence trades are low.
54
800
Convergence trade 1
Convergence trade 2
600
400
Position
200
0
−200
−400
−600
−800
0
100
200
300
400
500
600
700
800
900
1000
Simulation step
Figure 1-15: Positions evolution under VaR constraints. Typical path of the
positions in two convergence trading opportunities under VaR constraints.
400
Convergence trade 1
Convergence trade 2
300
200
Position
100
0
−100
−200
−300
−400
0
100
200
300
400
500
600
700
800
900
1000
Simulation step
Figure 1-16: Positions evolution under margin constraints. Typical path of
the positions in two convergence trading opportunities under margin constraints.
55
1.3.2
Mean reversion trading opportunities
Similar results we get also by simulating the optimal trading strategy for N = 2
correlated mean reversion trading opportunities under VaR and margin constraints.
This makes sense since without loss of generality, we have assumed that S̄ = 0 and
that Φ is a diagonal matrix, which makes the mean reversion trades case similar to the
convergence trades case. They are different in that the drift term of the convergence
trades for the same spreads S gets better and better as t → T , while for the mean
reversion trades it remains constant.
1.4
Conclusions
We explored the optimal portfolio allocation of a risk averse investor who invests in
N convergence trades or mean reverting trading strategies, while facing constraints.
In particular, we studied the optimal trading strategy when he faces VaR constraints
or collateral constraints. The optimal trading strategy is found by solving at each
time t a convex program for both cases, we characterized the solution of the convex
program and we found the properties of the optimal trading strategy. In all the
chapter, we have assumed that the investor completely trusts his model and he is
certain about the dynamics of the opportunities he faces. What happens if the model
is just an approximation? What happens if the investor believes that opportunities’
dynamics come from an unknown member of a set of unspecified models near an
approximating model? Concern about model misspecification will change the optimal
trading strategy of the investor and this is the topic of the next chapter.
56
Chapter 2
Optimal trading of arbitrage
opportunities under model
misspecification
A decision maker maximizes a utility function subject to a model. Standard control
theory helps a decision maker to make optimal decisions when his model is correct.
Robust control theory helps him to make good decisions when his model is approximately correct. In this chapter we will use methods of robust control theory to find
the optimal portfolio allocation of a risk averse investor, who invests in convergence
trades or mean reverting trading strategies, and is not completely confident about
the dynamics of his models.
In particular, we assume that the investor believes that the data comes from
an unknown member of a set of alternative models near his nominal model. These
alternative models are statistically difficult to distinguish from the nominal model.
The investor believes that his model is a pretty good approximation in the sense that
the discrepancies between the alternative models and his nominal model are small.
We will use the relative entropy to characterize the discrepancies between different
models. Concern about model misspecification leads the investor to choose a trading
strategy that is robust over the alternative models.
Three questions come naturally at this point:
57
• What does it mean to have a robust trading strategy?
A robust trading strategy is a strategy that works well over the set of alternative
probability models. We evaluate the worst performance of a given strategy over
that set of alternative probability models and we pick the one that maximizes
this worst case performance. It is essentially a “max-min” problem, a two-player
game in which a maximizing player chooses the best response to a malevolent
player who can disturb the stochastic model within limits.
• Why would we be interested in a robust decision rule over alternative models?
Why don’t we take a Bayesian approach, where we put a prior distribution over
the set of alternative probability models?
This could be another approach, but this set of alternative models may be too
large or too difficult for the investor to come up with a well behaved, plausible
prior distribution. In addition, we might want our solution to work well over
any kind of prior distribution [41].
• Why do we use the relative entropy to measure the discrepancy between an
alternative and the nominal model?
There are other ways to measure discrepancies between alternative probability
models, like Prokhorov distance [9] but the relative entropy with respect to a
measure P has nice properties and it is more tractable. It is given by:
D(Q) =
Z
log(
dQ
)dQ
dP
and it is a convex function of the measure Q.
In the rest of this chapter, we will review the relevant literature. Then we will
discuss about the set of the alternative models, the relative entropy and equivalent
ways to formulate our problem of the optimal robust portfolio allocation of the investor. Subsequently, we will find the optimal robust trading strategy of the investor
and finally we will explore the characteristics of this robust strategy.
58
2.1
Literature review
Whittle [79], [80] has studied mathematical methods for answering the question of
how to make decisions when you don’t fully trust your model.
Lars Hansen and Thomas Sargent in [41] have studied how to make economic
decisions in the face of model misspecification by modifying and extending aspects
of robust control theory. Their work revolves mostly around the linear-quadratic
regulator framework, where there is a certainty equivalence principle that allows a
deterministic presentation of the control theory.
Gilboa and Schmeidler in [34] have studied the max-min expected utility problem
where the decision maker has multiple priors and maximizes his expected utility assuming that nature chooses a probability measure to minimize his expected utility.
The minimization is over a closed and convex set of finitely additive probability measures. Their axiomatic treatment views this set of non-unique priors as an expression
of the agent’s preferences and the priors are not cast as distortions to a nominal
model.
Lars Hansen et al. [40] have studied robust decision rules when the agent fears that
the data are generated by a statistical perturbation of an approximating model that
is either a controlled diffusion process or a control measure over continuous functions
of time. They describe how stochastic formulations of robust control “constraint
problems” can be viewed in terms of Gilboa and Schmeidler’s max-min expected
utility model. They show the connection between the penalty robust control problem
and the constraint robust control problem, two closely related problems and formulate
the Hamilton Jacobi Bellman equations for various two-player zero sum continuous
time games that are defined in terms of a Markov diffusion process. We extend their
framework to the problem of optimal robust trading rules for a risk averse investor
who does not trust his model dynamics, believes that his nominal model is a good
approximation to the real model and invests in arbitrage opportunities.
Fleming and Souganidis [77] present how the Bellman-Isaacs condition defines a
Bellman equation for a two-player zero-sum game in which both players decide at time
59
0 or recursively. In other words, they show that the freedom to exchange orders of
maximization and minimization guarantees that equilibria of games where the choices
are done under mutual commitment at time 0 and of games where the choices are
done sequentially by both agents coincide.
Anderson et al. [3] show how the set of perturbed models in our formulations
is difficult to distinguish statistically from the approximating model given a finite
sample of timeseries observations.
Jacobson [44] and Whittle [78] studied risk sensitive optimal control in the context
of discrete-time linear quadratic regulator decision problems. They showed how the
risk-sensitive control law can be computed by equivalently solving a robust penalty
problem.
We will now discuss first how to represent the alternative probability models over
which we want our decision rules to be robust and how relative entropy can be used
to describe their discrepancies from the nominal model. We will formulate two closely
related nonsequential problems and the corresponding recursive HJB equations and
finally we will find the optimal robust portfolio allocation for a risk averse investor who
is not confident about the dynamics of his models and wants to invest in convergence
trades or mean reversion trading strategies.
2.2
Analysis
In Chapter 1 we saw that in the case when there is no model misspecification, the
investor wants to find the optimal portfolio allocation that solves the following problem:
maximizeθ∈Θ E(lnWT )
subject to
dWt = θt dSt + θ0t dBt
dSt = µ(S, t)dt + σ(S, t)dZt
dBt = rBdt
60
(2.1)
Here we have St ∈ RN and we have studied the following special cases:
K
X
ai Sit
dSit = −
dt +
σik dZkt
T −t
k=1
dSt = −Φ(St − S̄)dt + σdZt
and Θ is the set of admissible trading strategies:
Θ = θ |θT Σθ ≤ LW 2
for the case of VaR constraints and
Θ=
(
θ|
N
X
λi |θi | ≤ W
i=1
)
for the case of the margin constraints.
In this Chapter, the investor doubts his model dSt = µ(S, t)dt + σ(S, t)dZt . To
capture this doubt of the investor, we surround the approximating model with a cloud
of models that are statistically difficult to distinguish and we add a malevolent agent
who picks the worst possible model. The investor wants to find the optimal trading
strategy that solves the following problem:
maxθ∈Θ minQ∈Q EQ (lnWT )
(2.2)
where Θ is the set of admissible trading strategies and Q is the set of alternative
probability models. Problem 2.2 fits the max-min expected utility model of Gilboa
and Schmeidler [34], where Q is a set of multiple different priors. Let’s now discuss
how we represent the set of alternative probability models.
2.2.1
Alternative models representation
We use martingales to represent perturbations to the probability models and relative
entropy to measure the discrepancy between our nominal model and the alternative
61
models. To understand better our continuous time formulations, we digress for a
while by borrowing an example from [41].
Let’s consider a discrete time approximating model and its innovations ǫt which are
i.i.d Gaussian shocks. An alternative model alters the distribution of these shocks.
We use martingales to represent distortions to the probabilities. Let π̂t (ǫ) be the
alternative density of the shock ǫt+1 based on date t information. Then the random
Q
π̂
(ǫ)
variable Mt = tj=1 mj , where mj = j−1
and M0 = 1, is a martingale and is a
π(ǫ)
ratio of the joint alternative density over the joint nominal density. We define the
entropy of the alternative distribution associated with Mt as the expected likelihood
ratio with respect to the distorted distribution E(Mt log(Mt )). It has the property
that it is always non-negative and it is equal to 0, only when there is no distortion to
the nominal distribution.
Similarly, in our continuous time formulations we will use martingales to represent
distortions to the nominal probability model. We will construct an alternative model
Rt
by replacing Zt in our model by Ẑt + 0 hs ds, where Ẑt is a Brownian motion under the
alternative measure Q and ht is an adapted process that models the distortion, such
Rt
that the process ξt = e
0
hs dZs − 21
Rt
0
hT
s hs ds
is a martingale. Therefore, the nominal model
is misspecified by allowing the conditional mean of the shock vector in the alternative
models to feed back arbitrarily on the history up to date t. Since ξ0 = 1 we have
that E(ξt ) = 1. Since in addition, ξt > 0, we can define a probability measure Q such
that Q(A) = E[1A ξT ], in other words ξT =
dQ
dP
is the Radon-Nikodym derivative of
Q with respect to P, where the measures Q and P are equivalent. In fact one can
always define a process ht so that for any measure Q the Radon-Nikodym derivative
of Q with respect to P,
dQ
,
dP
is given by the exponential martingale ξT . In this way,
our distorted models are:
• For the convergence trades,
K
X
ai Sit
dSt = −
dt +
σik (dẐkt + hkt dt)
T −t
k=1
62
(2.3)
• For the mean reversion trades,
dSt = −Φ(St − S̄)dt + σ(dẐt + ht dt)
(2.4)
where Ẑt is a Brownian motion under Q. Why is it a Brownian motion under Q? The
answer lies with the Girsanov theorem [30] that states that if a process ht is such that
dQ
ξt is a martingale and ξT = dP
is the Radon-Nikodym derivative of Q with respect
ˆ = Zt − R t hs ds is a Brownian motion under measure Q.
to P, then the process Z(t)
0
Therefore, we parameterize Q by the choice of the drift distortion adapted process
ht .
Similarly with the discrete time case, we measure the discrepancy between measures Q and P as the relative entropy D(Q) (see Appendix for derivation),
D(Q) =
Z
T
0
1
EQ [hTt ht ]dt
2
(2.5)
.
This is to be expected, since the relative entropy between a multivariate Gaussian
distribution N(µ, I) and the multivariate standard normal distribution is D(Q) =
1 T
µ µ
2
(See Appendix for derivation) and ht dt is the conditional mean of the process
dZt under the alternative probability measure Q.
To express the notion that the nominal model is a good approximation to the real
model that generate the spread dynamics, we either restrain the alternative models
by D(Q) ≤ η or we penalize them with the magnitude of the entropy.
2.2.2
Model setup
Having described the set of alternative distributions, we are ready to formulate the
problem a risk averse investor faces who distrusts his model dynamics. As in [40] we
define two closely related problems:
63
• A multiplier robust control problem.
maxθ∈Θ minQ EQ (lnWT ) + νD(Q)
subject to
dWt = θt dSt + θ0t dBt
dSt = µ(S, t)dt + σ(S, t)(dẐt + ht dt)
(2.6)
dBt = rBdt
dξt = ξt ht dZt
where ξT =
dQ
dP
and D(Q) is given by equation 2.5. Here in essence there is an
implicit restriction manifested by the nonnegative penalty parameter ν.
• A constrained robust control problem.
maxθ∈Θ minQ EQ (lnWT )
subject to
dWt = θt dSt + θ0t dBt
dSt = µ(S, t)dt + σ(S, t)(dẐt + ht dt)
(2.7)
dBt = rBdt
dξt = ξt ht dZt
D(Q) ≤ η
where ξT =
dQ
dP
and D(Q) is given by equation 2.5.
In both cases the minimizing malevolent agent chooses the distortion process ht taken
θ as given and the maximizing investor chooses the optimal strategy taken ht as given.
We index the family of multiplier robust control problems by ν and the family of
constrained robust control problems by η. Obviously the two problems are related,
since the robustness parameter ν can be interpreted as the Lagrange multiplier on
the constraint D(Q) ≤ η. Actually we can show that if V (ν) is the optimal value
of the multiplier robust problem and K(η) is the optimal value of the constrained
robust problem then we have: K(η) = maxν≥0 V (ν) − νη [40]. Therefore we will be
only interested in finding V (ν).
64
2.3
Solution
We will solve 2.6 by solving the corresponding Hamilton Jacobi Bellman (HJB) equation. We will solve the HJB equation for the case that there are no constraints in
the admissible trading strategies and when the trading strategies are constrained by
VaR or collateral considerations. But first let’s digress for a while and solve the HJB
equation for the case when there is no fear of model misspecification.
2.3.1
No fear of model misspecification
For now we assume that the investor completely trusts the dynamics of his models
dSt = µ(S, t)dt + σ(S, t)dZt . He chooses the trading strategy θ ∈ Θ that solves the
problem 2.1 where Θ is the set of admissible trading strategies:
Θ = θ|θT Σθ ≤ LW 2
for the case of VaR constraints and
Θ=
(
θ|
N
X
λi |θi | ≤ W
i=1
)
for the case of the margin constraints. In this case the HJB equation is:
max Vt + VW (W r + θT (µ(S, t) − rSt )) + VST µ(S, t)
θ∈Θ
+ 1/2VW W θT Σθ + VW S Σθ + 1/2trace(ΣVSS ) = 0
where Σ = σσ T and V (W, S, t) is the value function of the investor subject to the
terminal condition V (W, S, T ) = ln(W ).
Due to the logarithmic preferences of the investor it is: V (W, S, t) = ln(W ) + H(S, t),
therefore VW S = 0, VW =
1
W
and VW W = − W12 . We also define ∀t ∈ [0, T ] Ft = θt /Wt
∈ RN .
65
The HJB equation becomes:
max Vt + (r + F T (µ(S, t) − rSt )) + VST µ(S, t)
F ∈F
− 1/2F T ΣF + 1/2trace(ΣVSS ) = 0
where F is the set of admissible trading strategies:
F = F |F T ΣF ≤ L
for the case of VaR constraints and
F=
(
F|
N
X
λi |Fi | ≤ 1
i=1
)
for the case of the margin constraints.
The optimal trading strategy is the solution to the following convex problem
min F T (−µ(S, t) + rSt ) + 1/2F T ΣF
F ∈F
(2.8)
as we also proved with a different method in Chapter 1, where µt = −µ(S, t) + rSt
and in particular it is:

S ( a1
 1t T −t

+ r)

..


µt = 

.


aN
SN t ( T −t + r)
for the convergence trades case and


ΦT1 (St − S̄) + rS1t


..


µt = 

.


T
ΦN (St − S̄) + rSN t
for the mean reversion trading strategies case.
66
2.3.2
Fear of model misspecification no constraints
In this section we assume that there are no constraints in the trading strategies followed by the risk averse investor and the investor is not confident about the dynamics
of his models. The Hamilton Jacobi Bellman equation for the problem 2.6 is given
by:
max min Vt + VW (W r + θT (µ(S, t) − rSt )) + VST µ(S, t)
θ
h
ν
+ 1/2VW W θT Σθ + VW S Σθ + 1/2trace(ΣVSS ) + VW θT σh + VST σh + hT h = 0
2
where Σ = σσ T and V (W, S, t) is the value function of the investor subject to the
terminal condition V (W, S, T ) = ln(W ). The malevolent agent picks the worst case
distortion drift process ht and the investor maximizes against the worst case scenario.
After defining ∀t ∈ [0, T ] Ft = θt /Wt ∈ RN the HJB equation becomes:
max min Vt + W VW (r + F T (µ(S, t) − rSt )) + VST µ(S, t)
F
h
ν
+1/2W 2VW W F T ΣF +W VW S ΣF +1/2trace(ΣVSS )+W VW F T σh+VST σh+ hT h = 0
2
The inner minimization problem is a convex quadratic problem. The first order
conditions are:
W VW σ T F + σ T VS + νh = 0
h=−
σ T (W VW F + VS )
ν
The optimal value of the inner minimization problem is:
g(F ) = −
W 2 VW2 F t ΣF + VST ΣVS + 2W VW F T ΣVS
2ν
67
Plugging this back into the HJB equation we have:
max Vt + W VW (r + F T (µ(S, t) − rSt )) + VST µ(S, t)
F
+ 1/2W 2 VW W F T ΣF + W VW S ΣF + 1/2trace(ΣVSS )
−
W 2 VW2 F t ΣF + VST ΣVS + 2W VW F T ΣVS
=0
2ν
Due to the logarithmic preferences of the investor it is: V (W, S, t) = lnW + H(S, t)
and in that case VW =
1
W
VW W = − W1 2 VW S = 0, VS (W, S, t) = HS (S, t) and the
minimizing drift distortion h = − σ
T (F +H
ν
S)
independent of the wealth.
The HJB equation now becomes:
max Vt + r + F T (µ(S, t) − rSt ) + VST µ(S, t)
F
− 1/2F T ΣF + 1/2trace(ΣVSS ) −
F T ΣF + VST ΣVS + 2F T ΣVS
=0
2ν
The optimal trading strategy is the solution to the following convex quadratic problem:
maximize F T (µ(S, t) − rSt −
ΣVS
1
1
) − (1 + )FtT ΣFt
ν
2
ν
(2.9)
The first order conditions are:
µ(S, t) − rSt −
ΣVS (St , t)
1
= (1 + )ΣFtopt
ν
ν
1
ΣVS (St , t)
−1
)
Ftopt =
1 Σ (µ(S, t) − rSt −
ν
1+ ν
ν
VS (St , t)
Ftopt =
Σ−1 (µ(S, t) − rSt ) −
ν +1
ν +1
We clearly see that as ν → ∞ the optimal trading strategy converges to the one where
we have no fear of model misspecification. This is to be expected since at this case the
problems 2.1 and 2.6 are equivalent. It is interesting to find the conditions under which
these weights are equal to the weights when there is no fear of model misspecification.
When there is a fear of model misspecification, the optimal weights are a convex
68
combination of Σ−1 (µ(S, t) − rSt ), i.e.the weights without model misspecification and
−VS . Therefore these weights are equal to the weights when there is no fear of model
misspecification, when Vs + F opt = 0, which is equivalent to hmin = 0. Of course this
is expected since in that case there would be no distortion drift and the HJB equation
would be the same as the benchmark case of no model misspecification.
After plugging in the optimal trading strategy to the HJB equation, it becomes:
1
T −1
1 (µ(S, t) − rSt ) Σ (µ(S, t) − rSt )
1+ ν
µ(S, t) − rS
1
T
+ VS µ(S, t) −
− 1/2
V T ΣVS = 0
ν+1
ν+1 S
Vt + r + 1/2 trace(ΣVSS ) + 1/2
We can plug in the optimal trading strategy to h = − σ
hmin = −
hmin = −
hmin
(σ T F opt + σ T VS )
ν
T 1
(σ 1+ 1 Σ−1 (µ(S, t) − rSt −
ν
T (W V F +V )
W
S
ΣVS
)
ν
ν
to find that:
+ σ T VS )
ν
σ T (Σ−1 (µ(S, t) − rSt ) + VS )
=−
ν +1
We consider two cases:
• Convergence trades. The optimal trading strategy is given by:
θtopt
VS
ν
Σ−1 At St +
=−
ν +1
ν+1
Wt
(2.10)
1
N
where At = diag( Ta−t
+ r, · · · , Ta−t
+ r). The optimal trading strategy is a
convex combination of the strategy without fear of model misspecification and
−VS with weights
ν
ν+1
and
1
.
ν+1
From the symmetry of the problem we have:
H(S, t) = H(−S, t) from which we get VS (St , t) = −VS (−St , t).
• Mean reversion trades. The optimal trading strategy is given by:
θtopt
VS
ν
Σ−1 (Φ(St − S̄) + rSt ) +
=−
ν +1
ν +1
69
Wt
(2.11)
The optimal trading strategy is again a convex combination of the strategy
without fear of model misspecification and −VS with weights
ν
ν+1
and
1
.
ν+1
For
the special case where S̄ = 0 we have:
θtopt
2.3.3
=−
VS
ν
Σ−1 ((Φ + rI)St ) +
ν+1
ν +1
Wt
(2.12)
Fear of model misspecification with VaR and margin
constraints
In this section we assume that the investor is not confident about the dynamics of his
models and he faces either VaR or margin constraints. The Hamilton Jacobi Bellman
equation for the problem 2.6 is given by:
max min Vt + VW (W r + θT (µ(S, t) − rSt )) + VST µ(S, t)
θ∈Θ
h
ν
+ 1/2VW W θT Σθ + VW S Σθ + 1/2trace(ΣVSS ) + VW θT σh + VST σh + hT h = 0
2
where Σ = σσ T and V (W, S, t) is the value function of the investor subject to the
terminal condition V (W, S, T ) = ln(W ). As previously Θ is the set of admissible
trading strategies:
Θ = θ|θT Σθ ≤ LW 2
for the case of VaR constraints and
Θ=
(
θ|
N
X
λi |θi | ≤ W
i=1
)
for the case of the margin constraints.
We can proceed like the previous case where we had no constraints and we will get
70
the following HJB equation:
max Vt + (r + F T (µ(S, t) − rSt )) + VST µ(S, t)
F ∈F
− 1/2F T ΣF + 1/2trace(ΣVSS ) −
F T ΣF + VST ΣVS + 2F T ΣVS
=0
2ν
where F is the set of admissible trading strategies:
F = F |F T ΣF ≤ L
for the case of VaR constraints and
F=
(
F|
N
X
λi |Fi | ≤ 1
i=1
)
for the case of the margin constraints. The optimal trading strategy is the solution
to the following convex problem:
maximize
F T (µ(S, t) − rSt −
ΣVS
1
1
) − (1 + )FtT ΣFt
ν
2
ν
(2.13)
subject to F ∈ F
We clearly see again that as ν → ∞ the optimal trading strategy converges to the one
where we have no fear of model misspecification as expected. Using a similar proof
as in Chapter 1, we can show (see Appendix) that in the case the investor faces VaR
constraints the optimal portfolio is given by:
Ftopt =


1
−1

 1+ 1 Σ µt
ν
Σ−1 µt


 r µTt Σ−1 µt
if µTt Σ−1 µt ≤ L(1 + ν1 )2
if µTt Σ−1 µt ≥ L(1 + ν1 )2
L
where µt = µ(S, t) − rSt −
ΣVS (St ,t)
.
ν
71
This is also equivalently written as:
Ftopt =
where 1 +
1
ν
+ λ = max(1 + ν1 ,
We consider two cases:
q
1
1+
1
ν
+λ
Σ−1 µt
−1 µ
µT
t
t Σ
).
L
• Convergence trades. The optimal trading strategy is the solution to the following convex problem:
minimize
F T (At St +
1
1
ΣVS
) + (1 + )FtT ΣFt
ν
2
ν
(2.14)
subject to F ∈ F
N
1
+ r, · · · , Ta−t
+ r).
where At = diag( Ta−t
• Mean reversion trades. The optimal trading strategy is the solution to the
following convex problem:
minimize
F T (Φ(St − S̄) + rSt +
ΣVS
1
1
) + (1 + )FtT ΣFt
ν
2
ν
(2.15)
subject to F ∈ F
For the special case where S̄ = 0, we have the following problem:
minimize
F T ((Φ + rI)St +
ΣVS
1
1
) + (1 + )FtT ΣFt
ν
2
ν
(2.16)
subject to F ∈ F
2.4
Results
We will investigate how the optimal trading strategy changes as a result of mistrust
of the model dynamics. We will first study the case where we have no constraints
and then the case where we have VaR constraints.
72
2.4.1
Convergence trades without constraints
We consider the case where we have N = 1 arbitrage opportunity and there are no
constraints. We will study the case where N = 2 when we have constraints. Due to
the symmetry of S around 0 it suffices to study only what happens when S ≥ 0, since
the symmetry implies that the value function is an even function of the spread S and
its partial derivative with respect to the spread is an odd function of S for each t.
The optimal weight in the arbitrage opportunity is given by:
Ftopt
1
a
σ 2 VS (St , t)
=− 2
((
+ r)St +
)
ν
σ (1 + ν1 ) T − t
and the minimizing distortion drift is given by: hmin = − σ(F
opt +V )
S
ν
(2.17)
. Comparing to
the case where there is no fear of model misspecification we see that now the variance
increases by multiplying by a factor of (1 + ν1 ) and the drift increases by adding
σ2 VS (St ,t)
.
ν
When S is positive, one would think that there are three cases to consider:
• If VS > 0 then F < 0. In this case there is a tradeoff between the two terms
opt
in hmin . The first term − σFν
corresponds to a positive distortion drift that
reduces the wealth of the investor since the investor is shorting the spread, while
the second term − σVν S corresponds to a negative distortion drift that points to
worse investment opportunities.
• If − AtσS2t ν < VS < 0 then F < 0 and both the terms in hmin correspond to
positive distortion drifts with the first one reducing the wealth of the investor
and the second term pointing to worse investment opportunities.
• If VS < − AtσS2t ν then F > 0. There is now again a tradeoff between the two
terms in hmin . Now the first term corresponds to a negative distortion drift
reducing the wealth of the investor since in this case the investor is long the
spread, and the second term corresponds to a positive distortion pointing to
worse investment opportunities.
73
A little more thought though will exclude the last two cases, since we expect the value
function V to be a non-decreasing function of S for nonnegative values of the spread,
since higher values of the spread S correspond to better investment opportunities.
That will lead to non-negative values of VS for S ≥ 0.
The HJB equation is given by:
Vt + r + 1/2 σ 2 VSS + 1/2
2
1
a
2S
+
r)
(
σ2
1 + ν1 T − t
( T a−t + r)S
1
a
S+
) − 1/2
VS2 σ 2 = 0
+ VS (−
T −t
ν+1
ν+1
We solve the HJB equation numerically using the method of finite differences [75].
In the following figures we have assumed that rf = 0, σ = 1, a = 0.01 and T = 1.
We observe the following:
• VS becomes larger and larger as t → T for each value of ν as we see in Figure 2-1
until some value close to the horizon where it starts going down. In addition,
VS is higher for higher values of ν.
• For very low values of ν the drift distortion hmin is positive and becomes larger
as t → T as we see in Figure 2-2. For higher values of ν the drift distortion
starts negative and after some point increases as t → T to positive values. As we
showed in the previous section, when hmin = 0, the optimal weight is equal to
the optimal weight in the case where there is no fear of model misspecification.
We can see this in Figure 2-4, where very close at the time where hmin crosses
0, the optimal weight graph crosses the one when ν = 100.
• Figure 2-3 shows the two terms of the distortion drift for ν = 1. The first term
corresponds to a positive distortion drift that reduces the wealth of the investor
since the investor is shorting the spread, while the second term corresponds to
a negative distortion drift that points to worse investment opportunities. In
this tradeoff the first term is losing at the beginning which makes the drift
distortion negative but as t → T it increases in a fast rate making the drift
distortion positive.
74
• The fact that VS becomes larger as t → T for each value of ν until a point close
to the horizon in combination with the fact that the drift term
a
T −t
increases
in a hyperbolic way leads to F becoming larger (in absolute value) as t → T
(Figure 2-4). The risk averse investor becomes more aggressive as the time to
T becomes smaller despite the fact that as t → T the malevolent agent picks
a more adverse distortion drift (Figure 2-2). This is due to the fact that the
improvement in the investment opportunities is so substantial that dominates
the fact that the distortion drift gets also larger.
• For very low values of ν the investor is more conservative than the case without
model misspecification for all t. For higher values on ν at the beginning the
investor is more aggressive and as t → T becomes more conservative comparing
to the case without model misspecification (Figure 2-4). This is because at the
beginning the drift term
a
T −t
is very low comparing to VS making the total drift
term lower for large values of ν than smaller values of ν, which leads to lower
magnitude of weight. This situation changes as time to horizon T gets smaller.
• As ν → ∞ the optimal weight in the strategy converges to the optimal weight
when there is no fear of model misspecification as we have argued before.
75
Vs as a function of time for S = 1
0.16
nu is: 0.01
nu is: 0.1
nu is: 1
nu is: 10
nu is: 100
0.14
0.12
Vs
0.1
0.08
0.06
0.04
0.02
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-1: Partial derivative of the value function with respect to S for a
single convergence trade. VS as a function of time at S = 1 for different values
of the robustness multiplier for a single convergence trade.
hmin as a function of time for S = 1
0.1
nu is: 0.01
nu is: 0.1
nu is: 1
nu is: 10
nu is: 100
0.08
0.06
hmin
0.04
0.02
0
-0.02
-0.04
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-2: Distortion drift for a single convergence trade. Distortion drift
as a function of time at S = 1 for different values of the robustness multiplier for a
single convergence trade.
76
Drift distortion and its two terms as a function of time for S = 1
0.1
h
h1
h2
0.08
0.06
Distortion
0.04
0.02
0
-0.02
-0.04
-0.06
-0.08
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-3: Distortion drift terms for a single convergence trade. Distortion
drift terms as a function of time at S = 1 for ν = 1 for a single convergence trade.
The first term corresponds to a positive distortion drift that reduces the wealth of the
investor since the investor is shorting the spread, while the second term corresponds
to a negative distortion drift that points to worse investment opportunities.
Optimal weights as a function of time for S = 1
0
nu is: 0.01
nu is: 0.1
nu is: 1
nu is: 10
nu is: 100
-0.02
Weights
-0.04
-0.06
-0.08
-0.1
-0.12
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-4: Optimal weight of a single convergence trade. Weight of the
convergence trading strategy as a function of time at S = 1 for different values of the
robustness multiplier.
77
2.4.2
Mean reversion trading strategies without constraints
We consider first the case where we have N = 1 mean reversion trading strategy and
S̄ = 0 i.e.
dSt = −φSt dt + σdZt
Due to the symmetry of S around 0 it suffices to study only what happens when
S ≥ 0, since the symmetry implies that the value function is an even function of the
spread S and its partial derivative with respect to the spread is an odd function of S
for each t.
The optimal weight in the trading strategy is given by:
1
σ 2 VS (St , t)
((φ
+
r)S
+
)
t
ν
σ 2 (1 + ν1 )
VS (St , t)
ν
(φ + r)St −
=− 2
σ (ν + 1)
ν+1
Ftopt = −
and the minimizing distortion drift is given by:
σ(F opt + VS )
ν
σVS
(φ + r)St
−
=
σ(ν + 1)
ν+1
hmin = −
Comparing to the case where there is no fear of model misspecification we see that
when the investor does not trust the model dynamics, the variance increases by multiplying with a factor of (1 + ν1 ) and the drift increases by adding
σ2 VS (St ,t)
.
ν
If VS
is non-negative and decreases as a function of time t, then the increase in the drift
gets smaller and smaller and the investor gets more and more conservative as time
passes by. In this case the distortion drift also gets larger and larger as time passes
by, which explains why the investor gets more and more conservative.
When S is positive, we have VS ≥ 0, since higher values of S correspond to better
investment opportunities. If VS ≥ 0 then F opt ≤ 0 for S ≥ 0. Therefore we see that
opt
there is a tradeoff between the two terms in hmin . The first term − σFν
corresponds to
a positive distortion drift that reduces the wealth of the investor since the investor is
78
shorting the spread, while the second term − σVν S corresponds to a negative distortion
drift that points to worse investment opportunities.
The HJB equation is given by:
Vt +r+1/2 σ 2VSS +1/2
2
1
(φ + r)S
1
2S
+VS (−φS +
)−1/2
V 2 σ2 = 0
(φ+r)
1
2
σ
ν+1
ν +1 S
1+ ν
We solve the HJB equation numerically using the method of finite differences. In the
following figures we have assumed that rf = 0, σ = 1, φ = 1 and T = 1. We observe
the following:
• VS becomes smaller and smaller as t → T for each value of ν as we see in Figure
2-5. This makes sense, since as t → T there is less time to take advantage of
the mean reversion trading strategy. In addition, VS is higher for higher values
of ν.
• The drift distortion hmin is positive and becomes larger and larger as t → T
for each value of the robustness multiplier as we see in Figure 2-6. Figure 2-7
shows the two terms of the distortion drift for ν = 1. The first term corresponds
to a positive distortion drift that reduces the wealth of the investor since the
investor is shorting the spread, while the second term corresponds to a negative
distortion drift that points to worse investment opportunities. In this tradeoff
the first term wins making the drift distortion positive.
• The fact that VS becomes smaller and smaller as t → T for each value of ν leads
to F becoming smaller and smaller (in absolute value) as t → T ( Figure 2-8).
In other words the risk averse investor becomes more conservative as the time
to T becomes smaller. This is to be expected, since as t → T the malevolent
agent picks a more adverse distortion drift (Figure 2-6) causing the investor to
be more cautious.
• The lower the value of ν the more conservative the investor is as we see in Figure
2-8. This is because ν is the robustness multiplier and lower values of it puts
79
less penalty in the distorting alternative distribution, leading to higher positive
drift distortions (Figure 2-6).
• As ν → ∞ the optimal weight in the strategy converges to the optimal weight
when there is no fear of model misspecification as we have argued before.
Vs as a function of time for S = 1
0.45
nu is: 1
nu is: 10
nu is: 100
0.4
0.35
0.3
Vs
0.25
0.2
0.15
0.1
0.05
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-5: Partial derivative of the value function with respect to S for a
single mean reversion trading strategy. VS as a function of time at S = 1 for
different values of the robustness multiplier.
80
hmin as a function of time for S = 1
0.5
nu is: 1
nu is: 10
nu is: 100
0.45
0.4
0.35
hmin
0.3
0.25
0.2
0.15
0.1
0.05
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-6: Distortion drift for a single mean reversion trading strategy.
Distortion drift as a function of time at S = 1 for different values of the robustness
multiplier.
Drift distortion and its two terms as a function of time for S = 1
0.7
h
h1
h2
0.6
0.5
Distortion
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-7: Distortion drift terms for a single mean reversion trading strategy. Distortion drift terms as a function of time at S = 1 for ν = 1. The first term
corresponds to a positive distortion drift that reduces the wealth of the investor, since
the investor is shorting the spread, while the second term corresponds to a negative
distortion drift that points to worse investment opportunities.
81
Optimal weights as a function of time for S = 1
-0.5
nu is: 1
nu is: 10
nu is: 100
-0.55
-0.6
-0.65
Weights
-0.7
-0.75
-0.8
-0.85
-0.9
-0.95
-1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-8: Optimal weight of a single mean reversion trading strategy.
Weight of the mean reversion trading strategy as a function of time at S = 1 for
different values of the robustness multiplier.
82
Let us now consider the case where we have N = 2 mean reversion trading strategies and again S̄ = 0. We solve numerically the HJB equation using the method of
finite differences. We have assumed that rf = 0, T = 1,

Φ=
and

Σ=
2 0
0 1
1 ρ
ρ 1




In Figure 2-9 we plot the weights of the mean reversion trading strategies as a
function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier.
We have chosen S1 = 1 and S2 = 2, since for these values the drift is the same for
both the strategies. We have assumed that there is no correlation between the two
trading strategies. We observe the following:
• First of all when there is no fear of model misspecification the weights of the
two strategies are the same and they do not change over time.
• When there is a fear of model misspecification the investor becomes more and
more conservative over time just like in the N = 1 case. It is interesting to note
that the weight is higher for the first trading strategy where the φ coefficient is
higher. That makes sense since “ceteris parebus” we would expect the Vs to be
higher for the strategy with the stronger rate of mean reversion (φ coefficient).
This is shown is Figure 2-11.
• Figure 2-10 shows the ratio of the weights of the two trading strategies. We
observe that this is higher for smaller values of the robustness multiplier and it
is reduced to 1 as t → T .
In Figure 2-12 we have assumed that there is a correlation ρ = 0.5 between the
two trading strategies. Now the weights are smaller than before due to the positive
correlations but they have the same properties as before. In the case when there is a
83
Optimal weights as a function of time for S1 = 1 and S2 = 2
-1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-1.1
-1.2
-1.3
Weights
-1.4
-1.5
-1.6
-1.7
-1.8
-1.9
-2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-9: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.
Ratio of optimal weights as a function of time for S1 = 1 and S2 = 2
1.3
nu is: 1
nu is: 10
nu is: 100
1.25
Weights
1.2
1.15
1.1
1.05
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-10: Ratio of the optimal weights. Ratio of the optimal weights of the
mean reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for
different values of the robustness multiplier. The correlation coefficient is ρ = 0.
negative correlation (Figure 2-13 ) the weights are larger, since now the opportunities
hedge each other, otherwise the properties remain the same.
84
Vs1 Vs2 as a function of time for S1 = 1 and S2 = 2
1.4
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
1.2
1
Vs
0.8
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-11: Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 2 when ρ = 0. Partial derivative of the value function with
respect to S1 and S2 as a function of time at S1 = 1 and S2 = 2 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0.
Optimal weights as a function of time for S1 = 1 and S2 = 2
-0.6
-0.7
-0.8
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Weights
-0.9
-1
-1.1
-1.2
-1.3
-1.4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-12: Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 2 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0.5.
85
Optimal weights as a function of time for S1 = 1 and S2 = 2
-2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-2.2
-2.4
-2.6
Weights
-2.8
-3
-3.2
-3.4
-3.6
-3.8
-4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-13: Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2. Weights of the mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 2 for different values of the
robustness multiplier. The correlation coefficient is ρ = −0.5.
86
So far we have examined the two trading strategies at values where the drifts are
the same. We now choose S1 = 1 and S2 = 1, since for these values the drifts differ
for the two strategies. Figure 2-14 shows the weights of the strategies over time for
different values of the robustness multiplier when there is no correlation between the
two trading strategies. We now observe the following:
• First of all when there is no fear of model misspecification the ratio of the
weights of the two strategies is the same as the ratio of their drifts normalized
by their variances and it does not change over time.
• When there is a fear of model misspecification the investor becomes more and
more conservative over time just like in the N = 1 case as we see in 2-14.
• VS1 is higher than VS2 . That makes sense since “ceteris parebus” we would
expect the Vs to be higher for the strategy with the stronger rate of mean
reversion (φ coefficient). This is shown is Figure 2-16.
• Figure 2-15 shows the ratio of the weights of the two trading strategies. We
observe that this is higher for smaller values of the robustness multiplier and
after initially increasing it finally converges to 1 as t → T . The reason that
initially the ratio is less than the one for the case where there is no fear of
model misspecification is that initially the ratio of
VS1
VS2
is less than the ratio of
the drifts.
A very interesting case arises when there is a high positive correlation like ρ = 0.9
between the two trading strategies. In this case the investor uses the second trading
strategy with the lower drift as a hedge for the first strategy as it is shown in Figure 217 where the investor is shorting the first strategy while he is long the second strategy.
Now VS2 is negative (Figure 2-18) which is to be expected since the investor is long
the asset and higher values of S2 lead to worse investment opportunities for a long
investor. Figure 2-19 shows the ratio of the magnitudes of the optimal weights as a
function of time.
87
Optimal weights as a function of time for S1 = 1 and S2 = 1
-0.5
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Weights
-1
-1.5
-2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-14: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 1 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.
Ratio of optimal weights as a function of time for S1 = 1 and S2 = 1
2.12
nu is: 1
nu is: 10
nu is: 100
2.1
2.08
Weights
2.06
2.04
2.02
2
1.98
1.96
1.94
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-15: Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.
Ratio of the optimal weights of the mean reversion trading strategies as a function
of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0.
Finally Figure 2-20 shows the weights when there is negative correlation ρ = −0.8.
The results are similar with the case of no correlation, although now the weights are
higher due to the negative correlation, which makes the two strategies good hedges.
88
Vs1 Vs2 as a function of time for S1 = 1 and S2 = 1
1.2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
1
Vs
0.8
0.6
0.4
0.2
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-16: Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 1 when ρ = 0. Partial derivative of the value function with
respect to S1 and S2 as a function of time at S1 = 1 and S2 = 1 for different values
of the robustness multiplier. The correlation coefficient is ρ = 0.
Optimal weights as a function of time for S1 = 1 and S2 = 1
4
3
2
1
Weights
0
-1
Spread 1 nu is: 0.1
Spread 2 nu is: 0.1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
-2
-3
-4
-5
-6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-17: Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0.9.
89
Vs1 Vs2 as a function of time for S1 = 1 and S2 = 1
2.5
V
V
S1
S2
nu is: 0.1
nu is: 0.1
V S1 nu is: 1
2
V S2 nu is: 1
V
V
1.5
S1
S2
nu is: 10
nu is: 10
Vs
1
0.5
0
-0.5
-1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-18: Partial derivative of the value function with respect to S1 and
S2 at S1 = 1 and S2 = 1 when ρ = 0.9. Partial derivative of the value function
with respect to S1 and S2 as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.9.
Ratio of optimal weights as a function of time for S1 = 1 and S2 = 1
1.65
nu is: 1
nu is: 10
nu is: 100
1.6
Weights
1.55
1.5
1.45
1.4
1.35
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-19: Ratio of the optimal weights at S1 = 1 and S2 = 1 when ρ = 0.9.
Ratio of the magnitude of the optimal weights of the mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 1 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0.9.
90
Optimal weights as a function of time for S1 = 1 and S2 = 1
-3.5
-4
-4.5
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Weights
-5
-5.5
-6
-6.5
-7
-7.5
-8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-20: Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1. Weights of the mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = −0.8.
91
2.4.3
Convergence trades with constraints
We consider first the case where we have N = 1 arbitrage opportunity and there are
collateral constraints i.e. |F | ≤ L. Due to the symmetry of the constraint and the
spread S around 0 it suffices to study only what happens when S ≥ 0, since the
symmetry implies that the value function is an even function of the spread S and its
partial derivative with respect to the spread is an odd function of S for each t.
The optimal weight in the arbitrage opportunity is given by:
Ftopt


1
a


− σ2 (1+
1 (( T −t + r)St +

)

ν

= −L





L
σ2 VS (St ,t)
)
ν
σ2 VS (St ,t)
|
ν
if|( T a−t + r)St +
≤ Lσ 2 (1 + ν1 )
if( T a−t + r)St +
σ2 VS (St ,t)
ν
≥ Lσ 2 (1 + ν1 )
if( T a−t + r)St +
σ2 VS (St ,t)
ν
≤ −Lσ 2 (1 + ν1 )
and the minimizing distortion drift is given by: hmin = − σ(F
opt +V )
S
ν
.
The HJB equation is given by:


( a +r)S
2
1


) − 1/2 ν+1
VS2 σ 2 = 0
Vt + r + 1/2 σ 2VSS + 1/2 1+1 1 ( T a−t + r)2 Sσ2 + VS (− T a−t S + T −t

ν+1

ν



2


if |( T a−t + r)St + σ VSν(St ,t) | ≤ Lσ 2 (1 + ν1 )






Vt + r + 1/2 σ 2VSS − 1 V 2 − VS ( a S − L σ2 ) − 1 (1 + 1 )σ 2 L2 + L( a + r)S = 0
2ν
S
T −t
ν
2
ν
T −t
2



if ( T a−t + r)St + σ VSν(St ,t) ≥ Lσ 2 (1 + ν1 )





2


Vt + r + 1/2 σ 2VSS − 2ν1 VS2 − VS ( T a−t S + L σν ) − 21 (1 + ν1 )σ 2 L2 − L( T a−t + r)S = 0





2

if ( a + r)St + σ VS (St ,t) ≤ −Lσ 2 (1 + 1 )
T −t
ν
ν
We solve the HJB equation numerically using the method of finite differences. In
the following figures we have assumed that rf = 0, σ = 1, a = 0.01 and T = 1. We
observe the following:
• VS is lower when the constraint is tighter (Figure 2-21). Figure 2-22 shows a
typical behaviour of VS over time at S = 1 and L = 0.1 for different values of
the robustness multiplier.
92
• The lower the value of the robustness multiplier, the more time it takes to
bind the constraint. For very low values of ν the investor is more conservative
than the case without model misspecification for all t. When the constraint is
relatively tight it is the case that the investor is more conservative when the
robustness multiplier is lower (Figure 2-23). When the constraint is relatively
loose (L is higher) it might be the case that for not very low values of ν the
investor is initially more aggressive and as t → T becomes more conservative
comparing to the case without model misspecification (Figure 2-24). This is
because at the beginning the drift term
a
T −t
is very low comparing to VS making
the total drift term lower for large values of ν than smaller values of ν, which
leads to lower magnitude of weight. This situation changes as time to horizon
T gets smaller.
• The fact that typically VS becomes larger as t → T for each value of ν until a
point close to the horizon in combination with the fact that the drift term
a
T −t
increases in a hyperbolic way leads to F becoming larger (in absolute value) as
t → T (Figure 2-23, 2-24) till it binds the collateral constraint. The risk averse
investor becomes more aggressive as the time to T becomes smaller despite the
fact that as t → T the malevolent agent picks a more adverse distortion drift.
This is due to the fact that the improvement in the investment opportunities is
so substantial that dominates the fact that the distortion drift gets also larger.
• For very low values of ν the drift distortion hmin is always positive. When the
constraint is tight enough the drift distortion is positive for all values of the
robustness multiplier (see Figure 2-25). When the constraint is not very tight
for not very low values of ν the drift distortion starts negative and after some
point increases as t → T to positive values (see Figure 2-26).
• Figure 2-27 shows the two terms of the distortion drift for ν = 1 when L = 0.1.
The first term corresponds to a positive distortion drift that reduces the wealth
of the investor since the investor is shorting the spread and it is bounded above
due to the collateral constraint, while the second term corresponds to a negative
93
distortion drift that points to worse investment opportunities. In this tradeoff
the first term wins and the final distortion drift is positive. For higher values of
L (see Figure 2-28) we see that the first term is losing at the beginning which
makes the drift distortion negative, explaining why initially the investor is more
aggressive than the case when there is no fear of model misspecification, but
as t → T it increases at a fast rate making the drift distortion finally positive,
which explains the fact that after a while the investor becomes more conservative
comparing to the case without fear of model misspecification. It is interesting
to note that after the collateral constraint binds the distortion drift evolution
is determined by the evolution of VS and therefore it might also undergo some
initial reduction before increasing to its upper bound dictated by the constraint.
• The tighter the collateral constraint the more conservative the investor is even
when the constrains does not bind (Figure 2-29).
• As ν → ∞ the optimal weight in the strategy converges to the optimal weight
when there is no fear of model misspecification as we have argued before.
Vs as a function of time for S = 1
0.16
0.14
0.12
Vs
0.1
0.08
0.06
0.04
0.02
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-21: Partial derivative of the value function with respect to S for
a single convergence trade when L = 0.1 and L = 100. VS as a function of
time at S = 1 for different values of the robustness multiplier. The solid line is when
L = 100 and the dotted line is for L = 0.1.
94
5.5
Vs as a function of time for S = 1
×10 -3
nu is: 0.01 L is: 0.1
nu is: 0.1 L is: 0.1
nu is: 1 L is: 0.1
nu is: 10 L is: 0.1
nu is: 100 L is: 0.1
5
4.5
4
Vs
3.5
3
2.5
2
1.5
1
0.5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-22: Partial derivative of the value function with respect to S for a
single convergence trade when L = 0.1. VS as a function of time at S = 1 for
different values of the robustness multiplier. The collateral constraint is |F | ≤ 0.1.
Optimal weights as a function of time for S = 1
0
-0.02
nu is: 0.01 L is: 0.1
nu is: 0.1 L is: 0.1
nu is: 1 L is: 0.1
nu is: 10 L is: 0.1
nu is: 100 L is: 0.1
Weights
-0.04
-0.06
-0.08
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-23: Optimal weight of a single convergence trade when L = 0.1.
Weight of the convergence trading strategy as a function of time at S = 1 for different
values of the robustness multiplier. The collateral constraint is |F | ≤ 0.1.
95
Optimal weights as a function of time for S = 1
0
nu is: 0.01 L is: 1
nu is: 0.1 L is: 1
nu is: 1 L is: 1
nu is: 10 L is: 1
nu is: 100 L is: 1
-0.01
-0.02
-0.03
Weights
-0.04
-0.05
-0.06
-0.07
-0.08
-0.09
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-24: Optimal weight of a single convergence trade when L = 1. Weight
of the convergence trading strategy as a function of time at S = 1 for different values
of the robustness multiplier. The collateral constraint is |F | ≤ 1.
hmin as a function of time for S = 1
0.1
0.09
nu is: 0.01 L is: 0.1
nu is: 0.1 L is: 0.1
nu is: 1 L is: 0.1
nu is: 10 L is: 0.1
nu is: 100 L is: 0.1
0.08
0.07
hmin
0.06
0.05
0.04
0.03
0.02
0.01
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-25: Distortion drift for a single convergence trade when L = 0.1.
Distortion drift as a function of time at S = 1 for different values of the robustness
multiplier. The collateral constraint is |F | ≤ 0.1.
96
hmin as a function of time for S = 1
0.1
nu is: 0.01 L is: 1
nu is: 0.1 L is: 1
nu is: 1 L is: 1
nu is: 10 L is: 1
nu is: 100 L is: 1
0.08
hmin
0.06
0.04
0.02
0
-0.02
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-26: Distortion drift for a single convergence trade when L = 1.
Distortion drift as a function of time at S = 1 for different values of the robustness
multiplier. The collateral constraint is |F | ≤ 1.
Drift distortion and its two terms as a function of time for S = 1
0.1
h
h1
h2
0.08
Distortion
0.06
0.04
0.02
0
-0.02
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-27: Distortion drift terms for a single convergence trade when
L = 0.1. Distortion drift terms as a function of time at S = 1 for ν = 1 and L =
0.1. The first term corresponds to a positive distortion drift that reduces the wealth
of the investor and it is bounded above due to the collateral constraint, while the
second term corresponds to a negative distortion drift that points to worse investment
opportunities.
97
Drift distortion and its two terms as a function of time for S = 1
0.08
h
h1
h2
0.06
Distortion
0.04
0.02
0
-0.02
-0.04
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-28: Distortion drift terms for a single convergence trade when
L = 1. Distortion drift terms as a function of time at S = 1 for ν = 1 and L =
1. The first term corresponds to a positive distortion drift that reduces the wealth
of the investor and it is bounded above due to the collateral constraint, while the
second term corresponds to a negative distortion drift that points to worse investment
opportunities.
Optimal weights as a function of time for S = 1
0
-0.02
Weights
-0.04
-0.06
-0.08
-0.1
-0.12
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Time
Figure 2-29: Optimal weight of a single convergence trade when L = 0.1 and
L = 100. Weight of the convergence trading strategy as a function of time at S = 1
for different values of the robustness multiplier. The solid line is when L = 100 and
the dotted line is for L = 0.1.
98
Let us now consider the case where we have N = 2 convergence trades. We
solve numerically the HJB equation using the method of finite differences. We have
assumed that rf = 0, T = 1,

a=
and

Σ=
0.04
0.02
1 ρ
ρ 1




Additionally we assume that we have a VaR constraint F T ΣF ≤ L. In Figures
2-30 and 2-32 we plot the weights of the convergence trading strategies as a function
of time at S1 = 1 and S2 = 2 for different values of L and different values of the
robustness multiplier. At these values of S1 and S2 the drift is the same for both
the strategies. We have assumed that there is no correlation between the two trading
strategies. Figures 2-31 and 2-33 show F T ΣF , the normalized wealth variance, as a
function of time for the different values of the robustness multiplier. We observe the
following:
• When the VaR constraint binds, and there is a fear of model misspecification
we invest more on the spread with the higher rate of mean reversion, due to
higher VS . The asymmetry between the two convergence trades goes down as
t → T and that is why when the VaR constraint binds, the difference in the
weights of the two strategies have to go down, as we see in Figures 2-30 and
2-32 where L = 0.5 and L = 0.05 respectively. Moreover, the investment in the
spread with the higher rate of mean reversion is higher than the corresponding
investment when there is no fear of model misspecification. This is due to the
fact that in both cases we have the same VaR constraint F12 + F22 = L, but in
the fist case there is an asymmetry causing F1 to be higher and F2 to be lower
than the corresponding weights in the second case.
• The lower the value of the robustness multiplier, the more time it takes to bind
the constraint, just like in the N = 1 case.
99
• When the VaR constraint does not bind the weights of both of the strategies increase at t → T due to the improvement of the investment opportunities. When
the constraint is relatively loose (L is higher) it might be the case that for not
very low values of ν the investor is initially more aggressive and as t → T becomes more conservative comparing to the case without model misspecification
(Figures 2-30 and 2-32). This is because at the beginning the drift term
a
T −t
is
very low comparing to VS making the total drift term lower for large values of
ν than smaller values of ν, which leads to lower magnitude of weight.
Optimal weights as a function of time for S1 = 1 and S2 = 2
0
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-0.1
Weights
-0.2
-0.3
-0.4
-0.5
-0.6
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-30: Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 2 when L = 0.5. Weights of the convergence trades as a function
of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.5.
100
VaR Constraint as a function of time for S1 = 1 and S2 = 2
0.6
nu is: 1
nu is: 10
nu is: 100
0.5
Weights
0.4
0.3
0.2
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-31: Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 2 when L = 0.5. Value of the normalized
wealth variance for two uncorrelated convergence trades as a function of time at
S1 = 1 and S2 = 2 for different values of the robustness multiplier. The correlation
coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.5.
Optimal weights as a function of time for S1 = 1 and S2 = 2
-0.02
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-0.04
-0.06
Weights
-0.08
-0.1
-0.12
-0.14
-0.16
-0.18
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-32: Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a function
of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05.
101
VaR Constraint as a function of time for S1 = 1 and S2 = 2
0.06
nu is: 1
nu is: 10
nu is: 100
0.05
Weights
0.04
0.03
0.02
0.01
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-33: Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of the normalized
wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1
and S2 = 2 for different values of the robustness multiplier. The correlation coefficient
is ρ = 0 and the rhs of the VaR constraint is L = 0.05.
102
Figures 2-34 and 2-35 show the weights and the lhs of the constraint for the
case of positive correlation, while Figures 2-36 and 2-37 cover the case for negative
correlations. The properties are similar with the ones for the uncorrelated case.
Optimal weights as a function of time for S1 = 1 and S2 = 2
0
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-0.02
Weights
-0.04
-0.06
-0.08
-0.1
-0.12
-0.14
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-34: Optimal weights of two positively correlated convergence trades
for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades as a
function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05.
103
VaR Constraint as a function of time for S1 = 1 and S2 = 2
0.06
nu is: 1
nu is: 10
nu is: 100
0.05
Weights
0.04
0.03
0.02
0.01
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-35: Value of the normalized wealth variance for two positively
correlated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of
the normalized wealth variance for two positively correlated convergence trades as a
function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.5 and the rhs of the VaR constraint is L = 0.05.
Optimal weights as a function of time for S1 = 1 and S2 = 2
-0.04
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-0.06
-0.08
-0.1
Weights
-0.12
-0.14
-0.16
-0.18
-0.2
-0.22
-0.24
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-36: Optimal weights of two negatively correlated convergence
trades for S1 = 1 and S2 = 2 when L = 0.05. Weights of the convergence trades
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint
is L = 0.05.
104
VaR Constraint as a function of time for S1 = 1 and S2 = 2
0.06
nu is: 1
nu is: 10
nu is: 100
0.05
Weights
0.04
0.03
0.02
0.01
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-37: Value of the normalized wealth variance for two negatively
correlated convergence trades at S1 = 1 and S2 = 2 when L = 0.05. Value of
the normalized wealth variance for two negatively correlated convergence trades as a
function of time at S1 = 1 and S2 = 2 for different values of the robustness multiplier.
The correlation coefficient is ρ = −0.5 and the rhs of the VaR constraint is L = 0.05.
105
So far we have examined the two trading strategies at values where the drifts are
the same. We now choose S1 = 1 and S2 = 1, since for these values the drifts differ
for the two strategies. Figure 2-38 shows the weights of the strategies over time for
different values of the robustness multiplier when there is no correlation between the
two trading strategies and when L = 0.05, while Figure 2-39 shows the lhs of the VaR
constraint, the normalized wealth variance, over time. We now observe the following:
• First of all when there is no fear of model misspecification the ratio of the
weights of the two strategies is the same as the ratio of their drifts normalized
by their variances and it does not change over time independent on whether the
VaR constraint binds or not.
• When the VaR constraint binds and there is a fear of model misspecification the
investor invests more on the spread with the higher drift. Since the constraint is
F12 + F22 = L if F1 increases (decreases) over time, then F2 decreases (increases)
over time. In Figure 2-38 F1 decreases over time (in magnitude), because the
asymmetry between the two strategies goes down as t → T . The existence of
this asymmetry is also the cause that when the robustness multiplier is lower the
difference in the weights of the two strategies is larger than when the robustness
multiplier is higher.
• The lower the value of the robustness multiplier, the more time it takes to bind
the VaR constraint.
• When the VaR constraint does not bind then the investor becomes more and
more aggressive, due to the improvement of the investment opportunities.
106
Optimal weights as a function of time for S1 = 1 and S2 = 1
-0.02
-0.04
-0.06
-0.08
Weights
-0.1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-0.12
-0.14
-0.16
-0.18
-0.2
-0.22
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-38: Optimal weights of two uncorrelated convergence trades for
S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades as a function
of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier. The
correlation coefficient is ρ = 0 and the rhs of the VaR constraint is L = 0.05.
VaR Constraint as a function of time for S1 = 1 and S2 = 1
0.06
nu is: 1
nu is: 10
nu is: 100
0.05
Weights
0.04
0.03
0.02
0.01
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-39: Value of the normalized wealth variance for two uncorrelated
convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of the normalized
wealth variance for two uncorrelated convergence trades as a function of time at S1 = 1
and S2 = 1 for different values of the robustness multiplier. The correlation coefficient
is ρ = 0 and the rhs of the VaR constraint is L = 0.05.
107
It is interesting to note the weights when there is a positive correlation high enough
to be long the second spread and use it as a hedge to the first one. In Figure 2-40 we
plot the weights of the two trading strategies and we observe that they have different
signs. Figure 2-41 shows the normalized wealth variance as a function of time. When
the VaR constraint does not bind both the weights increase in magnitude due to
the improvement in the investment opportunities. When the VaR constraint binds,
both of the weights become larger in magnitude. This is due to the fact that the
asymmetry between the two strategies is reduced towards the asymmetry in the case
without fear of model misspecification.
Finally Figures 2-42 and 2-43 show what happens when there is a negative correlation. The results are similar with the case of no correlation.
Optimal weights as a function of time for S1 = 1 and S2 = 1
0.2
0.1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Weights
0
-0.1
-0.2
-0.3
-0.4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-40: Optimal weights of two positively correlated convergence trades
for S1 = 1 and S2 = 1 when L = 0.05. Weights of the convergence trades as a
function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05.
108
VaR Constraint as a function of time for S1 = 1 and S2 = 1
0.06
nu is: 1
nu is: 10
nu is: 100
0.05
Weights
0.04
0.03
0.02
0.01
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-41: Value of the normalized wealth variance for two positively
correlated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of
the normalized wealth variance for two positively correlated convergence trades as a
function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = 0.8 and the rhs of the VaR constraint is L = 0.05.
Optimal weights as a function of time for S1 = 1 and S2 = 1
-0.1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-0.15
Weights
-0.2
-0.25
-0.3
-0.35
-0.4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-42: Optimal weights of two negatively correlated convergence
trades for S1 = 1 and S2 = 1 when L = 8. Weights of the convergence trades as a
function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = −0.8 and the rhs of the VaR constraint is L = 0.05.
109
VaR Constraint as a function of time for S1 = 1 and S2 = 1
0.06
nu is: 1
nu is: 10
nu is: 100
0.05
Weights
0.04
0.03
0.02
0.01
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-43: Value of the normalized wealth variance for two negatively
correlated convergence trades at S1 = 1 and S2 = 1 when L = 0.05. Value of
the normalized wealth variance for two negatively correlated convergence trades as a
function of time at S1 = 1 and S2 = 1 for different values of the robustness multiplier.
The correlation coefficient is ρ = −0.8 and the rhs of the VaR constraint is L = 0.05.
110
2.4.4
Mean reversion trading strategies with constraints
We consider first the case where we have N = 1 mean reversion trading strategy with
S̄ = 0 and we have a collateral constraint |F | < L. Again due to the symmetry of
S around 0 it suffices to study only what happens when S ≥ 0, since the symmetric
behaviour of the spread dynamics combined with the symmetry of the constraint
imply that the value function is an even function of the spread S and its partial
derivative with respect to the spread is an odd function of S for each t.
The optimal weight in the trading strategy is given by:
Ftopt


1


− σ2 (1+
1 ((φ + r)St +

)

ν

= −L





L
σ2 VS (St ,t)
)
ν
if |(φ + r)St +
σ2 VS (St ,t)
|
ν
≤ Lσ 2 (1 + ν1 )
if (φ + r)St +
σ2 VS (St ,t)
ν
≥ Lσ 2 (1 + ν1 )
if (φ + r)St +
σ2 VS (St ,t)
ν
≤ −Lσ 2 (1 + ν1 )
and the minimizing distortion drift is given by: hmin = − σ(F
opt +V )
S
ν
. When S is
positive, we have VS ≥ 0, since higher values of S correspond to better investment
opportunities. If VS ≥ 0 then −L ≤ F opt ≤ 0 for S ≥ 0. Therefore we see that again
opt
there is a tradeoff between the two terms in hmin . The first term − σFν
corresponds to
a positive distortion drift that reduces the wealth of the investor since the investor is
shorting the spread and it is bounded above, while the second term − σVν S corresponds
to a negative distortion drift that points to worse investment opportunities.
The HJB equation is given by:


2

1

) − 1/2 ν+1
VS2 σ 2 = 0
Vt + r + 1/2 σ 2 VSS + 1/2 1+1 1 (φ + r)2 Sσ2 + VS (−φS + (φ+r)S

ν+1

ν



2


if |(φ + r)St + σ VSν(St ,t) | ≤ Lσ 2 (1 + ν1 )






Vt + r + 1/2 σ 2 VSS − 1 V 2 − VS (φS − L σ2 ) − 1 (1 + 1 )σ 2 L2 + L(φ + r)S = 0
2ν S
ν
2
ν
2



if (φ + r)St + σ VSν(St ,t) ≥ Lσ 2 (1 + ν1 )





2

1

Vt + r + 1/2 σ 2 VSS − 2ν
VS2 − VS (φS + L σν ) − 12 (1 + ν1 )σ 2 L2 − L(φ + r)S = 0





2

if (φ + r)St + σ VS (St ,t) ≤ −Lσ 2 (1 + 1 )
ν
ν
111
We solve the HJB equation numerically using the method of finite differences. In the
following figures we have assumed that rf = 0, σ = 1, φ = 1 and T = 1 for two
different constraints, one tight and one very loose. We have similar results with the
case where there are no constraints. In addition we observe the following:
• VS becomes higher the less tight the constraint is for each value of the robustness
multiplier (Figure 2-47). This is to be expected since the tighter the constraints
the less we can take advantage the better investment opportunities when S gets
higher.
• Figure 2-45 shows the two terms of the distortion drift for ν = 2 an L = 0.7.
The first term corresponds to a positive distortion drift that reduces the wealth
of the investor since the investor is shorting the spread and it is bounded above
due to the constraint, while the second term corresponds to a negative distortion
drift that points to worse investment opportunities. In this tradeoff the first
term wins making the drift distortion positive.
• Figures 2-46, 2-48 shows the optimal weight when S = 1 over time for different
values of ν. For L = 0.7 we see that for high values of ν the constraint binds for
all the time period, for lower values of ν the constraint binds initially and then
F is reduced after some time t. Finally for even lower values of ν the constraint
does not bind at all.
• Figure 2-48 shows that when the constraint is tighter the investor is more conservative even when the constraint does not bind. This is expected since the
tighter the constraint the smaller is the VS . This difference in the weights
becomes smaller as t → T .
112
Vs as a function of time for S = 1
0.3
nu is: 1 L is: 0.7
nu is: 2 L is: 0.7
nu is: 3 L is: 0.7
0.25
Vs
0.2
0.15
0.1
0.05
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-44: Partial derivative of the value function with respect to S for
a single mean reversion trading strategy and a collateral constraint with
L = 0.7. VS as a function of time at S = 1 for different values of the robustness
multiplier for L = 0.7.
Drift distortion and its two terms as a function of time for S = 1
0.4
0.3
h
h1
h2
Distortion
0.2
0.1
0
-0.1
-0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-45: Distortion drift terms for a single mean reversion trading strategy and a collateral constraint with L = 0.7. Distortion drift terms as a function
of time at S = 1 for ν = 2 and for L = 0.7. The first term corresponds to a positive
distortion drift that reduces the wealth of the investor, since the investor is shorting the spread, while the second term corresponds to a negative distortion drift that
points to worse investment opportunities. The first term is bounded above due to the
collateral constraint.
113
Optimal weights as a function of time for S = 1
-0.5
-0.55
nu is: 1 L is: 0.7
nu is: 2 L is: 0.7
nu is: 3 L is: 0.7
Weights
-0.6
-0.65
-0.7
-0.75
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-46: Optimal weight of a single mean reversion trading strategy
with a collateral constraint with L = 0.7. Weight of the mean reversion trading
strategy as a function of time at S = 1 for different values of the robustness multiplier
and for L = 0.7.
Vs as a function of time for S = 1
0.4
0.35
0.3
Vs
0.25
0.2
0.15
0.1
0.05
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-47: Partial derivative of the value function with respect to S for a
single mean reversion trading strategy with different collateral constraints.
VS as a function of time at S = 1 for different values of the robustness multiplier and
different collateral constraints. The solid line is for L = 7 and the dotted line for
L = 0.7.
114
Optimal weights as a function of time for S = 1
-0.5
-0.55
Weights
-0.6
-0.65
-0.7
-0.75
-0.8
-0.85
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-48: Optimal weight of a single mean reversion trading strategy with
different collateral constraints. Weight of the mean reversion trading strategy
as a function of time at S = 1 for different values of the robustness multiplier and
different collateral constraints. The solid line is for L = 7 and the dotted line for
L = 0.7.
115
Let us now consider the case where we have N = 2 mean reversion trading strategies and again S̄ = 0. We solve numerically the HJB equation using the method of
finite differences. We have assumed that rf = 0, T = 1,

Φ=
and

Σ=
2 0
0 1
1 ρ
ρ 1




Additionally we assume that we have a VaR constraint F T ΣF ≤ L. In Figures
2-49, 2-51 and 2-53 we plot the weights of the mean reversion trading strategies as a
function of time at S1 = 1 and S2 = 2 for different values of L and different values
of the robustness multiplier. At these values of S1 and S2 the drift is the same for
both the strategies. We have assumed that there is no correlation between the two
trading strategies. Figures 2-50,2-52 and 2-54 show F T ΣF as a function of time for
the different values of the robustness multiplier. We observe the following:
• First of all when there is no fear of model misspecification the weights of the
two strategies are the same, since there is no asymmetry between them, and
they do not change over time independent on whether the VaR constraint binds
or not.
• When the VaR constraint binds, and there is a fear of model misspecification we
invest more on the spread with the higher φ coefficient, due to higher VS . As we
saw in the unconstrained case (Figure 2-10), the ratio of the weights is reduced
over time, since their asymmetry due to the reduction of the VS1 and VS2 is
reduced over time. Therefore, when the VaR constraint binds, the difference
in the weights of the two strategies have to go down, as we see in Figures 2-49
and 2-51 where L = 3 and L = 2 respectively. Moreover, the investment in the
spread with the higher φ coefficient is higher than the corresponding investment
when there is no fear of model misspecification. This is due to the fact that
116
in both cases we have the same VaR constraint F12 + F22 = L, but in the first
case there is an asymmetry causing F1 to be higher and F2 to be lower than the
corresponding weights in the second case.
• When the VaR constraint does not bind the weights of both of the strategies
go down just like in the unconstrained case as we see in Figures 2-49 and 2-53.
Optimal weights as a function of time for S1 = 1 and S2 = 2
-1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-1.05
-1.1
Weights
-1.15
-1.2
-1.25
-1.3
-1.35
-1.4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-49: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 3. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of
the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 3.
117
VaR Constraint as a function of time for S1 = 1 and S2 = 2
3.2
nu is: 1
nu is: 10
nu is: 100
3
Weights
2.8
2.6
2.4
2.2
2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-50: Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 3. Value of
the normalized wealth variance for two uncorrelated mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is
L = 3.
Optimal weights as a function of time for S1 = 1 and S2 = 2
-0.85
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-0.9
Weights
-0.95
-1
-1.05
-1.1
-1.15
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-51: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 2. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of
the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 2.
118
VaR Constraint as a function of time for S1 = 1 and S2 = 2
2.1
nu is: 1
nu is: 10
nu is: 100
2.08
2.06
2.04
Weights
2.02
2
1.98
1.96
1.94
1.92
1.9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-52: Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 2. Value of
the normalized wealth variance for two uncorrelated mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is
L = 2.
Optimal weights as a function of time for S1 = 1 and S2 = 2
-1
-1.1
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-1.2
-1.3
Weights
-1.4
-1.5
-1.6
-1.7
-1.8
-1.9
-2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-53: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 2 for different values of
the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 7.
119
VaR Constraint as a function of time for S1 = 1 and S2 = 2
8
Spread 1 nu is: 1
Spread 1 nu is: 10
Spread 1 nu is: 100
7
Weights
6
5
4
3
2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-54: Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 2 when L = 7. Value of
the normalized wealth variance for two uncorrelated mean reversion trading strategies
as a function of time at S1 = 1 and S2 = 2 for different values of the robustness
multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR constraint is
L = 7.
120
Figures 2-55 and 2-56 show the weights and the constraint for the case of positive
correlation, while Figures 2-57 and 2-58 cover the case for negative correlations. The
properties are similar with the ones for the uncorrelated case.
Optimal weights as a function of time for S1 = 1 and S2 = 2
-0.6
-0.7
-0.8
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Weights
-0.9
-1
-1.1
-1.2
-1.3
-1.4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-55: Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs
of the VaR constraint is L = 7.
121
VaR Constraint as a function of time for S1 = 1 and S2 = 2
5.5
5
4.5
Spread 1 nu is: 1
Spread 1 nu is: 10
Spread 1 nu is: 100
Weights
4
3.5
3
2.5
2
1.5
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-56: Value of the normalized wealth variance for two positively
correlated mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 7. Value of the normalized wealth variance for two positively correlated mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.5 and the rhs
of the VaR constraint is L = 7.
Optimal weights as a function of time for S1 = 1 and S2 = 2
-2
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-2.1
-2.2
Weights
-2.3
-2.4
-2.5
-2.6
-2.7
-2.8
-2.9
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-57: Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 2 when L = 7. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different
values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the
rhs of the VaR constraint is L = 7.
122
VaR Constraint as a function of time for S1 = 1 and S2 = 2
7.5
Spread 1 nu is: 1
Spread 1 nu is: 10
Spread 1 nu is: 100
7
Weights
6.5
6
5.5
5
4.5
4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-58: Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies at S1 = 1 and S2 = 2 when
L = 7. Value of the normalized wealth variance for two negatively correlated mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 2 for different
values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the
rhs of the VaR constraint is L = 7.
123
So far we have examined the two trading strategies at values where the drifts are
the same. We now choose S1 = 1 and S2 = 1 just like in the unconstrained case, since
for these values the drifts differ for the two strategies. Figure 2-59 shows the weights
of the strategies over time for different values of the robustness multiplier when there
is no correlation between the two trading strategies and when L = 2, while Figure
2-60 shows the VaR constraint over time. We now observe the following:
• First of all when there is no fear of model misspecification the ratio of the
weights of the two strategies is the same as the ratio of their drifts normalized
by their variances and it does not change over time independent on whether the
VaR constraint binds or not.
• When the VaR constraint binds and there is a fear of model misspecification the
investor invests more on the spread with the higher drift. Since the constraint is
F12 + F22 = L if F1 increases (decreases) over time, then F2 decreases (increases)
over time. In Figure 2-59 F1 increases over time, because the asymmetry between the two strategies grows larger as we can see in the unconstrained case
in Figure 2-15.
• When the VaR constraint does not bind then the investor becomes more and
more conservative in both the strategies over time just like in the N = 1 case.
124
Optimal weights as a function of time for S1 = 1 and S2 = 1
-0.5
-0.6
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-0.7
Weights
-0.8
-0.9
-1
-1.1
-1.2
-1.3
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-59: Optimal weights of two uncorrelated mean reversion trading
strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the mean reversion
trading strategies as a function of time at S1 = 1 and S2 = 1 for different values of
the robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 2.
VaR Constraint as a function of time for S1 = 1 and S2 = 1
2.1
2
nu is: 1
nu is: 10
nu is: 100
1.9
Weights
1.8
1.7
1.6
1.5
1.4
1.3
1.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-60: Value of the normalized wealth variance for two uncorrelated
mean reversion trading strategies at S1 = 1 and S2 = 1 when L = 2. Value of
the normalized wealth variance for two negatively correlated mean reversion trading
strategies as a function of time at S1 = 1 and S2 = 1 for different values of the
robustness multiplier. The correlation coefficient is ρ = 0 and the rhs of the VaR
constraint is L = 2.
125
It is interesting to note the weights when there is a positive correlation high
enough to be long the second spread and use it as a hedge to the first one. In
Figure 2-61 we plot the weights of the two trading strategies and we observe that
they have different signs. Figure 2-62 shows the VaR value as a function of time.
When the VaR constraint does not bind both the weights reduce in magnitude like
in the unconstrained case. When the VaR constraint binds, both of the weights
become larger in magnitude. This is due to the fact that the asymmetry between the
two strategies is reduced towards the asymmetry in the case without fear of model
misspecification, as we see in the unconstrained case (Figure 2-19).
Finally Figures 2-63 and 2-64 show what happens when there is a negative correlation. The results are similar with the case of no correlation.
Optimal weights as a function of time for S1 = 1 and S2 = 1
1.5
1
0.5
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
Weights
0
-0.5
-1
-1.5
-2
-2.5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-61: Optimal weights of two positively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1 when L = 2. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.9 and the rhs
of the VaR constraint is L = 2.
126
VaR Constraint as a function of time for S1 = 1 and S2 = 1
2.1
2
nu is: 1
nu is: 10
nu is: 100
1.9
Weights
1.8
1.7
1.6
1.5
1.4
1.3
1.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-62: Value of the normalized wealth variance for two positively
correlated mean reversion trading strategies at S1 = 1 and S2 = 1 when
L = 2. Value of the normalized wealth variance for two positively correlated mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = 0.9 and the rhs
of the VaR constraint is L = 2.
Optimal weights as a function of time for S1 = 1 and S2 = 1
-1.2
-1.4
-1.6
Spread 1 nu is: 1
Spread 2 nu is: 1
Spread 1 nu is: 10
Spread 2 nu is: 10
Spread 1 nu is: 100
Spread 2 nu is: 100
-1.8
Weights
-2
-2.2
-2.4
-2.6
-2.8
-3
-3.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-63: Optimal weights of two negatively correlated mean reversion
trading strategies for S1 = 1 and S2 = 1 when L = 8. Weights of the mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the
rhs of the VaR constraint is L = 8.
127
VaR Constraint as a function of time for S1 = 1 and S2 = 1
9
8
nu is: 1
nu is: 10
nu is: 100
Weights
7
6
5
4
3
2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time
Figure 2-64: Value of the normalized wealth variance for two negatively
correlated mean reversion trading strategies at S1 = 1 and S2 = 1 when
L = 8. Value of the normalized wealth variance for two negatively correlated mean
reversion trading strategies as a function of time at S1 = 1 and S2 = 1 for different
values of the robustness multiplier. The correlation coefficient is ρ = −0.5 and the
rhs of the VaR constraint is L = 8.
128
2.5
Conclusions
We investigated how the optimal trading strategy of a risk averse investor, facing
arbitrage opportunities and risk constraints, changes when he is not confident of
his model dynamics. In particular, we assumed that the investor believes that the
data come from an unknown member of a set of unspecified alternative models near
his approximating model. The investor believes that his model is a pretty good
approximation in the sense that the relative entropy of the alternative models with
respect to his nominal model is small. Concern about model misspecification leads the
investor to choose a robust trading strategy that works well over that set of alternative
models. We found what the optimal trading strategy is for the case of the convergence
trades and mean reversion trading strategies with and without constraints by solving
the corresponding Hamilton Jacobi Bellman equation.
In all of our cases we dealt with diffusion processes and the alternative models
distorted the conditional mean of the Brownian motion. An interesting extension of
our work would be to assume that we have jump processes, where the misspecification
is now about the dynamics of the jumps. This will be the topic of future work.
129
130
Chapter 3
Estimating the NIH Efficient
Frontier
The National Institutes of Health (NIH) is among the world’s largest and most important investors in biomedical research. Its stated mission is to “seek fundamental
knowledge about the nature and behavior of living systems and the application of
that knowledge to enhance health, lengthen life, and reduce the burdens of illness
and disability” (http://www.nih.gov/about/mission.htm). Some have criticized the
NIH funding process as not being sufficiently focused on disease burden[69, 45, 43].
We consider a framework in which biomedical research allocation decisions are
more directly tied to the risk/reward trade-off of burden-of-disease outcomes. Prioritizing research efforts is analogous to managing an investment portfolio—in both
cases, there are competing opportunities to invest limited resources, and expected
returns, risk, correlations, and the cost of lost opportunities are important factors in
determining the return of those investments.
Financial decisions are commonly made according to portfolio theory[55], in which
the optimal trade-off between risk and reward among a collection of competing
investments—known as the “efficient frontier”—is constructed via quadratic optimization, and a point on this frontier is selected based on an investor’s risk/reward
preferences. Given a measure of “return on investment” (ROI), an “efficient portfolio” is defined to be the investment allocation that yields the highest expected return
131
for a given and fixed level of risk (as measured by return volatility), and the locus of
efficient portfolios across all levels of risk is the efficient frontier.
We recast the NIH funding allocation decision as a portfolio-optimization problem
in which the objective is to allocate a fixed amount of funds across a set of disease
groups to maximize the expected “return on investment” (ROI) for a given level of
volatility. We define ROI as the subsequent improvements in years of life lost (YLL).
We use historical time series data provided by the NIH and the Centers for Disease
Control for each of 7 disease groups and we estimate the means, variances, and covariances among these time series. These serve as inputs to the portfolio-optimization
problem. Such an approach provides objective, systematic, transparent, and repeatable metrics that can incorporate “real-world” constraints, and yields well-defined
optimal risk-sensitive biomedical research funding allocations expressly designed to
reduce the burden of disease.
In the rest of this chapter, we will discuss the relevant literature review. Then
we will discuss about the data and solution methods used, present our results and
conclude with a discussion of our findings.
3.1
NIH Background and Literature Review
The National Institutes of Health (NIH) was established in 1938 and has a budget
of over $31 billion, of which 80% is awarded in competitive research grants to more
than 325,000 researchers through nearly 50,000 competitive grants at over 3,000 universities, medical schools, and other research institutions (http://www.nih.gov). The
NIH allocates funding among competing priorities by assessing such priorities with respect to five major criteria[65]: (a) public needs; (b) scientific quality of the research;
(c) potential for scientific progress (the existence of promising pathways and qualified
investigators); (d) portfolio diversification; and (e) adequate support of infrastructure (human capital, equipment, instrumentation, and facilities). This framework
was supported, with some additional recommendations, by an Institute of Medicine
(IoM) blue-ribbon panel in 1998 (see Table 3.1)[1].
132
Criteria
Recommendation 1. The
committee generally supports the
criteria that NIH uses for priority
setting and recommends that NIH
continue to use these criteria in a
balanced way to cover the full
spectrum of research related to
human health.
Processes
Recommendation 5. In exercising
the overall authority to oversee and
coordinate the priority-setting
process, the NIH director should
receive from the directors of all the
institutes and centers multiyear
strategic plans, including budget
scenarios, in a standard format on an
annual basis.
Public
Recommendation 7. NIH should
establish an Office of Public Liaison
in the Office of the Director, and,
where offices performing such a
function are not already in place, in
each institute. These offices should
document, in a standard format, their
public outreach, input, and response
mechanisms. The director's Office of
Public Liaison should review and
evaluate these mechanisms and
identify best practices.
Congress
Recommendation 10. The U.S.
Congress should use its authority to
mandate specific research programs,
establish levels of funding for them,
and implement new organizational
entities only when other approaches
have proven inadequate. NIH should
provide Congress with analyses of
how NIH is responding to requests for
such major changes and whether these
requests can be addressed within
existing mechanisms.
Recommendation 2. NIH should
make clear its mechanisms for
implementing its criteria for setting
priorities and should evaluate their
use and effectiveness.
Recommendation 6. The director
of NIH should increase the
involvement of the Advisory
Committee to the Director in the
priority-setting process. The
diversity of the committee's
membership should be increased,
particularly with respect to its public
members.
Recommendation 8. The director
of NIH should establish and
appropriately staff a Director's
Council of Public Representatives,
chaired by the NIH director, to
facilitate interactions between NIH
and the general public.
Recommendation 11. The director
of NIH should periodically review and
report on the organizational structure
of NIH, in light of changes in science
and the health needs of the public.
Recommendation 9. The public
membership of NIH policy and
program advisory groups should be
selected to represent a broad range
of public constituencies.
Recommendation 12. Congress
should adjust the levels of funding for
research management and support so
that the NIH can implement
improvements in the priority-setting
process, including stronger analytical,
planning, and public interface
capabilities.
Recommendation 3. In setting
priorities, NIH should strengthen its
analysis and use of health data, such
as burdens and costs of diseases, and
of data on the impact of research on
the health of the public.
Recommendation 4. NIH should
improve the quality and analysis of
its data on funding by disease and
should include direct and related
expenditures.
Table 3.1: IoM recommendations. 12 major recommendations of the 1998 Institute of Medicine panel in four large areas for improving the process of allocating
research funds.
133
Despite this framework and the IoM endorsement, NIH funding has been criticized
as not being aligned to disease burden and insufficiently effective[69, 45, 43]. For
example, the impact of cancer has been estimated as only 5% of total direct cost but
23% of all deaths[20], while extramural spending by the National Cancer Institute
(NCI) is about 15% of the total (http://report.nih.gov/). Sandler et al.[70] suggested
that digestive diseases were relatively underfunded based on comparisons of disease
burden as measured by direct and indirect cost. Gross[38] noted that NIH funding
is reasonably predicted by some burden-of-disease metrics (disability-adjusted lifeyears or DALY, which are unavailable in time-series form)[57]. Earmarks or target
funding levels for specific diseases and programs have been suggested by a number of
policymakers[2].
Funding allocation decisions are not unique to the NIH; in a study similar to
Gross et al.Gross, Curry et al. [27] has questioned the allocations of the Centers
for Disease Control. NIH leaders have noted that funding basic research is itself a
risky endeavor, involving trade-offs among all five of their funding criteria, and may
also include unstated secondary objectives, e.g., actively “balancing out” spending by
other agencies, charities, and the private sector[76]. Collectively, these factors impose
significant challenges to determining an ideal allocation of research funds.
Although the economic impact of biomedical research has been considered[23],
the main focus has been on measuring value-added rather than determining optimal
funding allocations. Murphy and Topel [63] estimate U.S. economic surplus from
improved health on the order of $2.6 trillion annually, with benefits distributed unequally across age and gender, and suggest that in some cases, incremental benefits
may not exceed the cost of achieving them. Johnston et al.[46] found a return to society in the form of averted treatment costs and public health benefits divided by cost
of trial expenditures of 46% for clinical trials at the National Institute of Neurological
Disorders and Stroke (NINDS), where the returns or net savings were generated by
four of the 28 trials examined, and collectively exceeded the costs of not only the
clinical, but the entire program of research at NINDS during the study period. Cutler and McClellan[28] computed returns of technological advances for five conditions
134
and found net benefit for four and costs equal to benefits in the fifth. Fleurence and
Togerson[31] suggested that research should be allocated to provide the most health
benefits to the population, subject to equity considerations, and observed that subjective, burden-of-disease, and payback methods all failed this test to some degree.
Instead, they argue that a method of information valuation is superior.
Modern financial portfolio theory—in which the expected return, risk (as measured
by volatility), and correlations of a collection of investment opportunities are taken
as inputs, and the set of all portfolio weights with the highest expected return for
a given level of risk is the output—produces rational allocations of limited resources
among competing priorities. For developing this method in 1952, Markowitz shared
the Nobel Memorial Prize in Economic Sciences in 1990. The theory has had extensive
applications among mutual funds, pension funds, endowments, and sovereign wealth
funds[55, 56, 24].
More recently, portfolio theory has been proposed as a means for conducting risksensitive cost-benefit analysis for health-care budgeting decisions[11, 66, 19, 18, 71,
72]. The motivation for these studies is the observation that typical cost-effectiveness
studies of healthcare programs ignore the uncertainty of realized costs, which can be
addressed by applying portfolio theory to balance the risks against the rewards of specific budget allocations. These studies present simplified frameworks for incorporating
risk into the healthcare budgeting process, e.g., two-security examples (although[71]
does contain 11 hypothetical cost/effect distributions) and do not contain full-scale
empirical applications to realistic budgeting tasks. As the authors note, applying
portfolio theory to large public healthcare reimbursement problems can be challenging. Patients may have differing and non-constant utility functions, and some argue
that the manager/administrator should only consider expected returns, allowing the
patient and physician to consider risk trade-offs at individual treatment levels, in
which case the aggregate utility function is implicit.
Despite the growing interest in measuring the return on biomedical research[67,
51], and the fact that portfolio theory has already been applied to healthcare budgeting decisions, some sceptics continue to argue against the use of any quantita135
tive metrics in this domain. For example, Black[14] states categorically that “[t]he
biomedical ‘payback’ approach is certainly inappropriate and attempts to impose it
should be strenuously resisted. Instead, a qualitative approach should be applied
that takes into account the ‘slow-burning fuse’ and avoids simple attribution of cause
and effect”. While such a response may be acceptable for certain types of funding,
it is becoming increasingly untenable with respect to public funds and government
support, which, by law, almost always require some form of cost/benefit analysis,
performance attribution, and oversight.
3.2
3.2.1
Methods
Funding Data
The NIH has 27 Institutes and Centers, of which we identified 10 with research
missions clearly tied to specific disease states, and which account for $21 billion of
funding in 2005 or 74% of the total (see Table 3.2 for the disease classification scheme
used and Figure 3-1 for the procedure for constructing the appropriation time series).
The National Institute of Allergies and Infectious Diseases (NIAID) spending has
been split to account for HIV, which is presented separately (see HIV discussion
below).
These Institutes and the basic research they fund have inevitable overlap and
effect beyond their charter; we treat all spending for any given Institute as being
directed toward the corresponding disease states, and account for spillover effects by
considering the correlations in the lessening of the burden of disease in other groups.
For example, molecular biology funded by the NCI may be relevant to infectious
diseases but, like the entire NCI budget, would be assumed for modeling purposes to
be directed at cancer; the hypothetical infectious-disease improvement would appear
in the correlation between the decrease in years of life lost for cancer and that of
infectious diseases.
136
Analytic
Group
ICD 9
ICD 10
NIH
Codes
001-139
140-239
240-279;
Chapter(s)
Certain infectious and parasitic diseases
Neoplasms
Endocrine, nutritional and metabolic diseases;
Blocks
A00-B99
C00-D48
E00-E88;
Institute(s)
NIAID
NCI
NIDDK
HLB
Chapter(s)
Infectious and Parasitic Diseases
Neoplasms
Endocrine, nutritional and metabolic diseases, and
immunity disorders;
Diseases of the digestive system;
Diseases of the genitourinary system
Diseases of the blood and blood-forming organs:
520-579;
580-629
280-289;
K00-K92;
N00-N98
D50-D89;
NHLBI
NMH
CNS
Diseases of the circulatory system;
Diseases of the respiratory system
Mental disorders
Diseases of the nervous system and sense organs
390-459;
460-519
290-319
320-389
Complications of pregnancy, childbirth, and the
puerperium;
Congenital anomalies;
Certain conditions originating in the perinatal period
630-676;
Diseases of the digestive system;
Diseases of the genitourinary system
Diseases of the blood and blood-forming organs and
certain disorders involving the immune mechanism;
Diseases of the circulatory system;
Diseases of the respiratory system
Mental and behavioural disorders
Diseases of the nervous system;
Diseases of the eye and adnexa;
Diseases of the ear and mastoid process
Pregnancy, childbirth and the puerperium;
I00-I99;
J00-J98
F01-F99
G00-G98;
H00-H57;
H60-H93
O00-O99;
NIMH
NINDS
NEI
NIDCD
NICHD
P00-P96;
Q00-Q99
680-709;
710-739
LAB
Diseases of the skin and subcutaneous tissue;
Diseases of the musculoskeletal system and connective
tissue
Symptoms, signs, and ill-defined conditions
EXT
External causes of injury and poisoning
E800E999
Certain conditions originating in the perinatal period;
Congenital malformations, deformations and
chromosomal abnormalities
Diseases of the skin and subcutaneous tissue;
Diseases of the musculoskeletal system and connective
tissue
Symptoms, signs and abnormal clinical and laboratory
findings, not elsewhere classified
Codes for special purposes; External causes of morbidity
and mortality
AID
NCI
DDK
CHD
AMS
740-759;
760-779
780-799
L00-L98;
M00-M99
NIAMS
R00-R99
U00-U99;
V01-Y89
Table 3.2: ICD mapping. Classification of ICD-9 (1978–1998) and ICD-10 (1999–
2007) Chapters and NIH appropriations by Institute and Center to 7 disease groups:
oncology (ONC); heart lung and blood (HLB); digestive, renal and endocrine (DDK);
central nervous system and sensory (CNS) into which we placed dementia and unspecified psychoses to create comparable series as there was a clear, ongoing migration
noted from NMH to CNS after the change to ICD-10 in 1999; psychiatric and substance abuse (NMH); infectious disease, subdivided into estimated HIV (HIV) and
other (AID); maternal, fetal, congenital and pediatric (CHD). The categories LAB
and EXT are omitted from our analysis.
137
Figure 3-1: NIH time series flowchart. Flowchart for the construction of NIH appropriations time series. “NIH Approp.” denotes NIH appropriations; “PHS Gaps”
denotes Institute funding by the U.S. Public Health Service; “Complete Approp.”
denotes the union of these two series; “FY Change” allows for the change in government fiscal years; “4Q FY” time series refers to the resulting series in which all years
are treated as having four quarters of three months each.
25
AID
HIV
AMS
CHD
CNS
DDK
HLB
ONC
NMH
Funding (in billions of $)
20
15
10
5
0
1940
1950
1960
1970
Year
1980
1990
2000
Figure 3-2: Appropriations data. NIH appropriations in real (2005) dollars, categorized by disease group.
138
3.2.2
Burden of Disease Data
Because of its simplicity, availability, breadth, and long history, years of life lost (YLL)
was chosen as the measure of burden of disease to be used in constructing the estimated return on investment from NIH-funded research. The CDC Wide-ranging Online Data for Epidemiologic Research (WONDER) database (http://wonder.cdc.gov/)
was queried for the underlying cause of death at the Chapter level (except for mental
disorders, where dementia and unspecified psychoses were all placed in CNS for consistency with CDC coding after 1998) for International Classification of Diseases (ICD)
categories ICD-9 (for 1979–1998) and ICD-10 (for 1999–2007). The two datasets for
pre- and post-1998 were joined into one continuous series, data were stratified into
groups by age at death, and YLL were computed by comparing the midpoint of the
age ranges with the World Health Organization’s (WHO) year-2000 U.S. life table
(http://www.who.int/whosis/en/). Years of life lost were then tabulated by Chapter
annually, and adjusted for population growth to remove what would otherwise be a
systematic downward bias in realized health improvements. This process yielded YLL
series for 9 distinct disease groups.
Using 2005 as the base year, the raw YLL observations were adjusted in other
years to be comparable to the 2005 population:
×
YLLt ≡ YLLraw
t
POP2005
POPt
,
POPt ≡ U.S. population in year t . (3.1)
The procedure for assembling the YLL time series is summarized in Figure 3-3,
and the resulting series, both raw and normalized for population growth, are shown
in Figure 3-4.
The change in burden of disease was measured by taking first differences. These
first differences were used to compute the “return on investment” on which the meanvariance optimizations were based (see the “Methods” section below).
Three disease areas required special consideration: HIV, AMS, and dementia.
AMS and HIV have shorter histories, which is problematic for estimating parameters
based on historical returns that are lagged by typical FDA approval times plus 4 years.
139
Figure 3-3: YLL time series flowchart. Flowchart for the construction of years
of life lost (YLL) time series. “WONDER Chapter Age Group” refers to a query to
the CDC WONDER database at the chapter level, stratified by age group at death;
“US Pop.” is the United States population from census data as expressed in the
WONDER dataset; and “US GDP” denotes U.S. gross domestic product.
140
(a) YLL Gross
(b) YLL Normalized
Figure 3-4: YLL data. Panel (a): Raw YLL categorized by disease group. Panel (b):
Population-normalized YLL (with base year of 2005), categorized by disease group.
Both panels are based on data from 1979 to 2007.
141
Dementia, including Alzheimer’s disease and unspecified psychoses, was reclassified
with the change from ICD-9 to ICD-10 from mental and behavioral disorders to
diseases of the nervous system; we placed all dementia YLL in the CNS group to
avoid a transition-point artifact at the juncture between ICD-9 and ICD-10, and
then performed a sensitivity analysis with and without the dementia YLL.
HIV poses a special challenge given its extreme returns after the introduction of
protease inhibitors, which are outliers that are likely to be non-stationary and would
heavily bias the parameter estimates on which the portfolio optimization is based.
To address this outlier, HIV spending and its corresponding YLL were omitted from
those of other infectious diseases—the component of NIAID spending directed at
HIV was estimated by straight-line interpolation from published figures, and this
HIV spending was treated as a separate entity and subtracted from reported NIAID
appropriations; a similar procedure was followed for the estimation of HIV-related
YLL, and WONDER was queried at the subchapter level to implement this separation. Because of their unique characteristics, these two groups are omitted from our
main empirical results.
3.2.3
Applying Portfolio Theory
To apply portfolio theory, the concept of a “return on investment” (ROI) must first be
defined. Although YLL has already been chosen as the metric by which the impact
of research funding is to be gauged, there are at least two issues in determining
the relation between research expenditures and YLL that must be considered. The
first is whether or not any relation exists between the two quantities. While the
objectives of pure science do not always include practical applications that impact
YLL, the fact that part of the NIH mission is to “enhance health, lengthen life, and
reduce the burdens of illness and disability” suggests the presumption—at least by
the NIH—that there is indeed a non-trivial relation between NIH-funded research and
burden of disease. For the purposes of this study, and as a first approximation, we
assume that YLL improvements are proportional to research expenditures. Of course,
factors other than NIH research expenditures also affect YLL, including research
142
from other domestic and international medical centers and institutes, spending in
the pharmaceutical and biotechnology industries, public health policy, behavioral
patterns, prosperity level and environmental conditions. Therefore, the YLL/NIHfunding relation is likely to be noisy, with confounding effects that may not be easily
disentangled. The Discussion section contains a more detailed discussion of this
assumption and some possible alternatives.
The second issue is the significant time lag between research expenditures and
observable impact on YLL. For example, Mosteller[62] cites a lag of 264 years, starting
in 1601, for the adoption of citrus to prevent scurvy by the British merchant marine.
More contemporary examples[26, 35, 39] cite lags of 17 to 20 years. We use shorter
lags in this study both because of data limitations (our entire dataset spans only
29 years), and also to reduce the impact of factors other than research expenditures
on our measure of burden of disease (YLL). Any attempt to optimize appropriations
to achieve YLL-related objectives must take this lag into account, otherwise the
resulting optimized appropriations may not have the intended effects on subsequent
YLL outcomes.
The impact of NIH-funded research on disease burden is likely to be spread out
over several years after this intervening lag, given the diffusion-like process in which
research results are shared in the scientific community. For simplicity, the same
duration (p = 5 years) of the diffusion-like impact for all the disease groups was
hypothesized. The lag q for each disease group was estimated by running linear
regressions associating improvements in YLL over p = 5 years with NIH funding q
years earlier and real income and choosing the lag between 9 and 16 years ( beyond
which data limitations and other factors make it impossible to distinguish the impact
of research funding from other confounding factors affecting YLL) that maximizes
the R2 and the corresponding lags are shown in Table 3.3.
This procedure is, of course, a crude but systematic heuristic for relating research
funding to YLL outcomes. Alternatives include using a single fixed lag across all
groups, simply assuming particular values for group-specific lags based on NIH mandates and experience, computing a time-weighted average YLL for each group with
143
a weighting scheme corresponding to an assumed or estimated knowledge-diffusion
rate for that group, or constructing a more accurate YLL return series by tracking
individual NIH grants within each group to determine the specific impact on YLL
(through new drugs, protocols, and other improvements in morbidity and mortality)
from the award dates to the present. While the choice of lag is critical in determining
the characteristics of the YLL return series and deserves further research, it does
not effect the applicability of the overall analytical framework. While our procedure
is surely imperfect, it is a plausible starting point from which improvements can be
made.
Assuming constant impact of research funding on YLL over the duration of p
years, the measure of the ROI that accrues to funds allocated in year t is then given
by:
Rt+q
≡ −
1
p
Pp−1
i=0 (YLLt+q+i
− YLLt+q+i−1 ) × GDPt+q+i
Appropriationt
(3.2)
where the minus sign reflects the focus on decreases in YLL, and the multiplier
GDPt+q is per capita real gross domestic product (GDP) in year t + q , which is
included to convert the numerator to a dollar-denominated quantity to match the
denominator. This ratio’s units are then comparable to those of typical investment
returns: date-(t+q) dollars of return per date-t dollars of investment.
Given the definition in equation (3.2) for the ROI of each of the disease groups,
the “optimal” appropriation of funds among those groups must be determined, i.e.,
the appropriation that produces the best possible aggregate expected return on total
research funding per unit risk. Denote by R ≡ [ R1 R2 · · · Rn ]′ the vector
of returns of all n groups for a given appropriation date t (where time subscripts
have been suppressed for notational simplicity), and denote by µ and Σ the vector
of expected returns and the covariance matrix, respectively. If the weights of the
budget allocation among the groups are ω, the ROI for the entire portfolio of grants,
denoted by Rp , is given by Rp = ω ′ R, and its expected value and variance are ω ′ µ
and ω ′ Σω, respectively. The objective function to be optimized is then given by
144
the expected value minus some multiple of the variance which reflects risk tolerance,
and this quadratic function of ω is maximized using standard quadratic optimization
techniques, subject to the constraint that the weights sum to 1.
In the mean-variance framework, we seek to find the best trade-offs between risk
and expected return by varying the portfolio weights ω to trace out the locus of
mean-variance combinations that cannot be improved upon, i.e., that are “efficient”.
This set of efficient portfolios, also known as the “efficient frontier” is formally defined
as the curve in mean-variance (or mean-standard deviation) space corresponding to
all portfolios with the highest level of expected return for a given level of variance.
This efficient frontier defines the set of allocations that cannot be improved upon
from a mean-variance perspective, and the optimal allocation is a single point on this
frontier that is determined by the investor’s desired volatility level or risk tolerance.
More formally, an investor with standard mean-variance preferences is assumed to
prefer portfolios with greater expected return and lower variance, with diminishing
returns in each (so that progressively greater increments of expected return must be
offered to the investor to induce him to accept increases in the same increment of risk
as the level of risk rises). This type of preferences generates so-called “indifference
curves” (non-intersecting curves in mean-variance space that trace out combinations
of mean and variance for which an individual is indifferent) that are upward sloping
and convex. The optimal portfolio for a given set of indifference curves is the tangency
point T of the efficient frontier with the most upper-left indifference curve.
To compute the efficient frontier, the following optimization problem must be
solved (we maintain the following notational conventions: (1) all vectors are column
vectors unless otherwise indicated; (2) matrix transposes are indicated by a prime
superscript, hence ω ′ is the transpose of ω; and (3) vectors and matrices are always
typeset in boldface, i.e., X and µ are scalars and X and µ are vectors or matrices):
Minimize
ω
ω ′ Σω
(3.3)
′
′
subject to ω µ ≥ µo , ω ι = 1 , ω ≥ 0
145
where ι is an (n × 1)-vector of 1’s and µo is an arbitrary fixed level of expected
return. By varying µo between a range of values and solving the optimization problem
for each value, all the efficient allocations ω ∗ may be tabulated, and the locus of
points in mean-standard-deviation space corresponding to these efficient allocations
is the efficient frontier. This so-called Markowitz portfolio optimization problem
involves minimizing a quadratic objective function with linear constraints, which is a
standard quadratic programming (QP) problem that can easily be solved analytically
in some cases[58], and numerically in all other cases by a variety of efficient and stable
solvers[37, 36].
One additional refinement to address the well-known issue of “corner solutions”
(in which several components of ω ∗ are 0) that often arise in the standard portfoliooptimization framework is proposed. While such extreme allocations may, indeed, be
optimal with respect to the mean-variance criterion, they are more often the result
of estimation error and outliers in the data[8]. Moreover, even in the absence of estimation error, mean-variance optimality may not adequately reflect other objectives
such as social equity across disease groups or distance from current status quo in allocation. To incorporate such considerations, a “regularization” technique is applied
in which the objective function is penalized for allocations that are far away from the
average allocation policy. Specifically, we consider the following regularized version
of the standard portfolio-optimization problem:
Minimize
ω
ω ′ Σω + γ kω − ω N IH k2
′
(3.4)
′
subject to ω µ ≥ µo , ω ι = 1 , ω ≥ 0
This formulation is essentially a dual-objective optimization problem in which the
first objective is to minimize the portfolio’s variance (ω ′ Σω), and the second objective
is to minimize the difference from the average NIH allocation policy (kω − ω N IH k2 )
and the non-negative parameter γ determines the relative importance of these two
objectives. Larger values of γ yield optimal weights that are closer to average NIH al146
Group
AID
CHD
CNS
DDK
HLB
ONC
NMH
Lag
10
12
10
11
16
16
9
Mean
-0.9
5.1
-1.7
-0.8
9.8
0.5
0.0
SD
1.3
3.8
0.8
1.5
3.4
1.3
0.3
Min
-4.3
0.2
-3.1
-3.8
4.3
-2.3
-0.8
Med
-0.8
4.4
-1.7
-0.8
9.6
1.2
0.0
Max
2.4
11.1
-0.4
2.9
18.6
2.1
0.6
Skewness Kurtosis
-0.3
5.2
0.1
1.4
0.1
1.9
0.2
3.6
0.7
3.6
-0.7
2.2
-0.1
3.2
Table 3.3: Return summary statistics. Summary statistics for the ROI of disease
groups, in units of years (for the lag length) and per-capita-GDP-denominated reductions in YLL between years t and t+4 per dollar of research funding in year t−q,
based on historical ROI from 1980 to 2003.
Year
1985
YLL
19,741,993
∆
GDP/Capita ($)
GDP-Weighted ∆ ($)
Mean GDP-Weighted YLL ∆ ($)
Lag (years)
Funding year
Appropriation $
ROI
1986
1987
19,380,387
361,605
29,443
10,646,748,753
19,015,838
364,549
30,115
10,978,408,090
1988
18,951,220
64,617
31,069
2,007,589,676
1989
18,086,872
864,348
31,877
27,552,827,900
1990
17,670,665
416,207
32,112
13,365,233,032
12,910,161,490
16
1970
695,809,705
18.6
Table 3.4: ROI example. An example of the ROI calculation for HLB from 1986.
location but which correspond to portfolios with greater volatility, and smaller values
of γ yield optimal weights that may be more concentrated among a smaller subset of
groups, but which imply lower portfolio volatility.
3.3
3.3.1
Results
Summary Statistics
Summary statistics of the ROI for the period 1980–2003 are presented in Table 3.3.
In Table 3.4 we provide an example of the ROI calculation for HLB for 1986, when
the return was 18.6.
Large differences in mean ROI for different Institutes are evident in Table 3.3,
147
10
10
8
8
6
6
4
4
2
2
0
0
−2
0
1
2
3
(a) With Alzheimer effect, gamma = 0
4
−2
10
10
8
8
6
6
4
4
2
2
0
0
−2
0
1
2
3
4
(c) Without Alzheimer effect, gamma = 0
−2
0
0
1
2
3
(b) With Alzheimer effect, gamma = 5
4
1
2
3
4
(d) Without Alzheimer effect, gamma = 5
Figure 3-5: Efficient frontiers. Efficient frontiers for (a) all groups except HIV and
AMS, γ = 0; (b) all groups except HIV and AMS, γ = 5; (c) all groups except HIV
and AMS without the dementia effect, γ = 0; and (d) all groups except HIV and AMS
without the dementia effect, γ = 5; based on historical ROI from 1980 to 2003.
ranging from small negative values (e.g., −1.7 for CNS) to large positive values (e.g.,
9.8 for HLB). Large differences in standard deviation also exist, ranging from 0.3 for
NMH to 3.8 for CHD.
3.3.2
Efficient Frontiers
In Figure 3-5 , efficient frontiers for the single- and dual-objective optimization problems are plotted in mean-standard deviation space for the 7-group cases with and
without taking into account the dementia effect.
148
Efficient
AID
CHD
CNS
DDK
HLB
ONC
NMH
NIH Avg
1/n
NIH−Var
NIH−Mean
Min−Var
Eff−25%
Eff−50%
Eff−75%
For each of these frontiers, the mean-standard deviation points for the following
funding allocations are also plotted:
(i) historical average NIH allocation for years 1996–2005;
(ii) equal-weighted (1/n) allocation;
(iii) minimum-variance allocation;
(iv) the allocation on the efficient frontier that has the same mean as the average
NIH allocation (the “NIH-mean” allocation);
(v) the allocation on the efficient frontier which has the same variance as the average
NIH allocation (the “NIH-var” allocation);
(vi) the allocation on the efficient frontier that is 25% of the distance from the
minimum variance allocation to the maximum expected-return allocation;
(vii) the allocation on the efficient frontier that is 50% of the distance from the
minimum variance allocation to the maximum expected-return allocation;
(viii) the allocation on the efficient frontier that is 75% of the distance from the
minimum variance allocation to the maximum expected-return allocation.
The region bounded by (i), (iv), (v) and the efficient frontier is of special interest
because all portfolios in this region offer lower variance, higher expected return, or
both when compared to the average NIH allocation, hence from a mean-variance
perspective such allocations are unambiguously preferable. These allocations are
called “dominating” portfolios relative to the average NIH allocation (i).
Figure 3-5a shows that a number of the disease groups appear to be concentrated
in a relatively low-risk sector of the risk/reward universe, which may be evidence of
active variance-minimization strategies by various stakeholders.
A sensitivity analysis is conducted by estimating the efficient frontier with (Figure 3-5a) and without the dementia effect (Figure 3-5c). Table 3.5 contains the
portfolio weights corresponding to Figures 3-5a and 3-5c respectively.
149
Group
All Groups:
AID
CHD
CNS
DDK
HLB
ONC
NMH
Benchmarks
NIH
1/n
Avg
8
7
14
10
17
27
16
Without Dementia:
AID
8
CHD
7
CNS
14
DDK
10
HLB
17
ONC
27
NMH
16
NIHVar
Single-Objective Portfolios (in %)
NIH- MinEffEffEffMean Var
25%
50%
75%
Dual-Objective Portfolios (γ = 5) (in %)
NIHNIH- MinEffEffEffVar Mean Var
25%
50%
75%
14
14
14
14
14
14
14
0
24
0
0
23
53
0
0
11
0
0
11
28
50
0
0
25
0
0
16
58
0
13
0
0
13
33
41
0
27
0
0
32
42
0
0
36
0
0
55
9
0
0
18
7
0
24
34
17
3
11
15
5
14
33
20
5
7
19
8
9
32
21
0
18
8
0
23
34
17
0
28
0
0
39
31
2
0
34
0
0
59
7
0
14
14
14
14
14
14
14
0
23
2
0
23
52
0
0
11
32
0
12
37
8
0
0
41
0
0
17
41
0
14
27
0
16
43
0
0
28
0
0
34
39
0
0
36
0
0
55
9
0
0
17
12
0
23
33
15
3
11
18
5
13
32
19
5
8
19
8
9
31
20
0
18
11
0
24
33
14
0
28
0
0
40
31
1
0
34
0
0
60
6
0
Table 3.5: Portfolio weights. Benchmark, single- and dual-objective optimal portfolio weights (in percent), based on historical ROI from 1980 to 2003.
The top left sub-panel of Table 3.5 shows that the single-objective optimization
does yield sparse weights as expected. For example, the minimum-variance portfolio
allocates to only three groups: 58% to NMH, 25% to CNS, and 16% to ONC. By minimizing variance, irrespective of the mean, this portfolio allocates funding to groups
with least variability in YLL improvements. The efficient-25% portfolio allocates
non-zero weights in four groups (41% to NMH, 33% to ONC, 13% to HLB, and 13%
to CHD), and yields 26% better expected return with 28% less risk. With still more
emphasis on expected return, the efficient-50% portfolio gives non-zero weights only
to three successful groups: 42% to ONC, 32% to the higher risk, higher expectedreturn HLB, and 27% to CHD. This portfolio has 172% higher expected return but
only 27% more risk than the NIH portfolio. The efficient-75% portfolio gives an even
higher weight of 55% to HLB, 36% to CHD, and 9% to ONC, yielding 318% higher
expected return and 148% more risk, a diminishing risk-adjusted expected return as
compared to portfolios with lower volatility. Given the greater emphasis on expected
return for this portfolio, it is not surprising to see HLB getting a bigger role due to
its apparent historical success in reducing YLL. Of course, whether or not past suc150
cess is indicative of comparable future success hinges on the science and associated
translational efforts underlying the diseases covered by HLB. This underscores the
importance of incorporating research and clinical insights into the funding allocation
process, especially within a systematic framework such as portfolio theory.
However, the dementia effect may underestimate the performance of the CNS
disease group, hence the lower panel of Table 3.5 reports corresponding optimalportfolio results without the dementia effect. In the single-objective case, the efficient50% and 75% portfolios are still sparse, with non-zero weights in 3 groups, while the
lower risk efficient-25% portfolio is less concentrated with non-zero weights to 4 groups
and significant weight (27%) to the CNS group.
Table 3.5 also contains the optimal portfolios for the dual-objective case (with
γ = 5) in the right sub-panels (see Figures 3-5b and 3-5d). These cases correspond
to portfolios that trade off closeness to the average NIH allocation policy with better
risk-adjusted expected returns. Now we observe that for both upper and lower subpanels corresponding to the 7-group with/without the dementia effect optimization,
respectively, the weights are less concentrated than in the single-objective case. For
example, the minimum-variance portfolio without the dementia effect now allocates
funding to all the groups, with weights ranging from 5% to 31%. However, even in this
case, the efficient-75% portfolio is still extreme, allocating weights only to HLB, CHD
and ONC. Therefore, special care must be exercised in selecting the appropriate point
on the efficient frontier. We also observe from the NIH-var or NIH-mean portfolios
that slight changes to the average NIH policy apparently yield superior performance
in mean-standard deviation space (28% to 89% relative improvement, depending on
the assumptions).
3.4
Discussion
Portfolio theory provides a systematic framework for determining optimal research
funding allocations based on historical return on investment, variance, and correlation
between appropriations and reductions in disease burden. The optimization results
151
suggest that significant YLL improvements with respect to a mean-variance criterion
may be possible through funding re-allocation. To our knowledge, this is the first
time such an approach has been empirically implemented in this domain.
However, our findings must be qualified in at least three respects: (1) YLL as a
measure of burden of disease, which is clearly incomplete and less than ideal; (2) the
definition of ROI and the challenges of relating research expenditures to subsequent
outcomes such as burden of disease; and (3) the known limitations of portfolio theory. While each of these qualifications can be addressed to varying degrees through
additional data and analysis, the empirical conclusions are likely to depend critically
on the nature of their resolution. In this section, we provide a short synopsis of these
qualifications, and also consider other objections to this framework and directions for
future research.
YLL captures only the most extreme form of disease burden, and other measures
such as disability-adjusted or quality-adjusted life years are clearly preferable. However, time series histories for such measures are currently unavailable; hence YLL is
the most natural starting point for gauging the impact of biomedical research funding,
and is directly aligned with the NIH mission to “lengthen life”.
As a measure of disease burden, YLL captures only lethal illness by definition;
chronic illness enters the optimization process only indirectly, mortality in the young
is more heavily weighted than that of the elderly, and quality-of-life is not captured at
all. The choice of YLL is motivated by several factors: long time-series observations
of YLL are readily available, they cover a large population, and they address the
entire spectrum of diagnoses categorized under the ICD. Broader measures of burden
of disease such as disability adjusted life years (DALY)[38] and quality-adjusted life
years (QALY)[68, 22] have been proposed, but historical time series for such measures
are not yet available. As better measures are developed (e.g., incidence, prevalence,
physician visits, hospitalization, DALY, QALY), portfolio-optimization methods may
be applied to them as well through appropriately defined “returns”. Should datasets
covering not only age and cause of death but also ante-mortem symptoms become
available, mean-variance-efficient allocations would likely place significant weight on
152
improvements in the care of less-lethal chronic diseases.
Even if YLL is an appropriate measure of disease burden, our definition of ROI
can also be challenged as being imprecise and ad hoc in several respects. NIH funding
is typically focused on basic research rather than translational efforts, therefore, NIH
spending may not be as directly related to subsequent YLL improvements. We have
not accounted for other expenditures that may also affect YLL, and to the extent
that NIH appropriations are systematically used to complement private spending
to allocate total funding across diseases more fairly[76], the relation between NIH
funding and subsequent YLL improvements may be even noisier, and may require
modelling private-sector expenditures as a separate but complementary portfoliooptimization problem with an objective function and constraints that are linked to
those of the NIH. Also, the standard portfolio-optimization framework implicitly assumes a constant multiplicative relation between dollars invested today and dollars
returned tomorrow (so that doubling the investment will typically double the ROI of
that investment), whereas the return to biomedical investments may be non-linear.
In addition, translational research takes time and significant non-NIH resources, further blurring the relation between NIH allocations and subsequent changes in YLL.
Finally, other factors may contribute to YLL improvements, including changes in cultural norms (including consumption of alcohol and cigarettes), economic conditions
(such as recessions vs. expansions), and public policy (such as vaccine programs and
mandates for automobile, home, and workplace safety). While all of these qualifications have merit, they are not insurmountable obstacles and can likely be addressed
through additional data collection and more sophisticated metrics, perhaps along the
lines of Porter[67] or Lane and Bertuzzi[51]. Moreover, the portfolio-optimization
approach provides a useful conceptual framework for formulating funding allocation
decisions systematically, even if its empirical implications are imprecise.
The estimates of q were an initial attempt to link appropriation with outcome in
a systemic and non-discretionary manner, but they were derived heuristically from
regulatory, appropriation, and epidemiological data which may not be stationary or
predictive. For example, if the Food and Drug Administration’s capacity for reviewing
153
new-drug applications is held constant and applications double, substantial increases
in regulatory queuing would be expected, even with the added resources generated by
the Prescription Drug User Fee Act. Finally, in converting changes in YLL to dollar
amounts, per-capita real GDP was used as the “conversion factor” irrespective of age,
despite the fact that children and retired individuals are economically less active.
While these caveats highlight the imprecision with which the impact of research
spending is measured, they also provide direction for developing better metrics. In
particular, the underlying science of each grant implies a particular set of dynamics for
translation and YLL impact, and with more sophisticated models of such dynamics,
the returns to fundamental research should be measurable with greater accuracy.
Even within the exact domain for which it was developed, portfolio theory has
several well-known limitations, of which the most obvious is the possibility that the
mean-variance criterion may not, in fact, be the appropriate objective function to
be optimized. While there is little disagreement that higher expected ROI is preferable to its alternative, the trade-off between expected ROI and risk is fraught with
subtleties involving specific psychological, perceptual, and behavioural mechanisms
of individuals and groups. Because of these considerations, mean-variance analysis is
often considered an approximation to a much more complicated reality—a starting
point for investment allocation decisions, not the final answer.
Another known limitation of portfolio theory is the fact that the input parameters
(µ, Σ) must be estimated from historical data, and estimation error in these parameter estimates can lead to portfolios that are unstable and sub-optimal[55]. One
common approach to addressing this problem in the financial context is to employ
prior information regarding the input parameters, thereby reducing the dependence
on historical data. Using Bayesian methods, expert opinions regarding the statistical properties of the individual asset returns can be incorporated into the portfolio
optimization process[13, 5] [12].
One limitation that is unique to the current application is the fact that portfolio
theory is silent on which mean-variance-optimal portfolio to select. In the financial
context, the existence of a riskless investment (e.g., U.S. Treasury bills) implies that
154
one unique portfolio on the efficient frontier will be desired by all investors—the socalled “tangency” portfolio[73]. Because there is no analog to a riskless investment in
biomedical research, the notion of a tangency portfolio does not exist in this context.
Therefore, decision makers must first determine society’s collective preferences for risk
and return with respect to changes in YLL before a unique solution to the portfoliooptimization problem can be obtained, i.e., they must agree on a societal “utility
function” for trading off the risks and rewards of biomedical research.
This critical step is a pre-requisite to any formal analysis of funding allocation
decisions, and underscores the need for integration of basic science with biomedical
investment performance analysis and science policy. Such integration will require
close and ongoing collaboration between scientists and policymakers to determine the
appropriate parameters for the funding allocation process, and to incorporate prior
information and qualitative judgments[14] regarding likely research successes, social
priorities, policy objectives and constraints, and hidden correlations due to non-linear
dependencies not captured by the data. In particular, it is easy to imagine contexts
in which funding objectives can and should change quickly in response to new environmental threats or public-policy concerns. However, such pressing needs must be
balanced against the disruptions—which can be severe due to the significant adjustment costs implicit in biomedical research[32]—caused by large unanticipated positive
or negative shifts in research funding. Although the end result of collaborative discussion may fall short of a well-defined objective function that yields a clear-cut optimal
portfolio allocation, the portfolio-optimization process provides a transparent and rational starting point for such discussions, from which several insights regarding the
complex relation between research funding and social outcomes are likely to emerge.
Any repeatable and transparent process for making funding allocation decisions—
especially one that involves criteria other than peer-review-based academic excellence—
will, understandably, be viewed with some degree of suspicion and contempt by the
scientific community. However, if one of the goals of biomedical research is to reduce
the burden of disease, some tension between academics and public policy may be
unavoidable. Moreover, in the absence of a common framework for evaluating the
155
trade-offs between academic excellence and therapeutic potential, other approaches
such as political earmarking[2] are being proposed, which may be even less palatable
from the scientific perspective.
In an environment of tightening budgets and increasing oversight of appropriations, portfolio theory offers scientists, policymakers, and regulators—all of whom
are, in effect, research portfolio managers—a rational, systematic, transparent, and
reproducible framework in which to explicitly balance and trade off expected benefits
with potential risks while accounting for correlation among multiple research agendas and real-world constraints in allocating scarce resources. Most funding agencies
and scientists have already been making such trade-offs informally and heuristically;
there may be additional benefits to making such decisions within an explicit framework based on standardized and objective metrics.
One of the most significant benefits from adopting such a framework may be the
reduction of uncertainty surrounding future funding-allocation decisions, which would
greatly enhance the ability of funding agencies and scientists to plan for the future and
better manage their respective budgets, research agendas, and careers. By approaching funding decisions in a more analytical fashion, it may be possible to improve their
ultimate outcomes while reducing the chances of unintended consequences.
156
Chapter 4
Impact of model misspecification
and risk constraints on market
In Chapters 1, 2 we studied the optimal trading strategy of a risk averse investor who
faces risk constraints and model misspecification. In this Chapter we will study how
risk constraints and fear of model misspecification affect the statistical properties of
the market returns. In particular, we will study their effect on the risk premium, the
volatility and liquidity of the market.
We find that the statistical properties of the market change. In particular, variability of the risk constraints leads to increasing risk premium, increasing volatility
and increasing illiquidity of the market. In addition, tightening of these constraints
leads also to increasing risk premium, increasing volatility and increasing illiquidity.
Moreover, we find that variability in risk aversions along with risk constraints also
lead to a more concave pricing function of the aggregate supply for the market, implying increasing risk premium, increasing volatility and increasing illiquidity. Finally,
we explore how the properties of the asset returns change when the investors do not
completely trust their models. We find that model misspecification is another source
of increasing risk premium, endogenous volatility and increasing illiquidity.
In the rest of this chapter, we will discuss the relevant literature review. Then
we will discuss about the setup of the model, and we will analyze the impact on the
market returns of varying risk constraints across agents, varying risk aversions across
157
agents and varying degrees of fear of model misspecification across agents. Finally,
we will conclude with all of our results.
4.1
Literature review
In the literature there are papers that assume heterogeneity along three dimensions:
risk aversion coefficients of the agents, constraints the agents face and beliefs of the
agents.
Danielsson and Zigrand [81] assume that the agents have the same beliefs and face
the same constraints but they differ in their risk aversion coefficients. They study
the economic implications of a Value-at-risk based regulatory system by analyzing a
two period multi-asset general equilibrium model with agents heterogeneous in risk
preferences and wealth. They assume that the agents have CARA utilities and they
argue that there will be endogenous volatility and increasing risk premium due to the
fact that “... risk will have to be transferred from the more risk-tolerant to the more
risk-averse”. As we will prove this is not true and not necessary for having increasing
risk premium, endogenous volatility and increasing illiquidity.
Kogan and Uppal [50] show how to analyze the equilibrium prices and policies in
an economy with incomplete financial markets and stochastic investment opportunity
set, where the agents face portfolio constraints. They study a general equilibrium
exchange economy with multiple agents, who differ in the risk aversion coefficients
and face borrowing constraints, while having the same beliefs.
Brumm et al. [21] consider a general equilibrium infinite-horizon economy with
the agents having heterogeneous risk preferences and facing the same constraints,
while having the same beliefs. They find that the presence of collateral constraints
leads to strong excess volatility and a regulation of margin requirements potentially
has stabilizing effects.
Then, there are papers with different prior beliefs not due to asymmetric information among the agents. Geanakoplos [33] assumes that the agents have different
priors (optimists, pessimists) but same risk aversion coefficients and wealth and they
158
face identical collateral constraints. He studies how these constraints determine an
equilibrium leverage and how this leverage changes over time leading to crashes and
boom periods, the so-called leverage cycles.
Chen, Hong and Stein [25] study what happens to the price of a risky asset, when
there are investors with heterogeneous priors who face short sales constraints. The
idea that short sales constraints increase the prices of risky assets when the investors
have heterogeneous beliefs is due to Lintner [52] and Miller [61]. Chen, Hong and
Stein show that greater dispersion of beliefs leads to even higher prices.
Finally, Hansen and Sargent [41] study a framework where the agents have a
common approximating model, but they differ in the degree of mistrust of the model.
They find that agent’s caution in responding to concerns about model misspecification
can raise prices assigned to macroeconomic risks.
We will see how risk constraints affect the statistical properties of the market,
in particular the risk premium, volatility and liquidity of the market. We will first
study the case where the investors differ in the constraints they face and/or their risk
aversion coefficients and then the case where they mistrust the model of asset payoffs
and the mistrust varies among the investors.
4.2
4.2.1
Analysis
Model setup
We assume we have H mean-variance single-period optimizers with heterogeneous
risk aversions, risk constraints and wealth. Each agent can invest in the market and
the risk free rate at t = 0. We assume that the risk-free rate is exogenously given
and the market is modeled as a risky asset with stochastic payoff at t = 1. There are
also noise traders. We do not model their utility explicitly, we only assume that they
are hit by random liquidity shocks and they submit random market orders at time
t = 0. Equivalently, the supply of the risky asset is stochastic. Each agent faces a
risk constraint, a constraint in his wealth volatility of the form |θh | ≤ Lh Wh0 , where
159
θh is the position of the h agent in the market, Wh0 is the initial wealth of agent h
and Lh determines the tightness of the risk constraint that the h agent faces.
Each agent’s wealth at time t=1 is given by:
Wh1 = dθh + (Wh0 − qθh )Rf
where d is the payoff of the risky asset (market), θh is the number of shares, q is the
price of the risky asset and Rf is the risk-free rate. It is:
E(R) =
µ̂θh + (Wh0 − qθh )Rf
Wh0
where µ̂ is the expected payoff of the risky asset. We also have:
var(R) =
σ̂ 2 θh2
2
Wh0
where σ̂ is the volatility of the payoff of the risky asset.
Each agent solves the following optimization problem:
maximize
θh
(µ̂ − qRf )
1
θh 2
θh
− γh σ̂ 2 (
)
Wh0 2
Wh0
subject to |θh | ≤ Lh Wh0
where γh is the agent’s risk aversion coefficient.
By solving the KKT conditions we find:
θhopt =
where 1 + λh = max(1,
µ̂ − qRf
+ λh )
γh
σ̂ 2 (1
Wh0
|µ̂−qRf |
)
Lh γh σ̂2
160
(4.1)
The solution can also be written as:
θhopt


µ̂−qRf

γh


σ̂2

 Wh0
= Lh Wh0





−Lh Wh0
if
µ̂−Lh γh σ̂2
Rf
µ̂+Lh γh σ̂2
Rf
≤q≤
if
q≤
µ̂−Lh γh σ̂2
Rf
if
q≥
µ̂+Lh γh σ̂2
Rf
(4.2)
A competitive equilibrium is a set of portfolios (θ1 , · · · , θH ) and a price q such
that:
• Markets clear
• Each agent’s portfolio is optimal
The market clearing condition implies that:
X
θhopt = θα ⇒
h∈H
q=
µ̂ − Ψσ̂ 2 θα
Rf
where θα is the aggregate supply of the risky asset and
1
Ψ
≡
We will consider two special cases:
P
1
h∈H
γh
(1+λh )
Wh0
• Constraints vary across the agents
• Risk aversion relative to wealth varies across the agents
4.2.2
Varying constraints
In the first case, we assume that the agents face heterogeneous constraints. In particular we assume that the parameter L̂h ≡ Lh Wh0 varies across the agents, while
γh
Wh0
is constant equal to γ. Without loss of generality, we assume that L̂1 ≤ L̂2 · · · ≤ L̂H .
From the market clearing condition, we have that: Ψθα =
161
µ̂−qRf
σ̂2
. In addition from
equation 4.1, we have that:
µ̂ − qRf
opt γh
=
θ
(1 + λh )
h
σ̂ 2
Wh0
= θhopt γ(1 + λh )
Therefore, it is: Ψθα = θhopt γ (1 + λh ) ∀h ∈ H.
We perform a sensitivity analysis for two cases:
• Keep L̂h constant and change θα . By changing the aggregate supply, we find
that:
– Ψθα is a piecewise linear convex increasing function of the aggregate supply.
– Its slope is given by
γ
H−i+1
till the constraint binds for agent i. In other
words, at each point it is equal to γ over the number of agents for whom
the constraint is not binding yet.
– The constraint binds for agent 1 when θα,1 = H L̂1 and for agent i when
θα,i = θα,i−1 + (H − i + 1)(L̂i − L̂i−1 ). These are the kink points in Figures
4-1, 4-2, 4-3.
– When the aggregate supply is greater than θα =
P
h∈H
L̂h there is no equi-
librium, since in that case all the agents are constrained in their positions
and they cannot buy any more shares and therefore the market cannot
clear.
– Since q =
µ̂−Ψσ̂2 θα
,
Rf
we see that the pricing function is a piecewise linear
concave decreasing function of the aggregate supply, as we see in Figures
4-1, 4-2, 4-3.
– Therefore, variability of the constraints leads to increasing risk premium,
increasing volatility and increasing illiquidity, since a small change in the
aggregate supply, a small liquidity shock by the noise traders leads to a
larger change in the price of the risky asset comparing to the case, where
there is no variability in the constraints the different agents face. Figure
162
Price of the risky asset
1.6
Variability in constraints
Same constraints
1.4
1.2
Price
1
0.8
0.6
0.4
0.2
0
50
100
150
Aggregate supply
Figure 4-1: Price of the risky asset as a function of the aggregate market
supply under varying constraints. We assume that we have 5 agents with the
same risk aversion coefficients. The red plot assumes the same L = 30 for all the
agents, while the blue assumes L to be different across the agents L1 = 10, L2 =
20, L3 = 30, L4 = 40, L5 = 50.
4-1 shows the pricing function when the agents face the same constraints
and when they face variable constraints. Figure 4-3 also shows the pricing
function of the risky asset when the agents face two sets of constraints with
the same mean but with different variability.
• Keep θα constant and change L̂h . As we see in Figure 4-2 as we tighten the
constraints, the pricing function becomes more concave. Therefore, tightening
of the constraints leads to increasing risk premium, increasing volatility and
increasing illiquidity.
163
Price of the risky asset
1.6
Different constraints Lh
Constraints reduced by a factor 1/5
1.4
1.2
Price
1
0.8
0.6
0.4
0.2
0
50
100
150
Aggregate supply
Figure 4-2: Price of the risky asset as a function of the aggregate market
supply under tightening constraints. We assume that we have 5 agents with the
same risk aversion coefficients. The blue plot assumes L to be different across the
agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50 and the red assumes that each Li
is reduced by 20%.
Price of the risky asset
1.6
Different constraints Lh
Less variable constraints by a factor of 2
1.4
1.2
Price
1
0.8
0.6
0.4
0.2
0
50
100
150
Aggregate supply
Figure 4-3: Price of the risky asset as a function of the aggregate market
supply with less variable constraints. We assume that we have 5 agents with
the same risk aversion coefficients. The blue plot assumes L to be different across
the agents L1 = 10, L2 = 20, L3 = 30, L4 = 40, L5 = 50 and the red assumes that
L1 = 20, L2 = 25, L3 = 30, L4 = 35, L5 = 40.
164
4.2.3
Varying risk aversions
So far we have assumed that the agents have heterogeneous constraints but same
normalized risk aversions. Now, we assume that the risk aversion relative to wealth
γˆh ≡
γh
Wh0
varies across the agents, while the parameters L̂h are constant equal to L.
Without loss of generality, we assume that γˆ1 ≤ γˆ2 · · · ≤ γˆH . As we proved in the
previous section, we have: Ψθα = θhopt γˆh (1 + λh ) ∀h ∈ H.
We perform a sensitivity analysis for two cases:
• Keep γˆh constant and change θα . By changing the aggregate supply, we find
that:
– Ψθα is a piecewise linear convex increasing function of θα .
– Slope is (
PH
1 −1
h=i γˆh )
till the constraint binds for agent i. In other words,
at each point it is equal to the aggregate risk aversion of the agents whose
constraint is not binding yet.
– The constraint binds for agent i when θα,i = iL +
the kink points in Figures 4-4, 4-5.
PH
k=i+1
L γγˆˆki . These are
– When the aggregate supply is greater than θα = HL there is no equilibrium, since in that case all the agents are constrained in their positions,
they cannot buy any more shares and therefore the market cannot clear.
– Since q =
µ̂−Ψσ̂2 θα
,
Rf
we see that the pricing function is a piecewise linear
concave decreasing function of the aggregate supply, as we see in Figures
4-4, 4-5.
– Therefore, variability of the risk aversion coefficients and constraints leads
to increasing risk premium, increasing volatility and increasing illiquidity,
since a small change in the aggregate supply, a small liquidity shock by
the noise traders leads to a larger change in the price of the risky asset
comparing to the case, where there is variability in the risk aversion coefficients but no constraints. Figure 4-4 shows the pricing function of the
165
Price of the risky asset
1.6
Variability in risk aversions
No constraints
1.4
1.2
Price
1
0.8
0.6
0.4
0.2
0
50
100
150
Aggregate supply
Figure 4-4: Price of the risky asset as a function of the aggregate market supply with constraints and varying risk aversions. We assume that we
have 5 agents with same constraints but different risk aversion coefficients. The blue
plot assumes L = 30 for each agent, while the red line assumes that the agents are
unconstrained.
risky asset when the agents with different risk aversion coefficients face
constraints and when they do not face any constraints.
• Keep θα constant and change L. As we see in Figure 4-5 as we tighten the
constraints, the pricing function becomes more concave. Therefore, tightening
of the constraints leads to increasing risk premium, increasing volatility and
increasing illiquidity.
166
Price of the risky asset
1.6
Variability in risk aversions
Tighter constraints
1.4
1.2
Price
1
0.8
0.6
0.4
0.2
0
50
100
150
Aggregate supply
Figure 4-5: Price of the risky asset as a function of the aggregate market
supply with tightening constraints and varying risk aversions. We assume
that we have 5 agents with same constraints but different risk aversion coefficients.
The blue plot assumes L = 30 for each agent, while the red line assumes that L = 20
for each agent.
167
4.2.4
Varying constraints and risk aversions
Now we assume that both the risk aversion relative to wealth γˆh ≡
γh
Wh0
and the
parameters L̂h vary across the agents. Without loss of generality we assume that
γˆ1 L̂1 ≤ γˆ2 L̂2 · · · ≤ γˆH L̂H . By changing the aggregate supply, we find that:
• Ψθα is a piecewise linear convex increasing function of θα .
• Slope is (
PH
1 −1
h=i γˆh )
till the constraint binds for agent i. In other words, at each
point it is equal to the aggregate risk aversion of the agents whose constraint is
not binding yet.
• The constraint binds increasing in the order of γ̂i L̂i = γi Li . It binds for agent
P
P
γˆi
i when θα,i = ik=1 L̂k + H
k=i+1 L̂i γˆk .
• When the aggregate supply is greater than θα =
P
h∈H
L̂h there is no equilib-
rium, since in that case all the agents are constrained in their positions, they
cannot buy any more shares and therefore the market cannot clear.
• The pricing function of the risky asset is a piecewise linear concave decreasing function of the aggregate supply. That means increasing risk premium,
increasing volatility and increasing illiquidity.
4.2.5
Varying fear of model misspecification
We now study how fear of model misspecification across the agents affects the statistical properties of risky assets. We change a little our initial model setup. In
particular, we assume we have H investors with CARA utilities, heterogeneous risk
aversions and wealth. Each investor can invest in N risky assets and the risk free
rate at t = 0. Each agent mistrusts the model that describes the payoff distribution
of the risky assets. All the agents have a common nominal model but they have
heterogeneous fears of model misspecification. We assume that the risk-free rate is
exogenously given and the common approximating model describing the risky assets’
payoff distribution at t = 1 is N(µ̂, σ̂ 2 ). There are also noise traders. We do not
168
model their utility explicitly, we only assume that they are hit by random liquidity
shocks and they submit random market orders at time t = 0. Equivalently, the supply
of the risky assets is stochastic.
Each agent’s wealth at time t=1 is given by:
Wh1 = dT θh + (Wh0 − q T θh )Rf
where d ∈ RN describes the payoff of the N risky assets, θh describe the number of
shares of the assets, q ∈ RN is the vector of prices of the risky assets and Rf is the
risk-free rate.
The common approximating model of the payoff distribution is d = µ̂+Σ̂1/2 ǫ where
ǫ follows the standard multivariate Gaussian distribution. The alternative models
alter the distribution of ǫ. In particular, similarly with the framework described in
Chapter 2, they change the mean of the shocks and they assume that the distribution
of the shock is N(h, I). Therefore, under the alternative models the payoff is d =
µ̂ + Σ̂1/2 (ǫ + h) i.e. d follows N(µ̂ + Σ̂1/2 h, Σ̂). The relative entropy of the alternative
distributions with respect to the nominal distribution is given by:D(Q) = 1/2hT h
as we have shown in Chapter 2 and proved in the Appendix. Under the alternative
distributions the certainty equivalence of the investors with CARA utilities is γ(W0 −
q T θ)Rf + γ µ̂T θ − 1/2γ 2 θT Σ̂θ + γθT Σ̂1/2 h.
We assume each investor solves the following optimization problem:
max min γ(W0 − q T θ)Rf + γ µ̂T θ − 1/2γ 2θT Σ̂θ + γθT Σ̂1/2 h + λ1/2hT h
θ
h
(4.3)
where γ is the risk aversion coefficient of the investor and λ is the multiplier that
penalizes the relative entropy of the alternative distribution.
By solving this optimization problem (see Appendix) we find that:
θhopt =
Σ̂−1 (µ̂ − Rf q)
γh (1 + λ1h )
169
(4.4)
The market clearing condition for the risky assets implies that (see Appendix):
X
θhopt = θα
h∈H
q=
µ̂ − ΨΣ̂θα
Rf
where θα ∈ RN is the aggregate supply of the risky assets and
1
Ψ
≡
1
h∈H γh (1+ 1 ) .
λ
P
h
In case the investors fully trusted their model dynamics we would have
q=
but now
1
Ψnr
≡
µ̂ − Ψnr Σ̂θα
Rf
(4.5)
1
h∈H γh .
P
So we see the case when the agents mistrust their models is equivalent to the
case where they fully trust their models but with increasing effective risk aversion
γef f = γ(1 + λ1 ).
By changing the aggregate supply, we find that when the investors mistrust their
models and make a robust decision rule then the slope of the pricing function of
the assets is larger compared to the case when the investors fully trust their models’
dynamics, since Ψ > Ψnr . If we consider the case where N = 1 and the risky asset
is the market, then we see that the risk premium, the volatility and the illiquidity go
up, since a small change in the aggregate supply, a small liquidity shock by the noise
traders leads to a larger change in the price of the risky asset comparing to the case
where the investors fully trust their models.
4.3
Conclusions
Risk sensitive regulations have become the cornerstone of international financial regulations. They imply an upper bound on the wealth volatility for each investor. In
this chapter we studied how risk constraints and model misspecification affect the
statistical properties of the market returns. In particular, we studied their effect on
the risk premium, the volatility and liquidity of the market.
170
We studied the following cases:
• Variability of the risk constraints: This is the case where the agents face different risk constraints. The more variability there is in the risk constraints, the
larger the risk premium, volatility and illiquidity of the market is. In addition
tightening of the constraints leads also to more risk premium, volatility and
illiquidity of the market.
• Variability in risk aversions: This is the case where the agents face the same
risk constraint but have different aversions to risk. Variability in risk aversions
along with the risk constraints also lead to a more concave pricing function of the
aggregate supply for the market, implying increasing risk premium, increasing
volatility and increasing illiquidity. An interesting question here is the following.
Do we have a concave pricing function due to the fact that “...risk will have to be
transferred from the more risk-tolerant to the more risk-averse”? Well as we saw
this is not the case. Any constraint that binds for an agent forces the discount
and the slope of the pricing function to be larger so that the other agents are
induced to absorb the excess supply and this is the mechanism that leads to a
more concave decreasing pricing function with respect to the aggregate supply.
• Model misspecification: This is the case where the agents do not fully trust their
model dynamics, they believe that the real model is an unknown member of a set
of alternative models near their nominal model and they make robust decision
rules. We find that model misspecification is another source of increasing risk
premium, endogenous volatility and increasing illiquidity.
171
172
Appendix A
Technical Notes
Proposition 1. The following QCQP:
1
minimize FtT µt + FtT ΣFt
2
T
subject to Ft ΣFt ≤ L
has a solution given by:
Ftopt =


−1

−Σ µt
if µTt Σ−1 µt ≤ L
−1

r Σ µt

− µTt Σ−1 µt
if µTt Σ−1 µt ≥ L
L
Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Ftopt , λopt
t are optimal
iff they satisfy the following KKT conditions:
• Primal feasibility: FtT opt ΣFtopt ≤ L
• Dual feasibility: λopt
≥0
t
T opt
• Complementary slackness: λopt
ΣFtopt − L) = 0
t (Ft
• Minimization of the Lagrangean: Ftopt = argmin L(Ft , λopt
t )
173
The Lagrangean is given by:
1
L(F, λ) = FtT µt + FtT ΣFt + λ(1/2FtT ΣFt − 1/2L)
2
The first order conditions are:
opt
µt + ΣFtopt + λopt
=0⇒
t ΣFt
Ftopt = −
Σ−1 µt
1 + λopt
t
If µTt Σ−1 µt > L, then we cannot have λopt
= 0, since in that case FtT opt ΣFtopt > L.
t
Therefore λopt
> 0 and from the CS condition
t
FtT opt ΣFtopt = L
µTt Σ−1 µt
=L
2
(1 + λopt
t )
r
(1 + λopt
t ) =
⇒
⇒
µTt Σ−1 µt
>1
L
−1 µ
t
Therefore, if µTt Σ−1 µt > L, then Ftopt = − r ΣT
µt Σ−1 µt
L
.
If µTt Σ−1 µt < L, then FtT opt ΣFtopt < L, therefore λopt
= 0 and Ftopt = −Σ−1 µt .
t
Finally if µTt Σ−1 µt = L, then we cannot have λopt
> 0, since in that case we would
t
have FtT opt ΣFtopt < L and λopt
> 0 and the CS condition would be violated. Therefore,
t
λopt
= 0 and Ftopt = −Σ−1 µt .
t
Therefore we proved that:
Ftopt =


−1

−Σ µt
if µTt Σ−1 µt ≤ L
−1

r Σ µt

− µTt Σ−1 µt
L
174
if µTt Σ−1 µt ≥ L
We could also prove this result in another way. Let us make a change of variables
where:
y = Σ1/2 Ft
Ft = Σ−1/2 y
Then our problem becomes:
1
y T Σ−1/2 µt + y T y
2
T
subject to y y ≤ L
minimize
The optimal solution is given by finding the projection of µ̃ = −Σ−1/2 µt on the
Euclidean ball y T y ≤ L. This projection is given by:
y opt =
µ̃
q
T
max(1, µ̃Lµ̃ )
Therefore the optimal solution for the original problem is given by:
Ftopt = Σ−1/2 y opt
Ftopt = −
Σ−1 µt
q
−1 µ
µT
t
t Σ
max(1,
)
L
175
Proposition 2. When Σ is a diagonal matrix, the following convex program:
1
minimize FtT µt + FtT ΣFt
2
PN
subject to
i=1 λi |Fit | ≤ 1
has a solution given by:
Fitopt =
sign(−µit )(| µλiti | − νtopt )+
σi2
λi
Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Ftopt , νtopt are optimal
iff they satisfy the KKT conditions:
• Primal feasibility:
PN
i=1
λi |Fit | ≤ 1
• Dual feasibility: νtopt ≥ 0
• Complementary slackness: νtopt (
PN
i=1
λi |Fit | − 1) = 0
• Minimization of the Lagrangean Ftopt = argmin L(Ft , νtopt )
The Lagrangean is given by:
N
X
1
L(F, ν) = FtT µt + FtT ΣFt + ν(
λi |Fit | − 1)
2
i=1
176
Ftopt = argmin L(Ft , νtopt )
= argmin 1/2
= argmin 1/2
= argmin 1/2
N
X
i=1
N
X
i=1
N
X
σi2 Fit2 +
N
X
Fit µit + νtopt (
i=1
σi2 |Fit |2 +
σi2 |Fit |2 +
i=1
N
X
λi |Fit | − 1)
i=1
N
X
|Fit |sign(−µit )µit + νtopt (
i=1
N
X
N
X
λi |Fit | − 1)
i=1
|Fit |(sign(−µit )µit + λi νtopt )
i=1
where Fitopt = |Fitopt |sign(−µit ).
We have:
|Fitopt |
=
Therefore:
Fitopt
=


0
if sign(−µit )µit + λi νtopt ≥ 0
opt

− sign(−µit )µ2it +λi νt
σ


0
i
if sign(−µit )µit + λi νtopt ≤ 0
if sign(−µit )µit + λi νtopt ≥ 0
µit
opt
it )νt

 − λi −sign(−µ
2
if sign(−µit )µit + λi νtopt ≤ 0
σi /λi
which is equivalent to:
Fitopt
Since it is:
=



0
if sign(−µit )µit + λi νtopt ≥ 0


sign(−µit )
|µit |
−νtopt
λi
σ2
i
λi
if sign(−µit )µit + λi νtopt ≤ 0
sign(−µit )µit + λi νtopt ≤ 0 ⇒
−|µit | + λi νtopt ≤ 0 ⇒
µit
| | − νtopt ≥ 0
λi
177
we have proved that:
Fitopt =
sign(−µit )(| µλiti | − νtopt )+
σi2
λi
Proposition 3. The relative entropy of a multivariate Gaussian distribution N(µ, I)
with respect to the multivariate standard Gaussian distribution is D(Q) = 12 µT µ
Proof. It is:
D(Q) =
+∞
Z
log(
−∞
where f (x) =
T
1
e−1/2(x−µ) (x−µ)
(2π)N/2
D(Q) = −
Z
f (x)
)f (x)dx
g(x)
and g(x) =
T
1
e−1/2x x
(2π)N/2
+∞
T
1/2(x − µ) (x − µ)f (x)dx +
−∞
Z
+∞
1/2xT xf (x)dx
−∞
The first term is:
1
1
− E[(X − µ)T (X − µ)] = − E[trace((X − µ)T (X − µ))]
2
2
1
= − E[trace((X − µ)(X − µ)T )]
2
1
= − trace(I)
2
= −N/2
The second term is:
1
1
1
E[X T X] = E[trace((X − µ)T (X − µ))] + E[X]T E[X]
2
2
2
1 T
= N/2 + µ µ
2
Therefore, we have D(Q) = 21 µT µ.
178
Proposition 4. The relative entropy of probability measure Q with respect to P ,
where
dQ
dP
Rt
= ξT and ξt = e
0
1
hT
s dZs − 2
Rt
0
hT
s hs ds
D(Q) =
Z
T
0
is given by:
1
EQ [hTt ht ]dt
2
Proof. The relative entropy of Q with respect to P is:
D(Q) = EQ [log(
dQ
)]
dP
= EQ [log(ξT )]
Z
Z T
1 T T
T
= EQ [
h hs ds]
hs dZs −
2 0 s
0
Z T
Z
1 T
T
=
EQ [hs dZs ] −
EQ [hTs hs ]ds
2 0
0
Z
Z T
1 t
T
EQ [hTs hs ]ds
=
EQ [EQ [hs dZs |Fs ]] −
2
0
0
Z T
1
=
EQ [hTt ht ]dt
2
0
since EQ [dZt |Ft ] = ht dt from Girsanov’s theorem.
Proposition 5. The following QCQP:
minimize −F T (µ(S, t) − rSt −
ΣHS
1
1
) + (1 + )FtT ΣFt
ν
2
ν
subject to F T ΣF ≤ L
has a solution given by:
Ftopt =


1
−1

 1+ 1 Σ µt
ν
Σ−1 µt


 r µTt Σ−1 µt
if µTt Σ−1 µt ≤ L(1 + ν1 )2
if µTt Σ−1 µt ≥ L(1 + ν1 )2
L
where µt = µ(S, t) − rSt −
ΣHS (St ,t)
.
ν
Proof. Let us apply the Karush Kuhn Tucker (KKT) conditions. Ftopt , λopt
t are optimal
iff they satisfy the following KKT conditions:
179
• Primal feasibility: FtT opt ΣFtopt ≤ L
• Dual feasibility: λopt
≥0
t
T opt
• Complementary slackness: λopt
ΣFtopt − L) = 0
t (Ft
• Minimization of the Lagrangean: Ftopt = argmin L(Ft , λopt
t )
The Lagrangean is given by:
1
1
L(F, λ) = −FtT µt + (1 + )FtT ΣFt + λ(1/2FtT ΣFt − 1/2L)
2
ν
The first order conditions are:
1
opt
−µt + (1 + )ΣFtopt + λopt
=0⇒
t ΣFt
ν
Σ−1 µt
opt
Ft =
1 + ν1 + λopt
t
If µTt Σ−1 µt > L(1+ ν1 )2 , then we cannot have λopt
= 0, since in that case FtT opt ΣFtopt >
t
L. Therefore λopt
> 0 and from the CS condition
t
FtT opt ΣFtopt = L
⇒
µTt Σ−1 µt
=L
2
(1 + ν1 + λopt
t )
r
1
1
µTt Σ−1 µt
)
=
>1+
(1 + + λopt
t
ν
L
ν
Therefore, if µTt Σ−1 µt > L(1 + ν1 )2 , then Ftopt =
rΣ
−1 µ
t
−1 µ
µT
t
t Σ
L
⇒
.
= 0 and Ftopt =
If µTt Σ−1 µt < L(1 + ν1 )2 , then FtT opt ΣFtopt < L, therefore λopt
t
Σ−1 µt
.
1+ ν1
> 0, since in that case we
Finally if µTt Σ−1 µt = L(1 + ν1 )2 , then we cannot have λopt
t
would have FtT opt ΣFtopt < L and λopt
> 0 and the CS condition would be violated.
t
Therefore, λopt
= 0 and Ftopt =
t
Σ−1 µt
.
1+ ν1
180
Proposition 6. The following convex program:
max min γ(W0 − q T θ)Rf + γ µ̂T θ − 1/2γ 2 θT Σ̂θ + γθT Σ̂1/2 h + λ1/2hT h
θ
h
(A.1)
has a solution given by:
θopt =
Σ̂−1 (µ̂ − Rf q)
γ(1 + λ1 )
Proof. For each h the objective function is concave in θ. Therefore, when we take
the minimum over h, we take the minimum of concave functions, which leads to a
concave function. Therefore this problem maximizes a concave function of θ.
The inner minimization problem is:
min γθT Σ̂1/2 h + λ1/2hT h
h
This is a convex problem where we minimize a quadratic function of h. The first
order conditions are:
γ Σ̂1/2 θ + λh = 0 ⇒
h=−
γ Σ̂1/2 θ
λ
The optimal value of the inner optimization problem is:
V (θ) = −γθT Σ̂1/2
=−
γ 2θT Σ̂θ
γ Σ̂1/2 θ
+ λ1/2
λ
λ2
γ 2 θT Σ̂θ
2λ
The problem A.1 then becomes:
max γ(W0 − q T θ)Rf + γ µ̂T θ − 1/2γ 2 θT Σ̂θ −
θ
181
γ 2 θT Σ̂θ
2λ
The first order conditions of this concave problem are:
1
γ(µ̂ − Rf q) − γ 2 Σ̂θopt (1 + ) = 0
λ
Σ̂−1 (µ̂ − Rf q)
θopt =
γ(1 + λ1 )
Therefore we proved our proposition.
Proposition 7. The market clearing condition for the risky assets implies that
X
θhopt = θα ⇒
h∈H
q=
µ̂ − ΨΣ̂θα
Rf
where θα ∈ RN is the aggregate supply of the risky assets and
Proof. From the proposition above we have:
Σ̂−1 (µ̂ − Rf q)
γ(1 + λ1 )
(µ̂ − Rf q)
Σ̂θh =
γ(1 + λ1 )
θh =
By adding across agents we have:
X (µ̂ − Rf q)
γ (1 + λ1h )
h∈H
h∈H h
X
1
Σ̂θα = (µ̂ − Rf q)
γ (1 +
h∈H h
X
Σ̂θh =
ΨΣ̂θα = (µ̂ − Rf q)
q=
where
1
Ψ
≡
P
µ̂ − ΨΣ̂θα
Rf
1
h∈H γh (1+ 1 )
λ
h
182
1
)
λh
1
Ψ
≡
P
1
h∈H γh (1+ 1 )
λ
h
Bibliography
[1] Committee on the NIH Research Priority-Setting Process, Scientific Opportunities and Public Needs: Improving Priority Setting and Public Input at the
National Institutes of Health, pp. 11–12. Washington, D.C.: National Academy
Press. (1998).
[2] C. Anderson. A new kind of earmarking. Science, 260(5107):483, Apr. 23, 1993
1993.
[3] Evan W. Anderson, Lars Peter Hansen, and Thomas J. Sargent. Robustness,
detection and the price of risk, 2000.
[4] Kerry E. Back. Asset Pricing and Portfolio Choice Theory. Oxford University
Press, 2010.
[5] Alexander Bade, Gabriel Frahm, and Uwe Jaekel. A general approach to Bayesian
portfolio optimization. Mathematical Methods of Operations Research, 70(2):337–
356, 2009.
[6] Suleyman Basak and B. Croitoru. Equilibrium mispricing in a capital market
with portfolio constraints. The Review of financial studies, 13:715–748, 2000.
[7] Suleyman Basak and Alexander Shapiro. Value-at-risk-based risk management:
Optimal policies and asset prices. The Review of financial studies, 14:371–405,
2001.
[8] V. Bawa, S. Brown, and R. Klein. Estimation Risk and Optimal Portfolio Choice.
North-Holland, Amsterdam, 1979.
[9] Dirk Bergemann and Karl Schlag. Robust monopoly pricing. Cowles Foundation,
Yale University, 2005.
[10] Dimitri P. Bertsekas. Convex Optimization Theory. Athena Scientific, 2009.
[11] S. Birch and A. Gafni. Cost effectiveness/utility analyses. Do current decision
rules lead us to where we want to be? Journal of Health Economics, 11:279–296,
1992.
[12] Dimitrios Bisias, Andrew W. Lo, and James F. Watkins. Estimating the NIH
efficient frontier. PLOS One, 2012.
183
[13] F. Black and R. Litterman. Global portfolio optimization. Financial Analysts
Journal, 48(5):28–43, 1992.
[14] Nick Black. Health services research: the gradual encroachment of ideas. Journal
of Health Services Research & Policy, 14:120–123, 2009.
[15] Michael Boguslavsky and Elena Boguslavskaya. Arbitrage under power. Risk,
pages 69–73, June 2004.
[16] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University
Press, Cambridge, UK, 2004.
[17] Michael J. Brennan and Eduardo S. Schwartz. Arbitrage in stock index futures.
The Journal of Business, 63, 1990.
[18] J. F. Bridges and D. D. Terris. Portfolio evaluation of health programs: A reply
to Sendi et al. Social Science & Medicine, 58:1849–1851, 2004.
[19] J. F. B. Bridges, M. Stewart, M. T. King, and K. van Gool. Adapting portfolio
theory for the evaluation of multiple investments in health with a multiplicative
extension for treatment synergies. European Journal of Health Economics, 3:47–
53, 2002.
[20] M. L. Brown, J. Lipscomb, and C. Snyder. The burden of illness of cancer:
economic cost and quality of life. Annual Review of Public Health, 22:91–113,
2001.
[21] Johannes Brumm, Felix Kubler, Michael Grill, and Karl Schmedders. Margin
regulation and volatility. ECB working paper, 1698, 2014.
[22] Carol S. Burckhardt and Kathryn L. Anderson. The quality of life scale (QOLS):
Reliability, validity, and utilization. Health and Quality of Life Outcomes, 1:60–
66, 2003.
[23] M. Buxton, S. Hanney, and T. Jones. Estimating the economic value to societies
of the impact of health research: a critical review. Bulletin of the World Health
Organization, 82(10):733–739, Oct 2004.
[24] S. Chandra. Regional economy size and the growth/instability frontier: Evidence
from Europe. Journal of Regional Science, 43(1):95–122, 2003.
[25] Joseph Chen, Harrison Hong, and Jeremy C. Stein. Breadth of ownership and
stock returns. Journal of financial economics, 66:171–205, 2002.
[26] Julius H. Comroe Jr. and Robert D. Dripps. Scientific basis for the support of
biomedical science. Science, 192(4235):105–111, 1976.
[27] C. W. Curry, A. K. De, R. M. Ikeda, and S. B. Thacker. Health burden and
funding at the Centers for Disease Control and Prevention. American Journal
of Preventive Medicine, 30(3):269–276, MAR 2006.
184
[28] D. M. Cutler and M. McClellan. Is technological change in medicine worth it?
Health Affairs, 20(5):11–29, Sep–Oct 2001.
[29] Darrell Duffie. Special repo rates. Journal of Finance, 51:493–526, 1996.
[30] Darrell Duffie. Dynamic Asset Pricing Theory. Princeton Series in Finance, 3
edition, 2001.
[31] R. L. Fleurence and D. J. Torgerson. Setting priorities for research. Health
Policy, 69(1):1–10, JUL 2004.
[32] Richard Freeman and John Van Reenen. What if Congress doubled R&D spending on the physical sciences? Technical Report 931, Center for Economic Performance, May 2009.
[33] John Geanakoplos. The leverage cycle. Cowles Foundation Discussion Paper,
1715R, 2009.
[34] Itzhak Gilboa and David Schmeidler. Maxmin expected utility with non-unique
prior. Journal of Mathematical Economics, 18:141–153, 1989.
[35] J. Grant. Evaluating “payback” on biomedical research from papers cited in
clinical guidelines: applied bibliometric study. BMJ, 320(7242):1107–1111, 2000.
[36] M. Grant and S. Boyd. Graph implementations for nonsmooth convex programs.
In V. Blondel, S. Boyd, and H. Kimura, editors, Recent Advances in Learning and
Control (a tribute to M. Vidyasagar), Lecture Notes in Control and Information
Sciences, pages 95–110. Springer, Berlin / Heidelberg, 2008.
[37] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, June 2009 2009.
[38] C. P. Gross, G. F. Anderson, and N. R. Rowe. The relation between funding
by the National Institutes of Health and the burden of disease. New England
Journal of Medicine, 340(24):1881–1887, JUN 17 1999.
[39] Steve Hanney, Iain Frame, Jonathan Grant, Philip Green, and Martin Buxton.
From Bench to Bedside: Tracing the Payback Forwards from Basic or Early
Clinical Research A Preliminary Exercise and Proposals for a Future Study.
Health Economics Research Group, Brunel University, Uxbridge, UK, 2003.
[40] Lars Hansen, Thomas Sargent, G. Turmuhambetova, and N. Williams. Robust
control, min-max expected utility and model misspecification. Journal of Economic Theory, 128:45–90, 2006.
[41] Lars P. Hansen and Thomas Sargent. Robustness. Princeton University Press,
2008.
[42] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2013.
185
[43] Ernest Istook. Research funding on major diseases is not proportionate to taxpayer’s needs. Journal of NIH Research, 9(8):26–28, 1997.
[44] D.H. Jacobson. Optimal stochastic linear systems with exponential performance
criteria and their relation to deterministic differential games. IEEE Transactions
Automatic Control, 18:124–131, 1973.
[45] S. C. Johnston and S. L. Hauser. Basic and clinical research: what is the most
appropriate weighting in a public investment portfolio? Annals of Neurology,
60(1):9A–11A, Jul 2006.
[46] S. C. Johnston, J. D. Rootenberg, S. Katrak, W. S. Smith, and J. S. Elkins.
Effect of a US national institutes of health programme of clinical trials on public
health and costs. Lancet, 367(9519):1319–1327, APR 22 2006.
[47] Philippe Jorion. Value at Risk: The New Benchmark for Managing Financial
Risk. McGraw-Hill, 2006.
[48] Jakub Jurek and Halla Yang. Dynamic portfolio selection in arbitrage. EFA
Meeting, 2006.
[49] Tong Suk Kim and Edward Omberg. Dynamic nonmyopic portfolio behavior.
The Review of Financial Studies, 9:141–161, 1996.
[50] Leonid Kogan and Raman Uppal. Risk aversion and optimal portfolio policies
in partial and general equilibrium economies. NBER working paper, 8609, 2001.
[51] Julia Lane and Stefano Bertuzzi. Measuring the results of science investments.
Science, 331:678–680, 2011.
[52] John Lintner. The aggregation of investor’s diverse judgements and preferences
in purely competitive security markets. Journal of financial and quantitative
analysis, 4:347–400, 1969.
[53] Jun Liu and Francis A. Longstaff. Losing money on arbitrage: Optimal dynamic
portfolio choice in markets with arbitrage opportunities. The Review of Financial
Studies, 17, 2004.
[54] Roger Lowenstein. When genius failed: The Rise and Fall of Long-Term Capital
Management. Random House Trade Paperbacks, 2001.
[55] H. M. Markowitz. Portfolio selection. Journal of Finance, 7(1):77–91, March
1952 1952.
[56] F. W. McFarlane. Portfolio approach to information systems. Harvard Business
Review, 59(4):142–150, 1981.
[57] M. T. McKenna, C. M. Michaud, C. J. L. Murray, and J. S. Marks. Assessing
the burden of disease in the United States using disability-adjusted life years.
American Journal of Preventive Medicine, 28(5):415–423, JUN 2005.
186
[58] Robert C. Merton. An analytic derivation of the efficient portfolio frontier.
Journal of Financial and Quantitative Analysis, 7:1851–1872, 1972.
[59] Robert C. Merton. Continuous time Finance. Wiley-Blackwell, 1992.
[60] Attilio Meucci. Review of statistical arbitrage, cointegration and multivariate
ornstein-uhlenbeck, 2010.
[61] Edward M. Miller. Risk, uncertainty and divergence of opinion. Journal of
Finance, 32:1151–1168, 1977.
[62] F. Mosteller. Innovation and evaluation. Science, 211(4485):881–886, 1981.
[63] Kevin Murphy and Richard Topel. Diminishing returns? The costs and benefits
of improving health. Perspectives in Biology and Medicine, 43(3):S108–S128,
2003.
[64] Rishi K. Narang. Inside the Black Box: A Simple Guide to Quantitative and
High Frequency Trading. Wiley, 2013.
[65] National Institutes of Health. Setting Research Priorities at the National Institutes of Health. National Institutes of Health, Bethesda, MD, 1997.
[66] B. J. O’Brien and M. J. Sculpher. Building uncertainty into cost-effectiveness
rankings: Portfolio risk-return tradeoffs and implications for decision rules. Medical Care, 38:460–468, 2000.
[67] M. E. Porter. What is value in health care? New England Journal of Medicine,
363(26):2477–2481, 2010.
[68] Luis Prieto and Jose A. Sacristan. Problems and solutions in calculating qualityadjusted life years (QALYs). Health and Quality of Life Outcomes, 1:80–87, 2003.
[69] S. J. Rangel, B. Efron, and R. L. Moss. Recent trends in National Institutes
of Health funding of surgical research. Annals of Surgery, 236(3):277–287, SEP
2002.
[70] R. S. Sandler, J. E. Everhart, M. Donowitz, E. Adams, K. Cronin, C. Goodman,
E. Gemmen, S. Shah, A. Avdic, and R. Rubin. The burden of selected digestive
diseases in the United States. Gastroenterology, 122(5):1500–1511, May 2002.
[71] P. Sendi, M. J. Al, A. Gafni, and S. Birch. Optimizing a portfolio of health
care programs in the presence of uncertainty and constrained resources. Social
Science & Medicine, 57:2207–2215, 2003.
[72] Pedram Sendi, Maiwenn J. Al, and Frans F. H. Rutten. Portfolio theory and
cost-effectiveness analysis: A further discussion. Value In Health, 7:595–601,
2004.
187
[73] William F. Sharpe. Capital asset prices: a theory of market equilibrium under
conditions of risk. J. Finance, 19:425–442, 1964.
[74] Andrei Shleifer and Robert W. Vishny. The limits of arbitrage. The Journal of
Finance, 52:35–55, 1997.
[75] Gilbert Strang. Computational Science and Engineering. Wellesley Cambridge
Press, 2007.
[76] H. Varmus. Evaluating the burden of disease and spending the research dollars of the National Institutes of Health. New England Journal of Medicine,
340(24):1914–1915, JUN 17 1999.
[77] W.H.Fleming and P.E.Souganidis. On the existence of value functions of twoplayer zero-sum stochastic differential games. Indiana University Mathematics
Journal, 38:293–314, 1989.
[78] P. Whittle. Risk sensitive linear quadratic gaussian control. Advanced Applied
Probability, 13:776–777, 1981.
[79] P. Whittle. Risk sensitive optimal control. Wiley, 1990.
[80] P. Whittle. Optimal control: basics and beyond. Wiley, 1996.
[81] Jean-Pierre Zigrand and Jon Danielsson. What happens when you regulate risk?:
evidence from a simple equilibrium model. Lse research online documents on
economics, London School of Economics and Political Science, LSE Library, 2001.
188