A Game Theoretic Approach to
Robust Option Pricing
Peter DeMarzo, Stanford
Ilan Kremer, Stanford
Yishay Mansour, TAU
Efficient markets,
Option Pricing,
Universal Portfolios:
Cover et al.
Game theoryCalibration, regret matching
Example 1: Approachability
You repeatedly predict the outcome of a coin toss. The coin need not be a fair
coin. What success rate can you guarantee in the long run?
Simple strategy: Predict Heads with probability 50% and obtain 50% success
Learning Strategy: At each point in time guess heads if there were more
heads in the past, otherwise guess tails. In the limit the success rate is
max{p,1-p} where p is the fixed parameter of the coin.
Q: What if the coin can change arbitrarily from period to period?
A: You can still get an equivalent performance!!
lim n {realized success rate}  max  pn* ,1  pn*   a.s. 0
Example 2: Competitive
Analysis (Regret Minimization)
When you go gambling each minute you choose which slot machine to
use. There are N different machines, some machines may be better.
You see the payoff of machines even if you did not use them.
Goal: In the long run obtain an average payoff that is no worse than the
best individual machine ex-post. (No Regret)
Q: What if payoffs are not stationary (so a machine which has a high
payoff may deteriorate over time)?
Introduced by Blackwell and Hannan in the
late 50s, rediscovered and used in:
• Computer Science- online algorithms
• Statistic and Information Theory
• Game Theory- Calibration, dynamic foundation
for Nash/correlated equilibrium
• With the exception of work on ‘Universal
Portfolios’ not incorporated into finance.
Regret Minimization
• Dynamic optimization under uncertainty
without a prior.
• Worst case analysis but specified in
relative rather than absolute terms (as in
Gilboa and Schmeidler)
• Minimizing Regret can be expressed as robust upper
bound for option pricing.
 Describe trading strategies that are based on approachability and the
bounds/regret they imply for call option with different strike prices.
• The optimal robust upper bound can be expressed as a
value of a zero sum game.
 Provide a numerical solution and conjecture about closed form
 Bounds are not wide and resemble empirical patterns.
Options: Basics
• What is an option:
A right to buy a stock at a given price
 strike price = K
At a given date (European)
 duration = T
• Option payoff: Max{ 0 , ST -K }
St = Stock price at time t
• This talk:
For Simplicity:
 zero interest rate
 no dividends
Option Pricing: Arbitrage Bounds
• Upper Bound [Merton 1973]
Current stock price: S1
• Lower bounds [Merton 1973]
Always positive
 At worst, payoff is zero.
Stock versus strike price: S1 – K
• Claim these bounds are tight
• Proof: Assume a huge change in first period…
• Better pricing needs more assumptions!
Example 1- Binomial model
• Suppose the risky asset can take only two
0.4 (1
Call (K=1)
Option price is 0.5-0.4=0.1
Example II: Black & Scholes
• Extend the tree to many periods
The limit is continuous time
• Black and Scholes
continuous prices and complete markets
 A specific stochastic model: random walk + drift
Regret- There is a given strategy and a set of alternative
strategies. Regret is defined as the difference/ratio
between the performance of the given strategy and the
ex-post optimal strategy among the alternatives.
Regret guarantee- A lower bound for regret that holds
even in the worst case scenario. This guarantee may be
conditional on some restricted set of possible scenarios.
For the purpose of this talk we ignore any behavioral
aspects. We do not argue that people behave according
to our measure of regret or that they should behave in
this way!!! We consider a specific regret measure to
allow us to derive pricing bounds and compare them to
the existing literature.
Regret and Financial markets
I have $100 which equals the price of IBM.
Should I buy one share of IBM or get a risk free asset?
Alternatives: IBM, risk free asset
Ex-post, compare
Max{IBM, risk free asset}
Loss: ratio
Linking Regret to Options
Note that holding Treasuries plus at-themoney call option on IBM leads to no
Payoff = Max{IBM, risk-free asset}
Thus, regret minimizing trading strategies
have implications for option values.
Regret and option pricing- An
• Suppose we measure regret by looking at the ratio of our
performance to the best asset ex-post. In addition,
suppose that the current IBM share price is $100 and the
risk free interest rate is zero. Your goal is to minimize
regret as compared to the best asset ex-post.
• Suppose we have a trading strategy such that if we start
with $100 then at time T our payoff will always exceed
max{80,0.8ST}, where ST denotes IBM share price at
time T. Hence, the regret is guarantee is 20%.
• We later describe how one can construct such strategies
and know only focus on the implications to option pricing.
• By scaling we conclude that starting with
$125 our strategy would have a payoff that
exceeds max{100, ST}.
• max{100, ST} is like $100 plus a call option
with strike $100
• the value of the option is bounded by
• Discrete-time finite-horizon model t=1..T
• A risky asset whose value at time t is St where
St=(1+rt)*St-1, where rt≥-1.
• In addition agents can borrow and lend at zero interest
• Restriction on price paths: (rt…rt) RT,  is compact
and 0 
  {r | q(r ) 
 qˆ}
j 1
Model- cntd.
• A dynamic trading strategy has initial value G0=c. At time
t invest a fraction xt in the risky asset and 1- xt in the risk
free asset. Zero risk free rate implies that Gt+1=Gt (1+ xt
• Definition We say that c=C(K) is an upper bound if
there exists a dynamic trading strategy that starts with $c
and for all possible price path in  its final payoff, GT,
satisfies: GT≥max{0,ST-K} (super replication).
Blackwell- (recall Example #2)
• You repeatedly choose a single action among {1..I}
possible alternatives; j,i denotes the payoff of alternative
i at time j.
• Can use a randomized strategy which is described by a
random variable j; j=i implies that you choose
alternative i at time j; your time t payoff is given by ,j
Machine #1 pays
Period n+1 expected
regret if
Machine #1 pays more
Play 1
Regret vs. machine #2
Choose the two alternatives
with probability proportional
to current regret
Regret vs. machine #1
Play 2
Period n+1 expected regret i
Machine #2 pays more
Regret so far
E ( An 1 2 |  t 1 ) | An |2 ( n 1,2   n 1,1 ) 2
Finite horizon properties
Proposition Conditional on the set of realized payoffs :
 2
n 2
  (1 j   2 j )2
j 1
Corollary Conditional on the set of realized payoffs :
E (max i A )  E   j , - max i {  j ,i } 
n ,i
j 1
j 1
 (
j 1
1 j
  2 j ) 2
Asymptotic No-Regret
Theorem (Hannan & Blackwell) If payoff are uniformly
bounded then there exists a randomized strategy so that :
1 n
1 n
limn [   j , j - max i {   j ,i }] a.s 0
n j 1
n j 1
Arbitrary starting pointConsider a variant of the previous strategy where
instead of starting at (0,0) we start at an arbitrary
point (-x,-y) for some non negative x,y
Corollary Conditional on the set of realized payoffs :
E( A
 x, y  2 )  q 2 ( r )  x 2  y 2
Useful in improving performance and in the
application for different strike prices
A trading strategy
• Multiplicative model versus additive model: Let 0,t=0, 1,t=ln(1+rt).
• Remove randomness: Invest at time t a fraction of xt=E(t) in the risky
Proposition The payoff of a trading strategy based on the
generalized strategy satisfies:
Gn  max{ (r ),  (r ) S n } where
 (r )  exp( x  q 2 (r )  x 2  y 2 ), and
 (r )  exp( y  q 2 (r )  x 2  y 2 )
Application- Upper bounds for at the money
options (K=1)
Using the same logic as the IBM example if
Gn  max{ ,  * Sn }
then the bound is
So we can choose x=y
Application- Upper bounds for at the money
options (K=1)
Restricting price paths:   {r | q(r )  qˆ}
Using the expressions we derive before we can get an upper bound on
the regret. Using the same logic as the example of IBM one gets:
Using the basic trading strategy:
exp[qˆ ]  1
Using a generalized strategy with optimal starting point (that depends on
q̂log )
exp[qˆ / 2]  1
• Choose starting point where x=y+log(K)
• That implies:
Gn  max{ (r ),  (r ) Sn } where  (r ) /  (r )  k
• It also implies a bound of 1/-k for the value of
an option with a strike k
- Borrow $k and invest 1/ in the trading strategy.
Optimal bound
• Let V(s,2,n) denote the optimal (lowest) upper
bound for a call option with a strike k=1 when the
current price is S. This is equivalent to having
S=1 and arbitrary K. The restriction on the price
paths is again:
  {r | q(r )   }
• Let V(s,2) denote the limit as n goes to infinity.
Dynamic Programming
V ( S , q 2 , n) 
min  , B   S  B
  Se r  B  V Se r , q 2  r 2 , n  1
for all r such that r 2  q 2
V * ( S , q 2 )  lim n V ( S , q 2 , n)
 q q 1/ q
for S  1
q  q S
 
V * (S , q2 )  
 q q S 1/ q  S  1 for S  1
 q  q
where q  e  1, q  1  e
Consider small q
• Original strategyexp[ qˆ ]  1  qˆ
• Optimal starting pointexp[qˆ / 2]  1  qˆ / 2
• Optimal strategy:
 qˆ / 2
• Black-Scholes
 qˆ / 2
Example:   20% (vs. BlackScholes)
Gradient Strategy
Optimal q-Bound
Black-Scholes Implied Volatility of Price Bound
Strike Price K
Approaches to option pricing
Black and Scholes:
Continuous price paths
Constant volatility (quadratic variation)
 Exact replication and pricing
No probability or
Strong assumption on
allowable paths
With jumps & stochastic volatility, exact pricing requires:
(i) A probability distribution P over price paths
(ii) A utility function, U
Our Approach:
No probability or preference assumptions
Constraints on the set of price paths (support of P)
Super-replication  Upper Bound for Option Price
Relation to Universal Portfolios:
• Cover and Ordentlich (1998)- consider the set of
constant-rebalanced portfolios.
- Provide a close form tight universal regret bound (minmax). This provides a bound for the value of the
derivative that pays ex-post the optimal constantrebalanced portfolios.
- Universal means that we consider all possible scenarios
Call (or put) options
- The relevant set of benchmarks is much simpler – buy
and hold strategies.
- The min-max value is 0.5 (trivial to prove). From option
pricing perspective yields an upper bound of the current
stock price.
- Hence, need to consider a ‘less universal’ approach and
the Blackwell strategy is useful.
Early empirical work (joint work
with Tyler Shumway)
S&P 500 options prices from 1/96 to 4/05
from OptionMetrics.
Options with 15 to 45 calendar days to