Presentation of commodity futures software for conferences

advertisement
Enterprise Risk Management Applications in High-Frequency Commodity Futures
Trading
Asymptotic Investors
1
Contents
Motivation..................................................................................................................................................... 2
Quantitative Risk Metrics.............................................................................................................................. 2
Application of Risk Metrics to Commodities Trading ............................................................................... 2
Spot to Futures Mapping ...................................................................................................................... 3
Last Trade Date Lookup ........................................................................................................................ 3
Performance Measurement Methodology ............................................................................................... 4
Sharpe Ratio Subroutine ....................................................................................................................... 5
Information Ratio Subroutine ............................................................................................................... 5
Sortino Ratio Subroutine....................................................................................................................... 6
Market Risk ................................................................................................................................................... 6
Credit Risk ..................................................................................................................................................... 6
Profit and Loss Simulation Results ................................................................................................................ 8
Software Architecture ................................................................................................................................... 9
External Data Sources ............................................................................................................................... 9
Graphical User Interface ......................................................................................................................... 11
Exchange Web Service ............................................................................................................................ 13
Significant Algorithms ................................................................................................................................. 13
High-Frequency Trading .......................................................................................................................... 13
Financial Simulation .................................................................................................................................... 13
Conclusions ................................................................................................................................................. 13
About the Author ........................................................................................................................................ 13
Bibliography ................................................................................................................................................ 14
2
Motivation
Risk managers have the objective of minimizing the economic capital and loss exposure while facilitating
trading profits. In short, the goal is to maximize the risk-adjusted return on capital (RAROC).
Commodity futures can prove to be volatile, exposing the firm to considerable market risk. Recent daily
volatility in Brent Crude Oil is at thirty-two percent (annualized), and gold is at about nineteen percent.
This is far higher than the six percent volatility of treasury bonds. High-frequency trading subroutines
permit rapid opening and closing of positions, allowing short-term arbitrage opportunities to be
capitalized on without incurring the danger of a dramatic price move while holding the position for a
long time. A machine-learning-based price inference engine allows statistical models which maximize
the probability (given the model assumptions) of a positive return to be leveraged in selecting contracts
to trade. Margin computation allows the firms positions to be marked-to-market and the margin
balance visualized in internal systems, minimizing the risk of a capital shortfall. Lastly, a new webservice-accessible exchange enables the firm to act as an intermediary and earn trading commissions
without taking a proprietary position (while incurring credit risk of the default of either side of a trade).
Quantitative Risk Metrics
The three types of risk commonly measured for financial assets are market risk, credit risk, and
operational risk. This fund and software application endeavors to minimize market and credit risks, for
which a variety of metrics exist. For market risk, the most common indicators are value-at-risk (VaR)
and volatility. These represent the worst-case or average-case movements in portfolio value as a result
of a change in market values (in this case, the futures prices of listed commodity futures contracts).
Credit risk indicates the likelihood of a single transaction counterparty to default. It is often represented
by credit rating, value-at-risk, or expected default frequency (EDF). In addition, exposure (amount of
money which would be lost upon default of a counterparty) is used to weight the probability of default.
Application of Risk Metrics to Commodities Trading
Risk management can be performed on all three areas of this undertaking: the hedge fund, the trading
system service, and the commodity futures exchange. The hedge fund and trading system will be
assessed similarly. The price-modelling module will be evaluated by assuming a holding period until the
maturity of the contract. The model’s price (Naïve Bayes or Logistic Regression) will be contrasted with
the current futures price to infer whether a short or long position should be established. Then the
current futures price is contrasted with the actual futures price obtained from historical market data
(MRCI), and the profit or loss of the prospective investment will be calculated. Note that the spot price
of all commodities but oil is obtained from the Wall Street Journal’s cash prices site and that the Brent
crude price will be downloaded from the Energy Information Administration (US Government) (this price
seems to be subject to a one-week delay, as the government releases statistics every Wednesday).
There may be difficulty in obtaining the sugar price from the Wall Street Journal site, as it has been left
blank in recent days (although the previous year’s prices are published). A disclaimer might need to be
issued that simulation dates should be at least thirty days prior to the current date in order to ensure
that prices are available. All quotes are delayed by at least one day (i.e. the previous day’s close is
used). Note also that the contract size (delivery amount for spot) and quotation unit (i.e. dollars or
cents) may differ between the spot pricing source (WSJ or EIA) and the futures pricing source (CME or
ICE). Historical prices are available on both the Wall Street Journal and EIA sites via separate links, so
sufficient data (one year) for testing should be accessible. Only the historical prices will be required for
3
testing, so futures prices (both current and maturity values) will be obtained from MRCI. Note that the
only difference between the current and at-maturity futures prices will be the URL string, which will
contain the date of pricing. However, the last trade date (and last publication of price) differs based on
the commodity, so the calculation of which date to use for the “maturity futures price” is non-trivial.
For gold, prices are available up to the 28th of a 31-day month. However, for soybeans, the 13th was the
final day for which pricing of the November contract was available in that month. Some sort of price
normalization conversion logic must be instituted when modeling the futures price based on the spot
price for underlying assets where the units differ. Absolute profits can be converted to returns by
dividing net cash flows into the margin account after the initial investment by the initial margin posted.
r = (Ecf – Icf)/IM, where Ecf is an amount paid by the exchange, Icf is the investor’s cash flow paid in to fund
margin calls, and IM is the initial margin amount.
Spot to Futures Mapping
Note that the spot and futures underlying do not always match exactly in nomenclature (i.e. what is
called Chicago Soft Red Winter Wheat on the CME is referred to as “Wheat, No. 2 soft red, St. Louis” by
the Wall Street Journal). However, it is assumed that the prices are comparable (i.e. futures equals spot
on the delivery date). Note that the URL for the spot price lookup points to historical spot prices for the
simulated trade date.
Futures (Asymptotic)
Spot (WSJ or EIA)
Gold
London p.m. fixing
Sugar
Raw sugar FOB, $/metric ton-K
Wheat
Wheat, No. 2 soft red, St.Louis, bushel-BP,U
Soybeans
Soybeans, No. 1 yellow Illinois, bu-BP,U
Corn
Corn, No. 2 yellow. Cent. Ill. bu-BP,U
Brent Oil
Brent - Europe
Last Trade Date Lookup
Note that the table below is premised on the definition of “business day” in either the US or the UK.
Business days are inferred by discounting both weekends and weekdays which are deemed “holidays”
based on their presence in a country-specific holiday calendar (either text file or database table). Note
that early close dates are counted as business days. A load script was written to populate a “holidays”
table, and a test was run using products listed both in the US (soybeans, gold) and the UK (Brent oil).
Then, discounting weekends and holidays, the last trade date was found and used as the current date to
download the maturity price. The exchange-mandated last trade date for each product is below. Note
the idiosyncratic nature of Brent Oil’s last trade date, which is bifurcated based on the maturity month
and year.
Commodity Exchange Date (contract month)
4
Gold
CME
Third-to-last business day
Soybeans
CME
Business day prior to 15th day
Wheat
CME
Business day prior to 15th day
Corn
CME
Business day prior to 15th day
Brent Oil
ICE
Up to 2/2016, 15th calendar day
prior to first day of calendar month
or nearest business day prior. After
3/2016, last business day of month
two months prior to contract month
Sugar
ICE
Last business day of preceding month
Performance Measurement Methodology
For the evaluation of the fund and software, volatility will be computed and used to derive risk-adjusted
performance metrics. For market risk evaluation, the Sharpe, Information, and Sortino ratios can be
utilized. The Sharpe Ratio (Sharpe, 1966) measures the fund’s expected return above the risk-free rate
against the returns’ expected dispersion around the mean. The Information Ratio indicates the alpha
return of the investment versus the expected dispersion of the return around the market return. Note
5
that the Information Ratio computation requires some definition of a benchmark “market” return to
derive alpha from.
The Sortino Ratio (Sortino and Forsey, 1996) was devised in 1983 by Brian Rom and utilizes some
“benchmark return” as the basis for the numerator and denominator. The numerator is the expected
return above the benchmark, and the denominator is the negative semi-standard-deviation, which is the
expected dispersion below the threshold of those returns less than the threshold.
S = E(r)-rθ/√((1/n)(∑max(rθ-ri,0)2))
Here n is the number of returns, rθ is the target return, E(r) is the expected value of the investment
return, and ri is the ith investment return.
Note that the benchmark return for the Information Ratio and the target return for the Sortino Ratio
(the risk-free rate for Sharpe is discussed below) should be obtained from a market data provider by
querying the S&P return, the GSCI Commodities Index returns, or some other basis for comparison.
Credit risk, which is incurred only in the application of the futures exchange, can most easily be
measured by credit VaR, which is the worst-case expected loss with a certain confidence (i.e. the
probability of a greater loss is equal to one minus this probability, so a 95% confidence means that the
probability of a greater loss is 5%). The performance metric utilized for the exchange will be riskadjusted return on capital (RAROC). The capital will be computed via the 90% VaR, and the ratio will be
as follows.
RAROC = E(income)/capital
Note that both the numerator and the denominator are per-investment averages. Instead of utilizing
dollar figures, percent return and percent loss are employed. A RAROC value of one would indicate that,
for every dollar held to ensure solvency in the event of an unusually large loss, one dollar of profit is
earned.
Sharpe Ratio Subroutine
The Sharpe Ratio is written as:
S = (mu-r)/sigma
“mu” is the expected return of the investment, “r” is the risk-free rate, and “sigma” is the volatility of
the investment. One point of ambiguity is “r”, which could be the US Treasury Bond yield, LIBOR, or
some other risk-free metric. For now, “r” will be hard-coded to be 3.0%, but in the future this should be
obtained via an HTTP request to Bloomberg, the US Treasury department, or some other source of fixed
income information. “mu” and “sigma” will be computed based on the returns found from the
profit/loss computation on each input contract listed in the test cases file (i.e. empirical mean and
variance).
Information Ratio Subroutine
The information ratio is computed as follows.
IR = (E(r)-rb)/√((1/n)(∑(ri-rb)2))
6
The numerator is similar to the Sharpe ratio’s, with the risk-free rate replaced by the benchmark rate, rb,
and the denominator involves replacing the mean variable used in computing the standard variance
with the benchmark return used to derive the alpha variance. The benchmark return is hard-coded to
be 5% for now.
Sortino Ratio Subroutine
The Sortino ratio is computed in the source code as follows (repeat of above formula).
SR = E(r)-rθ/√((1/n)(∑max(rθ-ri,0)2))
The numerator is similar to the Information Ratio’s, replacing the benchmark rate with the target rate
(hard-coded to be 4% for now). The denominator counts only returns below the target, reverses the
return and target in the difference, and replaces the benchmark return of the Information Ratio with the
target return.
Market Risk
Market risk is assessed for the machine learning and convergence arbitrage pricing mechanism as well
as for the high frequency opening and closing of positions. The means to compute this is to back-test
the pricing algorithms and trading decision infrastructure against historical market data downloaded via
Hypertext Transfer Protocol (HTTP) from the Moore Research Institute (MRCI). A flat file of contracts to
trade is constructed as follows.
<code>, <nContracts>, <currDate>
Where code is the securities code of the contract in the Asymptotic system (the four-character CME
code or the <abbr>MMYYYY code used by Asymptotic for ICE listings, where <abbr> is the one or two
character abbreviation), nContracts is the number of contracts to trade, and currDate is the date to use
as the current trade date. These test cases are read into memory, and, for each test case, a buy/sell
decision was made on the simulated “currDate” and the close of the contract (for a net profit/loss) on
the maturity date. For each contract, the profit (negative for loss) is added for the total, and the percent
of cases resulting in a profit are also checked. Note that this simulation is simplistic, as no logic to
determine the optimal close date for the contract is implemented.
A sample test file is below.
GCV4,100,2014-08-21
ZSN4,200,2014,07-21
Credit Risk
As listed futures are settled with an exchange clearinghouse and not with the firm taking the other side
of the trade, no risk of the counterparty default (statistically significant) exists. In the case of forward
(OTC) or spot transactions, such risk would be salient.
However, the commodity futures exchange will incur substantial credit risk, which must be hedged by
meticulously enforcing margin requirements. The credit risk will be quantified as credit Value-at-Risk,
7
which will be based on the outstanding notional values of all contracts, the volatility of those contracts,
and the credit ratings of the exchange participants.
8
Profit and Loss Simulation Results
A file containing three contracts (for gold, soybeans, and Brent Oil) was created (see above).
The output for a single trade of each contract was as follows.
test of pricing, pctCorrect 0.666666666666667
RAROC pricing test, mean 0.0431989906926242, variance 0.00652900688709723, sharpe ratio
0.163349255437263
This means that the risk-adjusted return metric is about .16. More test contracts should be added and
more trades executed to attempt to improve the performance. A small number of contracts also bises
the results, as a single large profit or loss can sway the final ratio. Secondly, better training data and an
improved learning heuristic might make the weights and thus the pricing algorithm, more accurate,
leading to higher returns.
In an assessment of the soybean pricing, the first problem noted was that WSJ spot price quotes are in
units of dollars per bushel, whereas the CME futures quotes are in terms of cents per bushel. Either a
units conversion method must be written, or the training must be conducted with spot prices in the
same units as those tested on.
A graphical dashboard for loading data and monitoring the performance metrics is under consideration.
This will allow repeated running of different test cases and the monitoring of different performance
metrics (Sharpe Ratio, Information Ratio, Sortino Ratio, and risk-adjusted return on capital (RAROC)),
with the metric to be used for evaluation selected in a combobox.
Two lines from an Excel sheet tracking the algorithm’s progress on gold and soybeans are below. Note
the inclusion of both margin and ROI.
underlying
gold
soybeans
mat
urit
y
Oct14
Jul14
sim
date
4/28/20
14
4/30/20
14
mod
el
price
1309
801.
7
initial
fut
price
1299.
2
1512.
75
positi
on
maturi
ty
future
s price
contra
ct
mult
num
contrac
ts
long
1229.2
100
100
1295
5000
200
short
profit
700000
2.18E+
08
margin/contr
act
4000
2500
margi
n
(amt)
4000
00
5000
00
roi
(perce
nt)
-175
43550
9
Software Architecture
The architecture of the order management system is essentially three-tiered, consisting of a Windows
GUI, a suite of business logic components, and a SQL database. The exchange service will consist of an
HTTP daemon and a matching engine thread pool.
Market Data
Provider
Exchange
Connectivity
Market Data
Download
Polyhedra SQL
Database
Pricing
Logistic
Regression
Order
Entry UI
Commodity
Futures
Exchange
Naïve
Bayes
High-Frequency Trading
Margin
Compute initial,
maintenance
margin
Post deposits,
note gains
with
clearinghouse
Open
Algorithm
Reverse
Basis
End
Basis
10
External Data Sources
Data is obtained for risk-adjusted performance measurement, order entry, and the visualization of
market conditions.
Historical
Futures Data
Historical Spot
Data
EIA
MRCI
WSJ
Historical
prices
(HTTP)
Cash prices
(HTTP)
Exchanges
ICE
Asymptotic
Dark Pool
CME
Live Futures Mkt
Data
Asymptotic
OMS
Bid, ask,
execution
prices (HTTP
web service
or TCP/IP)
Current
prices
(HTTP)
ICE
CME
11
Graphical User Interface
While this GUI is currently compiled along with much of the quantitative pricing and trading logic, it may
later be repackaged as a thin client. Then the GUI events will send HTTP requests to a cloud-hosted web
service to manage a brokerage’s commodity futures trading.
The GUI consists of: web-based market data display, pre-downloaded market data analysis, highfrequency trading, convergence arbitrage, margin, and risk-adjusted performance measure modules.
Some screen shots of significant UI screens for arbitrage, risk-adjusted performance measurement, and
market data are below.
Here, convergence arbitrage is employed to infer the futures price of a given underlying from the spot
and maturity. The two-dimensional graph maps one of the two inputs to the futures price.
The risk-adjusted performance measurement GUI below enables the simulation of a portfolio using the
linear logistic regression pricing mechanism and the scoring of the strategy via the Sharpe Ratio,
Information Ratio, Sortino Ratio, or risk-adjusted return on capital.
12
The market data screen provides pricing information for the most recent trading day, as well as the gain
and loss distributions and candlestick chart for a selected contract.
13
Exchange Web Service
This is a Windows Communication Foundation Web Service running atop an HTTP server. Matching
engine classes are invoked by this web service to pair bids and asks for the same contract. Remote
methods that can be invoked by clients include: “get bids”, “get asks”, “send order”, and “get trade”.
Significant Algorithms
Machine Learning
1.
Logistic Regression – Maximum likelihood estimation is used to infer the weights which make
the training set outputs most likely for a given model.
a. Linear – degree is one. P(Y) = 1/(1+e^(-(theta ∙ x))
b. Quadratic – degree of exponent is two. P(Y) = 1/(1+e^(-(theta ∙ (x^2, x)))
2. Naïve Bayes – maximizes the joint probability of the training outputs in the model given the
training vectors P(Y|X), where P(X|Y) is assumed to be normally distributed.
Above, X is a two-dimensional vector of (spot, maturity) values.
High-Frequency Trading
The high-frequency trading user interface permits the configuration of the pricing algorithm, the
reversal basis, and the halting basis. The pricing algorithm is: logistic regression (linear or quadratic) or
Naïve Bayes. The basis for reversal can be either time or profit/loss (e.g. 10%), and the basis for halting
is similar.
Financial Simulation
A C# program can be utilized to generate Excel sheets simulating market returns and trading profit/loss
(as represented by the margin account balance) over a time series. In another currency futures
application, this was handled by Excel Visual Basic code. This component has yet to be implemented.
Conclusions
By leveraging canonical concepts in Computer Science – machine learning, distributed systems, and
parallel processing – derivatives trading can be improved significantly. As the order management
system and exchange service continue to mature, advances should be achieved in both quantitative
trading and enterprise risk management.
About the Author
Patrick Toolis has spent fifteen years involved in theoretical computer science applications in five
countries. After co-founding J-Surplus.com, one of the first business-to-business e-commerce auction
sites for excess inventory in Japan, Mr. Toolis worked in the system integration realm at Iona
Technologies, dealing with major customers in telecommunications and semiconductors. He then
helped develop the order and execution processing at JapanCross Securities, one of the first electronic
crossing networks for Japanese equities (later merged with Instinet and now a part of Nomura). As a
consultant, he has written significant machine learning implementations on mobile computing platforms
as a consultant for Sears Holdings Corporation. Most recently, he served in a risk management role,
developing parallel computing algorithms for The American Express Company.
14
Bibliography
Hull, John C. Option. Futures, and Other Derivatives, Fifth Edition. 2002. Prentice Hall
http://www.cboe.com/micro/volatility/VXTYN/default.aspx Treasury volatility
http://vlab.stern.nyu.edu/analysis/VOL.GSGLD:COM-R.GARCH Volatility models
http://online.wsj.com/mdc/public/page/2_3023-cashprices.html Wall Street Journal spot prices
http://www.mrci.com/ohlc/index.php Moore Research Center Commodity Futures Prices
http://www.eia.gov/dnav/pet/pet_pri_spt_s1_d.htm Energy Information Administration’s spot crude
prices
Sharpe, W.F.. “Mutual Fund Performance”. Journal of Business 39(S1) 1966:119-138
Sortino, F. and H. Forsey. “On the Use and Misuse of Downside Risk”. The Journal of Portfolio
Management Winter 1996.
http://en.wikipedia.org/wiki/Information_ratio Information Ratio discussion.
Download