Data and Analytics in Financial Markets - Center

Analytics, Data and the Time for
High Performance Computing in Financial Markets
Brian Sentance, Chief Executive Officer, Xenomorph
Published: November 2007
Abstract
This white paper describes the current drivers behind the need for greater data analytical power within financial
markets, the business and technical challenges these drivers imply and how the Microsoft® High Performance
Computing (HPC) platform can move financial institutions closer to goal of real-time pricing, trading and risk
management across asset classes. Specifically, the paper illustrates many business scenarios present in
financial markets that may benefit from the calculation scalability and fault-tolerance of Microsoft HPC, and how
Microsoft is enabling greater productivity in financial markets by bringing this compute power closer to the
traders, risk managers and researchers that need it.
This is a preliminary document and may be changed substantially prior to
final commercial release of the software described herein.
The information contained in this document represents the current view of
Microsoft Corporation on the issues discussed as of the date of
publication. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of
Microsoft, and Microsoft cannot guarantee the accuracy of any information
presented after the date of publication.
This White Paper is for informational purposes only. MICROSOFT
MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS
TO THE INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the
user. Without limiting the rights under copyright, no part of this document
may be reproduced, stored in or introduced into a retrieval system, or
transmitted in any form or by any means (electronic, mechanical,
photocopying, recording, or otherwise), or for any purpose, without the
express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights,
or other intellectual property rights covering subject matter in this
document. Except as expressly provided in any written license agreement
from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual
property.
© 2007 Microsoft Corporation. All rights reserved.
Microsoft, Active Directory, Excel, Visual Basic, Visual Studio, Windows,
Windows Server, and the Windows logo are trademarks of the Microsoft
group of companies.
All other trademarks are property of their respective owners.
Contents
Data and Analytics in Financial Markets ..................................................................................... 1
High Performance Computing Background ............................................................................... 2
Microsoft® and High Performance Computing ............................................................................. 2
Introducing the Microsoft HPC Platform....................................................................................... 3
Microsoft® Windows® Compute Cluster Server 2003................................................................... 4
Driving the Need for HPC - Regulation ........................................................................................ 5
Basel I, II and Risk Management ................................................................................................. 6
Sarbanes Oxley and Spreadsheets ............................................................................................. 6
UCITS III, Derivatives and Structured Products .......................................................................... 7
RegNMS and Data Volumes ........................................................................................................ 7
MiFID, Data Volumes and Data Analysis ..................................................................................... 8
Driving the Need for HPC - Competition ..................................................................................... 9
Algorithmic Trading, Speed and Data Volumes ......................................................................... 10
Decimalization and US Trading Volumes .................................................................................. 10
Credit Derivatives and Product Complexity ............................................................................... 11
Structured Products Growth in the US ....................................................................................... 11
Liability Driven Investment (LDI) and Derivatives ...................................................................... 12
Hedge Fund Valuation and Risk Management .......................................................................... 12
Where to Apply HPC in Financial Markets ................................................................................ 13
Backtesting, Data Mining and Complexity in Algorithmic Trading ............................................. 13
New Derivative Products from Quantitative Research ............................................................... 14
Product Control for New Types of Derivative ............................................................................. 15
Derivatives Trading .................................................................................................................... 15
Portfolio Optimization ................................................................................................................. 15
Risk Management ...................................................................................................................... 16
Putting the Productivity in HPC – Some Practical Examples ................................................. 18
Introducing High Productivity Computing ................................................................................... 18
Productivity Example #1 – Windows CCS applied to Excel® 2007 ............................................ 18
Productivity Example #1(a) – Excel-WCCS Adapter for Legacy XLLs ...................................... 20
Productivity Example #2 – Windows CCS and Legacy VBA Code ........................................... 22
Productivity Example #3 – Windows CCS and MATLAB........................................................... 23
Productivity Example #4 – A Windows CCS “Call to Action” for Systems Developers ............. 24
Summary ...................................................................................................................................... 25
Appendix 1- URL Links for More Information ........................................................................... 26
Appendix 2- Improvements in Microsoft® Office Excel® 2007 ................................................. 27
Contributors and Acknowledgments
Chris Budgen, Chief Technical Architect, Xenomorph
Michael Newberry, Product Manager HPC, Microsoft
Jeff Wierer, Senior Product Manager HPC, Microsoft
Antonio Zurlo, Technology Specialist HPC, Microsoft
Data and Analytics in Financial Markets
Financial markets are going through interesting times. Usually the market experiences
more than enough activity when one area of regulation is changing or one area of the
market is going through rapid expansion. The current business and technical issues
faced by financial markets are at extraordinary levels, driven by a multitude of drivers
spanning both market growth and regulatory change. Those institutions that win will be
those that can manage: (i) higher data volumes, more quickly; (ii) greater data
complexity, more easily; and (iii) more challenging data analysis, more productively.
All of these challenges in data and analytics management drive the need for greater
compute power within financial markets. In order to ensure productive use of this
compute power, it is essential that it becomes easily accessible to all practitioners
within financial institutions; from traders and risk managers to IT development staff,
regardless of whether they are spreadsheet users or involved in the low level
programming of high performance trading applications. This is one of the key design
goals of the Microsoft® High Performance Computing (HPC) platform.
Microsoft believes that the Microsoft HPC platform is an enabling technology for
financial markets institutions that wish to be first to market with innovative new
financial products, risk management techniques and trading strategies. The time for
high performance computing is now, as institutions must face the challenging journey
towards real-time pricing, trading and risk management across asset classes.
Analytics, Data and the Time for High Performance Computing in Financial Markets
1
High Performance Computing Background
So firstly, let’s take a step back and ask the question just what is High Performance
Computing? Fundamentally, HPC is concerned with running applications and business
processes faster and more reliably than ever before. In more detail, HPC is a branch
of computer science that studies systems designs and programming techniques to
extract the best computational performance out of microprocessors. Whilst the origins
of HPC are to found in the development of specialized supercomputers developed for
the US government in the 1960s, in recent years distributed (aka parallel) computing
using commodity hardware has come to dominate this field.
The simple idea of distributed computing within HPC is to take a calculation, break it
into smaller chunks and run these smaller sub-calculations in parallel on multiple
processors and servers in order to reduce the overall calculation time. For example,
applying one hundred processors in parallel to a calculation has the theoretical
potential to reduce overall calculation time down by a factor of one hundred. Add to
this compute power the ability to re-run and recover from hardware or software failures
within the “cluster” of parallel computers, and it becomes easy to understand why HPC
technology is so appropriate to the challenges currently faced in financial markets.
Microsoft and High Performance Computing
Microsoft has set out to make high performance computing available to all, and
nowhere is the goal needed more than in financial markets. In summary, Microsoft
HPC technology has been designed to:








Leverage existing user skills such as familiarity with Microsoft® Windows® and
Windows-based applications, Microsoft® Excel® as an example.
Build upon existing .NET developer skills through tight integration of HPC
functionality within Microsoft® Visual Studio®.
Enable business user productivity through easy-to-use GUI applications for
cluster management and job scheduling.
Provide extensive command line interfaces for power users and developers that
need more flexibility and control.
Offer support for HPC industry standard interfaces such as MPI (Message
Passing Interface) for clustering.
Provide an easy to program job scheduler API for integrating desktop
applications directly to the cluster.
Offer integrated cluster setup, deployment and monitoring capabilities.
Build upon Microsoft® Active Directory® to offer powerful, built-in cluster security
management functionality.
Combining these goals with Microsoft’s partners’ ability to deliver the out-of-the-box
business functionality based on Microsoft HPC technology, means that high
performance computing has the potential to go mainstream within financial markets.
Analytics, Data and the Time for High Performance Computing in Financial Markets
2
Introducing the Microsoft HPC Platform
The Microsoft HPC Platform is fundamentally based upon Windows Compute Cluster
Server 2003 (WCCS), which in turn is composed of two components


Windows Server® 2003 Compute Cluster Edition (CCE)
Windows® Compute Cluster Pack (CCP)
Windows CCE is Microsoft’s 64-bit operating system offered specifically for the HPC
deployment. It offers the power and integration of the Windows Server family at a
value point designed for distributed, multiple-server computing. Windows CCP is
Microsoft’s 64-bit cluster management suite, offering hitherto unavailable levels of
cluster deployment, access and controllability for high performance computing.
In addition to Windows CCS, we need add two further components to complete full
picture of the Microsoft HPC Platform offering:


Microsoft Visual Studio
Microsoft .NET Framework
Microsoft Visual Studio is the most advanced and widely used integrated development
environment (IDE) on the market, and has a wealth of features designed to ensure it is
the IDE of choice for developing and debugging distributed HPC software. The
Microsoft .NET Framework should need no introduction as Microsoft’s strategic
managed code programming model for Windows and web application development
and deployment.
Analytics, Data and the Time for High Performance Computing in Financial Markets
3
Microsoft Windows Compute Cluster Server 2003
In this introductory section, let’s spend a little more time looking at the core member of
the Microsoft HPC Platform, Windows Compute Cluster Server 2003 (WCCS).
Windows CCS provides an integrated application platform for deploying, running, and
managing high performance computing applications. For customers who need to solve
complex computational problems, Windows CCS accelerates time-to-insight by
providing an HPC platform that is easy to deploy, operate, and integrate with your
existing infrastructure. Windows CCS allows researchers to research and financial
analysts to conduct analysis with minimal IT administration. Windows CCS operates
on a cluster of servers that includes a single head node and one or more compute
nodes (see above diagram). The head node controls and mediates access to the
cluster resources and is the single point of management, deployment, and job
scheduling for the compute cluster. Windows CCS uses the existing corporate Active
Directory infrastructure for security, account management, and overall operations
management using tools such as Microsoft Operations Manager 2005 and Microsoft
Systems Management Server 2003.
For more detailed information please visit:
http://www.microsoft.com/windowsserver2003/ccs/technology.aspx and
http://www.microsoft.com/windowsserver2003/ccs/techresources/default.mspx
Analytics, Data and the Time for High Performance Computing in Financial Markets
4
Driving the Need for HPC - Regulation
So what is driving this growth towards high volumes of data, greater complexity and
faster analysis, and ultimately the need for high performance computing in financial
markets? Regulatory compliance is a major factor behind many market developments,
and wherever you look regulators require institutions to manage and analyze more
data, ever more quickly. Some more detailed explanations of these regulations that
are taking us towards financial markets that need more compute power and greater
accessibility to it are outlined below.
Analytics, Data and the Time for High Performance Computing in Financial Markets
5
Basel I, II and Risk Management
Basel I & II are industry-wide banking regulations defining capital adequacy
requirement for banks based around the concepts of market risk, credit risk and
operational risk. Basel I and its subsequent 1996 amendment were the drivers behind
the use of Value at Risk (VaR) to determine the degree of market risk being
undertaken. This calculation is typically data- and calculation-intensive, from the
perspectives of both multi-factor market simulation and full revaluation of derivative
and fixed income instruments. As such VaR is ideally suited to the application of HPC
in order to enable more detailed scenario and sensitivity analysis and calculations
within shorter timeframes.
Basel I was also one of the drivers behind the growth of the credit markets, where
regulatory weaknesses in Basel I were exploited to free up regulatory capital using
credit derivatives. Basel II (effective 2008 worldwide) focuses on adding credit risk and
operational risk to the Basel I framework, and brings with it further computational
challenges in extending risk methodologies to credit risk management and promoting
further innovation in structured credit derivative instruments. HPC offers a technology
framework where the need to re-price ever more complex financial instruments can be
successfully matched against the compute power needed to achieve this goal in a
meaningful business timeframe.
Sarbanes Oxley and Spreadsheets
Sarbanes Oxley is a US federal law passed in response to a number of corporate and
accounting scandals that have occurred in the US, and enforces higher reporting
responsibilities and standards upon any corporation that trades in the US. Given the
ubiquitous usage of spreadsheets in the creation of company report and accounts,
Sarbanes Oxley has been one of the main drivers behind the establishment of
“Spreadsheet Management” as a mainstream sector of the software market.
Whilst Sarbanes Oxley is not specifically targeted at institutions within financial
markets, its implications apply equally to any corporation operating in and around the
US financial markets. Additional and very specific weight to the business need for
spreadsheet management within financial markets is the operational risk associated
with overuse of spreadsheets by traders.
Traders typically use spreadsheets as the main tool for pricing, hedging and risk
management when derivative products are too complex for mainstream systems to
handle. In addition to the direct operational risk of traders reporting on the value of
their own positions, is that regulators can insist on much higher regulatory capital
levels if a bank’s risk management systems are not capable of analyzing and reporting
upon the more exotic products within a portfolio. Here the Microsoft HPC platform can
once again add value, both from the perspective of increasing data transparency for
regulators by using the server-based spreadsheet technology found in Excel Services,
Analytics, Data and the Time for High Performance Computing in Financial Markets
6
and by using HPC behind Excel to enable closer to real-time calculation of trading
opportunity and risk exposure.
UCITS III, Derivatives and Structured Products
The original UCITS (Undertakings for the Collective Investment of Transferable
Securities) directive was intended by the European Union (EU) to create a single
European market and regulatory framework for fund management, allowing
investment management companies to “passport” between different member states. In
many ways its “passporting” goals were similar to the more recent MiFID directive
from the EU (see below). Whilst its latest incarnation, UCITS III, continues with this
aim in the “Management Directive” side of its scope, the “Product Directive” has
enabled the increased usage of derivatives and complex structured products by fund
management institutions.
UCITS III expands the range and type of financial instruments permitted by the original
1985 directive to include:





Transferable Securities and Money market instruments.
Bank Deposits.
Units of other investment funds.
Financial derivative instruments.
Index tracking funds.
Overall a UCITS Fund can have up to 20% of its NAV invested in OTC derivatives and
up to 35% if exchange traded derivatives are used. The leverage potential of this
change is enormous, as is the attractiveness to fund managers of being able to
change fund risk/return profiles, particularly in times where stock market performance
has been highly variable. These factors have contributed to growth in equity
derivatives, credit derivatives and structured products on the buy and sell side, with
the competitive drive for product innovation requiring the pricing of instruments of
greater complexity. The advantages of using these more complex products do not
however come without cost. In this regard, both buy-side and sell-side institutions
must put the tools in place that enable their technology to scale up along with
business need. The Microsoft HPC platform can fulfill this need, enabling reliable and
rapid valuation and re-pricing of complex instrument portfolios.
RegNMS and Data Volumes
The Regulation National Market System (RegNMS) aims to promote “best execution”
and greater market transparency within US equity markets. The four main components
of RegNMS are the:

Order Protection Rule – ensuring that best bid and best offer prices are
displayed in all markets.
Analytics, Data and the Time for High Performance Computing in Financial Markets
7



Access Rule – to prevent an order being executed in one market center at an
inferior price to that available in another.
Sub-Penny Pricing Rule – to prevent practitioners quoting in fractions of a
penny to step ahead of existing buy and sell limit orders.
Market Data Rule - more transparent allocation of market data revenues,
combined with the freedom for broker-dealers and market centers to distribute
their own data.
The implications of these rules are wide and, although effective October 2007, are still
subject to much debate. However, given that aspects of the regulation are concerned
with market transparency, inevitably this will increase the need to manage and
analyze more data sourced from a multitude of trade execution venues. Whilst many
of the proceeding examples have been concerned with the pricing of more complex
instruments using ever more complex mathematical techniques, here the
computational challenge faced is based upon the sheer volumes of data to be
analyzed in timeframes that are meaningful from a business perspective. HPC is not
the complete answer to these data analysis problems, but the issue is now migrating
from a heavily data centric problem to one that focuses on making more sense of the
data through more detailed and faster analysis.
MiFID, Data Volumes and Data Analysis
The Markets in Financial Instruments Directive (MiFID) issued by the European Union
(effective November 2007) will be a great catalyst for change in European financial
markets. Like RegNMS, MiFID is designed to increase market transparency and
focuses on ensuring “best execution” of financial instrument transactions in the EU.
Unlike RegNMS, MiFID also has to deal with enforcing market transparency across
the twenty-seven member states, and in this way could be considered as a more
ambitious and far-reaching regulation, the full effects of which are yet to be fully
determined.
The regulation aims to allow investment firms to “passport” across member states (see
earlier section on UCITS III) as a single regulatory framework is created across the
EU. It encourages the creation of alternate trading venues (such as Project Turquoise
for example) which will increase the venues from which data may need to be sourced
to obtain a full market “picture” of instrument prices and liquidity. Pre-trade and posttrade reporting requirements will additionally put pressure on data management
activity at investment institutions. MiFID also has execution of OTC derivatives within
its scope, and so more complex products are also in scope for the increasing number
of institutions that trade them. Increased data volumes and data complexity look set to
be a key implication of MiFID implementation, presenting further challenges to existing
data and analytics management technology and processes. Enabling both business
users and technologists alike to face these future challenges with confidence is a key
design aim of the Microsoft HPC platform.
Analytics, Data and the Time for High Performance Computing in Financial Markets
8
Driving the Need for HPC - Competition
Whilst regulation increasingly requires institutions to analyze more data more quickly,
intense commercial competition is the profit-driven motivation for many institutions to
look at high performance computing. The commercial rewards for bringing a new
product or service to market quicker than the competition are enormous, and wherever
calculation time is a delaying factor then use of HPC will follow. Combine this
decreased time-to-market with improved end-user productivity and operational
efficiency, and the case for deployment of the Microsoft HPC Platform is a strong one,
as outlined by the commercial drivers described below.
Analytics, Data and the Time for High Performance Computing in Financial Markets
9
Algorithmic Trading, Speed and Data Volumes
The algorithmic trading market has undergone extraordinary expansion in the US over
recent years and adoption of this approach to trading is increasing dramatically in
Europe and Asia. The days where individual traders would attempt to optimally fill a
client buy or sell order through manually scheduling out trades during a day are
numbered.
Clients are now able to see better order execution outcomes through the use of
computer automated or “algorithmic” trading techniques, and the banks and brokers
are able to dispense with the cost of expensive trading staff. Additionally the banks
and brokerage firms are now “productizing” the algorithms they have developed as
competitive differentiators for their buy-side clients. This market has become a
competitive battle around the speed (latency) of execution and the sophistication of
the algorithms used in this market.
These trading algorithms have given rise to greater trading volumes, as more trades
are implemented in smaller transaction amounts across multiple trading venues.
Changes in exchange quoting methods (penny quotes as described below) has also
driven this growth in trade volumes. The addition of regulations such as RegNMS in
the US and MiFID in the EU have also driven the growth in algorithmic trading activity,
as the search for price transparency and liquidity cause algorithmic trading products to
search across a greater number of trade execution venues.
Whilst algorithmic trading is ordinarily concerned with the execution of client orders,
the broader term of automated trading is also a driver of growth of trading volumes,
where traders and investors are trading on their own account (i.e. using the
organization’s own funds, not driven by client orders). Here traders go in search of
“Alpha” from intraday opportunities, often based around techniques such as statistical
arbitrage. As equity markets have matured, fewer opportunities for arbitrage trading
now exist using daily/end of day data hence the increasing interest in intraday tick
data. Additionally, the increase in algorithmic and automated trading has also opened
the door to the concept of algorithms that compete against each other, either in terms
of better execution or profiting directly from knowledge of what algorithms are being
used in the market.
Decimalization and US Trading Volumes
Whilst the RegNMS regulation addresses sub-penny pricing, it is probably worth a
brief mention of decimalization (penny pricing) of market quotes, and how this has
been a catalyst for increased trading volumes in the US. Decimalization in the US
equity markets since the year 2000 has contributed greatly to a reduction in trading
spreads and has also contributed to an increase in trading volumes, coupled with (and
driving) the growth in algorithmic and automated trading. This pattern is currently
being repeated in US exchange-traded option markets, where the recent change to
penny quotes (for example the Feb 2007 change to penny quotes on OPRA) will
Analytics, Data and the Time for High Performance Computing in Financial Markets
10
increase trading volumes, increase algorithmic trading activity in these markets and
increase the amounts of data that needs to be analyzed.
Credit Derivatives and Product Complexity
The credit markets have also undergone rapid expansion over recent years, with
issuance rising from $180 billion in 1996 to an estimated $20 trillion in 2006 according
to the British Bankers Association. Prior to the creation of the credit derivatives
markets in the early 1990s, it was very difficult for banks to reduce or offload the credit
risk associated with their loan portfolio (i.e. the very real risk that some of the loans will
not be repaid due to company bankruptcy and other circumstances). The credit
derivatives markets and the products within it have allowed credit risk to become a
tradable commodity, and hence credit risk can be better distributed across the
financial markets from high leveraged institutions (banks) to better-capitalized
businesses such as investment institutions and corporations. As mentioned earlier,
regulatory arbitrage has also played its part in the growth of this market.
The benefits and pitfalls of the advent of the credit derivatives markets are much
debated, however a continued feature of this market is product innovation, as
quantitative researchers create more complex products and structures. One of
fundamental and enabling building blocks of the market (and still accounting for
around a third of the market’s size) is the Credit Default Swap (CDS). In overview
terms, a CDS is a financial contract that pays the buyer a fee if a third party should
default in its payments on some referenced debt obligation. Other more complex
products, such as Collateralized Debt Obligations (CDOs) and CDO squared
instruments, can require extensive use of Monte-Carlo simulation methods since the
pricing techniques in this market are new and immature when compared to other
better-understood asset classes. Gone are the days when a credit derivatives desk
could simply acquire a new server to apply to each new computationally intensive
trade undertaken. As trading desks compete for more business within the credit
derivatives market, scalability across business processes will be a key success factor.
The Microsoft HPC platform is one of the key technologies that will enable scalability
and faster pricing of both existing and new types of exotic credit derivatives.
Structured Products Growth in the US
Given the effects of UCITS III described in the previous section and its effects on
structured products trading in Europe, it is also worth mentioning that structured
products are finally experiencing very strong growth in the United States (U.S.). U.S.
markets and investors have always been very equity focused. Europe has led the
growth in structured products because of the greater taxation, regulatory and market
complexity of the European market. Even so, growth of U.S. structured products
issuance from $50 billion in 2005 to around $100 billion in 2006 means that more
complex products are here to stay in the US, with a
Analytics, Data and the Time for High Performance Computing in Financial Markets
11
ll of the complexity of data management and data analysis that constant product
innovation implies. Again with its computational scalability, reliability and easy
integration with Microsoft Excel and Microsoft’s development tools, the Microsoft HPC
Platform is the technology to accelerate the management of structured products.
Liability Driven Investment (LDI) and Derivatives
Another factor that has contributed to the increase use of derivatives by asset
management institutions is the emergence of “Liability Driven Investment” (LDI) as a
mainstream technique for pension fund management. As a relatively recent refinement
to Asset and Liability Management (ALM), investment managers are now offering
pension funds an LDI service that focuses on better matching of pension fund assets
to member liabilities. As described in more detail below, LDI demands the kind of
increased compute power available from the Microsoft HPC platform.
In a traditional scenario, a pension’s asset profile might well be invested primarily in
equities, leading to a large exposure to equity market risk. This risk profile is not well
matched against the liability profile of long-term cash payments made to retired
members of the fund, which is exposed to different risk factors such as interest rates,
inflation and longevity. Within LDI, asset management institutions are now competing
with the sell-side banks in offering investment overlay services, where the asset risk
profile is overlaid with derivative products such as equity, fixed-income and inflation
derivatives. In addition to pricing and hedging the derivatives used in LDI, a further
computationally intensive stage is the optimization of the asset portfolio against the
client’s liability benchmark.
Hedge Fund Valuation and Risk Management
As investment management institutions search for greater sources of trading “Alpha”,
investment by fund managers, pension funds in hedge funds has become much more
mainstream. Many of the investment and trading strategies employed by hedge funds
involve the use of derivatives and more complex investment strategies to generate
Alpha, as well as seek returns that are less correlated with mainstream market
indices. The desirability of these more complex strategies and acceptance of hedge
funds as a clearly defined asset class is in turn driving the need for better risk
management and derivatives valuation on the buy-side.
Analytics, Data and the Time for High Performance Computing in Financial Markets
12
Where to Apply HPC in Financial Markets
Now we have examined some of the drivers behind the growth in data volumes,
complexity and speed of analysis, let’s get into more detail on where Microsoft HPC
can be applied within financial markets.
Back Testing, Data Mining and Complexity in Algorithmic Trading
Strategists involved in automated and algorithmic trading usually have no shortage of
trading ideas that they would like to try out, against a background where they have
very little time to develop and back test them. Even once an algorithm is found that
has some potential, it is often the case that the algorithm needs to be re-written and
re-tested within an automated trading environment that is production worthy. All of
these processes take time, and once again time to market is a key success factor in
this rapidly evolving market sector.
Analytics, Data and the Time for High Performance Computing in Financial Markets
13
Additionally as the automated trading market develops further, the algorithms used are
becoming more complex, combining the real-time statistical analysis of real-time,
intraday and historic data. This complexity is only set to increase, especially as new
asset classes fall within the grasps of trading automation, implying that real-time
derivatives and fixed income pricing will soon be required within all algorithmic trading
solutions. Whilst latency is the current key technology metric for algorithmic trading, as
algorithms and the asset classes traded by them become more complex, compute
power will be fundamental to success in this market.
Against this background of increasing algorithmic complexity, is the growing difficulty
in both managing and analyzing vast volumes of stored real-time data. According to
recent estimates by the Tabb Group, markets world-wide generate around 3.5
Terabytes of real-time data per day, leading to databases at many institutions that are
currently many 10’s of Terabytes in size and which will soon expand far beyond these
levels. Many institutions now looking at their huge investment in databases of tick data
are finding that workstation-bound analysis of these vast datasets by traders is simply
not feasible and limits the kinds of analysis that can be done. The question seems to
be heading towards: “Well, we have all the data we need, so what shall we do it with
now?”.
This data analysis challenge demands scalable, fault-tolerant server side tools that
deliver compute power to traders, without sacrificing the traders’ ability to try new
ideas and strategies quickly and easy. The Microsoft HPC Platform can deliver this
compute power to where it is needed, both for practitioners and technologists alike.
Faster parameter optimization, faster back testing, faster re-pricing of derivatives and
faster data analysis are all areas that the Microsoft HPC Platform can deliver upon,
enabling faster time to market for new automated trading strategies.
New Derivative Products from Quantitative Research
The speed with which a new financial product can come to market is a key competitive
factor for many derivative trading desks. Shorter time to market means higher margins
can be charged as market demand can be met in the absence of strong price
competition. A reputation for successful innovation also assists in driving market share
through perceived “ownership” of expertise in a particular product area. Even for those
products that are more widespread, the ability to quote a price faster to the client than
the competition can only increase the chances of success.
Quant and derivative research teams are at the front line of this battle of innovation,
under constant pressure from trading desks to deliver pricing models for new products
in timescales that are invariably short and very challenging. Whilst it may be the case
that an elegant “closed form” solution can be ultimately found for a derivative product,
it is highly likely that for new and complex derivative products financial mathematicians
will have to apply numerical methods in order to value the security. Ultimately, faster
numerical methods such as tree-based solutions may be found to the valuation
problem. However it is highly likely that either the initial valuation (or at the very least
Analytics, Data and the Time for High Performance Computing in Financial Markets
14
the validation) of the new derivative product will be done as some form of Monte-Carlo
simulation, a technique that lends itself well to “parallelization” with the Microsoft HPC
Platform.
Product Control for New Types of Derivative
Ensuring that the pricing and sensitivity outputs of a pricing model for a new and
potentially exotic instrument are accurate before it is traded is a key concern for
Product Control departments at many sell-side institutions. These concerns arise from
the possible trading losses that may ensue if an instrument is mispriced, from the
need to check that an instrument can be hedged and risk managed, and in order for
the regulators to see that process is being followed and that unnecessary operational
risk is not being taken.
Time to market is key for new types of derivative and structured products, as profit
margins are often determined by who else in the market can offer a similar service and
whether or not competition on price comes into play. Hence the quicker that the
process of bringing new products to market can be achieved, the greater are the
opportunities for increased profitability.
One of the main techniques used by quantitative analysts within product control is
Monte-Carlo simulation to verify independently that the trading desk’s pricing method
is both accurate and robust. If this process of model validation can be speeded up
linearly through the use of the Microsoft HPC Platform to distribute this compute load
across many machines, then time to market is reduced and profit margins will rise. As
a final thought for the future, it is probably not long until product control will need to be
involved in the validation of algorithmic trading “products” before they are released to
market, with all of the back testing and data analysis that this process implies.
Derivatives Trading
Once an instrument has been priced and a deal done with a counterparty, it is then
down to the derivatives trader to manage and hedge each instrument position within
the context of a wider portfolio of derivatives, underlying securities and hedge
instruments. Traders must consider both position risk and un-hedged exposure at both
a single instrument and a portfolio level, under a wide variety of “what-if” scenarios.
Given more (and easy) access to more computing power, traders would be able to revalue their portfolios and analyze more complex market scenarios in shorter periods of
time. In this scenario, utilizing the Microsoft HPC Platform to deliver this compute
power to traders would enable more robust post-trade hedging, plus faster analysis of
pre-trade market opportunities available.
Portfolio Optimization
Portfolio optimization within finance has been around as a major subject within
academia for a very long time, however its practical application has never quite met its
Analytics, Data and the Time for High Performance Computing in Financial Markets
15
true potential as market realities have often interfered with the elegance of some of
the mathematical techniques applied. Fixed transaction costs, non-fractional positions
and other matters causing mathematical discontinuities have always forced
practitioners down the route of more expensive and difficult optimization techniques,
such as integer quadratic programming and the like.
The advent of Liability Driven Investment and other techniques involving the use of
derivative products have added further non-linearity and discontinuities to the
mathematical and computational problems of applied optimization. Whilst some of the
problems will remain just out of reach due to their computational intractability, the
Microsoft HPC Platform delivers the compute performance that will enable
practitioners to face more difficult problems with confidence, allowing faster
optimization runs and more sophisticated modeling.
Risk Management
Given the economic pain associated with having high regulatory capital reserves
enforced upon them, it is unsurprising that the calculation of enterprise Value at Risk
(VaR) is one of the key responsibilities of risk management departments within sellside institutions. VaR can be defined as the level of portfolio loss that would not be
expected to be exceeded under normal market conditions over a defined time for a
given level of probability. For example, under Basel I the VaR calculation is defined as
the loss that would only be exceeded one time in one hundred (99% VaR) over a ten
day investment horizon. Taking a different example, if we had a portfolio of financial
instruments worth $500 million with a 95% VaR level of $40 million over a one day
timeframe, then only one day in twenty would we expect to suffer losses greater than
$40 million.
The two main techniques used to estimate VaR are the historical simulation and
Monte-Carlo simulation methods. Historical VaR requires that a period of history is
chosen whose set of returns will “represent” a distribution or scenario set of market
conditions in the future. Under each scenario, defined by current market conditions
perturbed by each historical set of changes, the profits and losses for the current
portfolio are calculated. Monte-Carlo VaR is similar, except that Monte-Carlo
simulation is used to model the future market scenarios rather than using a period of
history to directly model market movements.
However, Monte-Carlo VaR requires a multi-factor random simulation of market
variables, and in this regard historic data is still required in order to calculate the
correlation matrices necessary to drive the simulation. The addition of derivative
products to the portfolio of instruments to be analyzed adds further complication, as
the whole portfolio may need to be fully revalued for each simulation run. This could
result in multiple Monte-Carlo simulations (to value the complex instruments in the
portfolio) being run within each VaR scenario (themselves being generated by MonteCarlo methods).
Analytics, Data and the Time for High Performance Computing in Financial Markets
16
Both methods are computationally intense, but are “embarrassingly parallel” (i.e. a
calculation that can easily be split and distributed within a HPC environment). The
parallel nature of these calculations, along with the fact that many are performed
within Microsoft Excel lend themselves well to being implemented using the Microsoft
HPC Platform, with the aim of spreading this large compute load across multiple
servers and processors in order to reduce calculation time. Monte-Carlo simulation
also requires correlation matrices to be pre-calculated in addition to performing the
actual simulation runs themselves. VaR is usually calculated as part of an end of day
batch process. Whilst initial implementations may run in an hour or so, it is often the
case that with portfolio growth an “overnight” process quickly evolves into something
that may run for many hours, challenging the concept of overnight reporting and
certainly not allowing any time for re-runs should there be any process failures or
strange results needing further investigation. This current pressure on the calculation
time for VaR will increase as the growth in automated trading will eventually require
institutions to move towards real-time risk management. Faster, real-time risk
management will be the challenge of the future, a challenge that can be delivered
upon using the compute scalability and fault-tolerance of the Microsoft HPC Platform.
Analytics, Data and the Time for High Performance Computing in Financial Markets
17
Putting the Productivity in HPC – Some Practical Examples
Introducing High Productivity Computing
One of the key design aims of the Microsoft HPC platform is to make high
performance computing available to both business users and technologists alike, and
in so doing help to increase the productivity of all users. From a trading and risk
management perspective, business users will be able to work on a Windows desktop,
using familiar Windows applications, such as Excel, scaling seamlessly across a
Windows cluster. All this can be done using a single set of security credentials and not
requiring any additional skills or training to benefit from the power of distributed
computing.
From a technologist viewpoint, the Microsoft HPC platform offers integrated access to
high performance computing using the Microsoft Visual Studio 2005 IDE, tightly
integrated security and management features as mentioned above, powerful
command line scripting and conformance to parallel computing industry standards
such as MPI with the Microsoft Message Passing Interface (MS-MPI) and a parallel
debugger.
Time to market is a key factor within financial markets, and in this way the Microsoft
HPC Platform has been designed to address the problem of HPC deployment,
management and usage from both a business and technical viewpoint, not just a
computational perspective. Given this focus on delivery of the complete solution, then
Microsoft suggests that High Productivity Computing is both a valid and valuable
alternative definition to be considered when talking about why financial markets
institutions need HPC.
Productivity Example #1 – Windows CCS applied to Excel 2007
Whilst we have seen some of the business cases for using HPC in financial markets in
preceding sections, it is probably worth stating that the majority of these business
cases for HPC have an additional common theme. That theme is that many of them
will be front-ended (or at least prototyped) with the “lingua franca” of financial markets,
Microsoft Office Excel. The tight integration that is possible between Windows CCS
and Microsoft Office Excel 2007 has the potential to “democratize” access to high
performance computing. Through spreadsheet access, high performance computing
can be delivered directly to the trading, portfolio and risk managers who need it.
Analytics, Data and the Time for High Performance Computing in Financial Markets
18
Microsoft Office Excel has become a mission critical application for most financial
markets organizations, where business users find that the spreadsheet is the only
environment that can quickly bring together position data, complex instrument data,
market data and pricing models, and at the same time support the most complex
analysis under a wide variety of market scenarios. Whilst Excel has been an
overwhelming success in financial markets, these same business users have been
pushing Excel to the limit and demanding enterprise performance and reliability. In
summary, this increasingly “mission critical” usage of Excel requires:




Guaranteed execution of mission critical calculations.
Improved performance by completing parallel iterations on models or longrunning calculations.
Improved productivity of employees and resources by moving complex
calculations off desktops.
Transparent access to centralized spreading reports and analysis.
Excel add-ins are the primary way in which derivative and fixed income pricing models
are deployed and used in financial markets. Complex spreadsheets used to price
derivatives, simulate market movements or optimize portfolio composition are
compute-intensive and, as we have seen, invariably it would be desirable to greatly
speed up such calculations. Given the addition of the new multi-threaded calculation
engine of Excel 2007 (see Appendix 1 for more detail), traders and risk managers will
notice dramatic calculation speed improvements as Excel can now distribute its cell
calculation load in parallel across the available cores and processors on the machine
it is running on. In addition, the greatly improved memory-management features of
Excel 2007 will ensure that less time is taken accessing physical disk storage for data
intensive calculation runs.
There are many cases however, where the parallelization of the calculation load on
one machine will not be sufficient. In these scenarios, Excel 2007 can be combined
with Windows Compute Cluster Server in order to deliver calculation scalability and
fault tolerance across many machines containing multiple processors and processor
cores. In order to achieve this, user-defined functions can be created and installed on
64-bit servers running WCCS. Using the new multithreaded calculation engine of
Excel 2007, simultaneous calls can be made to the cluster servers to perform remote
calculations. Since calculations may be performed in parallel by multiple servers,
many complex spreadsheet calculations can be performed far quicker than before
whilst, at the same time, the load on the local client machine can be significantly
reduced. Hence traders and risk managers can achieve faster time to market with
new ideas, and at the same time not bring their local desktop to a standstill as a big
spreadsheet is calculated.
Analytics, Data and the Time for High Performance Computing in Financial Markets
19
With this solution, organizations are able to move formulas out of the Excel workbooks
and store them on a set of servers. Additionally, since the processing is moved off the
desktop, organizations have the option to lock-down access to the formulas used in
the calculations and just provide user visibility to results. This scenario requires
creating a local, client, User-Defined Function (UDF) as an XLL that will schedule jobs
on the head node via the web services API. Additionally a server-version of the UDF
(or its functional equivalent) will need to be created which will live on the cluster
servers. The “server” UDF will perform the calculations and return the results back to
Excel.
Productivity Example #1(a) – Excel-WCCS Adapter for Legacy XLLs
As mentioned in Example #1 above, the integration of Windows CCS with Excel 2007
requires some coding work to be done to build new “server-side” UDF functions and to
wrap existing legacy add-ins (often developed as XLLs) so that they can be accessed
in Windows CCS from Excel. As a result of their desire to make the application of the
Windows HPC Platform open and productive for all users, Microsoft has developed a
tool called the Excel-WCCS Adapter. The Excel-WCCS Adapter allows a developer to
automatically create a cluster-enabled add-in from an existing XLL and provides at
runtime a system for the distributed execution of XLL function calls on the cluster,
typically resulting in accelerated performance for supported scenarios.
Analytics, Data and the Time for High Performance Computing in Financial Markets
20
The primary benefits of the Excel-WCCS Adapter include:




No custom development required to adapt the existing XLL libraries for
compute cluster deployment; provided XLL UDFs meet the deployment criteria
The source code for the original XLL is not required as the automated proxy
generation process analyzes the public interface of the XLL binary component
itself to create a compatible compute cluster proxy component.
Multi-user support where many different users can each run their own
spreadsheets and call the same server-side UDFs on the compute cluster
server.
Multi-XLL support where each spreadsheet calls UDFs in multiple XLL libraries,
each of which can be distributed to the server. Mixed deployment (i.e.
spreadsheets with both local and remote XLLs are also supported).
The diagram above illustrates the Excel-WCCS Adapter architecture in more detail. It
should also be pointed out that certain deployment criteria must be met by the legacy
XLL before this tool can be applied to it. Those readers interested in reading more
about this productivity initiative for Excel add-ins should follow the link below:
Analytics, Data and the Time for High Performance Computing in Financial Markets
21
Productivity Example #2 – Windows CCS and Legacy VBA Code
When designing Excel Services, Microsoft took the decision that “unmanaged” (non.NET) code was not appropriate for secure, server-based spreadsheet access. Whilst
the technical reasons for this approach are understandable and valid, it does however
leave a productivity gap for those of users that are dependent on functionality
contained within “unmanaged” Visual Basic® for Applications (VBA) code.
Whilst the Microsoft HPC platform allows for many configurations of high performance
clustering behind Excel 2007, it is possible and practical to combine legacy versions of
Excel and Excel spreadsheets with Windows CCS. In this scenario, the legacy version
of Excel and would be installed on each compute node of the Windows CCS cluster.
An application or command-line script would then marshal what jobs (calculations
within spreadsheet workbooks) need to be undertaken by Excel within the cluster and
where results are to be placed. Obviously, this approach is only appropriate where the
desired calculation can be parallelized. However, this is often the case where VBA
code has been used to both control and implement Monte-Carlo simulations for risk
management and pricing purposes.
Analytics, Data and the Time for High Performance Computing in Financial Markets
22
Productivity Example #3 – Windows CCS and MATLAB
The MathWorks is one of the world’s leading developers of technical computing and
Model-Based Design software for risk managers and quants in financial markets. With
an extensive product set based on MATLAB® and Simulink®, The MathWorks provides
software and services to solve challenging problems and accelerate innovation in
financial markets.
MATLAB and Simulink enable risk managers and quants to concentrate on problem
solving, instead of spending their time programming. The MATLAB development
environment lets you interactively analyze and visualize data, develop algorithms, and
manage projects. With Distributed Computing Toolbox and MATLAB Distributed
Computing Engine users can tackle even larger problems than before by exploiting the
computing power of a group of networked computers managed by Windows CCS.
Analytics, Data and the Time for High Performance Computing in Financial Markets
23
Productivity Example #4 – A Windows CCS “Call to Action” for Systems
Developers
Given the levels of Excel usage within financial markets, it is perhaps unsurprising that
some of the preceding examples have focused on the use of Windows CCS with
Excel. This also reflects Microsoft’s belief that Excel is one of the main ways in which
business users in financial markets can begin to harness and benefit from the
computational power of HPC.
However, Excel is only one way of accessing Windows CCS, and in this regard
Microsoft has implemented very tight integration of Windows CCS with the Microsoft
Visual Studio 2005 IDE. The .NET Framework and features such as parallel
debugging of distributed calculations can help systems developers and independent
software vendors to deliver HPC-enabled applications faster than ever before. As we
have seen, the market need is there for HPC-enabled applications and Microsoft is
making this transition to distributed architecture computing as simple and straight
forward as possible.
Analytics, Data and the Time for High Performance Computing in Financial Markets
24
Summary
Regulation and competition are currently driving extraordinary levels of change and
innovation within financial markets. The ability to manage higher volumes of data,
greater complexity of data and faster analysis of data will be crucial in determining
which organizations succeed in the markets of the future.
High Performance Computing technology will play a key part in delivering the compute
power to deal with this data and analytics revolution, and the Microsoft HPC Platform
has been designed to maximize the productivity of users in meeting this challenge.
Leveraging familiar operating system, application and programming resources, and
combining them within one fully integrated management and security environment for
clustering, means that the Microsoft HPC Platform is an enabling technology in the
journey towards real-time pricing, trading and risk management across asset classes.
Analytics, Data and the Time for High Performance Computing in Financial Markets
25
Appendix 1- URL Links for More Information
Product Information

http://www.microsoft.com/hpc

http://www.microsoft.com/hpc/financial
Client Case Studies

http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=4000000030

http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=53917
Other Whitepapers
Overview of Windows Compute Cluster Server 2003

http://download.microsoft.com/download/9/e/d/9edcdeab-f1fb-4670-8914c08c5c6f22a5/HPC_Overview.doc
Windows Compute Cluster Server 2003 Reviewer’s Guide

http://www.microsoft.com/windowsserver2003/ccs/reviewersguide.mspx
Deploying and Managing Windows Compute Cluster Server 2003

http://go.microsoft.com/fwlink/?LinkId=55927
Improving Performance in Microsoft Office Excel 2007

http://msdn2.microsoft.com/en-us/library/aa730921.aspx
Spreadsheet Compliance in the 2007 Microsoft Office System

http://download.microsoft.com/download/8/d/7/8d7ea200-5370-4f23-bdcaca1615060ec4/Excel%20Regulatory%20White%20Paper_Final0424.doc
Excel Services Technical Overview

http://office.microsoft.com/search/redir.aspx?AssetID=XT102058301033&CTT=5&Origin=HA10
2058281033
Blogs

http://blogs.msdn.com/hpc/

http://blogs.msdn.com/excel

http://blogs.msdn.com/cumgranosalis

http://windowshpc.net/
Analytics, Data and the Time for High Performance Computing in Financial Markets
26
Appendix 2- Improvements in Microsoft Office Excel 2007
Microsoft has made significant investments towards the improvement of Excel in the
2007 release. With Excel 2007 users will experience a redesigned interface that
makes it easier and faster to create spreadsheets that match the growing needs of the
market. Here are some of the key features of Excel 2007 that can help to greatly
improve the data analysis capabilities of spreadsheet users and add-in developers:




Expanded capacity for spreadsheets. Previous releases of Excel were limited
to 256 columns by 65,536 rows. Users can now create spreadsheets with
columns going to XFD (that’s 16,384 columns!) and include up to 1,048,576
rows.
Maximized memory usage. Previous releases of Excel could only take
advantage of 1 GB of memory. That limit has been removed and Excel can
now take advantage of the maximum amount of memory addressed by the
installed 32-bit version of Windows.
Excel 2007 now includes a multi-threaded calculation engine. By default Excel
will maximize the utilization of all CPUs on a machine. For customers running
multi-core, multi-CPU, or hyper-threaded processes, linear improvements can
be seen in performance, assuming parallelization of calculations and the use of
components and environments that are multi-thread capable.
Continued compatibility with existing Excel XLL add-ins, with the addition of an
extended Excel XLL C API to fully support the functionality outlined above, plus
new features such as support for strings up to 32,767 characters in length.
For more information on Excel 2007, visit: http://office.microsoft.com/excel and for
more detail on developing add-ins for Excel 2007 using the updated Excel XLL C API,
see: http://msdn2.microsoft.com/en-us/library/aa730920.aspx
Analytics, Data and the Time for High Performance Computing in Financial Markets
27