post modern portfolio theory

advertisement
The role of News Analytics in financial
engineering: a review and the road
ahead
Gautam Mitra
7 December 2011 London
Outline


Introduction

What… Why… How.

A commercial
News data





Models and Applications




Abnormal Returns
News Enhanced Trading Strategies
Risk Control
Case studies




Data sources
Information Contents/Metadata
Summary Information/Views
Information/modelling architecture
Risk Control
News Analytics Toolkit
Momentum study
Summary Conclusion
WHAT
News analytics : a working definition

News analytics refers to the measurement of the
various qualitative and quantitative attributes of
textual news stories. Some of these attributes are:
sentiment, relevance, and novelty. Expressing news
stories as numbers permits the manipulation of
…information in a mathematical and statistical way
< Taken from Wiki >
A news story is about an event
WHY
the research problem = the business problem
The world of financial analytics is concerned with three leading
problems.
( i ) Pricing of assets in a temporal setting
( ii ) Making optimum investment decisions- low frequency or
optimum trading decisions- high frequency
( iii )Controlling risk at different time exposures
How
the message
Finance industry focuses on three major applications:
> High frequency :Trading strategies
> Low frequency :Investment strategies
> Risk control
By increasing the information set with quantified news the legacy
models for the above applications can be enhanced
Knowledge from three disciplines are required
> Information engineering
> AI …Knowledge Engineering
> Financial Engineering
Introduction

News

Market Environment

Sentiment
[Behavioural finance < greed..fear..irrational
exuberance >………
Wall Street 1
Wall Street 2 => money never sleeps ]
Introduction
[ neo classical models for choice or decision making]

Trading Strategies/ Decisions

Investment Decisions

Risk Control Decisions
Introduction
R & D Challenge  Identify Killer Application

Smart investors rapidly analyse/digest information.
 News stories/announcements.
 Stock price moves (market reactions).
 Act promptly to take trading/investment decisions.

Can a machine act intelligently(AI) to compete or
outsmart humans ?
Commercial

Read
The Handbook of News Analytics in Finance
By: Gautam Mitra and Leela Mitra
< for an instant understanding ...! >

< or look up http://www.bis.gov.uk/foresight/our-
work/projects/current-projects/computer-trading


The Future of Computer Trading in Financial Markets
Our report: Automated analysis of news to compute market sentiment: its
impact on liquidity and trading...Gautam Mitra , Dan DiBartolomeo, Ashok
Banerjee, Xiang Yu.
Outline


Introduction

What… Why… How.

A commercial
News data





Models and Applications




Abnormal Returns
News Enhanced Trading Strategies
Risk Control
Case studies




Data sources
Information Contents/Metadata
Summary Information/Views
Information/modelling architecture
Risk Control
News Analytics Toolkit
Momentum study
Summary Conclusion
News data: Data sources


Which Asset classes....?

FX- Currency

Commodities

Fixed income (Bonds)

Stocks (Equities)
Wall Street proverb:

‘Stocks are stories bonds are mathematics’
News Data Feed Providers
Tertiary Market
Participants
Market Data Feed Providers
Customers
Institutional Customers
Broker-Dealers &
Market Makers
ECN
Retail Customers
Retail Brokers &
Market Makers
Exchange
Main Market
Participants
News data: Data sources

Traders [ High Frequency ]

Fund Managers [ Low Frequency ]

Desktop
• Market Data
• NewsWire
• Web < blogs, twitter, message boards >

Data WareHouse

DataMart
News data: Data sources

Sources of news/informational flows (Leinweber)




News: Mainstream media, reputable sources.
 Newswires to traders desks.
 Newspapers, radio and TV.
Pre-News: Source data
 SEC reports and filings. Government agency reports.
 Scheduled announcements, macro economic news,
industry stats, company earnings reports…
Web based news
Social media: Blogs, websites and message boards
 Quality can vary significantly
 Barriers to entry low
 Human behaviour and agendas
News data: Data sources

Financial news can be split between



Scheduled news (Synchronous)
Unscheduled news (Asynchronous, event driven)
Scheduled news (Synchronous)





Arrives at pre scheduled times
Much of pre news
Structured format < XML..XBRL >
Often basic numerical format
Typically macro economic announcements and earnings
announcements
News data: Data sources

Unscheduled news (Asynchronous, event driven)







Arrives unexpectedly over time
Mainstream news and social media
Unstructured, qualitative, textual form
Non-numeric
Difficult to process quickly and quantitatively
May contain information about effect and cause of an
event
To be applied in quant models needs to be converted to an
input time series
Information contents/Metadata
Key Attributes include:

Entity Recognition

Relevance

Novelty

Events categories

Sentiment
Preanalysis  extracts/computes/mines these attributes and using
text analysis and AI-classifiers sentiment scores are created
This is the (news) metadata
Also the news flow/the intensity influences the resulting sentiment
Information/modelling architecture
Mainstream
News
Pre-News
Web 2.0
Social Media
Pre-Analysis
(Classifiers &
others)
metadata
• Entity Recognition
• Relevance
• Novelty
• Events
• Sentiment Score
News Flow/Intensity
(Numeric) financial
market data
Analysis
Consolidated
Data mart
Updated beliefs,
Ex-ante view of market
environment
Quant Models
1.Return Predictions
2.Fund Management /
Trading Decisions
3.Volatility estimates
and risk control
Information value chain
Data…
…information… knowledge
Data  analysis  Data mart  quant models
Analysis ..synthesis ..mining
entity recognition
Identify entities such as companies in news stories using point-intime sensitive information:






Short names
Long names
Common abbreviations
Common misspellings
Securities identifiers
Subsidiaries
Analysis ..synthesis ..mining
relevance
Calculate the relevance of a story to a given company:
•
Mentions in the text
•
Positioning in the story (headline vs. last paragraph)
•
Total number of companies mentioned
•
Detect roles played by companies in the story
•
Represent the context numerically
Analysis ..synthesis ..mining
novelty
Is the news story "new" or novel?
•
Elementize the various characteristics of a news story
•
Distinguish between similar vs. duplicate stories
•
Define a time window between stories
Example: Toyota’s Vehicle Recall (news flow in the first 30 minutes)
100
75
56
42
• 2010-01-21 21:20:08 -- PRESS RELEASE: Toyota Files Voluntary Safety Recall on Select Toyota
• 2010-01-21 21:20:08 -- News Flash: Toyota Files Voluntary Safety Recall On Select Toyota Division Vehicles
• 2010-01-21 21:21:27 -- Toyota To Recall About 2.3M Vehicles For Sticking Accelerator Pedals>TM
• 2010-01-21 21:48:10 -- DJ Toyota Recalls 2.3 Million Vehicles For Sticking Accelerators
Analysis ..synthesis ..mining:
event categories
Company news and
events are categorized:
•
•
•
•
Identify actionable
events
The more detailed
the event, the better
Differentiate
between scheduled
vs. unscheduled
news events
Distinguish between
explanatory or
predictive inputs
M&A
Activity
Stock Price
Changes
Analyst
Ratings
Bankruptcy
Revenues
Credit
Ratings
Regulatory
Price
Targets
Dividends
Legal
Issues
Earnings
Insider
Trading
Analysis ..synthesis ..mining
sentiment
Summary information and views
Thomson Reuters News Analytics
Equity coverage and available data
(i)
Coverage
(ii)
Equity:
All equities ............................34,037 (100.0%?)
Active companies ................32,719 (96.1%)
Inactive companies............. 1,318 (3.9%)
Equity coverage by region
Americas: ...............................14,785
APAC: .....................................11,055
EMEA:.......................................8,197
Equity Coverage Updates: Bi-weekly updated
for recent changes (de-listings, M&A,
IPOs).
History: Available from January 2003 (history
kept for delisted companies; symbology
changes tracked).
RavenPack News Analytics
Equity Coverage by Region
All equities...................................28,279 (100%)
Americas: ...................................11,950 (42.24%)
Asia: ............................................8,858 (31.31%)
Europe:...................................... 5,859 (20.71%)
Oceania: ....................................436
(5.08%)
Africa: .........................................186
(0.66%)
For the most updated list of supported
companies download the
companies.csv file at:
https://ravenpack.com/newsscores/
Historical Data:
Data format: Comma separated values (.csv)
files
Date/Time info: In Universal Coordinated Time
(UTC)
Archive Range: Since Jan 1, 2005
Archive Packaging: Monthly .csv files
compressed in .zip files on a per year
basis
Summary information

Other suppliers

Deutsche Boerse < Alpha Flash >

Bloomberg ‘Black box newsfeed’

Dow Jones Elementized Newsfeed
Summary information and views

Tetlock et al. event study shows “information leakage”
Summary information and views
Average Stock Price Reaction to Negative News Events
Source: Macquarie Quant Research –May 2009
Summary information and views
Average Stock Price Reaction to Positive News Events
Source: Macquarie Quant Research –May 2009
Summary information and views
Illustration of Seasonality (Hafez, RavenPack)
RavenPack Sentiment Scores
Reuters NewsScope Sentiment
Engine
Outline


Introduction

What… Why… How.

A commercial
News data





Models and Applications




Abnormal Returns
News Enhanced Trading Strategies
Risk Control
Case studies




Data sources
Information Contents/Metadata
Summary Information/Views
Information/modelling architecture
Risk Control
News Analytics Toolkit
Momentum study
Summary Conclusion
Model & Applications… (abnormal )
Returns


Traders and quant managers … identify and exploit asset
mispricings before they correct … generate alpha
News data can be used

Stock picking and generating trading signal

Factor models

Exploit behavioural biases in investor decisions
Model & Applications… (abnormal )
Returns



Stock picking and generating trading signal
Sentiment reversal as buy signal: J Kitterell uses a sequence of
P, N scores as a means of testing sentiment reversal.
Momentum strategy enhanced by news sentiment scores
Macquarie research also Sinha reports results with Thomson
Reuters data.
Model & Applications… (abnormal )
Returns
Behavioural biases



Odean and Barber (2007) find evidence individual investors
have a tendency to buy attention grabbing stocks.

Professional investors better equipped to assess a wider
range of stocks they are less prone to buying attention
grabbing stocks
Da, Engleberg and Gao also consider how the amount of
attention a stock received affects its cross-section of returns.

Use the frequency of Google searches for a particular
company as a measure of attention.

Find some evidence that changes in investor attention
can predict the cross-section of returns.
Model & Applications… (abnormal )
Returns

Stock picking and generating trading signal

Li (2006) simple ranking procedure
 … identify stocks with positive and negative sentiment
 10 K SEC filings for non-financial firms 1994 – 2005
 Risk sentiment measure – count number of times words
risk, risks, risky, uncertain, uncertainty and uncertainties
appear in management discussion and analysis section


Strategy long in low risk sentiment stocks

short in high risk sentiment stocks

… reasonable level returns
Leinweber (2010) – event studies based on Reuters
NewsScope Sentiment Engine
News Enhanced Algorithmic Trading
1.
Information/modelling architecture
2.
Modelling architecture

Pre-trade – Post trade Analysis
Characterize asset behaviour/dynamics by
i.
Asset Price/Return
ii.
Asset (Price) Volatility
iii.
Asset (Price) Liquidity
Construct trading models using these measures
Market Data
Bid, Ask, Execution
price, Time bucket
News Meta Data
Time stamp, CompanyID, Relevance, Novelty,
Sentiment score, Event
category…
Price/Returns
Predictive
Analysis
Model
Volatility
Liquidity
Pre-Trade Analysis
Market Data
Feed
Predictive
News Meta Data
Analytics
Feed
Automated Algo-Strategies
(Analytic)
Market
Data
Price,
volatility,
liquidity
Low Latency
Execution Algorithms
Post Trade Analysis
Trade orders
Post Trade
Analysis
Report
Market Data News Data
Ex-Ante Decision Model
Ex-Post Analysis Model
Applications: Risk management
Traditionally historic asset price data has been used to
estimate risk measures.



Significant changes in the market environment



ex post retrospective measures
fail to account for developments in the market environment,
investor sentiment and knowledge
Traditional measures can fail to capture the true level of risk
(Mitra, Mitra and diBartolomeo 2009; diBartolomeo and
Warrick 2005)
Incorporating measures or observations of the market
environment in risk estimation is important
EQUITY PORTFOLIO
RISK (VOLATILITY) ESTIMATION
USING MARKET INFORMATION
AND SENTIMENT
Leela Mitra
Co-authors: Gautam Mitra and
Dan diBartolomeo
.
Sponsored by:
Case study: Outline

Problem setting

Model description

Updating the model using quantified news

Study I

Study II

Discussion and conclusions
Introduction & background



Tetlock et al. (2007) note there are three main
sources of information

Analyst forecasts

Publicly disclosed accounting variables

Linguistic descriptions of operating environments
If first two are incomplete third may give us
relevant information
Tetlock et al. (2007) introduce “news” to a
fundamental factor model
Problem setting

Three main types of factor models




Macroeconomic – use economic variables as
factors (Chen, Ross and Roll; Sharpe)
Fundamental – based on firm specific (crosssectional) attributes (BARRA and Fama-French)
Statistical – factors are unobservable and derived
via calibration, often orthogonal.
Differ on sources of risk (uncertainty); can be
shown to be rotations of each other.
Problem setting

Need for models to update risk structure as
environment changes

diBartolomeo and Warrick (2005) update
covariance estimates using option implied volatility
CHANGES TO
MARKET
ENVIRONMENT

TRADERS
REACT
CHANGES IN
OPTION
IMPLIED
VOLATILITY
CHANGES IN
ASSET
COVARIANCE
MATRIX
Traders respond quickly in an intelligent fashion
Model description

An extension of diBartolomeo & Warrick(2005)

In two parts


“Basic” statistical factor model
Factor variance estimates are updated for
changes in option implied volatility
Model description


We construct a statistical factor model using
principal component analysis to find orthogonal
factors
Update the asset variances using option implied
volatility data
Model description


For each asset for which we have option
implied volatility data
We wish to identify the new factor variances
and asset specific variances
implied by updated asset variances

Solve this set of simultaneous equations to derive
the values, subject to some further conditions
Model description

Further conditions



Allow for structure that is expected of principal
component factors
Assume factor variances do not decline
substantially from one period to the next
Similarly assume asset specific variances do not
decline substantially from one period to the next
Study I

Period 17 January 2008 to 23 January 2008

EURO STOXX 50

Market sentiment worsened

Option implied volatility measures surged

Few key events

Large interest rate cut

George Bush announced stimulus plan

Soc Gen hit by Jerome Kerviel rogue trader scandal
Study I

Portfolio volatility from option implied model

is higher than “basic” model

rises significantly on 21 January
Study II

Over 2008 markets fell



Loss of liquidity in credit markets and banking
system
Many banks suffered bankruptcy or propped up
September and October 2008 – Volatility for
financial firms particularly high

Lehman Bankruptcy

Lloyds takeover of HBOS

Restrictions on short selling of financials
Study II

18 September 2008 to 24 September 2008

Dow Jones 30

Portfolio of three finance stocks



Portfolio of three non-finance stocks



Bank of America, CitiGroup and JP Morgan Chase
Equal weight on each stock
Johnson & Johnson, Kraft Foods and Coca Cola
Equal weight on each stock
Can the model predict impact in one sector…?
Study II
Study II
Information/modelling architecture
Mainstream
News
Pre-News
Web 2.0
Social Media
Pre-Analysis
(Classifiers &
others)
metadata
• Entity Recognition
• Relevance
• Novelty
• Events
• Sentiment Score
News Flow/Intensity
(Numeric) financial
market data
Analysis
Consolidated
Data mart
Updated beliefs,
Ex-ante view of market
environment
Quant Models
1.Return Predictions
2.Fund Management /
Trading Decisions
3.Volatility estimates
and risk control
Information value chain
Data…
…information… knowledge
Data  analysis  Data mart  quant models
News Analytics Toolkit
Momentum Study

RSI (Relative Strength Indicator) with a 15 day timeframe

U = closenow − closeprevious if up period, 0 otherwise

D = closeprevious − closenow if down period, 0 otherwise

RS = EMA(U,n) / EMA(D,n)



RSI = 100 – 100 / (1 + RS)
Asset Universe: FTSE100 and CAC40


EMA = n-period Exponential Moving Average
Daily market data from Jan 2005 to Jan 2011
Portfolio Selection:

Ranked by the RSI Momentum Indicator

Long only, equally weighted

Calendar rebalancing frequency every 60 or 90 working days

Transaction Cost: 0.2%

Number of assets in portfolio: 10 for FTSE100, 5 for CAC40
Momentum Study

News enhanced Momentum Strategy

News provided by RavenPack News Score 1.5

Revised Ranking including Market Data and News Data



Companies are ranked according to average sentiment score

Only news with Relevance ≥ 75 and within the previous 15 days are considered
Momentum ranking and news ranking are combined with equal weights
between news sentiment score and RSI score
Companies with no news in the period are considered to have an average
sentiment score of 50 (neutral sentiment)
Momentum Study

FTSE 100, 90 days rebalancing
Momentum Study

CAC 40, 90 days rebalancing
Momentum Study

FTSE 100, 60 days rebalancing
Momentum Study

CAC 40, 60 days rebalancing
Summary & discussions


Applications of (semi-)automated news analytics
in finance are growing in importance.
Pay back can be substantial to:

Investment Managers

Traders

Internal Risk Auditors

Regulators
Summary & discussions

Knowledge and Skills from three different
disciplines:

Information Systems.

Artificial Intelligence.

Financial Engineering & quantitative modelling
(including behavioural finance).
are required in various degrees to progress the
field/make substantial impact.
Thank you....


Thank you for your attention
Comments and Questions please
Download