Presented at EDAMBA summer school, Soreze (France) 23 July – 27 July 2009 Data Sourcing, Statistical Processing and Time Series Analysis An Example from Research into Hedge Fund Investments Presenter: University: Supervisor: Research Title: Contact: Florian Boehlandt University of Stellenbosch – Business School Prof Eon Smit Prof Niel Krige A Risk-Return Assessment of Fund of Hedge Funds in Comparison to Single Hedge Funds – An Empirical Analysis 14959747@sun.ac.za ‘In the business world, the rearview mirror is always clearer than the windshield’ - Warren Buffett - Research Purpose 1. Developing accurate parametric pricing models for hedge funds and fund of hedge funds 2. Accounting for the special statistical properties of alternative investment funds 3. Providing practitioners and statisticians with a framework to assess, categorize and predict hedge fund investments Research Approach Research Philosophy Positivistic, deductive research: Postulation of hypotheses that are tested via standard statistical procedures Research Approach Empirical analysis: Interpreting the quality of pricing models on the basis of historical data Primary Data External secondary data: Historic time series adjusted for data-bias effects Data Sourcing Data Sources Hedge Fund Databases Financial Databases Risk Simulation Monte Carlo (Solver) Confidence (RiskSim) CISDM/MAR DATA POOL Data Treatment Data Treatment DATA POOL Risk Simulation Statistical Processing Excel / VBA Statistica EViews FACTOR ANALYSIS STATISTICAL CLUSTERING MODEL BUILDING STATISTICAL SIGNIFICANCE Data Processing (1/2) Data Import Data Treatment Data Analysis • Extract relevant data from Access (SQL) • Import data as Pivot table report • Test for serial correlation /databias • Calculate adjusted excess returns • Select funds with consistent data series • Determine statistical model Data Processing (2/2) Weighting Comparative Analysis Data Output • Estimate weighted average parameters • Construct style indices • Calculate within-group variation • Calculate between-group variation • Tabular display of aggregate results • Construction of line - bar charts Data Import Access Database Information • Code • Fund (Name) • Main Strategy Performance • MM_DD_YYYY (Date) • Yield • Ptype (ROI or AUM) System Information • Leverage (Yes/No) Excel Pivot table report Access Database Management 1. 2. 3. 4. 5. 6. Introduce Autonumber as primary keys Define foreign keys for data queries Define table relationships (one-to-many) Build junction tables (many-to-many) Write SQL queries to display relevant data Integrate SQL in VBA code Why Access? • • • • Avoiding duplicate entries Cross-referencing data from various sources Combining and aggregating different databases Efficient storage due to relational data management • Queries allow for retrieval/display of specific data • Linked-in with Microsoft VBA and Excel (data displayable as Pivot table reports) • Searching for specific entries via SQL Data Validity • Consistency of performance history across different database providers • Degree of history-backfilling bias • Exclusion of defaulted funds/non-reporting funds from databases (survivorship bias) • Extent of infrequent or inconsistent pricing of assets (managerial bias) Data Bias Survivorship Inclusion of graveyard funds SelfSelection Multiple databases Database Instant History Look-ahead Rolling-window observation / Incubation period Hedge Fund Categories (TASS) Categories Dedicated Short Bias Directional Managed Futures Fund of Hedge Funds Market Neutral Global Macro Long / Short Equity Equity Market Neutral Event Driven Emerging Markets Global Macro Event Driven Fixed Income Arbitrage Convertible Arbitrage Statistical tests • • • • • Regression Alpha Average Error term Information Ratio Normality (Chi-squared, Jarque Bera) Goodness of fit, phase-locking and collinearity (Akaike Information Criterion, Hannan-Schwartz) • Serial Correlation (Durbin-Watson, Portmanteau) • Non-stationarity (unit root) Comparative Analysis Unbalanced ANOVA (within and between treatments) Strategy 1 Leverage Strategy 2 Leverage t – test for equal means Strategy 1 No Leverage Strategy 2 No Leverage t – test for equal means t – test (leverage t – test for vs. no leverage) equal means t – test for equal means t – test (between strategies) Empirical Findings • The accuracy of pricing models could be significantly improved when accounting for special statistical properties of hedge funds (Non-normality, non-linearity) • Hedge fund performance can be attributed to location choice as well as trading strategy • A limited number of principal components explains a significant proportion of crosssectional return variation Literature Review • Hedge Fund Linear Pricing Models – Sharpe Factor Model (Sharpe, 1992) – Constrained Regression (Otten, 2000) – Fama-French Factor Model (Fama, 1992) • Factor Component Analysis (Fung, 1997) • Simulation of Trading component (lookback straddle) Prediction Models Prediction Models AR GLS PCA Polynomial Fitting Constrained ARMA Univariate Taylor Series Lagrange ARIMA Multivariate Higher CoMoments KKT Conditional Simulation Sources Fama, E.F. & French, K.R. 1992. The Cross-Section of Expected Stock Returns. Journal of Finance, 47(2), June, 427-465. [Online] Available: http://links.jstor.org/sici?sici=00221082%28199206%2947%3A2%3C427%3ATCOESR%3E2.0.CO%3B2-N Fung, W. & Hsieh, D.A. 1997. Empirical characteristics of dynamic trading strategies: the case of hedge funds. Review of Financial Studies, 10(2), Summer, 275-302. [Online] Available: http://faculty.fuqua.duke.edu/~dah7/rfs1997.pdf Otten, R. & Bams, D. 2000. Statistical Tests for Return-Based Style Analysis. Paper delivered at EFMA 2001 Lugano Meetings, July. [Online] Available: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=277688 Sharpe, W.F. 1992. Asset allocation: management style and performance measurement. Journal of Portfolio Management, Winter, 7-19. [Online] Available: www.uic.edu/classes/fin/fin512/Articles/sharpe.pdf