IMBA THESIS WORKSHOP CLASS 2 P. SCHUHMANN, SPRING 2013 LECTURE MATERIAL BASED ON THE WORK OF STEVEN GREENLAW: DOING ECONOMICS: A GUIDE TO UNDERSTANDING AND CARRYING OUT ECONOMIC RESEARCH, STEVEN A. GREENLAW, 2006. HOUGHTON MIFFLIN CO. AVAILABLE FOR PURCHASE HERE: HTTP://WWW.AMAZON.COM/DOING-ECONOMICSUNDERSTANDING-CARRYING-ECONOMIC/DP/0618379835 THESIS OUTLINE • • • • Title Abstract Table of contents Acknowledgements • • • • • • • Introduction Literature review Theory Data Methods Results Discussion & conclusions • References • Appendices SUMMARIZING RESEARCH • What is the research topic? • What is the research question(s)? • Is this research question problem oriented or descriptive? • What is the hypothesis? • What is the basis of this hypothesis? • What data are used? • What methodology is used? • What are the results? • i.e. what are the facts associated with testing the hypothesis? • What knowledge is created? • Using logic, theory and intuition, what is the meaning of the facts? • What other interesting questions can be investigated? • Can you suggest exploratory or confirmatory research that might be associated with the research questions addressed in this study? QUESTIONS TO GUIDE YOUR A.B. You should attempt to answer the following for each paper you read (1-3 sentences for each): • Who? (full citation) • What? (what are the research questions?) • Why? (why is this important?) • How? (how was the research question addressed? i.e. what data and methods were used?) • What? (what were the main findings?) • Why? (why is this important?) READING AND SUMMARIZING • What methodology is used? • What if I don’t understand what the authors did? • Chances are that you will encounter several papers where you do not fully understand the methodology. This is normal. • If the paper is published, then it is very likely that the authors were able to convey the purpose of their study and some basic idea of the methods. • Summarize what you can. • Ask your committee members for guidance. ARE A LIT REVIEW AND AN A.B. THE SAME? • No. • An A.B. is a list of sources and summaries of those sources that you reviewed in order to understand your topic. • A literature review is a summary of the work that has been done in your topic area, written with the goal of providing the reader with the information needed to understand: • • • • The major findings in the area The approaches (and data) employed Any deficiencies in those studies The importance of your topic LIT REVIEW • The purpose of the literature review is to provide justification for your research. ELEMENTS OF GOOD WRITING Use the literature • Seek out published confirmation of your thoughts, ideas and assertions. • “Facts” should always be cited, unless they are common knowledge. ELEMENTS OF GOOD WRITING • Always give credit for intellectual property. • Failure to do so is plagiarism. • The key to avoiding plagiarism is to keep very careful records of everything you read and everything you write. • When in doubt, cite it. CITING YOUR SOURCES • One or two authors: • Smith (2001) notes that … • Smith and Jones (2008) suggest … • Empirical analysis of the relationship between environmental quality and travel demand includes applications in Asia (Lee and Phan, 2009), the Caribbean (Schuhmann, 2010; Oxenford and Mahon, 2007), and the U.S. (Dant, 2001; Ainsley and Medford, 2005). CITING YOUR SOURCES • More than two authors: • Smith et al. (2001) note that … • et al. is an abbreviation for the Latin phrase et alia, which means “and others.” • The period comes after “al.” • Provide the full citation in the reference section HOW TO CITE A SOURCE Reference sources of factual information or opinion: • “The protection of biological resources maintains essential ecosystem services that, while not explicitly represented in GDP (de Groot et al., 2002), serve to attract foreign exchange to developing nations via tourism (e.g. Troëng and Drews, 2004), and significantly contribute to human health and quality of life (McField and Kramer, 2007).” • “Lack of formal teacher training and the ingrained tradition of conventional methods of teaching and assessment may also create barriers to change in the classroom (Sunal et al, 2001).” HOW TO CITE A SOURCE Reference facts, positions or arguments to help motivate your argument: : • “Kwok (2006) and Weber (1998) suggest that financial systems differ across countries because of different perceptions of risk.” • “Graham et al. (2010) show that …” HOW TO CITE A SOURCE Reference general methodology • “The random utility modeling framework for describing site-choice decisions is well established in the literature (see, e.g., Bockstael et al. 1987; Bockstael et al. 1989; Kaoru et al. 1995).” HOW TO CITE A SOURCE Reference modeling particulars: • “In a framework similar to those employed by Bunn et. al. (1992), Mixon and Mixon (1996), and Mixon (1996), we seek to determine the degree to which macroeconomic movements impact firm investment behavior, ceteris paribus.” • “Following Burrus et al. (2009) we also control for differences in firm size using the log of the number of employees.” HOW TO CITE A SOURCE Reference source conclusions to support your conclusions: • “These results are also supported by those found by Baird (1980), Kerkvliet (1994), McCabe and Bowers (1996), and McCabe and Trevino (1997), but stand in contrast to the results found by Kermit et al. (1990). Notably, we show …” HOW TO CITE A SOURCE Reference sources for additional information: • “There is currently no ongoing teacher training program in economics, and past efforts have varied in terms of intensity of experience and frequency. For more information on such programs see Salemi (2010), Goodman et al (2003) and Salemi et al (1996).” ELEMENTS OF GOOD WRITING • You should rarely (if ever) use direct quotes in your paper. • The effort required to paraphrase someone else’s ideas are an important part of gaining a command over the literature. • There is almost always an alternative way of expressing a thought. • You want to put the work of others into the context of your work. ELEMENTS OF GOOD WRITING: ACTIVE OR PASSIVE VOICE? Which of these styles of writing seems more appropriate for your thesis (or professional research article)? A or B? A. “We measure the importance of several factors influencing biotech valuation.” B. “The importance of several factors influencing biotech valuation were measured”. -----------A. “I hypothesize that firms with greater drug approval rate will have higher valuations.” B. “It is hypothesized that firms with greater drug approval rate will have higher valuations.” ELEMENTS OF GOOD WRITING: ACTIVE OR PASSIVE VOICE? The active voice is clearer and more honest. • “We collected data from…” • “We hypothesize that …” • “We estimated the following model …” • “We can conclude that …” • “ I…” instead of “We…” is perfectly fine (especially for a thesis, which by definition has a single author), but less common. ELEMENTS OF GOOD WRITING • Keep it simple. • Avoid run-on sentences. • Can a sentence be divided into 2 simpler sentences? • Avoid emotion. • Avoid subjective language. • Is this is a matter of opinion or a matter of judgment? • Avoid questions • But how important is beach width to Caribbean tourists? PAIR UP • Choose a topic in finance • What might be an interesting research question in this topic area? • Can you be more specific? • How might you introduce the research question (the inverted triangle) to a novice reader on the subject? • What data might you need to collect to examine the research question? DATA • What is the concept or relationship in question? • How can the concept (or variables in the relationship) be measured? • What are the appropriate units of measure? • What is an appropriate sample frame or time frame? DATA • Cross-section • Time series • Panel/longitudinal TIME SERIES FREQUENCIES • Quarterly • GDP • Profits (Revenues, Costs) • Productivity measures • Monthly • Personal income measures • Sales measures • Price indices (CPE, PPI) • Weekly • Money measures • Daily • Interest rates • Stock prices • Exchange rates TIME SERIES FREQUENCIES • How to combine time series collected at different frequencies? TIME SERIES FREQUENCIES • Example: Suppose you have total sales data (measured monthly) and GDP data (measured quarterly). • Note: GDP = total value of production • How could you combine these time series? • Suppose you have a measure of total output (measured quarterly) and average price (measured monthly). • How could you combine these time series? SOURCES OF DATA • Census bureau (www.census.gov) • • • • • • • Population statistics for the U.S. Statistical abstract of the U.S. County and City Data Book Population census (every 10 years) Economic census (every 5 years) Annual survey of Manufacturers American Housing Survey SOURCES OF DATA • Bureau of Economic Analysis (www.bea.gov) • Major macro indicators for the U.S. • National income and product accounts (components of GDP and associated price indices) • Bureau of Labor Statistics (www.bls.gov) • Employment, productivity and consumption data • The Federal Reserve (http://www.federalreserve.gov/econresdata/) • Interest rates, exchange rates, money, public debt, bank assets& liabilities, corporate debt SOURCES OF DATA • IMF (http://www.imf.org/external/fin.htm) • International Financial Statistics (good source of financial data for IMF member countries) • EuroStat (http://epp.eurostat.ec.europa.eu/portal/page/por tal/statistics/themes) • Official statistical agency for the EU (fee for historic data?) • Others: World Bank, Organization for Economic Cooperation & Development (OECD), UN Agencies DATA • What if I cannot find the variable I’m looking for? • Can you find or create a proxy? • Does it measure the same behavior? • Is it highly correlated with the variable of interest? • If no proxy is available, can you reformulate the hypothesis given the available data? EXCHANGE RATES Official exchange rates vs. PPP exchange rates • To properly use GDP as a measure of economic well-being, we must consider differences in purchasing power. • E.g. comparing per capita GDP in the U.S. to per capita GDP in Dominica is an inaccurate way to compare well being, because prices are much lower for many goods in Dominica. OFFICIAL EXCHANGE RATES VS. PPP EXCHANGE RATES We need equality of purchasing power for these comparisons to be meaningful. • The official exchange rate between two countries may not be an accurate measure of purchasing power parity (PPP). OFFICIAL EXCHANGE RATES VS. PPP EXCHANGE RATES • Official exchange rates are determined by the supply and demand for currencies. • The supply and demand for currencies comes from the supply and demand for goods and services that can be purchased with those currencies and traded over international borders (“tradable goods”). • The prices of these “tradable goods” will eventually equalize across nations due to the forces of supply and demand. “The law of one price”. OFFICIAL EXCHANGE RATES VS. PPP EXCHANGE RATES • True PPP depends not only on the prices of “traded goods”, but also on the prices of goods not traded internationally, like meals, haircuts, bus rides, land, housekeeping services, etc… • The prices of these “non-traded” goods are in large part determined by unit labor costs, which of course tend to be lower in poorer countries. OFFICIAL EXCHANGE RATES VS. PPP EXCHANGE RATES • PPP exchange rate = ratio of the price of a basket of (traded and non-traded) goods in nation a vs. nation b OFFICIAL EXCHANGE RATES VS. PPP EXCHANGE RATES • For “rich” nations like Japan, US and Germany, the official exchange rate GDP is a reasonable approximation to the PPP exchange rate GDP • The difference between the PPP exchange rate and the official exchange rate will be higher for poorer, less developed nations. • Using the official exchange rate means an underestimate of living standards in poorer countries if measured using the currency of a richer country. REAL VS. NOMINAL • When dealing with price data or interest rate data be sure to differentiate between nominal and real: • Real Interest Rate = Nominal Interest Rate – Inflation rate E.g. bond yield = 6% inflation = 2% Real interest rate = 4% • Real price = nominal price / price index • Real price = nominal price / (1 + % Increase in prices since base year) REAL VS. NOMINAL E.g price in (base) year 2000 = $50 (CPI = 100) • Price in 2001 = $60 • Did you earn 20%? • Inflation = 2% => (CPI = 102) • Real price in 2001 is 60/1.02= 58.82 • Change in real price is $8.82 (17.84%) WHAT BELONGS IN THE DATA SECTION OF YOUR THESIS? • An explanation of what data you are using • Where did the data came from? • What variables are measured? (this could be shown in a Table of variable names and definitions) • What time period is covered? Or, when was the data collected? WHAT BELONGS IN THE DATA SECTION OF YOUR THESIS? 1. An explanation of what data you are using: “The data in this study include daily closing price indices of Shanghai A share (SHA), Shenzhen A share (SZA), Shanghai B share (SHB), Shenzhen B share (SZB) and Hong Kong Hang Seng China Enterprises Index H- Shares from the first quarter of 1992 through the fourth quarter of 2012.” AN EXPLANATION OF THE DATA YOU ARE USING: “In order to test the hypotheses noted above, we use measures of monetary aggregates, price levels and real GDP from 23 European nations for the period 1998-2011. All data are quarterly, except personal income, which was converted to a quarterly average from monthly data. Variable names, definitions and sources are reported in Table 1.” “Data on mergers and acquisitions between U.S. corporations were collected from Bloomberg using the following criteria: …” WHAT BELONGS IN THE DATA SECTION OF YOUR THESIS? 2. A description of how you treated or modified the data. • • • • • Are there missing observations? Are there outliers? How were they treated? Did you convert nominal to real? Did you convert monthly to quarterly (etc)? Did you create new variables? (e.g indicator variables, indices) DESCRIBING HOW DATA WERE TREATED • “Firms in our sample have Standard Industrial Classification (SIC) codes between 2830 and 2836. These firms discover, develop, produce, and sell drugs for the treatment or diagnosis of human diseases; 199 firms meeting this criterion were initially identified; 29 firms were omitted due to incomplete or inaccessible financial data and seven firms were excluded due to incomplete or inaccessible clinical trial data.” DESCRIBING HOW DATA WERE TREATED “Monthly returns of international equity indices were averaged for the ten-year period 2000- 2010, and closing prices expressed in local currency were used to compute returns. For countries with more than one equity index, we select the capitalization-weight index that best represents the country’s overall stock market. We estimate volatility as the standard deviation of the monthly stock index returns over the 2000-2010 sample period. We estimate the risk-adjusted return using the Sharpe ratio calculated as: Average return Index − Average Risk free rate Sharpe Ratio = σ The numerator of the Sharpe ratio is the average excess return over a risk-free benchmark. We use the average risk-free rate per country from 2000 to 2009 which we retrieved from the United Nations database. Homogeneous data were difficult to gather, so for some countries, the money market rate was chosen as the risk-free rate while for others we use the three month government borrowing rate.” WHAT BELONGS IN THE DATA SECTION OF YOUR THESIS? 3. A description of the distribution of important variables, including descriptive statistics. • Be sure to note measures of central tendency, dispersion and spread for key variables. • Refer back to theory/hypotheses to justify the measurement or inclusion of variables. • Include a table of descriptive statistics • Consider a histogram or (smoothed) series plot for your key variable(s) of interest DESCRIBING THE DISTRIBUTION • “The typical size of boards of directors in our sample is approximately eight individuals, with an average of 3.45 business experts, 1.47 financiers and 1.3 medical doctors per board. There is an average of nearly seven males in each group and roughly 42 percent of the boards in our sample are chaired by the CEO. On average, boards contain approximately 1.5 members who are current or previous employees of the company.” • “Our sample of divers contains 195 persons from Tobago and 165 persons from Barbados. Descriptive statistics are shown in Table 1. Divers in the sample are predominantly male and highly educated, mostly from the UK or the US. Approximately 60 percent of the sample had visited the island where they were diving on a prior occasion, or had traveled to the Caribbean previously. … • … Divers reported encountering an average of less than four other divers at the dive site and approximately one-third of the sample reported viewing no other divers at the site. Only 23 divers (6.4 percent) reported viewing more than 11 other divers at the site and only 4 divers (1 percent) reported viewing more than 15 other divers. IMPORTANT TABLES AND FIGURES • Table: Variable Names and definitions (and sources?) • Table: Descriptive statistics • Table: Correlation coefficients • Figure: Histogram for key variable(s) • Figure: Smoothed time series plot for key variable(s) • With more than 7.5 years of diving experience on average, our sample contains many experienced divers. Yet, 10 percent of the sample claimed zero years of diving experience and 32 percent indicated having no formal scuba certification. … • … Following the approach of Dearden et al. (2006), we construct a diver specialization index that ranges from 0-10 a priori using responses to 7 questions in our survey. Scoring for the index is shown in Table 2 and the frequency distribution of index values is shown in Table 3.” The Theory / Hypotheses Section • The purpose of this section is to present a theoretical analysis (the logical argument) of the problem you are investigating. • Explain how your problem can be viewed as an application of relevant theory (econ theory?) or results from the literature. • Describe your theoretical model. The Theory / Hypotheses Section • Why is this section important? • The purpose of any empirical investigation is to attempt to validate a particular hypothesis (or set of hypotheses). • It is therefore important that you communicate this hypothesis to the reader so that they can put the results in context. EXAMPLE Within the revealed preference literature, while there has been considerable research investigating various representations of expected catch (McConnell et al. 1995), there has been considerably less attention given to expected congestion. Furthermore, the effects of accounting for congestion on compensating variation measures for changes in site quality or access price within this framework have yet to be explored. To further illustrate some of these important issues within the RUM framework, consider the following explicit linear representation of the conditional indirect utility function Vij t + εij t = βtcij + δce jt + λqe jt + εij (3) where V is the conditional indirect utility for individual i, ce and qe are the expected catch and congestion, respectively, of visiting site j at time t, and the other variables are defined above. EXAMPLE Real estate investor sentiment surrounding periods of recurring hurricane landfalls is an attractive topic for research, especially in the area around Wilmington, NC, where four hurricanes made landfall between 1996 and 1999. Adjacent to discoveries of a real estate market “recovery” in this area of southeastern North Carolina since the unprecedented series of hurricane landfalls in the late 1990’s, we test a series of empirical expectations. First, we affirm the findings of Graham, Hall, and Schuhmann (2007) where home prices rebound in the years following Hurricane Floyd, the last major storm to hit the region in 1999. Second, we assemble metrics to proxy for investor sentiment, and use those metrics to illustrate the market’s improving sentiment since early this century. The first metric we consider is the spread between listing and selling prices. Our premise is that spreads between listing and selling prices increase as home-buyer sentiment changes with perceptions of increased exposure to hurricanes and catastrophic risk. This expanding spread is affirmed by Graham and Hall (2002). Extending those findings, we expect the spread to narrow in the years following Floyd. Home buyers become less willing to purchase at current prices, ceteris paribus, due to expectations of increasing future hurricane losses. As a result, sellers are forced to provide some price concession to compensate buyers for the assumption of additional risk. EXAMPLE Based on results in the literature and economic theory of demand, we hypothesize that tourism demand is a function of the explicit and implicit costs of travel, individual demographics and destination quality. More formally, travel demand can be estimated as: Log (vi/pi) = β0 + β1 TCi + β2 Qi + ∑ (βk∙Xk + … + βj∙Xj) (1) Where vi = total visits from zone i, pi = population of zone i, TCi = round trip travel cost from zone i (explicit + time cost) Qi = measure of coatal quality for respondents from zone i (response or instrument), and Xk … Xj = demographic characteristics of respondents from zone i. EXAMPLE In order to examine the relationship between student characteristics and economic knowledge acquisition, scores on the economics portion of the survey serve as the dependent variable. Note that the pre and post survey results can be examined individually or together by calculating the difference in correct answers between the pre- and post-course surveys. In the former case, the variable we wish to explain is constrained to be zero or a positive integer, hence a count data model will be appropriate for estimation. In the latter, the variable of interest (change in score) can be positive or negative; hence more traditional regression methods will suffice for estimation. EXAMPLE CONT’D Poisson regression models provide a standard framework for the analysis of count data when a majority of the data falls in the lower end of the distribution (ie 0,1,2,..). The Poisson distribution determines the probability of a count. (1) P(yi) = Prob[yi = j] = exp(-i) -ij / j! , j = 0, 1, 2, … Where the standard formulation for i is: (2) i = exp( ΄xi ) In order to examine the relationship between student characteristics and pre- and post-course scores on the economics questions in our survey, we estimate the following equations using a Poisson specification for both the pre-course survey results and the post-course results (variable definitions are provided in Table 1): (Model 1) Yi = 0 + 1(RSURVEYi)+ 2(BUSINESSi) + 3(OTHERi) + 4(HS ECONi) + 5(MACROi) + 6(MAC HAD MICi) + 7 (MIC HAD MACi) + 8(MIC HAD SURVEYi) + 9(MAC HAD MIC AND SURVEYi) + 10(TMATHPREi or TMATHPOSTi) + 11(MWCi) + 12(UNCWi) + 13 (UNi) and (Model 2) Yi = 0 + 1(RSURVEYi)+ 2(BUSINESSi) + 3(OTHERi) + 4(HSECONi) + 5(MACROi) + 6(MAC HAD MICi) + 7 (MIC HAD MACi) + 8(MIC HAD SURVEYi) + 9(MAC HAD MIC AND SURVEYi) + 10(Q1i) + 11(Q2i) + 12(Q3i) + 13 (Q4i) + 14(Q5i) + 15(Q6i) + 16(Q7i) + 17(Q8i) + 18(MWCi) + 19(UNCWi) + 20(UNi). METHODS • The methods section may be combined with the theory section or it may be combined with the data section. • “Theory and methods” • “Materials and methods” • In this section you specify the exact model that you are going to estimate. • E.g. “Using OLS regression, we estimate the following version of equation (2):” The Introduction • The introduction is often one of the last things you write (abstract will probably be the very last thing).