stata hands on session

SIMULATIONS MODELS FOR INTERNATIONAL TRADE GRAVITY EQUATIONS FOR INTERNATIONAL TRADE MODELS Paris-Dauphine / September 2015 DOCUMENT 2: STATA HANDS ON SESSION Ramón Mahía – UAM (Based on the material provided y UNCTAD-WTO)1 Complete modified and commented DO File: DO_MODIFIED_COMMENTED 1.- MANIPULATION OF DATA (Previous to Econometric estimation) (Steps 1 to 7 as described in Chapter 3 – UNCTAD/WTO.) Several operations to perform before estimation (see DO_MANIPULATION_COMMENTED): - Download datasets from sources and import them into a single software format (stata dta, E-Views wf,..) Homogenize formats of different datasets, list of countries, names for countries, names for variables, “names” for years Replace missings (ceros for trade, functional 999 for real missings….) Generate the structure for the gravity model data set: all possible combinations of countries (and years if panel Is used) Merge different files into a single one Generate dummies (if needed) for year, country, and Year x Country Compute log variables (for GDP, trade and distance) Step 1: - 1 Import CSV trade flows (tradeflows.csv), label variables and save to .dta Import txt file “joinwto.txt” with year of accession for each country and save it in .dta format Import CSV file “GDP.csv” with GDP data for each countries from 1960 to 2006, Replace BELGIUM and LUXEMBOURG by BENELUX, compute BENELUX GDP with the sum of both countries and change names for year variables save it in .dta format Open STATA datafile containing the rest of explanatory variables, fix BENELUX problem, change some variable names, label some other variables and save it in .dta format IMPORTANT NOTE: The content of this document, and specially the exercise section, is based on the document prepared by UNCTAD-WTO entitled “A Practical Guide to Trade Policy Analysis. (Chapter 3. Analyzing bilateral trade using the gravity equation). To access the on-line version of this UNCTAD-WTO doc, visit the WEB page: http://vi.unctad.org/tpa/index.html 1 Basically, at the end of that Step 1, four different STATA files are created and stored in the default directory: 1. tradeflows.dta (endogenous variable) in a Panel dataset for YEARS and PAIRS of countries in LONG format 2. joinwto.dta (for the explanatory variable “wtoaccesion”) in a Cross Section dataset for INDIVIDUAL countries 3. GDP.dta (for explanatory variables GDP’s) in a Panel dataset for YEARS and INDIVIDUAL countries in WIDE format 2 4. CEPII.dta (other explanatory variables in LONG format) in a Cross Section dataset for PAIRS of countries Step 2: - Starting with “tradeflows.dta”, create the FULL structure of the datafile: PANEL DATA for YEARS and every possible combination (PAIR) of countries filling with “zeros” the pairs newly created. The temporary file created is "gravity_temp1.dta" - Reshape GDP.dta to LONG Panel set and create a duplicate (GDP is going to be used as both importers’s GDP and exporter’s GDP) Step 3: reshape long stub, i(i) j(j) \ j new variable reshape long yr, i(countrycode) j(year) rename yr gdp - And MERGE those two new files (“GDP_exporter.dta” and “GDP_importer.dta”) with "gravity_temp1.dta" keeping those observations (PAIRS of countries) with information in both files. 3 - MERGE “joinWTA.dta” with that file creating two new variables: join_exporter and join_importer . - The new temporary file created is "gravity_temp3.dta" - MERGE data of both two new files “CEPII.dta” (previously saved) and “religion.dta” with the previous. The new temporary file created is "gravity_temp4.dta" Step 4: - Step 5: - - Create WTO accession dummies depending on whether one, none or both countries are members of WTO or not (onein, nonein, bothin) The new PERMANENT file created is "gravity.dta" and basically contains the core dataset (endogenous and exogenous variables, except for country/country x time/time dummies and some lasting transformations) The structure of the main dataset is shown in the next screenshot: Each row contains a trade flow (import) and the variables for the gravity equation (GDPS, and the terms for barriers and incentives) EXCEPT FOR MRT’S dummies. 4 Step 6: - Create country/country x time/time dummies for the specification of MTR terms and time fixed effects In this block, due to memory restrictions, three different options are offered if the number of dummies exceed the STATA capacity: o Option selected in this example: Reduce the number of years (>1995→1996 – 2005) o Compute country-period (and not country-year dummies) o Make a balanced panel (reducing the sample to those countries having the information for the same time period). - Create logs of variables GDP’s, and distance Compute five year averages of some variables Create a subset with OECD countries for the period 1196-2005 Create a subset with OECD countries for the period 2000-2005 Step 7: 2.- ECONOMETRIC ESTIMATIONS OF GRAVITY EQUATIONS - Load dataset “gravity_OECD_2000_2005.dta”: o 33 countries o 6 years o make a total of ([33(33-1)])x6=6336 records 5 - REG1: ESTIMATE A CROSS SECTION BASIC REGRESION IN LOG-LOG, WITHOUT MRT’s FOR YEAR AND DO SOME BASIC CHECKS: o Check number of valid observations for the endogenous “LIMPORTS” in 2000 and 2005 There could be a maximum of 33*32= 1056 valid values but there are only 992 because of 64 Missings due to zero values for trade with origin or destination in BLX. o Estimate the simplest log-linear gravity model regression for the year 2005 using only lgdp_exporter, lgdp_importer and ldistance and interpret parameters/elasticities o Check if GDP elasticities are close to unitary as predicted by theory:    Theory predicts a value around 1 for both elasticities A difference between origins GDP and destination GDPs is expected, a lower estimation for importer GDPs would suggest evidence of home market effects (due to barriers to entry or national product differentiation). Meta-Analysis shows that distance coefficient is also around -1. META analysis for 2500 gravity equations estimations. Table extracted from Head, K., & Mayer, T. (2013). Gravity equations: Workhorse, toolkit, and cookbook. Handbook of international economics, 4. 6 o Check if trade elasticity is significantly more sensible to trade barriers (proxied by distance) in 2005 than in 2000  Procedure: compare basic estimation for different years (2000 Vs 2005) using seemingly unrelated estimation (STATA suest2 command) It looks like no statistical difference exists comparing 2000 and 2005 estimates. - REG2: ESTIMATE ANOTHER CROSS SECTION REGRESSION WITH ADDITIONAL VARIABLES o Estimate, with robust inference, for 2005 adding more variables: reg limports contig comlang_off onein colony REPlandlocked PARTlandlocked religion ldist lgdp* if year==2005, robust 2 Seemingly unrelated estimation procedure combines the estimation results (parameter and variance matrices) in one parameter vector and simultaneous (co)variance matrix. The procedure is done after the isolated estimation of each equation. The idea behind this reasoning is that error terms in different equations might be correlated, and that may impact in the estimated covariance of parameters and thus in every crossmodel hypothesis concerning parameters of those different equations. 7  o - “onein” coefficient cannot be estimated (only zero values), and the same for “bothin” (only value 1) (tab onein if year==2005) Compare REG1 and REG2 regressions3. Check elasticities obtained:  GDP’s coefficients appeared to be slightly overestimated in the first regression but the size, and even the sign of this bias depends on the particular nature of relationship between trade resistance / incentive omitted variables for the particular case of countries comprised in the sample.  How do you compute the elasticity for dummy variables?4  Adjacency coefficient (“contig”) usually lies in the vicinity of 0.5 (Head, K 2003) suggesting that trade is about 65% higher as a result of sharing a border. To omit this variable may cause an upward bias (in absolute value) in distance parameter (both are negatively related to each other)  Contiguity and common language effects seem to have very comparable effects, with coefficients around 0.5. (Head, K., & Mayer, T. (2013), see table above)).  According to some papers, common links (lenguaje, colony,…) may cause very significant rises in trade (up to two, three times or even more…). Colonial links are not significant in our regression given the particular nature of the sample (only OECD countries included)  “Landlocked” variable seems only be significant for PARTNER (importer) country resulting in a reduction of imports of around 42% (coeff.=0,357). REG3: ADDING DUMMIES TO CONTROL FOR MTR’s EFFECT o o Keep only 2000 - 2005 observations with origin or destination in an OECD country 3.1 Try to estimate REG2, with robust inference, for a cross section in 2005 adding country dummies importer_* and exporter_* to control for MTR with country dummies (adding also year* dummies) 3 For that, it is useful to use “eststo” command (download it first if not already installed) 4 Remember that, in a log-log model, raw coefficients for dummies do not represent elasticities (% changes). The elasticity can be easily derived with Exp(β)-1. 8  o o o - Commonly, year dummies control for omitted terms causing secular / trend variation in panel data models (affecting in our example world trade for every single pair of exporter – importer)  Given that importer_* and exporter_* are country specific (not pair specific) perfectly correlate with other country specif variables such as REPlandlocked PARTlandlocked and lgdp_importer lgdp_exporter  Important differences appear for common coefficients, especially in the case of distance (“ldist”) that now exhibits an elasticity grater that one (as expected according to the MetaAnalysis table previously shown) 3.2 How can we add country dummies keeping the estimates of those country specifics such as GDP’s?. A pooled OLS regression for a short period (2000-2005) could be a solution for that country specific variables that varies over time (GDP’s for example) but not for country time-invariant variables (such as REPlandlocked, PARTlandlocked)  Repeat previous regression for the period 2000-2005  In effect, GDP’s coefficients can be now estimated and, according to literature, elasticities drop substantially (down to 0.6) in this “structural” version compared to previous estimates (without controlling for MRT’s) 3.3. What if we now add country x time dummies allowing for MTR time variants? (in the previous regression, MRT terms were supposed to be constant over time)  The answer is that, given that MRT’s now varies over time, we cannot again estimate country specific time variant variables (such as GDP’s) 3.4. What if we now add country-pair dummies allowing to control for paired heterogeneity?  Adding “pairid” fixed effects does not allow to estimate the coefficients for any “country pairs” such as distance, colony, onein,…..  SO IF WE CONTROL FOR ALL FIXED EFFECTS AT THE SAME TIME (COUNTRY, YEAR, COUNTRY X YEAR, AND COUNTRY-PAIRS) WE THEN LOSE THE REST OF PARAMETERS (except for fixed effects) REG4: PANEL DATA (Step 8 in UNCTAD-WTO document) o o o Set panel data structure (remember that the panel observation refers to “ij” pairs) Estimate a simple panel data FIXED effects (to control for bilateral MRT’s (including, also, time effects) We have to notice that, controlling with FE for bilateral MRT’s terms we are unable to estimate coefficients for every TIME INVARIANT variables both for “ij” pairs (such as distance, colony, common language, FTA) or simply at the level of “i” and/or “j” (such as landlocked) 9 o Using RANDOM Effects, we may estimate every coefficient (missed with FE) but, as ever with RE, at the risk of biased estimates: o Consider the possibility of RE Vs FE doing a Haussman test 10

stata hands on session

Related documents

Products

Support

stata hands on session

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib