DATASET COSTRUCTION After downloading the data from IMF, which are located in the folder called “DATI”, I merged them into a unique file called “TRADE MIO” in the folder “COSTRUZIONE DATASET”. The main goal of the work was to add years in the info about trade and GDP contained into Rose’s dataset. Rose has calculated the “trade” variable by averaging all of the “four possible” values of bilateral trade (exp from I to j, imp into j from I, and so forth). i.e. Country1 Country2 Pairid Exp Imp TRADE Italia Francia 2345 100000 200000 (100000+200000+250000+95000)/4 Francia Italia 2345 250000 95000 In order to construct this variable I assigned a pairid (meaning that I gave the same value of id for each couple of countries); for example if you look at the table the two rows have the same pairid=2345. PAIRID CONSTRUCTION Because Rose had assigned a pairid to her data, I used this variable to identify the same countries in new IMF data. In order to do this, I merged Rose’s pairid with the new IMF data through the country codes. The output file is called “trade mio con merge pair” in the folder “costruzione dataset”. At this stage I had some problems. The new IMF dataset has data which aren’t into Rose’s dataset. These data can be found into the file called “trade mio con merge pair”, and they are identified by a dummy called countrypairnew, which is 1 if the data is in new IMF (but not in Rose’s), and 0 otherwise. For data with countrypair=1 I had to construct a new pairid. The adopted method was the following: Starting from the dataset called mege pair=1 (in which there are only the new IMF data, which haven’t the pairid of Rose) I created the pairid for these new data. 1. First of all, I divided the bilateral flows from the unilateral ones constructing the variable codmio2. I created this variable by running dofiles which you are in the folder costruzione dataset\merge pair=1\per generare i paridnew. If codmio2 is 0, it means the flows are unilateral. I saved them as paesi con flussi unilaterali in the folder costruzione dataset\merge pair=1\per generare i paridnew. 2. For unilateral flows I created a new pairid by running the dofiles: gen pairidnew flussi unilaterali1 e gen pairidnew flussi unilaterali2. 3. For bilateral flows I constructed the variable codmio3(multiplying codmio and codmio2) and I ran the dofile called gen pairidnew flussi bilaterali located in costruzione dataset\merge pair=1\per generare i paridnew. TRADE CONSTRUCTION I first created the variable “trade” for data with the new pairid, then for the data that have the same Rose’s countrypair. We can start with the data containing the new pairid. I separated the bilateral flows from the unilateral flows. Regarding the unilateral flows, the construction of the trade variable was an average of imports and exports: (imp+exp)/2. The file with these data is called flussi unilaterali con trade, in costruzione dataset\merge pair=1. For bilateral flows the construction of the trade variable was more difficult. First of all, I created this variable year by year and in a different way, depending on the characteristics of import and the export data. Then, I generated the variable called casi that has a different value due the imports or exports are positive number, zero or missing value. The casi are the following: CASI 1 1 2 10 imp num/0 num/0 exp num/0 num/0 The second stage was the sum of casi for data containing the same pairid. I called this variable sumcasi and I obtained the follow combination: SUMCASI=2 imp num imp - SUMCASI=3 imp num imp imp - SUMCASI=4 imp SUMCASI=11 imp num num imp exp num - exp 0 0 - exp num 0 - imp num exp exp - imp - exp num - exp num imp - exp 0 - imp - 0 - exp 0 - imp num exp - imp - exp 0 - exp 0 exp exp num imp num exp 0 - num imp num imp - imp 0 exp 0 num exp 0 num imp exp 0 0 imp num 0 exp 0 0 num exp 0 0 SUMCASI=12 imp exp exp num - imp num - exp - imp num - exp 0 num - imp num exp num imp num exp - imp num num exp num num imp num num exp 0 imp imp - 0 0 exp 0 num SUMCASI=20 imp exp 0 0 imp 0 0 exp 0 num 0 - imp 0 num exp 0 num num imp 0 0 num exp 0 0 num 0 The data with sumcasi=4 was dropped. I created “trade” variable for each sumcasi trying to use all possible information. DATA WITH SUMCASI=2 When in these data there were 2 positive numbers i.e. imp num exp num - I created “trade” by averaging the two available value (i.e. trade=(exp+imp)/2) When there was only one value: i.e. imp - exp num 0 - I created the variable using this value (i.e. trade=exp) DATA WITH SUMCASI=3 The variable trade was constructed as the previous case(i.e. trade=exp). DATA WITH SUMCASI=11 For these data, when there were two value or only one, the “trade” was constructed as the cases of sumcasi=2. When there were three numbers i.e. imp num num exp num I preferred using the information of the same statistic centre, so the variable trade was created averaging the values of import and export of the same country. DATA WITH SUMCASI=12 I created trade as the last case of sumcasi=11 (i.e. trade=(imp+exp)/2): imp num - exp num - otherwise the variable trade was exactly the only available number. DATA WITH SUMCASI=20 These data reassume some previous cases. We can summarise: if there were: imp num num exp 0 num or imp exp 0 num 0 num trede=(imp+exp)/2 (data are collected by the same statistic centre). With the case as: imp exp 0 0 0 num trade=exp (I use the only available value) and imp num num exp num num trade=(imp1+exp1+imp2+exp2)/4. You can find dofiles and outpunt of all these cases in costruzione dataset\merge pair=1. I used the same strategy also with data containing rose’s pairid and I saved the outputs and dofiles in the folder costruzione dataset. The final dataset (cleaned) is file last pronto per unione con Rose in costruzione dati. GDP I downloaded GDP from World Development indicators. The data are in excel format in the folder GDP/DATA, and in Stata format in GDP/STATA FORMAT. These data are different from Rose’s GDP. In fact Rose’s GDP are in different base from data downloaded. You can find in GDP/STATA FORMAT: 1) GDP in costant 2000$ 2) GDP pro-capita in constant 2000$ 3) GDP current $ 4) population