DATASET COSTRUCTION

advertisement
DATASET COSTRUCTION
After downloading the data from IMF, which are located in the folder called “DATI”, I merged
them into a unique file called “TRADE MIO” in the folder “COSTRUZIONE DATASET”.
The main goal of the work was to add years in the info about trade and GDP contained into Rose’s
dataset. Rose has calculated the “trade” variable by averaging all of the “four possible” values of
bilateral trade (exp from I to j, imp into j from I, and so forth).
i.e.
Country1
Country2
Pairid Exp
Imp
TRADE
Italia
Francia
2345
100000
200000
(100000+200000+250000+95000)/4
Francia
Italia
2345
250000
95000
In order to construct this variable I assigned a pairid (meaning that I gave the same value of id for
each couple of countries); for example if you look at the table the two rows have the same
pairid=2345.
PAIRID CONSTRUCTION
Because Rose had assigned a pairid to her data, I used this variable to identify the same countries in
new IMF data. In order to do this, I merged Rose’s pairid with the new IMF data through the
country codes. The output file is called “trade mio con merge pair” in the folder “costruzione
dataset”.
At this stage I had some problems.
The new IMF dataset has data which aren’t into Rose’s dataset. These data can be found into the
file called “trade mio con merge pair”, and they are identified by a dummy called countrypairnew,
which is 1 if the data is in new IMF (but not in Rose’s), and 0 otherwise.
For data with countrypair=1 I had to construct a new pairid.
The adopted method was the following:
Starting from the dataset called mege pair=1 (in which there are only the new IMF data, which
haven’t the pairid of Rose) I created the pairid for these new data.
1. First of all, I divided the bilateral flows from the unilateral ones constructing the variable
codmio2. I created this variable by running dofiles which you are in the folder costruzione
dataset\merge pair=1\per generare i paridnew. If codmio2 is 0, it means the flows are unilateral.
I saved them as paesi con flussi unilaterali in the folder costruzione dataset\merge pair=1\per
generare i paridnew.
2. For unilateral flows I created a new pairid by running the dofiles: gen pairidnew flussi
unilaterali1 e gen pairidnew flussi unilaterali2.
3. For bilateral flows I constructed the variable codmio3(multiplying codmio and codmio2) and I
ran the dofile called gen pairidnew flussi bilaterali located in costruzione dataset\merge
pair=1\per generare i paridnew.
TRADE CONSTRUCTION
I first created the variable “trade” for data with the new pairid, then for the data that have the same
Rose’s countrypair.
We can start with the data containing the new pairid.
I separated the bilateral flows from the unilateral flows. Regarding the unilateral flows, the
construction of the trade variable was an average of imports and exports: (imp+exp)/2. The file with
these data is called flussi unilaterali con trade, in costruzione dataset\merge pair=1.
For bilateral flows the construction of the trade variable was more difficult. First of all, I created
this variable year by year and in a different way, depending on the characteristics of import and the
export data.
Then, I generated the variable called casi that has a different value due the imports or exports are
positive number, zero or missing value.
The casi are the following:
CASI
1
1
2
10
imp
num/0
num/0
exp
num/0
num/0
The second stage was the sum of casi for data containing the same pairid. I called this variable
sumcasi and I obtained the follow combination:
SUMCASI=2 imp
num
imp
-
SUMCASI=3 imp
num
imp
imp
-
SUMCASI=4 imp
SUMCASI=11 imp
num
num
imp
exp
num
-
exp
0
0 -
exp
num
0 -
imp
num
exp
exp
-
imp
-
exp
num
-
exp
num
imp
-
exp
0 -
imp
-
0
-
exp
0 -
imp
num
exp
-
imp
-
exp
0
-
exp
0
exp
exp
num
imp
num
exp
0 -
num
imp
num
imp
-
imp
0
exp
0 num
exp
0 num
imp
exp
0 0
imp
num
0
exp
0 0 num
exp
0
0
SUMCASI=12 imp
exp
exp
num
-
imp
num
-
exp
-
imp
num
-
exp
0 num
-
imp
num
exp
num
imp
num
exp
-
imp
num
num
exp
num
num
imp
num
num
exp
0
imp
imp
-
0
0
exp
0 num
SUMCASI=20 imp
exp
0
0
imp
0
0
exp
0
num
0
-
imp
0
num
exp
0 num
num
imp
0
0
num
exp
0
0 num
0
The data with sumcasi=4 was dropped.
I created “trade” variable for each sumcasi trying to use all possible information.
DATA WITH SUMCASI=2
When in these data there were 2 positive numbers
i.e.
imp
num
exp
num
-
I created “trade” by averaging the two available value (i.e. trade=(exp+imp)/2)
When there was only one value:
i.e.
imp
-
exp
num
0 -
I created the variable using this value (i.e. trade=exp)
DATA WITH SUMCASI=3
The variable trade was constructed as the previous case(i.e. trade=exp).
DATA WITH SUMCASI=11
For these data, when there were two value or only one, the “trade” was constructed as the cases of
sumcasi=2.
When there were three numbers
i.e.
imp
num
num
exp
num
I preferred using the information of the same statistic centre, so the variable trade was created
averaging the values of import and export of the same country.
DATA WITH SUMCASI=12
I created trade as the last case of sumcasi=11 (i.e. trade=(imp+exp)/2):
imp
num
-
exp
num
-
otherwise the variable trade was exactly the only available number.
DATA WITH SUMCASI=20
These data reassume some previous cases.
We can summarise:
if there were:
imp
num
num
exp
0
num
or
imp
exp
0
num
0
num
trede=(imp+exp)/2 (data are collected by the same statistic centre).
With the case as:
imp
exp
0
0
0 num
trade=exp (I use the only available value)
and
imp
num
num
exp
num
num
trade=(imp1+exp1+imp2+exp2)/4.
You can find dofiles and outpunt of all these cases in costruzione dataset\merge pair=1.
I used the same strategy also with data containing rose’s pairid and I saved the outputs and dofiles
in the folder costruzione dataset.
The final dataset (cleaned) is file last pronto per unione con Rose in costruzione dati.
GDP
I downloaded GDP from World Development indicators.
The data are in excel format in the folder GDP/DATA, and in Stata format in GDP/STATA
FORMAT.
These data are different from Rose’s GDP.
In fact Rose’s GDP are in different base from data downloaded.
You can find in GDP/STATA FORMAT:
1) GDP in costant 2000$
2) GDP pro-capita in constant 2000$
3) GDP current $
4) population
Download