DAD A software for Distributive Analysis / Analyse Distributive By Abdelkrim Araar and Jean-Yves Duclos PEP - PMMA TRAINING - ADISS ABABA June 2006 What is Distributive Analysis? Distributive analysis is concerned with the distribution and redistribution of well-being, usually captured by living standards at the household level. The distribution of living standards depends dynamically on a number of factors, such as: Average living standards at the level of the population Living standards relative to the mean The structure of the economy and the distributional channels of the richness. The economic policies in place (redistribution policies) Economic shocks PEP - PMMA TRAINING - ADISS ABABA June 2006 What is Distributive Analysis? Main topics linked to distribution and redistribution: Absolute & relative poverty Absolute & relative inequality Polarisation Vertical & horizontal equity Redistribution etc.. PEP - PMMA TRAINING - ADISS ABABA June 2006 What is Distributive Analysis? Example of some relatively recent economic shocks in developing countries: Economic transition from planned to market economies Application of macro adjustment programs Trade liberalisation Globalisation These shocks can have a significant impact on the distribution of living standards at different levels (regions, countries, within households). PEP - PMMA TRAINING - ADISS ABABA June 2006 Positioning DAD in distributive analysis The main features of the software can be summarized as follows: Free! User friendly – no need for programming Estimates easily a number distributive indices and curves that are extensively used in the literature about the distributive analysis. Estimates accurately the sampling distribution of such indices and curves by taking into account the sampling design of household surveys by means of analytical and numerical procedures Provides tools for testing the robustness of comparisons Insists on the power of graphs to provide informative pictures of the distribution of living standards PEP - PMMA TRAINING - ADISS ABABA June 2006 Basic descriptive tools: Estimation of means, quantiles, variances Non parametric estimation of density joint density non parametric regression between two variables regression slopes Scatter graphs Important and flexible graphical abilities PEP - PMMA TRAINING - ADISS ABABA June 2006 Poverty decomposition Static decomposition: Population subgroups Income components FGT index - analytical approach FGT Index - Shapley approach Dynamic decomposition: Growth and redistribution Sectoral decomposition FGT index – analytical & Shapley approaches Transient and chronic FGT index – analytical & Shapley approaches FGT index – analytical approach EDE index - analytical approach Absolute transition matrix - analytical approach PEP - PMMA TRAINING - ADISS ABABA June 2006 Inequality decomposition Static decomposition: Population subgroups Income components S-Gini index - analytical & Shapley approaches Generalised entropy index - analytical approach S-Gini Index - analytical & Shapley approaches Coefficient of variation index – analytical approach Dynamic decomposition: Difference: population subgroups Difference: income components S-Gini index- –analytical & Shapley approaches S-Gini index- –analytical & Shapley approaches Social welfare Atkinson index – analytical approach PEP - PMMA TRAINING - ADISS ABABA June 2006 Simulations and policy applications Impacts of income-component growth on Inequality, poverty and social welfare Impact of marginal price changes on poverty, social welfare and inequality Impact of demographic changes on poverty Impact of sectoral changes on poverty Impact of lump-sum targeting on poverty Impact of inequality-neutral targeting on poverty PEP - PMMA TRAINING - ADISS ABABA June 2006 Simulations and policy applications Gini income-component elasticity Growth elasticity of poverty Impact of marginal tax reforms on poverty and inequality Impact of reforms to poverty alleviation programmes, by targeting/allocation effects PEP - PMMA TRAINING - ADISS ABABA June 2006 Estimation of curves for descriptive and normative purposes: Lorenz & generalized Lorenz curves Concentration & generalised concentration curves Quantile and normalised quantile curves Poverty gap & cumulative poverty gap (CPG) curves FGT curves Pro-poor curves Bi-polarisation curves Deprivation curves Consumption dominance (CD) and normalised CD curves PEP - PMMA TRAINING - ADISS ABABA June 2006 Checking the robustness of poverty, social welfare, inequality and equity comparisons Estimation of stochastic dominance curves for: poverty social welfare inequality (normalised stochastic dominance) relative poverty indirect tax reforms “Efficient” targeting reforms Estimation of “ critical ” poverty lines for absolute and relative poverty Estimation of crossing points for Lorenz, CPG and concentration curves PEP - PMMA TRAINING - ADISS ABABA June 2006 Estimating sampling distributions Data from sample surveys usually display four important characteristics: they are stratified; they are clustered; they come with sampling weights (SW), also called inverse probability weights; sample observations provide aggregate information (such as household expenditures) on a number of “statistical units” (such as individuals) PEP - PMMA TRAINING - ADISS ABABA June 2006 Simple Random Sampling PEP - PMMA TRAINING - ADISS ABABA June 2006 Usual sampling procedures A country is first divided into geographical or administrative zones and areas, called strata. Each zone or area thus represents a strata. The first random selection takes place within the Primary Sampling Units (denoted as PSU’s) of each stratum. Within each stratum, a number of PSU’s are randomly selected. This random selection of PSU’s provides “clusters” of information. PSU’s are often provinces, departments, villages, etc. Within each PSU, there may then be other levels of random selection. PEP - PMMA TRAINING - ADISS ABABA June 2006 Usual sampling procedures For instance, within each province, a number of villages may be randomly selected, and within every selected village, a number of households may be randomly selected. The final sample observations constitute the Last Sampling Units (LSU’s). Each sample observation may then provide aggregate information (such as household expenditures) on all individuals or agents found within that LSU. These individuals or agents are not selected – information on all on them appears in the sample. They therefore do not represent the LSUs in statistical terminology. PEP - PMMA TRAINING - ADISS ABABA June 2006 Sampling Design with two levels of random selection PEP - PMMA TRAINING - ADISS ABABA June 2006 Sampling design and statistical significance Strata A PSU’s PSU’s Highest incomes PEP - PMMA TRAINING - ADISS ABABA June 2006 Strata B Lowest incomes Example: Figure 1 The SD of piority survey I (1994) of Burkina EP1: Burkina (1994) Strata 1 West Strata 2 South and West South Strata 3 North Center Strata 4 South Center Strata 5 North Strata 6 Othres Cities Strata 7 Ouaga Bobo PSUs 42 PSUs 37 PSUs 98 PSUs 55 PSUs 66 PSUs 39 PSUs 97 LSUs 839 LSUs 737 LSUs 1960 LSUs 1099 LSUs 1288 LSUs 778 LSUs 1938 Total Observations 8639 PEP - PMMA TRAINING - ADISS ABABA June 2006 Initialising the sampling design From the main menu one chooses the item Edit-> Set Sample Design. Indicate the variables to set the sample design and confirm your choice by clicking on the button OK. PEP - PMMA TRAINING - ADISS ABABA June 2006 Performing statistical inference Estimating confidence intervals and p-values Estimations are included directly for: FGT, SGini and Atkinson indices Can be computed via the “Confidence interval” application in DAD Testing hypothesis Can be performed directly for FGT, S-Gini and Atkinson indices Can be computed via the “Confidence interval” application in DAD PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD and DATA files Shows two sheets to load simultaneously two data bases Can read ASCII files safely through a data wizard Can support copy/paste to and from sheets of the most common software (Excel, Stata,...) Offers its own ASCII format for saving data Can edit variable information and content Can add or delete observations Can generate other variables PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD & Graphs Flexible Graph Options For example, one can change easily the: main title, title of axis and legends graph size template choice color, width and style of curves Saving DAD’s graphs One can save DAD’s graphs in: DAD Graph Format *.dgf JPEG, GIF, BMP … One can also save curves’ coordinates in ASCII format Editing curves’ coordinates in a new data sheet PEP - PMMA TRAINING - ADISS ABABA June 2006 How to learn to use DAD? The book entitled POVERTY AND EQUITY: MEASUREMENT, POLICY AND ESTIMATION WITH DAD covers most of the measurement theory implemented in DAD. The book is also a comprehensive reference for intermediate and advanced study in distributive analysis. DAD’s user manual provides tools for fast learning of DAD and can lead to rapid use of any of DAD’s applications. Exercises & Technical notes were written to consolidate the learning of DAD. Training sessions are regularly organised to teach distributive analysis and the use of DAD and other software. PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s data files With DAD, micro data from household surveys are typically required. A database used in DAD is then a matrix (a number of columns) whose number of lines is the number of observations DAD can load simultaneously two databases. The maximum number of variables for each DAD file is currently 20. PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s spreadsheet PEP - PMMA TRAINING - ADISS ABABA June 2006 The structure of a data file I- Sampling design Strata Specifies the name of the variable (in an integer format) that contains the Stratum identifiers. PSU Specifies the name of the variable (in an integer format) that contains the identifiers for the Primary Sampling Units. SAMPLING WEIGHT Specifies the name of the Sampling Weights variable. Finite Correction Gives the Finite Population Correction variable that is needed when the number of PSU is small and sampling was one without replacement. PEP - PMMA TRAINING - ADISS ABABA June 2006 The structure of a data file II- Basic distributive variables VARIABLE OF INTEREST. This is the variable that usually captures living standards. It can be for the entire household or for individuals (e.g., per capita or per equivalent adult expenditure). SIZE VARIABLE. This refers to the ”ethical” or physical size of the sampling observation GROUP VARIABLE To perform computations at the group level (integer variable : ex. Rural (1) Urban (2) ) PEP - PMMA TRAINING - ADISS ABABA June 2006 The structure of a data file II- Basic distributive variables GROUP NUMBER tells DAD on which value of the GROUP VARIABLE to condition the computation of some distributive statistics. The value for GROUP NUMBER should be an integer. For example, rural households might be assigned a value of 1 for some variable denoted as region. SAMPLING WEIGHT. Sampling weights are the inverse of the sampling rate. PEP - PMMA TRAINING - ADISS ABABA June 2006 Importing data files into DAD ASCII files After preparation of the required variables, one can export an ASCII file to be read by DAD. To import safely the data, a wizard is used in DAD. One can also use Copy/Paste to copy data from other software sheets (that is more risky however) A helpful software that can be used to prepare DAF files is Stat/Transfer (though a commercial software) PEP - PMMA TRAINING - ADISS ABABA June 2006 Launching DAD’s applications From the main menu one can choose the desired application; applications are organised by main themes. After choosing the desired application, a second widows appears to indicate the number of distributions or data files that should be used. PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s application for the FGT index PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s window of results PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s graphs (ex. Lorenz curves) PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s graphs (ex. difference between Lorenz curves) PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s graphs (ex. FGT curves (a=0)) PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s graphs (ex. Concentration & Lorenz curves) PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s graphs (ex. density curves) PEP - PMMA TRAINING - ADISS ABABA June 2006 DAD’s graphs (ex. Non parametric regression) PEP - PMMA TRAINING - ADISS ABABA June 2006