Uploaded by Mitch Campbell

CLEAR Gini Coeffecient Math IA

advertisement
Mathematics HL IA
Comparison among the three different methods of computing the Coefficient of gini
IBDP 2021
Introduction: Nowadays, income disparity is a well known problem. With every passing day, rich are becoming richer while
impoverished are facing bigger challenges. No sensitive person can ignore the fact that majority of world’s wealth is hold
by a few people. This creates a lot of problems. Economics is a subject of my choice, I like it very much. There is a
misconception among the masses that economics is all about monetary things, but actually economics is about well-being
of humans. So, I began to investigate about the Economics’ tools that helps us to understand disparity in income. Here I
got the coefficient of gini. It creates a chart of countries on the basis of their income levels and I got to know that
Mathematics can be used to find its value. Therefore I choose Gini coefficient as my IA topic and though to investigate,
which is the best method to find it. Question of income disparity is of utmost important to us all so that it does not disturb
the balance of our society.
Image 1 : A worrying situation of our country’s economy
Source : - Live mint
There was a worry in my mind how a specifically necessary measure that identify various governmental policies was set
on, opposing the uniqueness of pay in a topographical region and the hugeness of information required for a pinpoint
figure in countries, as an instance,our country India. This took my notice to the region in which I came to know about the
predomination of Mathematics in adding up formulae to narrate the income inequality. With the help of the knowledge I got
at school, There was a chance to grasp the vital parameters backing a part of all these mathematical formulae that further
tested me to inspect. I was having an enlarged thirst to apply the deep inquiries we used of analytics and arrangement in
our works at school to a thing that is far more correct and bonafide.
I wanted to use arithmetic to better understand world’s social, economical and political condition so I decided to calculate
gini coefficient using different methods, as It’s a measure of income disparity that ultimately related to world’s major
challenges.
Background Information: - Whenever, we want to find out the income disparity of a region, we use gini coefficient and
most of the administrations carry out their official computations to check their nation’s economic status. Ginni Corrado
invented this method long back in year 1912. Numerically, coefficient of Gini ranges beteen 0 and 1 ony.Higher value of
gini coefficient indicates high income disparity while it’s low value shows equal wealth distribution. Clearly, low value of
Gini Coefficient indictes a much prosper country
Mainly there are two types of methods by which we can find out gini coefficients

Statistical methods.

Graphical methods
We can take Lorenz curve method as a graphical method and as the statistical methods, we can choose covariance
method and Pareto’s destitution method. I am going to use these three ways to do my exploration
Using all the three methods, we shall compute gini coefficient for our country India for year 2013. Also we shall compare
all our values with the official value released by our government ( G = 0.510.)
First Method – Rule of Trapezium (Lorenz Curve)
It is easy to find out coefficient of gini by the Curve of Lorenz
The curve shown above indicates the percentages of the low income people to the high income peopleon the axis of x.
These sections of the populous are very well defined. These are aggregating percentages, which shows if quantile 3 is
examined, then it shows the total wealth of quantile 1, quantile 2 and 3. There can never be a group of people with no
income that’s why our graph passes through point A(0,0). Clearly all the population is having all the wealth so it’s a must
that graph goes to point B. Clearly the curve is moving on a diagonal path. This shows a very well balanced distribution of
wealth in the chosen sample, exhibits curve 𝐿0 (𝐿).
A practical Lorenz curve is the one depicted by 𝐿0 (𝐿) =
210𝐿 −1
1023
.
We shall find out the deviation of this curve that will tell us the degree of wealth disparity
Here is the formula to compute the area beneath a curve
Firstly, We should calculate area under the Lorenz curve because gini coefficient is equal to half of the areas divided
between the two curves.
1
1
×1×1=
2
2
Therefore, we can express coefficient of gini like this: -
LA represents the area between the two curves while G Shows the Coefficient of Gini. It is really very tough to use this
formula in day to day life. Usually the values we use are huge and it is almost impossible to show the categories on
graphsheet. We shall try to apply the rule of trapezium and shall use this set of data and our country’s payout groups will
be represented: -
Fundamentally, chosen method shows the integration in numeric terms, estimates abot the area beneath the curve.
Clearly we have splitted the curve is in various trapeziums also in final step, all the areas will be added. Whole process is
described here-
First of all, let us add all the areas represented in colour red. LA shows area that is green in colour. This green area, we
already computed, that 𝐿0 (𝐿), has the area under the curve of 0.5. That’s why, we are now able to calculate the
coefficient of gini with formula shown above
Second way (Lorenz Curve): Here, we take help from a polynomial regression to get the required expression for the Lorenz Curve, later for computing
coefficient of gini. We take help from polynomial regression in understanding the function on which our curve is based with
any coefficient and any degree of the polynomial. We can gauge the strength of relationship of regression by The R
squared value.
Here we make We can create a system of polynomial regression, apply concept of least squares. We compute the least
total of the remainings. Hence we can know how accurate regressions are. A residuals are just a variation between our
forecasted and real values. This way of computing residual is exhibited here: -
Sum of residuals can be represented using this expression: -
Ultimately, we will calculate Lorenz equation, for this we shall apply partial differentiation with respect to the constant. The
cause for this, we have to compute Also for checking the Lorentz curve about our country, we shall depend to the
questions of degree 2(quadratic equation)-
Also: -
Therefore, partial differentiation of our polynomials are : -
We should divide both l.h.s and r.h.s by 2 and replace values of constants, this thus we shall get the following equation -
This can also be shown as-
If we multiply two sides of this matrix by (1) that is actually transpose of the first matrix, then we can compute the values
of the constants:--
There is a three by three matrix, we shall compute the inverse of it and after that we shall also calculate the determinant
so that we are able in to getting the Lorenz Curve’s equation with degree two. We begin our work with matrix by replacing
variable x & variable y in above matrix and We get this -
We shall do the below given steps to calculate determinant matrix : -
After this, we shall get the inverse matrix by multiplying it to the reciprocal: -
We get the values of the constants in a way shown above: -
Finally, we reach to this quadratic equation --
It is time to plot graph that is exhibited above: -
It is better for us to cross-check polynomial regression with the help of some technological tools, Microsoft excel is a very
good option, I got these results by using. We are able to change the value of the degree of our polynomial regression in
this tool Microsoft excel.
So we used Microsoft excel to apply the quadratic regression and the five-order regression. Above screenshot of micro
soft excel clearly shows that we can change order of our polynomial by using above indicated. We got the below
mentioned outcome with ms-excel.
y = 1.22770x2 - 0.31910x + 0.03610
y = 6.51040x5 - 12.76000x4 + 9.01040x3 - 2.23960x2 + 0.47920x - 2E-10
We reached to a polynomial of degree 5 because of R square value. The tool we used above, Microsoft Excel, creates r
squared values too. It defines the accuracy of our regression. If it is close to one then our regression must be accurate.
Thus the below given explains the reson for choosing the polynomial of degree 5 as an option of our primary choice: Regression Polynomial order
R squared value
2
0.97480
3
0.99580
4
0.99960
5
1.00000
Clearly for a polynomial of order five, R2=5, therefore we select the polynomial of fifth order
This result is almost equal to the value that we have got manually for it therefore it is no harm to use regression we have
conducted with matrix
Its time to get the value of residuals while executing regression & comparison with actual data: -
The residual value exhibits that our degree two equation is not fitting into the curve of Lorenz going from the point B (1,1).
Sp, I chose to use the polynomial regression of degree 5 because we are already having 6 points. We can show it this
way; -
Above information helps us to deduce: -
This shows our countries Lorenz curve for year 2013, below is the scatterplot of this: -
It is time to compute the value of coefficient of gini : -
Third method (Formula of covariance method ) This is our third and final method. It helps us by finding covariance of different levels of income and the cumulative
distribution of income for these levels. We can write formula for Lorentz like : -
We will imagine, G(x) is a CDF. And we have one more imagination, that the function f(x) can be diffirentiated
continuously, also the below situatin is true: -
For a known value of x and p can be defined like this : -
If we combine the above assumptions and our expertise of coefficients of gini, We easily get the gini coefficient formula
The below shown table exhibits quantiles for India for 2013 :-
We shall use the data shown above in the below mentioned formula to calculate covariance
The variable that is independent is x, While the variable that is dependent is y, total number of data points is n, also
data’s mean is shown by. We shall compute the mean of the data at first : -
Replacing above in the formula of covariance, result is : -
It is time to divide above result by
2
𝐿
, we get our required result by using formula of covariance: -
Discussion: I used three unique methods to compute the value of coefficient of gini in this exploration. One of the method was based
on the formula of covariance while the remaining two methods used curve of Lorenz
The outcome of all the three various ways were these: Method
Result
Trapezium rule
0.4200
Lorentz curve method
0.4330
Covariance method
0.4950
All the three methods are giving answers which are not that different. Therefore we can see that all the methods are
appropriate to compute the coefficient that helps us to better understand income disparity of a country. Here is a detailed
discussusion on each method.
If Trapezium method is used and we compare its outcome with second rule, the outcomes aren’t that similar. These
outcomes are inaccurate and unrealistic. Value of gini coefficient will be low because of the increased area below the
curve of Lorenz. It happens because it treats curves as lines and therefore it expands area below the curve of Lorenz.
Therefore If we use this method then the value of our gini coefficient will be a bit smaller.
If we do a comparative study of second and third method, we can notice that both are giving us smaller values these
values were infact mentioned by central govt of India (G= 0.5100). There is a clear cut thing here, second rule is exhibiting
much greater eroor then 3rd. Reason for this may be that quantiles are used in this rule and therefore we find bigger
wrong evaluationss and variabilities. Covariance formula or 3rd method, it gives us an outcome that is very close to the
prediction of Indian Central government, because we used 5 coordinate sets in it.
All methods for calculating gini coefficient are good but we have to do a comparison among the all methods then clearly
Covariance one has an edge over others. We can get almost an exact value of gini coefficient using this method. Lorenz
Curve method is preferred when the data give to us is very big because it is a more straightforward one. I do not consider
3rd Method as an independent one. I take it as an extended version of 2 nd method. This gives us as a comprehensive
outcome of gini coefficient. It can be a complimentary method to the 2 nd one by giving us a much comprehensive income
disparity coefficient.
Overall, we can state that there are many limitations to gini coefficient if it is about the disparity because it segments the
samples. Gini coefficient does not think about roll out of payout in those particular parts.yet it can definitely works like a
coincidence betwixt computing disparity with many various indicators.
Evaluation: ●
Gini Coefficient enables us to understand the income disparity and support us to compare it for different
countries. I have not fulfilled it in my exploration but it can be there in an extended version. We explored three
various rules, different nations could not be explored.
●
In this exploration we used date received from our census only. This date is only related to the segment of
income only that’s why we had a very restricted scope for it. In spite of the restrictions, we were able to apply a
all three methods and get a decent value for gini coefficient
●
We demonstrated detailed calculation for each rule and this was the main strength of this exploration
Conclusion: In this exploration I used all three methods and calculate values for gini coefficient by each method.Though there was a
slight difference in answers in these methods but I feel that the task provided me a statistical understanding about the
inequality also how this is spoiling countries in the current time. Now I can think about income disparity in a better way. It
is strange to know how a small number of families enjoying almost all the wealth of the world. It surprises me to watch the
curve of Lorenz goes up with the higher quantities, while tens of millions of humans are in need of daily things, there are a
few people who thinks themselves at the top of this world. My task has widened my viewpoint and suported me to develop
an enhanced understanding about the topic.
Bibliography : 6523.0 - Household Income and Wealth, Australia, 2017-18. Australian Bureau of Statistics. (2019). Retrieved 23 January
2020, from https://www.abs.gov.au/ausstats/abs@.nsf/Lookup/by%20Subject/6523.0~201718~Main%20Features~Distribution%20of%20Household%20Income%20and%20Wealth~6.
Chappelow, J. (2019). The gini Index Definition. Investopedia. Retrieved 22 January 2020, from
https://www.investopedia.com/terms/g/the gini-index.asp.
Haitovsky, Y. (2004). International encyclopedia of the social & behavioral sciences.
Mexico - The gini Coefficient. Trading Economics. Retrieved 23 January 2020, from
https://tradingeconomics.com/mexico/the gini-coefficient-wb-data.html.
Norway - THE GINI Index. Trading Economics. Retrieved 23 January 2020, from
https://tradingeconomics.com/norway/the gini-index-wb-data.html.
South Africa - Income distribution. Index Mundi. Retrieved 23 January 2020, from
https://www.indexmundi.com/facts/south-africa/income-distribution.
South Africa | ZA: The gini Coefficient (THE GINI Index): World Bank Estimate | Economic Indicators. CEIC Data.
Retrieved 23 January 2020, from https://www.ceicdata.com/en/south-africa/poverty/za-the gini-coefficient-the gini-indexworld-bank-estimate.
Bourne, Murray. "The The gini Coefficient of Wealth Distribution." Intmathcom RSS. N.p., 24 Feb. 2010. Web. 07 Mar.
2017.
Nair, Remya. "IMF Warns of Growing Inequality in India and China." Http:// www.livemint.com/. Livemint, 03 May 2016.
Web. 07 Mar. 2017.
"Finding Residuals." Interactivate: Finding Residuals. CSERD, n.d. Web. 23 Mar. 2017.
Download