Uploaded by Morris King'ang'i

Mathe matics economics-383623049

advertisement
Introductory Quantitative Methods in Economics and Business 1
Coursework
Student Name
Student ID
Date
2
Question 1
a1)
Computation of descriptive statistics for amount oof spending that student use on
Alchohol, tobacco and other narcotics.
The following table shows the computed statistics for ATN variable.
Statistic
Value
Average
19.33
Median
14.00
Standard Deviation
22.83
The average is calculated through the formula
x
x
i
N
this involves adding all the values on student spending on alcoholic drinks,
tobacco, and narcotics and dividing by the number of students.
 x   
2
The formula for standard deviation is
x 
i
x
N
Median is the middle value when the observations are arranged in increasing order. In
this case, the median will be the average of the two middle values i.e the 85 th and 86th
value which are (14 +14)/2 which is 14. This means that half of the students in the
dataset spend less than £14 on ATn while the other half spends more than £14 on ATN.
a2)
The average value of 19.33 is bigger and different from the median value of 14.00. This
means that the data distribution is positively skewed with a longer tail to the right. The
interpretation of this scenario concerning student spending is that very few students have
a large spending amount on alcohol and narcotics compared to a large number that
spends comparatively less.
3
b)
Statistic
Value
Average
19.34
Median
0
Standard Deviation
72.07
The distribution of expenses on restaurants and hotels is more right-skewed than that of
spending on alcoholic drinks and narcotics because the mean is far larger than the
median. More than half of the student spends 0 amounts in hotel and restaurant.
Additionally, the distribution of spending in a hotel and restaurant is more dispersed than
the spending on alcohol, tobacco, and narcotics as indicated by the standard deviation of
72.07 and 22.83 respectively. Buy being more dispersed means that the differences
between the spending of the student in hotel and restaurant are larger than those between
student spending on alcohol, tobacco, and narcotics
c1)
The mean is 17. The mean is normally distributed with a mean of 19.33 and a standard
error of 12.04/(square root of 5)=5.38.
c2)
The sample selection was done by the use of 2 digits random number table. If the number
was greater than 170 it was ignored. If the number was less than and including 170 then a
student with that id was recruited for the sample. The process was repeated to obtain a
sample of five students. The student IDs of selected students are19, 37, 63, 111, and 136.
This selection made use of simple random sampling where each observation in the
variable under consideration was given a similar chance of being selected.
The formula for sample standard deviation is
 x  x
2
sx 
i
n 1
and the sample standard deviation is 12.04
The computation is done by first calculating the mean or the average number of spending
amounts for the 170 students. This involves adding all the amount of spending and dividing
the resultant figure by the total number of students that is 170. The figure obtained is the
4
arithmetic mean or the average amount spent. The mean spending is then subtracted from
each of the amounts of spending to obtain a figure difference for each student. The difference
value is then squared for each student and a sum is taken of all the squared values to obtain


the numerator single value under the square root in the formula (  xi  x ). The figure is
2
then divided by 169 which is (n-1). This gives the variance of the sample data in spending.
Obtaining the square-root of the variance gives the standard deviation of the amount that the
student spends. The standard deviation so obtained represents the extent of dispersion or
variation of spending by students in alcohol, tobacco and other narcotics.
Question two
a)
Distribution of total expenditure
The classification of total expenditure is obtained by first summing up all the expenditure by
each student to obtain total spending. Values are then arranged in ascending order by the use
of sorting in excel from the smallest to the largest. Since the quintile requires data to be
categorized into five different ascending groups in terms of total, the total number of students
(170) is divided by 5 to obtain 34. This is the number that each quintile will be composed of.
Each of the first 34 amounts is added to obtain the value in the first quintile and represent
total expenditure for those students. The process is repeated for the next 34 students to obtain
the total expenditure for quintile 2. Total expenditure for other quintiles is obtained through
the use of a similar method with results shown in the following table. All the workings are
done by use of the Microsoft Excel spread sheet package.
Quintile
Total expenditure
by quintile
1
£4177
2
£9697
3
£13391
5
4
£16913
5
£46961
Total
£91139
The data shows the distribution of total expenditure by the student in each group category.
Each of the categories has 34 students totalling 170 students. The data was organized in
ascending order and divided into five quintiles. The total value shows the sum of all amounts
spent by the students. The table shows that students in the 5th quintile spent over ten times on
average more than those in the first quintile on ATN. The total spending amounted to
£91139.
a2)
To obtain the cumulative percentage of the population for each of the total numbers of the
student in quintiles was divided by 170 to obtain a proportional value of 0.2 which
corresponds to 20%. To get the cumulative value each of the values in the next quintile was
added to get a cumulative value that added to 100% at quintile five. On a similar note, the
cumulative expenditure value in each quintile was divided by the total sum of all the
expenditure to obtain a quotient that represents the percentage of each quintile spending.
Each cumulative figure was added to the next to obtain a cumulative expenditure schedule
shown in the following table.
Cumulative
% of
Cumulative
%
Quintile
population
expenditure
1
20%
5%
2
40%
15%
3
60%
30%
4
80%
48%
5
100
100.00
The results show the cumulative population and expenditure. The distributions show a large
inequality in spending among students. Students in the upper quintile have higher spending
6
than those in the lower quintile. Indeed, an analysis of these findings shows that the student in
5th quantile that accounts for 20% of all the students spent 52% on ATN. Similarly, the
student in the first quintile that accounts for 20 percent of all students spent only 5% of all the
amount. This shows a large income disparity between the top earners and low income.
Cumulated percent of expediture
b.
120%
100%
80%
60%
40%
20%
0%
0%
20%
40%
60%
80%
100%
120%
Cummulative student percentage
line of equality
Lorenz
Figure 1: Lorenz curve
The Lorenz curve shown in figure 1 is a presentation of the distribution of expenditure or
wealth among students. The curve indicates that the first 20% of the student have a cumulated
expenditure of 5% while 80% have an accumulated expenditure of 48%. If the spending and
income in all quintiles was equal the Lorenz curve would follow the line of equality. The line
of equality represents the situation where every one of the students has equal income or
wealth.
c.
Gini coefficient evaluates the inequality of a distribution. It is a ratio with values between 0
and 1. The numerator is the area between the Lorenz curve of the distribution and the line of
equality; the denominator is the area under the line of equality.
The Gini coefficient is 0.49. The value is estimated by dividing the sum of cumulative
expenditure by the cumulative population percentage. A smaller value is always preferred in
a society where equality is advocated.
d.
7
A higher Gini coefficient implies greater inequality where high-income individuals receive or
spend large amounts. This means that this year's cohort was more unequal compare to last
year's cohort because the Gini coefficient of 0.49 for this year is greater than the 0.35 Gini
coefficient for last year.
Question 3
b.
=COUNTIF(C2: C171, ">0") which yields 108 students that did consume alcohol, tobacco
or narcotics
b1
𝐴𝑇𝑁
108
P(ATN)=𝑇𝑜𝑡𝑎𝑙 = 170 = 0.635
b2.
P(ATN)’=1- P(ATN)=0.365
c.
COUNTIF(J2:J171, ">0")
50 recorded positive expenditure on recreation and culture (RC)
d.
=COUNTIFS(C2:C171, ">0", J2:J171, ">0")
Number of students
ATN
No-ATN
RC
33
17
50
No-RC
75
45
120
Total
108
62
170
e.
Total
8
Number of students
ATN
No-ATN
Total
RC
0.19
0.10
0.29
No-RC
0.44
0.26
0.71
Total
0.64
0.36
1.00
f.
P(ATN|RC) =
P(ATN, RC)
𝑃(𝑅𝐶)
P(ATN|RC) =
0.19
= 0.66
0.29
g.
The consumption of ATN is not independent of RC. This is because the conditional
probability of ATN/RC is greater than zero. Additionally, if a student can afford to pay for
recreation and culture there are high chances that the student can be able to pay for ATN.
Question 4
a.
In statistics, the confidence interval refers to the chances of a population parameter falling
between a given set of values. The confidence interval evaluates the degree of uncertainty
that a population parameter will lie between certain values. Essentially, the most commonly
used probability limits are 99% and 95%. The confidence interval for a population parameter
9
is estimated from computed statistic usually from a sample taken from the population through
adding and subtracting the standard error of the statistic from the statistic itself to give a
parameter range.
The standard deviation is 6, n is 170, and the sample mean is 19.33.
The confidence interval is given bu the following formula
𝐶𝐼 = (x̅ − 𝑧 ∗
s
√
, x̅ + 𝑧 ∗
n
= (19.33 − 1.96∗
s
√n
6
√170
)
, ̅̅̅̅̅̅̅
19.33 + 1.96∗
6
√170
)
= (18.4, 20.2)
With 95% confidence, the population means is between 18.4 and 20.2, based on 170 samples.
This means that we are 95% confident that the average spending of students on ATN will lie
between £18.4 and £20.2.
b.
When the standard deviation of the population is the unknown computation of confidence
interval makes use of sample standard deviation and assumption is made to make use of
student’s t-distribution. However, with a large sample size students' t-distribution
approximates to normal distribution. As such the computation of the confidence interval is as
follow;
𝐶𝐼 = (x̅ − 𝑡 ∗
𝐶𝐼 = (19.33 − 1.96∗
s
√n
22.83
√170
, x̅ + 𝑡 ∗
s
√n
)
, ̅̅̅̅̅̅̅
19.33 + 1.96∗
22.83
√170
)
= (15.9, 22.8)
With 95% confidence, the population means is between 15.9 and 22.8, based on 170 samples.
c.
10
Based on my findings I will not agree with the newspaper article that University students
usually spend £25. This is because £25 is out of the confidence interval range calculated at a
95% confidence level.
Related documents
Download