Introductory Quantitative Methods in Economics and Business 1 Coursework Student Name Student ID Date 2 Question 1 a1) Computation of descriptive statistics for amount oof spending that student use on Alchohol, tobacco and other narcotics. The following table shows the computed statistics for ATN variable. Statistic Value Average 19.33 Median 14.00 Standard Deviation 22.83 The average is calculated through the formula x x i N this involves adding all the values on student spending on alcoholic drinks, tobacco, and narcotics and dividing by the number of students. x 2 The formula for standard deviation is x i x N Median is the middle value when the observations are arranged in increasing order. In this case, the median will be the average of the two middle values i.e the 85 th and 86th value which are (14 +14)/2 which is 14. This means that half of the students in the dataset spend less than £14 on ATn while the other half spends more than £14 on ATN. a2) The average value of 19.33 is bigger and different from the median value of 14.00. This means that the data distribution is positively skewed with a longer tail to the right. The interpretation of this scenario concerning student spending is that very few students have a large spending amount on alcohol and narcotics compared to a large number that spends comparatively less. 3 b) Statistic Value Average 19.34 Median 0 Standard Deviation 72.07 The distribution of expenses on restaurants and hotels is more right-skewed than that of spending on alcoholic drinks and narcotics because the mean is far larger than the median. More than half of the student spends 0 amounts in hotel and restaurant. Additionally, the distribution of spending in a hotel and restaurant is more dispersed than the spending on alcohol, tobacco, and narcotics as indicated by the standard deviation of 72.07 and 22.83 respectively. Buy being more dispersed means that the differences between the spending of the student in hotel and restaurant are larger than those between student spending on alcohol, tobacco, and narcotics c1) The mean is 17. The mean is normally distributed with a mean of 19.33 and a standard error of 12.04/(square root of 5)=5.38. c2) The sample selection was done by the use of 2 digits random number table. If the number was greater than 170 it was ignored. If the number was less than and including 170 then a student with that id was recruited for the sample. The process was repeated to obtain a sample of five students. The student IDs of selected students are19, 37, 63, 111, and 136. This selection made use of simple random sampling where each observation in the variable under consideration was given a similar chance of being selected. The formula for sample standard deviation is x x 2 sx i n 1 and the sample standard deviation is 12.04 The computation is done by first calculating the mean or the average number of spending amounts for the 170 students. This involves adding all the amount of spending and dividing the resultant figure by the total number of students that is 170. The figure obtained is the 4 arithmetic mean or the average amount spent. The mean spending is then subtracted from each of the amounts of spending to obtain a figure difference for each student. The difference value is then squared for each student and a sum is taken of all the squared values to obtain the numerator single value under the square root in the formula ( xi x ). The figure is 2 then divided by 169 which is (n-1). This gives the variance of the sample data in spending. Obtaining the square-root of the variance gives the standard deviation of the amount that the student spends. The standard deviation so obtained represents the extent of dispersion or variation of spending by students in alcohol, tobacco and other narcotics. Question two a) Distribution of total expenditure The classification of total expenditure is obtained by first summing up all the expenditure by each student to obtain total spending. Values are then arranged in ascending order by the use of sorting in excel from the smallest to the largest. Since the quintile requires data to be categorized into five different ascending groups in terms of total, the total number of students (170) is divided by 5 to obtain 34. This is the number that each quintile will be composed of. Each of the first 34 amounts is added to obtain the value in the first quintile and represent total expenditure for those students. The process is repeated for the next 34 students to obtain the total expenditure for quintile 2. Total expenditure for other quintiles is obtained through the use of a similar method with results shown in the following table. All the workings are done by use of the Microsoft Excel spread sheet package. Quintile Total expenditure by quintile 1 £4177 2 £9697 3 £13391 5 4 £16913 5 £46961 Total £91139 The data shows the distribution of total expenditure by the student in each group category. Each of the categories has 34 students totalling 170 students. The data was organized in ascending order and divided into five quintiles. The total value shows the sum of all amounts spent by the students. The table shows that students in the 5th quintile spent over ten times on average more than those in the first quintile on ATN. The total spending amounted to £91139. a2) To obtain the cumulative percentage of the population for each of the total numbers of the student in quintiles was divided by 170 to obtain a proportional value of 0.2 which corresponds to 20%. To get the cumulative value each of the values in the next quintile was added to get a cumulative value that added to 100% at quintile five. On a similar note, the cumulative expenditure value in each quintile was divided by the total sum of all the expenditure to obtain a quotient that represents the percentage of each quintile spending. Each cumulative figure was added to the next to obtain a cumulative expenditure schedule shown in the following table. Cumulative % of Cumulative % Quintile population expenditure 1 20% 5% 2 40% 15% 3 60% 30% 4 80% 48% 5 100 100.00 The results show the cumulative population and expenditure. The distributions show a large inequality in spending among students. Students in the upper quintile have higher spending 6 than those in the lower quintile. Indeed, an analysis of these findings shows that the student in 5th quantile that accounts for 20% of all the students spent 52% on ATN. Similarly, the student in the first quintile that accounts for 20 percent of all students spent only 5% of all the amount. This shows a large income disparity between the top earners and low income. Cumulated percent of expediture b. 120% 100% 80% 60% 40% 20% 0% 0% 20% 40% 60% 80% 100% 120% Cummulative student percentage line of equality Lorenz Figure 1: Lorenz curve The Lorenz curve shown in figure 1 is a presentation of the distribution of expenditure or wealth among students. The curve indicates that the first 20% of the student have a cumulated expenditure of 5% while 80% have an accumulated expenditure of 48%. If the spending and income in all quintiles was equal the Lorenz curve would follow the line of equality. The line of equality represents the situation where every one of the students has equal income or wealth. c. Gini coefficient evaluates the inequality of a distribution. It is a ratio with values between 0 and 1. The numerator is the area between the Lorenz curve of the distribution and the line of equality; the denominator is the area under the line of equality. The Gini coefficient is 0.49. The value is estimated by dividing the sum of cumulative expenditure by the cumulative population percentage. A smaller value is always preferred in a society where equality is advocated. d. 7 A higher Gini coefficient implies greater inequality where high-income individuals receive or spend large amounts. This means that this year's cohort was more unequal compare to last year's cohort because the Gini coefficient of 0.49 for this year is greater than the 0.35 Gini coefficient for last year. Question 3 b. =COUNTIF(C2: C171, ">0") which yields 108 students that did consume alcohol, tobacco or narcotics b1 𝐴𝑇𝑁 108 P(ATN)=𝑇𝑜𝑡𝑎𝑙 = 170 = 0.635 b2. P(ATN)’=1- P(ATN)=0.365 c. COUNTIF(J2:J171, ">0") 50 recorded positive expenditure on recreation and culture (RC) d. =COUNTIFS(C2:C171, ">0", J2:J171, ">0") Number of students ATN No-ATN RC 33 17 50 No-RC 75 45 120 Total 108 62 170 e. Total 8 Number of students ATN No-ATN Total RC 0.19 0.10 0.29 No-RC 0.44 0.26 0.71 Total 0.64 0.36 1.00 f. P(ATN|RC) = P(ATN, RC) 𝑃(𝑅𝐶) P(ATN|RC) = 0.19 = 0.66 0.29 g. The consumption of ATN is not independent of RC. This is because the conditional probability of ATN/RC is greater than zero. Additionally, if a student can afford to pay for recreation and culture there are high chances that the student can be able to pay for ATN. Question 4 a. In statistics, the confidence interval refers to the chances of a population parameter falling between a given set of values. The confidence interval evaluates the degree of uncertainty that a population parameter will lie between certain values. Essentially, the most commonly used probability limits are 99% and 95%. The confidence interval for a population parameter 9 is estimated from computed statistic usually from a sample taken from the population through adding and subtracting the standard error of the statistic from the statistic itself to give a parameter range. The standard deviation is 6, n is 170, and the sample mean is 19.33. The confidence interval is given bu the following formula 𝐶𝐼 = (x̅ − 𝑧 ∗ s √ , x̅ + 𝑧 ∗ n = (19.33 − 1.96∗ s √n 6 √170 ) , ̅̅̅̅̅̅̅ 19.33 + 1.96∗ 6 √170 ) = (18.4, 20.2) With 95% confidence, the population means is between 18.4 and 20.2, based on 170 samples. This means that we are 95% confident that the average spending of students on ATN will lie between £18.4 and £20.2. b. When the standard deviation of the population is the unknown computation of confidence interval makes use of sample standard deviation and assumption is made to make use of student’s t-distribution. However, with a large sample size students' t-distribution approximates to normal distribution. As such the computation of the confidence interval is as follow; 𝐶𝐼 = (x̅ − 𝑡 ∗ 𝐶𝐼 = (19.33 − 1.96∗ s √n 22.83 √170 , x̅ + 𝑡 ∗ s √n ) , ̅̅̅̅̅̅̅ 19.33 + 1.96∗ 22.83 √170 ) = (15.9, 22.8) With 95% confidence, the population means is between 15.9 and 22.8, based on 170 samples. c. 10 Based on my findings I will not agree with the newspaper article that University students usually spend £25. This is because £25 is out of the confidence interval range calculated at a 95% confidence level.