Uploaded by warapolb

MATHS1041 z5162418

advertisement
MATH1041 Statistics for Life and Social Science
Semester 2, 2017
Computing Assignment
Assignment release date: The assignment will be released to all students on the 20th
of September (Wednesday, Week 9) on Moodle (see “Assessments Information” section).
Submission date: Wednesday 11th October (Week 11) before 6pm (Sydney
time).
Please submit your assignment through Moodle, please see the “Assessments Information” section on Moodle for further information regarding online submission. You must
submit a neatly typed assignment converted to pdf format.
Data: A data set (in the text file format) will be sent to you via email at your official
university email address (see page 2 of this document for further details).
Assignment length: No more than six single-sided A4 pages.
You are required to include with your submission:
• This cover sheet as the first page (this cover sheet is also available as a single
document on Moodle).
• Your written assignment (no more than three pages). This should also included the
attached table completed with your answers (see last page of this document).
• Two pages for figures (includes a histogram, two normal quantile plots and a comparative boxplot).
Results table
/5
Q1
/9
Q2
/20
Q3
/19
Q4
/7
Total
/60
1
1(a)(b)
Mean: 6.810345
Standard Deviation: 5.002304
The central location of the histogram is 6.81 which is also identified as the median. The spread
of data is between 0-15. The shape of the histogram is also significantly right skewed.
2(a(i))
The null hypothesis (Ho) is the mean of points scored by NBA players in the 2016-2017 season
which is 8.40 (Ho: µ0=8.40) and the alternative hypothesis is that the mean of points scored
in the 2016-2017 season has changed from 8.40 (Ha : µa ≠ 8.40).
The value of the T-statistic is obtained by solving t
= -2.42.
As this is a 2-way hypothesis test, the P-value = 2P(T≥|t|) will be applied. The P-value obtained
from 2P(T≥|2.42|) is approximately 0.0106.
As the P-value (0.0106) is smaller than 0.05, the statistic provides strong evidence that the true
mean in the 2016-2017 season has changed.
2(a(ii))
The 95% confidence interval for the true mean of PPG in the 2016-2017 NBA season can be
obtained by applying
= (5.495056, 8.125634) = (5.50, 8.13) (2 d.p).
Hence we are 95% confident that the true mean of the PPG of the 2017-2017 NBA season lie
between 5.50 and 8.13. The confidence interval does not include the mean of 8.40 and thus
is consistent with the hypothesis test 2(a)i. which proves that the true mean has shifted.
2(b)
2(c)
The normal quantile plot indicates relatively right skewed data as it doesn’t quite follow a
straight line of regression. In addition, there are outliers present in the plot. The use of the a
t-distribution as a sampling distribution for parts 2(a)I and 2(a)ii are suitable in determining
the comparison of data against the regression line in order to conclude the mean of a
distribution.
3(a)
(i) The null hypothesis (Ho) of the transformed data is the mean of points scored by NBA
players in the 2016-2017 season which is 1.63 (Ho: µtrans = 1.63) and the alternate hypothesis
is that the mean of points scored in the 2016-2017 season has changed from 1.63 (Ha : µa ≠
1.63).
The value of the T-statistic is obtained by solving
, t = -2.377943978
As this is a 2-way hypothesis test, the P-value = 2P(T≥|t|) will be applied. The P-value
obtained from 2P(T≥|2.378||) is approximately 0.02079.
As the P-value is smaller than 0.05 (0.02079), the statistic provides strong evidence that the
true mean in the 2016-2017 season has changed in the next seasons.
(ii) The 95% confidence interval for the true mean of PPG in the 2016-2017 NBA season can
be obtained by applying
= (1.468227, 1.616133) = (1.47, 1.62) (2 d.p).
Hence we are 95% confident that the true mean of the PPG of the 2016-2017 NBA season lie
between 1.47 and 1.62. The confidence interval does not include the transformed mean of
1.63 and thus is consistent with the hypothesis test 2(a)i. which proves that the true mean
has shifted.
3(b)
Conclusions from the hypothesis test of the P-value being less than 0.05, indicated that it is statistically
insignificant at level 0.05. This shows that there is evidence to go against the null hypothesis. The
confidence interval assists in supporting the hypothesis as the null hypothesis (1.63) does not land
between the 95% confidence interval of 1.47 and 1.62.
3(c)
The transformed PPG data is much more normally distributed as it better follows a linear line of
regression. In comparison to the transformed data normal quantile plot and the untransformed data
normal quantile plot, the transformed data follows a relatively more linear path. However, it can also
be debated that the transformed data quantile plot is slightly right skewed, similar to the
untransformed data normal quantile plot.
4(a)
Comparative Boxplot: Age Class and PPG
4(b)
The comparative boxplot reveals a significant difference between the 25 years and over category
compared to the 25 years and under category. The 25 years and over have a much smaller spread in
comparison to the 25 years and under. The 25 years and under category also has two existing outliers.
The central location of both categories are 6.81.
4(c)
In conclusion, players ages 25 and under had a larger range of average PPG’s meaning they had a much
better performance in comparison to the 25 years and older category. This is based on the
comparative boxplot as ages over 25 years has a much smaller spread compared to the under 25 years’
category.
Sample size 𝑛
Sample mean 𝑥
Sample standard deviation 𝑠
𝑡-statistic
Degrees of Freedom
𝑃-value for test
Reject 𝐻' (Yes/No)
95% Confidence Interval
Original Data
Transformed Data
58
6.810345
5.002304
-2.42
58
1.54218
0.2812586
-21.458
57
57
0.0106
Yes
5.50, 8.13
0.02079
Yes
1.47, 1.62
Download