Statistics assignment for Victoria University MMPA 507

advertisement
SUSTAINABLE
DEVELOPMENT GOAL
INDICATORS
A Data and Insights Report
9 February 2024
HYPOTHESIS TEST

I chose the ratio of female-to-male labour force participation rate (%).

The null hypothesis (H0) is that the female to male labour force participation rate is 50%. This
means that the number of females participating in the labour force is half that of males. The
alternative hypothesis (H1) is that the female to male labour force participation rate is not
50%.

As 15.16>1.98 and p-value is 8.27x10-34, I rejected H0. This p-value means that the probability
of observing a sample mean of 73.35 given that the H0 is true is 8.27x10-32%.

Therefore, there is sufficient statistical evidence to conclude that the population mean is
significantly different from 50%. This decision was made at the 5% significance level, which
means that there is a 5% probability of rejecting H0 when it is true (Type I error).

There may be more or less females participating in the labour force relative to males. This
can inform economic policy, workforce, and gender equality efforts. However, regional
differences in culture, economy, and laws must be considered.
CONFIDENCE INTERVAL

I chose the timeliness of administrative proceedings (worst 0 - 1 best).

We can be 95% confident that the population average timeliness score for administrative
proceedings falls within the confidence interval range of 0.47 to 0.52. This does not mean
that there is a 95% probability that the population average ratings is within this interval
range. It means that if we repeated this study numerous times, we would expect that 95% of
the confidence intervals calculated from those studies would contain the population average.

The confidence interval suggests a moderate level of efficiency in administrative
proceedings. To improve timeliness, policymakers could consider strategies that push the
average score above the upper limit of the current confidence interval such as rolling out
digital technologies in government institutions. However, the way administrative timeliness
was measured is subjective and may be biased. We should consider this when interpreting
the interval.
SIMPLE REGRESSION

The Y intercept, b0= 1.73, indicates that, when administrative timeliness is zero, the mean
ratio of female-to-male labour force participation rate is e1.73 (or 56.45%). The value of the
slope, b1 = 0.25, indicates that, the female-to-male labour force participation rate is
predicted to increase by e0.25 for every point increase in administrative timeliness.

r2 = 0.072. This means that 7.20% of the variation in the female-to-male labour force
participation rate is explained by the variability of administrative timeliness. This indicates a
weak positive linear relationship between the two variables.
P-value

As I predicted that the slope was positive, the hypotheses is:
H0: β ≤ 0
H1: β > 0

P-value=0.00089. This is less than the chosen significance level (α = 0.05). There is sufficient
evidence to conclude that, at 5% significance that β is greater than 0 which means that the
slope is positive.
Assumptions

Independence: N/A - the data set was collected during the same period.
Linearity: there appears to be an apparent pattern between the residuals and administrative
timeliness as the residuals appear to be unevenly clustered. Therefore, the linear model
does not appear appropriate.
Equal variance: The residuals appear less varied with greater administrative timeliness
ratings. Therefore, there appears to be a violation of the assumption of equal variance.
Residual Plot Against Timeliness of Administrative
Proceedings
0,4
0,2
Residuals


0
0
0,2
0,4
0,6
0,8
1
-0,2
-0,4
-0,6
Timeliness of Administrative Proceedings (worst 0 - 1 best)

Normality: There appears to be left skewness based on the histogram and normal
probability plot. Therefore, there may be a violation in the assumption of normality.
Normal Probability Plot
0,4
0,3
0,2
Residuals
0,1
-3
0
-2
-1
-0,1 0
-0,2
-0,3
-0,4
-0,5
-0,6
Theoretical Z
1
2
3
MULTIPLE REGRESSION

The slope of timeliness with the labour force participation rate (b1 = 0.27) indicates that, for
a given percentage of female-to-male years of education received, the labour force
participation rate increases by e0.27 with each point increase in timeliness rating.

The slope of the mean years of education received with labour force participation rate (b2 =
−0.00057) indicates that, for a given timeliness rating, the labour force participation rate is
estimated to decrease by e-0.00057 for each additional percentage point of years of education
received.

R2 of 0.0755 indicates that 7.55% of the variation in labour force participation rate is
explained by the variation in the timeliness rating and years of education received (%).

Adjusted R2 of 0.0612 indicates that 6.12% of the variation in labour force participation rate
is explained by the multiple regression model, adjusted for the 2 independent variables and
sample size.
F Statistic
H0: β1 = β2
H1: βj ≠ 0
 The critical value of the F distribution with 2 and 130 degrees of freedom is 3.07. The F
statistic is 5.30. As 5.30> 3.07 we reject H0. We have sufficient evidence at the 5% level of
significance that at least one of the independent variables (timeliness and/or years of
education) is related to the labour force participation rate.
T Statistic
H0: β1 = 0
H1: β1 ≠ 0
 The critical values for 130 degrees of freedom are ±1.98.

First t test:
Because t = 3.22>1.98 and the p-value is 0.0016< 0.05, H0 is rejected. There is a relationship
at the 5% level of significance between administrative timeliness and labour force
participation, considering mean years of education.

Second t test:
Because t = -0.69 > -1.98 and the p-value is 0.49, we do not reject H0. There is no significant
relationship at the 5% level of significance between years of education received and labour
force participation rate, considering administrative timeliness.
Residuals
Timeliness of administrative proceedings
(worst 0 - 1 best) Residual Plot
0,4
Residuals
0,2
0
-0,2
0
0,2
0,4
0,6
0,8
1
-0,4
-0,6
Timeliness of administrative proceedings (worst 0 - 1 best)
Ratio of female-to-male mean years of
education received (%) Residual Plot
0,4
Residuals
0,2
0
-0,2
0
20
40
60
80
100
120
-0,4
-0,6
Ratio of female-to-male mean years of education received (%)
Ratio of female-to-male labour force
participation rate(%)- LOG Scale - Residual
Plot
0,4
Residuals
0,2
0
1,75
1,8
1,85
1,9
1,95
2
-0,2
-0,4
-0,6

Predicted Ratio of female-to-male labor force participation rate (%)- LOG
Therefore, the problems with the residuals in simple regression (noted above) do not seem
to be mitigated by the inclusion of this independent variable.
Collinearity

The correlation is 0.13. This means that the variance inflationary factor (VIF) is 1.15. As the
VIF is around 1, administrative timeliness and female-to-male years of education received
are uncorrelated.
Conclusion

It does not appear that including the ratio of female-to-male mean years of education
received variable to the model seems appropriate. This is because it does not add substantial
explanatory power to the model. Also, the issues identified in the simple regression model
remain.
Download