SUSTAINABLE DEVELOPMENT GOAL INDICATORS A Data and Insights Report 9 February 2024 HYPOTHESIS TEST I chose the ratio of female-to-male labour force participation rate (%). The null hypothesis (H0) is that the female to male labour force participation rate is 50%. This means that the number of females participating in the labour force is half that of males. The alternative hypothesis (H1) is that the female to male labour force participation rate is not 50%. As 15.16>1.98 and p-value is 8.27x10-34, I rejected H0. This p-value means that the probability of observing a sample mean of 73.35 given that the H0 is true is 8.27x10-32%. Therefore, there is sufficient statistical evidence to conclude that the population mean is significantly different from 50%. This decision was made at the 5% significance level, which means that there is a 5% probability of rejecting H0 when it is true (Type I error). There may be more or less females participating in the labour force relative to males. This can inform economic policy, workforce, and gender equality efforts. However, regional differences in culture, economy, and laws must be considered. CONFIDENCE INTERVAL I chose the timeliness of administrative proceedings (worst 0 - 1 best). We can be 95% confident that the population average timeliness score for administrative proceedings falls within the confidence interval range of 0.47 to 0.52. This does not mean that there is a 95% probability that the population average ratings is within this interval range. It means that if we repeated this study numerous times, we would expect that 95% of the confidence intervals calculated from those studies would contain the population average. The confidence interval suggests a moderate level of efficiency in administrative proceedings. To improve timeliness, policymakers could consider strategies that push the average score above the upper limit of the current confidence interval such as rolling out digital technologies in government institutions. However, the way administrative timeliness was measured is subjective and may be biased. We should consider this when interpreting the interval. SIMPLE REGRESSION The Y intercept, b0= 1.73, indicates that, when administrative timeliness is zero, the mean ratio of female-to-male labour force participation rate is e1.73 (or 56.45%). The value of the slope, b1 = 0.25, indicates that, the female-to-male labour force participation rate is predicted to increase by e0.25 for every point increase in administrative timeliness. r2 = 0.072. This means that 7.20% of the variation in the female-to-male labour force participation rate is explained by the variability of administrative timeliness. This indicates a weak positive linear relationship between the two variables. P-value As I predicted that the slope was positive, the hypotheses is: H0: β ≤ 0 H1: β > 0 P-value=0.00089. This is less than the chosen significance level (α = 0.05). There is sufficient evidence to conclude that, at 5% significance that β is greater than 0 which means that the slope is positive. Assumptions Independence: N/A - the data set was collected during the same period. Linearity: there appears to be an apparent pattern between the residuals and administrative timeliness as the residuals appear to be unevenly clustered. Therefore, the linear model does not appear appropriate. Equal variance: The residuals appear less varied with greater administrative timeliness ratings. Therefore, there appears to be a violation of the assumption of equal variance. Residual Plot Against Timeliness of Administrative Proceedings 0,4 0,2 Residuals 0 0 0,2 0,4 0,6 0,8 1 -0,2 -0,4 -0,6 Timeliness of Administrative Proceedings (worst 0 - 1 best) Normality: There appears to be left skewness based on the histogram and normal probability plot. Therefore, there may be a violation in the assumption of normality. Normal Probability Plot 0,4 0,3 0,2 Residuals 0,1 -3 0 -2 -1 -0,1 0 -0,2 -0,3 -0,4 -0,5 -0,6 Theoretical Z 1 2 3 MULTIPLE REGRESSION The slope of timeliness with the labour force participation rate (b1 = 0.27) indicates that, for a given percentage of female-to-male years of education received, the labour force participation rate increases by e0.27 with each point increase in timeliness rating. The slope of the mean years of education received with labour force participation rate (b2 = −0.00057) indicates that, for a given timeliness rating, the labour force participation rate is estimated to decrease by e-0.00057 for each additional percentage point of years of education received. R2 of 0.0755 indicates that 7.55% of the variation in labour force participation rate is explained by the variation in the timeliness rating and years of education received (%). Adjusted R2 of 0.0612 indicates that 6.12% of the variation in labour force participation rate is explained by the multiple regression model, adjusted for the 2 independent variables and sample size. F Statistic H0: β1 = β2 H1: βj ≠ 0 The critical value of the F distribution with 2 and 130 degrees of freedom is 3.07. The F statistic is 5.30. As 5.30> 3.07 we reject H0. We have sufficient evidence at the 5% level of significance that at least one of the independent variables (timeliness and/or years of education) is related to the labour force participation rate. T Statistic H0: β1 = 0 H1: β1 ≠ 0 The critical values for 130 degrees of freedom are ±1.98. First t test: Because t = 3.22>1.98 and the p-value is 0.0016< 0.05, H0 is rejected. There is a relationship at the 5% level of significance between administrative timeliness and labour force participation, considering mean years of education. Second t test: Because t = -0.69 > -1.98 and the p-value is 0.49, we do not reject H0. There is no significant relationship at the 5% level of significance between years of education received and labour force participation rate, considering administrative timeliness. Residuals Timeliness of administrative proceedings (worst 0 - 1 best) Residual Plot 0,4 Residuals 0,2 0 -0,2 0 0,2 0,4 0,6 0,8 1 -0,4 -0,6 Timeliness of administrative proceedings (worst 0 - 1 best) Ratio of female-to-male mean years of education received (%) Residual Plot 0,4 Residuals 0,2 0 -0,2 0 20 40 60 80 100 120 -0,4 -0,6 Ratio of female-to-male mean years of education received (%) Ratio of female-to-male labour force participation rate(%)- LOG Scale - Residual Plot 0,4 Residuals 0,2 0 1,75 1,8 1,85 1,9 1,95 2 -0,2 -0,4 -0,6 Predicted Ratio of female-to-male labor force participation rate (%)- LOG Therefore, the problems with the residuals in simple regression (noted above) do not seem to be mitigated by the inclusion of this independent variable. Collinearity The correlation is 0.13. This means that the variance inflationary factor (VIF) is 1.15. As the VIF is around 1, administrative timeliness and female-to-male years of education received are uncorrelated. Conclusion It does not appear that including the ratio of female-to-male mean years of education received variable to the model seems appropriate. This is because it does not add substantial explanatory power to the model. Also, the issues identified in the simple regression model remain.