Final Project Problem 2 We will Test the Null Hypothesis that the number of days of Breaking down of Ice is not linearly related with the number of years passing by. We have 101 data in record of Nenana 2017. The slope of this relationship will model the relationship between number of days of Breaking down of Ice and the years. Our Null hypothesis is that slope of the regression is 0. H0: The Ice breakup days is not linearly related with years. 𝛽1 = 0. HA: The Ice breakup days is related with years. 𝛽1 ≠ 0. YEARS SINCE 1900 LINE FIT PLOT ICE BREAKUP DAYS AFTER MIDNIGHT Ice Breakup Days After Midnight Predicted Ice Breakup Days After Midnight 150 140 130 120 110 100 0 20 40 60 80 100 120 YEARS SINCE 1900 Linearity Condition: There is no obvious curve in the scatter plot of Year since 1900 and Ice breakup days after Midnight. Independence condition: Since the data is collected in a time series and not randomly selected, so we check scatter plot of residuals against time and look for any suggested pattern. We don’t find any Suggested pattern. 140 Equal Spread Condition: The Plot of the Residuals against the predicted values shows no obvious patterns. The spread is almost same for all predicted values, and the scatter appears random. Nearly Normal Condition: The histogram of the residuals is looks approximately unimodal and symmetric but the data is piling up into two modes approximately equal. The normal probability plot reasonably straight. From the normal probability plot we can observe that the data set is approximately normally distributed with only a slight deviation on the higher Residuals values. Simple linear regression results: Slope : -0.07636707 Dependent Variable: Ice Breakup Days After Midnight Independent Variable: Years Since 1900 Ice Breakup Days After Midnight = 128.99805 - 0.07636707 * Years Since 1900 Sample size: 101 R (correlation coefficient) = -0.35577203 R-sq = 0.12657 Estimate of error standard deviation: 5.9074493 Coefficients Intercept 128.9980492 Years Since 1900 -0.07636707 Standard Error 1.473193569 0.020161826 t Stat 87.56354348 -3.787706131 P-value 0.000000 0.0003 Regression Statistics Multiple R 0.355772031 R Square 0.126573738 Adjusted R Square 0.11775125 Standard Error 5.90744931 Observations 101 ANOVA df Regression Residual Total SS MS F Significance F 1 500.6711435 500.6711435 14.34671773 0.000261041 99 3454.897778 34.89795735 100 3955.568921 P value computed is 0.0003 which is less than significance value 0.05, that signifies the association we see in data is unlikely to have occurred by chance. Therefore, we reject the Null hypothesis and conclude that there is strong evidence that the Ice breakup days is related with years. 95% confident interval: Intercept Years Since 1900 Lower 95.0% 126.0749135 Upper 95.0% 131.9211848 -0.116372507 -0.036361634 From the confidence interval for the intercept we can be 95% confident that the ice breakup days is between 126 to 131.92 days in year 1900 , which is not equal to zero. Q1. Rate of change in date of breakup over time is Slope (b1) = -0.0764. 95% Confidence interval = 𝑦̂ 0 ± t*n-2 * SE(µx0) = -0.076367±1.984 x 0.02016182 = (-0.1164, -0.0363) We can be 95% confident that with every year the ice will melt between (-0.1164, -0.0363) days faster than the previous year. Q2. If Ice is breaking up earlier, what is your conclusion? Basically, the rate of change is taken from the slope (b1) of ice breakup over time which is -0.076 and it means that for every year that passes by it takes about 0.0764 less days for the Ice to break. And we can conclude that yes, the Ice is breaking up earlier and every years the speed is increasing for melting ice and after 100 years it will reduce by 7.64 days approximately. Q3. Does this necessarily suggest global warming? We can find that there is a correlation between the two variables (Ice Breakup Days After Midnight and years), it's small (r value = -0.35577203 and r square value = 0.12657 is small) but it still exists and cannot be neglected. Yet, correlation doesn't explain causation, which means that even though the two variables seem to have a link (as number of years increases the Ice Breakup Days reduces with a rate of the rate of change -0.076) we cannot assume that one causes the other. Nevertheless, this does not suggest Global warming as the sole cause for this event to happen. Q4. What Could be the other reasons for this Trend? There could be multiple reasons why with the passing years the thing is melting faster, such as an increase in the population in that city or an increase in establishment of industries, acid rain, different pollutants, a lot of gathering of people because of this competition. Due to increasing temperature due to more vehicles and machines operating in the environment also this can cause to ice to melt faster. So this trend doesn’t necessarily implies Global Warming. More study on this should take place to identify the true reason, because simply with this data it cannot be predicted. Q5. What is the Predicted breakup date of the year 2020? predicted breakup date of year 2020: number of years since 1900 = 120 X0 = 120. b0 = 128.998, b1 = -0.0764 Y0e = (Ice Breakup Days After Midnight) Y0= b0 + b1 * X0 + 𝜀 = 128.998 + -0.0764 * 120 + 𝜀 = 119.83 days + 𝜺 Which is equal to Thursday 29th-Apr-2020 = Prediction breakup date t*n-2 for 95% prediction interval is = 1.984 mean = 𝑥̅ = 67 ; SE2(b1) = 0.000406499; (X0 - 𝑥̅ )2 = 2809.00 ; Se2/n = 0.34552433; Se2 = 34.89795735 95% Confidence Interval for 𝜷1 is: 𝑦̂0 ± t*n-2√𝑆𝐸 2 (𝑏1) ∗ ( x0 − 𝑥1)2 + 𝑆𝑒 2 𝑛 = 119.83 ± 1.984 √ 0.000406499 ∗ 2809.00 + 0.34552433 = (117.41 , 122.25) . We are 95% confident that the average number of days taken for ice breakup to occur after mid night in year 2020 is between (117.41 , 122.25). 95% prediction interval for 𝜷1 is: 𝑦̂0 ± t*n-2√𝑆𝐸 2 (𝑏1) ∗ ( x0 − 𝑥1)2 + 𝑆𝑒 2 𝑛 + 𝑆𝑒 2 =119.83 ± 1.984 √ 0.000406499 ∗ 2809.00 + 0.34552433 + 34.89795735 (107.87, 131.80) i.e (Saturday,18-Apr-20 & Tuesday,12-May-20) 95% prediction that the interval (107.87, 131.80) i.e (Saturday,18-Apr-20 – Tuesday,12-May-20) captures the true date at which the Ice Breakup After Midnight will happen on the year 2020.