Bio Statics 3211 Worksheet 3 Mary Ross 1031871 7/4/2021 1.The following data were recorded as independent observations for weights (grams) of eggs from chickens treated with calcium supplements. We want to know, if the average weight of the eggs statistically differs from 19 grams? One Sample T-test weight <- c(18.50, 18.55, 18.60, 18.65, 18.70, 18.75, 18.80, 18.85, 18.90, 18 .95, 19.00, 19.05,19.10, 19.15, 19.20, 19.25, 19.30, 19.35, 19.40, 19.45, 19. 50, 19.55, 19.60, 19.65,19.70, 19.75, 19.80, 19.85, 19.90, 19.95, 20.00, 20.0 5, 20.10, 20.15, 20.20, 20.25, 20.30, 20.35, 20.40, 20.45, 20.50) Eggs <-data.frame(cbind(weight)) the data weight has been renamed to Eggs head(Eggs) ## ## ## ## ## ## ## 1 2 3 4 5 6 weight 18.50 18.55 18.60 18.65 18.70 18.75 The function heads is showing the first 6 observation from the data frame. Identify your variable Paramaters str(Eggs) ## 'data.frame': ## $ weight: num 41 obs. of 1 variable: 18.5 18.6 18.6 18.6 18.7 ... The Eggs data is made up of 41 observations of 1 variable, the variable weight is numerical. Visualization of data boxplot(Eggs, col = "light pink") Assumptions- boxplot displayed an abnormal distribution. state your Hypothesis Null Hypothesis- There is no correlations between the eggs from treated chicken and the mean weight > 19g Alternative Hypothesis- There is a correlations between eggs from treated chicken and the mean weight > 19g Descriptive statistics min(Eggs) ## [1] 18.5 The minimun value of Eggs dataset is 18.5 max(Eggs) ## [1] 20.5 The maximum Value of Eggs dataset is 20.5 mean(Eggs$weight) ## [1] 19.5 The mean value of Eggs dataset is 19.5 median(Eggs$weight) ## [1] 19.5 The median of Eggs dataset is 19.5 range(Eggs) ## [1] 18.5 20.5 The range values of Eggs dataset is from 18.5 to 20.5 sd(Eggs$weight) ## [1] 0.5989574 The standard deviation for Eggs dataset is 0.5989574 var(Eggs) ## weight ## weight 0.35875 The varience of Eggs dataset is equal to 0.35875 quantile(Eggs$weight) ## 0% 25% 50% 75% 100% ## 18.5 19.0 19.5 20.0 20.5 summary(Eggs) ## ## ## ## ## ## ## weight Min. :18.5 1st Qu.:19.0 Median :19.5 Mean :19.5 3rd Qu.:20.0 Max. :20.5 check other Assumptions since the sample size is >30, there is no need to check for normality because of the central limit theorem. therefore we assume that the data is normally distributed. hist(Eggs$weight, col = "light blue") was conducted. A one sample T-test Onesample <- t.test(Eggs$weight, mu=19, alternative = "greater") Results t.test(Eggs$weight, mu=19, alternative = "greater") ## ## ## ## ## ## ## ## ## ## ## One Sample t-test data: Eggs$weight t = 5.3452, df = 40, p-value = 1.959e-06 alternative hypothesis: true mean is greater than 19 95 percent confidence interval: 19.34249 Inf sample estimates: mean of x 19.5 The data was conducted using a One-Tailed test. the p-value is 1.959e-06 which is less than 0.05, Therefore we reject the Null which states that there is no corelation between eggs from treated chicken and the mean weight > 19g. According to the result computed from the test: the t-test statistic value is 5.3452,the degree of freedom is 40 and the confidence interval is 19.34249. Conclusion The one sample t-test was used because we assume that the data was normally distributed and the degree of freedom is 40. it was found that the p-value of the test is 1.959e-06 which is less than the significance level (0.05). therefore we reject the Null hypothesis and conclude that the average weight of the eggs is significantly different from 19g with a pvalue of 1.959e-06. the hypothesis used is a right tailed. 2. Sample scores from a math test yield the results for tutorial sessions. 8,7,10,8,9,10,8,10,7,10,8,10,6,8,10,9,10,7,10,10,7,10,8,10,6,8,10,9,10,7 Test whether there is a significant difference from the theoretical value of a score of 7. Would you the hypothesis is left or right tailed? maths_test <- c(8,7,10,8,9,10,8,10,7,10,8,10,6,8,10,9,10,7,10,10,7,10,8,10,6, 8,10,9,10,7) Test_score <- data.frame(cbind(maths_test)) The data maths_test has been renamed to Test_score. head(Test_score) ## ## ## ## ## ## ## 1 2 3 4 5 6 maths_test 8 7 10 8 9 10 The function heads is showing the first 6 observation from the data frame. Identify your variable Paramaters str(Test_score) ## 'data.frame': 30 obs. of 1 variable: ## $ maths_test: num 8 7 10 8 9 10 8 10 7 10 ... This data set is made up of 30 observations and 1 variable. the variable maths_test is numerical. Visualization of data boxplot(Test_score, col = "light green") boxplot displayed an abnormal distribution. Assumptions- state your Hypothesis Null Hypothesis- There is no significant difference from the theoretical value of a score of 7 Alternative Hypothesis- There is significant difference from the theoretical value of a score of 7 Descriptive statistics min(Test_score) ## [1] 6 the minimum value is 6. max(Test_score) ## [1] 10 The maximum value is 10 mean(Test_score$maths_test) ## [1] 8.666667 the mean value is 8.666667 median(Test_score$maths_test) ## [1] 9 the median value is 9 range(Test_score) ## [1] 6 10 the range value of maths_test data is from 6 to 10 sd(Test_score$maths_test) ## [1] 1.372974 The standard deviation is 1.372974 var(Test_score) ## maths_test ## maths_test 1.885057 The variance is equal to 1.885057 quantile(Test_score$maths_test) ## ## 0% 6 25% 8 50% 9 75% 100% 10 10 summary(Test_score) ## ## ## ## ## ## ## maths_test Min. : 6.000 1st Qu.: 8.000 Median : 9.000 Mean : 8.667 3rd Qu.:10.000 Max. :10.000 check other Assumptions The sample size is equal to 30, so there is no need to test for normality so we assume that the data is normally distributed. hist(Test_score$maths_test, col = "pink") Hypothesis for shapiro Wilks test Null hypothesis- The data are normally distributed. Alternative hypothesis- The data is not normally distributed. shapiro.test(Test_score$maths_test) ## ## Shapiro-Wilk normality test ## ## data: Test_score$maths_test ## W = 0.82959, p-value = 0.0002385 the p-value is 0.0002385, the data is not normally distributed. A one sample t-test was conducted. t.test(Test_score$maths_test, mu=7, alternative = "less") ## ## One Sample t-test ## ## data: Test_score$maths_test ## t = 6.6489, df = 29, p-value = 1 ## alternative hypothesis: true mean is less than 7 ## 95 percent confidence interval: ## -Inf 9.092586 ## sample estimates: ## mean of x ## 8.666667 Results The data was conducted using a One-Tailed test. # left sided. the p-value is 1 which is greater than 0.05, Therefore we fail to reject the Null which states that there is no significant difference from the theoretical value of a score of 7 According to the result computed from the test: the t-test statistic value is 6.6489, the degree of freedom is 29 and the confidence interval is 9.092586 Conclusion The one sample t-test was used because we assume that the data was normally distributed and the degree of freedom is 29. it was found that the p-value of the test is 1 which is greater than the significance level (0.05). therefore we fail to reject the Null hypothesis and conclude that there there is no significant difference from the theoretical value of a score of 7. This test was chosen since the data is made up of 30 observations. 3. Test the Null hypothesis that the long jump scores from Team A is not different from Team B and or Team A scores are greater than or less than team B. Team A: 1,2,2,3,3,4,4,5,5,6 Team B: 1,2,4,5,5,5,6,6,7,9 Team_A <- c(1,2,2,3,3,4,4,5,5,6) Team_B <- c(1,2,4,5,5,5,6,6,7,9) dat <- data.frame(cbind(Team_A,Team_B)) Identify your variable Paramaters str(dat$Team_A) ## num [1:10] 1 2 2 3 3 4 4 5 5 6 str(dat$Team_B) ## num [1:10] 1 2 4 5 5 5 6 6 7 9 Visualization of data boxplot(Team_A, Team_B, col = "yellow") Assumptions- boxplot displayed an abnormal distribution. State your Hypothesis Null Hypothesis- Team A jump score are similar to Team B Alternative Hypothesis- Team A jump score is > than Team B (One Tailed) Descriptive statistics min(Team_A) ## [1] 1 The minimum value of Team_A is 1 min(Team_B) ## [1] 1 The minimum value of Team_B is 1 max(Team_A) ## [1] 6 The maximum value of Team_A is 6 max(Team_B) ## [1] 9 The maximum value of Team_B is 9 mean(Team_A) ## [1] 3.5 The mean value of Team_A is 3.5 mean(Team_B) ## [1] 5 The mean value of Team_B is 5 median(Team_A) ## [1] 3.5 The median value is 3.5 median(Team_B) ## [1] 5 The median value is 5 range(Team_A) ## [1] 1 6 The range value of Team_A is from 1 to 6 range(Team_B) ## [1] 1 9 The range value of Team_B is from 1 to 9 sd(Team_A) ## [1] 1.581139 The standard deviation is 1.581139 sd(Team_B) ## [1] 2.309401 The standard deviation is 2.309401 var(Team_A,) ## [1] 2.5 The variance is equal 2.5 var(Team_B) ## [1] 5.333333 The variance is equal to 5.333333 quantile(Team_A) ## 0% 25% 50% 75% 100% ## 1.00 2.25 3.50 4.75 6.00 quantile(Team_B) ## 0% 25% 50% 75% 100% ## 1.00 4.25 5.00 6.00 9.00 #check other Assumptions The sample size is less than 30 therefore we need to check whether the data follow a normal distribution. hist(Team_A, col = "light green") hist(Team_B, col = "green") Hypothesis for shapiro Wilks test Null hypothesis- The data are normally distributed. Alternative hypothesis- The data is not normally distributed. Shapiro-Wilk normality test shapiro.test(dat$Team_A) ## ## Shapiro-Wilk normality test ## ## data: dat$Team_A ## W = 0.96572, p-value = 0.8486 the p-value is 0.8486, the data is normally distributed. shapiro.test(Team_B) ## ## Shapiro-Wilk normality test ## ## data: Team_B ## W = 0.95939, p-value = 0.7789 the p-value is 0.7789, the data is normally distributed. Test Homogenity of varience #Hypothesis for Homogeneity Null Hypothesis- the variance are equal Alternative Hypothesis- The variance are not equal var.test(Team_A, Team_B) ## ## ## ## ## ## ## ## ## ## ## F test to compare two variances data: Team_A and Team_B F = 0.46875, num df = 9, denom df = 9, p-value = 0.2744 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.1164309 1.8871848 sample estimates: ratio of variances 0.46875 There is equal variance between both data set since the p-value is greater than 0.05 One Tailed test. Jump <- t.test(Team_A, Team_B, var.equal = TRUE, alternative = "greater") Results Jump ## ## ## ## ## ## ## ## ## ## ## Two Sample t-test data: Team_A and Team_B t = -1.6948, df = 18, p-value = 0.9463 alternative hypothesis: true difference in means is greater than 0 95 percent confidence interval: -3.034752 Inf sample estimates: mean of x mean of y 3.5 5.0 Conclusion The one sample, Right tailed t-test was chosen since we found that both data set was normally distributed and the degree of freedom is 18, it was found that the p-value of the test is 0.9463 which is greater than the significance level (0.05). therefore we accept the Null hypothesis since we have sufficient evidence and conclude that Team A jump score is not different from team B. 4. The pre and post tests scores for a randomly selected group of students before and after exposure to new teaching methods are as follows: Pretest: 10,5,8,6,4,5,5,8,7,6,10,5, 8,6,4,5,5,8,7,6 Posttest: 10,6,8,8,5,4,5,9,10,7, 8,5,4,5,9,10,7,10,6,8 Is there a significant difference in performance before compared to after exposure to a different teaching method?