Uploaded by Mary Ross

Biostats--Worksheet---3 bb

advertisement
Bio Statics 3211 Worksheet 3
Mary Ross 1031871
7/4/2021
1.The following data were recorded as independent observations for weights (grams) of
eggs from chickens treated with calcium supplements. We want to know, if the average
weight of the eggs statistically differs from 19 grams?
One Sample T-test
weight <- c(18.50, 18.55, 18.60, 18.65, 18.70, 18.75, 18.80, 18.85, 18.90, 18
.95, 19.00, 19.05,19.10, 19.15, 19.20, 19.25, 19.30, 19.35, 19.40, 19.45, 19.
50, 19.55, 19.60, 19.65,19.70, 19.75, 19.80, 19.85, 19.90, 19.95, 20.00, 20.0
5, 20.10, 20.15, 20.20, 20.25, 20.30, 20.35, 20.40, 20.45, 20.50)
Eggs <-data.frame(cbind(weight))
the data weight has been renamed to Eggs
head(Eggs)
##
##
##
##
##
##
##
1
2
3
4
5
6
weight
18.50
18.55
18.60
18.65
18.70
18.75
The function heads is showing the first 6 observation from the data frame.
Identify your variable Paramaters
str(Eggs)
## 'data.frame':
## $ weight: num
41 obs. of 1 variable:
18.5 18.6 18.6 18.6 18.7 ...
The Eggs data is made up of 41 observations of 1 variable, the variable weight is numerical.
Visualization of data
boxplot(Eggs, col = "light pink")
Assumptions- boxplot displayed an abnormal distribution.
state your Hypothesis
Null Hypothesis- There is no correlations between the eggs from treated chicken and the
mean weight > 19g
Alternative Hypothesis- There is a correlations between eggs from treated chicken and the
mean weight > 19g
Descriptive statistics
min(Eggs)
## [1] 18.5
The minimun value of Eggs dataset is 18.5
max(Eggs)
## [1] 20.5
The maximum Value of Eggs dataset is 20.5
mean(Eggs$weight)
## [1] 19.5
The mean value of Eggs dataset is 19.5
median(Eggs$weight)
## [1] 19.5
The median of Eggs dataset is 19.5
range(Eggs)
## [1] 18.5 20.5
The range values of Eggs dataset is from 18.5 to 20.5
sd(Eggs$weight)
## [1] 0.5989574
The standard deviation for Eggs dataset is 0.5989574
var(Eggs)
##
weight
## weight 0.35875
The varience of Eggs dataset is equal to 0.35875
quantile(Eggs$weight)
##
0% 25% 50% 75% 100%
## 18.5 19.0 19.5 20.0 20.5
summary(Eggs)
##
##
##
##
##
##
##
weight
Min.
:18.5
1st Qu.:19.0
Median :19.5
Mean
:19.5
3rd Qu.:20.0
Max.
:20.5
check other Assumptions
since the sample size is >30, there is no need to check for normality because of the central
limit theorem. therefore we assume that the data is normally distributed.
hist(Eggs$weight, col = "light blue")
was conducted.
A one sample T-test
Onesample <- t.test(Eggs$weight, mu=19, alternative = "greater")
Results
t.test(Eggs$weight, mu=19, alternative = "greater")
##
##
##
##
##
##
##
##
##
##
##
One Sample t-test
data: Eggs$weight
t = 5.3452, df = 40, p-value = 1.959e-06
alternative hypothesis: true mean is greater than 19
95 percent confidence interval:
19.34249
Inf
sample estimates:
mean of x
19.5
The data was conducted using a One-Tailed test. the p-value is 1.959e-06 which is less than
0.05, Therefore we reject the Null which states that there is no corelation between eggs
from treated chicken and the mean weight > 19g.
According to the result computed from the test: the t-test statistic value is 5.3452,the
degree of freedom is 40 and the confidence interval is 19.34249.
Conclusion
The one sample t-test was used because we assume that the data was normally distributed
and the degree of freedom is 40. it was found that the p-value of the test is 1.959e-06 which
is less than the significance level (0.05). therefore we reject the Null hypothesis and
conclude that the average weight of the eggs is significantly different from 19g with a pvalue of 1.959e-06. the hypothesis used is a right tailed.
2.
Sample scores from a math test yield the results for tutorial sessions.
8,7,10,8,9,10,8,10,7,10,8,10,6,8,10,9,10,7,10,10,7,10,8,10,6,8,10,9,10,7 Test whether
there is a significant difference from the theoretical value of a score of 7. Would you
the hypothesis is left or right tailed?
maths_test <- c(8,7,10,8,9,10,8,10,7,10,8,10,6,8,10,9,10,7,10,10,7,10,8,10,6,
8,10,9,10,7)
Test_score <- data.frame(cbind(maths_test))
The data maths_test has been renamed to Test_score.
head(Test_score)
##
##
##
##
##
##
##
1
2
3
4
5
6
maths_test
8
7
10
8
9
10
The function heads is showing the first 6 observation from the data frame.
Identify your variable Paramaters
str(Test_score)
## 'data.frame':
30 obs. of 1 variable:
## $ maths_test: num 8 7 10 8 9 10 8 10 7 10 ...
This data set is made up of 30 observations and 1 variable. the variable maths_test is
numerical.
Visualization of data
boxplot(Test_score, col = "light green")
boxplot displayed an abnormal distribution.
Assumptions-
state your Hypothesis
Null Hypothesis- There is no significant difference from the theoretical value of a score of 7
Alternative Hypothesis- There is significant difference from the theoretical value of a score
of 7
Descriptive statistics
min(Test_score)
## [1] 6
the minimum value is 6.
max(Test_score)
## [1] 10
The maximum value is 10
mean(Test_score$maths_test)
## [1] 8.666667
the mean value is 8.666667
median(Test_score$maths_test)
## [1] 9
the median value is 9
range(Test_score)
## [1]
6 10
the range value of maths_test data is from 6 to 10
sd(Test_score$maths_test)
## [1] 1.372974
The standard deviation is 1.372974
var(Test_score)
##
maths_test
## maths_test
1.885057
The variance is equal to 1.885057
quantile(Test_score$maths_test)
##
##
0%
6
25%
8
50%
9
75% 100%
10
10
summary(Test_score)
##
##
##
##
##
##
##
maths_test
Min.
: 6.000
1st Qu.: 8.000
Median : 9.000
Mean
: 8.667
3rd Qu.:10.000
Max.
:10.000
check other Assumptions
The sample size is equal to 30, so there is no need to test for normality so we assume that
the data is normally distributed.
hist(Test_score$maths_test, col = "pink")
Hypothesis for shapiro Wilks test
Null hypothesis- The data are normally distributed.
Alternative hypothesis- The data is not normally distributed.
shapiro.test(Test_score$maths_test)
##
## Shapiro-Wilk normality test
##
## data: Test_score$maths_test
## W = 0.82959, p-value = 0.0002385
the p-value is 0.0002385, the data is not normally distributed.
A one sample t-test was conducted.
t.test(Test_score$maths_test, mu=7, alternative = "less")
##
## One Sample t-test
##
## data: Test_score$maths_test
## t = 6.6489, df = 29, p-value = 1
## alternative hypothesis: true mean is less than 7
## 95 percent confidence interval:
##
-Inf 9.092586
## sample estimates:
## mean of x
## 8.666667
Results
The data was conducted using a One-Tailed test. # left sided. the p-value is 1 which is
greater than 0.05, Therefore we fail to reject the Null which states that there is no
significant difference from the theoretical value of a score of 7
According to the result computed from the test: the t-test statistic value is 6.6489, the
degree of freedom is 29 and the confidence interval is 9.092586
Conclusion
The one sample t-test was used because we assume that the data was normally distributed
and the degree of freedom is 29. it was found that the p-value of the test is 1 which is
greater than the significance level (0.05). therefore we fail to reject the Null hypothesis and
conclude that there there is no significant difference from the theoretical value of a score of
7.
This test was chosen since the data is made up of 30 observations.
3.
Test the Null hypothesis that the long jump scores from Team A is not different from
Team B and or Team A scores are greater than or less than team B. Team A:
1,2,2,3,3,4,4,5,5,6 Team B: 1,2,4,5,5,5,6,6,7,9
Team_A <- c(1,2,2,3,3,4,4,5,5,6)
Team_B <- c(1,2,4,5,5,5,6,6,7,9)
dat <- data.frame(cbind(Team_A,Team_B))
Identify your variable Paramaters
str(dat$Team_A)
##
num [1:10] 1 2 2 3 3 4 4 5 5 6
str(dat$Team_B)
##
num [1:10] 1 2 4 5 5 5 6 6 7 9
Visualization of data
boxplot(Team_A, Team_B, col = "yellow")
Assumptions- boxplot displayed an abnormal distribution.
State your Hypothesis
Null Hypothesis- Team A jump score are similar to Team B
Alternative Hypothesis- Team A jump score is > than Team B (One Tailed)
Descriptive statistics
min(Team_A)
## [1] 1
The minimum value of Team_A is 1
min(Team_B)
## [1] 1
The minimum value of Team_B is 1
max(Team_A)
## [1] 6
The maximum value of Team_A is 6
max(Team_B)
## [1] 9
The maximum value of Team_B is 9
mean(Team_A)
## [1] 3.5
The mean value of Team_A is 3.5
mean(Team_B)
## [1] 5
The mean value of Team_B is 5
median(Team_A)
## [1] 3.5
The median value is 3.5
median(Team_B)
## [1] 5
The median value is 5
range(Team_A)
## [1] 1 6
The range value of Team_A is from 1 to 6
range(Team_B)
## [1] 1 9
The range value of Team_B is from 1 to 9
sd(Team_A)
## [1] 1.581139
The standard deviation is 1.581139
sd(Team_B)
## [1] 2.309401
The standard deviation is 2.309401
var(Team_A,)
## [1] 2.5
The variance is equal 2.5
var(Team_B)
## [1] 5.333333
The variance is equal to 5.333333
quantile(Team_A)
##
0% 25% 50% 75% 100%
## 1.00 2.25 3.50 4.75 6.00
quantile(Team_B)
##
0% 25% 50% 75% 100%
## 1.00 4.25 5.00 6.00 9.00
#check other Assumptions The sample size is less than 30 therefore we need to check
whether the data follow a normal distribution.
hist(Team_A, col = "light green")
hist(Team_B, col = "green")
Hypothesis for shapiro Wilks test
Null hypothesis- The data are normally distributed.
Alternative hypothesis- The data is not normally distributed.
Shapiro-Wilk normality test
shapiro.test(dat$Team_A)
##
## Shapiro-Wilk normality test
##
## data: dat$Team_A
## W = 0.96572, p-value = 0.8486
the p-value is 0.8486, the data is normally distributed.
shapiro.test(Team_B)
##
## Shapiro-Wilk normality test
##
## data: Team_B
## W = 0.95939, p-value = 0.7789
the p-value is 0.7789, the data is normally distributed.
Test Homogenity of varience
#Hypothesis for Homogeneity
Null Hypothesis- the variance are equal
Alternative Hypothesis- The variance are not equal
var.test(Team_A, Team_B)
##
##
##
##
##
##
##
##
##
##
##
F test to compare two variances
data: Team_A and Team_B
F = 0.46875, num df = 9, denom df = 9, p-value = 0.2744
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.1164309 1.8871848
sample estimates:
ratio of variances
0.46875
There is equal variance between both data set since the p-value is greater than 0.05
One Tailed test.
Jump <- t.test(Team_A, Team_B, var.equal = TRUE, alternative = "greater")
Results
Jump
##
##
##
##
##
##
##
##
##
##
##
Two Sample t-test
data: Team_A and Team_B
t = -1.6948, df = 18, p-value = 0.9463
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
-3.034752
Inf
sample estimates:
mean of x mean of y
3.5
5.0
Conclusion
The one sample, Right tailed t-test was chosen since we found that both data set was
normally distributed and the degree of freedom is 18, it was found that the p-value of the
test is 0.9463 which is greater than the significance level (0.05). therefore we accept the
Null hypothesis since we have sufficient evidence and conclude that Team A jump score is
not different from team B.
4.
The pre and post tests scores for a randomly selected group of students before and
after exposure to new teaching methods are as follows: Pretest:
10,5,8,6,4,5,5,8,7,6,10,5, 8,6,4,5,5,8,7,6 Posttest: 10,6,8,8,5,4,5,9,10,7,
8,5,4,5,9,10,7,10,6,8 Is there a significant difference in performance before
compared to after exposure to a different teaching method?
Download