Uploaded by Tenhae McNally

sle251 Mock exam 2018 (1) no ans

advertisement
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
SLE251 Research Methods and Data Analysis
Trimester 1, 2018
Mock Exam
You have 90 minutes
Instructions:
•
Calculators permitted.
•
Answer all questions.
•
Section 1 answers would be completed on multiple choice answer sheet.
All other answers would be written in your Answer Booklet(s).
•
•
•
Total marks = 60 marks in total.
Section 1 (multiple choice) = 10 questions – 2 marks each (20 marks)
Section 2 (4 multi-part short-answer questions) – marks for each question as
indicated in text (40 marks)
1 of 9
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
Section 1 (20 marks)
Multiple choice: choose the one letter which indicates the most correct answer for each of
questions 1 to 10 and mark the appropriate letter for each question on your multiple choice
answer sheet (2 marks each):
Q1.
The Pearson correlation coefficient (r) can be interpreted as
(a)
(b)
(c)
a measure of the unexplained variation in the response (Y) variable
the significance of the test for zero slope
the difference between group means
Q2.
Which of the following is an assumption of chi-square tests of goodness-of-fit?
(a)
(b)
(c)
(d)
straight line relationship
equal variances
data in each cell must be percentages
average expected value across all cells is greater than 2
Q3.
The assumption of independence in an analysis of variance:
(a)
(b)
(c)
(d)
is only relevant for untransformed data
means that one observation should not influence the value of any other observation
only matters if the null hypothesis is true
implies that the distribution of the dependent (response) variable must be normal
Q4.
You get a significant result from the F-ratio test in a one factor ANOVA with four groups.
What is considered the best way to accurately determine the differences between the
group means?
(a)
(b)
(c)
(d)
use a Tukey’s multiple comparison test
do a correlation test of the relationship between group means and group variances
look to see whether the boxplots overlapped for the four groups
do a chi-square test goodness-of-fit test
Q5.
The slope of a regression equation measures the:
(a)
(b)
(c)
(d)
value of Y when X equals zero
unexplained variation in Y
correlation between Y and X
change in Y for a unit change in X
(d)
the strength of the linear relationship between two variables
2 of 9
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
Section 1 (20 marks) Continued
Questions 6 and 7 refer to the same example:
You are conducting a study to investigate the effects of a new drug used to treat high blood
pressure during pregnancy. You have 50 women who are given the drug, and 50 women who are
given a placebo. You also wish to know if there are any differences in the effect of the drug in
relation to whether the women had previously given birth: 25 women in each drug treatment
group had previously given birth, but for the other 25 women it was their first pregnancy. Each
women has their blood pressure measured 10 times during the study period.
Q6.
What is the appropriate test to establish whether there are differences in blood pressure
that are predicted by the drug treatment and pregnancy history?
(a)
(b)
(c)
(d)
Chi-squared test of heterogeneity
Single-factor Analysis of Variance
Two-factor Analysis of Variance
Linear regression
Q7. What is the appropriate independent unit of replication in the above study?
(a)
(b)
(c)
(d)
The 1000 measures of blood pressure in the study
The 100 women in the study
The 2 drug treatment groups in the study
The 2 pregnancy history groups in the study
Q8. In a two-factor Analysis of Variance, a P-value of 0.497 associated with the interaction term
indicates:
(a)
(b)
(c)
(d)
There is no significant effect of either predictor factor on the response variable
The effect of the first predictor on the response variable is the same regardless of the
state of the second predictor
The effect of the first predictor on the response variable is only significant when the effect
of the second predictor is also significant
The variances of the groupings are not different from each other
Q9. When is it appropriate to use a Fisher exact test?
(a)
(b)
(c)
(d)
When investigating the relationship between two binary (i.e. two-state) categorical
variables and you have a small sample size
When investigating the relationship between two continuous variables and the scatterplot
indicates curvilinearity
When the continuous data in an analysis of variance are non-normally distributed
When a chi-square test gives you a non-significant result
Q10. A researcher collates data from health records of 13,000 Australians, and observes a
strongly significant (P<0.001) positive correlation between blood pressure and number of GP
3 of 9
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
visits per year. He concludes that visiting a GP causes stress, elevating blood pressure. Why
might this be a bad conclusion?
(a)
(b)
(c)
(d)
Analysis of variance, not correlation, is the appropriate test that should be used
The large sample size makes statistical analysis, with P-values, unnecessary
Measures of blood pressure are likely to vary between individual GPs, making the results
unreliable
Correlation does not automatically imply causation, further evidence (experimental or
indirect) is needed.
4 of 9
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
Section 2 (40 marks)
Answer all parts of questions 11 to 14 in your answer book. Marks per question are indicated.
Q11. (10 marks)
A team of researchers wished to investigate
whether being held in captivity affected stress
levels in elephants. They compared three different
groups of elephants: those found in the wild,
those housed in confined enclosures in zoos, and
those found in large free-ranging enclosures at
safari parks. They took blood samples from
drugged elephants and measured corticosterone
levels (a hormone that is found in higher
concentrations when the animal is stressed) in
microliters of corticosterone per litre of blood. The
results are presented below:
R Output
3000
2500
1500
2000
corticosterone
3500
4000
4500
Results using raw data
Safaripark
Wild
Zoo
Conditions
mean
sd data:n
Safaripark 1758.279 336.3332
30
Wild
2889.206 602.5395
30
Zoo
3432.790 2531.3777
30
Levene's Test for Homogeneity of Variance (center = mean)
Df F value
Pr(>F)
group 2 9.3022 0.0002182 ***
87
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
5 of 9
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
3.4
3.3
3.2
logcorticosteron
3.5
3.6
Results using log-transformed data
3.1
73
Safaripark
Wild
Zoo
Conditions
mean
sd data:n
Safaripark 3.237201 0.08486613
30
Wild
3.450229 0.10133602
30
Zoo
3.446849 0.13773478
30
Levene's Test for Homogeneity of Variance (center = mean)
Df F value Pr(>F)
group 2
2.938 0.05825 .
87
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> summary(AnovaModel.2)
Df Sum Sq Mean Sq F value
Pr(>F)
Conditions
2 0.8934 0.4467
36.77 2.66e-12 ***
Residuals
87 1.0568 0.0121
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> Multiple Comparisons of Means: Tukey Contrasts
Fit: aov(formula = logcorticosteron ~ Conditions, data = Dataset)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
Wild - Safaripark == 0 0.21303
0.02846
7.486
<1e-06 ***
Zoo - Safaripark == 0
0.20965
0.02846
7.367
<1e-06 ***
Zoo - Wild == 0
-0.00338
0.02846 -0.119
0.992
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
> cld(.Pairs) # compact letter display
Safaripark
Wild
Zoo
"a"
"b"
"b"
6 of 9
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
(a)
(b)
(c)
(d)
What null hypothesis was being tested by the ANOVA? (2 marks)
The data were log-transformed - why? (2 marks)
What conclusions would you draw from the ANOVA F-test and Tukey’s test? (4 marks)
Other than homogeneity of variances, what are the other 2 main assumptions about the
data that must be met for this ANOVA test to be reliable? (2 marks)
Q12. (10 marks)
A study was done off Philip Island to test whether the size of
starfish varies with the colour of the animal (red, blue, albino) and
its sex (male, female). Thirty starfish were randomly chosen
within each combination of colour and sex and their size was
measured.
The results were (bar colour indicates starfish colour):
5.0
Mean body size
ez
is 4.0
h
isf
ra 3.0
ts
n
a 2.0
e
M
1.0
Male
Female
ANOVA
Source of variation
Colour
Sex
Colour x Sex
Residual
(a)
(b)
(c)
df
2
1
1
174
F-ratio
5.7
4.2
5.1
P value
0.015
0.021
0.010
What are the 3 null hypotheses are being tested in the ANOVA? (3 marks)
What are the conclusions from the three statistical tests of the null hypotheses? (3 marks)
How would you interpret this result, based on the bar graph of means? (4 marks)
Q13. (8 marks)
A biologist wished to model the relationship between a predictor variable, body weight
(measured to nearest 100g), and a response variable, number of eggs produced (measured
to nearest 103), for a species of fish called the cabezon (Scorpaenichthys moratus). He
sampled 11 fish and recorded body weight and number of eggs spawned for each fish. He
then did a linear regression analysis of number of eggs spawned against body weight. The
results are presented below.
7 of 9
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
R output :
(a)
(b)
What % of the variation in egg production was explained by body weight? (1 mark)
Complete the regression equation by filling in the blanks and writing it out fully in your
answer book. (2 marks)
# eggs =
(c)
(d)
(e)
+
* body weight
Using this regression equation calculate the predicted number of eggs that would be produced by a
cabezon weighing 1000 g (1 mark)
What two null hypotheses are being tested with the output shown above? (2 marks)
What biological conclusion would you draw from this analysis? (2 marks)
Q14. (12 marks)
A biologist was studying the amount of damage by koalas to juvenile leaves of eucalypts
(gum trees). She collected samples of juvenile leaves from a number of different trees of
five species (A, B, C, D, E) and recorded the numbers of leaves damaged and not damaged
by koalas. The data were:
Species
No. leaves damaged
No. leaves not damaged
A
10
40
B
38
8
C
15
48
D
15
35
E
68
19
The c2 analysis output was as follows:
8 of 9
SLE251 Research Methods and Data Analysis MOCK EXAM 2018
Pearson's Chi-squared test
data: .Table
X-squared = 90.4111, p-value < 2.2e-16
> .Test$expected # Expected Counts
1
2
3
4
5
1 24.66216 22.68919 31.07432 24.66216 42.91216
2 25.33784 23.31081 31.92568 25.33784 44.08784
> round(.Test$residuals, 2) # Chi-square Components
1
2
3
4
5
1 -2.95 3.21 -2.88 -1.95 3.83
2 2.91 -3.17 2.84 1.92 -3.78
(a)
(b)
(c)
(d)
How many degrees of freedom are there for this test? (2 marks)
What null hypothesis is being tested by the c2 test? (2 marks)
What are the assumptions of the c2 test? Are they met in this analysis? (4 marks)
What biological conclusions should the ecologist draw from the c2 test and the
standardised residuals? (4 marks)
END OF MOCK EXAM
9 of 9
Download