Intro to Biostatistics Health Sciences
Assignment#2_2025
Due: 09th May, 2025
[50marks]
1. Correlation: Sleeping hours vs awake hours for the 10 people
Sleeping hours vs hours awake
Person
Hours Sleeping
Hours Awake
1
8
16
2
6
18
3
7.25
16.75
4
8.5
15.5
5
5
19
6
7.5
16.5
7
6.75
17.25
8
7.75
16.25
9
7.1
16.9
10
6.5
17.5
Task: Using the above hypothetical data generate the following using SPSS (10marks). Enter the
data into SPSS and perform the task below.
a) Present the above data in a scatter plot chart (2marks)
b) Is there a correlation between the two variables? What is the correlation coefficient (r) obtained?
(2marks)
c) What is the p-value obtained? 1mark
d) Explained what the p-value means (obtained p value above-c)? is there a statistically significant
correlation between the two variables? (2marks)
e) Summarise your findings. (3marks).
2) Chi-square test
A chi-square test of independence is performed to see if there is an association between gender
(independent variable) and smoking status (dependent or outcome variable) of the sample
(n=200).
Hypothesis
Ho: There is no association between gender and smoking status
Ha: There is a association between gender and smoking status
Table.2 Observed data
Smokers
Non-Smokers
Total
Male
29
71
100
Female
16
84
100
45
155
200
Total
Enter the data above (table 2) into SPSS and perform the analysis
Perform the below task;
a) Calculate or compute the expected values: [2marks]
b) Calculate or compute the chi-squared test statistic. [2marks]
c) Compute the p-value. [1mark]
d) Is the p-value statistically significant? [1mark]
e) Do you accept or reject the null hypothesis? Explain. [2mark]
f) Is there an association between gender and smoking status? Explain. [2marks]
10 Marks
3. Measure of disease frequency [10marks]
Generate/calculate the following measures (rates) attack rate, incidence rate, mortality rate & case
fatality rate for NCD and PNG respectively, given the data (below).
Table.1 Covid-19 in PNG (March 2020)
NCD
PNG
55 positive cases from 9885 test done. Calculate
the attack rate?
62 total positive cases from 9885 test done.
Calculate the attack rate?
55 cases in a population of 437695. Calculate the
incidence rate?
62 cases in a population of 8million. Calculate the
incidence rate?
1 death from 55 positive cases. Calculate the
mortality rate?
1 death from 62 positive cases. Calculate the
mortality rate?
1 death from 55 positive cases. Calculate the case
fatality rate?
1 death from 62 positive cases. Calculate the case
fatality rate?
*Express and interpret what the numbers (rates) mean accordingly.
Measure of Association
4.
Myocardial infarctions (MI) in those who smoke cigarettes compared with those who
do not.
Relative Risk (RR) [10marks]
a)
b)
c)
d)
e)
Myocardial infarction
+
-
+
a
b
-
c
d
Smoking
Exposure
Disease
+
-
total
+
355
3140
3495
-
140
2507
2647
Total
495
5647
6142
Calculate the incidence of disease among the exposed? (1mark)
Calculate the incidence of disease among those not exposed? (1mark)
Calculate the Risk risk or risk ratio (show calculations/working outs? 3 marks
Interpretate the risk ratio obtained? (2marks)
What study design would you use to calculate a risk ratio? And why is it a useful measure?
(3marks)
5.
A new diagnostic test for lung cancer has been recommended for use in the oncology clinic. We
want to use this test to distinguish persons who have lung cancer from those who do not from smoking.
Using the test with 500 patients, we find the following:
Effects of prevalence of Disease (lung cancer) on Screening Test Results on 500 people
Disease status
Exposure
Lung Cancer (+)
No Lung Cancer (-)
Total
Smoking
240
25
265
No Smoking
15
220
235
255
245
500
Total
From the above hypothetical information calculate the following;
1) Sensitivity
2) Specificity
3) Prevalence of disease
[2 marks]
[2 marks]
[2 marks]
4) Predictive positive value (+)
[2 marks]
5) Predictive negative value (-)
[2 marks] -----------------------------------------------------------------------------------------------------------------------------------------
END OF ASSIGNMENT