Report Writing Example. Worksheet No.2: Problem 2 Introduction

advertisement
Report Writing Example.
Worksheet No.2: Problem 2
Introduction. There is a considerable variation in blood pressure during a day.
The systolic blood pressure reading of a patient of high blood pressure
has a normal distribution with a mean of 160 mm mercury and a standard deviation of 20 mm.
The variable SBP has a sample of 100 values randomly generated from the normal
distribution. Every five observations are grouped, and the group number is assigned to each group,
which is recorded in the variable GROUP. The statistics of mean, median, standard deviation, and
quartiles for the entire data are summarized in the table below.
Variable
"SBP"
Mean
S.D
157.33
L.Quartile
Median
U.Quartile
20.28113
144.125
156.55
168.625
The following figures show the histogram and the QQ plot from the entire data. In the histogram
(below on the left), we choose the class width 15mm, and are able to see that the shape is fairly
symmetric. The normality of sample data is suggested in the QQ plot (below on the right) where the
plot (in the red dots) are reasonably close to a straight line. There is no particular systematic pattern
observed in the QQ plot.
Data analysis for sample means and standard deviations. Now using the variable GROUP, we can
calculate the sample mean (Mean) and the sample standard deviation (S.D) within the group. The
results are shown in the following table.
GROUP
Mean
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
S.D
145.82
140.72
166.74
170.74
162.32
155.2
150.7
150.2
167.32
158.44
151.24
168.94
157.76
156.8
154.14
168.36
151.3
152.36
164.74
152.76
15.01922
30.5966
28.0819
38.2319
6.967568
12.9611
18.98631
16.97027
21.0927
18.32984
6.581261
18.96399
27.0912
15.69235
10.80176
3.013802
23.90868
23.53621
26.6935
9.826647
We can view the above list as data from the sampling distribution of mean and SD, and calculate their
statistics of mean, median, and standard deviation.
Variable
"Mean"
"S.D"
Mean
157.33
18.66734
S.D
8.407544
8.979727
L.Quartile
Median
U.Quartile
151.285
156
165.24
12.42127
18.64691
24.60488
Conclusion. The histogram for “Mean” variable (below on the left) shows that it is centered around
157mm, which is exactly the average of the entire data. However, the standard deviation (8.4mm) is
much smaller than that of entire data (20.3mm). According the property of sample mean, it is an
unbiased estimator, and the sampling distribution must be normal. The histogram below reveals that it
is slightly skewed to the right. It may be due to the small sample size from only 20 different groups.
The histogram for “S.D” variable (below on the right) shows that it is centered around 18.7mm, which
is smaller than the standard deviation (20.3mm) of the entire data. This observation is consistent with
the fact that the sample standard deviation is a biased estimator. The shape of histogram is slightly
skewed to the right as expected by the property of sample standard deviation.
Suppose that the individual measurement exceeding 150mm is considered high blood pressure (see the
original statement in worksheet No.2). Then 64% of single measurements (below on the left) indicates
that a patient has high blood pressure. On the other hand, 90% of the mean blood pressure reading from
multiple measurements (below on the right) indicates the high blood pressure. Recall that the original
data are sampled from the normal distribution with mean 160mm and standard deviation 20mm. Since
the true mean (160mm) is considered high blood pressure, the mean blood pressure reading has the
advantage of detecting high blood pressure more often than a single measurement.
Worksheet No.2: Problem 3: Here you need to finish (d) and (e).
Introduction. A new reading program was being evaluated in the fourth grade at an elementary
school. 20 students were randomly selected and their reading speed and reading comprehension were
thoroughly tested. Based on a fixed-length standardized test reading passage, the speeds (in minutes)
and comprehension scores (based on a 100-point scale) were obtained. The essential statistics for the
variables, “Speed” (in minutes) and “Comprehension” (in 100-point scale) are summarized in the table
below.
Variable
Mean
"Speed"
"Comprehension"
S.D
9.1
82.05
L.Quartile
2.573141
10.8796
Median
7
75.75
U.Quartile
8.5
82
11
90.25
Data analysis. The following figures show the histograms for the respective variables, “Speed” and
“Comprehension.” Their class widths are chosen to be 2 minutes and 10 points, respectively. The
histogram for speed is fairly symmetric, while the shape of histogram for comprehension suggests that
it is possibly bimodal.
Histogram for Speed (in minutes)
Histogram for Comprehension (in points)
The QQ plots (below) show that the plots are reasonably close to a straight line, suggesting the
normality of sample for both variables, speed and comprehension.
QQ plot for speed
QQ plot for comprehension
The next graph is the scatter plot for speed (horizontal x-axis) against comprehension (vertical y-axis).
It shows a positive trend: The longer a student spends a time for reading, the higher his or her
comprehension score becomes. Thus, there seems to be a relationship between reading speeds and
comprehension scores. The graph, however, indicates the two outliers in this trend.
Conclusion. Here you must discuss our general understanding of the sampling distribution for the
average score [the answer to (d)]. And you will explain whether you can support the researchers' claim
or not [the answer to (e)].
Download