Uploaded by crparker.art

1.A

advertisement
PART A: Answer the following questions without using STATA. (45 points)
1. (8 points) (Pagano/Gauvreau Ch2Q7) State whether each of the following observations is
an example of discrete or continuous data:
a. The number of suicides in the US in a specified year
b. The concentration of lead in a sample of water
c. The length of time that a cancer patient survives after diagnosis
d. The number of previous miscarriages an expectant mother had
a. discrete
b. continuous
c. continuous
d. discrete
2. (8 points) For a group of students with ages: 19, 30, 17, 22, 20, 19, 15, 50
a. Calculate the mean and median age.
b. Which summary, mean or median, do you favor as a measure of central tendency,
and why?
c. Calculate the range, variance, and standard deviation.
d. What is the unit of measurement for each of these measures of dispersion?
15, 17, 19, 19, 20, 22, 30, 50
a. median: 19.5, mean: 24
b. Median, because the mean can be greatly skewed by drastic differences on either end, the
mean is more susceptible to outliers/minor errors in data collection.
c. Range: 15-50, Variance: 114, Standard deviation: 10.68
d. The unit of measurement for each point of dispersion would be a single student.
3. (10 points) The following boxplots and table summarize information about death counts
for 16 counties in Maine.
a. Approximately, what is the median death count for females?
b. Approximately, is the interquartile range of male death counts 100, 200, 300, or 500?
c. From looking at the boxplot, is the mean death counts for males and females greater
than, less than, or approximately the same as the median death counts?
d. What is the percentage of counties with a female death count between 200 and 299?
e. What is the percentage of counties with female death count higher than 500?
a. 220
b. 300
c. They would both be greater due to the higher values on the top whiskers. These outliers would
skew the data values upward.
d. 25% of counties
e. 31.25%
4. (9 points) Consider the following three histograms (A, B, and C).
Histogram A
Histogram B
80
80
60
60
40
40
20
20
0
0
5
10
0
0
5
10
15
Histogram C
150
100
50
0
0
5
10
a. Which histogram would you expect to have the smallest mean? The largest
median? The smallest standard deviation? Explain your reasoning.
b. For each histogram, would you expect mean = median, mean > median, or mean <
median?
a. Histogram C would have the smallest mean. It has the most right-skewed distribution with the
greatest frequency of the lowest values. Due to this high frequency of low values, the higher
outliers would not affect the mean as much.
Histogram A would have the largest median, it has the greatest frequency of values closest to 5
while the other histograms have higher frequencies at values lower than 5, in addition to having a
normal distribution.
Histogram A would have the smallest standard deviation because it is the most normally
distributed with no evident outliers.
b. B and C are both right-skewed so the mean would be larger, A would have mean=median
(may be slightly right-skewed as well, however, the normal curve insinuates a more similar
relationship of mean and median).
5. (10 points) (Pagano/Gauvreau Ch3Q7) Eight individuals experienced an unexplained
episode of vitamin D intoxication. Blood levels of calcium and albumin for each subject
at the time of hospitalization are given below.
Calcium (mmol/l)
2.92
3.84
2.37
2.99
2.67
3.17
3.74
3.44
Albumin (g/l)
43
42
42
40
42
38
34
42
a. Calculate the mean, median, range, and standard deviation of the calcium levels.
b. Calculate the mean, median, range, and standard deviation of the albumin levels.
c. For healthy individuals, the normal range of calcium values is 2.12 to 2.74 mmol/l,
while the range of albumin levels is 32 to 55 g/l. Do you believe that patients
suffering from vitamin D intoxication have normal blood levels of calcium and
albumin?
a. mean: 3.14mmol/l
median: 3.08mmol/l
range: 2.37-3.84mmol/l (1.47mmol/l)
standard deviation: 0.478
b. mean: 40.38g/l
median: 42g/l
range: 34-43g/l (9g/l)
standard deviation: 2.83
c. All of the patients have a normal range Albumin, with most falling within the middle of that
normal range. 75% of patients had calcium levels above the normal range with 50% of the
patients having levels 1.5x the higher end of the normal range.
Download