Uploaded by Somayeh Hazeri

Report #1 2

advertisement
Biostatistics
Report No 1
October 3, 2018
Question 1:
Series 1 data, Table 1, demonstrates the average study time of students per week.
Table 1: Study time of students
Student
1
Study time
2
per week
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
15 18 23 25 27 31 32 35 37 41 42 43 45 49 50 53 56 57 60
According to the histogram illustrated in Figure 1, this distribution is left-skewed. Table 2
shows the calculations of descriptive statistics (mean, median, standard deviation,
variance, coefficient of variation, and interquartile) in summary. The fact that the median
(Q2) is closer to the upper quartile (Q3) than the lowest quartile (Q1), is saying that this
data series is negative skewed. Boxplot in Figure 2 also reflects the negative
skeweness of this series. From computations: Q1= 26.50 , Q2=39, and Q3=49.25. As
the difference between Q3 and Q2 is less than that of Q2 and Q1, then this distribution
is negatively skewed. It means that the average study time of students per week is
closer to the upper 50% than to the lower. In other words, the average study time of
most of the students is higher than the median. Also, the mean is less than the
meadian.
Table 2: Data for left-skewed distribution
Mean
Median
Standard
Deviation
Variance
37.05
39
15.52
240.79
Coefficient
of
Variation
41.88
Interquartile
22.75
Figure 1: Study time of students per week
Figure 2: Average study time of students per week
Question 2:
Series 2 data, Table 3, demonstrates salaries of a group of employees.
Table 3: Employees and salaries
Employees
20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Saleries
110000
30000
32000
32000
33000
33000
34000
34000
38000
38000
38000
42000
43000
45000
45000
48000
50000
55000
55000
65000
According to the histogram shown in Figure 3, this distribution is right-skewed. Table 4
presents the calculations of descriptive statistics (mean, median, standard deviation,
variance, coefficient of variation, and interquartile) in summary.The Boxplot in Figure 4
also reflects the positive-skewed of this data. To find out that these data are rightskewed, it is enough to consider that the median (Q2) is closer to the lowest quartile
(Q1) than the upper quartile (Q3). From computations: Q1= 33750 , Q2=40000, and
Q3=48500. As the difference between Q2 and Q1 is less than that of Q2 and Q3, then
this distribution is positively skewed. It means that the salaries of employees are closer
to the lowest 50%. In other words, the salary of most of the employees is less than the
median. There is one extreme in data that is distant from the average salary of the
employees. Maybe that is the salary of the manager!
Table 4: Data for right-skewed distribution
Mean
Median
Standard
Deviation
Variance
45000
40000
17935.56
321684211
Figure 3: Salary of employees
Coefficient
of
Variation
39.87
Interquartile
14750
Figure 4: Salary per employee
Question 3:
Series 3 data, Table 5, demonstrates the distribution of biostatic midterm grads for a
group of 29 students.
Table 5: Midterm grades of Biostatics students
Students Biostatic mid term exam
1
61
2
65.5
3
66.75
4
70
5
70.5
6
71.9
7
72.4
8
73
9
74.9
10
75
11
75.2
12
76
15
76
13
76.8
14
77
16
78
18
78
17
78.15
19
80
20
81.2
24
81.3
21
82.3
23
81.6
27
82.2
25
81.35
22
85.2
26
85.1
28
85.15
29
93
According to the histogram presented in Figure 5, this distribution is symmetric. Table 6
shows the calculations of descriptive statistics (mean, median, standard deviation,
variance, coefficient of variation, and interquartile) in summary. The Boxplot in Figure 6
also reflects the symmetrical distribution of this data. To have a symetric distribution, it
is required that the median (Q2) be placed in the middle of the lowest quartile (Q1) and
the upper quartile (Q3). From computations: Q1= 73 , Q2=77, and Q3=81.35. It means
that midterm grades of the biostatic course are evenly distributed between the lower
and the upper 50 percents.
Table 6: Data for symmetrical distribution
Mean
Median
Standard
Deviation
Variance
77.05
77
6.71
45.01
Coefficient
of
Variation
8.7
Figure 5: Distribution of biostatic midterm exam grades
Interquartile
8.35
Figure 6: Distribution of biostatic midterm exam grades.
Question 4:
This study gives an overview of the application of boxplots in food chemistry. To
represent the work, five different examples are illustrated.The examples involve relative
sweetness of sugars and sugar alcohols with respect to sucrose, the potassium content
of fruits and vegetables, amino acid content of egg white and yolk, chemical
composition of freshwater and saltwater fish, and change in fatty acid composition of
soybean oil through traditional cultivation or genetic engineering techniques. As a result
of this work, the authors concluded that boxplot is an easy way to interpret the studies in
food chemistry and it provides a good overview of data.
One of boxplots in this study (Figure 7) selected as a candidate to consider the
distribution of data.
Figure 7: Boxplots for the chemical composition (protein and fat) of freshwater and saltwater fish.
In protein freshwater series, median (Q2) is closer to the lowest quartile (Q1) than to the
upper quartile (Q3), then distribution for this series is right-skewed. It means that the
lowest 50% of ranked species of fish exhibit narrow range for protein composition in
comparison to the opposite side. This is also true for fat of freshwater and saltwater.
The range of fat composition in the lowest 50% is narrower in comparison to the
opposite side. In this way, we say that distribution is right or positively skewed. For
protein in saltwater fish, upper and lower boxes are equal. It means that Q2 is at the
middle of Q1 and Q3. In other words, the protein composition range is equal for species.
This distribution is symmetric. In Table 7, the correspondence between the boxplot
shape and the distribution is visible.
Table 7: Distribution of chemical composition (protein and fat) of freshwater and saltwater fish.
Protein
Freshwater fish
Right-skewed
Protein
Saltwater fish
Symmitric
Fat
Freshwater fish
Right-skewed
Fat
Saltwater fish
Right-skewed
Bibliography
João, E. V., and Ferreira, R. M. (2016). Box-and-Whisker plots applied to food chemistry.
Journal of Chemical education, 2026−2032.
Question 5:
Table 8 presents the data series used in this problem.
Table 8: X and Y data sets.
X
2
4
7
8.3
9
12
14
14.8
18
Y
16
14.5
12.5
12.01
11
9
7.5
6.4
3
In Figure 8, the values of a set (X) increase while the values of set (Y) decrease.
Threfore, the two sets of related data show a negative trend. If we imagine a trend line
for these data sets, the slope of the line will be negative.
Figure 8: Negative trend in series
Question 6:
Table 9 shows the data series used in this problem.
Table 9: X and Y data sets
X
3
4
8
9
12
15
18
30
35
40
Y
6
9
12
14
18
20
24
33
36
40
In Figure 9, the values of set (X) as well as the values of set (Y) are increasing, then the
two sets of related data show a positive trend. The trend line for these sets will have a
positive slope.
Figure 9: Positive trend in series
Question 7:
Table 10 presents the data series used in this problem.
Table 10: X and Y data sets
X
-41
-32
-28
-24
-11
6
10
12
48
110
Y
1.75
2.9
3.3
4.1
4.5
5.3
6.1
6.55
7.5
8
Table 11 demonstrates the means and standard deviations of two series (X, Y). The two
series has the same mean with different standard deviations. Series X has a large
standard deviation in comparison to series Y.
Table 11: Two series with same mean and different standard deviation
mean
5.0
Series X
standard
deviation
45.49
Series Y
mean
5.0
standard
deviation
2.05
Download