ijobfvc

advertisement




Homework 1 (Due date: January 30, 2004 in class)
Make sure to write your name and section number on each page clearly. Do not write your SSN or
ID.
Staple everything together. Place it on the table before class starts. There is a penalty of 5 pt. if
you do not staple them together.
If you come to class after the lecture has started, do not walk through the classroom to turn your
homework in. Wait until the class ends and give it to me before I leave the classroom.
There is no partial credit for the numerical mistakes that you may make. Include 4 digits after the
period to reduce the rounding error
Cloud data (there is a link under datasets on the course webpage) are those collected in a cloud-seeding
experiment in Tasmania between mid-1964 and January 1971. The rainfalls recorded are period rainfalls
in inches. TE and TW are the east and west target areas respectively, while NC, SC and NWC are the
corresponding rainfalls in the north, south and north-west control areas respectively. S = seeded, U =
unseeded.
Use this data to answer the following questions. I suggest you to use a spreadsheet like excel to reduce the
amount of time spent and numerical mistakes that you may make in computations.
1.
(15 pt.) Construct a stem-and-leaf display for rainfalls in east target area (TE) using digits as the stem
and the decimals as the leaf. Since you have two digits for each leaf, only use the first digit, do not
round it down or up (Example: data 0.74 has the stem 0 and leaf 74 but you only record 7 as the leaf
not 74. Data 2.48 has the stem 2 and leaf 48 but you only record 4 as the leaf not 48). Also separate
each stem into two parts (first part having the leaf 0 to 4 and the second part having the leaf 5 to 9).
Comment on the skewness. Is it unimodal data?
2.
(10 pt.) Construct a frequency distribution with the frequency, relative frequency, cumulative relative
frequency for rainfalls of the seasons.
3.
(15 pt.) Construct a histogram for rainfalls in east target area using class intervals
(0,1],(1,2],(2,3],(3,4],(4,5],(5,6]. Comment on the skewness and gaps. (Hint: as an example (0,1]
means data larger than 0 and at most 1)
4.
(12 pt.) (a) What proportion of rainfalls in east target area occurs in winter?
(b) What proportion of seeded rainfalls in winter is observed?
(c) What proportion of rainfalls in east target area are at most 1.00 inches?
(d) What proportion of rainfalls in east target area are at least 1.00 inches?
5.
(12 pt.) Calculate the mean, median, lower quartile, upper quartile, minimum, maximum for the
rainfalls in the east target area.
6.
(8 pt.) Calculate the range, interquartile range, standard deviation, coefficient of variation for the
rainfalls in the east target area.
7.
(10 pt.) Construct a boxplot for rainfalls in the east target area. Comment on the skewness and
outliers.
8.
(6 pt.) Consider the rainfalls in east target area. How much the smallest observation can be increased
without affecting the sample median? How much the smallest observation can be decreased without
affecting the sample median? If the smallest observation is lowered to the smallest value possible,
would it be an outlier? Answering these consider what numeric numbers can be used to express
rainfalls.
9.
(6 pt.) If each observation for this dataset is lowered by 0.1 inches,
(a) How much the upper quartile differs from the previous amount? (larger by ? inches/smaller by ?
inches/stays the same)
(b) How much the range differs from the previous amount? (larger by ? inches/smaller by ?
inches/stays the same)
(c) Would this increase affect the skewness? (less skewed/more skewed/stay the same)
10. (6 pt.) The following is the comparative boxplot for the rainfalls in the north, south and north-west
control areas.
12
10
U
8
U
U
S
6
U
S
4
U
U
2
0
-2
N=
108
108
108
NC
SC
NWC
(a) identify the skewness for each boxplot.
(b) Outliers are labeled as they are being seeded or unseeded. Does the seeded rainfalls heavier than the
unseeded ones? (Yes/No/not possible to determine)
(c) Look at the boxplot and tell us which one (NC/SC/NWC) has the largest median?
(d) Look at the boxplot and tell us which one (NC/SC/NWC) has the lowest variation?
Download