a2_f01_105 - University of Windsor

advertisement
University of Windsor
Odette School of Business
Business Data Analysis I 73-105 (Day and Night Sections) Fall 2001
Assignment 2
Due: Monday, October 15, in the beginning of the class.
General note: Staple the assignment with a cover sheet that includes your name, section,
and student ID on the front page of the assignment. Make sure to show your work, as
most of the marks will be given for correctly setting up solutions. You may discuss the
problems with other students, but you must hand in your own work, in your own words.
1. (22 points) Hand computation: A professor of statistics wants to report the results of a
midterm exam taken by his class of 12 students. The marks are as follows: 81, 87, 94,
84, 18, 96, 92, 94, 90, 95, 73, 70
a. (3 points) Find the mean, median, and mode of the data.
b. (3 points) Interpret in brief the values obtained in Part a. Use no more than 50
words in total.
c. (2 points) Compute the 30th percentile and the 70th percentile of the data.
d. (4 points) Compute the upper and the lower quartiles and the interquartile range.
Using these values, identify outlier(s), if any.
e. (4 points) How do the mean, median, and mode change if the outlier(s) is (are)
removed from the data? Do the changes justify the statement that the mean is
sensitive to extreme values (but the median and mode are not so sensitive)?
f. (3 points) Draw a box plot. Show five-number summary, box, whisker and the
other components of the box plot. A sample is shown in the text, in Figure 4.6, on
p. 137
g. (3 points) Is the distribution symmetric, positively skewed, or negatively skewed?
Justify your answer with the results of Parts a and f.
2. (13 points) Use data from file XR04-51 and Excel for Parts b and c:
a. (3 points) Answer Problem 4.51(a), text p. 140. You may construct the box plot
using Excel or by hand (not both). If you use Excel, you will need the Data
Analysis Plus that comes with the CD included with the text. The Data Analysis
Plus is installed on all computers in Room OB 210. A sample of Excel box plot is
shown in the text, on p. 138. If you construct the box plot by hand, you may show
five-number summary, box, whisker and the other components of the box plot. A
sample is shown in the text, in Figure 4.6, on p. 137 (notice that the figure does
not show data points; so you do not need to plot 100 data points).
b. (3 points) Answer Problem 4.51(b), text p. 140.
c. (3 points) Find the mean, median, and mode.
d. (4 points) Answer Problem 4.51(c), text p. 140. State if the distribution is
symmetric, positively skewed, or negatively skewed and if there is (are) any
outlier(s). Justify your answer with the results of Parts a, b and c.
3. (14 points) Hand computation: Consider the following sample:
x
y
4
8
8
11
9
13
11
16
5
15
a. (4 points) Compute variance, standard deviation and coefficient of variation of x
values.
b. (2 points) Compute standard deviation of y values using the short-cut formula.
c. (4 points) Compute cov(x,y)
d. (1 point) Interpret cov(x,y)
e. (1 point) Compute coefficient of correlation, r.
f. (2 points) Interpret r.
4. (16 points) Use data from file XR04-57 and Excel: Consider the data as sample and
answer the following:
a. (3 points) Compute the sample variance, standard deviation and coefficient of
variation of x values. Note: Excel uses formula for sample variance/standard
deviation, so the values obtained from Excel VAR and STDEV functions need not
be readjusted like you do in Part b.
b. (2 points) Compute cov(x,y). Note: Excel uses formula for population covariance,
so the value obtained from Excel COVAR function must be readjusted.
c. (2 points) Compute coefficient of correlation, r.
d. (1 point) Construct a scatter diagram showing x-values on the horizontal axis and
y-values on the vertical axis. From Excel chart dialog box Step 1, choose the first
scatter chart subtype, the one that has no lines.
e. (1 point) Suppose that you want to fit a straight line that best represents the data
shown on your scatter diagram. As a first step to do this, compute a y-value for
each x-value. Use the Excel function TREND and show your y-values on cells
C2:C51 while leaving the original data on cells A1:B51.
f. (2 points) Modify your scatter diagram of Part d to include the y-values obtained
in Part e as the second series of the scatter diagram. Format the 2nd data series
patterns to show lines and no marker. Samples of the chart are shown on Lecture
not ch4_c_f01_105.ppt slides 25-27.
g. (5 point) Comment if the relationship between x and y is positive/negative,
linear/non-linear, etc. Justify your answer with the results of Parts b, c and f.
Notes on printout for Problem 4: While you take a printout for Part e, you do not need to
print all the 51 rows, you may hide some rows, say rows 11 to 45 (highlight the cells you
want to hide and click Format, Row, Hide). Take a printout of the other rows all on one
sheet of paper. Show your answers to Parts a, b and c on the same sheet of paper. If you
take a printout of the chart required in Part f, you do not need to take another printout for
Part d. Submit one chart that shows your works for Parts d and f.
Download