Link

advertisement
STAT 360: HW #2
Fall 2014
Points: 35 pts
Name(s): ________________________
_______________________
_______________________
Data Description
These data are haystack measurements taken in Nebraska in 1927 and 1928, when farmers sold hay
unbaled and in the stack, requiring estimation of the volume of a stack. Two measurements that could
be made easily with a rope were usually employed: the circumference around the base of the stack and
the OVER, the distance from the ground on one side of a stack to the ground on the other side of the
stack. Stacks vary in height and shape so using a simple computation like the volume of a hemisphere,
while perhaps a useful first approximation, may or may not be sufficiently accurate.
Source: Methods of Correlation Analysis, Ezekiel, M., (1941), 2nd ed, New York: Wiley, p. 378-380
Data
Typical Haystack (I think…)
1
Answer the following using whatever software you’d like.
Marginal Distribution
1. Obtain the following quantities for Volume. (3 pts)
Mean =
Variance =
Total Unexplained Variation =
Distribution of Volume | Circumference
2. Create a plot of Volume vs. Circumference. Volume is the response variable for this
investigation, so place this variable on the y-axis in your plot. Give a brief statement (one or two
sentences) about the general pattern you see in this plot. (3 points)
3. Show the math as to why it is reasonable to estimate the volume of a haystack using the
quantity. (3 pts)
Hint: You may want to start with the volume of a sphere – Google it!
π‘‰π‘œπ‘™π‘’π‘šπ‘’ ≈
πΆπ‘–π‘Ÿπ‘π‘’π‘šπ‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ 3
12πœ‹ 2
4. Use the function above as an estimate of the mean function for Volume | Circumference. Plot
this estimated mean function on the graph created for Problem #2. Note: You may have to
obtain an estimate of the mean function for each data point in the dataset in order to construct
this plot. (3 pts)
2
5. Obtain Residual2 value for each point in your dataset. Sum up these values to obtain the total
unexplained variation for the mean function given above. (2 pts)
Total Unexplained Variation =
6. What proportion of the total unexplained variation is being explained by using this mean
function? Show the math here. (2 pts)
Distribution of Volume | Over
7. What would be a reasonable function for estimating volume if the Over measurement was used
instead of the Circumference measurement? Explain how you obtained this value. (4 pts)
8. Use your function in Problem #7 to obtain the Residual2 value for each point in your dataset.
Sum up these values to obtain the total unexplained variation for the mean function using Over.
(3 pts)
Total Unexplained Variation =
9. Once again, what proportion of the total unexplained variation is being explained by the using a
mean function based on Over? (2 pts)
3
Comparing the Estimating Mean Functions
10. Which is a better estimate to use for estimating the average volume of a haystack – the
Circumference of the Over measurement? Explain. (4 pts)
Investigation of the Variance Function
11. Pick one set of the residuals obtained above. Obtain the |Residual| value for each data point
and plot it against Circumference or Over, depending on which mean function was used. (3 pts)
12. Discuss the general pattern seen in the above plot. Does the variance function appear to
increase/decrease/not change as the haystacks get larger? Discuss the possible consequences
of a changing variance function on someone who purchases hay in this manner. (3 pts)
4
Download