(i) (ii) Calculate the sample mean and median of homevalue2006Mar. Comment on the shape of the distribution without creating a histogram. shape of the distribution --> graph is positively skewed since the median is smaller than the mean. Report the five number summary and box-whisker plots for homevalue2006Mar and homevalue2010Jul. (iii) Compute the standard deviation and IQR of homevalue2006Mar and homevalue2010Jul, respectively. (iv) Summarize your findings from (ii) and (iii). Home values seem to be down in July of 2010 when compared to prices in March of 2006. The minimum value for a house was lower in 2010 by a difference of 3k, which pales in comparison to the 22k difference in maximum values between both years. Median home value was also down when compared to prices in 2006. Furthermore, the range and IQR for housing prices are lower in July of 2010, showing that middle values of my dataset cluster more tightly in the middle, meaning there is less variability in housing value. (v) What do you think caused your findings in (iv)? Explain. What additional data would be helpful to support your explanation? A myriad of factors might’ve caused this drastic change in home values. However, I believe the most important factor for this price disparity might be that of a financial crisis. The impact of the 2008 recession could’ve played a part in decreasing prices for housing. Foreclosures and financial hardships affected people’s ability and willingness to purchase housing, decreasing the values of said homes in the process. To support this explanation, I could use more data, which would include quarterly housing prices after March of 2006 and leading up to July of 2010. This would provide a more complete picture of whether or not prices were affected by the recession. Part II Q-26 It appears that the vast majority of runners slowed down toward the end of the race, although most did by less then 300 seconds, making the histogram positively skewed, unimodal and assymetric. The typical difference value seems to be the mark from 50 to 100 seconds, followed closely by the 100-150 mark, and the 150-200 second mark. That means that while runners mostly did slow down, they didn’t do so by a large margin. Only a very small percentage of runners sped up by the last stretch of the race, probably less than 8% of the total number of marathon runners. Q-32 1. 2. 3. 1000-1190 63.7% Q-44 1. 18.7 2. Mean is 19.25743, median is 19.2. The representative value wasn’t that far off, from both terms, I just didn’t account for some of the spikes in the data. Median and mean is almost exactly the same value, which means that the data set has a symmetrical distribution. Q-48 Let x1;...;xn be a sample, and let a and b be constants with a=/=0. Define a new sample y1;...;yn by y1 = ax1 + b, ..., yn = axn + b. 1. The mean of sample y1;...;yn will increase/decrease by a factor of constant a, plus the value of b. To illustrate this I can use the following example: xi’ x1;...;xn = 2,3,4,5 yi’ y1;...;yn = 12, 18, 24, 30, if a = 6 and b = 0. The mean of xi’ is 3.5, while the mean for yi’ is 21. That means that xi’mean increased by a factor of 6, as 3.5 x 6 = 21. Additionally, if I were to assign a different value to b, like b = 1, the mean for yi’would increase by 1. 2. Again, the median of yi’ would increase/decrease by a factor of a, and added/subtracted b. The median in xi’ would is 3.5, while the median in yi’ is 21. Q-54 1. 2. Mean is 115.58, and deviations are: .8200015, .3200015, -.9800015, -.3800031, .2200031. Sample standard deviation is 694264, while the sample variance is .4820025. 3. Sxx is .120500625, that is the sum of all deviations, which divided by n-1 (or 4) is equal to . 4820025. 4. The sample variance was the same, that of .4820025. This is because decreasing every observation by 100 decreases the mean by 100 as well, creating the same deviations, leading to the same value for s2 Q-66 a. The Great Divide Hercules is an outlier, as 9.10 > (5.95 + 1.5(1.6)). There is only one extreme outlier, that being the Rogue Imperial Stout, as 11.6 > (5.95 + 3(1.6)). b. This box plot is fairly symmetric, with the median close to the middle of the box. It features one outlier, and one extreme outlier, signaled by the dot and the x, respectively.