Stat 4220 Homework Due March 27 1) A research group is studying whether the height of a building affects the ground temperature around the building. To find out they randomly sampled 250 buildings and recorded their height and the temperature around the building. The output from the regression is shown below. Test whether there is evidence of building height affecting the ground temperature. Regression Analysis: Temperature versus Height The regression equation is Temperature = 75.0 - 0.00164 Height Predictor Constant Height S = 3.34230 Coef 74.9795 -0.001638 SE Coef 0.3069 0.001072 R-Sq = 0.5% T 244.35 -1.53 P 0.000 0.127 R-Sq(adj) = 0 2) A study on the average wattage of Laramie power lines sampled 100 randomly chosen power lines and found a 99% confidence interval of (1158.8, 1262.3) Watts. Which of the following sentences is statistically accurate? a) If we did a new study of power lines in Laramie we have a 99% probability of getting a confidence interval for the Wattage between 1158.8 and 1262.3 b) We are 99% confident that the true population average Wattage for all power lines anywhere in the US is between 1158.8 and 1262.3 c) Of all Laramie power lines 99% of them will have a Wattage level that falls within the interval (1158.8, 1262.3) d) There is a 99% probability the next confidence interval done would correctly capture the true average wattage of power lines in Laramie e) 99% of the time that an interval on the wattage of power lines is made from Laramie power lines the population average will be in (1158.8, 1262.3) 3) A regression model was fit to determine how time studying for a test affects grade. The plot of the residuals is given below. Based on this plot which assumptions necessary for regression do you think may have been violated? Time studying Residual Plot 100 80 60 Residuals 40 20 0 -20 0 2 4 6 8 -40 -60 -80 Time studying 4) Which of the following data sets has the highest value of R? A) B) C) D) 10 12 A study on how the time of exercise affects heart rate had the following output 5) According to the output, if I exercise for time=150, what should be my heart rate? 6) After exercising everybody has different heart rates, which means there is a lot of variability in heart rates. How much of that variability is explained by exercise time? A study was done to compare tree height with trunk thickness. The following output was generated from the regression model. Simple linear regression results: Dependent Variable: Tree Height Independent Variable: Trunk Size Height = 26.540844 + 8.024617 Trunk Sample size: 51 R (correlation coefficient) = 0.9415 R-sq = 0.88648456 Estimate of error standard deviation: 8.624407 Parameter estimates: Parameter Intercept Trunk Estimate Std. Err. 26.540844 2.7416365 8.024617 0.41022122 7) Assuming the conditions are met, test if trunk size is a good predictor of tree height The output below studies whether salary should increase each year that you get older. Simple linear regression results: Dependent Variable: Salary Independent Variable: Age Salary = 43130.348 + 8.739329 Age Sample size: 100 R (correlation coefficient) = 0.0142 R-sq = 2.0028706E-4 Estimate of error standard deviation: 10103.091 Parameter estimates: Parameter Intercept Slope Estimate 43130.348 Std. Err. DF T-Stat P-Value 3426.885 98 12.5858755 <0.0001 8.739329 62.372787 98 0.14011447 0.00489 8) Would it be a good idea to use this model to predict salary given a specific age? 9) The temperature of the reactor in a nuclear submarine is normally distributed. A random sample of 3 different times showed an average temperature of 324°C with a standard deviation of 54°C. Find a 95% confidence interval for the true average temperature of the sub’s reactor. 10) A 95% confidence interval for μ1-μ2, based on two independent samples of sizes 38 and 40, respectively, is (45.6, 56.7). a) Is the difference between the two sample means included in the 95% confidence interval? b) Is the difference between the two population means included in the 95% confidence interval? c) Would the interval contain more values if the samples size were increased? d) Is the probability that the difference between the two population means, μ1-μ2, falls between 45.6 and 56.7 equal to 0.95? 11) The average August temperatures (y) and geographic latitudes (x) of 20 cities in the United States were studied. The regression equation for these data is Temperature = 113.6 – 1.01*(latitude) a. b. c. d. What is the slope of the line? Interpret the slope (how the mean August temperature is affected by a change in latitude) Estimate the mean August temperature for a city with latitude of 32. San Francisco has a latitude of 38. What would you predict for the mean August temperature of San Francisco? e. Given that the mean August temperature in San Francisco is actually 64 calculate the residual (prediction error) for San Francisco. f. The latitude at the equator is 0. Estimate the average August temperature at the equator. g. Explain why we should not use this equation to estimate average August temperature at the equator. 12) A car was driven 20 different times with different octane levels. Using the output from the regression, give a 99% confidence interval for the effect of octane on the car. Simple linear regression results: Dependent Variable: mileage Independent Variable: octane mileage = -53.426544 + 0.8503097 octane Sample size: 20 R (correlation coefficient) = 0.9134 R-sq = 0.8343458 Estimate of error standard deviation: 1.8180993 Parameter estimates: Parameter Intercept Slope Estimate -53.426544 Std. Err. DF T-Stat P-Value 7.824635 18 -6.827992 <0.0001 0.8503097 0.08930362 18 9.52156 <0.0001 13) 50 different companies are competing for a bid with WYDOT to build roads. Each company submitted a sample of their asphalt for the WYDOT to test. The plot below shows the relationship between asphalt strength and the asphalt tar concentration from each company. What would be an appropriate conclusion based on this graph? Check all that apply (there may be more than one correct answer) ________ The more tar that is put into the asphalt the stronger the asphalt will be ________ High levels of tar concentration are associated with stronger asphalt ________ Tar causes asphalt to be stronger ________ There is a correlation between tar concentration and asphalt strength ________ The stronger the asphalt is the more tar that will be put into it 14) In January 2013, the journal Pediatrics published data collected from 214 mother and infant pairs of low-income African-American mothers aged 18 to 35 years in central North Carolina. Data was collected on the number of televisions in the household. The data showed a mean of 3.0 televisions with a standard deviation of 1.2. “Maternal Characteristics and perception of Temperament Associated with Infant TV Exposure“ Pediatrics February 1, 2013 vol. 131 no. 2 e390-e397 doi: 10.1542/peds.20121224 http://pediatrics.aappublications.org.libproxy.uwyo.edu/content/131/2/e390/T2.expansion.html a) Construct a 95% Confidence interval b) Interpret your interval with a proper English sentence c) If a computer got the t-score instead of the t-table, how would the interval change?