1 Linear Regression Problems 1. As Earth’s population continues to grow, the solid waste generated by the population grows with it. Governments must plan for disposal and recycling of ever growing amounts of solid waste. Planners can use data from the past to predict future waste generation and plan for enough facilities for disposing of and recycling the waste. Given the following data on the waste generated in Florida from 19901994, how can we construct a function to predict the waste that was generated in the years 1995-1999? The scatter plot is shown in Figure 1.85. Year a) b) c) d) Tons of Solid Waste Generated (in thousands) 1990 19,358 1991 19,484 1992 20,293 1993 21,499 1994 23,561 Make a scatterplot of the data, letting x represent the number of years since 1990. Use a graphing calculator to fit linear, quadratic, cubic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. Graph the function of best fit with the scatterplot of the data. With each function found in part (b), predict the average tons of waste in 2000 and 2005, and determine which function gives the most realistic predictions. 2. The numbers of insured commercial banks y (in thousands) in the United States for the years 1987 to 1996 are shown in the table. (Source: Federal Deposit Insurance Corporation). Year y 1987 13.70 1988 13.12 1989 12.71 1990 12.34 1991 11.92 1992 11.46 1993 10.96 1994 10.45 1995 9.94 1996 9.53 Make a scatterplot of the data, letting x represent the number of years since 1987. a) Use a graphing calculator to fit linear, quadratic, cubic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. b) Graph the function of best fit with the scatterplot of the data. c) With each function found in part (b), predict the average number of insured commercial banks in 2000 and 2005, and determine which function gives the most realistic predictions. e) Plot the actual data and the model you selected on the same graph. How closely does the model represent the data? Project AMP A Quesada Director Project AMP 2 3. U.S. Farms. As the number of farms has decreased in the United States, the average size of the remaining farms has grown larger, as shown in the table below. Year 1910 1920 1930 1940 1950 1959 1969 1978 1987 1997 Average Acreage Per Farm 139 149 157 175 216 303 390 449 462 487 a) Make a scatterplot of the data, letting x represent the number of years since 1900. b) Use a graphing calculator to fit linear, quadratic, cubic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Graph the function of best fit with the scatterplot of the data. d) With each function found in part (b), predict the average acreage in 2000 and 2010 and determine which function gives the most realistic predictions. 4. Sports The winning times (in minutes) in the women’s 400-meter freestyle swimming event in the Olympics from 1936 to 1996 are given by the following ordered pairs. (1936,5.44) (1972, 4.32) (1948,5.30) (1976, 4.16) (1952,5.20) (1980, 4.15) (1956, 4.91) (1984, 4.12) (1960, 4.84) (1988, 4.06) (1964, 4.72) (1992, 4.12) (1968, 4.53) (1996, 4.12) a) Make a scatterplot of the data, letting x represent the number of years since 1972. b) Use a graphing calculator to fit linear, quadratic, cubic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Graph the function of best fit with the scatterplot of the data. d) Plot the actual data and the model you selected on the same graph. How closely does the model represent the data? Project AMP A Quesada Director Project AMP 3 Quadratic Regression Problems 1. The following data was obtained by throwing a rubber ball at a CBR. Time (sec) 0.0000 0.1080 0.2150 0.3225 0.4300 0.5375 0.6450 0.7525 0.8600 Height (m) 1.03754 1.40205 1.63806 1.77412 1.80392 1.71522 1.50942 1.21410 0.83173 a) Use the data above to make a scatterplot, letting x represent the number of seconds elapsed. b) Next, use a graphing calculator to find the model that best expresses the height and vertical velocity of the rubber ball. We can also use this model to predict the maximum height of the ball and its vertical velocity when it hits the face of the CBR. c) Fit linear, quadratic, cubic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. d) Graph the function of best fit with the scatterplot of the data. e) Determine the maximum height of the ball (in meters). f) With the model you selected in part (b), predict when the height of the ball is at least 1.5 meters. 2. Stopping Distance A state highway patrol safety division collected the data on stopping distances in Table 2.16. a) Draw a scatter plot of the data. b) Fit linear, quadratic, cubic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Superimpose the regression curve on the scatter plot. d) Use the regression model to predict the stopping distance for a vehicle traveling at 25 mph. e) Use the regression model to predict the speed of a car if the stopping distance is 300 ft. Table 2.16 Highway Safety Division Speed (mph) 10 20 30 40 50 Project AMP Stopping Distance (ft) 15.1 39.9 75.2 120.5 175.9 A Quesada Director Project AMP 4 3. Home Schooling Growth The estimated number of U.S. children that were home-schooled in the years from 1992 to 1997 were: Table 1.13 Home Schooling Year 1992 1993 1994 1995 1996 1997 Number 703,000 808,000 929,000 1,060,000 1,220,000 1,347,000 (a) Produce a scatter plot of the number of children home-schooled in thousands (y) as a function of years since 1990 (x). (b) Find the linear regression equation. (Round the coefficients to the nearest 0.01.) (c) Does the value of r 2 suggest that the linear model is appropriate? (d) Find the quadratic regression equation. (Round the coefficients to the nearest 0.01.) (e) Does the value of R2 suggest that a quadratic model is appropriate? (f) Use both curves to predict the number of U.S. children that are homeschooled in the year 2005. How different are the estimates? (g) Writing to Learn Use the results of this exploration to explain why it is risky to use regression equations to predict y-values for x values that are not very close to the data points, even when the curves fit the data points very well. 4. Leisure Time The following table shows the median number of hours of leisure time that Americans had each week in various years. Year Median Number of Leisure Hours Per Week 1973, 0 26.2 1980, 7 19.2 1987, 14 16.6 1993, 20 18.8 1997, 24 19.5 Source: Louis Harris and Associates (a) Make a scatterplot of the data, letting x represent the number of years since 1973, and determine which model best fits the data. (b) Use a graphing calculator to fit the type of function determined in part (a) to the data. (c) Graph the equation with the scatterplot. Then, use the function found in part (b) to estimate the number of leisure hours per week in 1978,1990, and 2005. Project AMP A Quesada Director Project AMP 5 5. On-line Travel Revenue With the explosion of increased Internet use, more and more travelers are booking their travel reservations on-line. The following table lists the total on-line revenue for recent years. Most of the revenue is from airline tickets. Year On-Line Travel Revenue (In Millions) $ 276 827 1900 3200 4700 6500 8900 1996 1997 1998 1999 2000 2001 2002 Source: Travel and Interactive Technology 1999 (a) Create a scatterplot of the data. Let x= the number of years since 1996. (b) Use a graphing calculator to fit the data with linear, quadratic, and exponential functions. Determine which function has the best fit. (c) Graph all three functions found in part (b) with the scatterplot in part (a). (d) Use the functions found in part (b) to estimate the on-line travel revenue in 2010. Which function provides the most realistic prediction? Quartic Regression Problems 1. Consumer Debt Nonmortgage consumer debt is mounting in the United States, as shown in the table below. Year 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 Non-mortgage Debt (In Billions) $ 762 789 783 775 804 902 1038 1161 1216 1266 f) Draw a scatter plot of the data. g) Fit linear, exponential, power, cubic, and quartic functions to the data. By comparing the values of R2 , determine the function that best fits the data. h) Superimpose the regression curve on the scatter plot. i) Use the regression model to predict when consumer debt will reach 1400 billion dollars. Project AMP A Quesada Director Project AMP 6 2. Declining Number of Farms in the United States Today U.S. farm acreage is about the same as it was in the early part of the twentieth century, but the number of farms has shrunk. Year Number of Farms (in millions) 6.4 6.5 6.3 6.1 5.4 3.7 2.7 2.3 2.1 1.9 1910 1920 1930 1940 1950 1959 1969 1978 1987 1997 Looking at the table above, we note that the data could be modeled with a cubic or a quartic function. (a) Model the data with both cubic and quartic functions. Let the first coordinate of each data point be the number of years after 1900. That is, enter the data as (10, 6.4), (20, 6.5), and so on. Then using R2 , the coefficient of determination, decide which functions is the better fit. The R2 -value gives an indication of how well the function fits the data. The closer R2 is to 1, the better the fit. (b) Graph the function with the scatterplot of the data. (c) Use the answer to part (a) to estimate the number of farms in 1900, 1975, and 2003. Exponential Regression Problems 1. In the years before the Civil War, the population of the United States grew rapidly, as shown in the following table from the U.S. Bureau of the Census. Year 1790 1800 1810 1820 1830 1840 1850 1860 Population in Millions 3.93 5.31 7.24 9.64 12.86 17.07 23.19 31.44 a) Draw a scatter plot of the data. b) Fit linear, quadratic, exponential, power, logarithmic, and logistic functions to the data. By comparing the values of R2 , determine the function that best fits the data. Project AMP A Quesada Director Project AMP 7 c) Superimpose the regression curve on the scatter plot. d) Use the regression model to predict the population in 1870. e) Use the regression model to predict the population in 1930. Explain why/why not you feel this prediction has validity. (Hint: you may want to complete this problem after you finish the problem dealing with Census records after the Civil War.) 2. Projected Number of Alzheimer’s Patients: German psychiatrist Alois Alzheimer first described the disease, later called Alzheimer’s disease, in 1906. Since life expectancy has significantly increased in the last century, the number of Alzheimer’s patients has increased dramatically. The number of patients in the United States reached 4 million in 2000. The following table lists projected data regarding the number of Alzheimer’s patients in years beyond 2000. Year, x 2000 2010 2020 2030 2040 2050 Projected Number of Alzheimer’s Patients in the United States (In millions) 4.0 5.8 6.8 8.7 11.8 14.3 a) Draw a scatter plot of the data. b) Fit linear, exponential, power, logistic and logarithmic functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Superimpose the regression curve on the scatter plot. d) Use the regression model to estimate the number of Alzheimer’s patients in 2005, 2025, and 2100. 3. Number of physicians: The following table contains data regarding the number of physicians in the United States in selected years. Year 1950 1955 1960 1965 1970 1975 1980 1985 1990 1994 1995 1996 Project AMP Total Number of Physicians 219,997 241.711 260.484 292,088 334,028 393,742 467,679 552,716 615,421 684,414 720,325 737,764 A Quesada Director Project AMP 8 a) Draw a scatter plot of the data. b) Fit linear, quadratic, cubic, exponential, quartic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Superimpose the regression curve on the scatter plot. d) Use the regression model to predict the population in 1975. e) Use the regression model to estimate the number of physicians in 2000 and 2025. 3. Credit Card Volume: The total credit card volume for Visa, MasterCard, American Express, and Discover has increased dramatically in recent years, as shown in the table below. (Source, CardWeb Inc.’s CardData) Year, x 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 Credit Card Volume, y (In Billions) 261.0 296.3 338.4 361.0 403.1 476.7 584.8 701.2 798.3 885.2 a) Draw a scatter plot of the data. b) Fit linear, quadratic, cubic, exponential, quartic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Superimpose the regression curve on the scatter plot. d) Use the regression model to predict the credit card volume in 2003 and in 2010. Logarithmic Regression Problems 1. Forgetting In an art class, students were tested at the end of the course on a final exam. Then they were retested with an equivalent test at subsequent time intervals. Their scores after time t, in months, are given in the table. Time, t (in months) 1 2 3 4 5 6 Score, y 84.9% 84.6% 84.4% 84.2% 84.1% 83.9% a) Draw a scatter plot of the data. b) Fit linear, quadratic, logarithmic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Superimpose the regression curve on the scatter plot. Project AMP A Quesada Director Project AMP 9 d) Use the regression model to predict test scores after 8, 10, 24, and 36 months. e) After how long will the test scores fall below 82%? 2 Jamie, a meteorologist, is interested in finding a function that explains the relation between the height of a weather balloon (in kilometers) and the atmospheric pressure (measured in millimeters of mercury) on the balloon. She collects the data shown in Table 10. Table 10 a) Using a graphing utility, draw a scatter diagram of the data with atmospheric pressure as the independent variable. b) Fit linear, quadratic, logarithmic, and power functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Superimpose the regression curve on the scatter plot. d) Use the function in part (b) to predict the height of the weather balloon if the atmospheric pressure is 560 millimeters of mercury. Atmospheric Pressure, p 760 740 725 700 650 630 600 580 550 Height,h 0 0.184 0.328 0.565 1.079 1.291 1.634 1.862 2.235 3. Economics and Marketing The following data represent the price and quantity supplied in 2005 for IBM personal computers. Price ($/Computer) 2300 2000 1700 1500 1300 1200 1000 Quantity Supplied 180 173 160 150 137 130 113 (a) Using a graphing utility, draw a scatter diagram of the data with price as the dependent variable. (b) Using a graphing utility, try a variety of function families. Compare the values R2 to find the function that best fits the data. (c) Using a graphing utility, draw the function found in part (b) on the scatter diagram. (d) Use the function found in part (b) to predict the number of IBM personal computers that will be supplied if the price is $1650. Project AMP A Quesada Director Project AMP 10 Power Regression Problems 1. Use the data in the table below to obtain a model for speed p versus distance traveled d. Consider linear, quadratic, exponential, power, and quartic models. Then use the model you selected as the best fit to predict the speed of the ball at impact, given that impact occurs when d 1.80 m. Table 2.12 Rubber Ball Data from CBR Experiment Distance (m) 0.00000 0.04298 0.16119 0.35148 0.59394 0.89187 1.25557 Speed (m/s) 0.00000 0.82372 1.71163 2.45860 3.05209 3.74200 4.49558 2. The length of time that a planet takes to make one complete rotation around the sun is its year. The table shows the length (in earth years) of each planet’s year and the distance of that planet from the sun (in millions of miles). Find a model for this data in which x is the length of the year and y the distance from the sum. Planet Mercury Venus Earth Mars Jupiter Saturn Uranus Neptune Pluto Year .24 .62 1 1.88 11.86 29.46 84.01 164.79 247.69 Distance 36.0 67.2 92.9 141.6 483.6 886.7 1783.0 2794.0 3674.5 3. Cholesterol Level and the Risk of Heart Attack. The data in the following table show the relationship of cholesterol level in men to the risk of a heart attack. Cholesterol Level, x 100 200 250 275 300 Men, Per 10,000, Who Suffer A Heart Attack, y 30 65 100 130 180 (a) Use a graphing calculator to fit a model function to the data. Consider linear, exponential, power, and cubic functions. (b) Graph the function with the scatterplot of the data. (c) Use the answer to part (a) to estimate the heart attack rate for men with cholesterol levels of 150, 350, and 400. Project AMP A Quesada Director Project AMP 11 Logistic Regression Problems 1. After the Civil War, the U.S. population increased, as shown below. Year Population in Millions 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 38.56 50.19 62.98 76.21 92.23 106.02 123.20 132.16 151.33 179.32 202.30 226.54 248.72 281.42 a) Draw a scatter plot of the data. b) Fit linear, quadratic, exponential, power, logarithmic, and logistic functions to the data. By comparing the values of R2 , determine the function that best fits the data. c) Use the regression model to predict the population in 1975 and in 2010. Explain why/why not you feel this prediction has validity. 2. Effect of Advertising A company introduces a new software product on a trial run in a city. They advertised the product on television and found the following data relating the percent P of people who bought after x ads were run. Number of Ads, x % Who Bought, P 0 0.2 10 0.7 20 2.7 30 9.2 40 27 50 57.6 60 83.3 70 94.8 80 98.5 90 99.6 Draw a scatter plot of the data. Then, fit linear, exponential, power, logistic and logarithmic functions to the data. By comparing the values of R 2 , determine the function that best fits the data. Then use the regression model to predict the percent P of people who will buy the software after 100 ads are run. * Relate what you have discovered in this exercise to what you have observed in television ads. What could the company do to change this pattern? Project AMP A Quesada Director Project AMP