What’s Your Function? Prelude I’m thinking of a function. It is of one of the following three forms: y ax b (linear) y bx a (power) y be ax (exponential) I’m not going to tell you which form it is nor am I going to tell you what a and b are. Instead, I’ll give you eight points which lie of the graph of the mystery function. You have to decide what type of function it is (linear, power, or exponential) and what the values of a and b are. Sound impossible? Keep reading… Whether one is trying to gain insight into how some physical phenomenon works (acceleration of gravity) or why biological species evolve the way they do (wingspan of birds), scientific experimentation or observation is often the starting point. These experiments and observations result in large amounts of data. Somehow, by examining these data certain relationships between the quantities is supposed to become apparent. How can this be done? In this activity we will investigate how basic knowledge of logarithms provide the power to differentiate between a large number of data sets. Logarithms You should already be familiar with the basic properties of logarithms. In particular, the following algebraic properties will be the most useful for the successful completion of this exploration: log( AB) log( A) log( B) A log( ) log( A) log( B) B log( A r ) r log( A) Fitting Lines to Data Fitting lines to data is probably one of the most common procedures used when analyzing scatterplots (xy-plots). When studying statistics this is sometimes referred to as regression analysis. It is important to realize that statisticians and mathematicians have agreed on the “best” way to fit a straight line to any data set. This is important, because if two different people are trying to fit a straight line to a set of data, it would be nice if they could both independently produce the same straight line! The process used is called least squares regression and basically amounts to finding the one line which comes closest (on average) to all of the data points. The process is not difficult; it simply involves a lot of computations that are often left to a calculator or computer. Clearly, straight lines will approximate some datasets very well and some quite poorly. Luckily, there is a way to quantify precisely how closely the data points fall to the best-fit line. It is called the correlation coefficient and is denoted by r. You may have seen this before in a statistics class. A few important facts about r are: 1 r 1 , and r 1 or r 1 only when the data points are exactly on a line. The closer r is to 1, the closer the data points are to a straight line. If r is close to 0, then the data points are nowhere near a straight line. If r is positive, then the data points approximate a line with positive slope. If r is negative, then the data points approximate a line with negative slope. Families of Functions You have probably already studied many different types of functions: linear, quadratic, cubic and higher powers, square root and other fractional powers, inverse square and other negative powers, exponential, and logarithmic. The following table represents four datasets (each dataset uses the same x-variable and has it’s own unique y-variable associated with it). Each data set was obtained by using one of the common functions listed earlier. In this activity, you will learn how to decide which function is paired with which dataset. x 2 5 6 7 10 12 25 30 y1 19.2 18 17.6 17.2 16 15.2 10 8 y2 14.14214 22.36068 24.4949 26.45751 31.62278 34.64102 50 54.77226 y3 5.256355 5.665742 5.809171 5.956231 6.420127 6.749294 9.34123 10.585 y4 0.6 3.75 5.4 7.35 15 21.6 93.75 135 Linear Data How can you tell when data are linear? Often, you can simply create a scatterplot and see if the data appear to be linear. If so, you can have a computer or calculator find the equation of the regression line as well as the r-value. 1. For each dataset compute the equation of the regression line and the corresponding r-value. 2. Which of these datasets appear to be exactly linear? How can you tell? Power Functions A power function is any function of the form y bx a , where b and a are any constant values. 3. The square root function, y x , is a power function. What are b and a? 1 4. The inverse square function, y 2 , is a power function. What are b and a? x 5. Show that the equation y bx a is equivalent to the equation log( y ) log( b) a log( x) . Note: it doesn’t matter what base logarithm you use here. 6. Introduce two new variables x' and y ' by: x' log( x) y ' log( y ) Re-write the last equation from question 5 in terms of these two new variables. This last exercise shows that if you have two columns of data that are related by a power function ( y bx a ), then the logarithms of the data should be linear ( y ' ax' log( b) )! 7. For the remaining datasets (those that were not exactly linear), create new datasets by taking the logarithm of all of the x-values and all of the y-values. 8. For each of these new datasets, graph the data, compute the equation of the regression line, and compute the corresponding r-value. 9. Do any of these new datasets appear to be exactly linear? For each of these, identify the underlying power function of the original dataset (i.e. find a and b). Hint: This involves reversing the algebraic steps you performed in question 5. Exponential Functions An exponential function is one of the form y be ax , where b and a are any constant values. 10. Show that the equation y be ax is equivalent to the equation ln( y ) ln( b) ax . Note: Here you want to use the natural logarithm (base e) since ln( e) 1 . This is similar to what we did above, but here we see that ln(y) and x are linearly related! Therefore if you graph the original x-values together with the natural logarithm of the original y-values, you will see a straight line if the original data were related by an exponential function. 11. For the remaining datasets (those that were not exactly linear or power functions), create a graph of x vs. ln(y), compute the equation of the regression line, and rvalue. 12. Did you uncover any exponential functions? If so, identify the underlying exponential function of the original dataset. In reality, one will probably never have r-values exactly equal to 1 or -1. However, by comparing the r-values that are produced by the above analyses, one can determine which function might come closest to matching the original data.