Regression Analysis With the TI

advertisement
Why Is Alaska So Cool?
TI-83 + DATA = MODEL
Activity 1: TI-83 + DATA = MODEL
Introduction to Computational Science and Mathematical Modeling
Using the TI-83 Plus Graphing Calculator
Guiding Question
How can scientific data be analyzed and modeled with a mathematical equation?
Prediction of Results
After reading the situation described below in “What is Mathematical Modeling?” and
studying the plot of the data, predict what type of mathematical equation would “best
fit” the data.
Objective
To determine a line of best fit (regression line) from a given set of data using the TI-83
Plus graphing calculator.
Materials
TI-83 Plus, log notebook
Vocabulary Terms
regression analysis trend line
computational science
direct variation
What is Mathematical Modeling?
Consider the following situation.
An investigator wondered: What would happen if you took a certain amount of gas and
measured its volume at different temperatures? So, the person completed a test about
the temperature of a gas and the volume it displaced with varying temperature (at
constant pressure). The investigator took a syringe, filled it about half full of air, and
sealed off the end so no further air could get in or out. The syringe was placed in an
oven that allowed for precise temperature control. As the temperature changed, the
plunger on the syringe moved inward or outward showing a change in the volume of
the gas at a given temperature. The investigator observed and measured carefully
certain temperatures in degrees Celsius and the volume of air in cm3 in the syringe at
those temperatures, then recorded their values.
When the test was completed, the investigator then plotted on a grid the data points of
temperature and air volume at that temperature. Figure 1 shows the resulting scatter
plot. As you consider the plot, what do you note?
Activity 1
Page 1
TI-83 + DATA = MODEL
Why Is Alaska So Cool?
Temperature
Volume
Figure 1 – Scatter Plot
The following questions then came to mind:
1. Is there a predictable pattern to the way the volume changes with temperature?
2. If there is such a pattern, is there a mathematical equation that describes or models
the pattern?
3. In fact, exactly what mathematical equation gives a reliable enough model for
representing the data?
4. Can this model predict accurately enough what the volume would do at other
temperatures that were not investigated?
These are questions that are involved in mathematical modeling. For our purposes,
mathematical modeling is an attempt to create a reasonable model for a set of real
world data by the use of a mathematical equation. That is, when scientists have
completed an investigation in which they have collected an amount of data, they try to
develop a mathematical model (a) that explains accurately enough all the data they
have collected, and (b) that can predict accurately enough trends for other data that
were not collected in the original investigation.
This type of mathematical modeling is called regression analysis (also called finding a
trend line, line of best fit, or least squares line). The object is to find a line (or curve)
through actual data points so that the differences between the actual data points and the
points on the line are as small as possible. Drawing a line of best fit by hand can be done
by “eyeballing” the points and estimating the position of the line that is closest to all the
points. The equation of such a line can be derived by finding any two points on the line,
then using them to find the slope and write an equation. While the mathematics
involved in finding the best fit line that minimizes the differences from the actual data
points to the line can be complicated and is usually reserved for the study of calculus or
higher-level statistics, equations for regression lines can easily be found using a
computer or graphing calculator.
Page 2
Activity 1
Why Is Alaska So Cool?
TI-83 + DATA = MODEL
Computational Science: Regression Analysis Activity on the TI-83 Plus
Computational science is the use of computers to create mathematical models of real
world events. Your calculator is a highly specialized computer as it uses a
programmable central processing unit (cpu, Z-80) for its operation. Consequently, you
will conduct a genuine analysis through computational science during the next activity.
In the following activity, you will not need to conduct an actual investigation such as
the one described before. You will, however, use the data below as if you were the
investigator above, and that you had already conducted the test, and recorded the
temperatures and volumes in a daily log, reproduced in Chart 1 below.
Chart 1 (Constant Pressure)
Volume
Temperature
(cm3)
(C)
273
0
283
10
293
20
323
50
373
100
423
150
473
200
523
250
546
273
(3)
(4)
(5)
(6)
With your TI-83 Plus calculator, you will now have the
opportunity to conduct a regression analysis on the data
in Chart 1. The question at hand is: What mathematical
equation best describes the relationship between
temperature and volume in the investigation?
To perform regression analysis on your calculator, you
will follow these six steps:
(1) Clear the calculator of any current data in the
lists,
(2) enter the data in the lists,
examine the resulting data scatter plot visually to see possible types of curves or
lines,
choose a type of equation for the regression analysis,
have the calculator’s cpu run the regression and derive an equation, superimpose
the model (graphed equation) onto the data scatter plot from the data, and
conduct a visual comparison and further analysis of the original data and your
model.
Clear Lists
Data tables for this calculator are called “lists.” There can be several lists, and they are
numbered L1, L2, L3, etc. To see if there is data in the lists in your calculator, press
STAT, then select 1:Edit to see the lists. There are two ways to quickly clear data from
the lists. The first is safer, but the second may be faster if you have data in more than
one list. To clear a list, press STAT, then select 4:ClrList, then enter one or more list
names, separated by commas, that you wish to clear. (The list names are in yellow
above the number keys). Press ENTER, then you should see the word Done on the
screen. (To clear the calculator of all lists at once, you could do the following. Press the
ON key, then press 2nd [MEM] 4 ENTER . Please do NOT press any other number
Activity 1
Page 3
TI-83 + DATA = MODEL
Why Is Alaska So Cool?
than 4! Pressing any other number may clear other items from your calculator’s
memory that make the rest of your unit project difficult or impossible.)
Clear Functions
Press Y= . If there are any Y= equations in this window, move the cursor with the
or
key and in each equation press. CLEAR You should have only a column of Y=
in the window when you are done as in the screen shot below.
You may now proceed to enter data.
Enter New Data
You will need to enter the data from the temperature vs. gas volume from Chart 1. To
do this, you will enter the temperature values into the calculator’s data tables, or lists.
You will place the gas volume values in L1 list, and the temperature values in L2. Please
start with the volumes.
Press STAT ENTER (or select 1:Edit..). The screen opens to the lists, and the
calculator is ready to receive the first value for volume in L1. The screen looks as the
one below.
Enter the first value, 273, then press ENTER . The calculator enters the value in the first
position, and the cursor moves down to the next line, ready for the next value. Enter the
Page 4
Activity 1
Why Is Alaska So Cool?
TI-83 + DATA = MODEL
second value, press ENTER . Continue doing this until L1 contains all the volume
values.
Then press the right cursor key (
). The cursor moves to the first position in L2. Enter
the first temperature value, 0, press ENTER . Continue entering the temperature values
as you did the volumes.
When all the values are entered, the two lists should have the same numbers of entries,
nine each. The screen should look like the one below:
You are now ready to examine your plot and graph.
Set Plot Options
Be sure that the plot function is turned on, so that you can see the scatter plot formed by
the data you have entered. To turn it on, press 2nd [STAT PLOT]. If 1:Plot 1 shows
ON, with a graph type of a dotted line, and L1 and L2 as the lists, as in the screen shot
below, then you need do no more.
Activity 1
Page 5
TI-83 + DATA = MODEL
Why Is Alaska So Cool?
If 1:Plot 1 shows OFF, (like 2:Plot 2 above) do the following:
With the cursor on 1 (as in the screen above), press ENTER , then select On by pressing
ENTER while the cursor is flashing on the word On. This turns the Plot 1 function to
On. Make sure that the rest of the screen is as shown below so that the values in L1 (the
volume) will be plotted on the x-axis and the values in L2 (the temperature) will be
plotted on the y-axis. The graph type should be a scatter plot (the first graph choice).
You can also select the type of mark you would like to use for the data points, either a
dot, a plus sign, or a small box.
This screen lets you change the way the plotted points display. To change any of the
options, scroll down with the down cursor (
) or until the cursor is in the line you
wish to change. You may choose the type of graph or mark by moving the cursor to the
right or to the left. When it is flashing on the option you want, press ENTER to select
it. If you need to change one of the list choices, you must press  then the list name key
to enter the new list name. Press 2nd [QUIT] to go to the home screen.
Analyze the Data Visually
Examine the scatter plot created by your data. To do this, press the blue GRAPH key,
and the calculator will build a scatter plot showing your data on a coordinate plane. If
you cannot see the plot, it may be that your data is off the screen. Press ZOOM 9
(ZoomStat) to have the window set up necessary parameters automatically. Your screen
should be similar to the one in Figure 1 of the first page of this assignment.
Now that you can see the data points, what do you notice about them? Do you see that
they seem to fall in a pattern of a line (more or less) with an upward slope as you go to
the right? It is based on this visual appearance that you will proceed to the regression
analysis.
Page 6
Activity 1
Why Is Alaska So Cool?
TI-83 + DATA = MODEL
Regression Type
There are many types of regression functions, or equations, possible, and you need to
choose one to compute. So, let us assume that a line does indeed describe the plotted
points, and that therefore a linear equation will fit the bill. From your studies in algebra
what do you know about the equation of a line? You may remember that a linear
equation takes the generalized form of y = ax+b, where a is the slope of the line and b is
its y-intercept.
Since we are considering a linear equation, you will choose a linear regression with your
calculator out of the many types of regressions it can perform. To do this, press ENTER
then move your cursor to the right to select CALC (
). At this point your calculator
lists a number of possible calculations and regressions it can perform. You should have
seen the screens below in order.
Choose LinReg(ax+b) by scrolling down to it and pressing ENTER , or, by pressing 4 .
The window returns to the home screen and displays LinReg(ax+b) as the next screen
shot shows. This means the calculator is ready to run a linear regression of the type
ax+b. Note: The screen may have other information on it, but the last line on the home
screen must have LinReg(ax+b).
Activity 1
Page 7
TI-83 + DATA = MODEL
Why Is Alaska So Cool?
Run Regression
You must indicate what data should be used for the regression. To do this, press 2nd
[L1] , 2nd [L2] (be sure to include the comma). The result looks like this:
This instructs the calculator to run a linear regression on the values in the two lists, L1
and L2. Consequently, when you press the enter key, the cpu will perform a long series
calculations and will find an equation for the line of best fit for the data points in the
lists. (Note: If you do not specify the lists to use for the regression equation, it should be
calculated based on L1 and L2 by default.)
Press ENTER .
After a few moments of calculating, the calculator returns a screen similar to Figure 2.
Figure 2 – Linear Regression
Notice that it gives an equation of the type y=ax+b, and gives values for a and for b.
You know that a gives the slope of the line (its “steepness”) and b gives the y-intercept
(the value for y when x = 0).
Now, you will graph the equation itself, superimposed on the data points you collected.
To do this, Press Y= and move the cursor to the first available function. Press VARS 5
Page 8
Activity 1
Why Is Alaska So Cool?
TI-83 + DATA = MODEL
(5:Statistics..), then move the cursor to the right twice (
) to select the EQ menu.
Then select 1:ReqEQ (or press ENTER ) to copy the regression equation to the cursor
position. Press ENTER again. Now the screen should look like the one below:
View Regression Graph and Scatter Plot
Press GRAPH to see the scatter plot and regression curve together, as below. (Your
graph may not show the x and y coordinate labels. This is fine.) Press ZOOM 9 to
make sure your graph otherwise looks like the one below.
Does the line produced by your linear regression move through the scatter plot in the
directions that you anticipated? If so, good! In this example, all the data points line up
on the regression line (and the slope and y-intercept values are nice and neat), but
usually the data points do not lie in a straight path. In fact, suppose in the experiment
above that the investigator measured several volumes of gas at additional temperatures.
Add these data points to the values you already entered in L1 and L2:
Activity 1
Page 9
TI-83 + DATA = MODEL
Why Is Alaska So Cool?
Volume
Temperature
3
(cm )
(C)
315
45
401
126
452
183
510
232
To enter new data points, press STAT, select 1:Edit…, then move your cursor to the
bottom of each list to add the new data. Note that the calculator does not care if the
points are in numerical order. Press GRAPH , then ZOOM , 9:Statistics to see the new
data points plotted. Notice that the line that is plotted was the regression line for the
original set of points. Is this line still the “best” line for the new set of data?
Determining the “Best” Fit
When all the points of a data set do not lie exactly on the same line, the regression line is
formed by a best average distance from the data points. The average distance of all the
data points from the line of best fit is represented by a number “r” (called the
correlation coefficient) and a value of r2, called the coefficient of determination. In
general, the closer the values of r and r2 are to 1, the better the equation represents the
actual data. Before you find a new regression line for the new set of data, let’s turn on
the diagnostic mode of the calculator so that it will display the values of r and r2 for
each regression equation. To turn on the diagnostics, press 2nd [CATALOG] (above the 0
key). Scroll down until the right arrow is pointing to the word DiagnosticOn, then press
ENTER . Now use the same procedure as before to find the regression equation for the
data in L1 and L2. You should see the new values of the variables on a screen similar to:
LinReg
y=ax+b
a=.9946635508
b=-270.8703463
r2=.9995427351
r=.9997713414
Notice that the numbers are not as “clean” and that the equation has changed
somewhat, although it is still close to the original equation. Enter this new regression
equation in the list of equations as Y1 following the same procedure outlined above and
press GRAPH to see the new equation graphed along with the data points.
You are now ready to complete further analysis.
Page 10
Activity 1
Why Is Alaska So Cool?
TI-83 + DATA = MODEL
Mathematical Analysis
Note that the regression line has a certain slope. If the temperature increases (y-axis), so
does the volume (x-axis). If the temperature decreases, the volume does likewise; if you
follow the line further to the left, what does it predict about the air volume at colder and
colder temperatures? These two variables are thus in direct relationship or direct
variation with each other. You may remember that a direct variation is of the form
y = kx where k is the constant of variation. Do you see that a line where the equation is
y = ax + b also represents a direct variation? The constant of variation (k) is the same as
the slope of the line (a).
Let’s zoom out to take a look at a greater portion of the regression graph and scatter
plot. Press ZOOM 3 ENTER .
You may now see the regression graph as it crosses both the y- and x-axes. (The values
X=409.5 and Y=136.5, or other values on your calculator, simply show on which data
point the cursor is.) This view of the graph shows a large array of predicted
temperatures and volumes that have never been directly experienced (were not part of
the data actually collected).
You are now in a position to analyze temperatures and volumes you did not actually
measure in the original investigation, and predict the values the regression yields for
those temperatures or volumes. You may do this analysis using the techniques you
learned when you completed the introductory exercises for your calculator.
You may evaluate the regression directly (a) by setting x to a value that the calculator
will solve for y; (b) by creating a table of values for the regression, and scrolling to the
value of interest; and (c) by exploring the graph created by the regression with the
Trace command. You may wish to review these techniques particularly starting with
the section, “Evaluate a Function,” found in the document titled A Tutorial for the TI-83
Plus Graphing Calculator.
Activity 1
Page 11
TI-83 + DATA = MODEL
Why Is Alaska So Cool?
Concluding Questions
Using any of the techniques indicated in the above paragraph, “Mathematical
Analysis,” answer the following questions.
1. What is the volume when the temperature is 175 C? ___________________
Which method did you use? ________________________________________
2. What is the temperature when the volume is 800 cm3? __________________
Which method did you use? ________________________________________
3. Consider: If the temperature continually gets lower and lower, the volume
continues to shrink until (in theory) the volume has decreased to 0 cm3.
Determine the temperature at which the volume theoretically is reduced to zero.
__________________________
Which method did you use? ________________________________________
4. Thought Challenge: Give two reasons it is illogical to think that gases at lowest
temperatures in reality have no volume (feel free to research the question).
(a) ______________________________________________________________
______________________________________________________________
(b) ______________________________________________________________
______________________________________________________________
Extension: A Last Thought
Some students, especially those in chemistry or physics, may have noted that this
investigation about gas volumes and temperatures at constant pressure concerns
Charles’ Law about gases. Many such experiments using regression analysis have led
scientists to accept the notion of absolute zero, -273 C, the theoretical coldest
temperature possible. It is thought that at this the temperature all molecular heat
motion ceases, and matter has no internal heat energy.
Scientists created the absolute scale for temperature based upon these findings. It is
defined as equal to degrees Celsius plus 273, and is often called the kelvin scale. Thus,
kelvin = degree Celsius + 273, or, K=C+273, so absolute zero, 0K = -273C.
Using the equation derived from our model above we have
y=ax+b, where a = 1, x = degrees Celsius, and b = 273. Therefore,
Page 12
Activity 1
Why Is Alaska So Cool?
TI-83 + DATA = MODEL
y=(1)(x)+273 = x+273 = kelvins.
Since y is the volume, we may use V for y, and we may use T for kelvins. Substituting,
V=T; the coefficient of T being an implied 1, which is the slope of the linear equation!
So, V/T = 1, a constant for any volume/temperature ratio.
As it turns out, different gases have unique constant ratios, so that the slope of any
given gas volume/temperature ratio gives the constant value (k) of that gas. Expressing
this direct relationship generally as V/T=k is Charles’ Law. That is, the ratio of volume
to temperature is a constant, so the volume of a gas and its temperature (at constant
pressure) are said to be directly proportional to each other.
Such is the power of the use of mathematical models in the understanding of real world
events.
You are now ready to begin the use of your calculator and CBL unit in the
investigations ahead.
Take a moment to reflect on today’s activities. What was meaningful? What was
confusing? Enter your reflections in your daily log.
Activity 1
Page 13
Download